15.7. On-line Backup of MySQL Cluster

This section describes how to create a backup and how to restore the database from a backup at a later time.

15.7.1. Cluster Backup Concepts

A backup is a snapshot of the database at a given time. The backup consists of three main parts:

Metadata: the names and definitions of all database tables
Table records: the data actually stored in the database tables at the time that the backup was made
Transaction log: a sequential record telling how and when data was stored in the database

Each of these parts is saved on all nodes participating in the backup. During backup, each node saves these three parts into three files on disk:

BACKUP-backup_id.node_id.ctl
A control file containing control information and metadata. Each node saves the same table definitions (for all tables in the cluster) to its own version of this file.
BACKUP-backup_id-0.node_id.data
A data file containing the table records, which are saved on a per-fragment basis. That is, different nodes save different fragments during the backup. The file saved by each node starts with a header that states the tables to which the records belong. Following the list of records there is a footer containing a checksum for all records.
BACKUP-backup_id.node_id.log
A log file containing records of committed transactions. Only transactions on tables stored in the backup are stored in the log. Nodes involved in the backup save different records because different nodes host different database fragments.

In the listing above, backup_id stands for the backup identifier and node_id is the unique identifier for the node creating the file.

15.7.2. Using The Management Client to Create a Backup

Before starting a backup, make sure that the cluster is properly configured for performing one. (See Section 15.7.4, “Configuration for Cluster Backup”.)

Creating a backup using the management client involves the following steps:

Start the management client (ndb_mgm).
Execute the command START BACKUP.
The management client responds as shown here:
```
Waiting for completed, this may take several minutes
Node 1: Backup backup_id started from node management_node_id
```
Here, backup_id is the unique identifier for this particular backup. (This identifier will also be saved in the cluster log, if it has not been configured otherwise.) management_node_id is the node ID of the management to which the management client is connected.
This means that the cluster has received and processed the backup request. It does not mean that the backup has been completed.
Note: Backup messages were not recorded in the cluster log in MySQL 5.1.12 or 5.1.13. The logging of backup operations was restored in MySQL 5.1.14 (see Bug#24544).
When the backup is completed, the management client will indicate this as shown here:
```
Node 1: Backup backup_id started from node management_node_id completed
 StartGCP: 417599 StopGCP: 417602
 #Records: 105957 #LogRecords: 0
 Data: 99719356 bytes Log: 0 bytes
```
The values shown for StartGCP, StopGCP, #Records, #LogRecords, Data, and Log will vary according to the specifics of your cluster.

Cluster backups are created by default in the BACKUP subdirectory of the DataDir on each data node. This can be overridden for one or more data nodes individually, or for all cluster data nodes in the config.ini file using the BackupDataDir configuration parameter as discussed in Identifying Data Nodes. The backup files created for a backup with a given backup_id are stored in a subdirectory named BACKUP-backup_id in the backup directory.

To abort a backup that is already in progress:

Start the management client.
Execute this command:
```
ndb_mgm> ABORT BACKUP backup_id
```
The number backup_id is the identifier of the backup that was included in the response of the management client when the backup was started (in the message Backup backup_id started from node management_node_id).
The management client will acknowledge the abort request with Abort of backup backup_id ordered. Note: At this point, the management client has not yet received a response from the cluster data nodes to this request, and the backup has not yet actually been aborted.
After the backup has been aborted, the management client will report this fact in a manner similar to what is shown here:
```
Node 1: Backup 3 started from 5 has been aborted. Error: 1321 - Backup aborted by user request: Permanent error: User defined error
Node 3: Backup 3 started from 5 has been aborted. Error: 1323 - 1323: Permanent error: Internal error
Node 2: Backup 3 started from 5 has been aborted. Error: 1323 - 1323: Permanent error: Internal error
Node 4: Backup 3 started from 5 has been aborted. Error: 1323 - 1323: Permanent error: Internal error
```
In this example, we have shown sample output for a cluster with 4 data nodes, where the sequence number of the backup to be aborted is 3, and the management node to which the cluster management client is connected has the node ID 5. The first node to complete its part in aborting the backup reports that the reason for the abort was due to a request by the user. (The remaining nodes report that the backup was aborted due to an unspecified internal error.) Note: There is no guarantee that the cluster nodes will respond to an ABORT BACKUP command in any particular order.
The Backup backup_id started from node management_node_id has been aborted messages mean that the backup has been terminated and that all files relating to this backup have been removed from the cluster filesystem.

It is also possible to abort a backup in progress from a system shell using this command:

shell> ndb_mgm -e "ABORT BACKUP backup_id"

Note: If there is no backup with ID backup_id running when an ABORT BACKUP is issued, the management client makes no response, nor is it indicated in the cluster log that an invalid abort command was sent.

15.7.3. ndb_restore — Restore a Cluster Backup

The cluster restoration program is implemented as a separate command-line utility ndb_restore, which can normally be found in the MySQL bin directory. This program reads the files created as a result of the backup and inserts the stored information into the database.

ndb_restore must be executed once for each of the backup files that were created by the START BACKUP command used to create the backup (see Section 15.7.2, “Using The Management Client to Create a Backup”). This is equal to the number of data nodes in the cluster at the time that the backup was created.

Note: Before using ndb_restore, it is recommended that the cluster be running in single user mode, unless you are restoring multiple data nodes in parallel. See Section 15.6.4, “Single User Mode”, for more information about single user mode.

Typical options for this utility are shown here:

ndb_restore [-c connectstring] -n node_id [-m] -b backup_id -r [backup_path=]/path/to/backup/files

The -c option is used to specify a connectstring which tells ndb_restore where to locate the cluster management server. (See Section 15.3.4.2, “The Cluster Connectstring”, for information on connectstrings.) If this option is not used, then ndb_restore attempts to connect to a management server on localhost:1186. This utility acts as a cluster API node, and so requires a free connection “slot” to connect to the cluster management server. This means that there must be at least one [API] or [MYSQLD] section that can be used by it in the cluster config.ini file. It is a good idea to keep at least one empty [API] or [MYSQLD] section in config.ini that is not being used for a MySQL server or other application for this reason (see Section 15.3.4.6, “Defining SQL and Other API Nodes”).

You can verify that ndb_restore is connected to the cluster by using the SHOW command in the ndb_mgm management client. You can also accomplish this from a system shell, as shown here:

shell> ndb_mgm -e "SHOW"

-n is used to specify the node ID of the data node on which the backups were taken.

The first time you run the ndb_restore restoration program, you also need to restore the metadata. In other words, you must re-create the database tables — this can be done by running it with the -m option. Note that the cluster should have an empty database when starting to restore a backup. (In other words, you should start ndbd with --initial prior to performing the restore.)

The -b option is used to specify the ID or sequence number of the backup, and is the same number shown by the management client in the Backup backup_id completed message displayed upon completion of a backup. (See Section 15.7.2, “Using The Management Client to Create a Backup”.)

The path to the backup directory is required, and must include the subdirectory corresponding to the ID backup of the backup to be restored. For example, if the data node's DataDir is /var/lib/mysql-cluster, then the backup directory is /var/lib/mysql-cluster/BACKUP, and the backup files for the backup with the ID 3 can be found in /var/lib/mysql-cluster/BACKUP/BACKUP-3. The path may be absolute or relative to the directory in which the ndb_restore executable is located, and may be optionally prefixed with backup_path=.

Important

When restoring cluster backups, you must be sure to restore all data nodes from backups having the same backup ID. Using files from different backups will at best result in restoring the cluster to an inconsistent state, and may fail altogether.

It is possible to restore a backup to a database with a different configuration than it was created from. For example, suppose that a backup with backup ID 12, created in a cluster with two database nodes having the node IDs 2 and 3, is to be restored to a cluster with four nodes. Then ndb_restore must be run twice — once for each database node in the cluster where the backup was taken. However, ndb_restore cannot always restore backups made from a cluster running one version of MySQL to a cluster running a different MySQL version. See Section 15.4.2, “Cluster Upgrade and Downgrade Compatibility”, for more information.

Note

For more rapid restoration, the data may be restored in parallel, provided that there is a sufficient number of cluster connections available. That is, when restoring to multiple nodes in parallel, you must have an [API] or [MYSQLD] section in the cluster config.ini file available for each concurrent ndb_restore process. However, the data files must always be applied before the logs.

Most of the options available for this program are shown in the following table:

Long Form	Short Form	Description	Default Value
`--backup-id`	`-b`	Backup sequence ID	`0`
`--backup_path`	None	Path to backup files	`./`
`--character-sets-dir`	None	Specify the directory where character set information can be found	None
`--connect`, `--connectstring`, or `--ndb-connectstring`	`-c` or `-C`	Set the connectstring in `[nodeid=node_id;][host=]host[:port]` format	`localhost:1186`
`--core-file`	None	Write a core file in the event of an error	`TRUE`
`--debug`	`-#`	Output debug log	`d:t:O,/tmp/ndb_restore.trace`
`--help` or `--usage`	`-?`	Display help message with available options and current values, then exit	[N/A]
`--ndb-mgmd-host`	None	Set the host and port in `host[:port]` format for the management server to connect to; this is the same as `--connect`, `--connectstring`, or `--ndb-connectstring`, but without a way to specify the `nodeid`	None
`--ndb-nodeid`	None	Specify a node ID for the ndb_restore process	`0`
`--ndb-optimized-node-selection`	None	Optimize selection of nodes for transactions	`TRUE`
`--ndb-shm`	None	Use shared memory connections when available	`FALSE`
`--nodeid`	`-n`	Use backup files from node with the specified ID	`0`
`--parallelism`	`-p`	Set from 1 to 1024 parallel transactions to be used during the restoration process	`128`
`--print`	None	Print metadata and log to `stdout`	`FALSE`
`--print_data`	None	Print data to `stdout`	`FALSE`
`--print_log`	None	Print log to `stdout`	`FALSE`
`--print_meta`	None	Print metadata to `stdout`	`FALSE`
`--restore_data`	`-r`	Restore data and logs	`FALSE`
`--restore_meta`	`-m`	Restore table metadata	`FALSE`
`--version`	`-V`	Output version information and exit	[N/A]

Beginning with MySQL 5.0.40, several additional options are available for use with the --print_data option in generating data dumps, either to stdout, or to a file. These are similar to some of the options used with mysqldump, and are shown in the following table:

Long Form	Short Form	Description	Default Value
`--tab`	`-T`	Creates dumpfiles, one per table, each named `tbl_name.txt`. Takes as its argument the path to the directory where the files should be saved (required; use `.` for the current directory).	None
`--fields-enclosed-by`	None	String used to enclose all column values	None
`--fields-optionally-enclosed-by`	None	String used to enclose column values containing character data (such as `CHAR`, `VARCHAR`, `BINARY`, `TEXT`, or `ENUM`)	None
`--fields-terminated-by`	None	String used to separate column values	`\t` (tab character)
`--hex`	None	Use hex format for binary values	[N/A]
`--lines-terminated-by`	None	String used to terminate each line	`\n` (linefeed character)
`--appends`	None	When used with `--tab`, causes the data to be appended to existing files of the same name	[N/A]

Note

If a table has no explicit primary key, then the output generated when using the --print includes the table's hidden primary key.

Beginning with MySQL 5.0.40, it is possible to restore selected databases, or to restore selected tables from a given database using the syntax shown here:

ndb_restore other_options db_name_1 [db_name_2[, db_name_3][, ...] | tbl_name_1[, tbl_name_2][, ...]]

In other words, you can specify either of the following to be restored:

All tables from one or more databases
One or more tables from a single database

Note

ndb_restore reports both temporary and permanent errors. In the case of temporary errors, it may able to recover from them. Beginning with MySQL 5.0.29, it reports Restore successful, but encountered temporary error, please look at configuration in such cases.

15.7.4. Configuration for Cluster Backup

Five configuration parameters are essential for backup:

BackupDataBufferSize
The amount of memory used to buffer data before it is written to disk.
BackupLogBufferSize
The amount of memory used to buffer log records before these are written to disk.
BackupMemory
The total memory allocated in a database node for backups. This should be the sum of the memory allocated for the backup data buffer and the backup log buffer.
BackupWriteSize
The default size of blocks written to disk. This applies for both the backup data buffer and the backup log buffer.
BackupMaxWriteSize
The maximum size of blocks written to disk. This applies for both the backup data buffer and the backup log buffer.

More detailed information about these parameters can be found in Backup Parameters.

15.7.5. Backup Troubleshooting

If an error code is returned when issuing a backup request, the most likely cause is insufficient memory or disk space. You should check that there is enough memory allocated for the backup. Important: If you have set BackupDataBufferSize and BackupLogBufferSize and their sum is greater than 4MB, then you must also set BackupMemory as well. See BackupMemory.

You should also make sure that there is sufficient space on the hard drive partition of the backup target.

NDB does not support repeatable reads, which can cause problems with the restoration process. Although the backup process is “hot”, restoring a MySQL Cluster from backup is not a 100% “hot” process. This is due to the fact that, for the duration of the restore process, running transactions get non-repeatable reads from the restored data. This means that the state of the data is inconsistent while the restore is in progress.