Make innobackupex use rsync to copy non-InnoDB files
Currently innobackupex copies non-InnoDB files by spawning a separate cp/tar process for every file which is inefficient for servers with huge numbers of databases/tables. This can be optimized by using rsync instead of cp, i.e. for local non-streaming backups.
Blueprint information
- Status:
- Complete
- Approver:
- Stewart Smith
- Priority:
- Undefined
- Drafter:
- Alexey Kopytov
- Direction:
- Approved
- Assignee:
- Alexey Kopytov
- Definition:
- Approved
- Series goal:
- Accepted for 1.6
- Implementation:
- Implemented
- Milestone target:
- 1.6.4
- Started by
- Alexey Kopytov
- Completed by
- Alexey Kopytov
Related branches
Related bugs
Sprints
Whiteboard
In addition to handling large number of files more efficiently than cp, rsync can also help to decrease FTWRL time even further if used twice: by first copying non-InnoDB files before locking tables, and then syncing any possible changes inside FTWRL.
The goal of this blueprint is to introduce the --rsync option to innobackupex which will be mutually exclusive with --remote-host and --stream. When specified, backup_files() will be called twice: first before calling mysql_lockall() to create a preliminary backup of non-InnoDB files using rsync, and then after mysql_lockall() to sync changes that might occur before the lock.
We have a contributed patch for this blueprint in bug #803556.
There's also a dependency on a proper fix for bug #408803. Since rsync can take longer than wait_timeout, we need a background process pinging the server during its execution. So bug #408803 must be fixed before implementing this blueprint to avoid implementing a separate keepalive logic here.