How to shrink the ibdata file by transporting tables with Trite

You’ve probably had some troubles with the shared InnoDB tablespace stored in the ibdata file. Especially when it has grown for some reasons and reached a critical size.

This behavior occurs in some cases, due to excessive rollback segments growth or during a migration from a unique shared tablespace to a file-per-table configuration for example.

In this post, I would like to explain how to shrink the ibdata file after an unwanted file growth in a file-per-table configuration.
Note that the process could be done without Trite but the tool avoids to write the script used to transport tables yourself.

Initial situation

Here is a sample of the InnoDB configuration:

innodb_data_file_path = ibdata1:100M:autoextend
innodb_file_per_table

And the status of your datafiles in the datadir directory:

drwx------ 2 mysql mysql 4,0K déc. 20 2012 performance_schema
drwxrwx--- 2 mysql mysql 4,0K déc. 20 2012 mysql
drwxrwx--- 2 mysql mysql 4,0K juin 10 23:58 DB_INNO_FPT_1
drwxrwx--- 2 mysql mysql 4,0K sept. 8 15:53 DB_INNO_FPT_2
-rwxrwx--- 1 mysql mysql 317G sept. 8 23:37 ibdata1

The ibdata file size is 317Go and you want to recover this lost space.

What is Trite?

Trite is a client/server tool that provides customizable transport of binary InnoDB table files.
The tool was developed by Joshua Prunier and is available on GitHub.

The tool allows to connect a client to your database backup (XtraBackup) and stream the files to another database.
Again, you can do that without Trite, but this tool automate the manual process.

Also, this procedure is much quicker than traditional mysqldump restores when the tables size becomes very large.

Finally, Trite can be useful in many use cases, it deserves your attention.

Why Trite?

Because in this case the dataset is too big to use a traditional export/import with mysqldump.
We have to copy very large InnoDB tables and Trite could help to da that with ease.

Also, we don’t simply want to restore a backup, we want to retrieve a ibdata file with a reasonable size.
So, we have to copy each table files from the backup to a new clean MySQL instance.

Pre-requisites

You need a spare server to restore the backuped files.
The source server should be master ready (binlog, server_id…)
Note that you can use a single server but the load generated by the process could be a problem on a production server.

Trite is written in Go, so, you need Go and a Go MySQL driver to install Trite on your server.
The Trite installation process is clearly detailed on the GitHub page of the project.

All tables must be InnoDB with the innodb_file_per_table configuration activated.

Xtrabackup is needed to perform the physical database backup.
Note that you must have Percona Server (5.1 to 5.6), Oracle MySQL 5.6 or MariaDB (5.5 to 10) for the target MySQL instance.

Overview of the procedure

We’ll copy the InnoDB tables from the source server (aka S) with the larger ibdata file to the target server (aka T).
I consider that MySQL is installed and configured on both servers.

Here is an overview of the procedure:

  • Install Trite on both servers (not detailed here)
  • [S] Make a backup of the source database with XtraBackup
  • [S] Apply logs with the --export option
  • [S] Make a dump of your database objects with Trite
  • [S] Start a Trite server pointing on the previous backup and dump directories
  • [T] Use the Trite client to restore all the tables in a brand new MySQL instance
  • [T] Configure the spare server as a slave and wait for the replication lag (not detailed here)
  • Switch your application from [S] to [T] or copy the fresh database on the source server (not detailed here)

Detailled procedure

Backup the source database with XtraBackup

On the source server, you have to make a full physical backup of your databases.
The backup process is not detailed here, please, refer to the documentation for the details on the installation and use.

Output example:

# innobackupex --user=root --password=pass /backup
... output truncated ...
innobackupex: Backup created in directory '/backup/2014-01-25_14-22-06'
140125 14:30:29 innobackupex: Connection to database server closed
140125 14:30:29 innobackupex: completed OK!

Backup files are generated in the /backup/2014-01-25_14-22-06 directory.

Apply logs on the backuped files

To allow transport tablespaces, you have to use the export option as follow:

# innobackupex --apply-log --use-memory=1G --export /backup/2014-01-25_14-22-06
... output truncated ...
140125 14:33:03 InnoDB: Starting shutdown...
140125 14:33:07 InnoDB: Shutdown completed; log sequence number 70808870412
140125 14:33:07 innobackupex: completed OK!

Make a dump of the database objects with Trite

On the source server, use Trite to make a dump of the database objects.
This step will create a directory that contains all the scripts to create databases and objects (tables, views, procedures…)

# trite -dump -user=root -password=pass -host=localhost
Dumping to: /backup/localhost_dump20140125173337
DB_INNO_FPT_1: 21 tables, 0 procedures, 0 functions, 0 triggers, 0 views
DB_INNO_FPT_2: 42 tables, 0 procedures, 0 functions, 0 triggers, 3 views
63 total objects dumped
Total runtime = 2.19331264s

This scripts will be used by the Trite client to create databases and tables on the target server.

Start the Trite server

On the source server, start the Trite server that allow the copy of the data files.
It could be better to start the server in a tmux or a screen.

You have to specified where are the backup and dump files:

# trite -server -dump_path=/backup/localhost_dump20140125173337 -backup_path=/backup/2014-01-25_14-22-06
Starting server listening on port 12000

Restore the databases and tables

On the target server, MySQL is running with a fresh instance.
The ibdata file is just fine for this instance (size depending of your setup).

The Trite client works as expected:

  • Create the databases and objects (tables, views…)
  • Copy the data files from the source server to the target server
  • Import the tablespaces for each table on the target MySQL instance
# trite -client -user=root -password=pass -socket=/var/lib/mysql/mysql.sock -server_host=localhost
DB_INNO_FPT_1.T1 has been restored
DB_INNO_FPT_1.T2 has been restored
DB_INNO_FPT_1.T3 has been restored
... output truncated ...
Applying triggers for DB_INNO_FPT_1
Applying views for DB_INNO_FPT_1
Applying procedures for DB_INNO_FPT_1
Applying functions for DB_INNO_FPT_1
Total runtime = 13m3.881954844s

All the databases and tables are now available in the target instance.
Note that you can run the client with multiple worker threads.

Attach the target server as a slave

XtraBackup created a file named xtrabackup_binlog_info on the backup directory on the source server.
This file contains the binary log file name and position of the exact point in the binary log to which the prepared backup corresponds.

Use the information stored in this file to setup the replication between the servers.
The process to configure a replication slave is not detailed here. You should find that easily through the World Wide Web.

Final steps

Just wait to have a replication slave up to date and chose your favorite switching method:

  • Switch your application from [S] to [T]
  • Restore [T] datafiles on [S]
  • Your way…

And don’t forget to add a maximum size for the ibdata file in your MySQL configuration file:

innodb_data_file_path = ibdata1:100M:autoextend:max:10G

What can you expect?

The performance of this process depends of the size of your tables, how fast your disks are and how fast your network is.
Now, be sure this process will be very efficient for huge datasets.

Again, Trite could be useful for many other use cases, all feedbacks are welcome.

Finally, I would like to thanks Joshua for the tool and the talks about how to improve Trite.

Source: http://joshuaprunier.blogspot.fr/2014/02/introducing-trite-tool-for-automating.html

How to exclude a database from your dump with ZRM (MySQL Community help needed)

Last month, Ronald Bradford, Giuseppe Maxia and Mark Leith spoke about how to simulate a mysqldump –ignore-database.
This mysqldump option doesn’t exist and these three guys have given us various helpful solutions.

But for those of us who use ZRM community to make backup with mysqldump, the –exclude-pattern seems to do the job :

--exclude-pattern "pattern" 
All databases or tables that match the pattern are not backed up. 
If --all-databases or --databases are specified, the exclude pattern applies
to database names. If --database is specified, the exclude pattern applies
to table names. Wildcard characters * (match one or more characters),
? (match one character), [] (match one of the characters specified
within parenthesis) and | (match one of the patterns) are supported.
For example: Exclude pattern "*_db|dbase[123]|abcd" will match names
tom_db, dbase1, dbase2, abcd. All tables or databases that match this pattern
will be excluded. Character # is not permitted in the exclude pattern.

But this option seems to not work properly and I remember that I’ve used a hack to force ZRM to exclude a database from the database list since I used this tools.
I’m not talented enough in perl to really hack the code of this tool, so, I made it simple.

I made ​​a simple change to this file : /usr/bin/mysql-zrm-backup
By adding this new line after the row number 1597 :  $params = “–databases @pdbs”

To obtain that :

} else {
 $params = "--all-databases ";
 @pdbs = &enumAllDatabases();
 if( defined $inputs{"exclude-pattern"} ){
 @pdbs = &filterPattern( @pdbs );
 my $l = @pdbs;
 if( $l == 0 ){
 &printAndDie( "Nothing to backup after exclude-pattern is applied\n" );
 }
 $params = "--databases @pdbs"
 }
 if( $verbose ){
 &printLog( "backup of the following databases will be done @pdbs\n" );
 }

I can now exclude a database from my dump (but not a table).

But I’m sure a talented person could try to review the code to improve it in a right way.
If you are this person, you know what needs to be done…

If you are a ZRM Community edition practicer, share your thoughts about that.

My slides of MySQL Meetup Viadeo / LeMUG Paris

I was glad to present how to schedule and monitor mysqldump with ZRM community last week in Paris as part of the MySQL Meetup Viadeo / LeMUG

You can find my slides below, enjoy ! :

Thanks to Olivier and all the viadeo team in Paris for this event.
And, of course, thanks to all attendees.
We need more events like that in France !