How to shrink the ibdata file by transporting tables with Trite

You’ve probably had some troubles with the shared InnoDB tablespace stored in the ibdata file. Especially when it has grown for some reasons and reached a critical size.

This behavior occurs in some cases, due to excessive rollback segments growth or during a migration from a unique shared tablespace to a file-per-table configuration for example.

In this post, I would like to explain how to shrink the ibdata file after an unwanted file growth in a file-per-table configuration.
Note that the process could be done without Trite but the tool avoids to write the script used to transport tables yourself.

Initial situation

Here is a sample of the InnoDB configuration:

innodb_data_file_path = ibdata1:100M:autoextend
innodb_file_per_table

And the status of your datafiles in the datadir directory:

drwx------ 2 mysql mysql 4,0K déc. 20 2012 performance_schema
drwxrwx--- 2 mysql mysql 4,0K déc. 20 2012 mysql
drwxrwx--- 2 mysql mysql 4,0K juin 10 23:58 DB_INNO_FPT_1
drwxrwx--- 2 mysql mysql 4,0K sept. 8 15:53 DB_INNO_FPT_2
-rwxrwx--- 1 mysql mysql 317G sept. 8 23:37 ibdata1

The ibdata file size is 317Go and you want to recover this lost space.

What is Trite?

Trite is a client/server tool that provides customizable transport of binary InnoDB table files.
The tool was developed by Joshua Prunier and is available on GitHub.

The tool allows to connect a client to your database backup (XtraBackup) and stream the files to another database.
Again, you can do that without Trite, but this tool automate the manual process.

Also, this procedure is much quicker than traditional mysqldump restores when the tables size becomes very large.

Finally, Trite can be useful in many use cases, it deserves your attention.

Why Trite?

Because in this case the dataset is too big to use a traditional export/import with mysqldump.
We have to copy very large InnoDB tables and Trite could help to da that with ease.

Also, we don’t simply want to restore a backup, we want to retrieve a ibdata file with a reasonable size.
So, we have to copy each table files from the backup to a new clean MySQL instance.

Pre-requisites

You need a spare server to restore the backuped files.
The source server should be master ready (binlog, server_id…)
Note that you can use a single server but the load generated by the process could be a problem on a production server.

Trite is written in Go, so, you need Go and a Go MySQL driver to install Trite on your server.
The Trite installation process is clearly detailed on the GitHub page of the project.

All tables must be InnoDB with the innodb_file_per_table configuration activated.

Xtrabackup is needed to perform the physical database backup.
Note that you must have Percona Server (5.1 to 5.6), Oracle MySQL 5.6 or MariaDB (5.5 to 10) for the target MySQL instance.

Overview of the procedure

We’ll copy the InnoDB tables from the source server (aka S) with the larger ibdata file to the target server (aka T).
I consider that MySQL is installed and configured on both servers.

Here is an overview of the procedure:

  • Install Trite on both servers (not detailed here)
  • [S] Make a backup of the source database with XtraBackup
  • [S] Apply logs with the --export option
  • [S] Make a dump of your database objects with Trite
  • [S] Start a Trite server pointing on the previous backup and dump directories
  • [T] Use the Trite client to restore all the tables in a brand new MySQL instance
  • [T] Configure the spare server as a slave and wait for the replication lag (not detailed here)
  • Switch your application from [S] to [T] or copy the fresh database on the source server (not detailed here)

Detailled procedure

Backup the source database with XtraBackup

On the source server, you have to make a full physical backup of your databases.
The backup process is not detailed here, please, refer to the documentation for the details on the installation and use.

Output example:

# innobackupex --user=root --password=pass /backup
... output truncated ...
innobackupex: Backup created in directory '/backup/2014-01-25_14-22-06'
140125 14:30:29 innobackupex: Connection to database server closed
140125 14:30:29 innobackupex: completed OK!

Backup files are generated in the /backup/2014-01-25_14-22-06 directory.

Apply logs on the backuped files

To allow transport tablespaces, you have to use the export option as follow:

# innobackupex --apply-log --use-memory=1G --export /backup/2014-01-25_14-22-06
... output truncated ...
140125 14:33:03 InnoDB: Starting shutdown...
140125 14:33:07 InnoDB: Shutdown completed; log sequence number 70808870412
140125 14:33:07 innobackupex: completed OK!

Make a dump of the database objects with Trite

On the source server, use Trite to make a dump of the database objects.
This step will create a directory that contains all the scripts to create databases and objects (tables, views, procedures…)

# trite -dump -user=root -password=pass -host=localhost
Dumping to: /backup/localhost_dump20140125173337
DB_INNO_FPT_1: 21 tables, 0 procedures, 0 functions, 0 triggers, 0 views
DB_INNO_FPT_2: 42 tables, 0 procedures, 0 functions, 0 triggers, 3 views
63 total objects dumped
Total runtime = 2.19331264s

This scripts will be used by the Trite client to create databases and tables on the target server.

Start the Trite server

On the source server, start the Trite server that allow the copy of the data files.
It could be better to start the server in a tmux or a screen.

You have to specified where are the backup and dump files:

# trite -server -dump_path=/backup/localhost_dump20140125173337 -backup_path=/backup/2014-01-25_14-22-06
Starting server listening on port 12000

Restore the databases and tables

On the target server, MySQL is running with a fresh instance.
The ibdata file is just fine for this instance (size depending of your setup).

The Trite client works as expected:

  • Create the databases and objects (tables, views…)
  • Copy the data files from the source server to the target server
  • Import the tablespaces for each table on the target MySQL instance
# trite -client -user=root -password=pass -socket=/var/lib/mysql/mysql.sock -server_host=localhost
DB_INNO_FPT_1.T1 has been restored
DB_INNO_FPT_1.T2 has been restored
DB_INNO_FPT_1.T3 has been restored
... output truncated ...
Applying triggers for DB_INNO_FPT_1
Applying views for DB_INNO_FPT_1
Applying procedures for DB_INNO_FPT_1
Applying functions for DB_INNO_FPT_1
Total runtime = 13m3.881954844s

All the databases and tables are now available in the target instance.
Note that you can run the client with multiple worker threads.

Attach the target server as a slave

XtraBackup created a file named xtrabackup_binlog_info on the backup directory on the source server.
This file contains the binary log file name and position of the exact point in the binary log to which the prepared backup corresponds.

Use the information stored in this file to setup the replication between the servers.
The process to configure a replication slave is not detailed here. You should find that easily through the World Wide Web.

Final steps

Just wait to have a replication slave up to date and chose your favorite switching method:

  • Switch your application from [S] to [T]
  • Restore [T] datafiles on [S]
  • Your way…

And don’t forget to add a maximum size for the ibdata file in your MySQL configuration file:

innodb_data_file_path = ibdata1:100M:autoextend:max:10G

What can you expect?

The performance of this process depends of the size of your tables, how fast your disks are and how fast your network is.
Now, be sure this process will be very efficient for huge datasets.

Again, Trite could be useful for many other use cases, all feedbacks are welcome.

Finally, I would like to thanks Joshua for the tool and the talks about how to improve Trite.

Source: http://joshuaprunier.blogspot.fr/2014/02/introducing-trite-tool-for-automating.html

Time to forget show processlist for monitoring?

Disclaimer: I’m not specially an expert of benchmarking, this post is to compare different options. All comments and advices are welcome.

I’m not telling you anything new, the show processlist command is a fantastic command line tool for instant check.
But what about monitor your databases with this command embedded in a tool?

Just have a look at this graph:

Screenshot 2014-06-23 at 10.09.38 PM

With 5K queries per seconds, how much will be hidden with a show processlist executed every seconds? Probably a lot.
So, I wanted to test which alternatives could be efficient to retrieve all the queries during a time lapse.

Test procedure and configuration

I used sysbench 0.5 (with oltp.lua) to make my tests on different configurations and tools.
My test server is a 20 cores (hyper-threaded) server with 128Go of RAM and a RAID 10 disk setup.
These tests are based on Percona Server 5.5.31.
I tested the amount of TPS from 1 to 64 concurrent threads.

Here is the standard benchmark graph (TPS):

0a7dffb4f4105bbaecefd912495f7447

Slow query log set to 0 (file output)

d4c4ef22f54e29ebbcfacd3e7395c4b1

Estimated size of the file: 52Go
We reach a stable peak of 1000 transactions per second.

Slow query log set to 0 (table output)

9e9ba5be3a12928fc4ea876c688c1bcf

Estimated size of the table: 8.9Go
Table contention is probably the cause of a stabilization around 600 transactions per second.

General log in a file

806e583c9db079f5ebf7846a27e92a6f

Estimated size of the file: 11Go
2000 transactions per second can be hoped here.

General log in a table

d49dbc548a3fa124ced9b58038581f74

Estimated size of the table: 9.9Go
Again, the results are below compared to the previous results in a file.

Infinite show processlist in a file

62adf7a0696c51338e217ac2bb2894ac

Estimated size of the file: 1.1Go
The results don’t seem so dramatic but not sure all the queries were grabbed in the file.

MariaDB Audit Plugin in a file

c223b317a1d2d8e2f14002d1666fcd6f

Estimated size of the file: 14Go
plugin-load=server_audit=server_audit.so
server_audit_events             = QUERY
server_audit_file_path          = /var/log/mysql/audit.log
server_audit_logging            = ON
server_audit_output_type        = file

MacAfee Audit Plugin (default configuration)

526777e9fa90b1b236b453519d616145

Estimated size of the file: 22Go

audit_offsets=6480, 6528, 4080, 4520, 104, 2584
audit_json_file=1
audit_json_socket_name=/tmp/audit.sock
audit_json_log_file=/var/log/mysql/mysql-audit.log
audit_record_cmds=select,insert,update,delete

MacAfee Audit Plugin (via socket)

fb244656047883f4eeab8a4bf046caf3

Estimated size of the file: 22Go

audit_offsets=6480, 6528, 4080, 4520, 104, 2584
audit_json_file=1
audit_json_socket=1
audit_json_socket_name=/tmp/audit.sock
audit_json_log_file=/var/log/mysql/mysql-audit.log
audit_record_cmds=select,insert,update,delete

MacAfee Audit Plugin (via socket + file sync 1000)

706b1033314f4ef6c774688bc618a612

Estimated size of the file: 22Go

audit_offsets=6480, 6528, 4080, 4520, 104, 2584
audit_json_file=1
audit_json_socket=1
audit_json_socket_name=/tmp/audit.sock
audit_json_log_file=/var/log/mysql/mysql-audit.log
audit_record_cmds=select,insert,update,delete
audit_json_file_sync=1000

Performance Schema activated

eb1ba3455682c347cf34cbaf53aad57a

performance_schema
performance_schema_events_waits_history_long_size = 10000
performance_schema_events_waits_history_size = 1000

Conclusions

Auditing a system necessarily has an impact on your system.
The tools and options presented here can provide various information.
Also, you have to consider how you’ll handle these data after having collected them.

The show processlist command don’t seem so dramatic for performance but not guarantee that all the queries will be retrieve.
You probably have to chose another solution if you really need to collect reliable data.

Hope these quick tests could help you to chose the right tool for your needs.

Wake up European DBA, call for papers for Percona Live London 2014 is open!

Call for papers for Percona Live London 2014 is open. For the fourth consecutive year, PLUK is going to be one of the best community event in Europe.
I have the honour of being conference committee chairman and the hard task to reviewing the talks with my colleagues of the committee.

First, let me introduce the committee members:

  • Art van Scheppingen (Spil Games)
  • Nicolai Plum (Booking.com)
  • Luis Motta Campos (Ebay Classifieds Group)
  • Colin Charles (MariaDB)
  • David Busby (Percona)
  • Morgan Tocker (Oracle)
  • Cédric PEINTRE (Dailymotion)

Amazing, isn’t it?! I think we couldn’t have a better committee for a community event.
I’m very glad to take part in the adventure with you guys!

And if you wonder what the committee does, please, read this review from Shlomi.
You should know that the committee is fully independent. We are working hard to offer you the best conference and being as fair as possible, with our beliefs, with fun.

Now, it’s time to submit your talks and tutorials. Like I said last year, If you are using databases, you might have something to say.
We are looking for the best use cases in relation to these topics, surprise us!

  • High Availability
  • DevOps
  • Programming
  • Performance Optimization
  • Replication & Backup
  • MySQL in the Cloud
  • MySQL and NoSQL
  • MySQL Case Studies
  • Security
  • What’s New in MySQL

Tell us about your life, I’m sure you have a ton of stories to tell us.
We would love to have new faces, new companies and new topics this year again.
Don’t be shy, everyone is eager to drink your words.

Read the guidelines and submit your talk now!

Looking forward to meet you in London in November.
Cheers

FOSDEM slides ok, but where are your slides from PLUK?

As usual, you can find the slides from PLUK 2013 and FOSDEM 2014 on [Plus].
You have just to access to the slides page and free up some time to discover and read all the referenced content.

Remember the programs of these fantastic events :

Now, I’ve just discovered that many slides are missing for PLUK 2013 on the perconalive website.

For latecomers, it’s still time to upload your slides from your dedicated speaker page.
Please, do it quickly, the attendees, the community, the committee and myself will be grateful to enjoy your slides!

Cheers

How do you want to get [Plus] in 2014!

2013 was and we wish you a happy new year 2014!
Last year was amazing and we are looking forward to live 2014.

I’m honored to take part of two amazing events in 2014, the FOSDEM and PLMCE.
By my lowly contribution as committee member of these two events, I wanted to promote open source in general and MySQL in particular.
And the schedule of these events tells me that we won the bet!
(PLMCE schedule will be published next week)

MySQL raises interest more and again in 2014 and is still in the heart of major architectures ;-)
We are proud to promote community events, community projects and open source initiatives for many years!

Also, for several months we have been working hard on glimpsee.net.
We hope to offer you a private beta release of our elegantly simple dashboard in the next few months.
Glimpsee has been already tested by privileged users since last summer.
And I heard they couldn’t live without our dynamic dashboard…

Hope to meet you somewhere on earth in 2014.

Cheers