How to shrink the ibdata file by transporting tables with Trite

You’ve probably had some troubles with the shared InnoDB tablespace stored in the ibdata file. Especially when it has grown for some reasons and reached a critical size.

This behavior occurs in some cases, due to excessive rollback segments growth or during a migration from a unique shared tablespace to a file-per-table configuration for example.

In this post, I would like to explain how to shrink the ibdata file after an unwanted file growth in a file-per-table configuration.
Note that the process could be done without Trite but the tool avoids to write the script used to transport tables yourself.

Initial situation

Here is a sample of the InnoDB configuration:

innodb_data_file_path = ibdata1:100M:autoextend
innodb_file_per_table

And the status of your datafiles in the datadir directory:

drwx------ 2 mysql mysql 4,0K déc. 20 2012 performance_schema
drwxrwx--- 2 mysql mysql 4,0K déc. 20 2012 mysql
drwxrwx--- 2 mysql mysql 4,0K juin 10 23:58 DB_INNO_FPT_1
drwxrwx--- 2 mysql mysql 4,0K sept. 8 15:53 DB_INNO_FPT_2
-rwxrwx--- 1 mysql mysql 317G sept. 8 23:37 ibdata1

The ibdata file size is 317Go and you want to recover this lost space.

What is Trite?

Trite is a client/server tool that provides customizable transport of binary InnoDB table files.
The tool was developed by Joshua Prunier and is available on GitHub.

The tool allows to connect a client to your database backup (XtraBackup) and stream the files to another database.
Again, you can do that without Trite, but this tool automate the manual process.

Also, this procedure is much quicker than traditional mysqldump restores when the tables size becomes very large.

Finally, Trite can be useful in many use cases, it deserves your attention.

Why Trite?

Because in this case the dataset is too big to use a traditional export/import with mysqldump.
We have to copy very large InnoDB tables and Trite could help to da that with ease.

Also, we don’t simply want to restore a backup, we want to retrieve a ibdata file with a reasonable size.
So, we have to copy each table files from the backup to a new clean MySQL instance.

Pre-requisites

You need a spare server to restore the backuped files.
The source server should be master ready (binlog, server_id…)
Note that you can use a single server but the load generated by the process could be a problem on a production server.

Trite is written in Go, so, you need Go and a Go MySQL driver to install Trite on your server.
The Trite installation process is clearly detailed on the GitHub page of the project.

All tables must be InnoDB with the innodb_file_per_table configuration activated.

Xtrabackup is needed to perform the physical database backup.
Note that you must have Percona Server (5.1 to 5.6), Oracle MySQL 5.6 or MariaDB (5.5 to 10) for the target MySQL instance.

Overview of the procedure

We’ll copy the InnoDB tables from the source server (aka S) with the larger ibdata file to the target server (aka T).
I consider that MySQL is installed and configured on both servers.

Here is an overview of the procedure:

  • Install Trite on both servers (not detailed here)
  • [S] Make a backup of the source database with XtraBackup
  • [S] Apply logs with the --export option
  • [S] Make a dump of your database objects with Trite
  • [S] Start a Trite server pointing on the previous backup and dump directories
  • [T] Use the Trite client to restore all the tables in a brand new MySQL instance
  • [T] Configure the spare server as a slave and wait for the replication lag (not detailed here)
  • Switch your application from [S] to [T] or copy the fresh database on the source server (not detailed here)

Detailled procedure

Backup the source database with XtraBackup

On the source server, you have to make a full physical backup of your databases.
The backup process is not detailed here, please, refer to the documentation for the details on the installation and use.

Output example:

# innobackupex --user=root --password=pass /backup
... output truncated ...
innobackupex: Backup created in directory '/backup/2014-01-25_14-22-06'
140125 14:30:29 innobackupex: Connection to database server closed
140125 14:30:29 innobackupex: completed OK!

Backup files are generated in the /backup/2014-01-25_14-22-06 directory.

Apply logs on the backuped files

To allow transport tablespaces, you have to use the export option as follow:

# innobackupex --apply-log --use-memory=1G --export /backup/2014-01-25_14-22-06
... output truncated ...
140125 14:33:03 InnoDB: Starting shutdown...
140125 14:33:07 InnoDB: Shutdown completed; log sequence number 70808870412
140125 14:33:07 innobackupex: completed OK!

Make a dump of the database objects with Trite

On the source server, use Trite to make a dump of the database objects.
This step will create a directory that contains all the scripts to create databases and objects (tables, views, procedures…)

# trite -dump -user=root -password=pass -host=localhost
Dumping to: /backup/localhost_dump20140125173337
DB_INNO_FPT_1: 21 tables, 0 procedures, 0 functions, 0 triggers, 0 views
DB_INNO_FPT_2: 42 tables, 0 procedures, 0 functions, 0 triggers, 3 views
63 total objects dumped
Total runtime = 2.19331264s

This scripts will be used by the Trite client to create databases and tables on the target server.

Start the Trite server

On the source server, start the Trite server that allow the copy of the data files.
It could be better to start the server in a tmux or a screen.

You have to specified where are the backup and dump files:

# trite -server -dump_path=/backup/localhost_dump20140125173337 -backup_path=/backup/2014-01-25_14-22-06
Starting server listening on port 12000

Restore the databases and tables

On the target server, MySQL is running with a fresh instance.
The ibdata file is just fine for this instance (size depending of your setup).

The Trite client works as expected:

  • Create the databases and objects (tables, views…)
  • Copy the data files from the source server to the target server
  • Import the tablespaces for each table on the target MySQL instance
# trite -client -user=root -password=pass -socket=/var/lib/mysql/mysql.sock -server_host=localhost
DB_INNO_FPT_1.T1 has been restored
DB_INNO_FPT_1.T2 has been restored
DB_INNO_FPT_1.T3 has been restored
... output truncated ...
Applying triggers for DB_INNO_FPT_1
Applying views for DB_INNO_FPT_1
Applying procedures for DB_INNO_FPT_1
Applying functions for DB_INNO_FPT_1
Total runtime = 13m3.881954844s

All the databases and tables are now available in the target instance.
Note that you can run the client with multiple worker threads.

Attach the target server as a slave

XtraBackup created a file named xtrabackup_binlog_info on the backup directory on the source server.
This file contains the binary log file name and position of the exact point in the binary log to which the prepared backup corresponds.

Use the information stored in this file to setup the replication between the servers.
The process to configure a replication slave is not detailed here. You should find that easily through the World Wide Web.

Final steps

Just wait to have a replication slave up to date and chose your favorite switching method:

  • Switch your application from [S] to [T]
  • Restore [T] datafiles on [S]
  • Your way…

And don’t forget to add a maximum size for the ibdata file in your MySQL configuration file:

innodb_data_file_path = ibdata1:100M:autoextend:max:10G

What can you expect?

The performance of this process depends of the size of your tables, how fast your disks are and how fast your network is.
Now, be sure this process will be very efficient for huge datasets.

Again, Trite could be useful for many other use cases, all feedbacks are welcome.

Finally, I would like to thanks Joshua for the tool and the talks about how to improve Trite.

Source: http://joshuaprunier.blogspot.fr/2014/02/introducing-trite-tool-for-automating.html

An incomplete list of what your developers would like to know before migrating to MySQL 5.5

A few years ago, I asked to check with me in the long (very long) change history of MySQL 5.5 documentation what are the changes in relation to the SQL syntax.
Chris Calender helped me to retrieve a list of the main changes, thanks again Chris.

Today, I would like to share this list with you.
It is simply a curated transcript of what you might find in the documentation but I’m sure it can help some of you.
 

INTO clause in nested SELECT statements

 
Previously, the parser accepted an INTO clause in nested SELECT statements, which is invalid because such statements must return their results to the outer context. As of MySQL 5.5.3, this syntax is no longer permitted and statements that use it must be changed.
 

Table aliases in DELETE statements

 
In MySQL 5.5.3, several changes were made to alias resolution in multiple-table DELETE statements so that it is no longer possible to have inconsistent or ambiguous table aliases.

In MySQL 5.1.23, alias declarations outside the table_references part of the statement were disallowed for the USING variant of multiple-table DELETE syntax, to reduce the possibility of ambiguous aliases that could lead to ambiguous statements that have unexpected results such as deleting rows from the wrong table.

As of MySQL 5.5.3, alias declarations outside table_references are disallowed for all multiple-table DELETE statements. Alias declarations are permitted only in the table_references part.

Incorrect:

  • DELETE FROM t1 AS a2 USING t1 AS a1 INNER JOIN t2 AS a2;
  • DELETE t1 AS a2 FROM t1 AS a1 INNER JOIN t2 AS a2;

Correct:

  • DELETE FROM t1 USING t1 AS a1 INNER JOIN t2 AS a2;
  • DELETE t1 FROM t1 AS a1 INNER JOIN t2 AS a2;

Previously, for alias references in the list of tables from which to delete rows in a multiple-table delete, the default database is used unless one is specified explicitly. For example, if the default database is db1, the following statement does not work because the unqualified alias reference a2 is interpreted as having a database of db1:

  • DELETE a1, a2 FROM db1.t1 AS a1 INNER JOIN db2.t2 AS a2 WHERE a1.id=a2.id;

To correctly match an alias that refers to a table outside the default database, you must explicitly qualify the reference with the name of the proper database:

  • DELETE a1, db2.a2 FROM db1.t1 AS a1 INNER JOIN db2.t2 AS a2 WHERE a1.id=a2.id;

As of MySQL 5.5.3, alias resolution does not require qualification and alias references should not be qualified with the database name. Qualified names are interpreted as referring to tables, not aliases.

Statements containing alias constructs that are no longer permitted must be rewritten.
 

New reserved words

 
There are several new reserved words in 5.5 what were not reserved in 5.1.
So if you are using any of these as table names, etc., then you’ll need to surround them with backticks (`)

  •  GENERAL
  • IGNORE_SERVER_IDS
  • MASTER_HEARTBEAT_PERIOD
  • MAXVALUE
  • RESIGNAL
  • SIGNAL
  • SLOW

 

FLUSH TABLES and LOCK TABLES

 
As of MySQL 5.5.3, due to work done for Bug #989, FLUSH TABLES is not permitted when there is an active LOCK TABLES … READ.
To provide a workaround for this restriction, FLUSH TABLES has a new variant, FLUSH TABLES tbl_list WITH READ LOCK, that enables tables to be flushed and locked in a single operation.
As a result of this change, applications that previously used this statement sequence to lock and flush tables will fail:

  • LOCK TABLES tbl_list READ;
  • FLUSH TABLES tbl_list;

Such applications should now use this statement instead:

  •  FLUSH TABLES tbl_list WITH READ LOCK;

 

Fast truncation

 
As of MySQL 5.5.7, InnoDB always uses the fast truncation technique, equivalent to DROP TABLE and CREATE TABLE.
It no longer performs a row-by-row delete for tables with parent-child foreign key relationships.
TRUNCATE TABLE returns an error for such tables.
Modify your SQL to issue DELETE FROM table_name for such tables instead.
 

TIMESTAMP display width

 
In very old versions of MySQL (prior to 4.1), the TIMESTAMP data type supported a display width, which was silently ignored beginning with MySQL 4.1.
This is deprecated in MySQL 5.1, and removed altogether in MySQL 5.5.

The display width is no longer supporter for TIMESTAMP data type in MySQL 5.5
 

Aliases and wildcards

 
Aliases for wildcards (as in SELECT t.* AS 'alias' FROM t) are no longer accepted and result in an error.
Previously, such aliases were ignored silently. (Bug #27249)
 

Scientific notation

 
In addition to these official parts of the MySQL release notes, Sheeri write a post about floats, doubles ans scientific notation between MySQL 5.1 and MySQL 5.5 :

http://blog.mozilla.org/it/2013/01/17/mysql-5-1-vs-mysql-5-5-floats-doubles-and-scientific-notation/

 

This list is incomplete because I’m sure you have your own tips to share.

Source : http://dev.mysql.com/doc/relnotes/mysql/5.5/en/

How I use twitter to follow the MySQL Community

Here is a perfect post to read during your holidays :-)
There are many ways to get news on the internet : blogs, rss feeds, facebook, linkedin, twitter…
What I want to talk about here is how I use twitter to follow the MySQL Community and how I stay up to date of the latest news.
I use twitter only for focus on the MySQL news and the MySQL community, that’s why I would like to share this experience with you.

Step 1 : Choose your friends

Q: What is the most complicated with Twitter ?
A: Read all the tweets that we receive every day

Of course it can be complicated and that takes a long long time…
That’s why you must choose who is your best friends carefully.

Except for the MySQL rock stars, I advise you to use TweetStats before to blindly follow anyone who tweeted a dark grigri about MySQL.
It’s very simple, just enter the username of the woman, man, firm or product you want to follow on the home page of Tweestats and let the magic gone.
This tool let you know exactly how many tweets a user published by month (or day).

I recommend to follow only the people who don’t publish more than 100 tweets per month (or 4 tweets per day).
Yes, only 4 tweets per day, if you have 100 friends, this means you have to read 400 tweets per day. Are you really able to read all of them ?!
And don’t forget that behind the Tweet there is often a link…

Step 2 :  Lists, Search and tools

Lists are one of the most interesting Twitter feature and I’m sure many of you don’t use them.
Lists are useful in many cases, take a look at this two examples  :

  • You use Twitter to track different types of information (Family and Friends, Sports, MySQL, Geek news…)
  • The people you follow are cosmopolitan and speak different languages
So, how to stay focused on the MySQL news across this jungle ?
I use TweetDeck to read through the lists, one list for the UK/US twittos, another one for the French twittos and a special column for my favorites.

 

 

Another really awesome feature of TweetDeck is the research in live streaming, perfect for tracking a live events.
For example, I’m currently using 4 search columns : “MySQL OR Percona OR MariaDB“, “#perconalive“, “MySQL AND slides”  and “#OSCON

 

Step 3 : The best way to tweet

 

If you want that your tweets are broadcast in a perfect fluidity, you need to use a Buffer.
This tool helps you to schedule your tweets, on twitter and linkedIn (very useful now that twitter no longer speak to linkedIn)

 

Q: Why I really need to use Bufferapp ?
A: Because your followers don’t want to receive all your tweets in the same time ;-)

 

And this tool will also let you know if your tweets are read, this is why you write them, isn’t it ?

 

Step 4 : One more thing (or two)

 

I like to backup my own tweets because these tweets are often some links to articles that I like.
My other vice is to receive my favorite tweets via email.
To do that, I put the internet to work for me with ifttt.

These two rules send my own tweets in my Evernote and send me an email for each favorite tweet, automatically !
Really awesome when I wish to find an old tweet in my Evernote or when I press the “favorite” button in TweetBot.

 

If you want to follow me, I always tweet about MySQL !




Fatal timeout !

There are several parameters to set a timeout on MySQL :

But I would like to focus on wait_timeout (or interactive_timeout depending on how you connect)

This timeout allows MySQL to automatically close a connection in case of non-activity during the time defined by this parameter (default value is 28000 seconds).

The problem I wish to explain here may happen when this timeout is set to a low value (about 10 to 30 seconds).

Indeed, in this case, this timeout may have serious consequences, look at that :
[The wait_timeout parameter is set to 10 seconds in this case]

You are connected through the standard MySQL client and you need to run a delete query (bypass autocommit) :

  • mysql> start transaction;
  • [ You wait more than 10s because a friend calls your attention about the latest news on iphone 5 ! ]
  • mysql> delete from most_valuable_table;

Your problem starts here because you forgot the where clause !
After the delete command, MySQL said :

ERROR 2006 (HY000): MySQL server has gone awayNo connection. Trying to reconnect... Connection id:    98 Current database: masterQuery OK, 33698541 rows affected (0.01 sec)

This simply means that the connection was lost and the last order re-submitted.
Then, you try to cancel your incomplete “delete” with a rollback command (your heart beats faster now…) :

  • mysql> rollback;
  • mysql> select * from most_valuable_table;

This select returns 0 rows, cruel !

It isn’t a mistake, the “start transaction” order was lost when MySQL has gone away.
The rollback is no longer possible. Bye bye to your 33.000.000 of rows !

It’s time to go to vacation…