Few words about pt-archiver

I really like the percona toolkit, we all love the percona toolkit.
I know how it’s difficult to write operational and efficient scripts (I try to do that myself everyday)
And it is even more difficult to share a script, to take the responsibility to share its own code.
From there, understand that this article is simply a review of my own thoughts about pt-archiver (with the invaluable assistance of @maximefouilleul), I don’t want to question the quality or usefulness of this tool.

I tried pt-archiver for the first time this week, and the first thing I do before using a tool is read the documentation (yes, I really like to read documentations)

I was intrigued by some options of this tool, first, I can read “It deletes data from the source by default“.
Personally, I hate that you want to remove my data by default, I prefer to have a –delete option instead of a –no-delete option (even if there is a –dry-run option).
The difference is not trivial for me, I love my data…

–why-quit ??? I don’t understand the real interest of this option, if the tool stops without doing its job, why hide this information by default ?

Also, it seems that –txn-size–progress and –limit options must have the same values for the tool to work properly. But I have to open a bug report for that.

It is only few words, no bitterness, just a feedback.

How to quickly identify queries with pt-query-digest and pt-query-advisor from rules ?

Today I’m working on integrate the Percona toolkit (instead of maatkit) in my own tools and I’m playing with pt-query-digest and pt-query-advisor.
These tools can be very interesting to identify some queries from established rules.
The  –review option is available for two of them and helps me to store a sample of each class of query and match them with an advice.

The rules (or advices) are available in the pt-query-advisor documentation and let you identify various problems such as queries with an argument with leading wildcard or with a table joined twice, for example.
There are 3 types of rules : note, warning and  critical.

The goal here is to quickly identify the queries corresponding to a rule, from the slow query log.
So, let me explain what was my method to do that, and, please, give me yours.

First, I need to create a table to store the queries (you can also use –create-review-table option) :

CREATE TABLE query_review (
   checksum     BIGINT UNSIGNED NOT NULL PRIMARY KEY,
   fingerprint  TEXT NOT NULL,
   sample       TEXT NOT NULL,
   first_seen   DATETIME,
   last_seen    DATETIME,
   reviewed_by  VARCHAR(20),
   reviewed_on  DATETIME,
   comments     TEXT
)

It’s interesting to note that the 3 last columns are used to annotate the queries. A simple PHP interface or script can be used to fill these fields (sorry, not enough time to write this script now).

Second easy step, put the queries from the slow query log to the review table with pt-query-digest :
(Note that my review table is in the mysql database and my slow query log is /var/log/slowq.log)

pt-query-digest /var/log/slowq.log --review <DSN OPTIONS>
DSN OPTIONS : u=<user>,p=<pass>,D=mysql,t=query_review,S=/tmp/mysql.sock

And obtain this result in database :
(fingerprint, sample and number of queries truncated for readability)

 
Now, generate an analyze file with pt-query-advisor from the review table :

pt-query-advisor --review <DSN OPTIONS> >/tmp/analyze_queries.log
DSN OPTIONS : u=<user>,p=<pass>,D=mysql,t=query_review,S=/tmp/mysql.sock

In this file, you can find a list of rules with associated queries (but you don’t need to edit the file, go to next step) :

CLA.007 0x4D97479E1774609A 0x6040F6055F330176 0x6BF38A80507513C8 0x7A4ACE71E642E539 0xCE21735A6006AB3D 0xEE13BCBC394FCD83

Finaly, to retrieve all the queries for a particular rule, use the command below (on 1 line) :

cat /tmp/analyze_queries.log | grep <RULE ID>
| awk '{for ( i = 2 ; i <= NF ; i++ ) print $i}'
| xargs -i mysql -u<user> -p<password> -E
-e"SELECT * FROM mysql.query_review WHERE CHECKSUM={}"

Replace the RULE ID, user and password with your informations.
The rules ID can be find in the pt-query-advisor documentation.

This is where the real work begins !

Hope that can help.

Vote for MySQL[plus] awards 2011 !

First of all, I wish you a happy new year.
Many things happened last year, it was really exciting to be involved in the MySQL ecosystem.
I hope this enthusiasm will be increased this year, up to you !

To start the year, I propose the MySQL[plus] Awards 2011
It will only take 5 minutes to fill out these polls.
Answer with your heart first and then with your experience with some of these tools or services.

Polls will be closed January 31, so, vote now !
For “other” answers, please,  let me a comment with details.

Don’t hesitate to submit proposal for tools or services in the comments.
And, please, share these polls !

 

Best MySQL GUI client Tool

  • SQLyog (47%, 85 Votes)
  • MySQL Workbench (20%, 36 Votes)
  • HeidiSQL (8%, 14 Votes)
  • Navicat for MySQL (7%, 12 Votes)
  • MySQL Administrator / Query browser (6%, 10 Votes)
  • Other (5%, 9 Votes)
  • phpMyAdmin (4%, 8 Votes)
  • SQLWave (1%, 1 Votes)
  • SIDU DB Web GUI (1%, 1 Votes)
  • EMS SQL Manager for MySQL (1%, 1 Votes)
  • Sequel Pro (1%, 1 Votes)
  • sqldeveloper (1%, 1 Votes)
  • dbForge Studio (1%, 1 Votes)
  • MyQuery (1%, 1 Votes)
  • DBTools Manager (0%, 0 Votes)
  • SQL Maestro MySQL Tools (0%, 0 Votes)
  • Toad for MySQL (0%, 0 Votes)

Total Voters: 181

Loading ... Loading ...

 

Best MySQL Monitoring Tool

  • MONyog (35%, 135 Votes)
  • MySQL Enterprise Monitor (29%, 113 Votes)
  • Nagios (13%, 52 Votes)
  • Cacti (10%, 38 Votes)
  • Other (6%, 25 Votes)
  • Zabbix (5%, 19 Votes)
  • Ganglia (2%, 4 Votes)

Total Voters: 386

Loading ... Loading ...

 

Best MySQL Backup Tool (two possible answers)

  • Xtrabackup (46%, 151 Votes)
  • mysqldump (29%, 94 Votes)
  • MySQL Enterprise Backup (28%, 91 Votes)
  • mydumper (9%, 30 Votes)
  • LVM Backup (FS snapshot) (6%, 19 Votes)
  • mylvmbackup (5%, 17 Votes)
  • ZRM Enterprise (5%, 15 Votes)
  • ZRM Community (5%, 15 Votes)
  • Other (5%, 15 Votes)
  • Cold backup (2%, 6 Votes)
  • r1soft CDP for MySQL (1%, 2 Votes)

Total Voters: 329

Loading ... Loading ...

 

Best MySQL Replication/Cluster Tools (two possible answers)

  • MySQL Replication (37%, 183 Votes)
  • Tungsten Replicator (21%, 105 Votes)
  • MySQL NDB Cluster (17%, 83 Votes)
  • ShoonerSQL (16%, 76 Votes)
  • Galera Replication (13%, 66 Votes)
  • DRBD (9%, 45 Votes)
  • Other (2%, 11 Votes)
  • ScaleDB (2%, 9 Votes)

Total Voters: 490

Loading ... Loading ...

 

Best MySQL community Tool

  • Percona toolkit (37%, 117 Votes)
  • MySQL Sandbox (28%, 89 Votes)
  • MHA for MySQL (mysql-master-ha) (14%, 45 Votes)
  • Other (6%, 18 Votes)
  • Xtrabackup Manager (5%, 15 Votes)
  • mysqltuner (3%, 11 Votes)
  • Innotop (3%, 10 Votes)
  • mycheckpoint (2%, 5 Votes)
  • openark kit (1%, 3 Votes)
  • MMM (1%, 3 Votes)
  • Common schema (0%, 2 Votes)

Total Voters: 318

Loading ... Loading ...

 

Best MySQL Support Service

  • SkySQL (38%, 130 Votes)
  • Percona (28%, 96 Votes)
  • Oracle (26%, 90 Votes)
  • Other (4%, 15 Votes)
  • FromDual (3%, 9 Votes)
  • Pythian (1%, 4 Votes)
  • Blue Gecko (0%, 1 Votes)
  • PalominoDB (0%, 0 Votes)

Total Voters: 345

Loading ... Loading ...

 

Best MySQL based distribution

  • Oracle MySQL (36%, 122 Votes)
  • MariaDB (33%, 114 Votes)
  • Percona server (27%, 91 Votes)
  • Drizzle (3%, 11 Votes)
  • Other (1%, 4 Votes)
  • TokuDB (0%, 0 Votes)
  • InfiniDB (0%, 0 Votes)

Total Voters: 342

Loading ... Loading ...

 

Best NoSQL solution (two possible answers)

  • Memcached (46%, 120 Votes)
  • Hadoop / HBase (28%, 72 Votes)
  • MongoDB (25%, 65 Votes)
  • Cassandra (10%, 27 Votes)
  • Other (10%, 27 Votes)
  • HandlerSocket (9%, 24 Votes)
  • CouchDB (6%, 16 Votes)
  • Redis (5%, 13 Votes)
  • Tarantool (4%, 11 Votes)
  • Riak (3%, 7 Votes)
  • Amazon SimpleDB (2%, 5 Votes)
  • Tokyo Cabinet / Tyrant (1%, 3 Votes)

Total Voters: 259

Loading ... Loading ...

 

Best MySQL 5.6 improvement

  • Performances (27%, 77 Votes)
  • Replication (26%, 74 Votes)
  • Fulltext (16%, 47 Votes)
  • memcached API (13%, 36 Votes)
  • Backup/Restore buffer pool (8%, 22 Votes)
  • Partitioning (6%, 16 Votes)
  • Other (4%, 11 Votes)

Total Voters: 285

Loading ... Loading ...

 

Happy 2012 !
Cédric

This article is obviously not sponsored !
(MySQL is a trademark of Oracle Corporation and/or its affiliates)