Few words about pt-archiver

I really like the percona toolkit, we all love the percona toolkit.
I know how it’s difficult to write operational and efficient scripts (I try to do that myself everyday)
And it is even more difficult to share a script, to take the responsibility to share its own code.
From there, understand that this article is simply a review of my own thoughts about pt-archiver (with the invaluable assistance of @maximefouilleul), I don’t want to question the quality or usefulness of this tool.

I tried pt-archiver for the first time this week, and the first thing I do before using a tool is read the documentation (yes, I really like to read documentations)

I was intrigued by some options of this tool, first, I can read “It deletes data from the source by default“.
Personally, I hate that you want to remove my data by default, I prefer to have a –delete option instead of a –no-delete option (even if there is a –dry-run option).
The difference is not trivial for me, I love my data…

–why-quit ??? I don’t understand the real interest of this option, if the tool stops without doing its job, why hide this information by default ?

Also, it seems that –txn-size–progress and –limit options must have the same values for the tool to work properly. But I have to open a bug report for that.

It is only few words, no bitterness, just a feedback.




Share the love!

Subscribe to RSS feed or by email to automatically receive real-time notifications
Oracle, MySQL, and InnoDB are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners

4 Comments

  1. I agree with your assessment. The tool is one of the oldest and most time- and field-proven tools in the toolkit–at least 5 years old (copyright 2007). It hasn’t changed fundamentally in all those years, which on the one hand proves the usefulness of its original design, but on the other hand makes me think it’s time to bring it up to our new standards and ways of doing things. pt-table-checksum v2 is an example of those new standards: fewer options, more automation, and _a lot_ of safety checks.

    It took months to rewrite pt-table-checksum, which is a low-risk tool. pt-archiver is a high-risk tool because it modifies data. As you aptly noted, “it is even more difficult to share a script, to take the responsibility to share its own code”. Add to that that Percona Toolkit tools must work (and be tested) on a large matrix of environments: MySQL 5.0-5.6, Perl 5.8-5.14, DBD::mysql 3.x-4.x, different character sets (latin1 vs. UTF8), different setups (replication, clusters, etc.), InnoDB vs. MyISAM (people still use the latter), etc. I want to rewrite pt-archiver, but I can’t say when that would happen (not because it’s a secret but because I don’t know yet).

    In any case, I appreciate and agree with your feedback. If and when we rewrite the tool, I’ll be sure to run early versions of it by you for more feedback. :-)

    Reply
    • Hi Daniel, thanks very much for this full and honest reply.
      I have no doubt that Percona team is able to provide tested and quality tools. I understand what you mean about pt-archiver.
      After one week I’m not able to force pt-archiver to work properly (I have to move 2 billion rows), we’ve currently tried to write our own tool.
      Do you think it is still useful to open a bug report?

      Reply
  2. Cédric, yes, it’s definitely worth opening a bug. The tool does work, and real bugs in it have been very rare, so maybe the problem is related to how you’re running it (which you should include in the bug report, and also PTDEBUG if possible).

    Reply

Leave a Comment.