Changing MySQL database encoding

For some time now, I had a pending task of converting the latin1-encoded MySQL database powering this site into utf8. I finally managed to do it after getting some advice from the kind people at a Vienna WordPress meetup.

The essence is that out of all the methods suggested in the official documentation, what worked best for me was dumping all the data to a text file, marking up and encoding it as utf-8 in a text editor and then importing it into a new database (instead of working on the production DB) that we point the wp-config.php file to after everything is verified.

First I used the basic backup and restore procedure to set up the DB in my local MySQL DB. Note that I had to import the data from the command line, due to PHPMyAdmin’s import file size limit.

$ mysql -u root -p mydatabase < mydatabase.sql

Then I dumped it once again with the forced encoding.

$ mysqldump -u root -p --opt --quote-names --skip-set-charset --default-character-set=latin1 mydatabase > database2.sql

After this it can be opened in a (reasonably good) editor (as there will be many lines). I used Emacs and it’s easy to play with the encoding in it: M-x revert-buffer-with-coding-system to force reading as latin1 and C-x C-m f to save it as utf8. Another M-x replace-string to substitute every occurrence of latin1_swedish_ci with utf8_general_ci to make sure all the tables and columns are properly defined, save under a different name and we’re good to go.

What remains is to import it into your local DB and use PHPMyAdmin to see that everything is readable and that there are no obvious errors.

$ mysql -u root -p --default-character-set=utf8 mydatabase-utf8 < mydatabase-utf8.sql

If everything’s fine, create a new database on your site, import data from the fixed file into it and edit wp-config.php to use it. If something doesn’t work, you can always change right back to your old database, because we haven’t touched it on the site. After a few weeks, once it’s clear that everything is still working, you can delete the old database.

I also expermiented with changing the DB engine to InnoDB (said to be recommended as the default in newer MySQL versions) by just replacing MyISAM with InnoDB in the text file, but that pumped my DB size from ~6 MB to ~40 MB (probably some transaction-related redundancy) and I needed root access to the my.cnf file to set InnoDB as the default for new tables (though this can be solved in WordPress), which proved to be not so desirable on my bet-you-can’t-run-it-cheaper-than-me shared baby-size hosting, so I decided to stay with MyISAM for now. Maybe once I start getting some serious traffic… 🙂

Published by

metakermit

Building apps, analysing data at Punk Rock Dev and sharing weird & cool photographs, drawings, music, films, games... More about me here. You can get new blog posts via RSS or follow @metakermit on Twitter where I also announce new stuff.

One thought on “Changing MySQL database encoding”

Leave a Reply

Your email address will not be published. Required fields are marked *