For some time now, I had a pending task of converting the latin1-encoded MySQL database powering this site into utf8. I finally managed to do it after getting some advice from the kind people at a Vienna WordPress meetup.
The essence is that out of all the methods suggested in the official documentation, what worked best for me was dumping all the data to a text file, marking up and encoding it as utf-8 in a text editor and then importing it into a new database (instead of working on the production DB) that we point the wp-config.php file to after everything is verified.
$ mysql -u root -p mydatabase < mydatabase.sql
Then I dumped it once again with the forced encoding.
$ mysqldump -u root -p --opt --quote-names --skip-set-charset --default-character-set=latin1 mydatabase > database2.sql
After this it can be opened in a (reasonably good) editor (as there will be many lines). I used Emacs and it’s easy to play with the encoding in it: M-x revert-buffer-with-coding-system to force reading as latin1 and C-x C-m f to save it as utf8. Another M-x replace-string to substitute every occurrence of latin1_swedish_ci with utf8_general_ci to make sure all the tables and columns are properly defined, save under a different name and we’re good to go.
What remains is to import it into your local DB and use PHPMyAdmin to see that everything is readable and that there are no obvious errors.
$ mysql -u root -p --default-character-set=utf8 mydatabase-utf8 < mydatabase-utf8.sql
If everything’s fine, create a new database on your site, import data from the fixed file into it and edit wp-config.php to use it. If something doesn’t work, you can always change right back to your old database, because we haven’t touched it on the site. After a few weeks, once it’s clear that everything is still working, you can delete the old database.
I also expermiented with changing the DB engine to InnoDB (said to be recommended as the default in newer MySQL versions) by just replacing MyISAM with InnoDB in the text file, but that pumped my DB size from ~6 MB to ~40 MB (probably some transaction-related redundancy) and I needed root access to the my.cnf file to set InnoDB as the default for new tables (though this can be solved in WordPress), which proved to be not so desirable on my bet-you-can’t-run-it-cheaper-than-me shared baby-size hosting, so I decided to stay with MyISAM for now. Maybe once I start getting some serious traffic… 🙂