MySQL: Fix Microsoft Word characters. Shows weird characters on the web page.

As a consultant, I do a lot of content migrations for clients. One issue I run into quite often is the encoding of databases, tables, columns differs between source and destination. Most clients do not want me to go and change the way their encoding is to fix issues since they are too afraid about messing with production data. Of course amongst other issues, it creates weird characters for data which is copied/pasted from Microsoft Word. You see weird characters like: ’ … – “ †‘

So if you just want to replace these with appropriate symbols, you may do it with a simple sql query. Note that below queries are without where clause. You may what to test it with one of your rows before making changes to the whole table. Of course, you should always backup your data before you try this out. If you have a dev system, that is even better. I put all my sql queries into a file ex: fix.sql and sourced it with mysql client.

vi fix.sql

update table_name set fieldname = replace(fieldname, '’', '\'');
update table_name set fieldname = replace(fieldname, '…','...');
update table_name set fieldname = replace(fieldname, '–','-');
update table_name set fieldname = replace(fieldname, '“','"');
update table_name set fieldname = replace(fieldname, '”','"');
update table_name set fieldname = replace(fieldname, '‘','\'');
update table_name set fieldname = replace(fieldname, '•','-');
update table_name set fieldname = replace(fieldname, '‡','c');

Save/exit.

# mysql
mysql> source fix.sql;

I am not sure if I am missing any other chars. If you know of any other chars, please comment with them and I will add on to the script here.

————————————-
DISCLAIMER: Please be smart and use code found on internet carefully. Make backups often. And yeah.. last but not least.. I am not responsible for any damage caused by this posting. Use at your own risk.

2 thoughts on “MySQL: Fix Microsoft Word characters. Shows weird characters on the web page.

  1. Chris Henry

    Wow…this looks like it will do exactly what I’m looking for. I run a job posting site where users will often paste directly from Word, and wind up with lots of these characters. Do you have a php equivalent?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>