MySQL and Unicode Thaana

Anyone developing custom code for a Thaana application or website will undoubtedly run into many interesting challenges and issues. Judging by how often I get emails asking for help with getting MySQL to play nice when using Unicode Thaana, I'd say it's one of the very first issues many (newbie?) web developers face.

Here are some of the basics that one should keep in mind when attempting to use Unicode Thaana and MySQL.

Charset
It is very important that the correct character set be used when using Unicode Thaana for MySQL to correctly recognize the text. If you are going nuts over your code returning "???" or junk characters after trying to store Thaana text into a MySQL database, then it most certainly is because of an incorrect charset being used.

I recommend issuing the following SQL commands upon successfully establishing a connection with the MySQL server.
SET NAMES utf8;
SET CHARACTER SET utf8;

Read up the MySQL reference page on connection charsets and collations for more details.

Collation
MySQL does not have special language specific collation rules for Thaana - and there is no need for such far as I am aware. I recommend using the "utf8_unicode_ci" collation when creating tables and fields in MySQL.

Refer to the MySQL reference page Unicode character sets for more details.

Happy databasing :-)