MySQL and Unicode Thaana

Anyone developing custom code for a Thaana application or website will undoubtedly run into many interesting challenges and issues. Judging by how often I get emails asking for help with getting MySQL to play nice when using Unicode Thaana, I'd say it's one of the very first issues many (newbie?) web developers face.

Here are some of the basics that one should keep in mind when attempting to use Unicode Thaana and MySQL.

Charset
It is very important that the correct character set be used when using Unicode Thaana for MySQL to correctly recognize the text. If you are going nuts over your code returning "???" or junk characters after trying to store Thaana text into a MySQL database, then it most certainly is because of an incorrect charset being used.

I recommend issuing the following SQL commands upon successfully establishing a connection with the MySQL server.
SET NAMES utf8;
SET CHARACTER SET utf8;


Read up the MySQL reference page on connection charsets and collations for more details.

Collation
MySQL does not have special language specific collation rules for Thaana - and there is no need for such far as I am aware. I recommend using the "utf8_unicode_ci" collation when creating tables and fields in MySQL.

Refer to the MySQL reference page Unicode character sets for more details.

Happy databasing :-)

Guide to blogging in Dhivehi using WordPress

Here is a short guide to using Thaana to make posts in Dhivehi using the free WordPress blogging service. I regularly get emails from different people asking me for help on this topic, so I hope this is helpful to all such people :-).

The solution presented here can be used on the free Wordpress blogging accounts available from WordPress.com and also on custom installs of WordPress. Please be aware, however, that there are much better ways to setup WordPress for Dhivehi posting if you have a custom install of WordPress or have a paid WordPress account.

Requirements:
You need to have the Dhivehi keyboard installed on your computer. If you already type Dhivehi, say using MS Word or OpenOffice, then you have this already. See this post by Fayid if you need help with getting the Dhivehi keyboard installed.

Steps:
1. Login to WordPress and click on the "Write" tab to start a new post.


2. In the post editor area, find and click the "HTML" tab to switch the view to HTML mode.


3. Copy and paste the following code into the editing area. It contains the bare minimum HTML/CSS needed to correctly display Dhivehi on all browsers supporting right-to-left text display in Unicode.
<div style="direction: rtl; text-align: right; font-family: faruma, 'mv iyyu nala', 'mv elaaf normal'; font-size: 14px;">Test</div>



4. Tinker around with the font family and font size settings if you know some CSS (or are feeling adventurous enough!). Faruma is probably the most decent Dhivehi Unicode font and is installed on most computers - hence is listed as the first preference. Different people like different font sizes for Dhivehi but 14px and 16px, I think, tend to be the easiest on the eye.

5. Click the "Visual" tab to switch the view back to the normal WYSIWYG mode.


6. Switch the keyboard to "Divehi" on the computer.


7. Select or delete the "Test" text on the post editor and start writing in Dhivehi.


That's all it takes!

Notes:
Unfortunately, there is no way to apply proper formatting to the post title with the free accounts on WordPress so you will have to settle for the out-of-alignment title display.

Video guide:
I thought it'd be fun to make a video guide for this and here is what thus materialized: View good quality | View crap quality (Youtube)

Good luck with the blogging!