Guide to using Thaana on the WWW - updated

I published an article last year, titled "Guide to using Thaana on the WWW", with the aim of presenting a quick overview of the various approaches/methods for developing Thaana-based websites. It introduced 6 different methods and included enough implementation details to help a beginner get started. I've now rewritten bits of the article for increased clarity and also added some examples to help fortify the usage instructions.

Click here to read the updated article.

MySQL and Unicode Thaana

Anyone developing custom code for a Thaana application or website will undoubtedly run into many interesting challenges and issues. Judging by how often I get emails asking for help with getting MySQL to play nice when using Unicode Thaana, I'd say it's one of the very first issues many (newbie?) web developers face.

Here are some of the basics that one should keep in mind when attempting to use Unicode Thaana and MySQL.

Charset
It is very important that the correct character set be used when using Unicode Thaana for MySQL to correctly recognize the text. If you are going nuts over your code returning "???" or junk characters after trying to store Thaana text into a MySQL database, then it most certainly is because of an incorrect charset being used.

I recommend issuing the following SQL commands upon successfully establishing a connection with the MySQL server.
SET NAMES utf8;
SET CHARACTER SET utf8;

Read up the MySQL reference page on connection charsets and collations for more details.

Collation
MySQL does not have special language specific collation rules for Thaana - and there is no need for such far as I am aware. I recommend using the "utf8_unicode_ci" collation when creating tables and fields in MySQL.

Refer to the MySQL reference page Unicode character sets for more details.

Happy databasing :-)

Guide to blogging in Dhivehi using WordPress

Here is a short guide to using Thaana to make posts in Dhivehi using the free WordPress blogging service. I regularly get emails from different people asking me for help on this topic, so I hope this is helpful to all such people :-).

The solution presented here can be used on the free Wordpress blogging accounts available from WordPress.com and also on custom installs of WordPress. Please be aware, however, that there are much better ways to setup WordPress for Dhivehi posting if you have a custom install of WordPress or have a paid WordPress account.

Requirements

You need to have the Dhivehi keyboard installed on your computer. If you already type Dhivehi, say using MS Word or OpenOffice, then you have this already. See this post by Fayid if you need help with getting the Dhivehi keyboard installed.

Steps

1. Login to WordPress and click on the "Write" tab to start a new post.


2. In the post editor area, find and click the "HTML" tab to switch the view to HTML mode.


3. Copy and paste the following code into the editing area. It contains the bare minimum HTML/CSS needed to correctly display Dhivehi on all browsers supporting right-to-left text display in Unicode.
Test



4. Tinker around with the font family and font size settings if you know some CSS (or are feeling adventurous enough!). Faruma is probably the most decent Dhivehi Unicode font and is installed on most computers - hence is listed as the first preference. Different people like different font sizes for Dhivehi but 14px and 16px, I think, tend to be the easiest on the eye.

5. Click the "Visual" tab to switch the view back to the normal WYSIWYG mode.


6. Switch the keyboard to "Divehi" on the computer.


7. Select or delete the "Test" text on the post editor and start writing in Dhivehi.


That's all it takes!

Notes:
Unfortunately, there is no way to apply proper formatting to the post title with the free accounts on WordPress so you will have to settle for the out-of-alignment title display.

Video guide:
I thought it'd be fun to make a video guide for this and here is what thus materialized: View good quality | View crap quality (Youtube)

Good luck with the blogging!

Guide to using Thaana on the WWW

Developing Dhivehi web pages is pretty easy and there are quite a few methods to do it. However, information on how to go about it seems to be lacking, leaving newbies stumped. Here is a general overview on the various methods for displaying Thaana on the WWW and should contain enough information to help anyone, designer or programmer, get started.

1. CSS: rtl + bidi-override

This method is applicable only to non-Unicode text. It works on all modern browsers but requires for the user to have atleast one of the fonts specified in the page - otherwise the text would be displayed as a mostly meaningless jumble of English letters.

This is the least-effort route to getting any non-Unicode Thaana text (such as those written using MS Word 97/2000, Accent Express, MLS or Faseyha Thaana) on to the web. The websites of Haveeru and Miadhu currently take this approach.

Usage:
To use this method, apply the following CSS to any HTML elements that contain Thaana text. You may use inline style attributes or CSS class/ids to achieve this. You may change the font names to suit your needs but make sure you list several popular fonts and that the fonts specified are all non-Unicode fonts. You could, of course, also add further CSS styling (font size, font color, line height etc) but the following are the required minimum.
font-family: A_Ilham, A_Randhoo, A_Faruma, A_Waheed;
direction: rtl;
unicode-bidi: bidi-override;

Demo:
View example



2. Unicode Dhivehi

This method is applicable to text in Unicode. It works well on all modern browsers but requires for the user to have atleast one Unicode Thaana font - and unlike method (1) the system defaults to a Thaana font it does have if it cannot find any of the fonts named in the page.

This is the best method for any new and modern Thaana-based website. It is used in the online Radheef, Jazeera Daily and Haama Daily.

Usage:
To use this method, first add the following to the page's HTML HEAD section.


Next, apply the following CSS to any HTML elements that contain Thaana text. You may use inline style attributes or CSS class/ids to achieve this. You may change the font names to suit your needs but make sure that the fonts specified are all Unicode fonts. You could, of course, also add further CSS styling (font size, font color, line height etc) but the following are the required minimum.
font-family: Faruma, "MV Elaaf Normal";
direction: rtl;
text-align: right;

Demo:
View example



3. Image

This approach basically renders the Dhivehi text as an image. This is perhaps the most obvious and was the only method available early on. However, this method is still a pretty lucrative solution especially given that many computers just don't have the required fonts available. Using an image for the text rids the requirement on the client browser/computer to have the proper fonts available.

The basic approach of rendering the text into an image using Photoshop, MS Word etc is pretty tedious as the process is entirely manual. However, there is a more sophisticated approach that renders the text into Dhivehi on-the-fly on the web server side (perhaps coupled with caching to reduce load). A server-side scripting language such as PHP can be used to render text into an image using any font of choice by the designer/programmer. The rendered images (typically PNGs) are of very small size and hence have a negligible effect on the page load time in most cases.

Refer to the imagettftext function for details on how to do it in PHP.



4. Flash

This method uses text loaded in Macromedia Flash with the required font(s) being embedded in the Flash clip. ActionScript and/or Flash variables are used to load the text into text areas in the Flash file. This method has the advantage that it works whether the client computer/browser has Dhivehi font available or not but then again it does require the client to have Flash installed and enabled. If you are only seeking to have nice one-line headline sort of text in Dhivehi then you might consider using sIFR.

Refer to Font Embedding help page at Adobe LiveDocs for details on font embedding in Flash.



5. WEFT

Web Embedding Fonts Tools is a Internet Explorer only solution offered by Microsoft. It involves using the Windows-only WEFT utility to create font "objects" that can then be placed on web pages. This method is not recommended unless the target only involves use of Internet Explorer.

Refer to Microsoft WEFT page for more information.



6. TrueDoc

TrueDoc is a solution offered by Bitstream Inc. It is a solution similar to Microsoft's WEFT in that TrueDoc solutions create a embeddable font resource called a Portable Font Resource. Any font (ie. Dhivehi font) can be loaded once users install a custom font "viewer" (called the Character Shape Player by the company). This solution is NOT free and requires the purchase of special software from BitStream to produce the custom embeddable font packages.

Refer to the TrueDoc site for more information.



Good luck ;-)

Update (24-Nov-2008): Method 1 and 2 rewritten for clarity and demos added.