Firefox 3 Thaana display bug: review and fixes

Maldivians who use Firefox would be aware that certain Dhivehi websites, such as Miadhu Online, no longer display the Thaana fonts correctly when they switched to the recently released version 3 of the popular browser. I would like to review the issue for the benefit of Maldivian web developers and put forward some solutions that could be used. Further, I would also like to make available a fix that ordinary web users can themselves use until website owners (or the Firefox developers) fix the issue.

Problem description

Firefox 3.x series (and the 2.x series as well, to a lesser extent) fails in correctly displaying Thaana in web pages when certain non-Unicode Thaana fonts are applied to the elements using CSS. The same pages, however, render correctly without issue with Internet Explorer, Safari and Opera.

DOCTYPE - One contributing factor seems to be the DOCTYPE of the page. My guess is that this issue may have something to do with quirksmode rendering or standards compliance. The lack of a DOCTYPE in the markup gives correct rendering of the Thaana fonts on the page. However, omission of the DOCTYPE cannot and should not be considered a solution as DOCTYPE is required for most page markup and browsers need the correct DOCTYPE specification to correctly render modern pages.

Font - Another factor seems to be the font file used. The Thaana characters fail to be rendered correctly when almost all of the commonly used Thaana fonts, such as A_Faseyha, A_Waheed and A_Randhoo, are used. However, some fonts do work without issue - A_Ilham for example.

Here are some demo pages to highlight the issues. Each of the pages has three lines of Thaana - first of which is Thaana text enclosed in a font tag specifying a (problematic) Thaana font, the second is a H3 headline which has the font family set to a (problematic) Thaana font using CSS alone, the third is again a H3 headline which has the font family is set to a (problematic) Thaana font using CSS but has the text placed inside a font tag and finally the fourth line has a H3 headline whose font family is set to a (working) Thaana font using CSS alone.
View Thaana on page with: no DOCTYPE, HTML 4.01 DOCTYPE and XHTML 1.0 DOCTYPE.

Developer's fix

There are two definite solutions that can be easily applied by web developers.

Solution 1: Add HTML Font tags around any and all text that is to be displayed in Thaana. Specify the font to be used within the "face" attribute of the Font tags as usual. The flip-side of this method is that it results in a significant increase in page size. Haveeru News seems to have addressed the problem using this method. Here's a example:
bwlimIhunc aufulunc 

should be transformed into 

bwlimIhunc aufulunc

Solution 2: Change font used in the CSS definition to "A_Ilham". It is, perhaps, not as clean and pretty as "A_Faseyha" but until there is a fix to Firefox it will have to do.

A further alternative solution would be for the site owners and developers to take this occasion to shift to Unicode Thaana. It is much more reliable and is the currently recommended method of displaying Thaana on the web. Jazeera Daily, Haama Daily and MvHeadlines, to name a few, are all using Unicode for text display and entry. You can utilize the PHP-based Thaana Conversions class I released to convert the existing non-Unicode Thaana text to Unicode - and you can do such conversion on-the-fly on page requests.

User's fix

I wrote a quick bookmarklet-based solution several weeks ago for my use after getting annoyed with having to open Internet Explorer to view pages from sites affected by this bug. This solution will, or rather should, work on any affected site and on any computer.

Simply right click on this link - Jaa's Thaana Fix - and select "Bookmark this link" from the drop-down menu. Alternatively, you can drag and drop the link onto your bookmarks toolbar. When you are on a page that is messed up by the bug, such as Miadhu Online, Vaikaradhoo Live or Kavaasaa, click the "Thaana fix" link on your Bookmarks menu or toolbar. You will need to do this for each page you view.

Happy reading :-)

Javascript Thaana Keyboard version 3.0

I released my Javascript Unicode Keyboard Handler for Thaana early this year as open-source software so that web developers producing Dhivehi websites can allow users to type Thaana straight into text entry fields without forcing them to switch keyboard using the relevant features on the user's computer operating system. The code has since made its way into many different Dhivehi websites. However, the code I released then was mostly as-is from it's original version which I had written back in 2003 which, sadly, means that its behavior could be a little bit unpredictable with certain modern browsers - especially Opera and Safari.

I've now rewritten the code with the intent of producing cleaner, easier-to-use code that works without fail on all modern browsers. This version is (more or less!) guaranteed to work, and has been tested, on Firefox 2+, Opera 9+, Internet Explorer 6+ and Safari 2+ and has also been tested on Windows, Mac and Linux operating systems.

I am a big fan of separating code from design, so in keeping with that ideal this new version uses a more modern way of assigning the Thaana keyboard functionality in favour of inline javascript event handling used by the previous version (look below for an example). Since everything needs a spunky name I've also changed the old name to the more descriptive "Javascript Thaana Keyboard", which future versions of the script will maintain.

As before, it is being released under the MIT License, which allows its use in both personal and commercial applications as long as the copyright and license permission notice remains intact - so what the guy at basfoiy.com has done is a definite no-no.

Usage:

1. Link the file in the HEAD section of the page:


2. For any text input element (i.e INPUTs or TEXTAREAs), assign them the class name "thaanaKeyboardInput". You can assign further classes to the elements without ill-effect, if needed.

3. Using CSS, set any Unicode-compatible Dhivehi font (and size) to be used for the fields. You can easily do that by adding a class definition for the "thaanaKeyboardInput" class or by any other method of your choice.

4. The Thaana functionality would be automatically applied to any elements with the required class name when the page is loaded!

Demo:

Check out the demonstration and testing page here.

Download:

- original full source version (7.34 KB)
- minified version (2.01 KB)
I recommend you use the minified version.

As always, drop a line here if you use it and/or have problems or suggestions. Enjoy. :-)

Update (20-Oct-2008): This version is now superseded by the new and improved v4.0.

Thaana date formatting for PHP 5

Here is a PHP 5 class that provides a drop-in function replacement/equivalent for the built-in PHP date() function to output formatted dates in Thaana/Dhivehi. It follows the standard method of writing Gregorian dates in Thaana by using transliterations of the English month names and using the native Dhivehi names for the week days. It accepts all the usual formatting arguments permitted by the original date() function thus allowing the same degree of formatting freedom as the original. The output returned from the function uses ASCII Thaana and, if needed, can then be converted to Unicode/UTF-8 by using the Thaana Conversions class. This class does not support Hijri dates (yet).

The class is being released under the Open Source MIT License.

Functions exposed

format()
Returns a Dhivehi date string formatted according to the given format string using the given integer timestamp

Usage

<?php
// Load class include
require 'thaana_date.obj.php';

// Format date
$thaanatoday = Thaana_Date::format('j M Y', time());
?>

Download

- Thaana_Date.zip (v0.2, 1.4KB)

Drop me a line if you have comments/queries. Enjoy :-)

Javascript Unicode Keyboard Handler for Thaana

Here's something that is probably going to be very useful to the Maldivian web developers working on Unicode-based Thaana web pages. It is a Javascript utility function that translates keystrokes into the appropriate Unicode Thaana characters. Hence, it makes it possible for HTML text input and textarea fields (and similar) to accept Thaana without having to require the user to switch the keyboard language on their computer. Such a feature contributes for a better user experience as the user can simply enter Dhivehi without the extra hassle. The code has been tested with no problems found on Firefox 1/2/3 and Internet Explorer 5/6/7.

If you would like a demo, I recommend you check out the text entry box at Radheef.com and see the HTML behind it. A few developers seem to have already adopted my code as at Radheef.com and utilized it in their work - haamadaily.com, sangudaily.com and jazeera.com.mv and haveeru.com.mv is using the code far as I know.

I originally wrote this around 2002 while experimenting with different methods of Thaana entry for the web. The version I'm releasing here, marked as version 2.0, is a modified version from 2006. It is being released under the MIT License.

- Download unicodehandler-2.0.js

Usage

1. Link the file in the HEAD section of the page:
<script type="text/javascript" src="/unicodehandler.js"></script>

2. Attach the handler to any text INPUT, TEXTAREA or editable DIV tag:
<textarea rows="1" onkeypress="return juk_HandleKeyPress(event);"></textarea>

3. Set any Unicode-compatible Dhivehi font to be used for the field using CSS.

4. That's it!

Drop a line here if you use it and/or have problems. Enjoy.

Update (16-Aug-2008): This version is now superseded by the new and improved v3.0.

Thaana Unicode<->Ascii conversions PHP class

Here is something that would probably be very handy to Maldivian web developers dabbling with Dhivehi sites. This PHP class addresses the need for converting text to and from Thaana in Ascii and Thaana in Unicode.

The class makes it easy to standardize text into one format irrespective of how it was/is written. This means that you can take text written in Accent, MS Word 97 (and prior) or written using Unicode as featured on recent MS Word editions and use the class to present output in the format of your choice without the need for imposing restrictions on the people who write the text. The class comes in even more handy when you have a form submission that takes input in Unicode but needs to be stored in the database or presented later as Ascii, or vice versa.

The class was something I originally wrote around 2001 and was used in the free Online Document Converter that featured on maldivianunderground.net. I rewrote it for PHP 5 recently for use in a project I am working on. The original class had support for Letin dhivehi -> Unicode/Ascii conversions as well which I haven't included in this release but will add it a future update.

Usage should be pretty straightforward but here is an example just to illustrate:

Example:
<?php
require 'thaana_conversions.obj.php';
$thaana = new Thaana_Conversions();

echo $thaana->convertUnicodeToAscii('&#1931;&#1960;&#1928;&#1964;&#1920;&#1960;');
echo $thaana->convertAsciiToUnicode('rWacje');
?>

Download:
- Thaana_Conversions.zip (v0.1, 2KB)

Enjoy :-)

Update (7-May-2008): This version is now superseded by v0.2.

Guide to using Thaana on the WWW

Developing Dhivehi web pages is pretty easy and there are quite a few methods to do it. However, information on how to go about it seems to be lacking, leaving newbies stumped. Here is a general overview on the various methods for displaying Thaana on the WWW and should contain enough information to help anyone, designer or programmer, get started.

1. CSS: rtl + bidi-override

This method is applicable only to non-Unicode text. It works on all modern browsers but requires for the user to have atleast one of the fonts specified in the page - otherwise the text would be displayed as a mostly meaningless jumble of English letters.

This is the least-effort route to getting any non-Unicode Thaana text (such as those written using MS Word 97/2000, Accent Express, MLS or Faseyha Thaana) on to the web. The websites of Haveeru and Miadhu currently take this approach.

Usage:
To use this method, apply the following CSS to any HTML elements that contain Thaana text. You may use inline style attributes or CSS class/ids to achieve this. You may change the font names to suit your needs but make sure you list several popular fonts and that the fonts specified are all non-Unicode fonts. You could, of course, also add further CSS styling (font size, font color, line height etc) but the following are the required minimum.
font-family: A_Ilham, A_Randhoo, A_Faruma, A_Waheed;
direction: rtl;
unicode-bidi: bidi-override;

Demo:
View example



2. Unicode Dhivehi

This method is applicable to text in Unicode. It works well on all modern browsers but requires for the user to have atleast one Unicode Thaana font - and unlike method (1) the system defaults to a Thaana font it does have if it cannot find any of the fonts named in the page.

This is the best method for any new and modern Thaana-based website. It is used in the online Radheef, Jazeera Daily and Haama Daily.

Usage:
To use this method, first add the following to the page's HTML HEAD section.


Next, apply the following CSS to any HTML elements that contain Thaana text. You may use inline style attributes or CSS class/ids to achieve this. You may change the font names to suit your needs but make sure that the fonts specified are all Unicode fonts. You could, of course, also add further CSS styling (font size, font color, line height etc) but the following are the required minimum.
font-family: Faruma, "MV Elaaf Normal";
direction: rtl;
text-align: right;

Demo:
View example



3. Image

This approach basically renders the Dhivehi text as an image. This is perhaps the most obvious and was the only method available early on. However, this method is still a pretty lucrative solution especially given that many computers just don't have the required fonts available. Using an image for the text rids the requirement on the client browser/computer to have the proper fonts available.

The basic approach of rendering the text into an image using Photoshop, MS Word etc is pretty tedious as the process is entirely manual. However, there is a more sophisticated approach that renders the text into Dhivehi on-the-fly on the web server side (perhaps coupled with caching to reduce load). A server-side scripting language such as PHP can be used to render text into an image using any font of choice by the designer/programmer. The rendered images (typically PNGs) are of very small size and hence have a negligible effect on the page load time in most cases.

Refer to the imagettftext function for details on how to do it in PHP.



4. Flash

This method uses text loaded in Macromedia Flash with the required font(s) being embedded in the Flash clip. ActionScript and/or Flash variables are used to load the text into text areas in the Flash file. This method has the advantage that it works whether the client computer/browser has Dhivehi font available or not but then again it does require the client to have Flash installed and enabled. If you are only seeking to have nice one-line headline sort of text in Dhivehi then you might consider using sIFR.

Refer to Font Embedding help page at Adobe LiveDocs for details on font embedding in Flash.



5. WEFT

Web Embedding Fonts Tools is a Internet Explorer only solution offered by Microsoft. It involves using the Windows-only WEFT utility to create font "objects" that can then be placed on web pages. This method is not recommended unless the target only involves use of Internet Explorer.

Refer to Microsoft WEFT page for more information.



6. TrueDoc

TrueDoc is a solution offered by Bitstream Inc. It is a solution similar to Microsoft's WEFT in that TrueDoc solutions create a embeddable font resource called a Portable Font Resource. Any font (ie. Dhivehi font) can be loaded once users install a custom font "viewer" (called the Character Shape Player by the company). This solution is NOT free and requires the purchase of special software from BitStream to produce the custom embeddable font packages.

Refer to the TrueDoc site for more information.



Good luck ;-)

Update (24-Nov-2008): Method 1 and 2 rewritten for clarity and demos added.