Quicksearch

Guide to using Thaana on the WWW

Developing Dhivehi web pages is pretty easy and there are quite a few methods to do it. However, information on how to go about it seems to be lacking, leaving newbies stumped. Here is a general overview on the various methods for displaying Thaana on the WWW and should contain enough information to help anyone, designer or programmer, get started.

1. CSS: rtl + bidi-override

This method is applicable only to non-Unicode text. It works on all modern browsers but requires for the user to have atleast one of the fonts specified in the page - otherwise the text would be displayed as a mostly meaningless jumble of English letters.

This is the least-effort route to getting any non-Unicode Thaana text (such as those written using MS Word 97/2000, Accent Express, MLS or Faseyha Thaana) on to the web. The websites of Haveeru and Miadhu currently take this approach.

Usage:
To use this method, apply the following CSS to any HTML elements that contain Thaana text. You may use inline style attributes or CSS class/ids to achieve this. You may change the font names to suit your needs but make sure you list several popular fonts and that the fonts specified are all non-Unicode fonts. You could, of course, also add further CSS styling (font size, font color, line height etc) but the following are the required minimum.
font-family: A_Ilham, A_Randhoo, A_Faruma, A_Waheed;
direction: rtl;
unicode-bidi: bidi-override;

Demo:
View example



2. Unicode Dhivehi

This method is applicable to text in Unicode. It works well on all modern browsers but requires for the user to have atleast one Unicode Thaana font - and unlike method (1) the system defaults to a Thaana font it does have if it cannot find any of the fonts named in the page.

This is the best method for any new and modern Thaana-based website. It is used in the online Radheef, Jazeera Daily and Haama Daily.

Usage:
To use this method, first add the following to the page's HTML HEAD section.


Next, apply the following CSS to any HTML elements that contain Thaana text. You may use inline style attributes or CSS class/ids to achieve this. You may change the font names to suit your needs but make sure that the fonts specified are all Unicode fonts. You could, of course, also add further CSS styling (font size, font color, line height etc) but the following are the required minimum.
font-family: Faruma, "MV Elaaf Normal";
direction: rtl;
text-align: right;

Demo:
View example



3. Image

This approach basically renders the Dhivehi text as an image. This is perhaps the most obvious and was the only method available early on. However, this method is still a pretty lucrative solution especially given that many computers just don't have the required fonts available. Using an image for the text rids the requirement on the client browser/computer to have the proper fonts available.

The basic approach of rendering the text into an image using Photoshop, MS Word etc is pretty tedious as the process is entirely manual. However, there is a more sophisticated approach that renders the text into Dhivehi on-the-fly on the web server side (perhaps coupled with caching to reduce load). A server-side scripting language such as PHP can be used to render text into an image using any font of choice by the designer/programmer. The rendered images (typically PNGs) are of very small size and hence have a negligible effect on the page load time in most cases.

Refer to the imagettftext function for details on how to do it in PHP.



4. Flash

This method uses text loaded in Macromedia Flash with the required font(s) being embedded in the Flash clip. ActionScript and/or Flash variables are used to load the text into text areas in the Flash file. This method has the advantage that it works whether the client computer/browser has Dhivehi font available or not but then again it does require the client to have Flash installed and enabled. If you are only seeking to have nice one-line headline sort of text in Dhivehi then you might consider using sIFR.

Refer to Font Embedding help page at Adobe LiveDocs for details on font embedding in Flash.



5. WEFT

Web Embedding Fonts Tools is a Internet Explorer only solution offered by Microsoft. It involves using the Windows-only WEFT utility to create font "objects" that can then be placed on web pages. This method is not recommended unless the target only involves use of Internet Explorer.

Refer to Microsoft WEFT page for more information.



6. TrueDoc

TrueDoc is a solution offered by Bitstream Inc. It is a solution similar to Microsoft's WEFT in that TrueDoc solutions create a embeddable font resource called a Portable Font Resource. Any font (ie. Dhivehi font) can be loaded once users install a custom font "viewer" (called the Character Shape Player by the company). This solution is NOT free and requires the purchase of special software from BitStream to produce the custom embeddable font packages.

Refer to the TrueDoc site for more information.



Good luck ;-)

Update (24-Nov-2008): Method 1 and 2 rewritten for clarity and demos added.

JavaScript Dhivehi Character Recognition

Here is another of my pet projects brought back from the land of the deceased.

This one is called "JavaScript Dhivehi Character Recognition". It was created early 2003 (or maybe late 2002) and made available on bichoo.net. Basically, it lets you draw a Thaana character using your mouse and then it "recognizes" what you have drawn. The purpose was mostly to satisfy my curiosity into artificial intelligence and pattern recognition at the time, however it also showed promises of the beginnings of a future where Dhivehi documents maybe scanned in and processed by a computer to convert it to text just as Optical Character Recognition technology has been doing for English documents. I think this rudimentary application was the first ever Dhivehi character recognition implementation released to the public. More interestingly, this seems to be the only character recognition implementation programmed in JavaScript floating around on the Internet even now. :-D

I spent a bit of time tonight reworking some bits of the code for clarity. The entire implementation is done using JavaScript and DHTML. You are welcome to study the code to see how it works. The code is well commented and maybe a good starter into AI and pattern recognition basics. It uses a single layer single Perceptron model to really simplify things however it is a good enough practical implementation to work for characters drawn on a 10x10 grid. The grid makes up the input data to the neural network. The neural network is hard-coded into the page and has definitions for each character in the alphabet. I do hope you are surprised by the accurateness of the recognition of this little application.

Have a look at it HERE. Let me know if you find it amusing... or not.

My company - Technova Pvt Ltd - is currently working on bringing a full fledged Dhivehi OCR software to the Maldivian public. It will probably be made available early 2006, as a service for customers requiring bulk OCR processing. We shall be releasing Windows, Linux and Mac versions of the software for home and business use around mid 2006.

Dhivehi article on the lunar eclipse of 31st December

I published a Dhivehi article over at Muraasil on the lunar eclipse to occur on the 31st of this month.

Charles Anderson on dragonfly migration to the Maldives

Here's a TED Talk that should be of interest to all curious Maldivians. Charles Anderson, a British marine biologist working and living in the Maldives for 26 years, reports on how him noticing the sudden emergence of dragonflies in the Maldives at certain times of the year led him to discover the world's longest migratory journey taken by any insect. It is a truly riveting story of curiosity and scientific discovery.

I now have an answer to a question I used to wonder about when I was a kid: Where do the dragonflies came from?

Last few days at...

I've changed home 6 times in the past 6 years and it's time again to indulge myself in the fun and hassle of packing up and moving to a new address to call home...

Thaana text rendering: A solution for devices without the required fonts

A few years ago I wrote a PHP-based Thaana text rendering class while investigating solutions to the problem of displaying Thaana text in web browsers on various devices. The class dynamically converts any given Thaana text into a formatted image of given dimensions and type. The use of images to display Thaana means that the information can be viewed on a large variety of devices and does away with the demand for the device to support Thaana fonts. On the flip side, the use of images does mean that this approach has higher bandwidth and data transfer requirements than text.

Features

The class makes use of the powerful image manipulation services provided by the GD library to create images from text and hence inherits the wide of features it offers. However, since the GD library does not (or atleast did not, back then) support right-to-left scripts and does not offer line wrapping to fit text within a bounding box, custom code had to be written to handle the unsupported text direction and formatting. The class also supports use of any Thaana font, made possible by GD support for loading TrueType fonts.

Applications

This piece of code was briefly put to live use around 2004 on the (now defunct) MUnet.net's Radheef service. More recently, it has been put to great use by Muraasil.com to display Thaana on their mobile service so that user's can read news in Thaana on mobile devices, including Windows Mobile-based phones and the iPhone.

Demo

Give it a go and play around: Thaana text rendering demo.

I am not releasing the code publicly just yet...

CFR on Muraasil

Summer glow

A few snaps of the summer goodness from a few days ago...

Thaana Common Fonts Research

Thaana Common Fonts Research (CFR) is a Thaana related research project I launched late last month and has been running since. Today, I finally got around to writing down some introductory information on the project, so here it is.

Introduction

This project will conduct some basic research into the prevalence and distribution of Thaana fonts.

Purpose

The investigation is aimed at obtaining:
- An understanding of the prevalence of individual Thaana fonts
- An understanding of the co-occurrence dynamics of Thaana fonts
- The distribution of Thaana Unicode and non-Unicode Thaana fonts
- The OS dependence of the fonts

Significance

This study will help us to:
- Get a first look into the distribution of Thaana fonts
- Develop recommendations for the use of fonts on the web
- Develop recommendations for the use of fonts in software and in documents
- Formulate plans for improving the reach of Thaana (and hence, Dhivehi)

Method

The research is conducted via the World Wide Web by sampling the fonts installed on the devices used by Maldivian web users.

The process goes like as follows:
1) A small, invisible Flash-based data collector is embedded into websites.
2) When a user visits a participating website, the data collector automatically compiles a list of the fonts installed on the system. This is done once per user device.
3) The font names and the operating system of the user is sent to my server where the data is logged for later analysis.

It is intended that data sample collection will be carried out until the end of this month (June 2009).

Participate

Webmasters and website owners can participate and contribute to this research by embedding the Flash-based data collector using the HTML code shown below into their website. Please change the DOMAINHERE bit to the domain name of your site so that I know who to chase if there are issues. The field is also used to note your contribution and participation in the project.
<object data="http://labs.jawish.org/cfr/cfr.swf" height="1" width="1" type="application/x-shockwave-flash">
	<param name="flashvars" value="site=DOMAINHERE" />
	<param name="movie" value="http://labs.jawish.org/cfr/cfr.swf" />
</object>


If you operate a high traffic Thaana-based website, I urge you to consider participating and help make this project a success. My thanks in advance!

Statistics

As of writing this post, 2293 data samples has been collected and 269 Thaana fonts have been identified and is being tracked.

You can see LIVE stats on the CFR project home.