Guide to using Thaana on the WWW - updated

I published an article last year, titled "Guide to using Thaana on the WWW", with the aim of presenting a quick overview of the various approaches/methods for developing Thaana-based websites. It introduced 6 different methods and included enough implementation details to help a beginner get started. I've now rewritten bits of the article for increased clarity and also added some examples to help fortify the usage instructions.

Click here to read the updated article.

Athuliyun: Thaana Handwriting Recognition demo

I dug up this old project from my backup disks today and worked a little magic to bring it back to life. This was and still is among my favourite experiments. Named "Athuliyun", I developed this software shortly after I bought my first PDA around 2005, with the goal of getting Thaana handwriting recognition on the platform. I didn't have much experience with software development for Windows CE (a.k.a Windows Mobile) and so it ended up being a Windows application. The project got binned when my interests moved to Optical Character Recognition for document scanning...

Athuliyun supports, as it stands now, the Thaana characters but not the filis (diacritics). This ofcourse severely limits its practical use but I reckon adding support for fili would be a relatively trivial task. I will be releasing this publicly, hopefully later this month, after adding that functionality and also retraining the recognition neural networks used in the software for improved performance.

Anyway, below is a short screencast of the application where you can see me scribble Thaana letters quite clumsily using the touchpad on my laptop - let's call it a software/technology preview ;-)



[An alternative lower-quality version can be found on Youtube]

Latin Thaana Converter 2.0

Latin Thaana Converter is a small, simple software for Microsoft Windows that performs transliteration on latinized (i.e. romanized) Thaana to convert it back into the Thaana script. This is a tool I originally released in 2003 under the name "Latin Dhivehi Converter"/"Lat2Dhiv". This new release carries a new name (which I think is a more technically correct name for what it does) and sports a few aesthetic changes but is functionally almost exactly the same as the original - it is basically a recompile of my old code within the .Net framework.

Automated transliteration of Latin Thaana is not an entirely easy task. Look up table based algorithms are simple to implement but are unable to correctly handle cases of sukun, present issues with most other fili and generally have a host of other problems as well. Latin Thaana Converter utilizes a finite state machine and its transliteration mappings are based on a more extensive scheme extracted from an analysis of a body of Latin Thaana-to-Thaana sample data. It maybe worth mentioning that the analysis had revealed that upto 4 characters were being used (and needed) for some Thaana transliterations. However, it must be said that the quality of transliteration from this is limited by the accuracy and diversity of the sample data I had used and hence is by no means perfect.

Since writing this program in 2003, I have experimented with probabilistic FSMs and also put machine learning techniques to the task with better results. I plan to write more extensively on Thaana transliteration algorithms at a later time...

Usage

1. Copy-paste or type the Latin Thaana text into the "Text in Latin Thaana" box.
2. Click "Convert".
3. The converted text appears in the "Text in Thaana" box.

Download

- Latin Thaana Converter 2.0 Installer (126KB, MS Windows)
- Latin Thaana Converter 2.0 Executable only (22.8KB, MS Windows)

Hope someone finds it useful :-)

My new favourite show: Prototype This!

I'm in love with this new series on Discovery Channel called Prototype This!. The show has four engineers coming up with creative technological "inventions" and developing a rapid prototype within 2 weeks. The show definitely has massive appeal to geeks like me who want to tinker around, invent and just build stuff.

The show is broadcast every Wednesday on Discovery Channel but can be downloaded off the web too, if you know where to look.