Towards a (true) Dhivehi search engine

As much as I would like the Dhivehi language to die and rot away, it seems it won't happen, atleast for a while. The (relatively) newly minted freedom to publish newspapers and the growth of web-based news sites may have poised Dhivehi for a serious revival of the language. The revival probably isn't so much in terms of improvements in the vocabulary or other more linguistics related changes but rather a revival in terms of the amount of information now being pumped out in Dhivehi - and in my opinion, that's a great start.

A (if not THE) point worth noting here is that much of this new information is being produced - and published - by digital means. Most government authorities now have web portals and an increasing number of them maintain them diligently. Most, if not all, newspapers and magazines also seem to maintain web portals with their content being made available online on the web. This modern revival thus presents a very interesting and a very much modern set of problems (to geeks like me atleast :-P) :- accessing it. It is probably the first time in Maldivian history that a "dhivehi search engine" makes practical sense.

Now, I am aware that Google and other search engines can be used to search for Dhivehi and I'm also aware that there are a few local operations that purport/aspire to be Maldivian search engines but they all share important shortcomings. These shortcomings are mostly inherent to the various methods of writing Thaana as used on the World Wide Web.

Say you want to search for the word "rayyithunge". Typing that into a search engine would bring an entirely different set of results from typing in "rwacyituncge" or "ރައްޔިތުންގެ" - both of which are alternative forms of representing the same thing in Dhivehi. The different set of results arise because of the differences in the representation schemes used on the different sites. A search with the phrase "rayyithunge" would bring in results with pages that seem to mostly contain English and that's because "rayyithunge" is Dhivehi "Latin"ised into English so that we could use standard English characters to write Dhivehi words. People commonly use such Latinised Dhivehi when writing emails or chatting - say "haalu kihineththa" etc. Meanwhile, a search with the phrase "rwacyituncge" results in a listing of content from sites like Haveeru and Miadhu who use standard ASCII coupled with custom Dhivehi fonts with the characters mapped. If you try copy-pasting something written on the Haveeru page you'd see that it comes out as a seemingly meaningless jumble of letters. Lastly, a search with the phrase "ރައްޔިތުންގެ would bring in results from sites like Minivan Daily and Sangu Daily who use Unicode to display Dhivehi. Anyway, the technical explanations aside, the point is that Dhivehi search is (currently) a messy enterprise.

The solution to this problem can (seem to) be pretty simple. A custom search interface could be made to simply take the search query from a user and convert it into the three different representation schemes and then spawn search a search for each representation phrase on any of the existing search engines. This would work just fine... until you run into peculiar problems related to Latinised and ASCII Dhivehi schemes. Take for example the word "ފަލަ" Latinised into "fala" - a search on the word would result in almost entirely non-Dhivehi results totally unrelated to what we really want. Similarly, a search on the ASCII'ed phrase "Oled" (which is the word "ދެލޯ") would result in a large number of non-Dhivehi results with no bearing on what we wanted. These problems occur because Latinised and ASCII Dhivehi representations can result in text that have meaning in English as well - such as the case of "Oled" as above which happens to be a popular technical term in English.

A more sophisticated approach to the search problem probably could successfully iron out (most of) these quirks. An ideal solution would be to do away with the existing search engines such as Google, despite their awesomeness, and develop a custom search engine. A custom engine would allow for the recognition of the various representation schemes used and the subtle differences between them. A search phrase entered on such an engine would perhaps standardize the phrase and search through a standardized index to return results that are a better mirror of the Dhivehi content that is out there. Such a custom search engine could bundle in extra Dhivehi-related facilities such as conversions to allow for lack of (particular) fonts as used on sites and spelling correction among others.

So, perhaps the question now is, is there a real need for a Dhivehi search engine yet? When should a Maldivian "Google" be born?

Guide to using Thaana on the WWW

Developing Dhivehi web pages is pretty easy and there are quite a few methods to do it. However, information on how to go about it seems to be lacking, leaving newbies stumped. Here is a general overview on the various methods for displaying Thaana on the WWW and should contain enough information to help anyone, designer or programmer, get started.

1. CSS: rtl + bidi-override

This method is applicable only to non-Unicode text. It works on all modern browsers but requires for the user to have atleast one of the fonts specified in the page - otherwise the text would be displayed as a mostly meaningless jumble of English letters.

This is the least-effort route to getting any non-Unicode Thaana text (such as those written using MS Word 97/2000, Accent Express, MLS or Faseyha Thaana) on to the web. The websites of Haveeru and Miadhu currently take this approach.

Usage:
To use this method, apply the following CSS to any HTML elements that contain Thaana text. You may use inline style attributes or CSS class/ids to achieve this. You may change the font names to suit your needs but make sure you list several popular fonts and that the fonts specified are all non-Unicode fonts. You could, of course, also add further CSS styling (font size, font color, line height etc) but the following are the required minimum.
font-family: A_Ilham, A_Randhoo, A_Faruma, A_Waheed;
direction: rtl;
unicode-bidi: bidi-override;

Demo:
View example



2. Unicode Dhivehi

This method is applicable to text in Unicode. It works well on all modern browsers but requires for the user to have atleast one Unicode Thaana font - and unlike method (1) the system defaults to a Thaana font it does have if it cannot find any of the fonts named in the page.

This is the best method for any new and modern Thaana-based website. It is used in the online Radheef, Jazeera Daily and Haama Daily.

Usage:
To use this method, first add the following to the page's HTML HEAD section.


Next, apply the following CSS to any HTML elements that contain Thaana text. You may use inline style attributes or CSS class/ids to achieve this. You may change the font names to suit your needs but make sure that the fonts specified are all Unicode fonts. You could, of course, also add further CSS styling (font size, font color, line height etc) but the following are the required minimum.
font-family: Faruma, "MV Elaaf Normal";
direction: rtl;
text-align: right;

Demo:
View example



3. Image

This approach basically renders the Dhivehi text as an image. This is perhaps the most obvious and was the only method available early on. However, this method is still a pretty lucrative solution especially given that many computers just don't have the required fonts available. Using an image for the text rids the requirement on the client browser/computer to have the proper fonts available.

The basic approach of rendering the text into an image using Photoshop, MS Word etc is pretty tedious as the process is entirely manual. However, there is a more sophisticated approach that renders the text into Dhivehi on-the-fly on the web server side (perhaps coupled with caching to reduce load). A server-side scripting language such as PHP can be used to render text into an image using any font of choice by the designer/programmer. The rendered images (typically PNGs) are of very small size and hence have a negligible effect on the page load time in most cases.

Refer to the imagettftext function for details on how to do it in PHP.



4. Flash

This method uses text loaded in Macromedia Flash with the required font(s) being embedded in the Flash clip. ActionScript and/or Flash variables are used to load the text into text areas in the Flash file. This method has the advantage that it works whether the client computer/browser has Dhivehi font available or not but then again it does require the client to have Flash installed and enabled. If you are only seeking to have nice one-line headline sort of text in Dhivehi then you might consider using sIFR.

Refer to Font Embedding help page at Adobe LiveDocs for details on font embedding in Flash.



5. WEFT

Web Embedding Fonts Tools is a Internet Explorer only solution offered by Microsoft. It involves using the Windows-only WEFT utility to create font "objects" that can then be placed on web pages. This method is not recommended unless the target only involves use of Internet Explorer.

Refer to Microsoft WEFT page for more information.



6. TrueDoc

TrueDoc is a solution offered by Bitstream Inc. It is a solution similar to Microsoft's WEFT in that TrueDoc solutions create a embeddable font resource called a Portable Font Resource. Any font (ie. Dhivehi font) can be loaded once users install a custom font "viewer" (called the Character Shape Player by the company). This solution is NOT free and requires the purchase of special software from BitStream to produce the custom embeddable font packages.

Refer to the TrueDoc site for more information.



Good luck ;-)

Update (24-Nov-2008): Method 1 and 2 rewritten for clarity and demos added.

IETF: The twelve networking truths

Whether you are a network engineer, a technician or just use computers for the occasional game and porn, the Internet Engineering Task Force's RFC 1925 is a must read. The Network Working Group supposedly produced this indispensable memo to document a few things about networking that almost every networking and computer training course happily skips on.

- Check out IETF RFC 1925 ;-)

Multi-touch computing: simply amazing!

I was very excited when I first saw the multi-touch-screen technology demo by Jeff Hans on TED Talks earlier this year. Like Jeff said in his talk, it hinted at what new turns the standard human-computer interaction might take in the near future. A lot of different researchers and companies had been working on it for atleast a decade now but Jeff's demo was the first of its kind that I had seen that delivered such an impressive and seemingly feature-complete product. However, since it was just a technology demo I expected to be left to drool at this marvel till the technology is perfected and hit the market in a few years.

It really didn't occur to me that such products may hit the market as soon as this year. So, I was very surprised when Microsoft recently announced their Surface computing device for release in November! Their "Surface" product delivers the full multi-touch computing experience with an interaction surface area that of a coffee table. Apparently, it can track upto 52 touch points and can even recognize objects placed on it. The product essentially follows similar technology to what was demoed at TED Talks by Jeff. But what really astounded me was the technology demos that Microsoft and technology reviewers have published on the product. Microsoft seems to have done a lot of mock applications to show how the multi-surface interface can be used and exploited towards a radically fresh computing experience. This really is a case of seeing is believing (and being impressed) and requires a look at the demo videos.

Sadly though, with the product's supposed price tag of around US$ 5000, it really packs a blow to the wallet. The price will certainly go down as more multi-touch devices from other companies appear on the market. Apple has already incorporated multi-touch technology on their soon-to-be-released iPhone but will deliver the multi-touch experience at a smaller scale.

Check out the video below of Microsoft's Surface - there's more on YouTube. If anyone would like to give me a spontaneous gift for any reason, I surely wouldn't mind receiving one of these babies! ;-)

Web Operating Systems: a personal review...

There are Web OSes springing up on the internet left and right these days. The web operating system, in its broadest definition, includes everything from complete browser based operating-system-like environments to terminal access-like services. I've been keeping a keen eye on the developments, partly because I think it will become one the next big raves on the internet and partly because I find such a services quite useful.

The currently active Web OS services all have free sign up available or at least demo versions for try-outs. Here are a few I've jumped through:

Oos

I quite liked the looks of Oos although I must say it is very very basic and very much incomplete for the moment. However, their interface loads fast, is clean and uncluttered. They've gone to lengths to copy the Windows looks and styles though and may not settle with die-hard users of other OSes.
- Oos homepage


EyeOS

EyeOS is an open source project towards the development of a web operating system and has the source available for download, allowing you to install it on your own site or intranet. The basic package has office, PIM and some utilities bundled in the download. They have a separate website EyeApps where further "applications" for EyeOS can be found.
- EyeOS homepage


YouOS

This is one of the more famous of the current bunch of WebOSes despite not being the best. There are a few applications available on it - a text editor, an instant messenger, notes app and a couple more utilities. The interface isn't too pleasing and the menu systems aren't that user friendly either. That said, it is quite usable though if all you want is the very basics.
YouOS homepage


AstraNOS

AstraNOS failed to impress me a single bit. The interface was ugly and cluttered and lacked any decent feature. Their approach seems to be more towards amalgamating existing independent web services and applications and provide links for those services. Seems like just another WebOS attempt which totally fails to hit any mark, in my humble opinion.
- AstraNOS homepage


Desktoptwo

DesktopTwo is definitely one of the better web OSes around. There is a number of simpler web-based applications (e.g. an instant messenger, mail application, address book, mp3 player) available in addition to the full OpenOffice package and Acrobat Reader applications which seem to be instantiated separately via VNC connections. The interface uses Adobe Flash and is quite pretty and usable. They also offer 1GB of storage space for free to get started.
- Desktoptwo homepage


Fenestela

This WebOS is totally based on the Windows looks - Windows XP to be more exact. There are a few applications such as a HTML editor, a text editor and some utilities available already. This is a commercial product, although I can't really see why anyone would want to purchase this... Ahem.
- Fenestela homepage


Glide

Glide is definitely one of the better and more feature rich WebOSes around. A text editor, music player, email, calendar, contacts and even a photo editor application are available. They also provide 2 GB of free storage space. I'd use this as soon as I get over my disgust for their appalling interface!
- Glide homepage


CorneliOS

An open source project that seems to be producing a quite impressive platform. It is a multi-user web OS software that is available for download and comes complete with user management, access control as well as a content management system. It maintains separate user directories and individual desktop environments. It is quite feature rich with office applications, calender, development applications and has a number of settings for controlling the operations and looks of the desktop environment.
- CorneliOS homepage


Goowy

Goowy is far from being gooey and sports a pretty and very nifty interface. At the moment is has instant messenger, email, calendar, contacts and files management features available. Sadly, it is missing an office package which I reckon should be essential to any web OS. They have a feature called minis, which are basically widgets/gadgets that perform little utility tasks or as information displays. Goowy makes itself less lucrative thanks to the lack of the office package and may well be gooey for now feature-wise.
- Goowy homepage


SSOE

One of the worst Web OSes I've come across! It's done in all Adobe Flash, extremely slow and buggy. Nuff said.
- SSOE homepage


DesktopOnDemand

DoD takes a different approach to a web OS in that theirs is not browser based but rather provides a remote terminal access to a hosted OS environment - one based on Linux and Gnome. Personally, I think this is the best approach to go for creating a Web OS as browser based OSes can be notoriously slow and makes the mistake of relying on the stateless (and inherently vulnerable) HTTP protocol for communications.

The DoD approach provides access to the OS via any NX client and has the option of using a browser based Java plugin as well. They provide 1 GB of free storage and the data can be accessed without entering the OS by using their web based file manager. NX technology uses compression on its data communications and achieves surprising performance. The DoD desktop was as fast as, if not faster than, using any of the browser based web OSes listed above, atleast on my broadband connection. DoD also benefits from NX's use of SSH encryption for data communications making it a very safe way to browse. It won't leave any discernible logs, can't be sniffed/tapped easily and you can store data and browse/chat without leaving any traces behind on the computers that are used to access it. These are great plus points when considering using a practical web OS that is can be accessed from anywhere and is safe.

There is a useful set of applications available as well: office apps, GIMP, instant messenger, browser, video/music player etc. This is my favourite for now and I reckon many others will like this one - especially the Linux fans!
- DesktopOnDemand homepage


CosmoPOD

CosmoPOD takes the same approach as DesktopOnDemand by providing remote terminal access to a KDE-based Linux desktop. CosmoPOD provides a lot more applications bundled in with their service: there is the complete OpenOffice package, IRC/IM clients, mail/newsgroup readers, project/money management software, web development package, a programming IDE, raster/vector graphics editors and a bunch of the usual KDE utilities as well. This alone makes this one of the most desirable web/online OS services around!

CosmoPOD also provides 1 GB free storage and an online browser based file manager that can be accessed without using the NX client.

Sad thing is the free offering is annoyingly slow and also shows advertising banners on the desktop. They do offer the option of switching to a premium service that gives fast access, more applications and control.
- CosmoPOD homepage

Enjoy :-)

Developing apps for mobiles with Adobe Flash Lite

Writing (simple) programs and games for mobile phones has gotten a whole lot easier thanks to Adobe/Macromedia's Flash Lite technology. It basically extends their flagship Flash presentation engine to the realm of mobile handsets by providing a player, similar to that which is installed on desktop PCs, to many of the mobile phones. The product effectively paves the way for the many animators, web developers and even beginners to easily develop rich programs for mobiles without having to delve into C or Java and presents a more lucrative solution than MIDP or BREW for creating device-independent light applications.

Anyone familiar with the standard Flash development techniques can quickly develop/port to Flash Lite and get it working. Transparent internet access, XML/HTML support and ActionScript are all supported - allowing for quite sophisticated applications to be built with ease. Flash Lite is included in the recent Adobe Flash CS3 release or can be added to Flash Professional 8 by downloading the free Flash Lite authoring update from the Adobe website. Many of the popular phones are supported and developers can download/update development profiles for different phones.

The possibilities for applications are numerous - games, data access front-ends etc and might be a great way for businesses to provide interactive information or services to customers with probably a lower development cost!

- Adobe Flash Lite product page
- Adobe Mobile & Devices Developer Center


Quick n dirty implementation of a Reverse Number lookup application in Flash Lite for Nokia phones running S60

Thermite: Towards the rapid destruction of hard disk(s)

A friend of mine, a very paranoid and drama-loving fellow, asked me recently for some suggestions on how to rapidly destroy a computer hard disk. He wanted to destroy his hard drive banks "if police came to get him". I don't know what possible reason the police may have for wanting him and his hard drives but being the dramatic fellow he is, I know he'd want to do it purely for the drama alone. Anyway, my solution to him was simple and most importantly, very dramatic: thermite!

A thermite reaction is an exothermic chemical reaction that generates temperatures reaching upto 2500 °C - more than enough to melt the entire hard drive and entirely destroy the magnetic lining of the platters. Thermite consists of aluminium and iron(III) oxide (better known as rust!). The two is to be mixed in a ratio of approximately 8 or 9 parts iron oxide to 3 parts aluminium. The aluminium needs to be powdered and can be obtained by filing/sanding soft drink cans or aluminium tubing. The iron oxide (rust) can be shaved off from a rusting iron rod and should be in powder form as well. I've successfully tried with aluminium filed off aluminium tubing that I had purchased for building an antenna and rust collected from a bunch of iron nails that had been rotting away.

Ignition is the most important part for getting the thermite reaction going and isn't an easy step for the method described above. The easiest way, which I recommended to my friend, was to use a magnesium strip. They aren't available in Male' but can be ordered online or purchased from abroad easily. Alternatively, I suggested, convince a chemistry student at CHSE to get some - they often use it as part of their weekly practicals and throw out a load of half-used strips to the bin anyway!

Finally, the thermite mixture can be placed in a little container and a piece of magnesium strip neatly stuck into it. The container can be placed on top of the hard drive with the magnesium strip accessible and ready for lighting when required! Once ignited, nothing will stop the reaction and extinguishers - be it water, foam or CO2 - will miserably fail too.

Needless to say, the prospects of having his hard disks melted in a fiery fire as cops watch helplessly really excited my friend...

Further info:
- http://www.ilpi.com/genchem/demo/thermite/
- Watch some thermite reactions

Note: Thermite is not fun play - stand well away and avoid using large amounts. Do be careful if you are curious enough to experiment ;-)