Thaana conversions class for PHP 5 - v0.2

Here is an update to the Thaana conversions class I released in Nov 2007. This new version 0.2 release expands the varieties of conversions available and should be more than adequate for almost all uses. This version, most importantly, adds solid UTF-8 conversion functions allowing for more flexibility in PHP-based Unicode/UTF-8 Thaana handling. Further, the class is now licensed under the pretty liberal Open Source MIT License. The code still relies solely on core PHP 5 functions and does not demand any extra PHP extensions to be installed.

Functions exposed by the class:
- convertUtf8ToUnicodeIntegers()
Convert UTF-8 data to Unicode character integer representations

- convertUtf8ToAscii()
Convert UTF-8 data to Ascii

- convertEntitiesToUnicodeIntegers()
Convert HTML Unicode entitied string to Unicode Integer characters array

- convertEntitiesToUtf8
Convert HTML Unicode entities to UTF-8

- convertEntitiesToAscii()
Convert HTML Unicode entities to Dhivehi Ascii equivalents

- convertUnicodeIntegersToUtf8()
Convert Unicode Integer array to UTF

- convertUnicodeIntegersToEntities()
Convert Unicode char integers to HTML entities

- convertUnicodeIntegersToAscii()
Convert Unicode char integers to Ascii

- convertAsciiToUtf8()
Convert Ascii Thaana to UTf-8

- convertAsciiToUnicodeEntities()
Convert Ascii Thaana to Unicode HTML entities

- convertAsciiToUnicodeIntegers()
Convert Ascii Thaana to an array of Unicode integers

Usage:
<?php
$thaana 
= new Thaana_Conversions();
echo 
$thaana->convertEntitiesToAscii('&#1931;&#1960;&#1928;&#1964;&#1920;&#1960;');
echo 
$thaana->convertAsciiToUtf8('rWacje');
?>

Download:
- Thaana_Conversions.zip (v0.2, 3KB)

Drop me a line if you run into trouble with any of the functionality or have comments/queries. Enjoy :-)

Thaana Unicode<->Ascii conversions PHP class

Here is something that would probably be very handy to Maldivian web developers dabbling with Dhivehi sites. This PHP class addresses the need for converting text to and from Thaana in Ascii and Thaana in Unicode.

The class makes it easy to standardize text into one format irrespective of how it was/is written. This means that you can take text written in Accent, MS Word 97 (and prior) or written using Unicode as featured on recent MS Word editions and use the class to present output in the format of your choice without the need for imposing restrictions on the people who write the text. The class comes in even more handy when you have a form submission that takes input in Unicode but needs to be stored in the database or presented later as Ascii, or vice versa.

The class was something I originally wrote around 2001 and was used in the free Online Document Converter that featured on maldivianunderground.net. I rewrote it for PHP 5 recently for use in a project I am working on. The original class had support for Letin dhivehi -> Unicode/Ascii conversions as well which I haven't included in this release but will add it a future update.

Usage should be pretty straightforward but here is an example just to illustrate:

Example:
<?php
$thaana 
= new Thaana_Conversions();
echo 
$thaana->convertUnicodeToAscii('&#1931;&#1960;&#1928;&#1964;&#1920;&#1960;');
echo 
$thaana->convertAsciiToUnicode('rWacje');
?>


Download:
- Thaana_Conversions.zip (v0.1, 2KB)

Enjoy :-)

Update (7-May-2008): This version is now superseded by v0.2.