Wednesday, September 15, 2004

Displaying Indian languages on a web site using EOT fonts

This article describes the use of Embedded OpenType fonts to display Indian languages on a website. We will use the Bengali language character-set for this example. The words Bangla and Bengali have been used interchangeably throughout this post and they refer to the same Indo-Aryan language native to the Bengal region of South Asia.

Objective

We want to build a website that displays its content in the Bangla (or Bengali) language. One method to achieve this, is to type the text in a desktop publishing program like Adobe PageMaker or Microsoft Word and generating images by taking screenshots which can then loaded on the webpages using the IMG HTML tags. Though this is possible, it is a cumbersome process and given that downloading these image files consume a lot of bandwidth, loading the web pages can be slow with a slow or moderately slow internet connection. We intend to improve the user experience by having the text load as a supported font which can be rendered and displayed by the browser as soon as the HTML loads instead of waiting for images to load to display the Bangla content. We want the Bangla font file to be delivered along with the webpage so that the Internet Explorer browser doesn't have to depend on locally installed TrueType Fonts (TTF) to successfully render the Bangla letter on the HTML pages.

Typing in Bangla

A screenshot of the
Avro Keyboard homepage
Avro Keyboard 2.1 is a little piece of software that can be installed on your Windows XP or Windows 2000 system, and it lets you type the Bangla letter on any application that supports UNICODE. Its documentation says that it is the first free and full UNICODE supported Bangla typing software for Windows. The project aims to add all popular Bangla keyboard layouts from Bangladesh and India with it to provide the maximum flexibility and usability to its users. It provides two keyboard layouts UniBijoy (closely matches the Bijoy keyboard layout) and Avro Easy (a custom easy-to-use layout developed by OmicronLab). The Unicode Consortium has listed Avro Keyboard as a Unicode Keyboard Layout resource here.

Configuring your system for Avro Keyboard

After Avro Keyboard is installed, to configure a Windows 2000 system, go to Control Panel > Regional Options and select Indic under Language settings for the system. If you are on Windows XP, go to Control Panel > Regional and Language Options > Language and select Install files for complex script and right-to-left languages (including Thai). In addition, you also need to update the Uniscribe Engine (usp10.dll) which resides in the System32 folder inside the Windows installation folder. The detailed steps for configuration are available here.

Text Editor

To type Bangla in Notepad once Avro Keyboard is installed, go to Format > Font and select a Bangla font that was installed - Solaiman Lipi, Rupali, Akaash Normal, Likhan, Mitra Mono, Sagar or Mukti Narrow. Back in Notepad's main window ensure that Avro's keyboard mode is set to Bangla Keyboard. Now, typing on your keyboard would generate Bangla text. The file must be saved as Unicode, instead of the default ANSI format. To do this, go to File > Save As... and set Encoding to Unicode. However, for the purposes of this experiment, we will use save the file as ANSI but enable "Character Codes" in Avro. More of that later.

HTML Editor

I have been using Microsoft FrontPage 98, which is a WYSIWYG HTML editor, to design web pages for a while. Even though it is a couple of years old, FrontPage 98 works fine for me as it is barebones and lets me take full control of all the pages that go into the website - i.e., there are no auto-generated files with scripts and other stuff as it happens with FrontPage 2002 or 2003. But to type Bangla characters in an HTML file, we need an HTML editor which supports Unicode and FrontPage 98 doesn't seem to be one of them. Notepad in Windows XP/2000 and Microsoft FrontPage 2002 supports Unicode and works with the Avro Keyboard so we will have to use one of them.

Embedded OpenType Fonts

As I said before, we don't want to depend on fonts that are already available on the user systems as there is very little likelihood of the font to be already installed. Embedded OpenType provides the mechanism to deliver fonts along with the web pages so that the web browser knows how to render the font. The Microsoft Web Embedding Fonts Tool, WEFT Version 3, lets web authors create 'font objects' that are linked to their web pages so that when an Internet Explorer user views the pages, they'll see them displayed in the font style contained within the font object. So, we will use WEFT to create OpenType font objects (.EOT files) from the TrueType Fonts (TTF) that are used in the webpage. While generating the HTML page, we will insert the Bangla letter using a TrueType Font on the local machine, but we will make the web browser render the characters using the OpenType font object which will be hosted on the same website.

Embedding EOT in HTML

The EOT font will be embedded in the HTML using Cascading Style Sheets (CSS). Here is a sample:

<style type="text/css">
 .beng {
  color: #000000;
  font-family: SolaimanLipi;
  font-style: normal;
  font-weight: normal;
src: url(SOLAIMA0.eot);
 }
</style>

Implementation

Create a Basic HTML file

Create a basic HTML file that you will use as basis to generate the EOT files. Open Microsoft Notepad and save the file using ANSI encoding. Set the font for Notepad to SolaimanLipi. [To set the font go to Format > Font]. From the Avro Keyboard options, enable "Character Codes" for it to generate character codes while you type using the selected Bangla font. When "Character Code" is enabled, the Character Code of the key pressed is entered in Notepad instead of the character itself. Close Notepad. Here is an example:

<html>
<head>
  <meta content="text/html; charset=x-user-defined"
  http-equiv="Content-Type">
</head>

<body>
  <font face="SolaimanLipi">কেমন</font>
  <font face="Verdana">what's new</font>
</body>
</html>

Generate the EOT files

Provide the HTML file created above as an input to Microsoft WEFT 3 to generate the EOT files. In the example above, we have used font faces SolaimanLipi and Verdana - so WEFT generates two files SOLAIMA0.eot and VERDANA0.eot.

Create the final HTML file

Next ensure that the EOT files have been stored at an accessible location. Type your final HTML file using the same method that was used previously to create the basic HTML file. Use the EOT file inside Cascading Style Sheets (CSS). Here is an example:

html>
 <head>
  <meta content="text/html; charset=x-user-defined"
   http-equiv="Content-Type">

  <style type="text/css">
   <!-- /* $WEFT -- Created by: Debjyoti Das on 2004-09-14 -- */
   @font-face {
    font-family: SolaimanLipi;
    font-style: normal;
    font-weight: normal;
    src: url(SOLAIMA0.eot);
   }

   .verd {
    font-family: Verdana;
    font-style: normal;
    font-weight: normal;
    src: url(VERDANA0.eot);
   }
   -->
  </style>

  <style type="text/css">
   .beng {
    color: #000000;
    font-family: SolaimanLipi;
    font-style: normal;
    font-weight: normal;
   }

   .verd {
    color: #000000;
    font-family: Verdana;
    font-style: normal;
    font-weight: normal;
   }
  </style>

 </head>

 <body>
  <span class="SolaimanLipi">কেমন</span>
  <span class="verd">what's up?</span>
 </body>
</html>

If everything works well, Internet Explorer will download the page and font object and render the Bangla text correctly. This was tested on Microsoft Internet Explorer 5 and 6.

Avro Keyboard is available at this website.
There is a short and simple guide to typing in Bangla at their website here.
The SolaimanLipi and other Bangla fonts - distributed under the GNU General Public License - can be downloaded from here.
The Unicode Consortium is located here.
The Microsoft Web Embedding Fonts Tool (WEFT) Version 3 is located here.

Friday, May 21, 2004

The Windows Security Nightmare

I just stumbled upon this article that closely resonates with my personal experiences with different versions of the Windows operating system throughout the years starting with Windows 95. It all starts with a fresh installation, but as I install and uninstall various programs, things gradually deteriorate. Eventually, the registry becomes cluttered with unnecessary files, forcing me to perform a clean install. However, this comes with a drawback – all the previously applied security updates are lost, leaving the system vulnerable to worms and malware. In this journey from one Windows version to another, this has been a common theme with them all: Windows 95, Windows 98, Windows 98 SE, Windows Me and presently Windows XP. Here is a Sydney Morning Herald article written by Usman Latif.

Thursday, April 1, 2004

Google Launches Gmail

Google announced the preview release of Gmail, a free webmail service with a massive storage capacity of 1 gigabyte per user. The idea came from a user who complained about existing email services' limitations. Gmail allows users to search all their emails easily and offers efficient organization without the need for filing or deleting messages. The preview version is being tested by a select group of users. The service would be available at http://gmail.google.com. Here is the press release from Google.