DBA > Articles

How to Create a Multilingual JasperReport

By: Manoj Debnath
To read more DBA articles, visit http://dba.fyicenter.com/article/

Almost all modern software that renders text, such as a browser or text editor, supports Unicode fonts encoded with the UTF-8 or UTF-16 encoding mechanism. A Java reporting library such as JasperReport is no exception and provides APIs to load external Unicode fonts representing cultural scripting symbols. This feature can be leveraged to create report schemed with the combination of one or more language scripts. There are databases such as PostgreSQL which support the UTF-8 encoding scheme to persist records in languages other than English as well. However, the construction of words in various cultural languages is very different and often quite complex. For example, creating a report using Indic Unicode fonts that is culturally and linguistically correct and that also rendering them appropriately in a report viewer is still a challenging task. The article explores the idea and shows how to fetch Unicode characters persisted in a database and subsequently create a multilingual report with the help of the JasperReport library in Java.

Problems with India Unicode

Working with Indian languages is the most problematic one. Indian languages, such as Hindi, Devanagiri, Gujrati, Marathi, Bengali/Bangla, and so forth use a collation of Unicode symbols to create a syntactically correct word or collated consonant. Collation refers to two or more Unicode characters merged into a single meaningful symbol according to the context. Representing the exact collation and rendering them appropriately in a viewer requires something more than just the ability of rendering by the software product and Unicode font's ability of encoding cultural characters symbols. Alphabetic ordering of English is pretty straightforward because letters come one after another with very little variation (as subscripts, superscripts, and so on). Unfortunately, the ordering of Indic language is way more complicated than that. Observe how Unicode character U+0997 is collated with U+09C1 to form a single character symbol in Bengali/Bangla.

Full article...

Other Related Articles

... to read more DBA articles, visit http://dba.fyicenter.com/article/