Chillu
+ Consonant => Conjunct ?
Author: |
Cibu C Johny |
Email: |
Cibu (at) yahoo.com |
Date: |
July 12, 2005 |
Version: |
2 |
Abstract
This document discusses
the pitfalls in allowing Malayalam Chillu letters to form conjunct with a
subsequent consonant. Also suggests the specific scenarios where it should be
allowed.
Conventions used in this document
DDA U+0D22
NNA U+0D23
RRA U+0D31
Virama U+0D4D
Introduction
We are looking at the
issues of writing a text in an old orthography font and reading it in new
orthography and vice versa. Specifically, we are looking at the possibility of
any Chillu-C1 + C2 sequence forming conjuncts in this mixed context.
There is no General rule for conjunct
formation
Argument is through
examples:
Unique Encoding Rule
I agree that there
could be a Chillu-conjunct formation rule for a proper subset of
Chillu-C1 + C2 permutations. Even then, we cannot have same rendering for two
different encodings (joiners not considered). Let me illustrate how that is
applicable here: Assume there is a conjunct formation rule for a subset of
Chillu-C1 + C2 permutations and as per that rule, Chillu-NNA + DDHA ( + )
can form a conjunct () in an old orthography font. Of course, NNA +
VIRAMA + DDHA ( + + )
will also form the same conjunct. There fore, a document written by multiple
people (eg:
a wiktionary.org
document) can quite possibly have both spellings for this conjunct without
reader or writer being aware of it. This can cause ineffective searches and
inconsistent sorted list of words and finally causing confusion to the users.
So we cannot allow Chillu-C1 + C2 and C1 + VIRAMA + C2
forming same conjunct. I would call this unique encoding rule.
This rule has a side
effect: Many words like /alpam/ can potentially have
two spellings - one with chillu-LA ()
and other with /lpa/ conjunct ().
Both of these spellings are used synonymously in contemporary Malayalam text.
This is very similar to two spellings of 'colour' ('color' is the
corresponding American spelling). A British English font should not try to
convert 'color' to 'colour'.
It should remain as intended by the author. Same should
be the case with two spellings of /alpam/ in
Malayalam. It should be displayed as intended by the author(s) of the text.
Except for two cases
described below, C1 + VIRAMA + C2 form the conjunct.
Exception 1: Malayalam version of eyelash repha
Same way as in
Exception 1, we can find a way to produce eyelash repha
conjunct in old orthography font, while producing explicit Chillu-RA in new
orthography font. That is,
Chillu-RA + C2 =>
eyelash-repha over C2, if available in the font.
Example:
+ =>
We can use joiners ZWJ & ZWNJ - in their usual meaning:
respectively forcing or avoiding the conjunct formation.
As per Unique encoding rule, RA + VIRAMA + C2 should not form eyelash repha conjunct.
Confusion on (/nta/) encoding
Representation of /nta/ is
closely related to what stands for. Malayalams behavior of
representing /ta/ and /rra/
with same letter had definitely contributed to the confusion of what is - /nta/ or /nrra/. Here are the details of the two ways in which it being
used:
These facts give way to two quite reasonable inputting
scenarios:
Along with above inputting scenarios following
not-so-obvious facts should also be considered:
Thus, we end up with 3 mutually
exclusive choices:
Due to the lack of a perfect solution, I feel option 1 should be the
pragmatic choice.