Forum » Hidden / Per page discussions » Language Processing Modules
Started by: Automatic
On: 1208450677|%e %b %Y, %H:%M %Z|agohover
Number of posts: 2
rss icon RSS: New posts
This is the discussion related to the wiki page Language Processing Modules.
Reference page has been created
thaweesakthaweesak 1208629575|%e %b %Y, %H:%M %Z|agohover

I would appreciate if we can add useful references or links to documents in the references at this website. I have started some.

unfold Reference page has been created by thaweesakthaweesak, 1208629575|%e %b %Y, %H:%M %Z|agohover
Should we say "cluster" boundary?
samphansamphan 1208896038|%e %b %Y, %H:%M %Z|agohover

Instead of saying character boundary, should we leave the word "character" for the concept that can be represent by a code point, i.e. a single TIS-620 character. And then use "cluster" or (in Unicode term) "grapheme cluster" for the combining character sequence like กี่, กู้ or กำ

One pattern that is always an issue is whether "กำ" or "น้ำ" one or two clusters. In OpenOffice.org and Microsoft Office, they treat the pattern as one unit.

unfold Should we say "cluster" boundary? by samphansamphan, 1208896038|%e %b %Y, %H:%M %Z|agohover
New post
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License