[icq-devel] Language encoding

Gerke Preussner j3rky at tactical-ops.de
Mon Jun 23 17:44:55 CEST 2003

Hi all,

I recently continued working on my ICQ Database Manager 'Flowershop' and
one of my beta testers informed be about problems with displaying
multi-lingual characters.

I did some test with a chinese friend of mine and figured out that
chinese, japenese and korean characters are encoded with the GB 2312-80
character set (using EUC-CN encoding form as default).

The problem I have now is figuring out which encoding was used for a
certain message. The message item itself does not hold any relevant
information. Especially the message string does not have any leading
encoding indicators. Furthermore, I did not find any relevant encoding
information in the user details.

In general, it would be possible that ICQ handles all messages as GB
2312-80 (since it's compatible with 0x21-0x7E single-byte ASCII) but I
doubt that because then other languages such as Latin, Cyrillic or
Hebrew wouldn't be possible.

I know that most of you are working on hacking ICQ's network protocol,
but maybe someone has an idea how the encoding of a certain message
could be determined. I don't wanna waste my time with re-inventing the
wheel :)

Many thanks in advance,


headcrash industries
email: j3rky at gerke-preussner.de
www: http://www.gerke-preussner.de
latest project: http://flowershop.gerke-preussner.de


TacticalOps Coder, Designer & Mapper
TacticalOps Germany PR & Community
Kamehan Studios, Paris, France
email: j3rky at tactical-ops.de

sick of playing Counterstrike?
try: http://www.tactical-ops.to
and: http://www.tactical-ops.de

