200004031037
Windows CE
CE
programmers
UTF format
UTF/UCS converter in Scheme
CE programmers...
- Take a look at the MSDN Library. The CE
API documentation for 2.10 and higher is pretty good and should answer most
questions on how to implement another language on a device or for an
application. Conversion routines are provided for several codepages. See the
Tnef API, among others.
- To display a HTML (or text) file in unicode (UCS-2) encoding on
WinCE: the HTML Viewer Control can be used, the CE SDK has an example HTML
viewer which can display unicode HTML with minimal changes (as I understand
it, the DTM_ADDTEXT has to be changed to DTM_ADDTEXTW and that's it).
- Editing unicode (UCS-2) files on CE: PocketWord of CE 2.11 can
read/write UCS encoded files ("Unicode text"). For CE 2.0, the editors of PocketC and of PocketScheme can do
this. Also there's a nice (freeware) Notepad by the
Starlight Computer Wizardry.
- Converting between different encodings: See the Tnef API of CE
2.10. The Unix utility TCS (public domain) can be used, this is the source. It's written in ANSI/C, so a port to CE should be
easy.
- Remember that IE 4 / IE 5 can be used to convert between different
encodings, useful for testing.
- Superseeded
by SP-1 for CE 2.11 How could the Inbox be used for non-Latin1 mail? Inbox relies on
the CE database which is unicode based. So: Read the message to convert via
the Mail API, convert it to unicode (UCS-2) and update the Inbox with the
message you just manipulated. It can then be displayed via PocketInbox's
normal GUI. Before sending, reverse the steps. It might be necessary to change
the system font if the language you're handling is not contained in the system
fonts, for this the registry would have to be changed:
HKLM/SYSTEM/GDI/SYSFNT/Nm. To work, the font specified in that key has to be
placed in the \widows\ folder. (Thanks to Jango for noting this)
Another
possibility: A program that automatically converts text in the clipboard
to/from UCS. Use: Open a message in Inbox, mark all the text, copy it - it is
converted - paste it.
If you're doing this, consider supporting the
UTF-8 format for incoming/outgoing mail (see also): being
a unicode transformation format, any (!) language can be handled and the
conversion between UCS and UTF which would be required is easy; Netscape and
Outlook can read UTF-8 mail. You can be sure that UTF will become standard for
emails in the future, later on for HTML.
- The different ways to store Unicode - Unicode coding methods are
documented in the help file of UnicEdit by H. Eichmann (do a websearch).
- Unicode UCS files exist in several flavours: The byte order is not
defined, so quite often the first two bytes of the file denote it. This is not
imperative, for example IE 4 sp-1 when saving as unicode will not add these.
Microsoft normally uses the byte order FF FE.
The UTF-8 format
The ASCII characters 0-7f are identical with their
UTF-8 representation. Any characters beyond that range will be encoded using 2
bytes (80-7ff) or 3 bytes (800-ffff). If 2 or 3 bytes are used, the second and
third byte will always have a value between 80 and BF:
0 - 7f |
0aaaaaaa |
80 - 7ff |
110aaaaa |
10bbbbbb |
7ff - ffff |
1110aaaa |
10bbbbbb |
10cccccc |
A UTF/UCS converter in Scheme
updated for PocketScheme 0.4.0 This is a very basic (and clumsily programmed) converter for UTF/UCS. It runs in
PocketScheme. It
can be used for sending/receiving mail in UTF (that is, any language).
Its primary use was to support mail in UTF-8. By now, this support has been added
to PocketOutlook with Microsofts' SP-1 for CE 2.11.
PocketScheme
is a free Scheme implementation with SIOD anchestry for CE devices by Ben
Goetter. Scheme is a Lisp dialect. The pscheme project is progressing steadily,
meanwhile it is possible to call the CE API from within a Scheme script. Unicode
is supported. IMHO, it's the only alternative to programming in C++ for CE
devices.
Do you have a suggestion? Email or use the guestbook.
hits since 980912