Paul Eggert wrote:
* At the HTML level the document specifies charset=windows-1252 and the four HTML lines containing non-ASCII characters didn't come out right in my browser. Can you please use plain ASCII instead, e.g., use "«" instead of the Windows-1252 byte that means left-pointing double angle quotation mark? Or it may be simpler to reformulate the examples to avoid non-ASCII characters.
Gave no problem in a couple of my browsers, for instance Macintosh Netscape 4, Internet Explorer 5. The "offending" characters were encoded correctly, namely as literals according to the character set windows-1252. Another way to encode the "left-pointing double angle quotation mark" is indeed by "«". A serious problem with this type of "named entities" is that there are 2451 named entities defined in the ISO/IEC DTR 9573-13 2nd Ed. standard <http://www.w3.org/2003/entities/iso9573-2003doc/9573.html> (or a more recent version) but only about one hundred of these are really supported by most browsers since version type 4 (Netscape/Microsoft) on most platforms. Much better supported are numbered entities like « (hexadecimal) or « (decimal). Decimal entities are preferred due to a somewhat better compatibility with older browsers and because you can copy them directly from most DTD entity definition files - e.g. the *.ent files from w3.org. If you have to use characters outside the defined character set - in this case windows-1252 - then hexadecimal/decimal entities are obligatory. In that case a utf-8 character set designation for the html document is advised, but this means that all windows-1252 literals should be reencoded to numbered entities according to Unicode positions. In short: for multilingual html pages one should use a utf-8 character set designation (per meta or http-header) and encode all special characters with numbered entities according to Unicode positions. Oscar van Vlijmen 2004-06-12 Sorry, but I don't C-copy emails to discussion partners personally if an email to the tz-list should be sufficient.