On 20/06/14 01:51, Paul Eggert wrote:
Garrett Wollman reported privately that XEmacs 21.4.22, the current stable version, doesn't work with the UTF-8 recently introduced into our commentary. For example, the UTF-8 character '−' (MINUS SIGN), which is stored as the three bytes "\342\210\222", displays as 'â\210\222'. For proper names this is annoying but tolerable, as there's little loss in utility from (say) 'Racoviță' to its display form 'RacoviÈ\233Ä\203'. But for symbols this is a real pain that can make it hard to understand the documentation, e.g., 'Release 2014e – 2014-06-12 21:53:52 −0700' displays as 'Release 2014e â\200\223 2014-06-12 21:53:52 â\210\2220700'. To work around this problem, make the following substitutions in commentary to mostly revert these symbols to their pre-UTF-8 versions: '§' -> 'section', '°' -> 'degrees', '±' -> '+-', '–' -> '-' (en dash), '—' -> '--' (em dash), '′' -> "'", '″' -> '"', '→' -> '->', '−' -> '-' (minus sign), '≤' -> '<='. Leave proper names and foreign words in UTF-8.
I'd have thought that was XEmacs' problem defaulting to the "wrong" charset. Could it be fixed by adding a modeline or perhaps a UTF-8 BOM as a hint about the character set? (I don't know how well zic or third-party tools would cope with a BOM, but a modeline ought to be no problem.) -- -=( Ian Abbott @ MEV Ltd. E-mail: <abbotti@mev.co.uk> )=- -=( Tel: +44 (0)161 477 1898 FAX: +44 (0)161 718 3587 )=-