Re: [tz] [PATCH] Switch back to ASCII for symbols.

June 20, 2014


      On 20/06/14 01:51, Paul Eggert wrote:
...
Garrett Wollman reported privately that XEmacs 21.4.22, the current
stable version, doesn't work with the UTF-8 recently introduced
into our commentary.  For example, the UTF-8 character '−'
(MINUS SIGN), which is stored as the three bytes "\342\210\222",
displays as 'â\210\222'.  For proper names this is annoying but
tolerable, as there's little loss in utility from (say) 'Racoviță'
to its display form 'RacoviÈ\233Ä\203'.  But for symbols this is a
real pain that can make it hard to understand the documentation, e.g.,
'Release 2014e – 2014-06-12 21:53:52 −0700' displays as
'Release 2014e â\200\223 2014-06-12 21:53:52 â\210\2220700'.
To work around this problem, make the following substitutions in
commentary to mostly revert these symbols to their pre-UTF-8 versions:
'§' -> 'section', '°' -> 'degrees', '±' -> '+-', '–' -> '-' (en
dash), '—' -> '--' (em dash), '′' -> "'", '″' -> '"', '→' -> '->',
'−' -> '-' (minus sign), '≤' -> '<='.  Leave proper names and
foreign words in UTF-8.
I'd have thought that was XEmacs' problem defaulting to the "wrong" 
charset.  Could it be fixed by adding a modeline or perhaps a UTF-8 BOM 
as a hint about the character set?  (I don't know how well zic or 
third-party tools would cope with a BOM, but a modeline ought to be no 
problem.)

-- 
-=( Ian Abbott @ MEV Ltd.    E-mail: <abbotti@mev.co.uk>        )=-
-=( Tel: +44 (0)161 477 1898   FAX: +44 (0)161 718 3587         )=-