[PATCH v2] tzfile.5: Fix indentation
RS after IP, if the indentation amount is not specified, takes the same indentation that IP had. The values being used were wrong, and by removing them, we're fixing the indentation of the page. Also, one RS was not just incorrect, but completely unnecessary, and there was a missing RE. Cc: "G. Branden Robinson" <branden@debian.org> Cc: Paul Eggert <eggert@cs.ucla.edu> Signed-off-by: Alejandro Colomar <alx@kernel.org> --- tzfile.5 | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tzfile.5 b/tzfile.5 index 867348d6..aa6b858d 100644 --- a/tzfile.5 +++ b/tzfile.5 @@ -42,7 +42,7 @@ .SH DESCRIPTION Fifteen bytes containing zeros reserved for future use. .IP \(bu Six four-byte integer values, in the following order: -.RS "\w' \(bu 'u" +.RS .TP "\w' 'u" .B tzh_ttisutcnt The number of UT/local indicators stored in the file. @@ -66,6 +66,7 @@ .SH DESCRIPTION The number of bytes of time zone abbreviation strings stored in the file. .RE +.RE .PP The above header is followed by the following fields, whose lengths depend on the contents of the header: @@ -134,8 +135,7 @@ .SH DESCRIPTION is in the range [\-89999, 93599] (i.e., more than \-25 hours and less than 26 hours); this allows easy support by implementations that already support the POSIX-required range [\-24:59:59, 25:59:59]. -.RS "\w' 'u" -.IP \(bu "\w'\(bu 'u" +.IP \(bu .B tzh_charcnt bytes that represent time zone designations, which are null-terminated byte strings, each indexed by the -- 2.43.0
On 2024-03-17 05:43, Alejandro Colomar wrote:
RS after IP, if the indentation amount is not specified, takes the same indentation that IP had. The values being used were wrong, and by removing them, we're fixing the indentation of the page. Also, one RS was not just incorrect, but completely unnecessary, and there was a missing RE.
Thanks for the bug report. Unfortunately the proposed fix generates too much white space between the bullet and the tzh_timecnt entry, when generating PDF. I installed the attached patch instead, which fixes the wrong .RS/.RE nesting and indentation in a different way; hope it works for you too.
Hi Paul! Do I need to subscribe to write to the list? I don't see my posts in the archives. On Sun, Mar 17, 2024 at 11:35:48AM -0700, Paul Eggert wrote:
On 2024-03-17 05:43, Alejandro Colomar wrote:
RS after IP, if the indentation amount is not specified, takes the same indentation that IP had. The values being used were wrong, and by removing them, we're fixing the indentation of the page. Also, one RS was not just incorrect, but completely unnecessary, and there was a missing RE.
Thanks for the bug report. Unfortunately the proposed fix generates too much white space between the bullet and the tzh_timecnt entry, when generating PDF. I installed the attached patch instead, which fixes the wrong .RS/.RE nesting and indentation in a different way; hope it works for you too.
Ahh, sorry, I missed the 2 spaces of indent before the TP tags.
From e555300159da0916b93cf36b5355910b1bdf080a Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sun, 17 Mar 2024 11:26:48 -0700 Subject: [PROPOSED] Fix .RS/.RE problem in tzfile.5
Problem reported by Alejandro Colomar. * tzfile.5: Fix improperly nested .RS/.RE that caused indenting to be slightly off, by adding an .RE and removing an .RS, and by fixing one of the .RS indentings. --- tzfile.5 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tzfile.5 b/tzfile.5 index 867348d6..742c8af8 100644 --- a/tzfile.5 +++ b/tzfile.5 @@ -42,7 +42,7 @@ or Fifteen bytes containing zeros reserved for future use. .IP \(bu Six four-byte integer values, in the following order: -.RS "\w' \(bu 'u" +.RS "\w' \(bu 'u"
This is technically incorrect: You already gained the first two spaces from the previous RS, in which you're nested. Remember that we have .RS "\w' 'u" .IP \(bu "\w'\(bu 'u" a few lines above. The second RS, by default, does the same as the previous IP, that is, "\w'\(bu 'u". What you want here is to change that, to add two more spaces before the TP tag, to the right of the IP indent. So, what you want is .RS "\w'\(bu 'u"
.TP "\w' 'u" .B tzh_ttisutcnt The number of UT/local indicators stored in the file. @@ -66,6 +66,7 @@ in the file (must not be zero). The number of bytes of time zone abbreviation strings stored in the file. .RE +.RE
LGTM.
.PP The above header is followed by the following fields, whose lengths depend on the contents of the header: @@ -134,7 +135,6 @@ Also, in realistic applications is in the range [\-89999, 93599] (i.e., more than \-25 hours and less than 26 hours); this allows easy support by implementations that already support the POSIX-required range [\-24:59:59, 25:59:59]. -.RS "\w' 'u"
LGTM.
.IP \(bu "\w'\(bu 'u"
You can remove the amount from this line. Since we've removed the RS call, now we're continuing the existing IP list. I'll send a patch in a moment. Have a lovely day! Alex
.B tzh_charcnt bytes that represent time zone designations, -- 2.40.1
-- <https://www.alejandro-colomar.es/> Looking for a remote C programming job at the moment.
While this doesn't change output, it changes how we calculate it, to be more correct. Cc: Paul Eggert <eggert@cs.ucla.edu> Cc: "G. Branden Robinson" <branden@debian.org> Signed-off-by: Alejandro Colomar <alx@kernel.org> --- tzfile.5 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tzfile.5 b/tzfile.5 index 742c8af8..daa55a08 100644 --- a/tzfile.5 +++ b/tzfile.5 @@ -42,7 +42,7 @@ .SH DESCRIPTION Fifteen bytes containing zeros reserved for future use. .IP \(bu Six four-byte integer values, in the following order: -.RS "\w' \(bu 'u" +.RS "\w'\(bu 'u" .TP "\w' 'u" .B tzh_ttisutcnt The number of UT/local indicators stored in the file. @@ -135,7 +135,7 @@ .SH DESCRIPTION is in the range [\-89999, 93599] (i.e., more than \-25 hours and less than 26 hours); this allows easy support by implementations that already support the POSIX-required range [\-24:59:59, 25:59:59]. -.IP \(bu "\w'\(bu 'u" +.IP \(bu .B tzh_charcnt bytes that represent time zone designations, which are null-terminated byte strings, each indexed by the -- 2.43.0
[looping in groff mailing list to pitch a terminological reform] Hi Alex, At 2024-03-17T19:56:10+0100, Alejandro Colomar wrote:
You already gained the first two spaces from the previous RS, in which you're nested. Remember that we have
.RS "\w' 'u" .IP \(bu "\w'\(bu 'u"
a few lines above. The second RS, by default, does the same as the previous IP, that is, "\w'\(bu 'u".
I see that I need to clarify the groff_man(7) page in this department. In fact, I suspect Doug McIlroy's term "prevailing indent" was tailor-made for expressing this behavior. But I want to amend it, in documentation and the "an.tmac" source file, to "prevailing _inset_", because "indentation" is overloaded to also refer to the additional spacing applied to (at least some) lines of an `IP`, `TP`, or (deprecated) `HP` paragraph. The resulting concept is simple. prevailing inset = base paragraph inset + sum of relative insets Equivalently: prevailing inset = value of `BP` register[1] + amounts in `RS` calls[2] Regards, Branden [1] forthcoming in groff 1.24 [2] for each active inset, if specified; if not, the indentation of the previous paragraph in the (sub)section is used, and if none (as with `P` and its synonyms), then the value of the `IN` register I _think_ I've got that right.
On Sun, Mar 17, 2024 at 02:24:29PM -0500, G. Branden Robinson wrote:
[looping in groff mailing list to pitch a terminological reform]
Hi Alex,
Hi Banden,
At 2024-03-17T19:56:10+0100, Alejandro Colomar wrote:
You already gained the first two spaces from the previous RS, in which you're nested. Remember that we have
.RS "\w' 'u" .IP \(bu "\w'\(bu 'u"
a few lines above. The second RS, by default, does the same as the previous IP, that is, "\w'\(bu 'u".
I see that I need to clarify the groff_man(7) page in this department. In fact, I suspect Doug McIlroy's term "prevailing indent" was tailor-made for expressing this behavior. But I want to amend it, in documentation and the "an.tmac" source file, to "prevailing _inset_", because "indentation" is overloaded to also refer to the additional spacing applied to (at least some) lines of an `IP`, `TP`, or (deprecated) `HP` paragraph.
Hmmm, I was about to say inset, but double-checked groff_man(7) to be sure of the exact term, and then I got confused even more: $ man groff_man | grep ' \.\(IP\|RS\) \[' .RS [inset‐amount] .IP [tag [indentation]] Since RS uses the same amount that IP used before it, it seems they could be the same thing. But then RS uses inset and IP uses indentation. How do pears be added to apples? :) Have a lovely day! Alex
The resulting concept is simple.
prevailing inset = base paragraph inset + sum of relative insets
Equivalently:
prevailing inset = value of `BP` register[1] + amounts in `RS` calls[2]
Regards, Branden
[1] forthcoming in groff 1.24
[2] for each active inset, if specified; if not, the indentation of the previous paragraph in the (sub)section is used, and if none (as with `P` and its synonyms), then the value of the `IN` register
I _think_ I've got that right.
-- <https://www.alejandro-colomar.es/> Looking for a remote C programming job at the moment.
Hi Branden, On Sun, Mar 17, 2024 at 08:31:55PM +0100, Alejandro Colomar wrote:
Hmmm, I was about to say inset, but double-checked groff_man(7) to be sure of the exact term, and then I got confused even more:
$ man groff_man | grep ' \.\(IP\|RS\) \[' .RS [inset‐amount] .IP [tag [indentation]]
Since RS uses the same amount that IP used before it, it seems they could be the same thing. But then RS uses inset and IP uses indentation. How do pears be added to apples? :)
I think this paragraph, about the IN register is where the confusion lies. So IN holds an amount that will be used as inset-amount or as indentation, depending on who uses it. That ambiguity was what confused me. Cheers, Alex -- <https://www.alejandro-colomar.es/> Looking for a remote C programming job at the moment.
On Sun, Mar 17, 2024 at 08:53:22PM +0100, Alejandro Colomar wrote:
Hi Branden,
On Sun, Mar 17, 2024 at 08:31:55PM +0100, Alejandro Colomar wrote:
Hmmm, I was about to say inset, but double-checked groff_man(7) to be sure of the exact term, and then I got confused even more:
$ man groff_man | grep ' \.\(IP\|RS\) \[' .RS [inset‐amount] .IP [tag [indentation]]
Since RS uses the same amount that IP used before it, it seems they could be the same thing. But then RS uses inset and IP uses indentation. How do pears be added to apples? :)
I think this paragraph, about the IN register is where the confusion lies. So IN holds an amount that will be used as inset-amount or as indentation, depending on who uses it. That ambiguity was what confused me.
Gah, I forgot to paste the paragraph. It's from groff_man(7). Ordinary paragraphs not within an .RS/.RE inset region are inset by the amount stored in the BP register; see section “Options” below. The IN register configures the default indentation amount used by .RS (as the inset‐amount), .IP, .TP, and the deprecated .HP; an overriding argument is a number plus an optional scaling unit. If no scaling unit is given, the man package assumes “n”. An indentation specified in a call to .IP, .TP, or the deprecated .HP persists until (1) another of these macros is called with an indentation argument, or (2) .SH, .SS, or .P or its synonyms is called; these clear the indentation entirely.
Cheers, Alex
-- <https://www.alejandro-colomar.es/> Looking for a remote C programming job at the moment.
-- <https://www.alejandro-colomar.es/> Looking for a remote C programming job at the moment.
Hi Alex, At 2024-03-17T20:54:12+0100, Alejandro Colomar wrote:
Gah, I forgot to paste the paragraph. It's from groff_man(7).
Ordinary paragraphs not within an .RS/.RE inset region are inset by the amount stored in the BP register; see section “Options” below. The IN register configures the default indentation amount used by .RS (as the inset‐amount), .IP, .TP, and the deprecated .HP; an overriding argument is a number plus an optional scaling unit. If no scaling unit is given, the man package assumes “n”. An indentation specified in a call to .IP, .TP, or the deprecated .HP persists until (1) another of these macros is called with an indentation argument, or (2) .SH, .SS, or .P or its synonyms is called; these clear the indentation entirely.
Yes, that's the paragraph that gave me pause when I reviewed it for this thread. So change may...prevail upon it soon. Regards, Branden
At 2024-03-17T20:31:55+0100, Alejandro Colomar wrote:
Hmmm, I was about to say inset, but double-checked groff_man(7) to be sure of the exact term, and then I got confused even more:
$ man groff_man | grep ' \.\(IP\|RS\) \[' .RS [inset‐amount] .IP [tag [indentation]]
Since RS uses the same amount that IP used before it, it seems they could be the same thing. But then RS uses inset and IP uses indentation. How do pears be added to apples? :)
Both are (horizontal) measurements, so they are commensurable and additive. Where they differ is in application. An inset applies to all output lines, period. An indentation applies only to a thing called a "paragraph". The various paragraphs types are distinguished primarily by where they apply indentation to their lines. As I put it in the groff's ms(7) documentation, which presents a similar matter... groff_ms(7): Paragraphs Paragraphing macros break, or terminate, any pending output line so that a new paragraph can begin. Several paragraph types are available, differing in how indentation applies to them: ... to the first output line of the paragraph, all output lines, or all but the first. These calls insert vertical space in the amount stored in the PD register, except at page or column breaks, ... Regards, Branden
[looping in Thomas Dickey since I mention him below] Hi Paul, At 2024-03-17T11:35:48-0700, Paul Eggert wrote:
On 2024-03-17 05:43, Alejandro Colomar wrote:
RS after IP, if the indentation amount is not specified, takes the same indentation that IP had. The values being used were wrong, and by removing them, we're fixing the indentation of the page. Also, one RS was not just incorrect, but completely unnecessary, and there was a missing RE.
Thanks for the bug report. Unfortunately the proposed fix generates too much white space between the bullet and the tzh_timecnt entry, when generating PDF. I installed the attached patch instead, which fixes the wrong .RS/.RE nesting and indentation in a different way; hope it works for you too.
Six four-byte integer values, in the following order: -.RS "\w' \(bu 'u" +.RS "\w' \(bu 'u" .TP "\w' 'u" [...] already support the POSIX-required range [\-24:59:59, 25:59:59]. -.RS "\w' 'u" .IP \(bu "\w'\(bu 'u"
From what I've seen it's uncommon, especially in man pages, to measure the space width of the current font. (Also, non-roff man page formatters have historically had fits with this sort of input--or totally ignored it--because of the 257 people who decided to write something called "man2html" in a fever dream of conquering the world with their mighty Perl scripting powers during the dot-com era, none wanted to implement a full *roff numerical expression evaluator.) Can I ask how the existing system of measurement units in *roff is unsatisfactory for your application? Thomas Dickey has the following idiom for bulleted paragraphs. .de bP .ie n .IP \(bu 4 .el .IP \(bu 2 .. It seems I can't talk him out of a macro definition for this application because he truly wants _wider_ spacing after the bullet on a terminal device than a typesetter. If his preferences were inverted, that is .de bP .ie n .IP \(bu 2 .el .IP \(bu 4 .. ...like that, then he could cut the Gordian Knot as follows and not define a macro at all. .IP \(bu 2m On typesetters, an em is twice the width of en (`IP`'s default unit). But on terminals, 1 em equals 1 en (which is typographically true on devices lacking proportional type). I bring all this up because as a man(7) macro language advocate and (unpaid) instructor, I strive to keep the language's expression as simple as possible. When inexperienced authors don't see scary syntax in others' pages, they're less likely to be discouraged, and perhaps more likely to stay on the horse and contribute worthy documentation. At the same time I understand the authorial desire to bring one's work to a high degree of polish. Regards, Branden
On 2024-03-17 12:06, G. Branden Robinson wrote:
Can I ask how the existing system of measurement units in *roff is unsatisfactory for your application?
Previously, tzfile.5 used only directives like ".IP *", ".IP * 2", ".RS", and ".RE" to control indenting. But after Alex suggested here: https://mm.icann.org/pipermail/tz/2023-October/033116.html that we switch to ".IP * 3", I noticed that the resulting PDF output had too much white space around the "*", even though the nroff output looked sorta OK. (The problem had already been present with "2", but it got worse with "3".) The problem got even a bit worse if I used "\(bu" instead of "*". So the patch I installed computed widths with \w instead. See: https://mm.icann.org/pipermail/tz/2023-October/058168.html
I bring all this up because as a man(7) macro language advocate and (unpaid) instructor, I strive to keep the language's expression as simple as possible.
Yes, if users don't care about PDF or varying-width HTML output there's no point to using \w here. The TZDB man pages already used \w for other things (lining up code and tables). Although a man page formatter that can't handle \w may be out of luck with \w in .IP directives, they were out of luck already.
Hi Paul & Branden, On Sun, Mar 17, 2024 at 03:07:49PM -0700, Paul Eggert wrote:
On 2024-03-17 12:06, G. Branden Robinson wrote:
Can I ask how the existing system of measurement units in *roff is unsatisfactory for your application?
Previously, tzfile.5 used only directives like ".IP *", ".IP * 2", ".RS", and ".RE" to control indenting. But after Alex suggested here:
https://mm.icann.org/pipermail/tz/2023-October/033116.html
that we switch to ".IP * 3", I noticed that the resulting PDF output had too much white space around the "*", even though the nroff output looked sorta OK. (The problem had already been present with "2", but it got worse with "3".) The problem got even a bit worse if I used "\(bu" instead of "*". So the patch I installed computed widths with \w instead. See:
https://mm.icann.org/pipermail/tz/2023-October/058168.html
I bring all this up because as a man(7) macro language advocate and (unpaid) instructor, I strive to keep the language's expression as simple as possible.
Yes, if users don't care about PDF or varying-width HTML output there's no point to using \w here.
I didn't want to suggest changing that, as it's personal taste. I personally don't find the indentation of IP \[bu] 3 that we use in the Linux man-pages, but then we don't wrap that in RS 2, so it's not an apples-to-apples comparison. In case you want to have a quick look at how it looks, here's an example from the Linux man-pages: <https://www.alejandro-colomar.es/share/dist/man-pages/git/HEAD/man-pages-HEA...> I personally think it's fine what you're doing, and don't find your output ugly. Yeah, the souorce code is... meh. But it looks reasonable to me. It's sad that some programs crash on that input.
The TZDB man pages already used \w for other things (lining up code and tables). Although a man page formatter that can't handle \w may be out of luck with \w in .IP directives, they were out of luck already.
Actually not. Surprising as it may be, Debian's man2html(1) could handle (probably by ignoring them; I didn't really check) previous uses of \w, but started crashing with \w in IP. Did you receive a copy of the Debian bug report? Have a lovely day! Alex -- <https://www.alejandro-colomar.es/> Looking for a remote C programming job at the moment.
On 2024-03-17 15:20, Alejandro Colomar wrote:
In case you want to have a quick look at how it looks, here's an example from the Linux man-pages:
<https://www.alejandro-colomar.es/share/dist/man-pages/git/HEAD/man-pages-HEA...>
Yes, unfortunately that looks subpar to me. There's too much space between the bullets and the text they're bulleting. For example, in the last page of man-pages(7) the bullets should be indented with respect to the parent text, and there should be less space between the bullets and the text. Much better is what tzfile(5) does now (see attached); this is particularly important when something is nested under the bullet level, as it is in tzfile(5). The current tzfile(5) bulleting approach is closer to how Joe Ossanna used bullets in section 7.2 of the Nroff/Troff User's Manual (1976)[1], which is what I learned troff from. (Ossanna doesn't subindent so his larger indents are not that much of a problem in the manual, but tzfile(5) needs to subindent.) There are other things not to like about the man page PDF output. The man pages are confused about when to use constant-width fonts vs varying-width fonts. The lines are too long to read comfortably; this is inherent to how a good font squeezes in more text. Indents are too large in general. The PDF man pages should be formatted for smaller pages, or with tons more margin, or two-column, or something. Of course I realize we can't fix all this, as there's long tradition of hasty and/or bad formatting dating back to 7th Edition Unix man pages. Still, if someone wants to make little improvements we should let them.
Surprising as it may be, Debian's man2html(1) could handle (probably by ignoring them; I didn't really check) previous uses of \w, but started crashing with \w in IP. Did you receive a copy of the Debian bug report?
I followed up separately to that. In short, that man2html appears to be unmaintained upstream and should be retired, but I sent in a patch anyway. [1]: https://www.tuhs.org/Archive/Documentation/Manuals/Unix_4.0/Volume_1/C.1.2_N...
[looping in linux-man@, as we discuss about improvements in the Linux man pages' PDF book] Hi Paul, On Sun, Mar 17, 2024 at 09:59:41PM -0700, Paul Eggert wrote:
On 2024-03-17 15:20, Alejandro Colomar wrote:
In case you want to have a quick look at how it looks, here's an example from the Linux man-pages:
<https://www.alejandro-colomar.es/share/dist/man-pages/git/HEAD/man-pages-HEA...>
Yes, unfortunately that looks subpar to me. There's too much space between the bullets and the text they're bulleting. For example, in the last page of man-pages(7) the bullets should be indented with respect to the parent text, and there should be less space between the bullets and the text. Much better is what tzfile(5) does now (see attached); this is particularly important when something is nested under the bullet level, as it is in tzfile(5). The current tzfile(5) bulleting approach is closer to how Joe Ossanna used bullets in section 7.2 of the Nroff/Troff User's Manual (1976)[1], which is what I learned troff from. (Ossanna doesn't subindent so his larger indents are not that much of a problem in the manual, but tzfile(5) needs to subindent.)
Hmm, while Ossana's indents might be a bit excessive, TZDB's might be too short. Maybe I would RS 4 spaces instead of 2 before the tag. Maybe you being used to programs with 2 spaces and me with 1 tab means we have our brains hard-wired for different indentation width preferences. But I kind of do like pre-indenting bullets; in some cases I've felt that having the bullets not indented was sub-par, but wasn't convinced enough to go and pre-indent them, since that would add complexity, and also allow less room for text in terminals.
There are other things not to like about the man page PDF output. The man pages are confused about when to use constant-width fonts vs varying-width fonts.
Can you please point to an example of this? I try to be consistent, but probably there are still cases that I haven't fixed due to lack of time.
The lines are too long to read comfortably; this is inherent to how a good font squeezes in more text.
I'm not sure I understand this. Do you mean there are too many letters in a line in the Linux man-pages PDF or too few? If we compare <https://www.alejandro-colomar.es/share/dist/man-pages/git/HEAD/man-pages-HEA...> with the PDF you attached to your email, you can see there are less words in a line in the Linux man-pages PDF than in yours. Also, your PDF has slightly less margins. When I first saw the PDF book, I had a feeling that lines were too long, and that a larger/better font might be necessary.
Indents are too large in general. The PDF man pages should be formatted for smaller pages, or with tons more margin, or two-column, or something. Of course I realize we can't fix all this, as there's long tradition of hasty and/or bad formatting dating back to 7th Edition Unix man pages. Still, if someone wants to make little improvements we should let them.
Sure. I do accept improvements for that. If you have more specific suggestions, or even patches, they're welcome!
Surprising as it may be, Debian's man2html(1) could handle (probably by ignoring them; I didn't really check) previous uses of \w, but started crashing with \w in IP. Did you receive a copy of the Debian bug report?
I followed up separately to that. In short, that man2html appears to be unmaintained upstream and should be retired, but I sent in a patch anyway.
Thanks. Have a lovely day! Alex
[1]: https://www.tuhs.org/Archive/Documentation/Manuals/Unix_4.0/Volume_1/C.1.2_N...
-- <https://www.alejandro-colomar.es/> Looking for a remote C programming job at the moment.
On 2024-03-18 01:35, Alejandro Colomar wrote:
Hmm, while Ossana's indents might be a bit excessive, TZDB's might be too short. Maybe I would RS 4 spaces instead of 2 before the tag.
That'd be too long for the nroff case. The nroff case is a bit too long already. It looks like the following in the current TZDB version: The goals of this section are: o to help TZif writers output files that avoid common pitfalls in older or buggy TZif readers, o to help TZif readers avoid common pitfalls when reading files generated by future TZif writers, and ... and if there were four spaces (instead of two) around the bullets, it'd be too much white space. Of course we could indent more or less depending on whether it's nroff or troff, but that's complexity I'd rather avoid.
I kind of do like pre-indenting bullets; in some cases I've felt that having the bullets not indented was sub-par, but wasn't convinced enough to go and pre-indent them, since that would add complexity, and also allow less room for text in terminals.
Glad you like preindenting. As you say, once one does it, one should use less white space.
There are other things not to like about the man page PDF output. The man pages are confused about when to use constant-width fonts vs varying-width fonts.
Can you please point to an example of this? I try to be consistent, but probably there are still cases that I haven't fixed due to lack of time.
See the attached, which is the output of "groff -man -Tpdf zdump.8". First, I had to do shenanigans like this: .ie \n(.g .ds - \f(CR-\fP .el .ds - \- and later use \*- every time I wanted to specify a zdump option like -v. Using plain "-v" in zdump.8 doesn't work, because it generates a hyphen in troff mode and hyphens are too narow. Using "\-v" doesn't work, because it generates a mathematical minus sign in the PDF, which differs from "-", which means you can't easily search for "-v" in the PDF. So I have to use "\*-v" with the above code. And this means the "-" is in a different font than the "v". On page 2, there are some examples in constant width font to make things line up. But shouldn't we be using constant width font for all code? That's what the rest of the world is doing nowadays (or, if you want to be fancy, a sans serif font that stands out in a different way). But Linux man page fonts are still stuck with a style defined by the limitations of the 1970s C/A/T phototypesetter <https://en.wikipedia.org/wiki/CAT_(phototypesetter)> and are using Times Bold and Times Italic to refer to program and file names. Also, it should be ragged right, in both nroff and troff output. Right-adjusted text looks nicer but is less functional, and man pages should be all about function. (See the reference below.)
The lines are too long to read comfortably; this is inherent to how a good font squeezes in more text.
I'm not sure I understand this. Do you mean there are too many letters in a line in the Linux man-pages PDF or too few?
Too many. I'm getting about 100 characters per line in the PDF, which is on the extreme high end of the usual recommendations (it should be closer to 60 characters per line). There's no single answer here of course (opinions do differ), but the man page lines are pretty clearly too long in the PDFs. See: Nanavati AA, Bias RG. Optimal line length in reading - a literature review. Visible Language. 2005;39(2):120-44. https://journals.uc.edu/index.php/vl/article/view/5765
If we compare <https://www.alejandro-colomar.es/share/dist/man-pages/git/HEAD/man-pages-HEA...> with the PDF you attached to your email, you can see there are less words in a line in the Linux man-pages PDF than in yours. Also, your PDF has slightly less margins.
They're pretty close, and both have too many characters per line.
Hi Paul, Branden, On Sun, Apr 07, 2024 at 11:33:38PM -0700, Paul Eggert wrote:
On 2024-03-18 01:35, Alejandro Colomar wrote:
Hmm, while Ossana's indents might be a bit excessive, TZDB's might be too short. Maybe I would RS 4 spaces instead of 2 before the tag.
That'd be too long for the nroff case. The nroff case is a bit too long already. It looks like the following in the current TZDB version:
The goals of this section are:
o to help TZif writers output files that avoid common pitfalls in older or buggy TZif readers,
o to help TZif readers avoid common pitfalls when reading files generated by future TZif writers, and
... and if there were four spaces (instead of two) around the bullets, it'd be too much white space.
Of course we could indent more or less depending on whether it's nroff or troff, but that's complexity I'd rather avoid.
Yeah, I was thinking only of the typeset version. And I agree in not wanting the complexity of a conditional.
I kind of do like pre-indenting bullets; in some cases I've felt that having the bullets not indented was sub-par, but wasn't convinced enough to go and pre-indent them, since that would add complexity, and also allow less room for text in terminals.
Glad you like preindenting. As you say, once one does it, one should use less white space.
I'll think about it. Maybe I add some preindent to the Linux man-pages.
There are other things not to like about the man page PDF output. The man pages are confused about when to use constant-width fonts vs varying-width fonts.
Can you please point to an example of this? I try to be consistent, but probably there are still cases that I haven't fixed due to lack of time.
See the attached, which is the output of "groff -man -Tpdf zdump.8".
First, I had to do shenanigans like this:
.ie \n(.g .ds - \f(CR-\fP .el .ds - \-
and later use \*- every time I wanted to specify a zdump option like -v. Using plain "-v" in zdump.8 doesn't work, because it generates a hyphen in troff mode and hyphens are too narow. Using "\-v" doesn't work, because it generates a mathematical minus sign in the PDF, which differs from "-", which means you can't easily search for "-v" in the PDF.
Hmmm. I use "\-v" in the Linux man-pages, and it works, in the sense that you can search for "-v" with ^F in the PDF viewer. See <https://kernel.org/pub/linux/docs/man-pages/book/man-pages-6.7.pdf#ldconfig....> It works for me in all the readers I tried, which are firefox(1), atril(1), and okular(1). In what systems does it not work for you?
So I have to use "\*-v" with the above code. And this means the "-" is in a different font than the "v".
On page 2, there are some examples in constant width font to make things line up. But shouldn't we be using constant width font for all code? That's what the rest of the world is doing nowadays (or, if you want to be fancy, a sans serif font that stands out in a different way).
Hmmm, with a set of macros C CR RC CI and IC to use them it could be a good idea. Branden, how does it look to you? I don't think CB and BC would be necessary.
But Linux man page fonts are still stuck with a style defined by the limitations of the 1970s C/A/T phototypesetter <https://en.wikipedia.org/wiki/CAT_(phototypesetter)> and are using Times Bold and Times Italic to refer to program and file names.
Also, it should be ragged right, in both nroff and troff output. Right-adjusted text looks nicer but is less functional, and man pages should be all about function. (See the reference below.)
You can probably configure that in man.local, no? I know at least you can disable hyphenation, which solves most of the functional problems.
The lines are too long to read comfortably; this is inherent to how a good font squeezes in more text.
I'm not sure I understand this. Do you mean there are too many letters in a line in the Linux man-pages PDF or too few?
Too many. I'm getting about 100 characters per line in the PDF, which is on the extreme high end of the usual recommendations (it should be closer to 60 characters per line).
Completely agree. CC += groff. Branden, do you think we can fix that somehow? Literally, the first thing I thought about the Linux man-pages PDF when I saw it was "Lines are so long that it's hard for me to read them.". Well, it was the second; I first saw the front page, which was beautiful; that thought was the first one when I say the first page after the front.
There's no single answer here of course (opinions do differ), but the man page lines are pretty clearly too long in the PDFs. See:
Nanavati AA, Bias RG. Optimal line length in reading - a literature review. Visible Language. 2005;39(2):120-44. https://journals.uc.edu/index.php/vl/article/view/5765
Hmmmm. Very interesting.
If we compare <https://www.alejandro-colomar.es/share/dist/man-pages/git/HEAD/man-pages-HEA...> with the PDF you attached to your email, you can see there are less words in a line in the Linux man-pages PDF than in yours. Also, your PDF has slightly less margins.
They're pretty close, and both have too many characters per line.
Yup. Have a lovely day! Alex -- <https://www.alejandro-colomar.es/>
On 2024-04-08 01:31, Alejandro Colomar wrote:
Hmmm. I use "\-v" in the Linux man-pages, and it works
Ha! I just checked and it works for me too. It did not work in 2014. Apparently since 2014 PDF and HTML viewers have gotten smarter about searching, so that "-" matches any form of dash. So perhaps I should remove this \- complication from the TZDB man pages.
You can probably configure that in man.local, no? I know at least you can disable hyphenation, which solves most of the functional problems.
Fine, but that should be the default. Users shouldn't have to fiddle with man.local to tailor output format to be good for the usual case. man.local should be for the unusual cases.
On 2024-04-08 10:46, Paul Eggert via tz wrote:
On 2024-04-08 01:31, Alejandro Colomar wrote:
Hmmm. I use "\-v" in the Linux man-pages, and it works
Ha! I just checked and it works for me too. It did not work in 2014. A
Unfortunately I spoke too quickly, as this does not work with Solaris 10 troff. On that platform, this command: printf ' - \\- \\(mi\n' | troff | dpost outputs a .ps file that, when converted to PDF, gives you U+002D (HYPHEN-MINUS), U+2013 (EN DASH), U+2212 (MINUS SIGN). So if TZDB wants to play nicely even on this obsolescent platform, it still needs to play its game with \*- instead. Oh well.
[Caveat lector: this is not a short email and I hyperlink to multiple longer ones] Hi Paul & Alex, At 2024-04-07T23:33:38-0700, Paul Eggert wrote:
On 2024-03-18 01:35, Alejandro Colomar wrote:
Hmm, while Ossana's indents might be a bit excessive, TZDB's might be too short. Maybe I would RS 4 spaces instead of 2 before the tag.
That'd be too long for the nroff case. The nroff case is a bit too long already. It looks like the following in the current TZDB version:
The goals of this section are:
o to help TZif writers output files that avoid common pitfalls in older or buggy TZif readers,
o to help TZif readers avoid common pitfalls when reading files generated by future TZif writers, and
... and if there were four spaces (instead of two) around the bullets, it'd be too much white space.
Of course we could indent more or less depending on whether it's nroff or troff, but that's complexity I'd rather avoid.
Depending on what you want, you can apply a scaling unit to the measurement. On terminals, 1 em equals 1 en, but on typesetters they're different (1 en is one half em). This doesn't work for Thomas Dickey's case, unfortunately, where he wants _wider_ spacing on terminals than typesetters. For example: man/curs_addwstr.3x:.de bP man/curs_addwstr.3x-.ie n .IP \(bu 4 man/curs_addwstr.3x-.el .IP \(bu 2 man/curs_addwstr.3x-..
There are other things not to like about the man page PDF output. The man pages are confused about when to use constant-width fonts vs varying-width fonts.
Can you please point to an example of this? I try to be consistent, but probably there are still cases that I haven't fixed due to lack of time.
See the attached, which is the output of "groff -man -Tpdf zdump.8".
First, I had to do shenanigans like this:
.ie \n(.g .ds - \f(CR-\fP .el .ds - \-
and later use \*- every time I wanted to specify a zdump option like -v. Using plain "-v" in zdump.8 doesn't work, because it generates a hyphen in troff mode and hyphens are too narow. Using "\-v" doesn't work, because it generates a mathematical minus sign in the PDF, which differs from "-", which means you can't easily search for "-v" in the PDF. So I have to use "\*-v" with the above code. And this means the "-" is in a different font than the "v".
Like Alex, I am curious why the PDF CMap isn't solving the copy-and-paste part of this problem.
On page 2, there are some examples in constant width font to make things line up. But shouldn't we be using constant width font for all code?
I'd say no. For code _displays_, sure. Inline? That's less certain. Used _judiciously_, the way Brian Kernighan does, it's fine. mdoc mavens like to pound the table, trumpeting the superiority of their "semantic" macros. The problem is that for many years, the coupling of macros of code/literal semantic denotation to the Courier typeface led to _horrible_ typography in groff, because things like the square brackets in "synopsis language" weren't in Courier--logically enough, because they're not "literal"--but this made it difficult to tell how wide the spaces you were looking at were, or if space was even present between a bracket and a semantically muscular adjacent code item. I submit that mdoc advocates lost sight of basic readability. I guess it was more fun (and quite possibly more remunerative from an employer) performing automated transformations on semantic tags than attending to the basics of typesetting. (I'm no great practitioner myself! But I assume I'm not alone in preferring to be able to tell whether a space is present at a given location in the text, especially if it's showing me how to type in a Unix command, which follow varying conventions.) I recently drove a bulldozer through this nonsense in groff Git HEAD and am steeling myself for a summons to the International Criminal Court on charges of "semantic heresy" levied by people who don't even use groff anyway, but mandoc(1). (mandoc(1)'s solution to the typesetting problem is to format HTML and then use a third-party tool to convert HTML to a PDF.) https://lists.gnu.org/archive/html/groff/2024-03/msg00163.html
That's what the rest of the world is doing nowadays (or, if you want to be fancy, a sans serif font that stands out in a different way). But Linux man page fonts are still stuck with a style defined by the limitations of the 1970s C/A/T phototypesetter <https://en.wikipedia.org/wiki/CAT_(phototypesetter)> and are using Times Bold and Times Italic to refer to program and file names.
Not exactly. With groff, you can remap these font names. I recently showed a groff mailing list subscriber who _hates_ monospaced fonts (and especially Courier) how to customize the way it's rendered in his system. <https://lists.gnu.org/archive/html/groff/2024-03/msg00181.html>:
... I don't think many people hate monospaced fonts as much as you do. I won't change the default but I will do my best to ensure that you can perform the font substitutions you desire. In fact I think I have.
Here's a recipe that works with groff 1.23.0 and Git HEAD.
$ git di tmac/ diff --git a/tmac/man.local b/tmac/man.local index 8f75330bf..491bda1ee 100644 --- a/tmac/man.local +++ b/tmac/man.local @@ -21,6 +21,12 @@ .\" 4: x-man-doc://1/groff -- ManOpen (Mac OS X pre-2005) .\" Set this register to configure which the `MR` macro uses. .\" .nr an*MR-URL-format 1 +. +.ftr CR AR +.ftr CB AB +.ftr CI AI +.ftr CBI ABI +.ds an*example-family A\" .\" .\" Local Variables: .\" mode: nroff
(The `ftr` requests are necessary to snag "manual" font selections with `ft` requests or, more likely, `\f` escape sequences, and to handle tbl(1) tables using those font names.)
Apply the foregoing patch to wherever your man.local is installed. The "Files" section of groff_man(7) in groff 1.23.0 (or Git) should tell you where it is.
Also, it should be ragged right, in both nroff and troff output.
This _also_ came up just in the past month or so on the groff list. https://lists.gnu.org/archive/html/groff/2024-03/msg00079.html Short version: historical practice in this area has been divergent with respect to nroff output, and overwhelmingly against your preference with respect to troff. (That doesn't mean you're _wrong_. But you might be iconoclastic.)
Right-adjusted text looks nicer but is less functional, and man pages should be all about function. (See the reference below.)
"All about" can be enlisted to do entirely too much work here, but I agree with the principle.
The lines are too long to read comfortably; this is inherent to how a good font squeezes in more text.
I'm not sure I understand this. Do you mean there are too many letters in a line in the Linux man-pages PDF or too few?
Too many. I'm getting about 100 characters per line in the PDF, which is on the extreme high end of the usual recommendations (it should be closer to 60 characters per line). There's no single answer here of course (opinions do differ), but the man page lines are pretty clearly too long in the PDFs.
One straightforward means of addressing this problem is simply to typeset the manual at a larger type size. Say, 11 or 12 points. groff's supported that for a couple of decades. For these sizes, Werner Lemberg even chose vertical spacing counterparts inspired by TeX. groff_man(7): -rStype‐size Use type‐size for the document’s body text; acceptable values are 10, 11, or 12 points. See subsection “Font style macros” above for the default.
See:
Nanavati AA, Bias RG. Optimal line length in reading - a literature review. Visible Language. 2005;39(2):120-44. https://journals.uc.edu/index.php/vl/article/view/5765
I've got this queued up to read during a doctor's appointment today. (More like a waiting room appointment.) I have a personal shell function that exercises the new groff man `PO` register to use the default line length but center the man page in the terminal window, and have been enjoying it for months. An inevitable problem we will face in trying to set man pages on narrower lines is the heavy use of tables and other means of filling disablement by page authors. No sooner did they get a feel for the additional 13n additional elbow room that groff gave them (over historical *roffs' 65n), than they started overrunning that limit too. Documenters of C wanted function synopses to look just so, and turned off filling to get it. Other page authors wanted to depict what the terminal would look like, and ran roughshod over considerations of circumstances under which a man page might actually be typeset. I wouldn't be at all averse to reimposing a 65n line length limit as a _style_ recommendation. And I think I know where to poke the formatter to get it to emit a warning diagnostic if the line length is overrun when filling is disabled. (This would be kin to TeX's notoriously discouraging "overfull hbox" warnings, but if I can't write a diagnostic message more intelligible than that, I'll put in for retirement.) At 2024-04-08T10:31:32+0200, Alejandro Colomar wrote:
Hmmm, with a set of macros C CR RC CI and IC to use them it could be a good idea. Branden, how does it look to you? I don't think CB and BC would be necessary.
I don't like that idea at all. I don't want to add _any_ more font macros to man(7). Incidentally, Eighth or Ninth Edition Unix man(7) did the foregoing, with a "literal" font named "L" in the package. (It mapped to Courier, as I recall.) Stepping up an abstraction level was good. Growing another dimension in the vector space of font macros was less good. Not being taken up anywhere else is neither good nor bad, but may be suggestive. However, that doesn't mean we must surrender to futility. My idea for attacking this is the user-definable "tag class". https://lists.gnu.org/archive/html/groff/2023-10/msg00034.html
Too many. I'm getting about 100 characters per line in the PDF, which is on the extreme high end of the usual recommendations (it should be closer to 60 characters per line).
Completely agree. CC += groff. Branden, do you think we can fix that somehow? Literally, the first thing I thought about the Linux man-pages PDF when I saw it was "Lines are so long that it's hard for me to read them.". Well, it was the second; I first saw the front page, which was beautiful; that thought was the first one when I say the first page after the front.
Pass `-rS11` (or -rS12) to the formatter when building and see if you like the result. Regards, Branden
Hi Branden! On Mon, Apr 08, 2024 at 10:59:25AM -0500, G. Branden Robinson wrote:
[Caveat lector: this is not a short email and I hyperlink to multiple longer ones]
Hi Paul & Alex,
At 2024-04-07T23:33:38-0700, Paul Eggert wrote:
The lines are too long to read comfortably; this is inherent to how a good font squeezes in more text.
I'm not sure I understand this. Do you mean there are too many letters in a line in the Linux man-pages PDF or too few?
Too many. I'm getting about 100 characters per line in the PDF, which is on the extreme high end of the usual recommendations (it should be closer to 60 characters per line). There's no single answer here of course (opinions do differ), but the man page lines are pretty clearly too long in the PDFs.
One straightforward means of addressing this problem is simply to typeset the manual at a larger type size. Say, 11 or 12 points. groff's supported that for a couple of decades. For these sizes, Werner Lemberg even chose vertical spacing counterparts inspired by TeX.
groff_man(7): -rStype‐size Use type‐size for the document’s body text; acceptable values are 10, 11, or 12 points. See subsection “Font style macros” above for the default.
See:
Nanavati AA, Bias RG. Optimal line length in reading - a literature review. Visible Language. 2005;39(2):120-44. https://journals.uc.edu/index.php/vl/article/view/5765
I've got this queued up to read during a doctor's appointment today. (More like a waiting room appointment.)
I have a personal shell function that exercises the new groff man `PO` register to use the default line length but center the man page in the terminal window, and have been enjoying it for months.
An inevitable problem we will face in trying to set man pages on narrower lines is the heavy use of tables and other means of filling disablement by page authors. No sooner did they get a feel for the additional 13n additional elbow room that groff gave them (over historical *roffs' 65n), than they started overrunning that limit too.
Documenters of C wanted function synopses to look just so, and turned off filling to get it. Other page authors wanted to depict what the terminal would look like, and ran roughshod over considerations of circumstances under which a man page might actually be typeset.
I wouldn't be at all averse to reimposing a 65n line length limit as a _style_ recommendation. And I think I know where to poke the formatter to get it to emit a warning diagnostic if the line length is overrun when filling is disabled. (This would be kin to TeX's notoriously discouraging "overfull hbox" warnings, but if I can't write a diagnostic message more intelligible than that, I'll put in for retirement.)
Since manual pages often have a few levels of indentation, lines need to be rather wide on the terminal (and using those levels of indentation, the actual length of the text isn't too much). I wouldn't narrow the line length in nroff(1) mode. I find troff(1) mode the one that's hardly readable by default.
At 2024-04-08T10:31:32+0200, Alejandro Colomar wrote:
Hmmm, with a set of macros C CR RC CI and IC to use them it could be a good idea. Branden, how does it look to you? I don't think CB and BC would be necessary.
I don't like that idea at all. I don't want to add _any_ more font macros to man(7).
Okay.
Too many. I'm getting about 100 characters per line in the PDF, which is on the extreme high end of the usual recommendations (it should be closer to 60 characters per line).
Completely agree. CC += groff. Branden, do you think we can fix that somehow? Literally, the first thing I thought about the Linux man-pages PDF when I saw it was "Lines are so long that it's hard for me to read them.". Well, it was the second; I first saw the front page, which was beautiful; that thought was the first one when I say the first page after the front.
Pass `-rS11` (or -rS12) to the formatter when building and see if you like the result.
Hmm, that's much more pleasing! commit 5ba7ca38f758370c9cbfcb901aa0f0f1efb31f52 (HEAD -> contrib) Author: Alejandro Colomar <alx@kernel.org> Date: Mon Apr 8 19:15:35 2024 +0200 share/mk/: $TROFFFLAGS: Use a larger font size Link: <https://journals.uc.edu/index.php/vl/article/view/5765> Reported-by: Paul Eggert <eggert@cs.ucla.edu> Suggested-by: "G. Branden Robinson" <branden@debian.org> Cc: "Thomas E. Dickey" dickey@his.com Signed-off-by: Alejandro Colomar <alx@kernel.org> diff --git a/share/mk/configure/build-depends/groff-base/troff.mk b/share/mk/configure/build-depends/groff-base/troff.mk index 051172ce7..b9b7518cf 100644 --- a/share/mk/configure/build-depends/groff-base/troff.mk +++ b/share/mk/configure/build-depends/groff-base/troff.mk @@ -6,7 +6,9 @@ ifndef MAKEFILE_CONFIGURE_BUILD_DEPENDS_GROFF_BASE_TROFF_INCLUDED MAKEFILE_CONFIGURE_BUILD_DEPENDS_GROFF_BASE_TROFF_INCLUDED := 1 -DEFAULT_TROFFFLAGS := -wbreak +DEFAULT_TROFFFLAGS := \ + -wbreak \ + -rS12 EXTRA_TROFFFLAGS := TROFFFLAGS := $(DEFAULT_TROFFFLAGS) $(EXTRA_TROFFFLAGS) TROFF := troff
Regards, Branden
Have a lovely day! Alex -- <https://www.alejandro-colomar.es/>
NetBSD maintains the same pages (https://nxr.netbsd.org/search?q=&project=src&defs=&refs=&path=lib%2Flibc%2Ft...) in mandoc format which is a semantics based format as opposed to a presentation one like man, if that's helpful. christos
On Mar 17, 2024, at 11:59 PM, Paul Eggert via tz <tz@iana.org> wrote:
On 2024-03-17 15:20, Alejandro Colomar wrote:
In case you want to have a quick look at how it looks, here's an example from the Linux man-pages: <https://www.alejandro-colomar.es/share/dist/man-pages/git/HEAD/man-pages-HEA...>
Yes, unfortunately that looks subpar to me. There's too much space between the bullets and the text they're bulleting. For example, in the last page of man-pages(7) the bullets should be indented with respect to the parent text, and there should be less space between the bullets and the text. Much better is what tzfile(5) does now (see attached); this is particularly important when something is nested under the bullet level, as it is in tzfile(5). The current tzfile(5) bulleting approach is closer to how Joe Ossanna used bullets in section 7.2 of the Nroff/Troff User's Manual (1976)[1], which is what I learned troff from. (Ossanna doesn't subindent so his larger indents are not that much of a problem in the manual, but tzfile(5) needs to subindent.)
There are other things not to like about the man page PDF output. The man pages are confused about when to use constant-width fonts vs varying-width fonts. The lines are too long to read comfortably; this is inherent to how a good font squeezes in more text. Indents are too large in general. The PDF man pages should be formatted for smaller pages, or with tons more margin, or two-column, or something. Of course I realize we can't fix all this, as there's long tradition of hasty and/or bad formatting dating back to 7th Edition Unix man pages. Still, if someone wants to make little improvements we should let them.
Surprising as it may be, Debian's man2html(1) could handle (probably by ignoring them; I didn't really check) previous uses of \w, but started crashing with \w in IP. Did you receive a copy of the Debian bug report?
I followed up separately to that. In short, that man2html appears to be unmaintained upstream and should be retired, but I sent in a patch anyway.
[looping in groff list because I started talking about my plans again] Hi Paul, At 2024-03-17T15:07:49-0700, Paul Eggert wrote:
On 2024-03-17 12:06, G. Branden Robinson wrote:
Can I ask how the existing system of measurement units in *roff is unsatisfactory for your application?
Previously, tzfile.5 used only directives like ".IP *", ".IP * 2", ".RS", and ".RE" to control indenting. But after Alex suggested here:
https://mm.icann.org/pipermail/tz/2023-October/033116.html
that we switch to ".IP * 3",
Ah. Hmm. I would not have made that suggestion, myself. For bullets and list enumerators, I find the robotic enforcement of a 2n separation between the paragraph marker and its content to be unnecessarily prescriptive. Consequently, groff 1.24 will no longer do so for `IP` paragraphs. (Rather, the minimum separation it enforces will be zero; it will permit abutment but not overlapping.) I think a lot depends on the sigil one chooses for a paragraph marker, which could be anything in Unicode, and of course on personal taste. (I wanted to maintain separation enforcement for the `TP` because paragraph tags are so often words or phrases. I have longer-term plans to perform automatic "tagging" (in the hyperlink sense) of paragraph tags, to facilitate improved navigation and search features in the man(7) applications.)
I noticed that the resulting PDF output had too much white space around the "*", even though the nroff output looked sorta OK. (The problem had already been present with "2", but it got worse with "3".) The problem got even a bit worse if I used "\(bu" instead of "*". So the patch I installed computed widths with \w instead. See:
Well, I won't tell you not to use, say, "IP \(bu 1" if you like it. ;-)
Yes, if users don't care about PDF or varying-width HTML output there's no point to using \w here.
If _you_ care about formatting man pages in PDF, you might be interested in some things I've recently landed in groff Git. https://lists.gnu.org/archive/html/groff/2024-01/msg00125.html https://lists.gnu.org/archive/html/groff/2024-03/msg00139.html
The TZDB man pages already used \w for other things (lining up code and tables). Although a man page formatter that can't handle \w may be out of luck with \w in .IP directives, they were out of luck already.
This sounds perfectly reasonable. Thanks for helping me to understand your use case. Regards, Branden
participants (4)
-
Alejandro Colomar -
Christos Zoulas -
G. Branden Robinson -
Paul Eggert