FW: timezone code - re-entrancy issue
I'm forwarding this message from John Dlugosz, who is not on the time zone mailing list. Those of you who are on the list, please direct replies appropriately. --ado -----Original Message----- From: John Dlugosz [mailto:JDlugosz@TradeStation.com] Sent: Tuesday, January 06, 2009 7:36 To: tz@lecserver.nci.nih.gov Subject: timezone code - re-entrancy issue At the bottom of localsub, tzname[tmp->tm_isdst] = &sp->chars[ttisp->tt_abbrind]; I think this is to update the tzname string to the last time zone abbr. actually used in to produce the tm values. It is not re-entrant. This is called even by localtime_r, which may be changing the values even as another thread (who called plain localtime) is reading it. It also changes all the time as the "binary search" is being done by mktime. That violates the "behaves as if never called" rule, and putting it in a subroutine doesn't help there. Am I missing something? --John
Date: Mon, 12 Jan 2009 13:09:52 -0500 From: "Olson, Arthur David (NIH/NCI) [E]" <olsona@dc37a.nci.nih.gov> Message-ID: <B410D30A78C6404C9DABEA31B54A2813029A0408@nihcesmlbx10.nih.gov> | At the bottom of localsub, | | tzname[tmp->tm_isdst] = &sp->chars[ttisp->tt_abbrind]; | | I think this is to update the tzname string to the last time zone abbr. | actually used in to produce the tm values. | | It is not re-entrant. We know. The tzname[] array is a bogus legacy interface that has to be maintained for compatability with ancient applications. Nothing rational is going to be using it these days. Just ignore that nonsense - if your application has no need for this legacy compatability, just delete tzname[] completely. kre ps: in most timezones, even though those assignments happen relatively frequently in theory, in practice, they're asigning what are essentially constants (for the zone) so the values mostly don't ever actually change. This isn't guaranteed of course, but even if a threaded application were written badly enough that it was using global variables like this in most cases, no-one would ever notice the problems.
We know. The tzname[] array is a bogus legacy interface that has to be maintained for compatability with ancient applications. Nothing rational is going to be using it these days. Just ignore that nonsense - if your application has no need for this legacy compatability, just delete tzname[] completely.
So what is the way, in this code (or POSIX), to get the name of the timezone? And I assume what is really wanted is the name at the time of interest, so you can label output with the local time and the abbreviation. --John (Forgive me for using Outlook; I can't find a way to format quotes. Or to do anything else for that matter.)
Date: Mon, 12 Jan 2009 17:45:39 -0500 From: "John Dlugosz" <JDlugosz@TradeStation.com> Message-ID: <450196A1AAAE4B42A00A8B27A59278E70925C93A@EXCHANGE.trad.tradestation.com> | So what is the way, in this code (or POSIX), to get the name of the | timezone? For what you ask there, there is none, here, or elsewhere. tzname[] (despite its name) is not that, it is the current timezone abbreviation (which in this code is better obtained from tm_tzname if it is really needed for anything - I'd actually suggest avoiding its use if you possibly can, use the numeric offset instead). In POSIX my guess is (I'm no POSIX expert) that strftime("%Z") is probably the correct way, that at least allows for some degree of localisation, which localtime() certainly does not. | And I assume what is really wanted is the name at the time of | interest, so you can label output with the local time and the | abbreviation. If it is just for human consumption, and the human will already have a pretty good idea what zone might be being referenced, then the zone abbreviation is probably acceptable - you just cannot rely on its value being useful for anything more than that (they're wildly ambiguous, and in the timezone package, for some zones, totally arbitrary and meaningless - in many (particularly smaller) countries the time is just the time, and has no name by which it is distinguished from the time someplace else.) | (Forgive me for using Outlook; Ah, sorry, that one is a sin beyond redemption - eternal damnation is your only prospect... kre
-----Original Message----- From: kre@munnari.OZ.AU [mailto:kre@munnari.OZ.AU] Sent: Monday, January 12, 2009 9:28 PM To: John Dlugosz Cc: tz@elsie.nci.nih.gov Subject: Re: FW: timezone code - re-entrancy issue Date: Mon, 12 Jan 2009 17:45:39 -0500 From: "John Dlugosz" <JDlugosz@TradeStation.com> Message-ID: <450196A1AAAE4B42A00A8B27A59278E70925C93A@EXCHANGE.trad.tradestation.com
| So what is the way, in this code (or POSIX), to get the name of the | timezone? For what you ask there, there is none, here, or elsewhere. tzname[] (despite its name) is not that, it is the current timezone abbreviation (which in this code is better obtained from tm_tzname if it is really needed for anything - I'd actually suggest avoiding its use if you possibly can, use the numeric offset instead). Hmm, the code reads tzname[tmp->tm_isdst] = &sp.chars[ttisp->tt_abbrind]; where ttisp points the record actually found that is in governance at the specified time, and whose offset is used in the final calculation. So I maintain that it does update the global tzname with the actual name in force at that time, each time it is called. If there is no proper way to obtain that, why is it so deprecated that only ancient programs would use the old way? It's the only choice! In POSIX my guess is (I'm no POSIX expert) that strftime("%Z") is probably the correct way, that at least allows for some degree of localisation, which localtime() certainly does not. But that begs the question of where strftime gets the information. | And I assume what is really wanted is the name at the time of | interest, so you can label output with the local time and the | abbreviation. If it is just for human consumption, and the human will already have a pretty good idea what zone might be being referenced, then the zone abbreviation is probably acceptable - you just cannot rely on its value being useful for anything more than that Understood. (they're wildly ambiguous, and in the timezone package, for some zones, totally arbitrary and meaningless - in many (particularly smaller) countries the time is just the time, and has no name by which it is distinguished from the time someplace else.) Gotcha. | (Forgive me for using Outlook; --John
Date: Tue, 13 Jan 2009 14:49:15 -0500 From: "John Dlugosz" <JDlugosz@TradeStation.com> Message-ID: <450196A1AAAE4B42A00A8B27A59278E70925CC97@EXCHANGE.trad.tradestation.com> | So I maintain that it does update the global tzname with the actual name | in force at that time, each time it is called. It updates it with the tz abbreviation each time it is called, yes. I personally wouldn't call the abbreviation a "name", so I guess whether I really am disagreeing or not depends upon whether your emphasis there is "the actual name" part or "updates ... each time it is called". The latter is clearly true, and the tzname[] array is most certainly not thread safe - a property it shares with lots of other of the very old unix interfaces. On the other hand, for most applications, which don't go altering the timezone during execution (single common timezone for all threads) and for most timezones, the string assigned is the same every time (for each value of tm_isdst), so the fact that the assignment is being done tends to be hidden. That isn't something to rely upon, but applications that use tzname[] (for some weird reason) will mostly work OK, even if multi-threaded. New applications just shouldn't go near tzname (nor the old timezone and whatever the other global var was that some implementations supported). | If there is no proper way to obtain that, why is it so deprecated that | only ancient programs would use the old way? It's the only choice! No, tm_zone (which I mistakenly called tm_tzname last time) is the "proper" way to get the abbreviation associated with a time_t (after a call to localtime() of course). That one is perfectly safe (just call localtime_r() if you need it). | But that begs the question of where strftime gets the information. No, strftime() localtime() (et. al.) are all internal library functions, they can communicate with each other using private interfaces if necessary. All that matters to the application (or POSIX) is that it works. You're not supposed to know or care how. For this implementation, the tm_zone field is used, an implementation that doesn't have tm_zone in struct tm would use some other way. kre
The localtime.c file does not mention tm_zone, so "this implementation" is missing the proper way. Shouldn't there be conditional code to copy to tmp->tm_zone if that field is present (as indicated in private.h)? I didn't find a strftime implementation in this package, either, so if some "private interface" exists that allows some other package to cooperate with it, that's what I'm asking about. --John -----Original Message----- From: kre@munnari.OZ.AU [mailto:kre@munnari.OZ.AU] Sent: Wednesday, January 14, 2009 4:00 PM To: John Dlugosz Cc: tz@elsie.nci.nih.gov Subject: Re: FW: timezone code - re-entrancy issue Date: Tue, 13 Jan 2009 14:49:15 -0500 From: "John Dlugosz" <JDlugosz@TradeStation.com> Message-ID: <450196A1AAAE4B42A00A8B27A59278E70925CC97@EXCHANGE.trad.tradestation.com
| So I maintain that it does update the global tzname with the actual name | in force at that time, each time it is called. It updates it with the tz abbreviation each time it is called, yes. I personally wouldn't call the abbreviation a "name", so I guess whether I really am disagreeing or not depends upon whether your emphasis there is "the actual name" part or "updates ... each time it is called". The latter is clearly true, and the tzname[] array is most certainly not thread safe - a property it shares with lots of other of the very old unix interfaces. On the other hand, for most applications, which don't go altering the timezone during execution (single common timezone for all threads) and for most timezones, the string assigned is the same every time (for each value of tm_isdst), so the fact that the assignment is being done tends to be hidden. That isn't something to rely upon, but applications that use tzname[] (for some weird reason) will mostly work OK, even if multi-threaded. New applications just shouldn't go near tzname (nor the old timezone and whatever the other global var was that some implementations supported). | If there is no proper way to obtain that, why is it so deprecated that | only ancient programs would use the old way? It's the only choice! No, tm_zone (which I mistakenly called tm_tzname last time) is the "proper" way to get the abbreviation associated with a time_t (after a call to localtime() of course). That one is perfectly safe (just call localtime_r() if you need it). | But that begs the question of where strftime gets the information. No, strftime() localtime() (et. al.) are all internal library functions, they can communicate with each other using private interfaces if necessary. All that matters to the application (or POSIX) is that it works. You're not supposed to know or care how. For this implementation, the tm_zone field is used, an implementation that doesn't have tm_zone in struct tm would use some other way. kre
Given conditional compilation requirements, you'll get to look for TM_ZONE rather than tm_zone in localtime.c --ado -----Original Message----- From: John Dlugosz [mailto:JDlugosz@TradeStation.com] Sent: Wednesday, January 14, 2009 5:15 To: kre@munnari.OZ.AU Cc: tz@lecserver.nci.nih.gov Subject: RE: FW: timezone code - re-entrancy issue The localtime.c file does not mention tm_zone, so "this implementation" is missing the proper way. Shouldn't there be conditional code to copy to tmp->tm_zone if that field is present (as indicated in private.h)? I didn't find a strftime implementation in this package, either, so if some "private interface" exists that allows some other package to cooperate with it, that's what I'm asking about. --John -----Original Message----- From: kre@munnari.OZ.AU [mailto:kre@munnari.OZ.AU] Sent: Wednesday, January 14, 2009 4:00 PM To: John Dlugosz Cc: tz@elsie.nci.nih.gov Subject: Re: FW: timezone code - re-entrancy issue Date: Tue, 13 Jan 2009 14:49:15 -0500 From: "John Dlugosz" <JDlugosz@TradeStation.com> Message-ID: <450196A1AAAE4B42A00A8B27A59278E70925CC97@EXCHANGE.trad.tradestation.com
| So I maintain that it does update the global tzname with the actual name | in force at that time, each time it is called. It updates it with the tz abbreviation each time it is called, yes. I personally wouldn't call the abbreviation a "name", so I guess whether I really am disagreeing or not depends upon whether your emphasis there is "the actual name" part or "updates ... each time it is called". The latter is clearly true, and the tzname[] array is most certainly not thread safe - a property it shares with lots of other of the very old unix interfaces. On the other hand, for most applications, which don't go altering the timezone during execution (single common timezone for all threads) and for most timezones, the string assigned is the same every time (for each value of tm_isdst), so the fact that the assignment is being done tends to be hidden. That isn't something to rely upon, but applications that use tzname[] (for some weird reason) will mostly work OK, even if multi-threaded. New applications just shouldn't go near tzname (nor the old timezone and whatever the other global var was that some implementations supported). | If there is no proper way to obtain that, why is it so deprecated that | only ancient programs would use the old way? It's the only choice! No, tm_zone (which I mistakenly called tm_tzname last time) is the "proper" way to get the abbreviation associated with a time_t (after a call to localtime() of course). That one is perfectly safe (just call localtime_r() if you need it). | But that begs the question of where strftime gets the information. No, strftime() localtime() (et. al.) are all internal library functions, they can communicate with each other using private interfaces if necessary. All that matters to the application (or POSIX) is that it works. You're not supposed to know or care how. For this implementation, the tm_zone field is used, an implementation that doesn't have tm_zone in struct tm would use some other way. kre
Date: Wed, 14 Jan 2009 17:14:32 -0500 From: "John Dlugosz" <JDlugosz@TradeStation.com> Message-ID: <450196A1AAAE4B42A00A8B27A59278E7093A3DC0@EXCHANGE.trad.tradestation.com> The first part of your mail was already answered. | I didn't find a strftime implementation in this package, either, so if | some "private interface" exists that allows some other package to | cooperate with it, that's what I'm asking about. strftime() needs to deal with locale info, which is highly system dependent, so there's no reasonable way to include it with this code. Further, it is really just a binary->string conversion function for a struct tm, it itself knows nothing much about time (am/pm & 12 hour clocks is almost it!), so it wouldn't even gain much (or probably anything) by being included. But strftime (along with all the rest of the locale info) gets added to the system library, along with the tzcode - if the implementor of strftime() needs to do some private fiddle to localtime() to make strftime() possible (or just easier) then that's what they do. kre
Date: Wed, 14 Jan 2009 17:14:32 -0500 From: "John Dlugosz" <JDlugosz@TradeStation.com> Message-ID: <450196A1AAAE4B42A00A8B27A59278E7093A3DC0@EXCHANGE.trad.tradestation.com> | I didn't find a strftime implementation in this package, And of course, look in strftime.c ... I should have known it would be there (even though I'm not sure it should be.) kre
On Thursday, January 15 2009, "Robert Elz" wrote to "John Dlugosz, tz@lecserver.nci.nih.gov" saying:
No, tm_zone (which I mistakenly called tm_tzname last time) is the "proper" way to get the abbreviation associated with a time_t (after a call to localtime() of course). That one is perfectly safe (just call localtime_r() if you need it).
Actually, this isn't true -- tm_zone isn't safe if the value of the "TZ" environment variable is changed. (This is true even in single-threaded code, in fact.) -- Jonathan Lennox lennox@cs.columbia.edu
Date: Wed, 14 Jan 2009 18:49:10 -0500 From: lennox@cs.columbia.edu Message-ID: <18798.31222.292955.425549@amman.clic.cs.columbia.edu> | Actually, this isn't true -- tm_zone isn't safe if the value of the "TZ" | environment variable is changed. (This is true even in single-threaded | code, in fact.) Hmm - yes, that's non-trivial to fix as there's no tm_free() function. Probably we'd need to simply retain all of the tzabbr lists, rather than including them directly in the state struct), for all zones ever loaded (just once, so we don't burn memory when returning to a zone for the second time) and suffer the cost of that - which isn't much, as the string size is small (typically < 20 bytes/zone, usually much less) and the number of zones is bounded, the number actually used by an application likely much smaller, so the total cost is going to be small. kre
participants (4)
-
John Dlugosz -
lennox@cs.columbia.edu -
Olson, Arthur David (NIH/NCI) [E] -
Robert Elz