Let me fill in a bit of background. For internationalization identifiers in general, including languages, scripts, countries, and so on, we need some identifier value that means "the value for that identifier is unknown". It is also often used in APIs for "the value supplied was invalid.". Some examples are in https://cldr-smoke.unicode.org/spec/main/ldml/tr35.html#Unknown_or_Invalid_I... . (Note that in that table there is 'unk'. CLDR provides short, stable identifiers for IANA TZDB identifiers, based where possible on UN/LOCODEs. All of these are defined by a mapping to the regular long IANA TZDB identifiers (plus Etc/Unknown), as per https://github.com/unicode-org/cldr/blob/main/common/bcp47/timezone.xml.) Around 2010, we added the "Etc/Unknown" in CLDR for use with the TZDB, to serve the purpose of a TZDB identifier for 'unknown'. The name was chosen so that it would be very unlikely to collide with any identifier that the TZDB would itself define in the future with a different meaning. Much more recently (2023) an effort was started to support some legacy POSIX identifiers that had been deliberately omitted from CLDR. You can read more about that in https://unicode-org.atlassian.net/browse/CLDR-17111, although the reasoning for the choices is not documented there. As a part of that effort, the best mapping for Factory (based upon its usage) was concluded to be Etc/Unknown. That led to the issue being discussed (see https://issues.chromium.org/issues/381620359). Justin's proposal for resolving the issue by formally adding Etc/Unknown to the TZDB would be very welcome. For the definition of Etc/Unknown in CLDR, we would then simply refer to the IANA TZDB. We could do this as early as the release of CLDR 47, due mid-March 2025. On Sat, Dec 7, 2024 at 12:11 PM Brian Inglis via tz <tz@iana.org> wrote:
As well as many distros building fat format (as third party readers break on slim) from rearguard with backward, backzone, and zone.tab for compatibility, and some add posixrules America/New_York, it would not surprise me if others, and commercial vendors set posixrules and/or Factory to some localized zone so users always have a valid default and never see -00.
I could see this being a sound and convenient decision for orgs who provide localized installers, especially as most countries with their own languages also have only a single timezone, so Factory, /etc/localtime, and /etc/timezone can default to the same zone.
If the other UTC (Unicode Technical Committee) or ICU would like Etc/Unknown to be instantiated as a timezone looking like the default Factory, and submit a patch, it should be included.
I do not think any link between Factory and Unknown should ever be assumed.
On 2024-12-06 19:19, Arthur Olson via tz wrote:
May be prudent to keep Etc/Factory and Etc/Unknown separate to guard against the possibility that a vendor does something funky with Factory (such as automatically modifying the distribution so that Factory is their local time). There's a small price to pay given the small sizes of the existing Factory and proposed Unknown binaries.
@dashdashado
On Fri, Dec 6, 2024, 9:11 PM Justin Grant via tz <tz@iana.org <mailto:tz@iana.org>> wrote:
Hi TZ friends - Should the time zone identifier "Etc/Unknown" (standardized in Unicode Technical Standard 35 and used by the ICU <https:// icu.unicode.org/> library that implements time zone support in all major web browsers, in Java, and in other platforms) be added to the IANA Time Zone Database?
Etc/Unknown would behave the same as the existing time zone identifier "Factory". Ideally, Etc/Unknown could be a Zone and Factory could be turned into a Link pointing to Etc/Unknown, because both of them have the meaning of "the time zone of this computer is not known" but Etc/Unknown is more self-describing and is better aligned with modern Zone naming conventions.
Here's more context: "Etc/Unknown" is standardized in https://unicode.org/ reports/tr35/#Time_Zone_Identifiers < https://unicode.org/reports/tr35/ #Time_Zone_Identifiers>. Here's the relevant text from the standard:
> There is a special code "unk" for an Unknown or Invalid time zone. > This can be expressed in the tz database style ID "Etc/Unknown", > although it is not defined in the tz database.
Following this standard, Etc/Unknown is returned by ICU <https:// icu.unicode.org/> when the time zone of a computer cannot be determined. Here's an example of an ICU method that can return Etc/Unknown: icu::TimeZone::detectHostTimeZone() < https://unicode-org.github.io/icu-docs/
apidoc/dev/icu4c/classicu_1_1TimeZone.html#a5ca5a356ff03ed1f7cd0b1550117f529>.
The fact that Etc/Unknown looks like an IANA identifier but is not
actually
in TZDB has been a long-running source of problems. Most recently,
this week
Chrome is rushing out a patch <
https://issues.chromium.org/issues/381620359>
that reverted a recent change that added support for the "Factory"
zone by
making it an alias for "Etc/Unknown". GIven that ICU previously
didn't
recognize Factory and so returned Etc/Unknown for computers in the
Factory
zone, this was assumed by everyone to be a safe change. But this change turned out to break some 3rd-party libraries and had to be
reverted.
After discussing this recent bug with engineers at Google, our
consensus was
that it'd be helpful for the time zone ecosystem if Etc/Unknown
stopped
being a special case and started being a regular Zone in the IANA
Time Zone
database. Especially if we could Link-ify Factory at the same time
so that
Zone picker UIs won't have two distinct "I don't know what time zone
this
is" choices.
-- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry