Tim Parenti wrote:
concatenating the reference output in a single tarball alongside the source dataset would encourage implementations which simply parse the reference output, which may be counter to our goals
I don't see how putting the reference output into a separate tarball would affect the degree of encouragement. Implementations that wanted to parse the reference output could do that regardless of whether it's in a separate tarball. For what it's worth, the reference output format is extensional, so it can lose some of the input's information. For example, the input could specify a special rule for the year 2051, info that is discarded by the current cutoff of 2050. So as things stand now, in principle the reference output is not a good choice for a downstream implementation. (If we change the reference output format to capture everything, this obstacle would go away of course.) By "extensional" I mean this: https://en.wikipedia.org/wiki/Extensional_definition