zic fails for soft-links-only target volume
I received a complaint that zic recently started to fail when the target directory is on a volume that doesn't support hard links. (The complainant is using some weird VMware Fusion setup, but I think this might be true on Windows generally.) It used to go through just fine, creating symlinks instead, but as of 2016g it fails with something like ./zic: link from /home/postgres/testversion/share/timezone/US/Eastern failed: Operation not permitted It appears to me that the problem is that commit 35484e98c broke the case of linking to an existing symlink. This happened because itsdir() was modified to return 2 not 0 for a symlink, but dolink() throws an EPERM error for that, because it wasn't taught about the change. There's no reason to disallow that case, so I propose this patch: diff --git a/zic.c b/zic.c *** a/zic.c --- b/zic.c *************** dolink(char const *fromfield, char const *** 824,830 **** ** there's a fair chance of root running us. */ fromisdir = itsdir(fromfield); ! if (fromisdir) { char const *e = strerror(fromisdir < 0 ? errno : EPERM); fprintf(stderr, _("%s: link from %s/%s failed: %s\n"), progname, directory, fromfield, e); --- 824,830 ---- ** there's a fair chance of root running us. */ fromisdir = itsdir(fromfield); ! if (fromisdir < 0 || fromisdir == 1) { char const *e = strerror(fromisdir < 0 ? errno : EPERM); fprintf(stderr, _("%s: link from %s/%s failed: %s\n"), progname, directory, fromfield, e); BTW, you can test this behavior without needing to have such a volume at hand, by forcing the link() attempt to fail like this: *************** dolink(char const *fromfield, char const *** 840,845 **** --- 840,846 ---- progname, directory, tofield, e); exit(EXIT_FAILURE); } + staysymlink = true; // for testing purposes only!! link_errno = (staysymlink ? ENOTSUP : link(fromfield, tofield) == 0 ? 0 : errno); if (link_errno == ENOENT && !todirs_made) { regards, tom lane
I wrote:
It appears to me that the problem is that commit 35484e98c broke the case of linking to an existing symlink. This happened because itsdir() was modified to return 2 not 0 for a symlink, but dolink() throws an EPERM error for that, because it wasn't taught about the change. There's no reason to disallow that case, so I propose this patch:
On closer inspection, I notice that there really aren't any callers of itsdir() that are interested in both conditions simultaneously. So a less error-prone solution would be to revert 35484e98c's changes to itsdir(), taking it back to doing only what the name suggests, and invent an itssymlink() function that has a similar API but tests for symlink-ness not directory-ness. Then you'd do this: - staysymlink = itsdir(tofield) == 2; + staysymlink = itssymlink(tofield) > 0; and leave dolink()'s first usage of itsdir() alone. I also note that 35484e98c has possibly broken the other caller of itsdir(): if (mkdir(name, MKDIR_UMASK) != 0) { int err = errno; if (err != EEXIST && itsdir(name) < 0) { error(_("%s: Can't create directory %s: %s"), progname, name, strerror(err)); exit(EXIT_FAILURE); } } as I doubt we want that to consider "it's a symlink" as an OK outcome. I would actually say this should be - if (err != EEXIST && itsdir(name) < 0) { + if (err != EEXIST || itsdir(name) <= 0) { as we really don't want the code to keep going unless there is a directory there. Maybe even drop the test on what the err code is, and only check itsdir()? regards, tom lane
On 11/03/2016 07:55 AM, Tom Lane wrote:
It appears to me that the problem is that commit 35484e98c broke the case of linking to an existing symlink.
That's right. Thanks for the diagnosis.
There's no reason to disallow that case
Unfortunately the 'link' system call has system-dependent behavior when linking to a symlink. Some systems follow the symlink, and others don't. For the systems that don't, hardlinking to a relative symlink can cause trouble, since the symlink is resolved relative to its parent directory, and this interpretation can differ in the source vs the destination directory. I fixed this in the attached patch by using linkat with AT_SYMLINK_FOLLOW, as this is immune to the problem of resolving symlinks and is the intended meaning here anyway. On older systems lacking linkat, the code checks whether the source is a symlink and refuses to hard-link to it; this should be good enough.
So a less error-prone solution would be to revert 35484e98c's changes to itsdir(), taking it back to doing only what the name suggests, and invent an itssymlink() function that has a similar API but tests for symlink-ness not directory-ness.
Thanks for this suggestion too. Done in the attached patch.
I would actually say this should be
- if (err != EEXIST && itsdir(name) < 0) { + if (err != EEXIST || itsdir(name) <= 0) {
as we really don't want the code to keep going unless there is a directory there.
Yes, that sounds right, and is also in the attached patch.
Maybe even drop the test on what the err code is, and only check itsdir()? If errno == EEXIST (a common case), there should be no need to check itsdir, as the callers of mkdir do not consider an existing non-directory to be trouble (they'll diagnose that themselves). So unless I'm missing something, I'm inclined to keep the err code test, as a minor optimization.
Thanks again for the diagnosis and suggestions. Please try the attached patch, which I've installed into the development repository.
Paul Eggert <eggert@cs.ucla.edu> writes:
I fixed this in the attached patch by using linkat with AT_SYMLINK_FOLLOW, as this is immune to the problem of resolving symlinks and is the intended meaning here anyway. On older systems lacking linkat, the code checks whether the source is a symlink and refuses to hard-link to it; this should be good enough.
hmmm ...
Thanks again for the diagnosis and suggestions. Please try the attached patch, which I've installed into the development repository.
Running with this code on a Linux box, with a modification to force staysymlink true, I get a lot of symlinks that seem to be misdirected. For example the toplevel Cuba link looks like lrwxrwxrwx. 1 postgres postgres 17 Nov 3 20:39 Cuba -> ../America/Havana when it should be just America/Havana. regards, tom lane
I wrote:
Running with this code on a Linux box, with a modification to force staysymlink true, I get a lot of symlinks that seem to be misdirected. For example the toplevel Cuba link looks like
lrwxrwxrwx. 1 postgres postgres 17 Nov 3 20:39 Cuba -> ../America/Havana
when it should be just America/Havana.
It looks to me like this is already broken in 2016h. I think it's a thinko in relname(): should not the number of dotdots added depend on the number of additional directory levels in the *target* name, not the *source* name? This seems to fix it for me: *** a/zic.c --- b/zic.c *************** relname(char const *from, char const *to *** 804,812 **** for (i = 0; f[i] && f[i] == to[i]; i++) if (f[i] == '/') dir_len = i + 1; ! for (; f[i]; i++) ! dotdots += f[i] == '/' && f[i - 1] != '/'; ! taillen = i - dir_len; dotdotetcsize = 3 * dotdots + taillen + 1; if (dotdotetcsize <= linksize) { if (!result) --- 804,812 ---- for (i = 0; f[i] && f[i] == to[i]; i++) if (f[i] == '/') dir_len = i + 1; ! for (; to[i]; i++) ! dotdots += to[i] == '/' && to[i - 1] != '/'; ! taillen = strlen(f + dir_len); dotdotetcsize = 3 * dotdots + taillen + 1; if (dotdotetcsize <= linksize) { if (!result) With that, all the symlinks seem to point somewhere valid. regards, tom lane
Paul Eggert <eggert@cs.ucla.edu> writes:
Tom Lane wrote:
With that, all the symlinks seem to point somewhere valid.
Thanks, I wrote the same patch while thinking about your earlier email, and installed the attached.
Thanks, I can confirm that the original portability problem is fixed: zic seems to install a valid set of symlinks when its hard-link attempts fail, according to both my testing and that of my original complainant. However, I'm still dubious about the error handling in mkdirs(), specifically I think you should do this: - if (err != EEXIST && !itsdir(name)) { + if (err != EEXIST || !itsdir(name)) { I see that you modified the function's header comment to acknowledge that it doesn't consider a collision with a non-directory to be an error, but that doesn't make any sense to me. All the call sites would prefer the invariant to be "on return, there is a directory there"; they are all going to fail if there isn't. regards, tom lane
On 11/04/2016 07:17 AM, Tom Lane wrote:
- if (err != EEXIST && !itsdir(name)) { + if (err != EEXIST || !itsdir(name)) {
I see that you modified the function's header comment to acknowledge that it doesn't consider a collision with a non-directory to be an error, but that doesn't make any sense to me.
For example, that change could cause zic to cry wolf when it attempts to make an already-existing directory whose parent is unwritable. In this case mkdir can fail with errno == EACCES (because when there are multiple reasons to fail POSIX does not specify which errno value to use). This would cause "err != EXIST" to yield 1, which would generate a bogus diagnostic. Although it would be correct to replace the code with just 'if (!itsdir(name)) {', this would slow zic down in the common case where mkdir fails with errno == EEXIST due to an existing directory.
All the call sites would prefer the invariant to be "on return, there is a directory there"; they are all going to fail if there isn't. That should be OK, since when they fail they should diagnose their failures, with error messages that are good enough. Plus, any such failures can happen only when some other process is simultaneously doing something weird like replacing directories with non-directories. I don't think we need to worry about exactly which diagnostic is generated in such situations, so long as the problem is diagnosed somehow.
I installed the attached patch to help comment this a bit better.
participants (2)
-
Paul Eggert -
Tom Lane