pkgadd segfaults

Alan Mizrahi alan-crux at mizrahi.com.ve
Thu Mar 9 04:30:15 UTC 2006


El Wednesday, 8 de March de 2006 7:59 pm, Daniel Mueller escribió:
> Hi Alan,
>
> Your gdb-backtrace says it dies at handle.c:118
>
> i = (*(t->type->closefunc))(t->fd);
>
> > #0  0x080d58d0 in _IO_un_link ()
> > #1  0x080cf24f in fclose ()
> > #2  0x0805e0b8 in destroy ()
> > #3  0x0805c1fa in tar_close (t=0x818cc30) at handle.c:118
>
> "closefunc" is a pointer to zlib's "gzclose" function (defined in
> libtar/libtar.c:97). Have you (re-)compiled your zlib with some strange
> optimization options?
>
> bye, danm

I didn't use any strange flags, I have always used "-O2 -march=i686 -pipe" as 
my CFLAGS and CXXFLAGS, and it has always worked fine (this is a pentium2).

Anyway, I rebuilt libz.a with: -O2 -DDEBUG -ggdb, and then proceeded to build 
pkgutils again (with -O2 -ggdb), and now I get this backtrace:


Program received signal SIGSEGV, Segmentation fault.
0x080d5ee0 in _IO_un_link ()
(gdb) bt
#0  0x080d5ee0 in _IO_un_link ()
#1  0x080cf85f in fclose ()
#2  0x0805e0b8 in destroy (s=0x818dc30) at gzio.c:375
#3  0x0805c1fa in tar_close (t=0x82508c0) at handle.c:118
#4  0x0804dd0e in pkgutil::pkg_install (this=0x816db00, filename=@0xbff4c8e0,
    keep_list=@0xbff4c800) at pkgutil.cc:425
#5  0x080568d6 in pkgadd::run (this=0x816db00, argc=-1074476992,
    argv=0xbff4cc54) at pkgadd.cc:104
#6  0x08048687 in main (argc=3, argv=0xbff4cc54) at memory:285


This is gzio.c:

368:    if (s->stream.state != NULL) {
369:        if (s->mode == 'w') {
xxx:#ifdef NO_GZCOMPRESS
370:            err = Z_STREAM_ERROR;
xxx:#else
370:            err = deflateEnd(&(s->stream));
xxx:#endif
371:        } else if (s->mode == 'r') {
372:            err = inflateEnd(&(s->stream));
373:        }
374:    }
375:    if (s->file != NULL && fclose(s->file)) {
xxx:#ifdef ESPIPE
xxx:        if (errno != ESPIPE) /* fclose is broken for pipes in HP/UX */
xxx:#endif
xxx:            err = Z_ERRNO;
xxx:    }


I guess the segfault is at fclose(s->file), but why is this happening?

After this test, I installed the stock zlib from crux 2.1, and rebuilt 
pkgutils-5.20, with the same result.

Then I tried installing the stock pkgutils from crux 2.1, and still the same 
result, this makes no sense.

The next thing I tried was running pkgadd with strace.  I got this output:

munmap(0xb7b95000, 131072)              = 0
rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 8) = 0
stat64("/etc/openldap/ldap.conf", {st_mode=S_IFREG|0644, st_size=1043, ...}) = 
0
geteuid32()                             = 0
stat64("/etc/openldap/ldap.conf", {st_mode=S_IFREG|0644, st_size=1043, ...}) = 
0
geteuid32()                             = 0
time(NULL)                              = 1141878180
write(5, "0\201\234\2\1\30c\201\226\4\27ou=group,dc=bwv2,dc=c"..., 159) = 159
select(1024, [5], [], NULL, NULL)       = 1 (in [5])
read(5, "0H\2\1\30dC\4", 8)             = 8
read(5, "\37cn=root,ou=Group,dc=bwv2,dc=com"..., 66) = 66
select(1024, [5], [], NULL, NULL)       = 1 (in [5])
read(5, "0\f\2\1\30e\7\n", 8)           = 8
read(5, "\1\0\4\0\4\0", 6)              = 6
time(NULL)                              = 1141878180
time([1141878180])                      = 1141878180
rt_sigaction(SIGPIPE, {SIG_DFL}, NULL, 8) = 0
geteuid32()                             = 0
lchown32("/usr/include/gmpxx.h", 0, 0)  = 0
utime("/usr/include/gmpxx.h", [2006/03/08-16:55:24, 2006/03/08-16:55:24]) = 0
chmod("/usr/include/gmpxx.h", 0100644)  = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

As you can see, I am using nss-ldap.  So the next thing I did was to comment 
out ldap from my nsswitch.conf, and it worked! pkgadd doesn't segfault 
anymore!

This is very strange, since I run many programs that call the nss functions on 
ldap, and they always work.  And by many I mean: exim, sshd, samba, mysql, 
apache, etc.  I even tried with different nss-ldap versions, with the same 
results.

If this wasn't strange enough, read this:
If I put back ldap in my nsswitch.conf, and run nscd, everything works fine!

So my guess is that this is a glibc bug, what do you think?

I'll take a deeper look when I have more time, but for now I am satisfied (and 
puzzled) with the results.

Thanks for reading.

Regards,


--
Alan Mizrahi





More information about the CRUX mailing list