El Wednesday, 8 de March de 2006 7:59 pm, Daniel Mueller escribió:
Hi Alan,
Your gdb-backtrace says it dies at handle.c:118
i = (*(t->type->closefunc))(t->fd);
#0 0x080d58d0 in _IO_un_link () #1 0x080cf24f in fclose () #2 0x0805e0b8 in destroy () #3 0x0805c1fa in tar_close (t=0x818cc30) at handle.c:118
"closefunc" is a pointer to zlib's "gzclose" function (defined in libtar/libtar.c:97). Have you (re-)compiled your zlib with some strange optimization options?
bye, danm
I didn't use any strange flags, I have always used "-O2 -march=i686 -pipe" as my CFLAGS and CXXFLAGS, and it has always worked fine (this is a pentium2). Anyway, I rebuilt libz.a with: -O2 -DDEBUG -ggdb, and then proceeded to build pkgutils again (with -O2 -ggdb), and now I get this backtrace: Program received signal SIGSEGV, Segmentation fault. 0x080d5ee0 in _IO_un_link () (gdb) bt #0 0x080d5ee0 in _IO_un_link () #1 0x080cf85f in fclose () #2 0x0805e0b8 in destroy (s=0x818dc30) at gzio.c:375 #3 0x0805c1fa in tar_close (t=0x82508c0) at handle.c:118 #4 0x0804dd0e in pkgutil::pkg_install (this=0x816db00, filename=@0xbff4c8e0, keep_list=@0xbff4c800) at pkgutil.cc:425 #5 0x080568d6 in pkgadd::run (this=0x816db00, argc=-1074476992, argv=0xbff4cc54) at pkgadd.cc:104 #6 0x08048687 in main (argc=3, argv=0xbff4cc54) at memory:285 This is gzio.c: 368: if (s->stream.state != NULL) { 369: if (s->mode == 'w') { xxx:#ifdef NO_GZCOMPRESS 370: err = Z_STREAM_ERROR; xxx:#else 370: err = deflateEnd(&(s->stream)); xxx:#endif 371: } else if (s->mode == 'r') { 372: err = inflateEnd(&(s->stream)); 373: } 374: } 375: if (s->file != NULL && fclose(s->file)) { xxx:#ifdef ESPIPE xxx: if (errno != ESPIPE) /* fclose is broken for pipes in HP/UX */ xxx:#endif xxx: err = Z_ERRNO; xxx: } I guess the segfault is at fclose(s->file), but why is this happening? After this test, I installed the stock zlib from crux 2.1, and rebuilt pkgutils-5.20, with the same result. Then I tried installing the stock pkgutils from crux 2.1, and still the same result, this makes no sense. The next thing I tried was running pkgadd with strace. I got this output: munmap(0xb7b95000, 131072) = 0 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 8) = 0 stat64("/etc/openldap/ldap.conf", {st_mode=S_IFREG|0644, st_size=1043, ...}) = 0 geteuid32() = 0 stat64("/etc/openldap/ldap.conf", {st_mode=S_IFREG|0644, st_size=1043, ...}) = 0 geteuid32() = 0 time(NULL) = 1141878180 write(5, "0\201\234\2\1\30c\201\226\4\27ou=group,dc=bwv2,dc=c"..., 159) = 159 select(1024, [5], [], NULL, NULL) = 1 (in [5]) read(5, "0H\2\1\30dC\4", 8) = 8 read(5, "\37cn=root,ou=Group,dc=bwv2,dc=com"..., 66) = 66 select(1024, [5], [], NULL, NULL) = 1 (in [5]) read(5, "0\f\2\1\30e\7\n", 8) = 8 read(5, "\1\0\4\0\4\0", 6) = 6 time(NULL) = 1141878180 time([1141878180]) = 1141878180 rt_sigaction(SIGPIPE, {SIG_DFL}, NULL, 8) = 0 geteuid32() = 0 lchown32("/usr/include/gmpxx.h", 0, 0) = 0 utime("/usr/include/gmpxx.h", [2006/03/08-16:55:24, 2006/03/08-16:55:24]) = 0 chmod("/usr/include/gmpxx.h", 0100644) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ As you can see, I am using nss-ldap. So the next thing I did was to comment out ldap from my nsswitch.conf, and it worked! pkgadd doesn't segfault anymore! This is very strange, since I run many programs that call the nss functions on ldap, and they always work. And by many I mean: exim, sshd, samba, mysql, apache, etc. I even tried with different nss-ldap versions, with the same results. If this wasn't strange enough, read this: If I put back ldap in my nsswitch.conf, and run nscd, everything works fine! So my guess is that this is a glibc bug, what do you think? I'll take a deeper look when I have more time, but for now I am satisfied (and puzzled) with the results. Thanks for reading. Regards, -- Alan Mizrahi