PDA

View Full Version : amdump segfault in 2.5.1b



bethany
August 8th, 2006, 07:48 AM
This week I installed Amanda 2.5.1b from RPM on a RedHat EL4 x86_64 machine. When I run amdump, I get the following:


~]$ sudo su amandabackup -c "amdump ISDaily2.5"
*** glibc detected *** double free or corruption (fasttop): 0x0805d658 ***

This shows up in dmesg:

driver[11483]: segfault at 0000000000000011 rip 0000000000707f5b rsp 00000000ffff1bec error 4

excerpt from /var/lib/amanda/ISDaily/amdump.1:

taper: DONE [idle wait: 2.637 secs]
/usr/sbin/amdump: line 109: 11847 Done $libexecdir/planner$SUF $conf "$@"
11848 Aborted | $libexecdir/driver$SUF $conf "$@"
taper: writing end marker. [ISDaily04 OK kb 0 fm 0]

The only other record I see of the glibc double free error on this forum is a user who didn't have the right version of Tar. I'm running tar-1.14-9.RHEL4. Could this be the issue?

What other info can I provide to help debug? Output of amadmin version?

thanks!
B.

paddy
August 8th, 2006, 07:54 AM
Which version of 2.5.1? Is it 2.5.1b1 or 2.5.1b2? 2.5.1b2 addresses significant
memory allocation issues.

Please use tar 1.15.1 version. Some tar 1.14 versions have known to problems
that will cause restoration issues.

Are you using rpms from http://www.zmanda.com/downloads.html or
are you building on your own? If you have compiled on your own,
please provide output of "amadmin xx version"

Thanks,
Paddy

bethany
August 8th, 2006, 08:00 AM
Hi Paddy - it's 2.5.1b2 from the zmanda downloads page:

output of 'rpm -qa | grep amanda':
amanda-backup_client-2.5.1b2-1.rhel4
amanda-backup_server-2.5.1b2-1.rhel4

I'll update Tar to 1.15.1 and see how it goes.

bethany
August 8th, 2006, 08:11 AM
Hm... same result with 1.15.1:


# su amandabackup -c "amdump ISDaily2.5"
*** glibc detected *** double free or corruption (fasttop): 0x0805d658 ***
# tar --version
tar (GNU tar) 1.15.1

paddy
August 8th, 2006, 09:03 AM
Can you please run amdump with -x option?

Change the first line in amdump script to
#!/bin/sh -x

Depending on which command in the script causes this problem, you will
have to run the command under "valgrind". Please note that you will
have to run as a root.

Thanks for your help in debugging the problem. Unfortunately, we are not
seeing this problem.

Paddy

bethany
August 9th, 2006, 04:56 AM
# su amandabackup -c "amdump ISDaily2.5"

+ prefix=/usr
+ exec_prefix=/usr
+ sbindir=/usr/sbin
+ libexecdir=/usr/lib/amanda
+ confdir=/etc/amanda
+ PATH=/usr/sbin:/usr/lib/amanda:/usr/bin:/bin:/usr/sbin:/sbin:/usr/ucb
+ export PATH
+ USE_VERSION_SUFFIXES=no
+ test no = yes
+ SUF=
+ '[' 1 -lt 1 ']'
+ conf=ISDaily2.5
+ '[' '!' -d /etc/amanda/ISDaily2.5 ']'
+ shift
+ cd /etc/amanda/ISDaily2.5
++ amgetconf ISDaily2.5 logdir
+ logdir=/var/lib/amanda/ISDaily
+ '[' 0 -ne 0 ']'
+ errfile=/var/lib/amanda/ISDaily/amdump
++ amgetconf ISDaily2.5 tapecycle
+ tapecycle=15
+ '[' 0 -ne 0 ']'
++ amgetconf ISDaily2.5 dumpuser
+ dumpuser=amandabackup
+ '[' 0 -ne 0 ']'
+ runuser=amandabackup
+ '[' 0 -ne 0 ']'
+ test -f hold
+ test -f /var/lib/amanda/ISDaily/amdump
+ test -f /var/lib/amanda/ISDaily/log
+ umask 077
+ touch /var/lib/amanda/ISDaily/amdump
+ exec
*** glibc detected *** double free or corruption (fasttop): 0x0805d658 ***

This seems to imply that either planner or dumper are at fault since the last line in amdump to run is exec:

touch $errfile
exec </dev/null 2>>$errfile 1>&2
echo "amdump: start at `date`"
echo "amdump: datestamp `date +%Y%m%d`"
$libexecdir/planner$SUF $conf "$@" | $libexecdir/driver$SUF $conf "$@"
echo "amdump: end at `date`"

I removed the suid bit for dumper and ran Valgrind but am not sure what I'm looking for:


# valgrind --tool=memcheck /usr/lib/amanda/dumper ISDaily2.5
==15828== Memcheck, a memory error detector for x86-linux.
==15828== Copyright (C) 2002-2004, and GNU GPL'd, by Julian Seward et al.
==15828== Using valgrind-2.2.0, a program supervision framework for x86-linux.
==15828== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al.
==15828== For more details, rerun with: -v
==15828==
==15828== Warning: invalid file descriptor 820 in syscall close()
==15828== Warning: invalid file descriptor 821 in syscall close()
==15828== Warning: invalid file descriptor 822 in syscall close()
==15828== Use --log-fd=<number> to select an alternative log fd.
==15828== Warning: invalid file descriptor 823 in syscall close()
==15828== Warning: invalid file descriptor 824 in syscall close()
==15828== Warning: invalid file descriptor 825 in syscall close()
==15828== Warning: invalid file descriptor 826 in syscall close()
==15828== Warning: invalid file descriptor 827 in syscall close()
==15828== Warning: invalid file descriptor 828 in syscall close()
< snip >
15828== Warning: invalid file descriptor 1020 in syscall close()
==15828== Warning: invalid file descriptor 1021 in syscall close()
==15828== Warning: invalid file descriptor 1022 in syscall close()
==15828== Warning: invalid file descriptor 1023 in syscall close()
dumper: pid 15828 executable /usr/lib/amanda/dumper version 2.5.1b2
dumper> (null)
==15828== Invalid read of size 1
==15828== at 0x1B9651E0: quote (token.c:178)
==15828== by 0x1B9653A6: squote (token.c:161)
==15828== by 0x804C79A: main (dumper.c:514)
==15828== Address 0x1B9ADAD0 is 0 bytes inside a block of size 128 free'd
==15828== at 0x1B904EA5: free (vg_replace_malloc.c:153)
==15828== by 0x1B92D56B: getcmd (server_util.c:74)
==15828== by 0x804C834: main (dumper.c:335)
==15828==
==15828== Invalid read of size 1
==15828== at 0x1B96524F: quote (token.c:205)
==15828== by 0x1B9653A6: squote (token.c:161)
==15828== by 0x804C79A: main (dumper.c:514)
==15828== Address 0x1B9ADAD0 is 0 bytes inside a block of size 128 free'd
==15828== at 0x1B904EA5: free (vg_replace_malloc.c:153)
==15828== by 0x1B92D56B: getcmd (server_util.c:74)
==15828== by 0x804C834: main (dumper.c:335)
BAD-COMMAND
dumper> (null)==15828==
==15828== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 24 from 2)
==15828== malloc/free: in use at exit: 29448 bytes in 496 blocks.
==15828== malloc/free: 656 allocs, 160 frees, 42370 bytes allocated.
==15828== For a detailed leak analysis, rerun with: --leak-check=yes
==15828== For counts of detected errors, rerun with: -v

bethany
August 11th, 2006, 05:47 AM
I think I'm going to rebuild this machine in 32-bit mode and see if it makes a difference...

bethany
August 11th, 2006, 01:27 PM
Interesting - I rebuilt the machine with the 32bit version of RHEL4 and it is still dumping:

$ sudo su amandabackup -c "amdump ISDaily2.5"
*** glibc detected *** double free or corruption (fasttop): 0x09f6a658 ***
$ uname -a
Linux 2.6.9-34.0.2.ELsmp #1 SMP Fri Jun 30 10:33:58 EDT 2006 i686 athlon i386 GNU/Linux

ktill
August 11th, 2006, 02:28 PM
Hi,
I just downloaded amanda-2.5.1b2 and compiled. I don't see the problem on Red Hat:
Linux boston.zmanda.com 2.6.9-11.ELsmp #1 SMP Fri May 20 18:26:27 EDT 2005 i686 i686 i386 GNU/Linux
[amandabackup@boston DailySet1]$ amcheck DailySet1
Amanda Tape Server Host Check
-----------------------------
Holding disk /backups/holding: 5707536 KB disk space available, that's plenty
slot 8: read label `DailySet18', date `20060628101832'
NOTE: skipping tape-writable test
Tape DailySet18 label ok
Server check took 0.112 seconds

Amanda Backup Client Hosts Check
--------------------------------
Client check: 1 host checked in 0.070 seconds, 0 problems found

(brought to you by Amanda 2.5.1b2)
[amandabackup@boston DailySet1]$ amdump DailySet1
[amandabackup@boston DailySet1]$




Can you please do "amcheck ISDaily2.5" to see if there is any error?
Next, please list the following files:
1) /tmp/amanda/amandad/amandad.*.debug.
2) /etc/amanda/ISDaily2.5/amdump.1
3) /etc/amanda/ISDaily2.5/log.xxx <-- the latest one.

Thanks!

Kevin Till
Zmanda

bethany
August 12th, 2006, 07:13 AM
OK, I've applied the latest round of Redhat errata that hit our satellite server last night and and booted into the new kernel.

amcheck:
$ sudo su amandabackup -c "amcheck ISDaily2.5"

Amanda Tape Server Host Check
-----------------------------
Holding disk /amandahold: 395262320 kB disk space available, using 393165168 kB
slot 8: read label `ISDaily02', date `20060812004501'
cannot overwrite active tape ISDaily02
slot 9: Found a non-amanda tape, will label it `ISDaily03'.
NOTE: skipping tape-writable test
Found a brand new tape, will label it ISDaily03.
NOTE: conf info dir /var/lib/amanda/ISDaily/curinfo does not exist
NOTE: it will be created on the next run.
NOTE: index dir /var/lib/amanda/ISDaily/index does not exist
NOTE: it will be created on the next run.
Server check took 0.423 seconds

Amanda Backup Client Hosts Check
--------------------------------
WARNING: Usage of fully qualified hostname recommended for Client localhost.
WARNING: Usage of fully qualified hostname recommended for Client localhost.
WARNING: Usage of fully qualified hostname recommended for Client localhost.
Client check: 2 hosts checked in 0.182 seconds, 0 problems found

(brought to you by Amanda 2.5.1b2)

[bj56@middenheap-dev ~]$ sudo su amandabackup -c "amdump ISDaily2.5"
*** glibc detected *** double free or corruption (fasttop): 0x09dba658 ***

1) /tmp/amanda/amandad/amandad.*.debug.

$ sudo cat /tmp/amanda/amandad/amandad.20060812104701.debug

amandad: debug 1 pid 4104 ruid 502 euid 502: start at Sat Aug 12 10:47:01 2006
security_getdriver(name=BSD) returns 0x6fb0e0
amandad: version 2.5.1b2
amandad: build: VERSION="Amanda-2.5.1b2"
amandad: BUILT_DATE="Fri Jul 28 18:18:32 PDT 2006"
amandad: BUILT_MACH="Linux rocky.zmanda.com 2.6.9-22.0.2.ELsmp #1 SMP Thu Jan 5 17:13:01 EST 2006 i686 i686 i386 GNU/Linux"
amandad: CC="gcc"
amandad: CONFIGURE_COMMAND="'./configure' '--build=i386-redhat-linux' '--prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--libexecdir=/usr/lib/amanda' '--datadir=/usr/share' '--sysconfdir=/etc' '--sharedstatedir=/var/lib/amanda' '--localstatedir=/var/lib/amanda' '--libdir=/usr/lib' '--includedir=/usr/include' '--infodir=/usr/info' '--mandir=/usr/share/man' '--with-gnutar=/bin/tar' '--with-gnutar-listdir=/var/lib/amanda/gnutar-lists' '--with-dumperdir=/usr/lib/amanda' '--with-index-server=localhost' '--with-tape-server=localhost' '--with-user=amandabackup' '--with-group=disk' '--with-owner=paddy' '--with-fqdn' '--with-bsd-security' '--with-bsdtcp-security' '--with-bsdudp-security' '--with-ssh-security' '--with-assertions'"
amandad: paths: bindir="/usr/bin" sbindir="/usr/sbin"
amandad: libexecdir="/usr/lib/amanda" mandir="/usr/share/man"
amandad: AMANDA_TMPDIR="/tmp/amanda" AMANDA_DBGDIR="/tmp/amanda"
amandad: CONFIG_DIR="/etc/amanda" DEV_PREFIX="/dev/"
amandad: RDEV_PREFIX="/dev/" DUMP="/sbin/dump"
amandad: RESTORE="/sbin/restore" VDUMP=UNDEF VRESTORE=UNDEF
amandad: XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
amandad: SAMBA_CLIENT="/usr/bin/smbclient" GNUTAR="/bin/tar"
amandad: COMPRESS_PATH="/bin/gzip" UNCOMPRESS_PATH="/bin/gzip"
amandad: LPRCMD="/usr/bin/lpr" MAILER="/usr/bin/Mail"
amandad: listed_incr_dir="/var/lib/amanda/gnutar-lists"
amandad: defs: DEFAULT_SERVER="localhost" DEFAULT_CONFIG="DailySet1"
amandad: DEFAULT_TAPE_SERVER="localhost" HAVE_MMAP HAVE_SYSVSHM
amandad: LOCKING=POSIX_FCNTL SETPGRP_VOID ASSERTIONS DEBUG_CODE
amandad: AMANDA_DEBUG_DAYS=4 BSD_SECURITY RSH_SECURITY USE_AMANDAHOSTS
amandad: CLIENT_LOGIN="amandabackup" FORCE_USERID HAVE_GZIP
amandad: COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
amandad: COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
amandad: time 0.000: dgram_recv(dgram=(nil), timeout=7389296, fromaddr=0x1)
amandad: (sockaddr_in *)0x70c070 = { 0, 0, 0.0.0.0 }
security_handleinit(handle=0x99cbe88, driver=0x6fb0e0 (BSD))
amandad: time 0.011: accept recv REQ pkt:
<<<<<
SERVICE noop
OPTIONS features=fffffeff9ffeffffff7f;
>>>>>
amandad: time 0.012: creating new service: noop
OPTIONS features=fffffeff9ffeffffff7f;

amandad: time 0.013: sending ACK pkt:
<<<<<
>>>>>
amandad: dgram_send_addr(addr=0xbfe629a0, dgram=0x6fc084)
amandad: (sockaddr_in *)0xbfe629a0 = { 2, -5117, 127.0.0.1 }
amandad: dgram_send_addr: 0x6fc084->socket = 0
amandad: time 0.031: sending REP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
>>>>>
amandad: dgram_send_addr(addr=0xbfe62ae0, dgram=0x6fc084)
amandad: (sockaddr_in *)0xbfe62ae0 = { 2, -5117, 127.0.0.1 }
amandad: dgram_send_addr: 0x6fc084->socket = 0
amandad: time 0.031: dgram_recv(dgram=(nil), timeout=7389296, fromaddr=0xb26401)
amandad: (sockaddr_in *)0x70c070 = { 2, -5117, 127.0.0.1 }
amandad: time 0.031: received ACK pkt:
<<<<<
>>>>>
security_close(handle=0x99cbe88, driver=0x6fb0e0 (BSD))
amandad: time 0.032: dgram_recv(dgram=(nil), timeout=7389296, fromaddr=0xb2433a)
amandad: (sockaddr_in *)0x70c070 = { 2, -5117, 127.0.0.1 }
security_handleinit(handle=0x99cbe88, driver=0x6fb0e0 (BSD))
amandad: time 0.032: accept recv REQ pkt:
<<<<<
SERVICE selfcheck
OPTIONS features=fffffeff9ffeffffff7f;maxdumps=1;hostname= localhost;config=ISDaily2.5;
GNUTAR /var 0 OPTIONS |;auth=BSD;compress-fast;index;
GNUTAR / 0 OPTIONS |;auth=BSD;compress-fast;index;
>>>>>
amandad: time 0.032: creating new service: selfcheck
OPTIONS features=fffffeff9ffeffffff7f;maxdumps=1;hostname= localhost;config=ISDaily2.5;
GNUTAR /var 0 OPTIONS |;auth=BSD;compress-fast;index;
GNUTAR / 0 OPTIONS |;auth=BSD;compress-fast;index;

amandad: time 0.033: sending ACK pkt:
<<<<<
>>>>>
amandad: dgram_send_addr(addr=0xbfe629a0, dgram=0x6fc084)
amandad: (sockaddr_in *)0xbfe629a0 = { 2, -5117, 127.0.0.1 }
amandad: dgram_send_addr: 0x6fc084->socket = 0
amandad: time 0.133: sending REP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
OK /var
OK /var
OK /var
OK /
OK /
OK /
OK /usr/lib/amanda/runtar executable
OK /bin/tar executable
OK /var/lib/amanda/gnutar-lists/. read/writable
OK /etc/amandates read/writable
OK /bin/gzip executable
OK /dev/null read/writable
OK /tmp/amanda has more than 64 KB available.
OK /tmp/amanda has more than 64 KB available.
OK /etc has more than 64 KB available.
>>>>>
amandad: dgram_send_addr(addr=0xbfe62ae0, dgram=0x6fc084)
amandad: (sockaddr_in *)0xbfe62ae0 = { 2, -5117, 127.0.0.1 }
amandad: dgram_send_addr: 0x6fc084->socket = 0
amandad: time 0.134: dgram_recv(dgram=(nil), timeout=7389296, fromaddr=0xbfe62aac)
amandad: (sockaddr_in *)0x70c070 = { 2, -5117, 127.0.0.1 }
amandad: time 0.134: received ACK pkt:
<<<<<
>>>>>
security_close(handle=0x99cbe88, driver=0x6fb0e0 (BSD))
amandad: time 30.127: pid 4104 finish time Sat Aug 12 10:47:31 2006

bethany
August 12th, 2006, 07:15 AM
$ sudo cat /tmp/amanda/amandad/amandad.20060812104810.debug

amandad: debug 1 pid 4171 ruid 502 euid 502: start at Sat Aug 12 10:48:10 2006
security_getdriver(name=BSD) returns 0x13e0e0
amandad: version 2.5.1b2
amandad: build: VERSION="Amanda-2.5.1b2"
amandad: BUILT_DATE="Fri Jul 28 18:18:32 PDT 2006"
amandad: BUILT_MACH="Linux rocky.zmanda.com 2.6.9-22.0.2.ELsmp #1 SMP Thu Jan 5 17:13:01 EST 2006 i686 i686 i386 GNU/Linux"
amandad: CC="gcc"
amandad: CONFIGURE_COMMAND="'./configure' '--build=i386-redhat-linux' '--prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--libexecdir=/usr/lib/amanda' '--datadir=/usr/share' '--sysconfdir=/etc' '--sharedstatedir=/var/lib/amanda' '--localstatedir=/var/lib/amanda' '--libdir=/usr/lib' '--includedir=/usr/include' '--infodir=/usr/info' '--mandir=/usr/share/man' '--with-gnutar=/bin/tar' '--with-gnutar-listdir=/var/lib/amanda/gnutar-lists' '--with-dumperdir=/usr/lib/amanda' '--with-index-server=localhost' '--with-tape-server=localhost' '--with-user=amandabackup' '--with-group=disk' '--with-owner=paddy' '--with-fqdn' '--with-bsd-security' '--with-bsdtcp-security' '--with-bsdudp-security' '--with-ssh-security' '--with-assertions'"
amandad: paths: bindir="/usr/bin" sbindir="/usr/sbin"
amandad: libexecdir="/usr/lib/amanda" mandir="/usr/share/man"
amandad: AMANDA_TMPDIR="/tmp/amanda" AMANDA_DBGDIR="/tmp/amanda"
amandad: CONFIG_DIR="/etc/amanda" DEV_PREFIX="/dev/"
amandad: RDEV_PREFIX="/dev/" DUMP="/sbin/dump"
amandad: RESTORE="/sbin/restore" VDUMP=UNDEF VRESTORE=UNDEF
amandad: XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
amandad: SAMBA_CLIENT="/usr/bin/smbclient" GNUTAR="/bin/tar"
amandad: COMPRESS_PATH="/bin/gzip" UNCOMPRESS_PATH="/bin/gzip"
amandad: LPRCMD="/usr/bin/lpr" MAILER="/usr/bin/Mail"
amandad: listed_incr_dir="/var/lib/amanda/gnutar-lists"
amandad: defs: DEFAULT_SERVER="localhost" DEFAULT_CONFIG="DailySet1"
amandad: DEFAULT_TAPE_SERVER="localhost" HAVE_MMAP HAVE_SYSVSHM
amandad: LOCKING=POSIX_FCNTL SETPGRP_VOID ASSERTIONS DEBUG_CODE
amandad: AMANDA_DEBUG_DAYS=4 BSD_SECURITY RSH_SECURITY USE_AMANDAHOSTS
amandad: CLIENT_LOGIN="amandabackup" FORCE_USERID HAVE_GZIP
amandad: COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
amandad: COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
amandad: time 0.000: dgram_recv(dgram=(nil), timeout=1372272, fromaddr=0x1)
amandad: (sockaddr_in *)0x14f070 = { 0, 0, 0.0.0.0 }
security_handleinit(handle=0x8cd9e88, driver=0x13e0e0 (BSD))
amandad: time 0.000: accept recv REQ pkt:
<<<<<
SERVICE noop
OPTIONS features=fffffeff9ffeffffff7f;
>>>>>
amandad: time 0.000: creating new service: noop
OPTIONS features=fffffeff9ffeffffff7f;

amandad: time 0.002: sending ACK pkt:
<<<<<
>>>>>
amandad: dgram_send_addr(addr=0xbfebf500, dgram=0x13f084)
amandad: (sockaddr_in *)0xbfebf500 = { 2, 29442, 127.0.0.1 }
amandad: dgram_send_addr: 0x13f084->socket = 0
amandad: time 0.003: sending REP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
>>>>>
amandad: dgram_send_addr(addr=0xbfebf640, dgram=0x13f084)
amandad: (sockaddr_in *)0xbfebf640 = { 2, 29442, 127.0.0.1 }
amandad: dgram_send_addr: 0x13f084->socket = 0
amandad: time 0.003: dgram_recv(dgram=(nil), timeout=1372272, fromaddr=0x833401)
amandad: (sockaddr_in *)0x14f070 = { 2, 29442, 127.0.0.1 }
amandad: time 0.003: received ACK pkt:
<<<<<
>>>>>
security_close(handle=0x8cd9e88, driver=0x13e0e0 (BSD))
amandad: time 0.004: dgram_recv(dgram=(nil), timeout=1372272, fromaddr=0x83133a)
amandad: (sockaddr_in *)0x14f070 = { 2, 29442, 127.0.0.1 }
security_handleinit(handle=0x8cd9e88, driver=0x13e0e0 (BSD))
amandad: time 0.004: accept recv REQ pkt:
<<<<<
SERVICE sendsize
OPTIONS features=fffffeff9ffeffffff7f;maxdumps=1;hostname= localhost;config=ISDaily2.5;
GNUTAR /var 0 1970:1:1:0:0:0 -1 OPTIONS |;auth=BSD;compress-fast;index;
GNUTAR / 0 1970:1:1:0:0:0 -1 OPTIONS |;auth=BSD;compress-fast;index;
>>>>>
amandad: time 0.004: creating new service: sendsize
OPTIONS features=fffffeff9ffeffffff7f;maxdumps=1;hostname= localhost;config=ISDaily2.5;
GNUTAR /var 0 1970:1:1:0:0:0 -1 OPTIONS |;auth=BSD;compress-fast;index;
GNUTAR / 0 1970:1:1:0:0:0 -1 OPTIONS |;auth=BSD;compress-fast;index;

amandad: time 0.005: sending ACK pkt:
<<<<<
>>>>>
amandad: dgram_send_addr(addr=0xbfebf500, dgram=0x13f084)
amandad: (sockaddr_in *)0xbfebf500 = { 2, 29442, 127.0.0.1 }
amandad: dgram_send_addr: 0x13f084->socket = 0
amandad: time 0.038: sending PREP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
>>>>>
amandad: dgram_send_addr(addr=0xbfebf600, dgram=0x13f084)
amandad: (sockaddr_in *)0xbfebf600 = { 2, 29442, 127.0.0.1 }
amandad: dgram_send_addr: 0x13f084->socket = 0
amandad: time 29.372: sending PREP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
/ 0 SIZE 1237170
>>>>>
amandad: dgram_send_addr(addr=0xbfebf600, dgram=0x13f084)
amandad: (sockaddr_in *)0xbfebf600 = { 2, 29442, 127.0.0.1 }
amandad: dgram_send_addr: 0x13f084->socket = 0
amandad: time 30.778: sending PREP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
/ 0 SIZE 1237170
/var 0 SIZE 76380
>>>>>
amandad: dgram_send_addr(addr=0xbfebf600, dgram=0x13f084)
amandad: (sockaddr_in *)0xbfebf600 = { 2, 29442, 127.0.0.1 }
amandad: dgram_send_addr: 0x13f084->socket = 0
amandad: time 30.779: sending REP pkt:
<<<<<
OPTIONS features=fffffeff9ffeffffff7f;
/ 0 SIZE 1237170
/var 0 SIZE 76380
>>>>>
amandad: dgram_send_addr(addr=0xbfebf640, dgram=0x13f084)
amandad: (sockaddr_in *)0xbfebf640 = { 2, 29442, 127.0.0.1 }
amandad: dgram_send_addr: 0x13f084->socket = 0
amandad: time 30.780: dgram_recv(dgram=(nil), timeout=1372272, fromaddr=0xbfebf60c)
amandad: (sockaddr_in *)0x14f070 = { 2, 29442, 127.0.0.1 }
amandad: time 30.780: received ACK pkt:
<<<<<
>>>>>
security_close(handle=0x8cd9e88, driver=0x13e0e0 (BSD))
amandad: time 59.773: pid 4171 finish time Sat Aug 12 10:49:10 2006

bethany
August 12th, 2006, 07:18 AM
2)$ sudo cat /var/lib/amanda/ISDaily/amdump.1

amdump: start at Sat Aug 12 10:48:10 EDT 2006
amdump: datestamp 20060812
planner: pid 4169 executable /usr/lib/amanda/planner version 2.5.1b2
planner: build: VERSION="Amanda-2.5.1b2"
planner: BUILT_DATE="Fri Jul 28 18:18:32 PDT 2006"
planner: BUILT_MACH="Linux rocky.zmanda.com 2.6.9-22.0.2.ELsmp #1 SMP Thu Jan 5 17:13:01 EST 2006 i686 i686 i386 GNU/Linux"
planner: CC="gcc"
planner: CONFIGURE_COMMAND="'./configure' '--build=i386-redhat-linux' '--prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--libexecdir=/usr/lib/amanda' '--datadir=/usr/share' '--sysconfdir=/etc' '--sharedstatedir=/var/lib/amanda' '--localstatedir=/var/lib/amanda' '--libdir=/usr/lib' '--includedir=/usr/include' '--infodir=/usr/info' '--mandir=/usr/share/man' '--with-gnutar=/bin/tar' '--with-gnutar-listdir=/var/lib/amanda/gnutar-lists' '--with-dumperdir=/usr/lib/amanda' '--with-index-server=localhost' '--with-tape-server=localhost' '--with-user=amandabackup' '--with-group=disk' '--with-owner=paddy' '--with-fqdn' '--with-bsd-security' '--with-bsdtcp-security' '--with-bsdudp-security' '--with-ssh-security' '--with-assertions'"
planner: paths: bindir="/usr/bin" sbindir="/usr/sbin"
planner: libexecdir="/usr/lib/amanda" mandir="/usr/share/man"
planner: AMANDA_TMPDIR="/tmp/amanda" AMANDA_DBGDIR="/tmp/amanda"
planner: CONFIG_DIR="/etc/amanda" DEV_PREFIX="/dev/"
planner: RDEV_PREFIX="/dev/" DUMP="/sbin/dump"
planner: RESTORE="/sbin/restore" VDUMP=UNDEF VRESTORE=UNDEF
planner: XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
planner: SAMBA_CLIENT="/usr/bin/smbclient" GNUTAR="/bin/tar"
planner: COMPRESS_PATH="/bin/gzip" UNCOMPRESS_PATH="/bin/gzip"
planner: LPRCMD="/usr/bin/lpr" MAILER="/usr/bin/Mail"
planner: listed_incr_dir="/var/lib/amanda/gnutar-lists"
planner: defs: DEFAULT_SERVER="localhost" DEFAULT_CONFIG="DailySet1"
planner: DEFAULT_TAPE_SERVER="localhost" HAVE_MMAP HAVE_SYSVSHM
planner: LOCKING=POSIX_FCNTL SETPGRP_VOID ASSERTIONS DEBUG_CODE
planner: AMANDA_DEBUG_DAYS=4 BSD_SECURITY RSH_SECURITY USE_AMANDAHOSTS
planner: CLIENT_LOGIN="amandabackup" FORCE_USERID HAVE_GZIP
planner: COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
planner: COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
READING CONF FILES...
planner: timestamp 20060812104810
planner: time 0.002: startup took 0.002 secs

SENDING FLUSHES...
ENDFLUSH

SETTING UP FOR ESTIMATES...
planner: time 0.002: setting up estimates for cfdev:/
cfdev:/ overdue 13373 days for level 0
setup_estimate: cfdev:/: command 0, options: none last_level -1 next_level0 -13373 level_days 0 getting estimates 0 (-2) -1 (-2) -1 (-2)
planner: time 0.002: setting up estimates for cfdev:/var
cfdev:/var overdue 13373 days for level 0
setup_estimate: cfdev:/var: command 0, options: none last_level -1 next_level0 -13373 level_days 0 getting estimates 0 (-2) -1 (-2) -1 (-2)
planner: time 0.002: setting up estimates for localhost:/
localhost:/ overdue 13373 days for level 0
setup_estimate: localhost:/: command 0, options: none last_level -1 next_level0 -13373 level_days 0 getting estimates 0 (-2) -1 (-2) -1 (-2)
planner: time 0.002: setting up estimates for localhost:/var
localhost:/var overdue 13373 days for level 0
setup_estimate: localhost:/var: command 0, options: none last_level -1 next_level0 -13373 level_days 0 getting estimates 0 (-2) -1 (-2) -1 (-2)
planner: time 0.002: setting up estimates took 0.000 secs

GETTING ESTIMATES...
driver: pid 4170 executable /usr/lib/amanda/driver version 2.5.1b2
driver: tape size 1048576000
driver: adding holding disk 0 dir /amandahold size 393165168 chunksize 2097152
planner: time 0.052: got partial result for host cfdev disk /var: 0 -> -2K, -1 -> -2K, -1 -> -2K
planner: time 0.052: got partial result for host cfdev disk /: 0 -> -2K, -1 -> -2K, -1 -> -2K
planner: time 0.052: got partial result for host localhost disk /var: 0 -> -2K, -1 -> -2K, -1 -> -2K
planner: time 0.052: got partial result for host localhost disk /: 0 -> -2K, -1 -> -2K, -1 -> -2K
reserving 393165168 out of 393165168 for degraded-mode dumps
driver: send-cmd time 0.095 to taper: START-TAPER 20060812104810
driver: started dumper0 pid 4181
driver: send-cmd time 0.095 to dumper0: START 20060812104810
driver: started dumper1 pid 4182
driver: send-cmd time 0.096 to dumper1: START 20060812104810
driver: started dumper2 pid 4183
driver: send-cmd time 0.096 to dumper2: START 20060812104810
driver: started dumper3 pid 4184
driver: send-cmd time 0.097 to dumper3: START 20060812104810
driver: start time 0.097 inparallel 4 bandwidth 11400 diskspace 393165168 dir OBSOLETE datestamp 20060812104810 driver: drain-ends tapeq FIRST big-dumpers ttt
dumper: pid 4181 executable dumper0 version 2.5.1b2
dumper: pid 4182 executable dumper1 version 2.5.1b2
dumper: pid 4183 executable dumper2 version 2.5.1b2
dumper: pid 4184 executable dumper3 version 2.5.1b2
taper: pid 4180 executable taper version 2.5.1b2
taper: page size = 4096
taper: buffer size is 32768
changer: opening pipe to: /usr/lib/amanda/chg-disk -info
changer: opening pipe to: /usr/lib/amanda/chg-disk -slot current
taper: slot: 9 wrote label `ISDaily03' date `20060812104810'
driver: result time 0.200 from taper: TAPER-OK
driver: state time 0.200 free kps: 11400 space: 393165168 taper: idle idle-dumpers: 4 qlen tapeq: 0 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle
driver: interface-state time 0.200 if default: free 10000 if LOCAL: free 1000 if ETH0: free 400
driver: hdisk-state time 0.200 hdisk 0: free 393165168 dumpers 0
planner: time 3.250: got partial result for host cfdev disk /var: 0 -> -2K, -1 -> -2K, -1 -> -2K
planner: time 3.250: got partial result for host cfdev disk /: 0 -> 1532070K, -1 -> -2K, -1 -> -2K
planner: time 3.730: got partial result for host cfdev disk /var: 0 -> 76900K, -1 -> -2K, -1 -> -2K
planner: time 3.730: got partial result for host cfdev disk /: 0 -> 1532070K, -1 -> -2K, -1 -> -2K
planner: time 3.730: got result for host cfdev disk /var: 0 -> 76900K, -1 -> -2K, -1 -> -2K
planner: time 3.730: got result for host cfdev disk /: 0 -> 1532070K, -1 -> -2K, -1 -> -2K
planner: time 29.387: got partial result for host localhost disk /var: 0 -> -2K, -1 -> -2K, -1 -> -2K
planner: time 29.387: got partial result for host localhost disk /: 0 -> 1237170K, -1 -> -2K, -1 -> -2K
planner: time 30.793: got partial result for host localhost disk /var: 0 -> 76380K, -1 -> -2K, -1 -> -2K
planner: time 30.793: got partial result for host localhost disk /: 0 -> 1237170K, -1 -> -2K, -1 -> -2K
planner: time 30.793: got result for host localhost disk /var: 0 -> 76380K, -1 -> -2K, -1 -> -2K
planner: time 30.793: got result for host localhost disk /: 0 -> 1237170K, -1 -> -2K, -1 -> -2K
planner: time 30.793: getting estimates took 30.791 secs
FAILED QUEUE: empty
DONE QUEUE:
0: cfdev /var
1: cfdev /
2: localhost /var
3: localhost /

ANALYZING ESTIMATES...
pondering cfdev:/var... next_level0 -13373 last_level -1 (due for level 0) (new disk, can't switch to degraded mode)
curr level 0 nsize 76900 csize 38450 total size 41546 total_lev0 38450 balanced-lev0size 7690
pondering cfdev:/... next_level0 -13373 last_level -1 (due for level 0) (new disk, can't switch to degraded mode)
curr level 0 nsize 1532070 csize 766035 total size 808613 total_lev0 804485 balanced-lev0size 160897
pondering localhost:/var... next_level0 -13373 last_level -1 (due for level 0) (new disk, can't switch to degraded mode)
curr level 0 nsize 76380 csize 38190 total size 847835 total_lev0 842675 balanced-lev0size 168535
pondering localhost:/... next_level0 -13373 last_level -1 (due for level 0) (new disk, can't switch to degraded mode)
curr level 0 nsize 1237170 csize 618585 total size 1467452 total_lev0 1461260 balanced-lev0size 292252
INITIAL SCHEDULE (size 1467452):
localhost / pri 13374 lev 0 nsize 1237170 csize 618585
cfdev /var pri 13374 lev 0 nsize 76900 csize 38450
localhost /var pri 13374 lev 0 nsize 76380 csize 38190
cfdev / pri 13373 lev 0 nsize 1532070 csize 766035

DELAYING DUMPS IF NEEDED, total_size 1467452, tape length 1048576000 mark 1000
delay: Total size now 1467452.

PROMOTING DUMPS IF NEEDED, total_lev0 1461260, balanced_size 292252...
planner: time 30.794: analysis took 0.000 secs

GENERATING SCHEDULE:
--------
DUMP localhost fffffeff9ffeffffff7f / 20060812104810 13374 0 1970:1:1:0:0:0 1237170 618585 604 1024
DUMP cfdev fffffeff9ffeffffff7f /var 20060812104810 13374 0 1970:1:1:0:0:0 76900 38450 37 1024
DUMP localhost fffffeff9ffeffffff7f /var 20060812104810 13374 0 1970:1:1:0:0:0 76380 38190 37 1024
DUMP cfdev fffffeff9ffeffffff7f / 20060812104810 13373 0 1970:1:1:0:0:0 1532070 766035 748 1024
--------
driver: flush size 0
driver: find_diskspace: time 30.789: want 38496 K
driver: find_diskspace: time 30.789: find diskspace: size 38496 hf 393165168 df 393159152 da 38496 ha 38528
driver: find_diskspace: time 30.789: find diskspace: selected /amandahold free 393165168 reserved 38528 dumpers 0
driver: find_diskspace: time 30.789: want 766080 K
driver: find_diskspace: time 30.789: find diskspace: size 766080 hf 393165168 df 393159152 da 766080 ha 766112
driver: find_diskspace: time 30.789: find diskspace: selected /amandahold free 393165168 reserved 766112 dumpers 0

continued on next post...

bethany
August 12th, 2006, 07:19 AM
taper: DONE [idle wait: 30.589 secs]
/usr/sbin/amdump: line 109: 4169 Done $libexecdir/planner$SUF $conf "$@"
4170 Aborted | $libexecdir/driver$SUF $conf "$@"
taper: writing end marker. [ISDaily03 OK kb 0 fm 0]
amdump: end at Sat Aug 12 10:48:41 EDT 2006
Scanning /amandahold...
20060812004501: found Amanda directory.
20060812104810: found Amanda directory.
lost+found: 20060811171148: found Amanda directory.


3) /etc/amanda/ISDaily2.5/log.xxx <-- the latest one.

$ sudo cat /var/lib/amanda/ISDaily/log.20060812104810.0
DISK planner cfdev /
DISK planner cfdev /var
DISK planner localhost /
DISK planner localhost /var
START planner date 20060812104810
INFO planner Adding new disk cfdev:/.
INFO planner Adding new disk cfdev:/var.
INFO planner Adding new disk localhost:/.
INFO planner Adding new disk localhost:/var.
START driver date 20060812104810
STATS driver startup time 0.097
START taper datestamp 20060812104810 label ISDaily03 tape 0
FINISH planner date 20060812104810 time 30.794
INFO taper tape ISDaily03 kb 0 fm 0 [OK]

ktill
August 14th, 2006, 12:57 PM
thanks for providing additional information! However, I don't see anything obviously wrong.

Is there a core file produced? If so, could you please show the backtrace?
You might need to change the core file size allowed by doing
"ulimit -c unlimited"

Also, could you try to log in as amandabackup and do the amdump command.

bethany
August 14th, 2006, 01:11 PM
So while I was waiting for a response on this thread I decided to work a bit more on my cfengine scripts that install and configure amanda. Previously I had been installing/upgrading tar by hand after installing the RPM. In the process of testing the cfengine actions, I was rpm -e'ing the amanda-backup_server RPM then running the cfengine script to upgrade tar then install Amanda server. Now amdump is running without crashing! So I guess the key is to upgrade tar to 1.15 BEFORE you install the amanda-backup_server RPM. I've only run it once but so far so good...

Have you guys considered making Tar >=1.15 a dependency for the RPM?

bethany
August 16th, 2006, 05:27 AM
Well I guess I spoke too soon - amdump is still seg faulting. I will try doing the dumps as amandabackup and will look for the core files.

I also might try starting fresh with an Amada config file just to rule out some funkiness with my amanda.conf which was recycled and adjusted from a 2.4.0 configuration...

bethany
August 21st, 2006, 06:38 AM
OK, so I started fresh with a new amanda.conf derived from the example amanda.conf that comes with the RPM. I think I've narrowed down the segfaults to when my disklist file contains a combination of local and remote disks. If only local disks are in disklist, it dumps fine. If only remote disks are in disklist, that also works. Examples of the disklist files and result of amdump:

3. Aug 17 12:37
middenheap-dev.mydomain.edu / holding-disk -1 local
Result: Success

4. Aug 17 12:39
cfdev.mydomain.edu / comp-root-tar
Result: Success

5. Aug 17 12:44
cfdev.mydomain.edu / comp-root-tar
cfdev.mydomain.edu /var comp-user-tar
Result: Success

7 - Aug 17 12:50
cfdev.mydomain.edu / comp-root-tar
cfdev.mydomain.edu /var comp-user-tar
bethany-30.mydomain.edu / root-tar
Result: Success

8 - Aug 17 4:15
cfdev.mydomain.edu / comp-root-tar
cfdev.mydomain.edu /var comp-user-tar
bethany-30.mydomain.edu / server-encrypt-fast
bethany-30.mydomain.edu / root-tar
Result: Success

9 Aug 17 5:15
cfdev.mydomain.edu / server-encrypt-fast
cfdev.mydomain.edu /var comp-user-tar
middenheap-dev.mydomain.edu / holding-disk -1 local
middenheap-dev.mydomain.edu /var comp-user-tar -1 local
Result: Failed

paddy
August 22nd, 2006, 11:12 AM
Hi Bethany,

I think we may have a fix for this problem. I will build a RHEL4 amanda server
rpm to verify if it fixes your problem. Please confirm that you are using
RHEL4.

Thanks,
Paddy

bethany
August 22nd, 2006, 11:14 AM
Yes, I am running RHEL4.
Fingers crossed...

paddy
August 22nd, 2006, 03:02 PM
I have uploaded the amanda-backup_server rpm for RHEL4 for testing. Please use wget to download the rpm.

wget http://www.zmanda.com/downloads/community/2.5.1b2/Test/amanda-backup_server-2.5.1b2-1.rhel4.i386.rpm

Please let us know the results.

Thanks,
Paddy

bethany
August 23rd, 2006, 06:02 AM
So far so good! I had to rpm -e the amanda-backup_client on the server machine before this new RPM would install, but I guess that means I shouldn't have had both the server and client RPMs installed on my Amanda server to begin with.

I just ran a dump that included disk configurations that crashed in the past and everything went fine. Will run a few more tests today and tonight...