PDA

View Full Version : amanda 3.1.0 virtual changer hardware failed



thewizard
June 29th, 2010, 04:00 AM
Hi,


I'm coming from amanda 2.5 where I've used the bash script virtual changer.
recently a restoration test failed due to a tar error, manually connecting the splits (correctly) also yielded that same error so I decided to switch to the more recent amanda 3.1.0

I've started configuration from scratch, everything should have been fine (and amcheck was good) but at I got a failed email report:


DailySet1 FAIL: AMANDA MAIL REPORT
...
driver: Taper protocol error
driver: going into degraded mode because of taper component error.
taper: Will request retry of failed split part.
...


Inspecting the logs I can see


PARTIAL taper ... "Could not seek '(cache file)' for reading: Bad file descriptor partial taper"


This was on the fourth vtape, the backup correctly spanned across tapes 1 through 3 and failed on the fourth (I gave it 5 tapes per run).

Any thoughts?

Thanks.

thewizard
June 30th, 2010, 10:36 AM
I see no replies...

I've checked the logs a bit more and saw:


driver taper pid 21787 exited with signal 6


On my system (defined in signum.h) I see it's one of the following two:


#define SIGABRT 6 /* Abort (ANSI). */
#define SIGIOT 6 /* IOT trap (4.2 BSD). */


But I think this happed after it failed with:


...
... "No space left on device"
...
... "Could not seek '(cache file)' for reading: Bad file descriptor"


The "No space left on device" is normal for switch vtapes, the one following is strange.

P.S.

ulimit for the amanda user seems fine:


time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) 0
memory(kbytes) unlimited
locked memory(kbytes) 32
process 38912
nofiles 1024
vmemory(kbytes) unlimited
locks unlimited

thewizard
July 4th, 2010, 07:35 AM
this is still happening:



...
driver: result time 9469.691 from taper: REQUEST-NEW-TAPE 00-00001
driver: send-cmd time 9469.691 to taper: NEW-TAPE
taper: wrote label `DailySet1-14'
driver: state time 9469.797 free kps: 6976 space: 32579584 taper: writing idle-dumpers: 3 qlen tapeq: 0 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle
driver: interface-state time 9469.797 if default: free 6976
driver: hdisk-state time 9469.797 hdisk 0: free 32579584 dumpers 0
driver: result time 9469.797 from taper: NEW-TAPE 00-00001 DailySet1-14
aespipe: write failed
driver: state time 9469.859 free kps: 6976 space: 32579584 taper: writing idle-dumpers: 3 qlen tapeq: 0 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle
driver: interface-state time 9469.859 if default: free 6976
driver: hdisk-state time 9469.859 hdisk 0: free 32579584 dumpers 0
driver: result time 9469.859 from taper: PARTIAL 00-00001 INPUT-GOOD TAPE-ERROR "[sec 9193.000000 kb 50331648 kps 5474.997063]" "" "Could not seek '(cache file)' for reading: Bad file descriptor"


This is always after the second or third vtape change, it goes through at least two vtapes (sometimes 3) and then dies.

I've checked in amanda-client's logs (it's the same machine) and saw the following in sendbackup files (this is an example, same output on 6 others).


sendbackup.20100702140915.debug-Fri Jul 2 14:09:15 2010: sendbackup: gnutar: /usr/libexec/amanda/runtar: pid 16755
sendbackup.20100702140915.debug-Fri Jul 2 14:09:15 2010: sendbackup: Started backup
sendbackup.20100702140915.debug-Fri Jul 2 14:09:15 2010: sendbackup: Started index creator: "/bin/tar -tf - 2>/dev/null | sed -e 's/^\.//'"
sendbackup.20100702140915.debug:Fri Jul 2 14:10:00 2010: sendbackup: critical (fatal): index tee cannot write [Broken pipe]
sendbackup.20100702140915.debug-/usr/lib/amanda/libamanda-3.1.0.so[0x7fe96f5f1b6e]
sendbackup.20100702140915.debug-/usr/lib/libglib-2.0.so.0(g_logv+0x1ad)[0x7fe96e66de8d]
sendbackup.20100702140915.debug-/usr/lib/libglib-2.0.so.0(g_log+0x83)[0x7fe96e66e123]
sendbackup.20100702140915.debug-/usr/libexec/amanda/sendbackup(start_index+0x4d4)[0x408031]
sendbackup.20100702140915.debug-/usr/libexec/amanda/sendbackup[0x40acf8]
sendbackup.20100702140915.debug-/usr/libexec/amanda/sendbackup(main+0x2b9f)[0x4065f6]
sendbackup.20100702140915.debug-/lib/libc.so.6(__libc_start_main+0xf4)[0x7fe96c51e1c4]
sendbackup.20100702140915.debug-/usr/libexec/amanda/sendbackup[0x403879]

thewizard
July 5th, 2010, 03:41 AM
solved,

no one bothered to reply... but in case anyone else is interested:

I did nothing special but changed the following:

update to 3.1.1.
changed in xinitd to tcp, stream, nowait and bsdtcp auth parameter.
added auth bsdtcp in both dumptype and amanda-client.
changed compression from client to server (though it's the same machine).
changed to bzip2 but I don't think it's relevant.

One of those solved it.

thewizard
July 7th, 2010, 10:00 PM
It seems it was a fluke!


I have the error again, in the mail:


partial taper: Error reading '(cache file)': Unexpected EOF
FAILED [data write: Broken pipe]


This always happens when switching vtapes (using the new perl switcher).
it switches fine between vtape1 and vtape2, continues, then on the next switch to vtape3 it dies.
I'm almost sure it's not bolted down to this vtape2-vtape3, I think it once finished vtape3 and died on switching to vtape4.

I've now changed the scheme to very large vtapes to see how it goes. but I thing it is a problem, My current configuration is without a holding disk (with a split disk though) and I've made sure it's not because I'm out of space. I'll test today and see how it goes.