Hello, all.
I'm experiencing a problem backing up the /home directory on only one of our servers. We have 15 other Linux servers and 6 Windows servers that all backup successfully.
Here are the details: The Amanda backup server (named Gemini) is running Ubuntu Hardy 8.04 32bit and the Amanda 2.6.1p2 from the .deb on this site. The problematic client box (named Scorpio) is running CentOS 4.8 32bit and the 2.6.1p2 client .rpm from this site.
I have several Disk List Entries being backed up on Scorpio. (/etc, /root, /usr/local/apache, et.al.) and they all work flawlessly. However, the backups for /home on this box fail consistently.
Here's the output from 'amreport':
Code:
*** THE DUMPS DID NOT FINISH PROPERLY!
Hostname: gemini
Org : ScorpioBackupsSet
Config : DailySet2
Date : March 1, 2010
These dumps were to tape DailySet2-6.
The next 2 tapes Amanda expects to use are: DailySet2-7, DailySet2-8.
STRANGE DUMP SUMMARY:
scorpio.pmt.org /home lev 1 STRANGE (see below)
STATISTICS:
Total Full Incr.
-------- -------- --------
Estimate Time (hrs:min) 0:00
Run Time (hrs:min) 0:01
Dump Time (hrs:min) 0:13 0:07 0:07
Output Size (meg) 1334.3 1334.3 0.0
Original Size (meg) 9376.2 8019.1 1357.0
Avg Compressed Size (%) 22.1 16.6 54.5 (level:#disks ...)
Filesystems Dumped 14 13 1 (1:1)
Avg Dump Rate (k/s) 1709.2 3405.7 0.0
Tape Time (hrs:min) 0:01 0:01 0:00
Tape Size (meg) 1334.3 1334.3 0.0
Tape Used (%) 6.7 6.7 0.0
Filesystems Taped 13 13 0
Chunks Taped 13 13 0
Avg Tp Write Rate (k/s) 40718.2 40718.2 --
USAGE BY TAPE:
Label Time Size % Nb Nc
DailySet2-6 0:01 1334M 6.7 13 13
STRANGE DUMP DETAILS:
-- scorpio.mydomain.tld /home lev 1 STRANGE
sendbackup: start [scorpio.mydomain.tld:/home level 1]
sendbackup: info BACKUP=/bin/tar
sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/bin/tar -xpGf - ...
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? /bin/tar: ./[user email caused the STRANGE message - nothing to worry about here]: Warning: Cannot stat: No such file or directory
| Total bytes written: 1422960640 (1.4GiB, 3.5MiB/s)
sendbackup: size 1389610
sendbackup: end
INFO dumper pid-done 11172
SUCCESS chunker scorpio.mydomain.tld
/home 20100301020001 1 [sec 398.257 kb 756921 kps 1900.7]
INFO chunker pid-done 11171
STATS driver estimate scorpio.mydomain.tld /home 20100301020001 1 [sec 521 nkb 1568197 ckb 827808 kps 1587]
PART taper DailySet2-6 14 scorpio.mydomain.tld /home 20100301020001 1/1 1 [sec 26.279571 kb 756920 kps 28802.617465]
DONE taper scorpio.mydomain.tld /home 20100301020001 1 1 [sec 26.279571 kb 756920 kps 28802.617465]
INFO dumper pid-done 11046
INFO taper tape DailySet2-6 kb 2123270 fm 14 [OK]
INFO taper pid-done 11045 INFO dumper pid-done 11047
INFO dumper pid-done 11048 INFO dumper pid-done 11049
FINISH driver date 20100301020001 time 935.586
INFO driver pid-done 11044
\--------
NOTES:
planner: Full dump of scorpio.mydomain.tld:/var/lib/mysql promoted from 6 days ahead.
<SNIP>
</SNIP>
planner: Full dump of scorpio.mydomain.tld:/usr/share/ssl promoted from 6 days ahead.
DUMP SUMMARY:
DUMPER STATS TAPER STATS
HOSTNAME DISK L ORIG-MB OUT-MB COMP% MMM:SS KB/s MMM:SS KB/s
-------------------------- ------------------------------------- -------------
scorpio.pmt. /etc 0 19 5 25.7 0:03 1970.6 0:00 491170.5
scorpio.pmt. /home 1 1357 739 54.5 6:38 1900.7
<SNIP></SNIP>
scorpio.pmt. -spool/cron 0 0 0 -- 0:00 29.1 0:00 3147.2
(brought to you by Amanda version 2.6.1p2)
So I see /home with information under the DUMPER, but absolutely nothing under the TAPER.
To troubleshoot, I made a separate set of vtapes and a whole other config exclusively for /home on Scorpio and the _exact_ same thing happens. I can see all the date pile up from dumper, get split up by chunker in my holding disk, and start writing to vtape. So everything appears to work UP TO the point that taper starts writing to vtape.
I will post output of the taper log for the last run for JUST the /home directory in my next post since I'm about to go over the limit for length.
This had been backing up successfully for several weeks. I know that we as IT folks always hear "I didn't change _anything_ and it just stopped working" and we know that generally doesn't happen but I've queried my whole team and noone can identify any changes they've made on either client OR server (OS updates, network settings, etc.). In the process of troubleshooting, I have removed and reinstalled the RPM package on the client _after_ the problems began. As previously mentioned, I also added a new vtape/config set to try to diagnose before seeking outside help. I've tried several different dumptypes (comp-user-tar, user-tar, nocomp-user, nocomp-user-span) and they all fail backing up /home.
Is it likely I'm just missing some obvious message in logs or is there somewhere else I should look for further troubleshooting?
Thanks in advance!
-Dave E.