PDA

View Full Version : FAIL Missing Part problem



angadsingh
July 1st, 2010, 11:29 PM
Hello Amanda Community,

I am the (part-) IT administrator of a startup. We have recently implemented amanda for the continuous backup of all our company's critical data over to a 6TB RAID array using Amanda. We're backing up several directories from 2 machines (olabs1 and olabs2). We're running CentOS 5.3 on both the machines and Amanda server itself is installed on olabs1 (which is also a client)

Everything is configured and running and its been about a week now. I monitor amanda's backup runs every day, and have been running amadmin, amstatus, etc. to observe and optimize amanda parameters for the best use of space and speed.

All that said, today I saw a very crucial error with the amanda find command:



-sh-3.2$ amadmin DailySet1 find olabs2

date host disk lv tape or file file part status
2010-06-28 19:00:26 olabs2 /mnt/san/data 0 DailySet1-1 1 1/3 OK
2010-06-28 19:00:26 olabs2 /mnt/san/data 0 DailySet1-1 2 2/3 OK
2010-06-28 19:00:26 olabs2 /mnt/san/data 0 DailySet1-1 3 3/3 OK
2010-06-28 19:17:34 olabs2 /mnt/san/data 1 DailySet1-2 1 1/1 OK
2010-06-29 00:45:01 olabs2 /mnt/san/data 1 DailySet1-4 36 1/1 OK
2010-06-30 00:45:01 olabs2 /mnt/san/data 1 DailySet1-3 2 1/1 OK
2010-07-01 00:45:01 olabs2 /mnt/san/data 1 DailySet1-5 16 1/15 OK FAIL Missing part
2010-07-02 00:45:01 olabs2 /mnt/san/data 0 DailySet1-6 1 1/3 OK
2010-07-02 00:45:01 olabs2 /mnt/san/data 0 DailySet1-6 2 2/3 OK
2010-07-02 00:45:01 olabs2 /mnt/san/data 0 DailySet1-6 3 3/3 OK
2010-06-28 19:17:34 olabs2 /var/lib/mysql 0 DailySet1-2 2 1/1 OK
2010-06-29 00:45:01 olabs2 /var/lib/mysql 0 DailySet1-4 35 1/33 OK FAIL Missing part
2010-06-30 00:45:01 olabs2 /var/lib/mysql 1 DailySet1-3 1 1/1 OK
2010-07-01 00:45:01 olabs2 /var/lib/mysql 1 DailySet1-5 19 1/1 OK
2010-07-02 00:45:01 olabs2 /var/lib/mysql 1 DailySet1-6 10 1/4 OK FAIL Missing part


So amanda complains that file "16" on the vtape slot "DailySet1-5", which has part of the backup for "olabs2:/mnt/san/data" for the backup run at "2010-07-01 00:45:01" is MISSING!

I then ls'd the backup vtape directory on the attached RAID device (all configured using linux software RAID) and got this:



-sh-3.2$ cd /mnt/raid-backup-1/vtapes/slot5
-sh-3.2$ ls -l
total 15203660
-rw------- 1 amandabackup disk 32768 Jul 1 01:31 00000.DailySet1-5
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:31 00001.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:33 00002.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:34 00003.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:36 00004.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:38 00005.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:39 00006.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:41 00007.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:42 00008.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:44 00009.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:45 00010.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:47 00011.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:48 00012.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:50 00013.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 1073774592 Jul 1 01:51 00014.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 267780071 Jul 1 01:51 00015.olabs1._mnt_san_data.0
-rw------- 1 amandabackup disk 16797428 Jul 1 01:52 00016.olabs2._mnt_san_data.1
-rw------- 1 amandabackup disk 208942580 Jul 1 02:08 00017.olabs1._home.1
-rw------- 1 amandabackup disk 16006069 Jul 1 02:08 00018.olabs1._var_lib_mysql.1
-rw------- 1 amandabackup disk 10867531 Jul 1 02:08 00019.olabs2._var_lib_mysql.1


As you can see "00016.olabs2._mnt_san_data.1" does exist and its size is 16 MB. And the other 2 backup files that amanda complains about exist too.

I then ran amrecover, sethost to olabs2, setdisk to /mnt/sandata and setdate to the date of this particular backup (for which amanda complains that files are missing) - and to my shock, amanda shows a datestamp of the previous backup for all the files. - the backup for that date indeed does not exist (or is failing due to that reported MISSING part).

Can anyone help me with this? The files do exist on the backup disks. Are the checksums failing? Is this a bug in amanda?

Any help really appreciated

Thanks,
Angad Singh
Senior Software Engineer
Oxylabs Networks