Results 1 to 10 of 10

Thread: Abrupt Backup Failure on Very Large Filesystem

  1. #1

    Default Abrupt Backup Failure on Very Large Filesystem

    We have an installation of Amanda 3.3.7 that exists primarily to back up a 130TB zfs filesystem. Backups go to a 34-cartridge LTO-6 jukebox, and had been running mostly smoothly for the past year. Until recently, the filesystem had about 9TB of data in it, and a level 0 backup could be completed on 4 cartridges. In the past couple of weeks, we have added another 2 TB of data, and our backups have begun to fail very abruptly at the end of the SECOND tape. The only error message that we get is:

    FAILURE DUMP SUMMARY:
    *host* *filesystem* lev 0 FAILED [data write: Broken pipe]
    *host* *filesystem* lev 0 partial taper: No space left on device, splitting not enabled
    *host* *filesystem* lev 0 FAILED [data write: Broken pipe]
    *host* *filesystem* lev 0 partial taper: No space left on device, splitting not enabled

    This makes very little sense to me as this is happening at the end of the second tape, and 3 weeks ago, the last level 0 worked just fine, filling 4 tapes. I also know that I haven't gone in and changed anything in the configuration in that time period.

    The only other thing out of the ordinary about this past dump cycle is that we have added a number of very large (50 GB) files. I can imagine ways that these might cause difficulties, but I don't know that they are the problem.

    I'm not an amanda expert by any means. I've gotten my head around the fact that I don't schedule the dumps or select which tape to use, but I inherited this configuration. Is there anyone out there who can help me suss out what's going on?

  2. #2
    Join Date
    Nov 2005
    Location
    Canada
    Posts
    1,049

    Default

    I do not understand how amanda could report 'splitting not enabled' on the second tape.
    Can you attach the complete taper debug file?
    Also post the changer configuration, tapetype configuration and the dumptype definition

  3. #3

    Default

    The taper debug file (or at least what I think you are asking for) is 41KB long. Is there a good place for me to send it to?

    changer.conf (in total) is
    havereader=1
    (the changer is a Dell ML 6010 with two LTO-6 drives and 34 slots.


    tapetype from amanda.conf is
    define tapetype "LTO-6-1" {
    comment "Created by amtapetype; compression enabled"
    length 2459392960 kbytes
    filemark 9 kbytes
    speed 75596 kps
    blocksize 32 kbytes
    part_size 256G
    }

    The dumptype is:
    define dumptype simple-gnutar-local {
    auth "ssh"
    ssh_keys "*redacted*"
    compress none
    program "GNUTAR"
    }

    (we also tried it once without the holding disk, which was the same as that, but with the added line "holdingdisk never")

    since I suspect that this may have something to do with the holding disk, I also attach that configuration as well:
    holdingdisk hd1 {
    comment "main holding disk"
    directory "*redacted*"
    use 19000Gb
    chunksize 2000Gb
    }
    reserve 20 # percent

    Thanks in advance for your time and attention

  4. #4
    Join Date
    Nov 2005
    Location
    Canada
    Posts
    1,049

    Default

    I didn't know there is a limit on the size of the file that can be attached.
    What was the error when you tried to attach the debug file?
    You must rename the file with a '.txt' extension before attaching it.

  5. #5

    Default

    The file length limit appears to be 19.5KB, and the three I have are all over 40KB

  6. #6

    Default

    Looking into the log file I'm seeing a line:
    Mon May 2 16:46:23 2016: thd-0x172ff00: taper: Amanda::Taper::Scribe preparing to write, part size 274877906944, using no cache (PEOM will be fatal) (splitter) (no LEOM)

    and then:
    Tue May 3 04:28:49 2016: thd-0x24d4e10: taper: Building type SPLIT_FILE header of 32768-32768 bytes with name='backuphost.domain' disk='/bigfilesystem' dumplevel=0 and blocksize=32768
    Tue May 3 04:53:46 2016: thd-0x24d4e10: taper: Device tape:/dev/nst0 error = 'No space left on device'
    Tue May 3 04:53:46 2016: thd-0x24d4e10: taper: Device tape:/dev/nst0 setting status flag(s): DEVICE_STATUS_VOLUME_ERROR
    Tue May 3 04:53:48 2016: thd-0x24d4e10: taper: xfer-dest-taper-splitter CRC: 9930dd96:2606024458240
    Tue May 3 04:55:21 2016: thd-0x172ff00: taper: Cancelling <Xfer@0x235dcb0 (<XferSourceDirectTCPListen@0x2359990> -> <XferDestTaperSplitter@0x229d200>)>
    Tue May 3 04:55:21 2016: thd-0x24d4e60: taper: xfer-source-fd CRC: bc9c5784:2606025801728
    Tue May 3 04:55:21 2016: thd-0x172ff00: taper: tape ShiLab_Daily026-BNY776 kb 2415919104 fm 10 [OK]

    right before the job ends. If I read this correctly, amanda isn't using the holding disk to write the backup of the large filesystem, and it's dying when it reaches PEOM (I guess that's premature end of media, or something similar). Is there an easy way to test this hypothesis, and do you have any suggestions about what the best way to get amanda using the holding disk is?

    Thanks,

  7. #7
    Join Date
    Nov 2005
    Location
    Canada
    Posts
    1,049

    Default

    I think it was not able to put a single part on the tape, it got EOF before it wrote 256G of data.

    unset the part_size
    and set: device-property "LEOM" "yes"

  8. #8

    Default

    Actually there are 9 more lines of the same form as the "building type SPLIT_FILE" in the log file immediately before the start of the second excerpt. I'll assume that means that it wrote out 9 parts of 256GB before hitting EOF? I've made the changes. It will be a little while before I see if this works, though.

  9. #9

    Default

    I'm clearly doing something wrong. When I change the tape definition to:

    define tapetype "LTO-6-1" {
    comment "Created by amtapetype; compression enabled"
    length 2459392960 kbytes
    filemark 9 kbytes
    speed 75596 kps
    blocksize 32 kbytes
    device_property "LEOM" "TRUE"
    }

    and I try to run amstatus on the configuration, I get an error on the device_property line: "tapetype parameter expected" and "end of line is expected" (If I delete the device_property line, amstatus works)
    Last edited by afant_nih; May 6th, 2016 at 06:07 AM. Reason: added additional information

  10. #10
    Join Date
    Nov 2005
    Location
    Canada
    Posts
    1,049

    Default

    The device-property go in the changer section if you have one or it can be global
    It must not be in a tapetype section

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •