PDA

View Full Version : Slow backup (1MiB/s) of local machine (large DLE)



stevecs
December 14th, 2009, 04:30 AM
Ok, I am just starting w/ Amanda per my post here (http://forums.zmanda.com/showthread.php?t=2227) I have some large systems that I want to have a backup system locally attached to them (due to their size). Locally I can back up to tape (LTO4) with GNU Tar (v1.19 / ubuntu 8.04.3) at ~105-110MiB/s no problems. The drive array consists of 64 drives on hardware raid controllers using LVM to stripe the various raid-6's. RAW I/O to the volume is over 500MiB/s.

When using amanda with the configs below, I see that the estimate phase takes a very long time (fails to complete on /var/ftp) and from iostat seems to be doing only 1-2MiB/s? which could very well be a problem in that regard. When dump finally /does/ start on the disks the backup is about 1-2MiB/s (nearly 10 hours to back up less than 100GiB).

I did notice that amanda does NOT set any blocking factor to tar, so I wrapped tar (created a script that would add the blocking factor of 1024 (512KiB) which I use for direct tar backups and gave me the >100MiB/s numbers above.

Since this is so pathetic, I am hoping that there is something that someone can see here as I just can't believe that amanda or any backup utility could be this bad by design.

I also noticed that all the taper 'PART' sections of the log mention 10240 kbp which may be a hard coded limit somewhere but can't find it or even if that is important (does seem very low though).

EDIT: note, running 2.6.1p2 binary from zmanda.com for ubuntu 8.04

On a test machine I have this:
---(df -h)
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 58G 11G 44G 20% /
/dev/sda1 236M 58M 167M 26% /boot
/dev/mapper/vg_media-lv_ftpshare
40T 29T 12T 72% /var/ftp
/dev/mapper/vg_media-lv_usershare
2.0T 1.4T 685G 67% /home
---- (disklist)
xxxx / nocomp-tar
xxxx /boot nocomp-tar
xxxx /home nocomp-tar
xxxx /var/ftp nocomp-tar

----(amanda.conf)
org "BackupSetAA" # your organization name for reports
mailto "xxxxxx" # space separated list of operators at your site
dumpcycle 12weeks # the number of days in the normal dump cycle
runspercycle 12 # the number of amdump runs in dumpcycle days
tapecycle 40 # the number of tapes in rotation
runtapes 40 # number of tapes to be used in a single run of amdump
tpchanger "chg-manual" # the tape-changer glue script
tapedev "/dev/nst0" # the no-rewind tape device
changerfile "/etc/amanda/BackupSetAA/chg-manual.conf" # tape changer configuration parameter file
changerdev "/dev/null" # tape changer configuration parameter device
tapetype LTO4-HWC # what kind of tape it is
labelstr "^AA[0-9][0-9][0-9][0-9]*$" # label constraint regex: all tapes must match
dtimeout 1800 # number of idle seconds before a dump is aborted
ctimeout 30 # max number of secconds amcheck waits for each client
etimeout 3000 # number of seconds per filesystem for estimates

dumpuser "amandabackup" # the user to run dumps under
inparallel 4 # maximum dumpers that will run in parallel (max 63)
dumporder "sssS" # specify the priority order of each dumper
taperalgo first # The algorithm used to choose which dump image to send
displayunit "g" # Possible values: "k|m|g|t"
netusage 150000 Kbps # maximum net bandwidth for Amanda, in KB per sec
bumpsize 200 Gb # minimum savings (threshold) to bump level 1 -> 2
bumppercent 20 # minimum savings (threshold) to bump level 1 -> 2
bumpdays 1 # minimum days at each level
usetimestamps yes
device_output_buffer_size 128000k
label_new_tapes "AA%%%%"

maxdumpsize -1 # Maximum total size the planner will schedule
# for a run (default: runtapes * tape_length) (kbytes).
amrecover_do_fsf yes # amrecover will call amrestore with the
# -f flag for faster positioning of the tape.
amrecover_check_label yes # amrecover will call amrestore with the
# -l flag to check the label.
bumpmult 4 # threshold = bumpsize * bumpmult^(level-1)
amrecover_changer "changer" # amrecover will use the changer if you restore
# from this device. It could be a string like 'changer' and amrecover will use your
# changer if you set your tape to 'changer' with 'setdevice changer' or via
# 'tapedev "changer"' in amanda-client.conf
autoflush no
infofile "/etc/amanda/BackupSetAA/curinfo" # database DIRECTORY
logdir "/etc/amanda/BackupSetAA/logs" # log directory
indexdir "/etc/amanda/BackupSetAA/index" # index directory
tapelist "/etc/amanda/BackupSetAA/tapelist" # index directory
define interface local {
comment "a local disk"
use 200000 kbps
}

define dumptype global {
comment "Global definitions"
auth "bsdtcp"
}
define dumptype nocomp-tar {
comment "GNUTAR based dump"
global
program "GNUTAR"
exclude list "/etc/amanda/exclude.gtar"
compress none
priority high
tape_splitsize 25 Gb
index yes
record yes
}
define tapetype LTO4-HWC {
comment "just produced by tapetype prog (hardware compression on)"
length 788480 mbytes
filemark 0 kbytes
tape_splitsize 25 Gb
speed 94896 kps
}

------------- (log snippet)
INFO amdump amdump pid 30272
INFO planner planner pid 30318
DISK planner xxxx /
DISK planner xxxx /boot
DISK planner xxxx /home
DISK planner xxxx /var/ftp
START planner date 20091213212136
INFO planner Adding new disk xxxx:/.
INFO planner Adding new disk xxxx:/boot.
INFO planner Adding new disk xxxx:/home.
INFO planner Adding new disk xxxx:/var/ftp.
INFO driver driver pid 30319
START driver date 20091213212136
STATS driver hostname loki
STATS driver startup time 0.005
INFO dumper dumper pid 30323
INFO dumper dumper pid 30324
INFO dumper dumper pid 30321
INFO dumper dumper pid 30322
INFO taper taper pid 30320
ERROR planner Request to xxxx failed: timeout waiting for REP
WARNING planner disk xxxx:/var/ftp, estimate of level 0 timed out.
FAIL planner xxxx /var/ftp 20091213212136 0 "[disk /var/ftp, all estimate timed out]"
FINISH planner date 20091213212136 time 3601.427
INFO planner pid-done 30318
INFO dumper gzip pid 21929
START taper datestamp 20091213212136 label AA0001 tape 1
PART taper AA0001 1 xxxx /home 20091213212136 1/-1 0 [sec 18.061213 kb 10240 kps 566.960813]
PART taper AA0001 2 xxxx /home 20091213212136 2/-1 0 [sec 0.718334 kb 10240 kps 14255.207188]
PART taper AA0001 3 xxxx /home 20091213212136 3/-1 0 [sec 0.324078 kb 10240 kps 31597.331507
------

stevecs
December 15th, 2009, 03:37 PM
Ok, after much playing I got it to about 40MiB/s which is better but still nowhere near the speeds of a raw TAR backup and right at the starvation point of the tape drive.

Another item I noticed is that program "DUMP" does not work w/ XFS volumes. Seems that 2.6.1p2 tries to use /sbin/dump for /any/ filesystem regardless if it's correct or not (or doesn't properly see filesystem type).


So, looks like there is still a major performance problem somewhere though I can't seem to find it in any of the docs nor anyone really talking about pushing real speeds w/ amanda. (i.e. >100MiB/s)

Anyone have pointers?


Current amanda.conf
---
dumpuser "amandabackup"
org "BackupSetAA"
mailto "xxxx"
send-amreport-on never
dumpcycle 12weeks
runspercycle 12
tapecycle 40
runtapes 40
tpchanger "chg-manual"
tapedev "/dev/nst0"
changerfile "/etc/amanda/BackupSetAA/chg-manual.conf"
changerdev "/dev/null"
tapetype LTO4-HWC
labelstr "^AA[0-9][0-9][0-9][0-9]*$"
dtimeout 1800
ctimeout 30
etimeout 300
inparallel 4
dumporder "BTBT"
taperalgo first
displayunit "g"
netusage 150000 Kbps
bumpsize 5000 Gb
bumppercent 0
bumpdays 6
usetimestamps yes
device_output_buffer_size 128000k
label_new_tapes "AA%%%%"
maxdumpsize -1
amrecover_do_fsf yes
amrecover_check_label yes
bumpmult 4
amrecover_changer "changer"
autoflush no
infofile "/etc/amanda/BackupSetAA/curinfo"
logdir "/etc/amanda/BackupSetAA/logs"
indexdir "/etc/amanda/BackupSetAA/index"
tapelist "/etc/amanda/BackupSetAA/tapelist"
define interface local {
comment "a local disk"
use 200000 kbps
}
define dumptype global {
comment "Global definitions"
auth "bsdtcp"
}
define dumptype nocomp-tar {
comment "GNUTAR based dump"
global
program "GNUTAR"
exclude list "/etc/amanda/exclude.gtar"
compress none
priority high
tape_splitsize 50 Gb
fallback_splitsize 2 Gb
index yes
record yes
estimate server
holdingdisk never
}
define dumptype nocomp-dump {
comment "DUMP based dump"
global
program "DUMP"
compress none
priority high
tape_splitsize 50 Gb
fallback_splitsize 2 Gb
index yes
record yes
estimate server
holdingdisk never
}
define tapetype LTO4-HWC {
comment "just produced by tapetype prog (hardware compression on)"
length 788480 mbytes
filemark 0 kbytes
speed 94896 kps
blocksize 512 kbytes
readblocksize 512 kbytes
filemark 0
}
------

(amdump snippet from failed nocomp-dump of /home)
--
GETTING ESTIMATES...
driver: pid 10229 executable /usr/libexec/amanda/driver version 2.6.1p2
driver: tape size 807403520
reserving 0 out of 0 for degraded-mode dumps
driver: send-cmd time 0.001 to taper: START-TAPER 20091215164518
driver: started dumper0 pid 10234
driver: send-cmd time 0.002 to dumper0: START 20091215164518
driver: started dumper1 pid 10236
driver: send-cmd time 0.003 to dumper1: START 20091215164518
driver: started dumper2 pid 10237
driver: send-cmd time 0.003 to dumper2: START 20091215164518
driver: started dumper3 pid 10238
driver: send-cmd time 0.004 to dumper3: START 20091215164518
driver: start time 0.004 inparallel 4 bandwidth 150000 diskspace 0 dir OBSOLETE datestamp 20091215164518 driver: drain-ends tapeq FIRST big-dumpers BTBT
planner time 0.012: got result for host xxxx disk /home: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner time 0.012: got result for host xxxx disk /boot: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner time 0.012: got result for host xxxx disk /: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
dumper: pid 10234 executable dumper0 version 2.6.1p2
dumper: pid 10237 executable dumper2 version 2.6.1p2
dumper: pid 10236 executable dumper1 version 2.6.1p2
dumper: pid 10238 executable dumper3 version 2.6.1p2
taper: pid 10233 executable taper version 2.6.1p2
planner: time 0.104: got partial result for host xxxx disk /home: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 0.105: got partial result for host xxxx disk /boot: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 0.105: got partial result for host xxxx disk /: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 0.105: got partial result for host xxxx disk /home: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 0.105: got partial result for host xxxx disk /boot: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 0.105: got partial result for host xxxx disk /: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 0.105: got partial result for host xxxx disk /home: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 0.105: got partial result for host xxxx disk /boot: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 0.105: got partial result for host xxxx disk /: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 1.094: got result for host xxxx disk /home: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 1.094: got result for host xxxx disk /boot: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 1.094: got result for host xxxx disk /: 0 -> 1000000K, -1 -> -2K, -1 -> -2K
planner: time 1.094: getting estimates took 1.094 secs
FAILED QUEUE: empty
DONE QUEUE:
0: xxxx /home
1: xxxx /boot
2: xxxx /

ANALYZING ESTIMATES...
pondering xxxx:/home... next_level0 -14594 last_level -1 (due for level 0) (new disk, can't switch to degraded mode)
curr level 0 nsize 1000000 csize 1000000 total size 1001536 total_lev0 1000000 balanced-lev0size 83333
pondering xxxx:/boot... next_level0 -14594 last_level -1 (due for level 0) (new disk, can't switch to degraded mode)
curr level 0 nsize 1000000 csize 1000000 total size 2002048 total_lev0 2000000 balanced-lev0size 166666
pondering xxxx:/... next_level0 -14594 last_level -1 (due for level 0) (new disk, can't switch to degraded mode)
curr level 0 nsize 1000000 csize 1000000 total size 3002560 total_lev0 3000000 balanced-lev0size 249999
INITIAL SCHEDULE (size 3002560):
xxxx /home pri 14596 lev 0 nsize 1000000 csize 1000000
xxxx /boot pri 14596 lev 0 nsize 1000000 csize 1000000
xxxx / pri 14596 lev 0 nsize 1000000 csize 1000000

DELAYING DUMPS IF NEEDED, total_size 3002560, tape length 32296140800 mark 0
delay: Total size now 3002560.

PROMOTING DUMPS IF NEEDED, total_lev0 3000000, balanced_size 249999...
planner: time 1.094: analysis took 0.000 secs

GENERATING SCHEDULE:
--------
DUMP xxxx ffffffff9ffeffffffff7f /home 20091215164518 14596 0 1970:1:1:0:0:0 1000000 1000000 976 1024 "Can't switch to degraded mode when using a new disk"
DUMP xxxx ffffffff9ffeffffffff7f /boot 20091215164518 14596 0 1970:1:1:0:0:0 1000000 1000000 976 1024 "Can't switch to degraded mode when using a new disk"
DUMP xxxx ffffffff9ffeffffffff7f / 20091215164518 14596 0 1970:1:1:0:0:0 1000000 1000000 976 1024 "Can't switch to degraded mode when using a new disk"
--------
taper: using label `AA0000' date `20091215164518'
driver: result time 1.191 from taper: TAPER-OK
driver: state time 1.191 free kps: 150000 space: 0 taper: idle idle-dumpers: 4 qlen tapeq: 0 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle
driver: interface-state time 1.191 if default: free 150000
driver: hdisk-state time 1.191
driver: flush size 0
driver: dumping xxxx:/home directly to tape
driver: send-cmd time 1.191 to taper: PORT-WRITE 00-00001 xxxx /home 0 20091215164518 53687091200 NULL 2147483648
driver: result time 1.192 from taper: PORT 11027
driver: send-cmd time 1.192 to dumper0: PORT-DUMP 00-00001 11027 xxxx ffffffff9ffeffffffff7f /home NODEVICE 0 1970:1:1:0:0:0 DUMP X X X bsdtcp |" <auth>bsdtcp</auth>\n <record>YES</record>\n <index>YES</index>\n"
driver: state time 1.192 free kps: 148976 space: 0 taper: writing idle-dumpers: 3 qlen tapeq: 0 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle
driver: interface-state time 1.192 if default: free 148976
driver: hdisk-state time 1.192
driver: state time 1.214 free kps: 148976 space: 0 taper: writing idle-dumpers: 3 qlen tapeq: 0 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle
driver: interface-state time 1.214 if default: free 148976
driver: hdisk-state time 1.214
driver: result time 1.214 from taper: REQUEST-NEW-TAPE 00-00001
driver: send-cmd time 1.214 to taper: NEW-TAPE
dumper: kill index command
driver: state time 1.356 free kps: 148976 space: 0 taper: writing idle-dumpers: 3 qlen tapeq: 0 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle
driver: interface-state time 1.356 if default: free 148976
driver: hdisk-state time 1.356
driver: result time 1.356 from dumper0: FAILED 00-00001 "[dump (10367) /sbin/dump returned 1]"
driver: state time 18.857 free kps: 148976 space: 0 taper: writing idle-dumpers: 3 qlen tapeq: 0 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle
driver: interface-state time 18.857 if default: free 148976
driver: hdisk-state time 18.857

----

martineau
December 16th, 2009, 04:46 AM
zmanda binaries for ubuntu-8.04 doesn't include xfs. You needs to compile it yourself.

If your dump fit on one tape, disabling spliting can inprove speed.
The use of a holding disk improve speed too.

stevecs
December 16th, 2009, 09:02 PM
So far I've found that a holding disk slows down the backup or at best has zero positive effect. No DLE will ever fit on a single tape. As for XFS, I tried self-compiling but then found that even by manually linking it still seems to go toward /sbin/dump for file systems. I manually linked xfsdump to /sbin/dump but then I found that it does not take the xfsdump -e option to exclude files.

Currently the best performance I can get is with the config below, which frankly, sucks (1/4 - 1/5 the performance of a raw gnu tar v1.19 backup of the same volumes) consistent.



-------
dumpuser "amandabackup" # the user to run dumps under
org "BackupSetAA" # your organization name for reports
mailto "xxxxx" # space separated list of operators at your site
send-amreport-on never # [all|strange|error|never]
dumpcycle 18weeks # the number of days in the normal dump cycle
runspercycle 18 # the number of amdump runs in dumpcycle days
tapecycle 40 # the number of tapes in rotation
runtapes 40 # number of tapes to be used in a single run of amdump
tpchanger "chg-manual" # the tape-changer glue script
tapedev "/dev/nst0" # the no-rewind tape device
changerfile "/etc/amanda/BackupSetAA/chg-manual.conf" # tape changer configuration parameter file
changerdev "/dev/null" # tape changer configuration parameter device
tapetype LTO4-HWC # what kind of tape it is
labelstr "^AA[0-9][0-9][0-9][0-9]*$" # label constraint regex: all tapes must match
dtimeout 8000 # number of idle seconds before a dump is aborted
ctimeout 30 # max number of secconds amcheck waits for each client
etimeout 7200 # number of seconds per filesystem for estimates
inparallel 4 # maximum dumpers that will run in parallel (max 63)
# this maximum can be increased at compile-time,
# modifying MAX_DUMPERS in server-src/driverio.h
dumporder "sSsS" # specify the priority order of each dumper
# s -> smallest size
# S -> biggest size
# t -> smallest time
# T -> biggest time
# b -> smallest bandwidth
# B -> biggest bandwitdh
# try "BTBTBTBTBTBT" if you are not holding
# disk constrained
taperalgo first # The algorithm used to choose which dump image to send
# to the taper.
# Possible values: [first|firstfit|largest|largestfit|smallest|last]
# Default: first.
# first First in - first out.
# firstfit The first dump image that will fit on the current tape.
# largest The largest dump image.
# largestfit The largest dump image that will fit on the current tape.
# smallest The smallest dump image.
# last Last in - first out.
displayunit "g" # Possible values: "k|m|g|t"
# Default: k.
# The unit used to print many numbers.
# k=kilo, m=mega, g=giga, t=tera
netusage 150000 Kbps # maximum net bandwidth for Amanda, in KB per sec
bumpsize 50 Gb # minimum savings (threshold) to bump level 1 -> 2
bumppercent 0 # minimum savings (threshold) to bump level 1 -> 2
bumpdays 2 # minimum days at each level
bumpmult 100 # threshold = bumpsize * bumpmult^(level-1)
usetimestamps yes
device_output_buffer_size 131072k # amount of buffer space to use when writing to devices
label_new_tapes "AA%%%%"
maxdumpsize -1 # Maximum total size the planner will schedule
# for a run (default: runtapes * tape_length) (kbytes).
amrecover_do_fsf yes # amrecover will call amrestore with the
# -f flag for faster positioning of the tape.
amrecover_check_label yes # amrecover will call amrestore with the
# -l flag to check the label.
amrecover_changer "changer" # amrecover will use the changer if you restore
# from this device. It could be a string like 'changer' and amrecover will use your
# changer if you set your tape to 'changer' with 'setdevice changer' or via
# 'tapedev "changer"' in amanda-client.conf
autoflush no
flush-threshold-dumped 100
flush-threshold-scheduled 100
infofile "/etc/amanda/BackupSetAA/curinfo/" # database DIRECTORY
logdir "/etc/amanda/BackupSetAA/logs" # log directory
indexdir "/etc/amanda/BackupSetAA/index" # index directory
tapelist "/etc/amanda/BackupSetAA/tapelist" # index directory
holdingdisk hd0 {
comment "Holding disk 0"
directory "/var/lib/amanda/holdings"
use 800 Gb
chunksize 4 Gb
}
define interface local {
comment "a local disk"
use 200000 kbps
}
define dumptype global {
comment "Global definitions"
auth "bsdtcp"
}
define dumptype nocomp-tar {
comment "GNUTAR based dump"
global
program "GNUTAR"
exclude list "/etc/amanda/exclude.gtar"
compress none
priority high
tape_splitsize 4 Gb
fallback_splitsize 4 Gb
index yes
record yes
estimate calcsize # server | calcsize | client
holdingdisk off # off | on | required
}
define tapetype LTO4-HWC {
comment "just produced by tapetype prog (hardware compression on)"
length 788480 mbytes
filemark 0 kbytes
speed 94896 kps
blocksize 512 kbytes
readblocksize 512 kbytes
filemark 0
}
-----------

martineau
December 17th, 2009, 04:14 AM
If you have a compilationtree, try the following:
cd client-src
make getfsent
./getfsent /home

What is the fstype reported forr the filesystem?

Do the compilation found xfsdump?
$ grep XFSDUMP config/config.h

stevecs
December 17th, 2009, 07:16 PM
Only the large partitions are XFS the others (/ ; /boot ) are ext3, /home & /var/ftp are both xfs.

the compilation found XFSDUMP but for some reason it kept trying to use /sbin/dump?

Anyway, I have since removed amanda due to mainly the performance issues, I have moved over to Bacula which I am getting 105-115MiB/s performance out of it with very little cpu. (<50% of two 2.4Ghz cores). With these figures I can add in another 1-2 more LTO4 drives easily without affecting file serving.