|
|||||||
| Amanda documentation | MySQL Backup documentation | Amanda downloads | ZRM downloads | BackupPC downloads | Amanda list archive |
![]() |
|
|
Thread Tools | Display Modes |
|
#1
|
|||
|
|||
|
So now we have working zfs send on snapshots i.e. level 0 backups. On my test installation that takes 5 LTO 2 tapes and 18 hours.
Any discussions on how to handle incremental backups on zfs? I have been looking at ways to estimate and have come to the following: Level 0: good approximation by filesystem referenced count, maybe a little bit low but the error is small. It does not warrant doing a dummy snapshot for level 0 estimations only. Level 1+: No way to estimate, only way is running zfs send -i pool/fs@0 pool/fs@1 | wc -c Any thoughts? How accurate must the estimate be? Does amanda do 0 -> 1 -> 2 -> 1 or only 0 -> 1 -> 2 -> 0? I only see the later when running dump. If the later sequence is the only one, it means that only the latest snapshot needs to be kept between backups. Is level 9 sufficient? On some large filesystems with streaming media, I consistently was doing level 5 to 8 backups when they were UFS and used dump. Is the snapshot name fixed in the current code or can it be changed to something like pool/fs@amanda.diskname.level i.e. system/usr/local@amanda._usr_local.0, system/usr/local@amanda._usr_local.1 etc? Unless there is anything that I can test already, I will try and experiment over the weekend and see what I come up with. /glz Last edited by glowkrantz; December 19th, 2008 at 02:43 AM. Reason: Spelling, clerification |
|
#2
|
|||
|
|||
|
Attached are what's needed to test incremental zfs send backups. It's been tested on
FreeBSD 7.1-PRERELEASE: Sat Nov 29 03:33:04 CET 2008 ZFS filesystem version 6 ZFS storage pool version 6 ZFS on-disk version 1 and FreeBSD 8.0-CURRENT: Wed Dec 24 13:07:11 CET 2008 ZFS filesystem version 13 ZFS storage pool version 13 ZFS on-disk version 3 > unzip -l amanda-zfs-send-incremental.zip Archive: amanda-zfs-send-incremental.zip Length Date Time Name -------- ---- ---- ---- 7713 12-25-08 21:08 amanda-2.6.1b2-20081222.tar.gz 437 12-25-08 21:03 amwc.sh 4257 12-25-08 20:51 patch-application-src::amzfs-sendrecv.pl 4710 12-25-08 20:54 patch-perl::Amanda::Application::Zfs.pm -------- ------- 17117 4 files For *BSD ports users: amanda-2.6.1b2-20081222.tar.gz: amanda-devel port of comunity build. For anyone building from scratch or just want to update an installation: amwc.sh: hack for reading the actual sendsize, MUST BE PLACES IN /usr/local/bin/amwc (yes, I know but I hope it's temporary). patch-perl::Amanda::Application::Zfs.pm: patch to support versioned snapshots and allowing overlap between amcheck and amdump. patch-application-src::amzfs-sendrecv.pl: patch to handle incemental backups to level 9. All comments etc. welcome. /glz |
|
#3
|
|||
|
|||
|
One error in the port - missed a moved manpage.
/glz |
|
#4
|
|||
|
|||
|
Some more info:
Dumptype definitions to use with these patches: # My global define dumptype admin { comment "Global definitions for all administered dumps" # This is quite useful for setting global parameters, so you don't have # to type them everywhere. All dumptype definitions in this sample file # do include these definitions, either directly or indirectly. # There's nothing special about the name `global'; if you create any # dumptype that does not contain the word `global' or the name of any # other dumptype that contains it, these definitions won't apply. # Note that these definitions may be overridden in other # dumptypes, if the redefinitions appear *after* the `global' # dumptype name. # You may want to use this for globally enabling or disabling # indexing, recording, etc. Some examples: index yes record yes } # # Amanda ZFS dump using snapshot and zfs send # define application-tool amzfs_sendrecv { comment "amzfs-sendrecv" plugin "amzfs-sendrecv" property "DF-PATH" "/bin/df" property "ZFS-PATH" "/sbin/zfs" # FreeBSD 7, delegation works in CURRENT property "PFEXEC-PATH" "/usr/local/bin/sudo" property "PFEXEC" "YES" } define dumptype user-zfs-sendrecv { admin index no program "APPLICATION" application "amzfs_sendrecv" maxdumps 2 } define dumptype user-zfs-sendrecv-split { user-zfs-sendrecv tape_splitsize 1 Gb } The use of sudo for operator access in FreeBSD 7, before delegation, requires the following line in the suoders file: operator ALL=(root) NOPASSWD: /sbin/zfs /glz Last edited by glowkrantz; January 4th, 2009 at 07:02 PM. Reason: More info |
|
#5
|
|||
|
|||
|
Thanks for the patch, I started to look at it and I have 1 questions?
Why zfs_purge_snapshot is called before the backup is tried? I think it should be called only if the backup succeed and just before zfs_rename_snapshot. If amanda try a level 0 and it failed, it can try a higher level on the next run. |
|
#6
|
|||
|
|||
|
Hi all,
I think the current proposal will have significant performance issues with the piping the output through wc (espcially during the estimate phase!). Could you not use : $cmd = "$self->{pfexec_cmd} $self->{zfs_path} get -Hp -o value used $self->{filesystem}\@$self->{snapshot}" for snapshots with level > 0? Regards, Nick |
|
#7
|
|||
|
|||
|
Quote:
Let me know if you find a way to compute estimate. |
|
#8
|
|||
|
|||
|
Quote:
How about iterating the through the snapshots and deducting one 'reference' value from the 'level-1' snapshot from the 'reference' value from the 'level' snapshot?? I've attached a script of my testing. Any comments welcome! Regards, Nick |
|
#9
|
|||
|
|||
|
You didn't try to remove files.
If you remove /rpool/test/sol-nv-b101-x86-dvd.iso.1 before creating level 3 snaphost, then the reference for level 3 snapshot will be 7.37G, you wil get: # zfs list -r rpool/test NAME USED AVAIL REFER MOUNTPOINT rpool/test 10.4G 305G 10.4G /rpool/test rpool/test@0 17K - 2.15G - rpool/test@1 17K - 5.22G - rpool/test@2 17K - 7.37G - rpool/test@3 0 - 7.37G - but the backup will be 2.15G, how do you compute it? |
|
#10
|
|||
|
|||
|
Which is why I followed the tradition of GNU tar, where the size is calculated by a dummy backup.
And looking in the sendsize debug file, I see that it's also done for dump, even on level 0. There are times when we just don't know unless we do it. An alternative could be to do server side estimates, which I think use the statistics from actual dump runs to estimate. /glz Last edited by glowkrantz; January 16th, 2009 at 05:56 AM. |
![]() |
| Thread Tools | |
| Display Modes | |
|
|