Go Back   Zmanda Forums > Amanda community forums > Operating Systems & Hardware specific issues with Amanda
User Name
Password
Register FAQ Search Today's Posts Mark Forums Read
Amanda documentation MySQL Backup documentation Amanda downloads ZRM downloads BackupPC downloads Amanda list archive

Reply
 
Thread Tools Display Modes
  #1  
Old December 19th, 2008, 02:06 AM
glowkrantz glowkrantz is offline
Amanda hacker
 
Join Date: May 2008
Posts: 45
Default Incremental backup using zfs send

So now we have working zfs send on snapshots i.e. level 0 backups. On my test installation that takes 5 LTO 2 tapes and 18 hours.

Any discussions on how to handle incremental backups on zfs?
I have been looking at ways to estimate and have come to the following:
Level 0: good approximation by filesystem referenced count, maybe a little bit low but the error is small. It does not warrant doing a dummy snapshot for level 0 estimations only.
Level 1+: No way to estimate, only way is running
zfs send -i pool/fs@0 pool/fs@1 | wc -c

Any thoughts?

How accurate must the estimate be?

Does amanda do 0 -> 1 -> 2 -> 1 or only 0 -> 1 -> 2 -> 0? I only see the later when running dump. If the later sequence is the only one, it means that only the latest snapshot needs to be kept between backups.

Is level 9 sufficient? On some large filesystems with streaming media, I consistently was doing level 5 to 8 backups when they were UFS and used dump.

Is the snapshot name fixed in the current code or can it be changed to something like pool/fs@amanda.diskname.level i.e. system/usr/local@amanda._usr_local.0, system/usr/local@amanda._usr_local.1 etc?

Unless there is anything that I can test already, I will try and experiment over the weekend and see what I come up with.

/glz

Last edited by glowkrantz; December 19th, 2008 at 02:43 AM. Reason: Spelling, clerification
Reply With Quote
  #2  
Old December 25th, 2008, 12:44 PM
glowkrantz glowkrantz is offline
Amanda hacker
 
Join Date: May 2008
Posts: 45
Default

Attached are what's needed to test incremental zfs send backups. It's been tested on

FreeBSD 7.1-PRERELEASE: Sat Nov 29 03:33:04 CET 2008
ZFS filesystem version 6
ZFS storage pool version 6
ZFS on-disk version 1

and FreeBSD 8.0-CURRENT: Wed Dec 24 13:07:11 CET 2008
ZFS filesystem version 13
ZFS storage pool version 13
ZFS on-disk version 3

> unzip -l amanda-zfs-send-incremental.zip
Archive: amanda-zfs-send-incremental.zip
Length Date Time Name
-------- ---- ---- ----
7713 12-25-08 21:08 amanda-2.6.1b2-20081222.tar.gz
437 12-25-08 21:03 amwc.sh
4257 12-25-08 20:51 patch-application-src::amzfs-sendrecv.pl
4710 12-25-08 20:54 patch-perl::Amanda::Application::Zfs.pm
-------- -------
17117 4 files

For *BSD ports users:
amanda-2.6.1b2-20081222.tar.gz: amanda-devel port of comunity build.

For anyone building from scratch or just want to update an installation:
amwc.sh: hack for reading the actual sendsize, MUST BE PLACES IN /usr/local/bin/amwc (yes, I know but I hope it's temporary).
patch-perl::Amanda::Application::Zfs.pm: patch to support versioned snapshots and allowing overlap between amcheck and amdump.
patch-application-src::amzfs-sendrecv.pl: patch to handle incemental backups to level 9.

All comments etc. welcome.

/glz
Attached Files
File Type: zip amanda-zfs-send-incremental.zip (10.7 KB, 7 views)
Reply With Quote
  #3  
Old December 26th, 2008, 11:47 PM
glowkrantz glowkrantz is offline
Amanda hacker
 
Join Date: May 2008
Posts: 45
Default

One error in the port - missed a moved manpage.

/glz
Attached Files
File Type: zip amanda-devel.zip (7.7 KB, 4 views)
Reply With Quote
  #4  
Old January 4th, 2009, 06:57 PM
glowkrantz glowkrantz is offline
Amanda hacker
 
Join Date: May 2008
Posts: 45
Default

Some more info:
Dumptype definitions to use with these patches:

# My global
define dumptype admin {
comment "Global definitions for all administered dumps"
# This is quite useful for setting global parameters, so you don't have
# to type them everywhere. All dumptype definitions in this sample file
# do include these definitions, either directly or indirectly.
# There's nothing special about the name `global'; if you create any
# dumptype that does not contain the word `global' or the name of any
# other dumptype that contains it, these definitions won't apply.
# Note that these definitions may be overridden in other
# dumptypes, if the redefinitions appear *after* the `global'
# dumptype name.
# You may want to use this for globally enabling or disabling
# indexing, recording, etc. Some examples:
index yes
record yes
}

#
# Amanda ZFS dump using snapshot and zfs send
#
define application-tool amzfs_sendrecv {
comment "amzfs-sendrecv"
plugin "amzfs-sendrecv"
property "DF-PATH" "/bin/df"
property "ZFS-PATH" "/sbin/zfs"
# FreeBSD 7, delegation works in CURRENT
property "PFEXEC-PATH" "/usr/local/bin/sudo"
property "PFEXEC" "YES"
}

define dumptype user-zfs-sendrecv {
admin
index no
program "APPLICATION"
application "amzfs_sendrecv"
maxdumps 2
}

define dumptype user-zfs-sendrecv-split {
user-zfs-sendrecv
tape_splitsize 1 Gb
}

The use of sudo for operator access in FreeBSD 7, before delegation, requires the following line in the suoders file:

operator ALL=(root) NOPASSWD: /sbin/zfs

/glz

Last edited by glowkrantz; January 4th, 2009 at 07:02 PM. Reason: More info
Reply With Quote
  #5  
Old January 13th, 2009, 08:21 AM
martineau martineau is offline
Amanda hacker
 
Join Date: Nov 2005
Posts: 366
Default

Thanks for the patch, I started to look at it and I have 1 questions?

Why zfs_purge_snapshot is called before the backup is tried?
I think it should be called only if the backup succeed and just before zfs_rename_snapshot.

If amanda try a level 0 and it failed, it can try a higher level on the next run.
Reply With Quote
  #6  
Old January 15th, 2009, 02:49 AM
nick.smith@techop.ch nick.smith@techop.ch is offline
 
Join Date: Jul 2008
Posts: 8
Default

Hi all,

I think the current proposal will have significant performance issues with the piping the output through wc (espcially during the estimate phase!).

Could you not use :

$cmd = "$self->{pfexec_cmd} $self->{zfs_path} get -Hp -o value used $self->{filesystem}\@$self->{snapshot}"

for snapshots with level > 0?

Regards,

Nick
Reply With Quote
  #7  
Old January 15th, 2009, 06:12 AM
martineau martineau is offline
Amanda hacker
 
Join Date: Nov 2005
Posts: 366
Default

Quote:
Originally Posted by nick.smith@techop.ch View Post
Hi all,

I think the current proposal will have significant performance issues with the piping the output through wc (especially during the estimate phase!).

Could you not use :

$cmd = "$self->{pfexec_cmd} $self->{zfs_path} get -Hp -o value used $self->{filesystem}\@$self->{snapshot}"
This is always 0 for a newly created snapshot.
Let me know if you find a way to compute estimate.
Reply With Quote
  #8  
Old January 16th, 2009, 01:05 AM
nick.smith@techop.ch nick.smith@techop.ch is offline
 
Join Date: Jul 2008
Posts: 8
Default

Quote:
Originally Posted by martineau View Post
This is always 0 for a newly created snapshot.
Let me know if you find a way to compute estimate.
Apologies - my last post was somewhat erroneous!

How about iterating the through the snapshots and deducting one 'reference' value from the 'level-1' snapshot from the 'reference' value from the 'level' snapshot??

I've attached a script of my testing.

Any comments welcome!

Regards,

Nick
Attached Files
File Type: txt zfs-snapshot-test.txt (2.0 KB, 3 views)
Reply With Quote
  #9  
Old January 16th, 2009, 04:08 AM
martineau martineau is offline
Amanda hacker
 
Join Date: Nov 2005
Posts: 366
Default

You didn't try to remove files.

If you remove /rpool/test/sol-nv-b101-x86-dvd.iso.1 before creating level 3 snaphost, then the reference for level 3 snapshot will be 7.37G, you wil get:
# zfs list -r rpool/test
NAME USED AVAIL REFER MOUNTPOINT
rpool/test 10.4G 305G 10.4G /rpool/test
rpool/test@0 17K - 2.15G -
rpool/test@1 17K - 5.22G -
rpool/test@2 17K - 7.37G -
rpool/test@3 0 - 7.37G -

but the backup will be 2.15G, how do you compute it?
Reply With Quote
  #10  
Old January 16th, 2009, 05:02 AM
glowkrantz glowkrantz is offline
Amanda hacker
 
Join Date: May 2008
Posts: 45
Default

Which is why I followed the tradition of GNU tar, where the size is calculated by a dummy backup.

And looking in the sendsize debug file, I see that it's also done for dump, even on level 0.

There are times when we just don't know unless we do it.

An alternative could be to do server side estimates, which I think use the statistics from actual dump runs to estimate.

/glz

Last edited by glowkrantz; January 16th, 2009 at 05:56 AM.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -8. The time now is 09:32 PM.