PDA

View Full Version : Amanda for use on large (>40TiB) systems?



stevecs
October 13th, 2009, 01:12 PM
I have used other products in the past for backup (NBU, Arkeia, Legato, and the old standbys (tar, star, cpio, et al). I was on the hunt for a good low cost backup solution for a few (read < 5) servers that have very large data sets on them. I am not opposed to commercial Zmanda but don't like the subscription only model (assuming that if I don't continue to pay every year I loose the software & web license of the last version I had while under support?)

Anyway, since I haven't seen any real examples of people using Amanda for large scale use, I was hoping that someone would chime in here to validate 1) it works ok and 2) the performance doesn't suck (i.e. long backup times). With a corollary question about if amanda supports multiple tape drives used per job in an automatic fashion (i.e. to write file x to drive 0 and file x+1 to drive 1; and keep alternating). I've done the manual method (pointing streams to each drive) but that does not scale as well for reducing backup time.



- 40 to 50TiB / ~5,000,000 files per server
Quarterly Full backups
Daily differentials to tape pool that gets rotated offsite weekly

- tape drive is LTO-4 technology (Quantum Scalar 50)

- Maintain 3 full backups
- last full backup on-site
- n-1 backup offsite
- n-2 backup in transit to become the next full n backup

- Differentials
- keep n pool onsite
- n-1 through n-11 pools offsite
- n-12 in transit to become the next 'n'


Since these are large systems I figure that running the Amanda server & client local on each would be the most efficient (avoid network latency & bandwidth issues). Assuming a staging drive is needed was thinking of a couple SSD's (say 200-400GiB total space) as that would be the only thing fast enough to really keep up. Each system itself has 4 channels going to 64-128 spindles and has no problems pushing the 120MiB/s to a tape drive (could probably feed 4 drives for most of the backup but some directories have a lot of lookups which would lower it down to about 120-140MiB/s of a single drive).


Anyway anyone have any experience/comments on large single server setups and amanda's suitability for it?

zmanda_jacob
October 14th, 2009, 01:56 PM
Hello,

I work at Zmanda in professional services, and have implemented Amanda in many environments which contain both large amounts of data(40Tb) and large number of files(recently worked on a dataset that had 46 million files). I can say from my experience that Amanda, with proper backup planning, can handle these large amounts of data and files. As far as using the multiple tape drives goes, Amanda does support the use of multiple tape drives and you can back up to multiple tape drives simultaneously, but you will have to divide your data across multiple backup sets. Using these methods, you can certainly keep your backup times low.

Jacob Shucart
Professional Services Engineer
Zmanda - jacob@zmanda.com

stevecs
October 15th, 2009, 10:03 PM
Ok, so Amanda does not do anything more intelligent with tape/file load balancing it still requires manual stream or job configuration each to it's own pool.

From what you did not comment on about the offsite rotations, does Amanda handle in it's tape management setup? The ability to know about tapes that are not physically on-site until they expire? (and ideally ability to run a report as to which tapes should be brought back from offsite (a pull/pack list?)

Are there any references or specs as to what type of platform (cpu mhz, # cpus/cores, memory, et al) is required to sustain 120MiB/s+ speeds of backup or multiples thereof (i.e. multiple LTO4 drives?). I know network loads on how much is required to run this traffic over a wire, but more interested in what overhead is needed by amanda itself to push that load/it's own processing overhead.

zbackup
October 16th, 2009, 12:13 PM
Few comments:

1. Amanda does do load balancing at a different level. It's scheduler automatically determines backup levels for clients to optimize network and storage usage.

2. Regarding subscription vs. perpetual: Zmanda's sales team can work with you on that. You can connect with them at zsales (at) zmanda.com. If you went with subscription, after your subscription expires you will be able to use the software with the same feature set as the community edition.

3. Specs for backup server will depend on several factors, including e.g. whether you are doing encryption on the server. In general, additional cpu power and io bandwidth on the backup server always comes in handy.

zmanda_jacob
October 20th, 2009, 03:01 PM
As far as offsite tape rotations go, Amanda supports removing tapes from the changer. If you remove them from the changer to send them offsite, Amanda will maintain the indexes for those tapes so when you bring them back Amanda will know what is on them and you can perform restores and put them back in the rotation if needed.