PDA

View Full Version : Index file locking weirdness



mjm1138
November 14th, 2007, 09:53 AM
Hi everyone,

I've searched through the forum and haven't found anything on this one. I'm attempting to do mysql-zrm dumps to a datadomain restorer device, which appears to the system as an NFS share (export options are rw,no_root_squash,no_all_squash,secure). OS is CentOS 5 and MySQL-zrm version is 1.2.1. When I attempt to run the job I get the following error message:

dailyrun:backup:ERROR: Cannot lock index file for writing. Permission denied

I've tried several permutations of directory ownership for the destination directory, all with the same result. I've also tried running mysql-zrm-scheduler both as root and the mysql user, and tried doing the backup both with the backup-user database user I created and the mysql "root" user.

Any theories? A file called "index" gets created in the datetime stamped subdirectory of the destination directory, which is owned by root:root. I'm stumped.

Thanks,
-Mike

paddy
November 14th, 2007, 02:12 PM
A guess:
What version of kernel are you using? http://nfs.sourceforge.net mentions that
nfs client 2.6.12 provided support for flock interface.

Try writing a small code to do flock to see if the issue is with data domain server or
nfs client.

One option is to remove flock from mysql-zrm-backup.pl code. flock is used in ZRM at the time of creation of index file.

Paddy

mjm1138
November 14th, 2007, 02:46 PM
Kernel version is 2.6.18. Thanks for the pointer though. I note that I can successfully use flock against a file exported from another NFS server, but not on the DataDomain device. This may be a peculiarity of their NFS implementation or their filesystem; I'll seek advice from them and post their reply.

Mean time, can you tell me what the potential implications are of removing flock from the backup script? If this is the only host writing to this directory, what is my risk?

Thanks,
-Mike

paddy
November 14th, 2007, 03:00 PM
Kernel version is 2.6.18. Thanks for the pointer though. I note that I can successfully use flock against a file exported from another NFS server, but not on the DataDomain device. This may be a peculiarity of their NFS implementation or their filesystem; I'll seek advice from them and post their reply.


Please post the information that you get from Data Domain folks. Data Domain does not support some operations such as hard links. I'm not sure of flock.



Mean time, can you tell me what the potential implications are of removing flock from the backup script? If this is the only host writing to this directory, what is my risk?


There is no risk if you have not schedule multiple backup runs at the same time
(backup runs should not overlap)

Paddy

mjm1138
November 15th, 2007, 08:32 AM
Word from DataDomain is that they do not support flock, so there it is. I'm currently doing a logical dump with flock stripped out of the script, which seems to be working. It is sssssssllllllllllloooooooooooooooooowwwww, but that's another matter.

Thanks for the help.

paddy
November 15th, 2007, 11:53 AM
Why logical backup? You should try "raw" backup.

What is the size of the data being backed up?

See Dmitri's post (http://forums.zmanda.com/showthread.php?t=489&highlight=mysql+paper) on ZRM performance with various backup methods.

Paddy

mjm1138
November 15th, 2007, 01:44 PM
Found the performance issue, which was a mismatch in MTU between the database host and the DataDomain device. I now get good performance with both raw and logical backups. We're going to be doing logical backups because of the portability implications of using the raw method. For our current 44GB of data on an 8-way box with 24GB RAM I find this makes a difference of about 10 minutes, but that's not a big problem for us. As the data set grows, we may ultimately decide that performance trumps portability.

Thanks again for all the help.

-Mike

raminix
June 16th, 2008, 06:47 AM
I'm in a test pilot right now with a DataDomain DDR-530 and was running into the same issue with file locks. I found this thread and commented out the lines in the mysql-zrm-backup file that created the file locks on the index file, and have been backing up successfully to the DDR appliance.

I discovered that the purges are also having the same issue and have not been able to locate anywhere in the code where a file lock is created, but I see where requests are going to the DDR to lockd via a tcpdump.

Any suggestions on how to get mysql-zrm-purge running against a DDR as well?

kkg
June 16th, 2008, 09:47 PM
I'm in a test pilot right now with a DataDomain DDR-530 and was running into the same issue with file locks. I found this thread and commented out the lines in the mysql-zrm-backup file that created the file locks on the index file, and have been backing up successfully to the DDR appliance.

I discovered that the purges are also having the same issue and have not been able to locate anywhere in the code where a file lock is created, but I see where requests are going to the DDR to lockd via a tcpdump.

Any suggestions on how to get mysql-zrm-purge running against a DDR as well?

If I remember correctly, the only other place flock is used is in /usr/lib/mysql-zrm/ZRM/Common.pm for locking the log file.

You could try commenting that out.

--kkg

raminix
June 30th, 2008, 08:40 AM
Well, I did try that already and didn't have any luck.. Thanks, though!

raminix
July 2nd, 2008, 05:17 PM
If anyone else happens to be stuck in the same situation where they can't get the purge action to work due to lockd issues, feel free to modify this to suit your needs. I flung together a "quick and dirty" shell script to accomplish the same thing using the same criteria as the mysql-zrm-purge shell script. I only wrote it to handle retention periods based on weeks, and for no more than nine. (We max out at eight for all of our backups.) This will check the index file for each backup and act according to the state retention period and backup timestamp. If you have a different retention requirement, it should only take you a minute or two to rewrite this one to suit your needs.

Keep in mind there is NO error checking - it wasn't meant to be pretty.


Hope this helps someone!

--------


#!/bin/bash

##################################################
# purge-zrm-backups
#
# purges backups without file locking
#
##################################################

BACKUPDIR="/path/to/your/mysql-zrm/backups"
PURGELOG="/var/log/mysql-zrm/purgelog"
CURDATE=`date +%s`
TIMESTAMP="date -Iseconds"

echo "$($TIMESTAMP) -- Starting purge session" >> $PURGELOG

for buset in $BACKUPDIR/*
do
for budate in $buset/*
do
KEEP=`grep retention-policy $budate/index | awk -F= '{print$2}'`
WEEKS=${KEEP:0:1}
TICKS=$((WEEKS * 7 * 86400))
CUTOFF=$((CURDATE - TICKS))
TSTAMP=`grep backup-date-epoch $budate/index | awk -F= '{print$2}'`

if [ "$TSTAMP" -lt "$CUTOFF" ]; then
echo "$($TIMESTAMP) -- | Purging $budate" >> $PURGELOG
rm -rf $budate 2>&1 &
fi
done
done

echo "$($TIMESTAMP) -- Finished purge session" >> $PURGELOG