Results 1 to 1 of 1

Thread: selfcheck request failed: timeout waiting for REP

  1. #1

    Post selfcheck request failed: timeout waiting for REP

    Hello,

    I am having trouble with a couple aspects of my amanda deploy. Unfortunately one of them is also one of the most important of them: Postgres Database backups with Ampgsql.


    When I perform an amcheck on my config this is what I get:

    Code:
    [amandabackup@amanda ~]$ amcheck SNJC
    Amanda Tape Server Host Check
    -----------------------------
    WARNING: part-size of 0 MB < 0.1% of tape length.
     This may create > 1000 parts, severely degrading backup/restore performance.
     See http://wiki.zmanda.com/index.php/Splitsize_too_small for more information.
    Holding disk /mnt/store/amanda: 1077312 MB disk space available, using 872512 MB
    slot 3: volume 'SNJC-3'
    Will write to volume 'SNJC-3' in slot 3.
    NOTE: skipping tape-writable test
    Server check took 3.554 seconds
    
    Amanda Backup Client Hosts Check
    --------------------------------
    ERROR: VIRTCENT14.summitnjhome.com: [can not read/write /var/lib/amanda/gnutar-lists/.: Read-only file system]
    WARNING: VIRTCENT09.summitnjhome.com: selfcheck request failed: timeout waiting for REP
    Client check: 11 hosts checked in 99.108 seconds.  2 problems found.
    
    (brought to you by Amanda 3.2.1)
    Now the fist error is odd because this is how I have my tapetype set:

    Code:
    tapetype HARDDISK
    define tapetype HARDDISK {
    part_size 20
    comment "Backup to Virtual Tape"
    length 20480 mbytes # each tape is 20 Gigs
    }

    Which according to my calculations should be .1 percent of the total size of the tapes. So I'm not sure why Amanda isn't realizing this.


    As to the issue on virtcent14 that is a file system read issue. I'm dealing with it. But what i need help with most of all is the PostgreSQL DB access issue.


    Here is my client side config:

    Code:
    #
    # amanda.conf - sample Amanda client configuration file.
    #
    # This file normally goes in /etc/amanda/amanda-client.conf.
    #
    
    conf "SNJH"		# your config name
    
    index_server "amanda.summitnjhome.com"	# your amindexd server
    tape_server  "amanda.summitnjhome.com"	# your amidxtaped server
    tapedev      "changer"	# your tape device
    			# if not set, Use configure or ask server.
    			# if set to empty string "", ask server
    			# amrecover will use the changer if set to the value
    			# of 'amrecover_changer' in the server amanda.conf.
    
    #   auth	- authentication scheme to use between server and client.
    #		  Valid values are "bsd", "bsdudp", "bsdtcp", "krb5", "local",
    #		  "rsh" and "ssh".  
    #		  Default: [auth "bsdtcp"]
    auth "bsdtcp"
    
    #ssh_keys ""			# your ssh keys file if you use ssh auth
    
    property "PG-DATADIR" "/var/lib/pgsql/data"
    property "PG-ARCHIVEDIR" "/var/lib/pgsql/archive"
    property "PG-HOST" "/tmp"
    property "PG-USER" "amandabackup"
    property "PG-PASSFILE" "/etc/amanda/pg_passfile"
    Both machines (working and non-working) have the same client config.

    Here is how the permissions look on those files in the /etc/amanda directory on the database client machine:

    Code:
    [root@VIRTCENT09:/etc/amanda] #ls -l
    total 8
    -rw------- 1 amandabackup disk 989 Feb  1 03:24 amanda-client.conf
    -rw------- 1 amandabackup disk  38 Jan  7 10:50 pg_passfile
    And here is how those directories look on the problem server:

    Code:
    [root@VIRTCENT09:~] #ls -l /var/lib/pgsql/
    total 20
    drwxr-xr-x  2 postgres postgres 4096 Jan 29 22:13 archive
    drwx------  2 postgres postgres 4096 Oct 11 10:42 backups
    drwx------ 11 postgres postgres 4096 Feb  1 03:10 data
    drwxr-xr-x  3 postgres postgres 4096 Jan 15 23:42 lib
    -rw-------  1 postgres postgres 3123 Jan 31 22:56 pgstartup.log
    Here is how they are working on the working server:

    Code:
    [root@VIRTCENT10:/var/lib/pgsql] #ls -l
    total 28
    drwxr-x---  2 postgres postgres 12288 Feb  1 03:10 archive
    drwx------  2 postgres postgres  4096 Oct 11 10:42 backups
    drwx------ 12 postgres postgres  4096 Feb  1 03:10 data
    -rw-------  1 postgres postgres  2740 Jan 14 12:13 pgstartup.log
    drwxr-xr-x  3 postgres postgres  4096 Feb  1 02:55 test
    I notice that the non working one has some large directories in it (I'm not sure if that could be contributing to the problem)

    Code:
    [root@VIRTCENT09:/var/lib/pgsql] #du -sh *
    275M    archive
    4.0K    backups
    109M    data
    1.1G    lib
    4.0K    pgstartup.log
    This is what space looks like on the working server (the one that can successfully back up it's pgsql directory:

    Code:
    [root@VIRTCENT10:/var/lib/pgsql] #du -sh *
    674M    archive
    4.0K    backups
    96M     data
    4.0K    pgstartup.log
    32K     test

    I have the amanda user added to the postgres group (for access to the archive directory)

    Code:
    [root@VIRTCENT09:~] #groups amandabackup
    amandabackup : disk postgres
    But I am unclear on how the amanda user is meant to access the data directory.


    This is how the non-working pgsql directory is is listed in the disk list:

    Code:
    VIRTCENT09.summitnjhome.com /var/lib/pgsql/data dt_ampgsql
    And this is how the working pgsql directory is listed in the disk list:

    Code:
    VIRTCENT10.summitnjhome.com /var/lib/pgsql/data dt_ampgsql


    I am including my complete config as an attachment. The only thing I've tried so far is upping the etimeout and ctimeout values according to a troubleshooting article that I found for this type of article on the web only to no avail.


    I am an inexperienced Amanda administrator. Could someone please help me find a solution to these errors?

    Thanks in advance!
    Attached Files Attached Files

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •