Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: amcheck, amdump failing when a single client is unavailable

  1. #1
    Join Date
    Aug 2006
    Posts
    34

    Default amcheck, amdump failing when a single client is unavailable

    Hi all -
    Since I changed to bsdtcp auth, amcheck and amdump fail/never finish when one of my backup clients is unavailable. This means that if one Amanda client is down, my entire backup fails.

    For example, if I run amcheck against all clients, or just the client that is unavailable:
    Code:
    -sh-3.00$ amcheck mybackupset
    I get the expected results, then it just hangs after "Server check took 0.069". Here is a snippet of the amcheck log in /tmp/amanda/server/(backupset)/amcheck.*.debug:

    Code:
    amcheck-clients: time 4000.034: connect_port: Try  port 1922: Available   - amcheck-clients: time 4003.035: connect_portrange: connect from 0.0.0.0.1922 failed
    amcheck-clients: time 4003.035: connect_portrange: connect to XXX.XXX.XXX.XX.10080 failed: No route to host
    amcheck-clients: time 4003.035: connect_port: Try  port 1923: Available   - amcheck-clients: time 4006.036: connect_portrange: connect from 0.0.0.0.1923 failed
    amcheck-clients: time 4006.036: connect_portrange: connect to XXX.XXX.XXX.XX.10080 failed: No route to host
    amcheck-clients: time 4006.036: connect_port: Try  port 1924: Available   - amcheck-clients: time 4009.036: connect_portrange: connect from 0.0.0.0.1924 failed
    amcheck-clients: time 4009.036: connect_portrange: connect to XXX.XXX.XXX.XX.10080 failed: No route to host
    amcheck-clients: time 4009.036: connect_port: Try  port 1925: Available   - amcheck-clients: time 4012.036: connect_portrange: connect from 0.0.0.0.1925 failed
    amcheck-clients: time 4012.036: connect_portrange: connect to XXX.XXX.XXX.XX.10080 failed: No route to host
    amcheck-clients: time 4012.037: connect_port: Try  port 1926: Available   -
    It looks as if it's enumerating ports on the unavailable client or server. It will run for hours before I give up and kill the process. I thought bsdtcp is only supposed to use 10080/tcp or 512,1023/tcp? Do I need to specify the reserved-tcp-port in amanda.conf?

    Server: amanda-backup_server-2.5.1b2-1.rhel4 RPM
    thanks!
    B.

  2. #2
    Join Date
    Oct 2005
    Location
    Bay Area, CA
    Posts
    124

    Default

    Hi,

    I cannot reproduce the problem here. I purposedly brought down one of the systems in the disklist. Amcheck report:
    WARNING: ultra2.zmanda.com: selfcheck request failed: No route to host

    amdump reports "result missing" from ultra2 while the other sytem on the disklist got backed up correctly.

    What are the ctimeout, dtimeout and etimeout setting?

    --Kevin Till

  3. #3
    Join Date
    Aug 2006
    Posts
    34

    Default

    Hi Kevin,
    My timeout values in amanda.conf are as follows:

    etimeout 300 # number of seconds per filesystem for estimates.
    dtimeout 3600 # number of idle seconds before a dump is aborted.
    ctimeout 30

  4. #4
    Join Date
    Aug 2006
    Posts
    34

    Default

    P.S. - I tried specifying the "reserved-tcp-port" variable as mentioned here: [url]http://wiki.zmanda.com/index.php/Amanda.conf[/url] by putting the following in amanda.conf:
    Code:
    reserved-tcp-port "512,1023"
    But Amanda doesn't seem to like it:
    Code:
    -sh-3.00$ amcheck ISDaily2.5
    "/etc/amanda/ISDaily2.5/amanda.conf", line 65: configuration keyword expected
    "/etc/amanda/ISDaily2.5/amanda.conf", line 65: end of line is expected
    amcheck: errors processing config file "/etc/amanda/ISDaily2.5/amanda.conf"

  5. #5
    Join Date
    Aug 2006
    Posts
    34

    Default

    Here I run amcheck against a host that is in fact up and running:

    Code:
    -sh-3.00$ amcheck ISDaily2.5 cfdev.xxx.xxxx.xxx
    Amanda Tape Server Host Check
    -----------------------------
    Holding disk /amandahold: 385996 MB disk space available, using 395258080 MB
    slot 28: read label `ISDaily27', date `20060906030001'
    NOTE: skipping tape-writable test
    Tape ISDaily27 label ok
    (snip)
    
    Amanda Backup Client Hosts Check
    --------------------------------
    ERROR: NAK cfdev.xxx.xxxx.xxx: host middenheap-dev.xxx.xxxx.xxx: port 1025 not secure
    Client check: 18 hosts checked in 188.978 seconds, 1 problem found
    
    (brought to you by Amanda 2.5.1b2)
    On the client, a ps -aux turns up two amandad processes still hanging around:
    Code:
    504      23785  0.0  0.1  2204  924 ?        Ss   Oct29   0:00 amandad -auth=bsdtcp amdump
    504      27214  0.0  0.1  2204  924 ?        Ss   Oct30   0:00 amandad -auth=bsdtcp amdump
    I'm guessing those are from the last night's two backups that I had to kill on the server because they were timing out due to the single host being unavailable. I did a "sudo killall amandad" on the client then ran amcheck on the server again and it ran fine.

  6. #6
    Join Date
    Aug 2006
    Posts
    34

    Default

    And, one more time, an amcheck run on just the host that is down:

    cat /tmp/amanda/server/ISDaily2.5/amcheck.20061101104702.debug
    Code:
    amcheck: debug 1 pid 14416 ruid 502 euid 0: start at Wed Nov  1 10:47:02 2006
    amcheck: debug 1 pid 14416 ruid 502 euid 502: rename at Wed Nov  1 10:47:02 2006
    security_getdriver(name=bsdtcp) returns 0xf57140
    security_handleinit(handle=0x8ff4670, driver=0xf57140 (BSDTCP))
    security_streaminit(stream=0x8ff4ef0, driver=0xf57140 (BSDTCP))
    amcheck-clients: time 0.005: connect_port: Skip port 512: Owned by exec.
    amcheck-clients: time 0.005: connect_port: Skip port 513: Owned by login.
    amcheck-clients: time 0.005: connect_port: Skip port 514: Owned by shell.
    amcheck-clients: time 0.005: connect_port: Skip port 515: Owned by printer.
    amcheck-clients: time 0.006: connect_port: Try  port 516: Available   - changer_query: changer return was 60 1
    changer_query: searchable = 0
    changer_find: looking for ISDaily27 changer is searchable = 0
    amcheck-clients: time 3.006: connect_portrange: connect from 0.0.0.0.516 failed
    amcheck-clients: time 3.006: connect_portrange: connect to XXX.XXX.XXX.99.10080 failed: No route to host
    amcheck-clients: time 3.007: connect_port: Try  port 517: Available   - amcheck-clients: time 6.008: connect_portrange: connect from 0.0.0.0.517 failed
    amcheck-clients: time 6.008: connect_portrange: connect to XXX.XXX.XXX.99.10080 failed: No route to host
    amcheck-clients: time 6.008: connect_port: Try  port 518: Available   - amcheck-clients: time 9.008: connect_portrange: connect from 0.0.0.0.518 failed
    amcheck-clients: time 9.008: connect_portrange: connect to XXX.XXX.XXX.99.10080 failed: No route to host
    amcheck-clients: time 9.008: connect_port: Skip port 519: Owned by utime.
    amcheck-clients: time 9.008: connect_port: Skip port 520: Owned by efs.
    amcheck-clients: time 9.008: connect_port: Skip port 521: Owned by ripng.
    amcheck-clients: time 9.009: connect_port: Try  port 522: Available   - amcheck-clients: time 12.009: connect_portrange: connect from 0.0.0.0.522 failed
    amcheck-clients: time 12.009: connect_portrange: connect to XXX.XXX.XXX.99.10080 failed: No route to host
    amcheck-clients: time 12.009: connect_port: Try  port 523: Available   - amcheck-clients: time 15.009: connect_portrange: connect from 0.0.0.0.523 failed
    amcheck-clients: time 15.009: connect_portrange: connect to XXX.XXX.XXX.99.10080 failed: No route to host
    amcheck-clients: time 15.010: connect_port: Try  port 524: Available   - amcheck-clients: time 18.010: connect_portrange: connect from 0.0.0.0.524 failed
    amcheck-clients: time 18.010: connect_portrange: connect to XXX.XXX.XXX.99.10080 failed: No route to host
    amcheck-clients: time 18.010: connect_port: Skip port 525: Owned by timed.
    amcheck-clients: time 18.010: connect_port: Skip port 526: Owned by tempo.
    amcheck-clients: time 18.010: connect_port: Try  port 527: Available   - amcheck-clients: time 21.011: connect_portrange: connect from 0.0.0.0.527 failed
    amcheck-clients: time 21.011: connect_portrange: connect to XXX.XXX.XXX.99.10080 failed: No route to host
    amcheck-clients: time 21.011: connect_port: Try  port 528: Available   - amcheck-clients: time 24.012: connect_portrange: connect from 0.0.0.0.528 failed
    amcheck-clients: time 24.012: connect_portrange: connect to XXX.XXX.XXX.99.10080 failed: No route to host
    amcheck-clients: time 24.012: connect_port: Try  port 529: Available   - amcheck-clients: time 27.013: connect_portrange: connect from 0.0.0.0.529 failed
    amcheck-clients: time 27.013: connect_portrange: connect to XXX.XXX.XXX.99.10080 failed: No route to host
    amcheck-clients: time 27.014: connect_port: Skip port 530: Owned by courier.
    amcheck-clients: time 27.014: connect_port: Skip port 531: Owned by conference.
    amcheck-clients: time 27.014: connect_port: Skip port 532: Owned by netnews.
    Why does it keep trying the host even though it has no route to it? I tried turning off iptables on the server thinking maybe it was causing the server to somehow not get the hint that the host is completely unavailable, no difference.

  7. #7
    Join Date
    Oct 2005
    Location
    Bay Area, CA
    Posts
    124

    Default

    Hi,

    try
    reserved-tcp-port 512,1023 # no quotes

  8. #8
    Join Date
    Aug 2006
    Posts
    34

    Default

    Without quotes doesn't work either. Here's the paste from my amanda.conf:

    Code:
    tapebufs 20             # A positive integer telling taper how many
                            # 32k buffers to allocate.  The default is 20 (640k).
    
    reserved-tcp-port 512,1023 # ports used by bsdtcp auth
    And the results:
    Code:
    -sh-3.00$ amcheck ISDaily2.5 cfdev.XX.XXX.XXX
    "/etc/amanda/ISDaily2.5/amanda.conf", line 65: configuration keyword expected
    "/etc/amanda/ISDaily2.5/amanda.conf", line 65: end of line is expected
    "/etc/amanda/ISDaily2.5/amanda.conf", line 66: configuration keyword expected
    "/etc/amanda/ISDaily2.5/amanda.conf", line 66: end of line is expected
    amcheck: errors processing config file "/etc/amanda/ISDaily2.5/amanda.conf"
    Does it need to be in a specific location in amanda.conf?

  9. #9
    Join Date
    Oct 2005
    Location
    Bay Area, CA
    Posts
    124

    Default

    reserved-tcp-port support was added on 9/22/06. I suspect your amanda software version was built before that date.

  10. #10
    Join Date
    Aug 2006
    Posts
    34

    Default

    amdump: start at Wed Nov 1 03:15:01 EST 2006
    amdump: datestamp 20061101
    planner: pid 7183 executable /usr/lib/amanda/planner version 2.5.1b2
    planner: build: VERSION="Amanda-2.5.1b2"
    planner: BUILT_DATE="Tue Aug 22 12:12:18 PDT 2006"
    planner: BUILT_MACH="Linux rocky.zmanda.com 2.6.9-22.0.2.ELsmp #1 SMP Thu Jan 5 17:13:01 EST 2006 i686 i686 i386 GNU/Linux"

    OK, I guess I need to uninstall and re-install with a newer RPM. Is it normal to add new features without incrementing the version/build ID?
    Edit: maybe I'm a little confused about versions - should I be working with the 2.5.1p RPMs available on the download page?
    Last edited by bethany; November 1st, 2006 at 12:30 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •