Results 1 to 4 of 4

Thread: bug: selfcheck request failed: recv error: Connection reset by peer

  1. #1
    Join Date
    Oct 2012
    Posts
    34

    Unhappy bug: selfcheck request failed: recv error: Connection reset by peer

    Since a week I have problems with amcheck and amdump. I'll get the following problems.

    I worked with 3.3.6 and today I have compiled Amanda 3.3.7p1 for our Debian machine. I have confuscated the urls for privacy matters.

    The 3.3.6 version worked until week ago. Then it didn't work on the same version. 3.3.7p1 has the same problem so I really don'tunderstand it because not all machines/URLs are affected with that problem.

    Additionally if I run amcheck on the single machine (in that case git.example.de) it works.

    backup@chronos:/usr/local$ amcheck a2g-4
    Amanda Tape Server Host Check
    -----------------------------
    WARNING: holding disk /u1/amanda/ahd-4: only 6808342528 kB available (7340032000 kB requested)
    Searching for label 'a2g-4-03': volume ''
    ddslot 1: volume 'a2g-4-00' is still active and cannot be overwritten
    slot 2: volume 'a2g-4-01' is still active and cannot be overwritten
    slot 3: volume 'a2g-4-02' is still active and cannot be overwritten
    slot 4: volume 'a2g-4-03'
    Will write to volume 'a2g-4-03' in slot 4.
    NOTE: skipping tape-writable test
    Server check took 716.477 seconds

    Amanda Backup Client Hosts Check
    --------------------------------
    WARNING: localhost: selfcheck request failed: recv error: Connection reset by peer
    WARNING: yoda.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: didi.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: ci.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: bdp.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: connect.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: git.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: ext-prt2.exampleplus.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: git.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: pbx.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: dockerlabs.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: minkki-newsletter.staging.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: mv1.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: jdev.example.de: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: ontranet.example.com: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: jira.example.com: selfcheck request failed: error sending REQ: write error to: Broken pipe
    WARNING: spm.example.de: selfcheck request failed: Connection timed out
    Client check: 51 hosts checked in 788.644 seconds. 17 problems found.

    (brought to you by Amanda 3.3.7p1)
    ...

    backup@chronos:/usr/local$ amcheck a2g-4 git.example.de
    Amanda Tape Server Host Check
    -----------------------------
    WARNING: holding disk /u1/amanda/ahd-4: only 6808342528 kB available (7340032000 kB requested)
    Searching for label 'a2g-4-03':found in slot 4: volume 'a2g-4-03'
    Will write to volume 'a2g-4-03' in slot 4.
    NOTE: skipping tape-writable test
    Server check took 2.303 seconds

    Amanda Backup Client Hosts Check
    --------------------------------
    Client check: 1 host checked in 2.427 seconds. 0 problems found.

    (brought to you by Amanda 3.3.7p1)
    Last edited by r8qt7; June 26th, 2015 at 06:47 AM.

  2. #2
    Join Date
    Aug 2008
    Location
    Sunnyvale, CA
    Posts
    306

    Default

    Hello!

    If amcheck is run against each of these hosts individually, do any of them fail? Are any of them powered down?

    I see the check for localhost gets a "Connection reset by peer". Is it under any particularly load while the amcheck for all hosts is occurring?

    I also see a "Connection timed out" for spm.example.de. Probably a firewall?

    Paul

  3. #3
    Join Date
    Oct 2012
    Posts
    34

    Default

    Quote Originally Posted by pyeatman View Post
    Hello!

    If amcheck is run against each of these hosts individually, do any of them fail? Are any of them powered down?

    I see the check for localhost gets a "Connection reset by peer". Is it under any particularly load while the amcheck for all hosts is occurring?

    I also see a "Connection timed out" for spm.example.de. Probably a firewall?

    Paul
    The spm has probably a different problem but the rest is working.

    Firewall (iptables) is disabled. Localhost (and the others hosts) are working if I use amcheck a2g-4 localhost instead of the 'global' amcheck a2g-4.

    There might be a perl update to Perl 5.20 but if amcheck and amdump works if I set up the individual hosts and it doesn't work on a global basis I'm wondering if there still is the a problem.

    There was a similar problem reported earlier

    [url]https://www.mail-archive.com/amanda-users@amanda.org/msg47439.html[/url]

    I'm wondering if the bug fix disimproved the whole system?

  4. #4
    Join Date
    Aug 2008
    Location
    Sunnyvale, CA
    Posts
    306

    Default

    If an amcheck of "localhost" by itself succeeds but gets "Connection reset" when run with all hosts, I would first question if the additional networking and processes is preventing the hosts on client from responding correctly. Is there any particular load on "localhost" during the host checks?

    As far as the link to the bug fix you mention, you are correct. This did disimprove the whole system and it was actually reverted over a year ago. It should thus not be in 3.3.7. Thus the issue of other host checks failing if a particular host fails a check can still occur. Thus, this is why I was asking if any of these systems were powered down at the time of check as this is one host check "failure" that can case others to fail.

    Does removing hosts that are failing a check for reasons other than "error sending REQ" allow the rest to succeed? What is the result if you run a host check for all but "localhost" and "spm", for instance? If removing any that are showing certain failures allows the rest to succeed, we have narrowed it down to this. If the failures begin to occur for all once reaching a certain number of hosts no matter what combination of hosts you check (yet they check fine individually), this would generally indicate a networking issue.

    Paul

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •