PDA

View Full Version : amdump fails with planner: ERROR Request to mail.mydomain.com failed: timeout waiting



jgduke
July 14th, 2007, 08:06 AM
Hi,

I have installed Amanda 2.5.2p1 successfully. amcheck is working and not
returning any error, but when I start amdump, I always get after about 15
minutes the same error report:

Hostname: mail
Org : DailySet1
Config : DailySet1
Date : July 13, 2007

These dumps were to tape DailySet1-05.
The next 4 tapes Amanda expects to use are: 4 new tapes.
The next 4 new tapes already labelled are: DailySet1-06, DailySet1-07,
DailySet1-08,
DailySet1-09.

FAILURE AND STRANGE DUMP SUMMARY:
mail.jgduke.dnsalias.com / lev 0 FAILED [disk /, all estimate timed
out]
planner: ERROR Request to mail.mydomain.com failed: timeout waiting for
REP

looking in the logfile of the planner, I seen the following:
planner: time 0.073: bind_portrange2: Try port 659: Available - Success
planner: time 0.073: dgram_bind: socket 3 bound to ::.659
planner: time 0.078: dgram_send_addr(addr=0x8060330, dgram=0xb7f4a344)
planner: time 0.078: (sockaddr_in6 *)0x8060330 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 0.078: dgram_send_addr: 0xb7f4a344->socket = 3
planner: time 0.082: dgram_recv(dgram=0xb7f4a344, timeout=0,
fromaddr=0xb7f5a330)
planner: time 0.082: (sockaddr_in6 *)0xb7f5a330 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 0.082: dgram_recv(dgram=0xb7f4a344, timeout=0,
fromaddr=0xb7f5a330)
planner: time 0.082: (sockaddr_in6 *)0xb7f5a330 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 0.082: dgram_send_addr(addr=0x8060330, dgram=0xb7f4a344)
planner: time 0.082: (sockaddr_in6 *)0x8060330 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 0.082: dgram_send_addr: 0xb7f4a344->socket = 3
planner: time 0.083: security_getdriver(name=BSD) returns 0xb7f38380
planner: time 0.083: security_handleinit(handle=0x80627b8, driver=0xb7f38380
(BSD))
planner: time 0.089: dgram_send_addr(addr=0x80627d8, dgram=0xb7f4a344)
planner: time 0.089: (sockaddr_in6 *)0x80627d8 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 0.089: dgram_send_addr: 0xb7f4a344->socket = 3
planner: time 0.094: security_close(handle=0x8060310, driver=0xb7f38380
(BSD))
planner: time 0.094: dgram_recv(dgram=0xb7f4a344, timeout=0,
fromaddr=0xb7f5a330)
planner: time 0.094: (sockaddr_in6 *)0xb7f5a330 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 0.152: dgram_recv(dgram=0xb7f4a344, timeout=0,
fromaddr=0xb7f5a330)
planner: time 0.152: (sockaddr_in6 *)0xb7f5a330 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 301.164: dgram_send_addr(addr=0x80627d8, dgram=0xb7f4a344)
planner: time 301.165: (sockaddr_in6 *)0x80627d8 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 301.165: dgram_send_addr: 0xb7f4a344->socket = 3
planner: time 301.165: dgram_recv(dgram=0xb7f4a344, timeout=0,
fromaddr=0xb7f5a330)
planner: time 301.165: (sockaddr_in6 *)0xb7f5a330 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 601.179: dgram_send_addr(addr=0x80627d8, dgram=0xb7f4a344)
planner: time 601.179: (sockaddr_in6 *)0x80627d8 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 601.179: dgram_send_addr: 0xb7f4a344->socket = 3
planner: time 601.180: dgram_recv(dgram=0xb7f4a344, timeout=0,
fromaddr=0xb7f5a330)
planner: time 601.180: (sockaddr_in6 *)0xb7f5a330 = { 10, 10080,
::ffff:192.168.0.5 }
planner: time 901.201: security_seterror(handle=0x80627b8, driver=0xb7f38380
(BSD) error=timeout waiting for REP)
planner: time 901.265: security_close(handle=0x80627b8, driver=0xb7f38380
(BSD))
planner: time 901.266: pid 25854 finish time Fri Jul 13 20:13:40 2007

I am using SuSE Linux 10.1, I have even deactivated the firewall, thinking
this could be a possible cause for communication issues, both the client and
the server are the same machine. But nothing helped. I have browsed through
the archive of the amanda mailing list, but also without success.
I have already set the etimout to 3600 seconds which is one hour, without success.
I have no idea where I should now look for the cause of this issue, and hope
someone will be available to point me in the right direction.

Thanks for any help.

Regards,
jgduke

jgduke
July 15th, 2007, 07:37 AM
Ok, I found the reason for this, it was due to one directory which had so much subdiretories and files in it, that even a du -k . returned only after several hours. I have excluded this directory, and now everything works almost as expected.