Results 1 to 9 of 9

Thread: [FreeBSD 7.1-STABLE] amtape operations hang

  1. #1
    Join Date
    May 2008
    Posts
    45

    Default [FreeBSD 7.1-STABLE] amtape operations hang

    When I run some amtape operations, like taper, I get a hang in Amanda/MainLoop/libMainLoop.so. It seems to be related to glib and pthreads.


    Command line: amtape lule2 taper
    Call: /usr/bin/perl5 /usr/local/libexec/amanda/chg-glue lule2

    I have tested with both Perl 5.8.8 and 5.8.9.

    Backtrace is attached.

    The thing can be worked around by building threaded perl but this may not be possible for all users.

    /glz
    Attached Files Attached Files

  2. #2
    Join Date
    Mar 2007
    Location
    Chicago, IL
    Posts
    688

    Default

    Hmm.. from what I can see in the backtrace, this is not a bug where perl is called in a thread (that would be bad!), but perhaps results from linking code that expects to be single-threaded (perl) with code that uses threads (glib). There are some versions of glib with versions of child_watch that contain race conditions which could trigger this kind of hang - what version of glib are you using?

    I know that FreeBSD's threading is quite a bit different from other systems, so any other suggestions you can offer as to why building a threaded perl is an effective workaround would be helpful.

  3. #3
    Join Date
    May 2008
    Posts
    45

    Default

    Hi Dustin,

    Here are the list of ports used by Amanda:
    # pkg_info -rx amanda
    Information for amanda-devel-2.6.1b2.20081222:

    Depends on:
    Dependency: mtx-1.3.11
    Dependency: python25-2.5.2_3
    Dependency: perl-threaded-5.8.9
    Dependency: pkg-config-0.23_1
    Dependency: pcre-7.8
    Dependency: libiconv-1.11_1
    Dependency: gettext-0.17_1
    Dependency: glib-2.18.4
    Dependency: gamin-0.1.10
    Dependency: gio-fam-backend-2.18.4
    Dependency: lzo2-2.03_2
    Dependency: lzop-1.02.r1
    Dependency: lzmautils-4.32.7
    Dependency: gtar-1.21


    Attached are the link lists for threaded and non-threaded perl, the only difference I see is that in the threded version /lib/libthr.so.3 is linked with the perl main.

    It only seems to be happening in the server part, as the only one I have encountered so far is amtape , my clients seems to work with non-threaded perl. The clients backup using zfs send, GNU tar and FreeBSD dump.

    By the way, what is the standard on the Linux systems where you build Amanda? Looking around, some of the distributions seems to default to the threaded version.

    /glz
    Attached Files Attached Files
    Last edited by glowkrantz; January 18th, 2009 at 05:33 AM. Reason: Spelling, bettter syntax.

  4. #4
    Join Date
    Mar 2007
    Location
    Chicago, IL
    Posts
    688

    Default

    glib-2.18 is fairly new, so my race-condition hypothesis is less likely.

    Various linuxes have threaded or non-threaded perls, but it's invariably libpthread (rather than libthr or, worse, a distinct C library like libc_r), which does not require that all code in the process be compiled to be thread-aware. That is, it's fine to use threads from C in non-threaded perl process, as long as you don't call any perl functions in a thread.

    Digging into the backtrace a little bit, it looks like pthread_create ends up calling calloc, which, in trying to lock its arena, calls some mutex functions, which in turn call calloc. I see this in thr_mutex.c:

    /* This function is used internally by malloc. */
    int
    _pthread_mutex_init_calloc_cb(pthread_mutex_t *mutex,
    void *(calloc_cb)(size_t, size_t))
    {

    which suggests that there is some funny business that should be going on to make libc and libthr play nicely together. I can't quite trace the logic from _pthread_mutex_trylock through _destroy and _mutexattr_init, so it's not entirely clear what's going wrong.

    At a higher level, this looks like one of two things:
    1. A bug in libc/libthr.
    2. Amanda, or your build of Amanda, is doing something that is known not to work on FreeBSD, and which triggers this bug-like behavior.

    I have no idea how to distinguish those two things. If it's #2, what are we doing wrong, and how can we fix it?

  5. #5
    Join Date
    May 2008
    Posts
    45

    Default

    I found this one:
    [url]http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/stdlib/malloc.c?rev=1.183;content-type=text%2Fx-cvsweb-markup[/url]

    I will try to backport it to 7.1 and also try and get hold of a second tape station so I can test on an 8-CURRENT system.

    /glz

  6. #6
    Join Date
    Mar 2007
    Location
    Chicago, IL
    Posts
    688

    Default

    any updates?

  7. #7
    Join Date
    May 2008
    Posts
    45

    Default

    I backported the patch and it didn't help, so I have created a jail, have built non-stripped debug version of amanda and dependant software and will do some debugging during this weekend.

    As real life(tm) in interfering, it may take until next week until I know.

    /glz

  8. #8
    Join Date
    May 2008
    Posts
    45

    Default

    The offical FreeBSD ports maintainer Jun Kuriyama got it working with the attached patch.

    It seems that on FreeBSD the race protection must be enabled all the time.

    /glz

    PS. For general consumption, maybe change the 1 to __FreeBSD__ if this truly is not a problem on other architectures.

    PPS. Now tested on FreeBSD 6.4 and 7.1.
    Attached Files Attached Files
    Last edited by glowkrantz; February 4th, 2009 at 01:21 AM. Reason: Spelling, PS, PPS

  9. #9
    Join Date
    Mar 2007
    Location
    Chicago, IL
    Posts
    688

    Default

    Awesome! We certainly haven't seen this kind of failure on other systems, so perhaps this is because of some assumption that newer versions of Glib make regarding child signals or other racy conditions - assumptions that don't hold on FreeBSD. I'm happy to apply a __FreeBSD__ conditional and call it a day.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •