Professional Web Applications Themes

Thread lockup problem possibly nfs related. Solaris 8 - Sun Solaris

Hi, I have a problem with a threaded application on solaris. It keeps 'locking up' a pstack reveals that all the threads are locked in this state: ff91f160 lwp_sema_wait (f430de30) ff7f9ac4 _park (f430de30, ff81e000, 0, f430dd70, 81010100, 0) + 114 ff7f94c0 _swtch (f430dd70, 0, ff81e000, 5, 1000, 0) + 158 ff7f83bc _cond_wait (f430dd70, 4356, ff81e000, ff8209b0, ff940450, f43091b4) + d4 ff7fc568 rw_rdlock (ff8298e4, 5000, ff81e000, 5257, f43093a2, ff940450) + d8 ff911620 _realbufend (0, ff93c004, c053f8, ff919678, ff904618, 0) + 30 ff904868 _doprnt (0, f4309294, c053f8, ff93c004, f4308d33, 2b4590) + d4 ff907cb0 fprintf (c053f8, 2b4590, ff943a4c, ff93fee0, 48a5, f4309778) + e8 ...

  1. #1

    Default Thread lockup problem possibly nfs related. Solaris 8

    Hi, I have a problem with a threaded application on solaris. It
    keeps 'locking up' a pstack reveals that all the threads are locked
    in this state:

    ff91f160 lwp_sema_wait (f430de30)
    ff7f9ac4 _park (f430de30, ff81e000, 0, f430dd70, 81010100, 0) +
    114
    ff7f94c0 _swtch (f430dd70, 0, ff81e000, 5, 1000, 0) + 158
    ff7f83bc _cond_wait (f430dd70, 4356, ff81e000, ff8209b0, ff940450,
    f43091b4) + d4
    ff7fc568 rw_rdlock (ff8298e4, 5000, ff81e000, 5257, f43093a2,
    ff940450) + d8
    ff911620 _realbufend (0, ff93c004, c053f8, ff919678, ff904618, 0) +
    30
    ff904868 _doprnt (0, f4309294, c053f8, ff93c004, f4308d33, 2b4590) +
    d4
    ff907cb0 fprintf (c053f8, 2b4590, ff943a4c, ff93fee0, 48a5,
    f4309778) + e8

    Now it's possible one thread may be writing to an nfs drive and thus
    might be reasonably blocked, but this specific thread is NOT writing
    to an nfs volue, but it has locked for at least 30 seconds, and all
    threads seem to block on rw_rdlock as soon as any IO function is
    called, (or even
    vsnprintf will block as it seems to use the IO system).

    Is it possible that solaris has a bug where the library takes out a
    global
    mutex of some kind before doing certain file IO so one thread talking
    to
    nfs can block all other threads ?

    Here is another example thread:

    ff91f160 lwp_sema_wait (f1c0fe30)
    ff7f9ac4 _park (f1c0fe30, ff81e000, 0, f1c0fd70, 0, 0) + 114
    ff7f94c0 _swtch (f1c0fd70, 0, ff81e000, 5, 1000, 0) + 158
    ff7f83bc _cond_wait (f1c0fd70, 4356, ff81e000, ff8209b0, ff940460, 0)
    + d4
    ff7fc690 pthread_rwlock_wrlock (ff940460, ff81e000, ff940460,
    ff8209b0, ff940430, ff91738c) + b4
    ff9113e0 _findiop (0, ff93c004, 0, 0, 1bb9c, ff91738c) + 28
    ff911e38 fopen (f1bfd7c8, 2687e0, f1bfd7c8, 6434, 220c8, 560f4) +
    4

    Which 'is' talking to an nfs drive.

    Any advice appreciated, I could re-write all my IO to nfs drives to
    have timeouts, but this strikes me as only a partial patch to a fault
    in the libraries as this would just decrease the time that all threads
    get stuck, surely I must be wrong about the cause. ?

    Thanks in advance.
    Chris Guest

  2. #2

    Default Re: Thread lockup problem possibly nfs related. Solaris 8

    co.nz (Chris Pugmire) writes:
     
     

    Blocking in NFS would show you blocked in an I/O system call;
    what you're seeing here is being blocked in the inards of stdio.
     
     
     
     

    No it isn't; again it's blocked on a rw lock; again inside stdio.
     

    Some thread in your application appears to be holding a r/w lock which is
    essential for stdio to make progress.

    Are you using printf or such from signal handlers? Are you using longjump
    out of signal handlers?

    Or do you suspend threads?

    In S9 we rewrote the stdio locking primitives such that it no longer needs
    to acquire the locks quite as often.

    Casper
    --
    Expressed in this posting are my opinions. They are in no way related
    to opinions held by my employer, Sun Microsystems.
    Statements on Sun products included here are not gospel and may
    be fiction rather than truth.
    Casper Guest

  3. #3

    Default Re: Thread lockup problem possibly nfs related. Solaris 8

    > Some thread in your application appears to be holding a r/w lock which is 

    Yes, I'm using printf in some odd places, I thought this was ok on a
    detached process since the printfs will be going to null, here is
    the detach code I use if it's relevant, I will remove printfs now.

    #ifndef _PATH_DEVNULL
    #define _PATH_DEVNULL "/dev/null"
    #endif
    int lib_detach(void)
    {
    int fd;
    int noclose=0;

    fflush(NULL);
    switch (fork()) {
    case -1:
    printf("Detach(), fork failed %s\n",strerror(errno));
    return (-1);
    case 0:
    break;
    default:
    _exit(0);
    }

    if (setsid() == -1) {
    printf("Detach(), child setsid failed %s\n",strerror(errno));
    return (-1);
    }

    if (!noclose && (fd = open(_PATH_DEVNULL, O_RDWR, 0)) != -1) {
    (void)dup2(fd, STDIN_FILENO);
    (void)dup2(fd, STDOUT_FILENO);
    (void)dup2(fd, STDERR_FILENO);
    if (fd > 2)
    (void)close (fd);
    }
    return (0);
    }
    #endif


     
    nope.

    p.s. thank you so much for answering, I've been beating my head against this
    for 3 weeks now :-)
    Chris Guest

  4. #4

    Default Re: Thread lockup problem possibly nfs related. Solaris 8

    co.nz (Chris Pugmire) writes:
     

    Is it the child of the fork() which hangs? If so, remove the printfs.

    After fork(), only async-signal-stuff goes [inthe child].

    Casper
    Casper Guest

Similar Threads

  1. Question related to the CW thread
    By Dan in forum Macromedia Dreamweaver
    Replies: 13
    Last Post: August 2nd, 12:20 PM
  2. Replies: 0
    Last Post: July 27th, 12:50 PM
  3. Replies: 0
    Last Post: July 13th, 11:21 PM
  4. #24635 [NEW]: small block of code causes crash, possibly destructor related.
    By eric at cosky dot com in forum PHP Development
    Replies: 0
    Last Post: July 13th, 05:54 PM
  5. Replies: 2
    Last Post: July 11th, 01:26 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139