Professional Web Applications Themes

Tru64 w command reports an idle user which doesn't exist ? Is utmp corrupt ? - Linux / Unix Administration

System is Tru64 5.1B running on a cluster. We have monitoring scripts checking for idle users which picked up the following: [4]swips2:/usr/users/jamesb # w |egrep "[d]ays|User" 11:58 up 29 days, 2:44, 69 users, load average: 3.10, 3.09, 3.04 User tty from login idle JCPU PCPU what YENQP1 pts/162 192.168.16.60 23:06 11days This suggests there should be an idle process for this connection, but there isn't: [4]swips2:/usr/users/jamesb # ps -ef | grep [Y]ENQ [4]swips2:/usr/users/jamesb # I think this means that the utmp file is incorrect, I cannot reboot this machine (its a 24/7 service and this is only a minor irritation), ...

  1. #1

    Default Tru64 w command reports an idle user which doesn't exist ? Is utmp corrupt ?

    System is Tru64 5.1B running on a cluster.

    We have monitoring scripts checking for idle users which picked up the
    following:

    [4]swips2:/usr/users/jamesb # w |egrep "[d]ays|User"
    11:58 up 29 days, 2:44, 69 users, load average: 3.10, 3.09, 3.04
    User tty from login idle JCPU PCPU what
    YENQP1 pts/162 192.168.16.60 23:06 11days

    This suggests there should be an idle process for this connection, but
    there isn't:

    [4]swips2:/usr/users/jamesb # ps -ef | grep [Y]ENQ
    [4]swips2:/usr/users/jamesb #


    I think this means that the utmp file is incorrect, I cannot reboot
    this machine (its a 24/7 service and this is only a minor irritation),
    so is there a way to refresh/rebuild utmp or fix this another way ?

    Thanks.
    James Guest

  2. #2

    Default Re: Tru64 w command reports an idle user which doesn't exist ? Is utmp corrupt ?

    James Blackmore wrote: 
    the 
    what 

    Not a particularly good method to use, as you have
    discovered. What you have enoered is common to
    all versions of Unix and Unix-alikes as far as I know.
     
    but 

    Define "correct". Login sessions log in utmp. Login
    sessions that exit gracefully log in utmp. Login
    sessions that are killed ungracefully do not log in
    utmp.

    The most typical way these ghosts are created is someone
    exitting their windowing session without exitting their
    login sessions first. It used to happen under various
    X11 window managers but eventually they switched to
    more gracefull kill methods. It still happens when
    folks exit their Windows login while a Unix window is
    open.
     
    irritation), 

    First thing, you already know this is an issue so modify
    your script. If there are no processes, move on to the
    next user.

    Next thing, if there are plenty of logins on the host
    they eventually clean up on their own. Each login
    session takes an unused pty and when there enough
    logins to reach the pty with the ghost the ghost goes
    away.

    Last thing, if you really want to clean-up utmp, there
    are various programs on various freeware sites. Look
    for "fix utmp" and so on on your favorite freeware
    site.

    Doug Guest

  3. #3

    Default Re: Tru64 w command reports an idle user which doesn't exist ? Is utmp corrupt ?

    Thanks Doug,
     

    Thanks, I didn't realise this, and this script has been running for 2
    years and this is the first time this has occured so I can only assume
    Wintegrate/Powerterm (the common clients here) are reasonably good at
    exiting cleanly even when window is closed.
     

    Thanks, I will do this, I might even drop the w | grep days altogether
    and use STIME on a ps -ef, something like:

    ps -ef | grep `date +'%b'` | egrep -v '`date +"%b %e"`'
     

    I don't understand why this didn't happen then, as we have several
    hundred logins a day, so surely it should have been re-used. From
    midnight to 9am today we have already had 200 logins, and the busy
    time starts at 9am, so in 11 days we should have had several thousand
    logins ?

    [4]swips2:/ # date
    Fri Mar 18 08:47:16 GMT 2005
    [4]swips2:/ # last | tail -1
    wtmp begins Fri Mar 18 00:02
    [4]swips2:/ # last | grep -v ftp | wc
    197 1956 14481

    I wonder if the pty is not properly returned to 'free list' in this
    'unclean exit' case, or this would have been cleaned up in 11 days I
    think.
     

    Thanks, but user accounting information is not too critical, so once I
    was sure this was just an 'incorrect' utmp file I flushed it with
    logclean.

    All users are kicked out for a nightly 2am backup anyway, so I can
    easily check for any 'idle' sessions manually from before then which
    stayed up, and the accounting info will be correct from now on (till
    the next time).

    Thanks for the response though, all very useful info !

    James.
    James Guest

  4. #4

    Default Re: Tru64 w command reports an idle user which doesn't exist ? Is utmp corrupt ?

    James Blackmore wrote: 
    >
    > Thanks, I didn't realise this, and this script has been running for 2
    > years and this is the first time this has occured so I can only[/ref]
    assume 

    Sounds like it. If you're using W2Kmost of the
    time folks will logout and it appears that is
    being handled gracefully. It looks like someone
    powered off without logging out or some sort of
    application crash happened.
     
    >
    > I don't understand why this didn't happen then, as we have several
    > hundred logins a day, so surely it should have been re-used. From
    > midnight to 9am today we have already had 200 logins, and the busy
    > time starts at 9am, so in 11 days we should have had several thousand
    > logins ?[/ref]

    It isn't quite just the number of logins that
    determines pty recycling. Each session tends
    to use the lowest available numbered pty,
    though occasionaly a race condition will have
    a session skip a couple. So what really counts
    for reclaiming these ghosts is the peak number
    of sessions not the raw number.
     

    It's just a missing entry in utmp and ownerships
    of the device pair in /dev. Not all that much
    to the clean-up involved. A process no longer
    has the device open so a scan will show it
    available.

    So what I think happened: Your app is usually good
    about exitting gracefully so it gets logged in
    utmp. On this occasion there was an application
    crash, or kill -9 rahter than -15, or a power off
    without logout or similar. It happened to be a
    session with a high pty number because the login
    happened to happen during a monthly peak.
     
    >
    > Thanks, but user accounting information is not too critical, so once[/ref]


    Yup. Logclean is just fine for utmp clean-up.
    As long as there aren't any processes you know
    it is really available.

    Doug Guest

Similar Threads

  1. [mysql] Table Doesn't Exist
    By khamstra in forum Coldfusion Database Access
    Replies: 7
    Last Post: July 21st, 08:10 PM
  2. ANNOUNCE: User::Utmp 1.8
    By Michael Piotrowski in forum PERL Modules
    Replies: 0
    Last Post: March 27th, 01:13 AM
  3. Says it's Reviewing Draft that Doesn't Exist???
    By kwilliams in forum Macromedia Contribute General Discussion
    Replies: 2
    Last Post: May 10th, 08:34 PM
  4. Replies: 1
    Last Post: July 29th, 04:31 PM
  5. determining idle time from /etc/utmp ?
    By Daniel Lenski in forum UNIX Programming
    Replies: 1
    Last Post: July 13th, 10:50 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139