Professional Web Applications Themes

keeping ntpd running - Sun Solaris

apologies if I'm really missing something obvious here... we have systems running ntpd started by startup script. Typically the process dies at some time and the box's date stamp "erodes" over time. So - how to keep the ntpd daemon running even when it dies? inittab seems to be the answer thought I and so used the following ntpd:23:respawn:/usr/bin/ntpd .... then ran init q of course. Thing is when I kill off ntpd ... for testing purposes... it doesn't respawn... so what am I doing wrong? or am I barking up the proverbial trees? and if so... any obvious solutions? ...

  1. #1

    Default keeping ntpd running

    apologies if I'm really missing something obvious here...

    we have systems running ntpd started by startup script.

    Typically the process dies at some time and the box's date stamp "erodes" over time.

    So - how to keep the ntpd daemon running even when it dies?

    inittab seems to be the answer thought I and so used the following


    ntpd:23:respawn:/usr/bin/ntpd


    .... then ran init q of course.

    Thing is when I kill off ntpd ... for testing purposes... it doesn't respawn...


    so what am I doing wrong?


    or am I barking up the proverbial trees? and if so... any obvious solutions?

    cheers

    ian
    Ian Guest

  2. #2

    Default Re: keeping ntpd running

    Ian Diddams wrote 

    Generally when ntpd dies it's because there's some kind of
    configuration problem. It finds it can't stay synchronized,
    so it decides there is no point in continuing to run and
    exits.

    I suggest that you focus your efforts on figuring out why
    ntpd is exiting. For example, you may need to use "tickadj"
    to make a co modification to the system's time-keeping,
    so that it is close enough that ntpd can make up the rest.
    (ntpd does not work well if the clock is *really* off.)

    Probably the best thing to do is to start looking for error
    messages in ntpd's log files.

    - Logan

    Logan Guest

  3. #3

    Default Re: keeping ntpd running

    If we had a problem lik this we would write a script to check ntpd is
    running and put it in cron and run it every hour or so.
    There is something called daemontools which is designed for this purpose.
    http://cr.yp.to/daemontools/faq.html

    I believe its written by the author of qmail


    George


    george Guest

  4. #4

    Default Re: keeping ntpd running

    Ian Diddams <com> wrote: 
     
     

    How does it die? Do you get a message in /var/adm/messages? If so,
    what does it say?

    Are you using a locally compiled version of ntpd? If so, what version?
    (Solaris still ships the old xntpd stuff).

    --
    Darren Dunham com
    Unix System Administrator Taos - The SysAdmin Company
    Got some Dr Pepper? San Francisco, CA bay area
    < This line left intentionally blank to confuse you. >
    Darren Guest

  5. #5

    Default Re: keeping ntpd running

    Logan Shaw <rr.com> wrote in message news:<8oTbb.71638$austin.rr.com>...
     


    Hmmm... I don't doubt your thinking but I can't help feel a solution
    to restart it would be more time effective than tracking down why ntpd
    dies intermittantly/sporadically on any box at random when 50+ other
    "clone" boxes are all running quite happily at the same time... it
    doesn't die after 20 seconds say... it mighht be up for a few hours,
    or days or weeks... then drop out. Meanwhile every other
    similar-in-every-way box continues quite happily ... until it drops
    out ... while every other exactly similar box blah blah blah...

    cheers

    Ian
    Ian Guest

  6. Moderated Post

    Default Re: keeping ntpd running

    Removed by Administrator
    Dan Guest
    Moderated Post

  7. #7

    Default Re: keeping ntpd running

    Ian Diddams <didds2(at)excite.com> wrote:
    ::
    :: we have systems running ntpd started by startup script.
    :: Typically the process dies at some time
    :: ^^^^^^^^^^^^
    :: and the box's date stamp "erodes" over time.
    :: So--how to keep the ntpd daemon running even when it dies?


    Have you checked to see precisely when the "some time" is?
    Check to see if ntpd runs for about 20 minutes after reboot
    and then quits. Just to quickly see if it is running, do

    $ ntpq -p

    which will report "connection refused" if ntpd is not there.

    If ntpd is found to run for 20 minutes and then quit, what's
    going on is probably this: If the ntpd daemon, when starting
    up, finds that the system's internal clock is more than 1000
    seconds different from the time ticks received from external
    sources, ntpd wants a human to figure out why and make an
    intentional clock change; the daemon is programmed not to
    simply trust the external ticks and change the system's clock
    on its own. So the daemon rather quietly logs a message:

    time error is way too large (set clock manually)

    and then it gives up. Without updating the clock. If left
    to its own devices, it will never succeed, and the system's
    time setting will erode until you have the Temporal Dust Bowl.

    Sample-to-sample variation in clock chips, even in PCs that
    are allegedly identical, could cause just one machine out of
    50 to have a "bad enough" clock that the time error grows
    too large for ntpd to deal with.

    If you have the "ntpdate" program, you may want to invoke
    it in your start-up script before you try to start "ntpd".
    The idea is to set the system clock approximately correct
    before handing the ball to ntpd. (Solaris does something
    like this when starting its xntpd program.) Or use
    "ntpd -g". See the NTP FAQ:

    http://www.ntp.org/ntpfaq/NTP-s-trouble.htm#AEN4599

    Sun distributes some Blueprints which give advice on how to
    configure the NTP package distributed with Solaris, most of
    which applies to the newer releases of NTP as well.

    http://www.sun.com/blueprints/0701/NTP.pdf
    http://www.sun.com/blueprints/0801/NTPpt2.pdf
    http://www.sun.com/blueprints/0901/NTPpt3.pdf

    ...RSS

    --
    /usr/xpg4/bin/date '+%C%y-%m-%d_%H:%M:%S'
    Richard Guest

  8. #8

    Default Re: keeping ntpd running

    "Richard S. Shuford" <stratagy.REM0VE-THlS-PART.com>
    wrote in
     


    Interesting!

    So your saying that a system might run for several days apparently
    quite happily, but in all that time the clock is actually getting
    further away from centralised time rather than closer to it until such
    time as it > 1000 seconds out, at which time it fails/dies/stops/gives
    up/stops running?

     

    Yeah - this is done. So at boot/ntpd restart via script, the system
    is presumably using centralised time..


    ian
    Ian Guest

  9. #9

    Default Re: keeping ntpd running



    Ian Diddams wrote: 
    >
    >
    >
    > Interesting!
    >
    > So your saying that a system might run for several days apparently
    > quite happily, but in all that time the clock is actually getting
    > further away from centralised time rather than closer to it until such
    > time as it > 1000 seconds out, at which time it fails/dies/stops/gives
    > up/stops running?[/ref]

    I don't think this is quite what Richard is saying. If you have a server
    that is giving bad time, or the current time on the system is greater than
    1000 seconds off of the correct time, if you are using xntpd and did not
    run ntpdate prior to starting it, it may exit shortly after starting,
    giving the message above. After that, the system clock is free running,
    since xntpd is not running.

    This scenario is less likely using ntpd is the -g option is used as well.
    In this case, ntpd will act on its own as the ntpdate.

    A big problem at start up of xntpd and ntpd is that the mitigation algorithms
    are initially crippled in the interest of a fast start, and may not detect
    a falseticker and will happily sync to it if it is the first server that becomes
    available. Suppose you have a system that is 500 seconds off, and there are four
    servers available, one of which is 1001 seconds off in the same direction. If the
    timing just happens to be right and the falseticker is the first to reach usability
    after 5 polls, the system will step the clock to match the falseticker and will then
    reset. After 5 more polls, the truechimers will vote the false ticker off the island,
    and the system will now want to reset to the correct time, but since this is not the
    first sync, the 1000 second limit will come into play and ntpd (or xntpd) will exit.

    --
    blu

    Lesson from the blackout of 2003:
    The power grid is THE most critical infrastructure, upon which all
    others depend, and nobody really knows how it works.
    --------------------------------------------------------------------------------
    Brian Utterback - Solaris Sustaining (NFS/Naming) - Sun Microsystems Inc.,
    Ph/VM: 781-442-1343, Em:brian.utterback-at-ess-you-enn-dot-kom

    Brian Guest

  10. #10

    Default Re: keeping ntpd running

    Brian,

    I'm glad you siezed on that bone, which has been botherning me for some
    time. There are two thresholds implemented in NTPv4 (ntpd) just for this
    problem, minclock and minsane, both arguments to the tos command. These
    have been around for awhile, but probably badly doented. The minclock
    threshold is used by the clustering algorithm as it casts off outlyer
    servers until the total remaining is not more than this value. At the
    moment, minclock defaults to three mostly for historic reasons. From
    Byzantine agreement principles, it really should be four.

    The interesting threshold is minsane, which is the minimum number of
    survivors necessary to declare the client synchronized. It defaults to
    one in the interest of fast synchronization, but really should be
    something higher like four, assuming that number of servers can always
    be found. If minclock and minsane are both set to four and some greater
    number of servers, like six, were available, once several samples have
    been collected from each of at least four servers, the clock would be
    set. As more servers are found, the best four of them would survive to
    set the clock. This would be the ideal configuration from a
    sanity/antiterrorist point of view, but if this were the default case
    the volume of confused mail on this list would easily double.

    So, stick a "tos minclock 4 minsane 4" in your configuration file along
    with six servers and watch the fun. But, please note, no such feature is
    in NTPv3 (xntpd).

    Dave

    Brian Utterback wrote: 
    > >
    > >
    > >
    > > Interesting!
    > >
    > > So your saying that a system might run for several days apparently
    > > quite happily, but in all that time the clock is actually getting
    > > further away from centralised time rather than closer to it until such
    > > time as it > 1000 seconds out, at which time it fails/dies/stops/gives
    > > up/stops running?[/ref]
    >
    > I don't think this is quite what Richard is saying. If you have a server
    > that is giving bad time, or the current time on the system is greater than
    > 1000 seconds off of the correct time, if you are using xntpd and did not
    > run ntpdate prior to starting it, it may exit shortly after starting,
    > giving the message above. After that, the system clock is free running,
    > since xntpd is not running.
    >
    > This scenario is less likely using ntpd is the -g option is used as well.
    > In this case, ntpd will act on its own as the ntpdate.
    >
    > A big problem at start up of xntpd and ntpd is that the mitigation algorithms
    > are initially crippled in the interest of a fast start, and may not detect
    > a falseticker and will happily sync to it if it is the first server that becomes
    > available. Suppose you have a system that is 500 seconds off, and there are four
    > servers available, one of which is 1001 seconds off in the same direction. If the
    > timing just happens to be right and the falseticker is the first to reach usability
    > after 5 polls, the system will step the clock to match the falseticker and will then
    > reset. After 5 more polls, the truechimers will vote the false ticker off the island,
    > and the system will now want to reset to the correct time, but since this is not the
    > first sync, the 1000 second limit will come into play and ntpd (or xntpd) will exit.
    >
    > --
    > blu
    >
    > Lesson from the blackout of 2003:
    > The power grid is THE most critical infrastructure, upon which all
    > others depend, and nobody really knows how it works.
    > --------------------------------------------------------------------------------
    > Brian Utterback - Solaris Sustaining (NFS/Naming) - Sun Microsystems Inc.,
    > Ph/VM: 781-442-1343, Em:brian.utterback-at-ess-you-enn-dot-kom[/ref]
    David Guest

Similar Threads

  1. Keeping count
    By kev in forum MySQL
    Replies: 1
    Last Post: June 30th, 10:43 PM
  2. Verity won't stay running - constantly running
    By DaveF67 in forum Coldfusion Server Administration
    Replies: 0
    Last Post: September 26th, 12:59 PM
  3. ntpd core dumping on 5.3-p5
    By Bill in forum FreeBSD
    Replies: 2
    Last Post: March 18th, 10:13 PM
  4. ntpd core dump
    By Richard Danter in forum FreeBSD
    Replies: 3
    Last Post: February 25th, 08:04 PM
  5. Can I use ntpd through a socks proxy??
    By Zhao You Bing in forum Debian
    Replies: 0
    Last Post: July 24th, 02:30 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139