Professional Web Applications Themes

Sudden reboot - Sun Solaris

[[ This message was both posted and mailed: see the "To," "Cc," and "Newsgroups" headers for details. ]] There are two ways to reboot UNIX (in theory): 1 - Go to run level 6 (commanded by root or power monitor). 2 - Kernel executed panic() (will eventually hit run level 6). That is a very short list. Lack of a log entry exclused root command and power monitor. panic() is executed to "bail" when a "bad thing" happens. Bad things usually occur in situations where software reaches an invalid state. This is usually due to the following. A - program ...

  1. #1

    Default Re: Sudden reboot

    [[ This message was both posted and mailed: see
    the "To," "Cc," and "Newsgroups" headers for details. ]]


    There are two ways to reboot UNIX (in theory):
    1 - Go to run level 6 (commanded by root or power monitor).
    2 - Kernel executed panic() (will eventually hit run level 6).

    That is a very short list. Lack of a log entry exclused root command
    and power monitor.

    panic() is executed to "bail" when a "bad thing" happens. Bad things
    usually occur in situations where software reaches an invalid state.
    This is usually due to the following.
    A - program error (buggy driver or function)
    B - hardware error (broke or incompatible hardware)

    This is also a very short list.

    Intermittent problems (such as this one) can only be captured using
    crash files.

    When an application execute panic(), it will issue an API call to the
    kernel that will will immediately suspend the process (zombie), the
    kernel will record memory from the application into a crash file for
    debugging, then the process will be harvested. This can be used to
    debug the application. When the kernel performs panic(), it becomes
    unable to handle things like file systems, so crash files are recorded
    by an entirely different means, but the usage is identical
    (troubleshooting).

    The panic() function in the kernel calls a dump routine which will
    store off kernel RAM into the swap partition. In the next boot (which
    hapens right away), the crash utility is run. If a kernel dump is found
    in the swap partition by the crash utility, it will compress it and
    save it to a file (probably in /var or /adm). This occurs before
    virtual memory is started.

    The kernel may need to be reconfigured or rebuilt to obtain a crash
    file, and a local disk needs to be connected (kernel dump will probably
    not work over NFS).

    The debug utility is used to yze the crash file (probably kdb). You
    will need to specify the location of the kernel file and the location
    of the crash file. There are four important functions required for
    debugging (use "?" to show a list within the debugger).
    Process list
    Process selection
    Stack walkback
    Register dump

    First, dump out the process list. Second, select each process (one at a
    time) and display the stack and registers. You are looking for the
    register set for a process that showns (panic) in the stack. This might
    be the "active" process (the last one running).

    The chain of events that ended up at the panic() function may start
    with an interrupt. Most kernel events would start with an API call from
    an application, but an unimplemented interrupt or error interrupt could
    have occured. Interrupts will stop whatever function was active and
    start a new function. Interrupt service functions probably have "intr"
    in the function name (may be easy to recognize), and all interrupt
    handling functions are in kernel memory. The interrupt may have been
    triggered by an API call from a process that became active shortly
    before panic(). In other words: the functions later in the stack may
    not have been called by those listed earlier in the stack.

    Unfortunately I do not know enough about SPARC to help you further.
    Once you get this far, you should be able to get further assistance to
    isolate the cause.

    You may get more information by typing "kernel+driver+development" into
    the search menu at [url]http://docs.sun.com[/url].

    Best of luck.

    Greg Wilson
    [email]nanoatzin99netscape.net[/email]

    ---------

    In article <db4636cf.0306200450.e27c3d9posting.google.com> , Tanya
    <tanya.levitskygetronics.com> wrote:
    > I am monitoring Unix Servers for customer, and yesterday I had one of
    > the servers reboot suddenly without any errors or warnings. I checked
    > /var/adm/messages, authlog, no users were loged on, and there is no
    > evidence of any problems. The server is Sun 3800, split in 2 domains
    > (only one went down), running Solaris 8. Has something like that ever
    > happen to anyone? Am I missing anything? I would really appretiate any
    > input. Thanks.
    Methusela Oreiley Guest

  2. #2

    Default Re: Sudden reboot



    In article <db4636cf.0306200450.e27c3d9posting.google.com> , Tanya
    <tanya.levitskygetronics.com> wrote:
    > I am monitoring Unix Servers for customer, and yesterday I had one of
    > the servers reboot suddenly without any errors or warnings. I checked
    > /var/adm/messages, authlog, no users were loged on, and there is no
    > evidence of any problems. The server is Sun 3800, split in 2 domains
    > (only one went down), running Solaris 8. Has something like that ever
    > happen to anyone? Am I missing anything? I would really appretiate any
    > input. Thanks.
    I cant help you here, but the Sun Fire 3800 comes with 1 year hardware
    onsite and one year solaris support with the buy, call Sun Support and
    get the proffessional help, if the system is younger then one year they
    will replace HW.

    Also note that the firmware on the system is very important, you should
    run the latest possible, depending on your configuration its one of the
    following patches...

    Sun Fire 3800-6800 112883-06 (5.14.5) Solaris 8 4/01 ***
    112494-08 (5.13.5)
    112127-03 (5.12.7)

    This list is from Infodoc 18474 available from sunsolve.com

    And of course, the latest recommended cluster...

    Do you have any compact PCI cards in the system?

    /Johan A

    Mr. Johan Andersson Guest

  3. #3

    Default Re: Sudden reboot


    > I cant help you here, but the Sun Fire 3800 comes with 1 year hardware
    > onsite and one year solaris support with the buy, call Sun Support and
    > get the proffessional help, if the system is younger then one year they
    > will replace HW.
    To clarify, 1 year warranty on hardware, onsite same day, if it NEEDS to
    be replaced of course :-) and on year solaris support, this is free with
    the purchase.

    /Johan A

    Mr. Johan Andersson Guest

  4. #4

    Default Re: Sudden reboot

    The SunFire machines have been rumored to have manufacturing defects. Sun
    has corrected those in the past couple of months. If your hardware was
    purchased priort to April of 2003, then you ought to escalate the issue with
    Sun Support.

    V.
    "Mr. Johan Andersson" <johansolace.mh.se> wrote in message
    news:Pine.GSO.4.53.0306271105010.5686krynn.solace .mh.se...
    >
    >
    > > I cant help you here, but the Sun Fire 3800 comes with 1 year hardware
    > > onsite and one year solaris support with the buy, call Sun Support and
    > > get the proffessional help, if the system is younger then one year they
    > > will replace HW.
    >
    > To clarify, 1 year warranty on hardware, onsite same day, if it NEEDS to
    > be replaced of course :-) and on year solaris support, this is free with
    > the purchase.
    >
    > /Johan A
    >

    Venkatesh Padmanabhan Guest

Similar Threads

  1. Sudden Crash.
    By Nightfall_Blue in forum Coldfusion Server Administration
    Replies: 1
    Last Post: July 12th, 11:31 AM
  2. IE crashes all of a sudden in Win 98
    By Qwann in forum Macromedia Flash Player
    Replies: 1
    Last Post: February 12th, 01:10 PM
  3. Sudden Crash Ill 10.0.3
    By Georgia_A@adobeforums.com in forum Adobe Illustrator Windows
    Replies: 2
    Last Post: June 18th, 02:04 PM
  4. Command Not Available all of a sudden
    By david_kohn@adobeforums.com in forum Adobe Photoshop Mac CS, CS2 & CS3
    Replies: 1
    Last Post: April 5th, 10:50 PM
  5. Sudden reboot solaris 9 sun 220R
    By Kareem Mahgoub in forum Linux / Unix Administration
    Replies: 5
    Last Post: March 2nd, 06:36 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139