Ask a Question related to Sun Solaris, Design and Development.
-
Joe #1
Strange crash on E-450
I have an E-450 that has crashed the past 2 days. When I get in in
the morning the server is single-user waiting for a password. The
strange part is, it appears the server did not go all the way down and
then boot back up. The /var, /opt and some other file systems are
corrupt. I can fsck /var, but not the others: I get a "Can't open
/dev/vx/dsk/opt" error. When I fix /var and mount it, all the users
are still logged in. The server logs show that it has NOT rebooted.
The uptime is from the last known reboot. There are no logs, since
/var is not mounted. And I am not getting a crash dump.
We are running Solaris 8. The only app running is Sybase 12.5.
Any ideas...
Joe
Joe Guest
-
Strange crash in a game i'm developing...
Hello, I'm programming a minigolf game using Shockwave 3d and Havok in Director 10.1. It's in beta and I've found a HUGE problem. Sometimes when... -
Strange Crash
Perhaps this isn't the most appropriate avenue for seeking help about crashes, but this problem is driving me nuts... and costing me time. I have... -
#23132 [Csd]: Strange engine crash (reference counting problem)
ID: 23132 Updated by: moriyoshi@php.net -Summary: Strange engine crash Reported By: edink at proventum dot net... -
#25410 [Opn->Csd]: strange crash & freeze bug when reassigning global var to method's return
ID: 25410 Updated by: sniper@php.net Reported By: xuefer at 21cn dot com -Status: Open +Status: ... -
Strange script induced 'crash'
Hi All, FMP 6.0v1, mac OS 9.2 two files (A, B), a simple perform external scrip, script in A executes the script in B: called to open The... -
Anthony Mandic #2
Re: Strange crash on E-450
Joe wrote:
Can you describe this in a little more detail?>
> I have an E-450 that has crashed the past 2 days. When I get in in
> the morning the server is single-user waiting for a password.
You do realise you should fsck the raw device /dev/vx/rdsk/...?> The strange part is, it appears the server did not go all the way down
> and then boot back up. The /var, /opt and some other file systems are
> corrupt. I can fsck /var, but not the others: I get a "Can't open
> /dev/vx/dsk/opt" error.
Are you sure it was in single user mode?> When I fix /var and mount it, all the users are still logged in.
What does last show?> The server logs show that it has NOT rebooted.
> The uptime is from the last known reboot.
Hmmm ... last or the others won't show anything then.> There are no logs, since /var is not mounted.
What about ASE's log? Is its filesystem unmounted? If not,> And I am not getting a crash dump.
>
> We are running Solaris 8. The only app running is Sybase 12.5.
check ASE's log for anything unusual.
-am © 2003
Anthony Mandic Guest
-
Kent Smith #3
Re: Strange crash on E-450
On 9 Jul 2003 06:06:15 -0700, [email]joehoechst@cs.com[/email] (Joe) wrote:
You might want to set up mirrors for your system logs on partitions>I have an E-450 that has crashed the past 2 days. When I get in in
>the morning the server is single-user waiting for a password. The
>strange part is, it appears the server did not go all the way down and
>then boot back up. The /var, /opt and some other file systems are
>corrupt. I can fsck /var, but not the others: I get a "Can't open
>/dev/vx/dsk/opt" error. When I fix /var and mount it, all the users
>are still logged in. The server logs show that it has NOT rebooted.
>The uptime is from the last known reboot. There are no logs, since
>/var is not mounted. And I am not getting a crash dump.
>
>We are running Solaris 8. The only app running is Sybase 12.5.
>
>Any ideas...
>
>
>Joe
that stay up. (see /etc/syslog.conf)
Does Sybase stay up? Are there any anomalies in the Sybase logs?
When you say "is in single user waiting for the password", do you mean
that you can't telnet in from another machine, or that the console is
in text mode at the password prompt?
Are there any errors in the system logs at all that might indicate a
problem?
When you say "the users are still logged in", do you mean that their
processes are still there, but disconnected from their clients, or
that these users can continue working as before? This bit is
particularly confusing to me.
How do you fix /opt and the other FS? Do you have to reboot?
does prtdiag -v show everything you think it should?
You have to be more specific.
--Kent
=================================
Kent Smith * IPSO Incorporated
Business * Technology * Solutions
Financial Services and Accounting Systems Consulting
[url]http://www.ipsoinc.com[/url]
Kent Smith Guest
-
Joe #4
Re: Strange crash on E-450
Kent Smith <ksmith@ipsoinc.com> wrote in message news:<c49ogv446o517lo5phugikp84b1lkqvl9e@4ax.com>. ..
> On 9 Jul 2003 06:06:15 -0700, [email]joehoechst@cs.com[/email] (Joe) wrote:No Sybase does not stay up. It is one of the corrutped file systems.> You might want to set up mirrors for your system logs on partitions> >
> >Joe
> that stay up. (see /etc/syslog.conf)
>
> Does Sybase stay up? Are there any anomalies in the Sybase logs?No, you can not telnet from other servers. This is what is at the>
> When you say "is in single user waiting for the password", do you mean
> that you can't telnet in from another machine, or that the console is
> in text mode at the password prompt?
console. (without the dashes)
--------------
Login incorrect
Type control-d to proceed with normal startup,
(or give root password for system maintenance):
-------------------No, there are no logs to speak of. We send the messages to another>
> Are there any errors in the system logs at all that might indicate a
> problem?
server (and keep them local). The only thing I get in the messages
log is the typically messages when I reboot the server. There are no
crash dumps either.
I did not check on their processes. When I did the last command, I>
> When you say "the users are still logged in", do you mean that their
> processes are still there, but disconnected from their clients, or
> that these users can continue working as before? This bit is
> particularly confusing to me.
>
saw they were still logged in. The finger command confirmed that. I
did a "who -b" and it said it was still up from April 7th (the first
time this happened)
The only file system that I could fsck was /var. I had to reboot to> How do you fix /opt and the other FS? Do you have to reboot?
fix the rest. When I tried fo fsck the file systems it got "Can't
open /dev/vx/dsk/opt" I did try to fsck /dev/vx/rdk/opt also with the
same message.
I believe it does. I don't see any errors. The one thing I notice is>
> does prtdiag -v show everything you think it should?
prtdiag on an E-450 does not show last power failure.I know. This sounds very strange. It appears as if the server is>
> You have to be more specific.
going DOWN to single user mode, not rebooting to single user mode.
Otherwise I would assume a bad power supply/input. But since it
appears that the servers does not completely re-boot, that theory does
not seem possible.
>
> --Kent
> =================================
> Kent Smith * IPSO Incorporated
> Business * Technology * Solutions
> Financial Services and Accounting Systems Consulting
>
> [url]http://www.ipsoinc.com[/url]Joe Guest
-
Lon Stowell #5
Re: Strange crash on E-450
Joe wrote:
Caution: this is from another unix variant, belief is that Solaris>
> I know. This sounds very strange. It appears as if the server is
> going DOWN to single user mode, not rebooting to single user mode.
> Otherwise I would assume a bad power supply/input. But since it
> appears that the servers does not completely re-boot, that theory does
> not seem possible.
would do the same thing tho... and someone will correct if wrong.
Have seen a similar issue before, mysteriously going to single
mode. A wacked out user application was sending kill signals
without first making sure that the PID destination was properly
initialized and made sense. Murphy's Law dictated that the
PID it would end up with was "-1" which would kill the init
process group. Due to an admin oopsie, this wacked app was
running as root.
Lon Stowell Guest
-
Scott Richardson #6
Re: Strange crash on E-450
"Lon Stowell" <lon.stowell@comcast.net> wrote in message
news:Di_Oa.19146$Ph3.1404@sccrnsc04...That makes very good sense.> Joe wrote:
>>> >
> > I know. This sounds very strange. It appears as if the server is
> > going DOWN to single user mode, not rebooting to single user mode.
> > Otherwise I would assume a bad power supply/input. But since it
> > appears that the servers does not completely re-boot, that theory does
> > not seem possible.
> Caution: this is from another unix variant, belief is that Solaris
> would do the same thing tho... and someone will correct if wrong.
>
> Have seen a similar issue before, mysteriously going to single
> mode. A wacked out user application was sending kill signals
> without first making sure that the PID destination was properly
> initialized and made sense. Murphy's Law dictated that the
> PID it would end up with was "-1" which would kill the init
> process group. Due to an admin oopsie, this wacked app was
> running as root.
>
I still say have someone detail profile the power and environmental
aspects...
Is this system accesible to the internet? A Firewall in place?
There are no logs anywhere on this server, with any clue as to what
happened?
Have you checked with Sun? Any patches available that address symptoms like
this?
Put a performance monitor product on both the power and on the E-450 /
Solaris and see what it says, over a weeks worth of time. Detail profile
what goes on, when, and what affect it has on operational dynamics and
resources. If something is going on, you'll have a clear indication as to
exactly when, what how & why. You can get a 10 day free trial version of an
extremely detailed, low overhead Solaris performance monitor at
[url]www.deltekonline.com[/url]. (Solaris Agent, Windows Performance Console - two
components).
hth.
--
Regards,
Scott
Scott Richardson Guest
-
Kent Smith #7
Re: Strange crash on E-450
On 9 Jul 2003 12:39:20 -0700, [email]joehoechst@cs.com[/email] (Joe) wrote:
What does "uptime" report? Does the system thing it rebooted? What>Kent Smith <ksmith@ipsoinc.com> wrote in message news:<c49ogv446o517lo5phugikp84b1lkqvl9e@4ax.com>. ..>>> On 9 Jul 2003 06:06:15 -0700, [email]joehoechst@cs.com[/email] (Joe) wrote:>I know. This sounds very strange. It appears as if the server is>> >
>> >Joe
>going DOWN to single user mode, not rebooting to single user mode.
>Otherwise I would assume a bad power supply/input. But since it
>appears that the servers does not completely re-boot, that theory does
>not seem possible.
>
is the date of the "init" process (as reported by ps -ef)? Your last
scheduled reboot, or the time of the anomalous behavior?
Do you have savecore enabled? (in my 2.6 system that is done in
rc2.d/S20sysetup). If not, you might want to. If so, did you get a
system core? Anything interesting in it?
This is a real stumper!
--Kent
=================================
Kent Smith * IPSO Incorporated
Business * Technology * Solutions
Financial Services and Accounting Systems Consulting
[url]http://www.ipsoinc.com[/url]
Kent Smith Guest
-
Anthony Mandic #8
Re: Strange crash on E-450
Scott Richardson wrote:
Yeah, I knew someone who once did "kill 1 1" instead of>
> Lon Stowell wrote:
>>> > Have seen a similar issue before, mysteriously going to single
> > mode. A wacked out user application was sending kill signals
> > without first making sure that the PID destination was properly
> > initialized and made sense. Murphy's Law dictated that the
> > PID it would end up with was "-1" which would kill the init
> > process group. Due to an admin oopsie, this wacked app was
> > running as root.
> That makes very good sense.
"kill -1 1".
I'd suggest pointing the loghost to another machine. Even if> I still say have someone detail profile the power and environmental
> aspects...
>
> Is this system accesible to the internet? A Firewall in place?
>
> There are no logs anywhere on this server, with any clue as to what
> happened?
this one hangs and doesn't write to its logs, its syslogd
might be able to send a log message out to another machine.
-am © 2003
Anthony Mandic Guest
-
Lacour #9
Re: Strange crash on E-450
This sounds similar to an issue I had about 8 months ago. I had a e4500
with 8 CPUs and 4Gb of ram with an external storage array in a room that was
supposed to have 24x7 AC. As it turned out the rooms AC unit was on a
controller that set to cool only during work hours. Bottom line is that the
sever was shutting its self down because it was getting too hot during the
night when the AC was not on. We did not catch the issue until I had to
physically do something to on of the other servers in the same room,when I
walked into the room it was easily 100 degrees Fahrenheit in the room and
then system was reporting an internal temp of 140 degrees centigrade.
Charles
"Joe" <joehoechst@cs.com> wrote in message
news:461cdbbf.0307090506.71bddbb5@posting.google.c om...> I have an E-450 that has crashed the past 2 days. When I get in in
> the morning the server is single-user waiting for a password. The
> strange part is, it appears the server did not go all the way down and
> then boot back up. The /var, /opt and some other file systems are
> corrupt. I can fsck /var, but not the others: I get a "Can't open
> /dev/vx/dsk/opt" error. When I fix /var and mount it, all the users
> are still logged in. The server logs show that it has NOT rebooted.
> The uptime is from the last known reboot. There are no logs, since
> /var is not mounted. And I am not getting a crash dump.
>
> We are running Solaris 8. The only app running is Sybase 12.5.
>
> Any ideas...
>
>
> Joe
Lacour Guest



Reply With Quote

