Best of the Newsgroups: double panic smp osr5 crash dump interpret


What is this stuff?

If this isn't exactly what you wanted, please try our Search (there's a LOT of techy and non-techy stuff here about Linux, Unix, Mac OS X and just computers in general!):



From: Bela Lubkin <belal@caldera.com>
Subject: Re: Server crashes - need help! :(
Date: Mon, 30 Dec 2002 12:35:07 GMT References: <3E0F2797.1040102@dniq-online.com> <20021229115029.I10531@mammoth.ca.caldera.com> <3E0FDE64.505@dniq-online.com>


Hate these ads?



Farlander wrote:



>    Ok :) Here's the most frequent one:
>
> KERNEL STACK TRACE FOR PROCESS 94:
> STKADDR   FRAMEPTR  FUNCTION   POSSIBLE ARGUMENTS
> e0000844  e0000970  prf_task_s (0x4,0,0x1000,0xe)
> e0000978  e0000994  cmn_err    (0x3,got_RESERVEDFLT+0x26c,0xe,u+0x9d4)
> e000099c  e00009c8  k_trap     (u+0x9d4)
>            e00009d4  kern_trap  from 0xf0013ae5 in bcpalign
>    ax:dffda000 cx:     400 dx:   1ffda bx:    1000 fl:    10206 ds: 160 fs:   0
>    sp:e0000a04 bp:e0000a24 si:dffda000 di:c0120000 err:       0 es: 160 gs:   0
> e00009dc  e0000a24  bcpalign   (tmpva_pages,0xc0120000,0x1000,0x1ffda)
> e0000a2c  e0000a50  dumpnextpa (0xc0120000,u+0xb30,0x3,got_RESERVEDFLT+0x26c)
> e0000a58  e0000b74  sysdump    (0x4,0,0xfd8bd2b8,0xe)
> e0000b7c  e0000b98  cmn_err    (0x3,got_RESERVEDFLT+0x26c,0xe,u+0xbd8)
> e0000ba0  e0000bcc  k_trap     (u+0xbd8)
>            e0000bd8  kern_trap  from 0xf005f234 in freeb
>    ax:ffffffff cx:       1 dx:f03560c4 bx:fd8bd2b8 fl:    10282 ds: 160 fs:   0
>    sp:e0000c08 bp:e0000c30 si:fd8c87d8 di:       0 err:       0 es: 160 gs:   0
> e0000be0  e0000c30  freeb      (0xfd8c87d8,0xfd8c87d8,0x1,0xf2d745e8)
> e0000c38  e0000c48  freemsg    (0xfd8c87d8,0xfd8c87d8,0xf2d745e8,0xf2d5c700)
> e0000c50  e0000c88  sr_device  (0xf2d5c700,0xfd8c87d8,0xfd8c87d8,0xf2d5c700)
> e0000c90  e0000cb4  sramsendcm (0xf2d5c700,0xfd8c87d8,0xf27aaa00,0)
> e0000cbc  e0000cd4  _dlgn_send (0xfd8c87d8,0xfd8c87d8,0,0xfd8c87d8)
> e0000cdc  e0000cf4  _dlgn_putc (0xf2d5c700,0xfd8c87d8,0xfce2df7c,streams+0x1998)
> e0000cfc  e0000d18  dlgnwput   (0xfce2df7c,0xfd8c87d8,0xfce2bfb4,0)
> e0000d20  e0000d44  putnext    (0xfce2bfb4,0xfd8c87d8,streams+0x1998,0)
> e0000d4c  e0000d7c  strputpmsg (inode+0x12de0,u+0xdcc,u+0xdc0,0)
> e0000d84  e0000d9c  strputmsg  (inode+0x12de0,u+0xdcc,u+0xdc0,0)
> e0000da4  e0000ddc  msgio      (0x2)
> e0000de4  e0000de8  putmsg     (0x80d5810,0x80d1b34,0x80d1ab4,0x80474d0)
> e0000df0  e0000e10  systrap    (u+0xe1c)
>            e0000e1c  scall_noke from 0x80053348
>    ax:      56 cx:       4 dx:       0 bx: 80d5810 fl:      202 ds:  1f fs:   0
>    sp:e0000e4c bp: 804742c si: 80d1b34 di: 80d1ab4 err:      56 es:  1f gs:   0














Well, that's clearly in the Dialog driver (and then a double-panic in
the panic dump writing code...!)



When it hits this double-panic (2nd panic in bcpalign()), has it printed
any of the dump-in-progress dots?



>    And here's another one - for msgcount:

> KERNEL STACK TRACE FOR PROCESS 87:
> STKADDR   FRAMEPTR  FUNCTION   POSSIBLE ARGUMENTS
> e0000910  e0000a3c  prf_task_s (0x4,0,0x1000,0xe)
> e0000a44  e0000a60  cmn_err    (0x3,got_RESERVEDFLT+0x26c,0xe,u+0xaa0)
> e0000a68  e0000a94  k_trap     (u+0xaa0)
>            e0000aa0  kern_trap  from 0xf0013ae5 in bcpalign
>    ax:dffda000 cx:     400 dx:   1ffda bx:    1000 fl:    10206 ds: 160 fs:   0
>    sp:e0000ad0 bp:e0000af0 si:dffda000 di:c0110000 err:       0 es: 160 gs:   0
> e0000aa8  e0000af0  bcpalign   (tmpva_pages,0xc0110000,0x1000,0x1ffda)
> e0000af8  e0000b1c  dumpnextpa (0xc0110000,u+0xbfc,0x3,got_RESERVEDFLT+0x26c)
> e0000b24  e0000c40  sysdump    (0x4,0,0,0xe)
> e0000c48  e0000c64  cmn_err    (0x3,got_RESERVEDFLT+0x26c,0xe,u+0xca4)
> e0000c6c  e0000c98  k_trap     (u+0xca4)
>            e0000ca4  kern_trap  from 0xf005fc7a in msgcount
>    ax:       0 cx:      78 dx:       0 bx:       0 fl:    10286 ds: 160 fs:   0
>    sp:e0000cd4 bp:e0000ce8 si:f3005e50 di:       7 err:       0 es: 160 gs:   0
> e0000cac  e0000ce8  msgcount (0xfd8c84c0,0xfd4297f0,0xfd423e78,shlock_str_qnext)
> e0000cf0  e0000d18  putnextqru (0,0,0x1,0)
> e0000d20  e0000d44  queuerun   (0xfd8c98a0,0x1)
> e0000d4c  e0000d54  runqueues (u+0xdcc,inode+0x52d50,u+0x1148,region+0xcae0)
> e0000d5c  e0000d7c  strputpmsg (inode+0x52d50,u+0xdcc,u+0xdc0,0)
> e0000d84  e0000d9c  strputmsg  (inode+0x52d50,u+0xdcc,u+0xdc0,0)
> e0000da4  e0000ddc  msgio      (0x2)
> e0000de4  e0000de8  putmsg     (0x80d5870,0x80d1b94,0x80d1b0c,u+0xe10)
> e0000df0  e0000e10  systrap    (u+0xe1c)
>            e0000e1c  scall_noke from 0x80053348
>    ax:      56 cx:       2 dx:       0 bx: 80d5870 fl:      202 ds:  1f fs:   0
>    sp:e0000e4c bp: 8047508 si: 80d1b94 di: 80d1b0c err:      56 es:  1f gs:   0

>    I hope there's something useful in there that I'm missing...


ad



Again with the double-panic...



This doesn't have the Dialogic driver on the stack, but if it fed bad
data into a STREAMS queue, this could be related.



If the double-panics are happening after the dump has printed some dots,
there is something bad happening in the hardware.  Something like a DMA
transfer being written to a wrong address, corrupting memory not owned
by the driver.  The loop in sysdump() that calls dumpnextpage() and then
bcopy() (which we see here as "bcpalign") uses the same addresses over
and over.  0xc0110000 is the unchanging address of a disk buffer it's
using to stage writes.  tmpva_pages is the unchanging virtual address at
which it is sequentially mapping every page of memory.  The mapping
cannot fail (if no memory existed at that physical address, it would
just get all 0xff's).  So if it double-panics after some dots have been
printed, something very strange is happening.  In fact it's pretty
strange even if this is the first page.









What is the value of register CR2 in these dumps?  That's the address it
got the fault on.  Should be the same as either %esi or %edi in the last
trap frame in the stack trace (si:dffda000 di:c0110000 in the 2nd
example).



For that matter, how are you displaying these stacks?!  Those are
crash(ADM) output.  To get crash output on a panic, you would need a
finished panic dump, but these show the system going down in flames in
mid-dump!  I could understand scodb traces, you could be using a serial
console and capturing the output, but crash output from a double-panic
in the dump code?!?



>Bela<





Enter your email address for automatic notification of new posts here
(be sure to whitelist 'feedburner.com' if you use spam filtering)

Or use any RSS reader

Delivered by FeedBurner


Views for this page
Today This Week This Month This Year  Overall
2417202 1,001

/Bofcusm/1918.html copyright 1997-2004 Bela Lubkin All Rights Reserved

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.

Publishing your articles here

More:
       - Bela




Unix/Linux Consultants

Your ad here - $48.00 yearly!

UBB Computer Services Support for Openserver, Unixware and Linux. Windows integration with Unix/Linux servers. Hardware, Backup and Networking issues. Located near Sacramento CA, we provide onsite support throughout Northern CA and Nationwide via remote access. We are a SCO Authorized Partner and a Microlite BackupEdge Certified Reseller.


SCO, OpenServer, UnixWare, software, servers, security, networks, installation, administration, troubleshooting, maintenance, Watchguard, firewalls, VPNs, e-mail. Visit us at http://opensystemscomputing.com and www.go2unix.com.


http://www.breakthru.com.au SCO (Openserver and Unixware), Unix, Solaris and Linux Consulting services including: Secure Networking Solutions; Linux based Firewalls; Backup Solutions; Secure Home to Office Network Setup; Phone, Remote and On-Site Support available - Satisfaction Guaranteed!



Twitter
  • Nov 18 20:41
    I'll be out all day Wednesday the 19th, hard to reach even by phone. Leave a message and have patience.
  • Nov 18 10:07
    What am I doing? Not doing what I WANT to be doing! Oh, well, in every life a little work must fall..




card_image








Change Congress


Related Posts