Professional Web Applications Themes

Need Expert Help - dbca fails (ORA-27047) on raw vol. 9201 SuSE SLES8 - Oracle Server

Hi, I have done this many times on Aix, Solaris, Relient Unix, HP and Linux almost all the time with third party clusters but this time I have to work with oracle clusters, looks like this is not stable, I am trying to find out the root cause. Problem is dbca fails at 37% create/cloneDBCreation.log: ORACLE instance started. Total System Global Area 252776588 bytes Fixed Size 450700 bytes Variable Size 218103808 bytes Database Buffers 33554432 bytes Redo Buffers 667648 bytes Create controlfile reuse set database rac * ERROR at line 1: ORA-01503: CREATE CONTROLFILE failed ORA-01565: error in identifying file ...

  1. #1

    Default Need Expert Help - dbca fails (ORA-27047) on raw vol. 9201 SuSE SLES8

    Hi,
    I have done this many times on Aix, Solaris, Relient Unix, HP and
    Linux
    almost all the time with third party clusters but this time I have to
    work with oracle clusters, looks like this is not stable, I am trying
    to
    find out the root cause.

    Problem is dbca fails at 37%

    create/cloneDBCreation.log:
    ORACLE instance started.
    Total System Global Area 252776588 bytes
    Fixed Size 450700 bytes
    Variable Size 218103808 bytes
    Database Buffers 33554432 bytes
    Redo Buffers 667648 bytes
    Create controlfile reuse set database rac
    *
    ERROR at line 1:
    ORA-01503: CREATE CONTROLFILE failed
    ORA-01565: error in identifying file
    '/opt/oracle/oradata/rac/cwmlite01.dbf'
    ORA-27047: unable to read the header block of file
    Linux Error: 4: Interrupted system call


    alert_rac1.log:
    '/opt/oracle/oradata/rac/undotbs01.dbf' ,
    '/opt/oracle/oradata/rac/users01.dbf' ,
    '/opt/oracle/oradata/rac/xdb01.dbf'
    LOGFILE GROUP 1 ('/opt/oracle/oradata/rac/redo01.log') SIZE 102400K
    REUSE,
    GROUP 2 ('/opt/oracle/oradata/rac/redo02.log') SIZE 102400K REUSE
    RESETLOGS
    Fri Sep 26 13:21:13 2003
    lmon registered with NM - instance id 1 (internal mem no 0)
    Fri Sep 26 13:21:13 2003
    Reconfiguration started
    List of nodes: 0,
    Global Resource Directory frozen
    one node partition
    Communication channels reestablished
    Master broadcasted resource hash value bitmaps

    List of nodes always 0, irrespective of number of nodes up


    rac1_diag_18951.trc:
    *** SESSION ID:(2.1) 2003-09-26 13:21:09.847
    CMCLI WARNING: CMInitContext: init ctx(0xabba1fc)
    kjzcprt:rcv port created
    Node id: 0
    List of nodes: 0,
    *** 2003-09-26 13:21:09.853
    Reconfiguration starts [incarn=0]
    I'm the master node
    *** 2003-09-26 13:21:09.853
    Reconfiguration completes [incarn=1]
    CMCLI WARNING: ReadCommPort: received error=104 on recv().
    kjzmpoll: slos err[12 CMGroupGetList 2 RPC failed status(-1)
    respMsg->status(0) 0]
    [kjzmpoll1]: Error [category=12] is encountered
    CMCLI ERROR: OpenCommPort: connect failed with error 111.
    kjzmdreg1: slos err[12 CMGroupExit 2 RPC failed status(1) 0]
    [kjzmleave1]: Error [category=12] is encountered
    error 32700 detected in background process
    OPIRIP: Uncaught error 447. Error stack:
    ORA-00447: fatal error in background process
    ORA-32700: error occurred in DIAG Group Service
    ORA-27300: OS system dependent operation:CMGroupExit failed with
    status: 0
    ORA-27301: OS failure message: Error 0
    ORA-27302: failure occurred at: 2
    ORA-27303: additional information: RPC failed status(1)
    ORA-32700: error occurred in DIAG Group Service
    ORA-27300: OS system dependent operation:CMGroupGetList failed with
    status: 0
    ORA-27301: OS failure message: Error 0
    ORA-27302: failure occurred at: 2
    ORA-27303: additional information: RPC failed status(-1)
    respMsg->status(0)

    according to the posting
    http://lists.suse.com/archive/suse-oracle/2002-Feb/0058.html
    all logical volumes should not start at 0 but all my lvms start at 0
    end
    at 124 according to yast2
    it is by any chance some thing to do with my shared harddrive setup.
    I really appreciate any of your commnets and ideas.


    Here is my 9201 RAC setup
    HW: Two Dellpoweredge 1600SC (two Xeon CPUs 2.40GHz),1Gbit Eth inter
    connect.
    shared disk: Adaptec 29160 Ultra160 SCSI adapters from both nodes
    connected to SEAGATE ST373307LW externally - (I dont know any tools
    to
    make sure this setup is OK - or must I go for certified shared
    storage)
    SW: SuSE SLSE8 (/etc/SuSE-release: SuSE SLES-8 (i386) VERSION = 8.1)
    2.4.19-64GB-SMP #1 SMP )applied k_smp-2.4.19-196.i586.rpm patch
    (Oracle certified on this OS)

    I have setup raw partitions with lvm and bind to /dev/raw/raw*
    $ ls -dl /dev/oracle
    drwxrwxrwx 2 root root 4096 2003-09-27 14:33
    /dev/oracle
    $ ls -dl /dev/oracle/lvol1
    brw-rw---- 1 oracle dba 58, 0 2003-09-27 14:33
    /dev/oracle/lvol1
    .... up to 25 vols
    $ ls -ld /dev/raw
    drwxrwxrwx 2 root root 4096 2003-09-19 17:39 /dev/raw
    $ ls -ld /dev/raw/raw1
    crw------- 1 oracle dba 162, 1 2003-09-19 17:39
    /dev/raw/raw1
    .... upto 25 volumes bind with

    Cluster manager is the one which uses one of the shared disks first. I
    started
    getting errors when I start dbca, seems to be oracm buggy. Deleted
    totally previous install, installed 9201 cluster manager and applied
    9203 cluster manager patch. Installed oracle successfully.
    started ocmstart.sh
    cm.log:
    oracm, version[ 9.2.0.2.0.41 ] started {Fri Sep 26 14:51:10 2003 }
    KernelModuleName is hangcheck-timer {Fri Sep 26 14:51:10 2003 }
    OemNodeConfig(): Network Address of node0: 192.168.1.1 (port 9998)
    {Fri Sep 26 14:51:10 2003 }
    OemNodeConfig(): Network Address of node1: 192.168.1.2 (port 9998)
    {Fri Sep 26 14:51:10 2003 } 
    file = oem.c, line = 491 {Fri Sep 26 14:51:10 2003 }
    InitializeCM: ModuleName = hangcheck-timer {Fri Sep 26 14:51:10 2003
    }
    InitializeCM: Kernel module hangcheck-timer is already loaded {Fri Sep
    26 14:51:10 2003 }
    ClusterListener (pid=1553, tid=3076): Registered with watchdog daemon.
    {Fri Sep 26 14:51:10 2003 }
    CreateLocalEndpoint(): Network Address: 192.168.1.1
    {Fri Sep 26 14:51:10 2003 }
    UpdateNodeState(): node(0) added udpated {Fri Sep 26 14:51:13 2003 }
    HandleUpdate(): SYNC(1) from node(1) completed {Fri Sep 26 14:51:13
    2003 }
    HandleUpdate(): NODE(0) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(2)
    {Fri Sep 26 14:51:13 2003 }
    HandleUpdate(): NODE(1) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(1)
    {Fri Sep 26 14:51:13 2003 }
    NMEVENT_RECONFIG [00][00][00][00][00][00][00][03] {Fri Sep 26 14:51:13
    2003 }
    Successful reconfiguration, 2 active node(s) node 1 is the master, my
    node num is 0 (reconfig 2) {Fri Sep 26 14:51:14 2003 } 
    ClientProcListener:11273 file = unixinc.c, line = 754 {Fri Sep 26
    14:51:32 2003 } 
    ClientProcListener:12297 file = unixinc.c, line = 754 {Fri Sep 26
    14:51:32 2003 }

    lsmod shows:
    hangcheck-timer 1248 0 (unused)
    that 0 doesnt seems to be right
    even after the 9203 patch watchdogd is still part of the ocmstart.sh
    not sure
    if I have to comment it.

    lsnodes output is wrong (oracm started only on one node)
    $ lsnodes -l
    nd1
    $ lsnodes (this is wrong, ocms not running on the other node)
    nd1
    nd2
    $ lsnodes -n
    nd1 0
    nd2 1
    $ lsnodes -v
    CMCLI WARNING: CMInitContext: init ctx(0x804ad00)

    nd1
    nd2
    CMCLI WARNING: CommonContextCleanup: closing comm port
    $
    gdb_dsr Guest

  2. #2

    Default Re: Need Expert Help - dbca fails (ORA-27047) on raw vol. 9201 SuSE SLES8

    com (gdb_dsr) wrote in message news:<google.com>... 
    > file = oem.c, line = 491 {Fri Sep 26 14:51:10 2003 }
    > InitializeCM: ModuleName = hangcheck-timer {Fri Sep 26 14:51:10 2003
    > }
    > InitializeCM: Kernel module hangcheck-timer is already loaded {Fri Sep
    > 26 14:51:10 2003 }
    > ClusterListener (pid=1553, tid=3076): Registered with watchdog daemon.
    > {Fri Sep 26 14:51:10 2003 }
    > CreateLocalEndpoint(): Network Address: 192.168.1.1
    > {Fri Sep 26 14:51:10 2003 }
    > UpdateNodeState(): node(0) added udpated {Fri Sep 26 14:51:13 2003 }
    > HandleUpdate(): SYNC(1) from node(1) completed {Fri Sep 26 14:51:13
    > 2003 }
    > HandleUpdate(): NODE(0) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(2)
    > {Fri Sep 26 14:51:13 2003 }
    > HandleUpdate(): NODE(1) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(1)
    > {Fri Sep 26 14:51:13 2003 }
    > NMEVENT_RECONFIG [00][00][00][00][00][00][00][03] {Fri Sep 26 14:51:13
    > 2003 }
    > Successful reconfiguration, 2 active node(s) node 1 is the master, my
    > node num is 0 (reconfig 2) {Fri Sep 26 14:51:14 2003 } 
    > ClientProcListener:11273 file = unixinc.c, line = 754 {Fri Sep 26
    > 14:51:32 2003 } 
    > ClientProcListener:12297 file = unixinc.c, line = 754 {Fri Sep 26
    > 14:51:32 2003 }
    >
    > lsmod shows:
    > hangcheck-timer 1248 0 (unused)
    > that 0 doesnt seems to be right
    > even after the 9203 patch watchdogd is still part of the ocmstart.sh
    > not sure
    > if I have to comment it.
    >
    > lsnodes output is wrong (oracm started only on one node)
    > $ lsnodes -l
    > nd1
    > $ lsnodes (this is wrong, ocms not running on the other node)
    > nd1
    > nd2
    > $ lsnodes -n
    > nd1 0
    > nd2 1
    > $ lsnodes -v
    > CMCLI WARNING: CMInitContext: init ctx(0x804ad00)
    >
    > nd1
    > nd2
    > CMCLI WARNING: CommonContextCleanup: closing comm port
    > $[/ref]

    Applied 9203 cluster patch and created database manually.
    gdb_dsr Guest

Similar Threads

  1. Any expert here..
    By Amitabh_is_here in forum Macromedia Flex General Discussion
    Replies: 4
    Last Post: April 28th, 10:27 PM
  2. Looking for expert help
    By Jake in forum ASP.NET Web Services
    Replies: 4
    Last Post: August 27th, 12:02 AM
  3. I need some help from an expert
    By benmaxwood@aol.com in forum Macromedia Flash Actionscript
    Replies: 1
    Last Post: February 5th, 03:26 PM
  4. Installing Suse 8.2 Profesional hangs after this line >>> SuSe Linux Installation ....<<<
    By Nenad Ukropina in forum Linux Setup, Configuration & Administration
    Replies: 0
    Last Post: September 15th, 07:11 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139