Professional Web Applications Themes

Disk error with initial RAID 1 Sync - Linux Setup, Configuration & Administration

Hi, I have a 1U server with dual Western Digital 80 GB HDs both on IDE Channel 1. I would like to set both of thees drives up in a RAID 1 configuration. I am using Gentoo Linux and have been going through the configuration process. Previously, the server was running RH8 with RAID1 and I noticed that one drive was down. Before doing the full install of Gentoo, I ran a battery of tests on the drives with Western Digital's tools. Both hard drives passed without a single problem. Now I am configuring Gentoo and am finding that during ...

  1. #1

    Default Disk error with initial RAID 1 Sync

    Hi,

    I have a 1U server with dual Western Digital 80 GB HDs both on IDE
    Channel 1. I would like to set both of thees drives up in a RAID 1
    configuration. I am using Gentoo Linux and have been going through
    the configuration process. Previously, the server was running RH8
    with RAID1 and I noticed that one drive was down. Before doing the
    full install of Gentoo, I ran a battery of tests on the drives with
    Western Digital's tools. Both hard drives passed without a single
    problem.

    Now I am configuring Gentoo and am finding that during the initial
    RAID1 sync., I get and error on one of the drives. I am stumped at
    this point because as far as I can tell the drives are good and should
    not be giving me this problem. If it helps, the MoBo is an MSI-6378
    and the machine has 1 GB of RAM and is using Athlon-XP 2000+ CPU.

    Here are exact details of everything I did and the resulting error
    message that I received during the sync:

    Here is exact details. I re-ran this this afternoon and so this is
    fresh off the machine. I am installing the new OS on a machine with
    dual 80 GB HDs. I

    FDisked both drives with exactly matching partitions. Here is what
    the config looks like:

    Disk /dev/hda: 80.0 GB, 80026361856 bytes
    255 heads, 63 sectors/track, 9729 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes

    Device Boot Start End Blocks Id System
    /dev/hda1 * 1 5 40131 fd Linux raid
    autodetect
    /dev/hda2 6 68 506047+ 82 Linux swap
    /dev/hda3 69 9729 77601982+ fd Linux raid
    autodetect

    Command (m for help):

    (It is the same for both so all you need to do is replace hda with
    hdb)

    I loaded the RAID Kernel module and then created the following
    raidtab:

    # /boot (RAID 1)
    raiddev /dev/md0
    raid-level 1
    nr-raid-disks 2
    chunk-size 32
    persistent-superblock 1
    device /dev/hda1
    raid-disk 0
    device /dev/hdb1
    raid-disk 1

    # / (RAID 1)
    raiddev /dev/md2
    raid-level 1
    nr-raid-disks 2
    chunk-size 32
    persistent-superblock 1
    device /dev/hda3
    raid-disk 0
    device /dev/hdb3
    raid-disk 1

    I then went ahead and begin the synch process using the command mkraid
    /dev/md* where * is

    either 0 or 2. The initial synch of md0 goes without a hitch though I
    have to use mkraid -R since there is remnants from the last synch on
    the disks. I then begin to synch md2. Here is the output of
    /proc/mdstat during the md2 synch:


    Personalities : [raid1]
    read_ahead 1024 sectors
    md2 : active raid1 ide/host0/bus0/target1/lun0/part3[1]
    ide/host0/bus0/target0/lun0/part3[0]
    77601856 blocks [2/2] [UU]
    [=======>.............] resync = 37.5% (29110720/77601856)
    finish=36.7min

    speed=21993K/sec
    md0 : active raid1 ide/host0/bus0/target1/lun0/part1[1]
    ide/host0/bus0/target0/lun0/part1[0]
    40064 blocks [2/2] [UU]

    unused devices: <none>


    During the synch process on md2, the following errors appear. ( I do
    not know exactly when this occurred, but I know that it is after
    synching at least 80% of the drive):

    hdb: dma_timer_expiry: dma status == 0x61
    hdb: timeout waiting for DMA ( repeats again )
    hdb: (__ide_dma_test_irq) called while not waiting
    hda: status timeout: status=0xd0 { Busy }

    hda: drive not ready for command
    ide0: reset: success
    hdb: irq timeout: status=0xd0 { Busy } (2 more times)

    end_request: I/O error, dev 03:43 (hdb), sector 138982783
    raid1: Disk faiulure one ide/host0/bus0/target1/lun0/part3, disbaling
    device
    Operation continuing on 1 devices
    hdb: status timeout: status=0xd0 { Busy }

    hdb: drive not ready for command
    ideo0: reset: success
    md2: no spare disk to reconstruct arraay! -- continuing in degraded
    mode
    hdb: irq timeout: status=0xd0 { Busy }

    ide0: rest: success
    hdb: irq timeout: stauts=0xd0 { Busy } (these two lines are repeated
    once more}

    end_request: I/O error, dev 03:43 (hdb), sector 138982911

    And now cat /proc/mdstat looks like this:

    Personalities : [raid1]
    read_ahead 1024 sectors
    md2 : active raid1 ide/host0/bus0/target1/lun0/part3[1](F)
    ide/host0/bus0/target0/lun0/part3[0]
    77601856 blocks [2/1] [U_]

    md0 : active raid1 ide/host0/bus0/target1/lun0/part1[1]
    ide/host0/bus0/target0/lun0/part1[0]
    40064 blocks [2/2] [UU]

    unused devices: <none>

    The question I have now is what to troubleshoot. This feels like some
    sort of hardware problem, but I am not sure where to even start since
    the disks passed all tests. Anyone have any thoughts about this?

    TIA for any advice

    JL
    Jay Guest

  2. #2

    Default Re: Disk error with initial RAID 1 Sync

    com (Jay L) writes:
     
     

    You can do that, but I would strongly recommend putting the drives
    on separate IDE channels. Not only will the performance suffer
    drastically if you keep the drives in a MASTER/SLAVE setup on
    a single IDE channel, it will eventually cause problems if one of
    the drives startts to fail - as that might result in both drives
    becoming unavailable at the same time.

    [...]
     

    [...]

    Seems that /dev/hdb has DMA problems. This may or may not be the
    result of the drives being connected to the same IDE channel;
    DMA problems may also indicate faulty hardware, low-quality
    cables or driver problems. At the very least, try this setup
    with /dev/hda and /dev/hdc (that is, putting the SLAVE drive
    as MASTER on the secondary IDE channel).

    Michael
    --
    Michael Buchenrieder * greenie.muc.de * http://www.muc.de/~mibu
    Lumber Cartel Unit #456 (TINLC) & Official Nets
    Note: If you want me to send you email, don't munge your address.
    Michael Guest

  3. #3

    Default Re: Disk error with initial RAID 1 Sync

    Believe, it or not, I think that I found a solution for the problem. I
    pulled the drives and reviewed the jumpers and found that they were in
    "Cable Select" mode. According to the people who sold me the machine,
    this is the normal configuration and allows the system to
    automatically select a master/slave depending on the position in the
    IDE chain. I opted to change this and manually chose master and slave.
    (Master for the last drive in the chain and slave for the middle
    drive.) After making this change, my problems have disappeared! I will
    keep my eyes on it in case it is not a permanent solution, but so far
    so good.

    I wanted to post this here in case anyone else experiences this issue.

    JL

    Michael Buchenrieder <muc.de> wrote in message news:<muc.de>... 

    >
    > You can do that, but I would strongly recommend putting the drives
    > on separate IDE channels. Not only will the performance suffer
    > drastically if you keep the drives in a MASTER/SLAVE setup on
    > a single IDE channel, it will eventually cause problems if one of
    > the drives startts to fail - as that might result in both drives
    > becoming unavailable at the same time.
    >
    > [...]

    >
    > [...]
    >
    > Seems that /dev/hdb has DMA problems. This may or may not be the
    > result of the drives being connected to the same IDE channel;
    > DMA problems may also indicate faulty hardware, low-quality
    > cables or driver problems. At the very least, try this setup
    > with /dev/hda and /dev/hdc (that is, putting the SLAVE drive
    > as MASTER on the secondary IDE channel).
    >
    > Michael[/ref]
    Jay Guest

  4. #4

    Default Re: Disk error with initial RAID 1 Sync

    On 8 Jan 2004 11:01:55 -0800, com (Jay L) wrote:
     

    Your dealer is obviously a 15 year old kid. CS is a fairly recent
    development and it works every bit as well and consistantly as plug &
    pray.

    Repeat after me: I will NEVER allow the computer to make an important
    decision, I will do it myself, so I know that it is done properly.

    Cable Select, Plug & Play... standards would be wonderful, if only
    everyone implemented the same ones the same way!

    Mike-
    Mornings: Evolution in action. Only the grumpy will survive.
    -----------------------------------------------------

    Please note - Due to the intense volume of spam, we have
    installed site-wide spam filters at catherders.com. If
    email from you bounces, try non-HTML, non-encoded,
    non-attachments.


    ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
    http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
    ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
    Michael Guest

Similar Threads

  1. Failed Software Raid Disk
    By Mike in forum Windows Server
    Replies: 2
    Last Post: August 12th, 05:37 PM
  2. Replies: 3
    Last Post: December 11th, 04:51 PM
  3. Possible raid or boot disk problem
    By Tom Martau in forum Linux / Unix Administration
    Replies: 1
    Last Post: November 3rd, 06:55 AM
  4. Large RAID disk setup
    By Jeff in forum AIX
    Replies: 2
    Last Post: October 22nd, 06:58 PM
  5. changing disk of a ssa raid-1 array
    By Al Wetzel in forum AIX
    Replies: 1
    Last Post: August 7th, 04:42 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139