Professional Web Applications Themes

Serious issue with SATA disks again - FreeBSD

I'm still getting errors like this: ad10: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=5601695 ad10: FAILURE - WRITE_DMA timed out ad10: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=4848803 ad10: FAILURE - WRITE_DMA timed out ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=5618815 ad10: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=4848959 ad10: FAILURE - WRITE_DMA timed out ad10: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=4472607 ad10: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=4860959 ad10: FAILURE - WRITE_DMA timed out ad10: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=4861087 ad10: WARNING - WRITE_DMA ...

  1. #1

    Default Serious issue with SATA disks again

    I'm still getting errors like this:

    ad10: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=5601695
    ad10: FAILURE - WRITE_DMA timed out
    ad10: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=4848803
    ad10: FAILURE - WRITE_DMA timed out
    ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=5618815
    ad10: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=4848959
    ad10: FAILURE - WRITE_DMA timed out
    ad10: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=4472607
    ad10: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=4860959
    ad10: FAILURE - WRITE_DMA timed out
    ad10: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=4861087
    ad10: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=4861695

    Yesterday, for the first time, the system crashed (ungracefully) after
    some of these errors occurred, and I had to reset the system manually;
    fsck had to correct errors after boot.

    I need to know what is causing these problems. They have been reported
    for a year by various people on various configurations (different
    motherboards and chipsets). I've seen lots of complaints and reports,
    but no solutions. It's not hardware, so don't bother suggesting that
    unless you can _prove_ that the OS is eliminated from consideration.

    Doesn't anyone actually know how FreeBSD works? Someone wrote the code
    that prints the above cryptic messages. What do they mean, _exactly_?

    These errors occur most often while I'm running a Perl program (awstats)
    to yse web logs. That may explain why the LBAs seem to be in the
    same region. ad10 contains /tmp and /var; ad12 (which doesn't seem to
    show the error messages) contains /usr. The root and swap file are on a
    different drive entirely.

    I'm beginning to get the impression that support for disks is rather
    weak in FreeBSD 5.x. I have mysterious SCSI errors on one machine that
    nobody seems to have any clue about, and mysterious SATA errors on
    another machine that nobody seems to have any clue about. I can't
    really brag about the reliability or uptime of the OS if it crashes once
    a week due to unresolved bugs in disk-handling code.

    --
    Anthony


    Anthony Guest

  2. #2

    Default Re: Serious issue with SATA disks again

    On Sat, Mar 19, 2005 at 10:38:13AM +0100, Anthony Atkielski wrote: 

    Its impossible to _prove_ the software is _not_ at fault just as its
    impossible to prove the hardware is not at fault. When software works
    for others but not on your hardware then one can only conclude there is
    _something_ about your hardware.

    With seemingly random timeouts such as you are seeing I would suspect
    the SATA cable. SATA runs gigabits/sec and could be very sensitive. Try
    a different cable from another source.

    Also run the HD manufacturer's test utility. This week a bad block
    appeared on one of my SATA drives after 4,000 hours of runtime.
    Downloaded a bootable ISO from Hitachi Global Storage which booted into
    DOS, found and remapped my bad block.

    "smartctl" from ports was also quite useful at reading the error log
    maintained by the HD firmware. Interesting reading, such as my drive
    temperature was 35, lifetime max/min was 19/45 (Celsius).
     

    It means the driver asked the HD to fill a buffer, but it didn't
    complete the task within alloted time. Either the drive didn't begin, or
    data was lost and fell short.
     

    A few years ago one of my then-new machines could not write a floppy in
    FreeBSD but could in NT4. Tried lots of things, also got the attention
    of the floppy driver maintainer. A few weeks later got the idea to
    "Reset to Defaults" in the BIOS. Then reset the few specific things I
    needed back the way they were. Magic. There was something undoented
    being set by BIOS at boot that didn't bother NT.

    One of my BIOS settings above was to hold PCI back to version 2.0 or 2.1
    vs 2.1 or 2.2. Learned on of my PCI cards didn't like something about
    the new PCI spec and that the system was not smart enough to know.

    More recently, in 5.2.1, I had no problems with a parallel ATA drive
    with Hyperthreading enabled on my P4. No problems running sysinstall to
    prep the new SATA drives. But the SATA drives locked the kernel solid
    moments after first use. Disabled HT and all was fine. Something about
    HT and the new Geom framework used for SATA (but not for PATA, at least
    then) didn't work. Until a block went bad on one drive there hasn't been
    a drive problem in 4,000+ hours. I only reboot for power failures and
    updates.

    --
    David Kelly N4HHE, net
    ================================================== ======================
    Whom computers would destroy, they must first drive mad.
    David Guest

  3. #3

    Default Re: Serious issue with SATA disks again

    David Kelly writes:
     

    It doesn't work for others. I found lots of messages complaining about
    this on various platforms, but no explanations.
     

    I don't want suspicions, I want answers. Who generates the message, and
    exactly what does it mean?

    I see the string in ata-queue.c, and references in a couple of other
    modules, but as usual, there are no comments at all, so there's no way
    to figure out what's going on.
     

    I don't think Western Digital has one (?). If it does, where can I find
    it?
     

    I tried running the offline self-test, but it didn't seem to do
    anything.
     

    Or there's a bug in the code.
     

    Or that NT was programmed to handle (i.e., a better driver in NT than in
    FreeBSD).
     

    Or something about the way FreeBSD handled this situation contained a
    bug.

    --
    Anthony


    Anthony Guest

  4. #4

    Default Re: Serious issue with SATA disks again

    Erik de Jong writes:
     

    What made you think it was a hardware problem?
     

    Somebody, somewhere knows what causes this error.
     

    Such as code without any trace of comments, like I see in FreeBSD.
     

    Including the operating system.
     

    Which hardware issue do I have?
     

    Somebody wrote the code.
     

    It has been around for a least a year, judging by what I've found on the
    Net.
     

    That doesn't make it okay for FreeBSD.

    --
    Anthony


    Anthony Guest

  5. #5

    Default Re: Serious issue with SATA disks again

     
    Here is WDC's data lifeguard utility for DOS:
    http://support.wdc.com/download/index.asp?cxml=n&pid=2&swid=30

    Also, you might want to try flashing the firmware for the
    controller/motherboard with the lastest versions. I've had several
    occaisons recently where I couldn't get hardware to work with BSD until I
    got up to the lastest firmware (an old dell perc 2 most recently). I
    didn't see the original email, so I may be off-base on the controller
    being the problem, though.

    Jerry
    http://www.syslog.org





    Jerry Guest

Similar Threads

  1. Boot from SATA
    By Matthew in forum Windows Server
    Replies: 3
    Last Post: August 15th, 04:25 PM
  2. Jaz and Zip disks no longer work as start disks
    By Tony in forum Mac Applications & Software
    Replies: 8
    Last Post: September 6th, 06:01 AM
  3. SATA drives
    By Shazbot in forum Linux Setup, Configuration & Administration
    Replies: 3
    Last Post: August 25th, 12:33 AM
  4. Help Needed: G4 freezes when using Install disks...both 9.2.2and X disks.
    By William in forum Mac Applications & Software
    Replies: 3
    Last Post: July 28th, 08:01 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139