hda/SMART error mesages

Issues related to hardware problems
Post Reply
TomE
Posts: 20
Joined: 2011/09/19 22:09:35

hda/SMART error mesages

Post by TomE » 2013/11/13 18:05:33

I am getting frequent error messages that look as if they are coming from smartd.

Code: Select all

hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
hda: drive_cmd: error=0x04 { DriveStatusError }
ide: failed opcode was: 0xb0
Google searches point to smartd spitting these messages out when querying the BIOS and not understanding the reply it gets back. (See THIS THREAD for an explanation of the error msg from Alan Cox.) According to the logwatch, I'm getting the exact same amount of these messages per day (558), and it does not seem to be degrading the server performance, at least noticeably, but it is an annoyance when this error pops up every 3 minutes.

My mobo is fairly old, a Tyan Thunder K7X, the v4.06 BIOS is from 2003 and I cannot locate anything about SMART on any changeable option. Any info or pointers would be greatly appreciated.

EDIT: Forgot to mention that turning smartd off does not make the error go away.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: hda/SMART error mesages

Post by TrevorH » 2013/11/13 18:35:27

That looks more like a failing disk to me
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

TomE
Posts: 20
Joined: 2011/09/19 22:09:35

Re: hda/SMART error mesages

Post by TomE » 2013/11/14 12:20:13

TrevorH wrote:That looks more like a failing disk to me
I thought so as well, so much so I started accumulating new hardware for a server build, since this machine is pretty old. Still, I thought it strange the number of error messages was the exact same every day...I would think if the drive was bad it would be spitting out error messages when accessed, and surely that wouldn't happen the exact same amount each day. I looked into it a bit more, updated the kernel and smartmontools, and edited the smartd.conf to monitor that drive. There was one line in the old smartd.conf:

Code: Select all

DEVICESCAN -H -m root


I changed it to monitor all attributes:

Code: Select all

 /dev/hda -a -m root
..and the error messages stopped. I don't know what changed, perhaps smartd is getting the feedback it was looking for earlier. I'll have to study it out some more, make sure I didn't simply tell it to stop sending error messages, but from what I read in the man pages that isn't the case.

Post Reply