Re: [CentOS] sfdisk -l output errors and crc erros

"Paul" <unix@xxxxxxxxxxxxxx> · Sun, 16 Jul 2006 01:51:22 -0400 (EDT)

On Thu, July 13, 2006 9:50 am, William L. Maltby wrote:
> On Wed, 2006-07-12 at 19:33 -0400, Paul wrote:
>> OK, I'm still trying to solve this.  Though the server has been up rock
>> steady, but the errors concern me.  I built this on a test box months
>> ago
>> and now that I am thinking, I may have built it originally on a drive of
>> a
>> different manufacturer, although about the same size (20g).  This may
>> have
>> something to do with it.  What is the easiest way to get these errors
>> taken care of?  I've tried e2fsck, and also ran fsck on Vol00.  Looks
>> like
>> I made a fine mess of things.  Is there I wasy to fix it without
>> reloading
>
> AFAIK, there is no "easiest way". From my *limited* knowledge, you have
> a couple different problems (maybe) and they are not identified. I'll
> offer some guesses and suggestions, but without my own hard-headed
> stubbornness in play, results are even more iffy.
>
>> Centos?  Here are some outputs:
>>
>>
>> snapshot from /var/log/messages:
>>
>> Jul 12 04:03:21 hostname kernel: hda: dma_intr: status=0x51 { DriveReady
>> SeekComplete Error }
>> Jul 12 04:03:21 hostname kernel: hda: dma_intr: error=0x84 {
>> DriveStatusError BadCRC }
>> Jul 12 04:03:21 hostname kernel: ide: failed opcode was: unknown
>
> I've experienced these regularly on a certain brand of older drive
> (*really* older, probably not your situation). Maxtor IIRC. Anyway, the
> problem occurred mostly on cold boot or when re-spinning the drive after
> it slept. It apparently had a really *slow* spin up speed and timeout
> would occur (not handled in the protocol I guess), IIRC.

This is definitely a symptom.  I wonder if LVM has anything to do with it?
 I'm running an "IBM-DTLA-307020" (20gig).  I was previously running an
"IBM-DTLA-307015" on FC1 on ext3 partitions and never had a problem.

When I find the time, I am just going reload the Centos4.3 on ext3
partitions, restore data, and see how it goes.

>
> Your post doesn't mention if this might be related. If all your log
> occurrences tend to indicate it happens only after long periods of
> inactivity, or upon cold boot, it might not be an issue. But even there,
> hdparm might have some help. Also, if it does seem to be only on cold-
> boot or long periods of "sleeping", is it possible that a bunch of
> things starting at the same time are taxing the power supply? Is the PS
> "weak". Remember that PSs must have not only a maximum wattage
> sufficient to support the maximum draw of all devices at the same time
> (plus a margin for safety), but that also various 5/12 volt lines are
> limited. Different PSs have different limits on those lines and often
> they are not published on the PS label. Lots of 12 or 5 volt draws at
> the same time (as happens in a non-sequenced start-up) might be
> producing an unacceptable voltage or amperage drop.
>
> Is your PCI bus 33/66/100 MHz? Do you get messages on boot saying
> "assume 33MHz.... use idebus=66"? I hear it's OK to have an idebus param
> that is too fast, but it's a problem if your bus is faster than what the
> kernel thinks it is.
>
> Re-check and make sure all cables are well-seated and that power is well
> connected. Speaking of cables, is it new or "old"? Maybe cable has a
> small intermittent break? Try replacing the cable. Try using an 80-
> conductor (UDMA?) cable, if not using that already. If the problem is
> only on cold boot, can you get a DC volt-meter on the power connector?
> If so, look for the voltages to "sag". That might tell you that you are
> taxing your PS. Or use the labels, do the math and calculate if your are
> close to the max wattage in a worst-case scenario.
>
> I suggest using hdparm (*very* carefully) to see if the problem can be
> replicated on demand. Take the drive into various reduced-power modes
> and restart it and see if the problem is fairly consistent.
>
>>
>>
>> sfdisk -l:
>>
>> Disk /dev/hda: 39870 cylinders, 16 heads, 63 sectors/track
>> Warning: The partition table looks like it was made
>>   for C/H/S=*/255/63 (instead of 39870/16/63).
>> For this listing I'll assume that geometry.
>> Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from
>> 0
>>
>>    Device Boot Start     End   #cyls    #blocks   Id  System
>> /dev/hda1   *      0+     12      13-    104391   83  Linux
>> /dev/hda2         13    2500    2488   19984860   8e  Linux LVM
>> /dev/hda3          0       -       0          0    0  Empty
>> /dev/hda4          0       -       0          0    0  Empty
>> Warning: start=63 - this looks like a partition rather than
>> the entire disk. Using fdisk on it is probably meaningless.
>> [Use the --force option if you really want this]
>
> What does your BIOS show for this drive? It's likely here that the drive
> was labeled (or copied from a drive that was labeled) in another
> machine. The "key" for me is the "255" vs. "16". The only fix here (not
> important to do it though) is to get the drive properly labeled for this
> machine. B/u data, make sure BIOS is set correctly, fdisk (or sfdisk) it
> to get partitions correct.
>
> WARNING! Although this can be done "live", use sfdisk -l -uS to get
> starting sector numbers and make the partitions match. When you re-label
> at "255", some of the calculated translations internal to the drivers(?)
> might change (Do things *still* translate to CHS on modern drives? I'll
> need to look into that some day. I bet not.). Also, the *desired*
> starting and ending sectors of the partitions are likely to change. What
> I'm saying is that the final partitioning will likely be "non-standard"
> in layout and laying in wait to bite your butt.
>
> I would backup the data, change BIOS, sfdisk it (or fdisk or cfdisk, or
> any other partitioner, your choice). If system is hot, sfdisk -R will
> re-read the params and get them into the kernel. Then reload data (if
> needed). If it's "hot", single user, or run level 1, mounted "ro", of
> course. Careful reading of sfdisk can allow you to script and test (on
> another drive) parts of this.

I really want to try some of this, but not until I have a hot ready
standby HD to throw in if it get's hosed.  I'm hosting some stuff and like
to known for reliable 24x7 service.

>
> Easy enough so far?  >:-)

Yea, peace of cake.  Thanks for sharing your knowledge!  I do need to play
around with LVM more and get comfortable with it.  LVM seems to be
somewhere between Solaris metabd's and ZFS.

>
>>
>>
>> sfdisk -lf
>
> The "f" does you no good here, as you can see. It is really useful only
> when trying to change disk label. What would be useful (maybe) to you is
> "-uS".
>
>>
>> <snip>
>
> HTH
> --
> Bill
> _______________________________________________
> CentOS mailing list
> CentOS@xxxxxxxxxx
> http://lists.centos.org/mailman/listinfo/centos
>

-- 

^^^^^^^^^^^^| || \
| Budvar    ######|| ||'|"\,__.
| _..._...______ ===|=||_|__|...]
"(@)'(@)""""**|(@)(@)*****(@)I

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos