Re: jbod + SMART : how to identify failing disks ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 17 Nov 2014 13:31:57 -0800 Craig Lewis wrote:

> I use `dd` to force activity to the disk I want to replace, and watch the
> activity lights.  That only works if your disks aren't 100% busy.  If
> they are, stop the ceph-osd daemon, and see which drive stops having
> activity. Repeat until you're 100% confident that you're pulling the
> right drive.
>
I use smartctl for lighting up the disk, but same diff. 
JBOD can become a big PITA quickly with large deployments and if you don't
have people with sufficient skill doing disk replacements.

Also depending on how a disk died you might not be able to reclaim the
drive ID (sdc for example) without a reboot, making things even more
confusing. 

Some RAID cards in IT/JBOD mode _will_ actually light up the fail LED if
a disk fails and/or have tools to blink a specific disk. 
However with the later the task of matching a disk from the controller's
perspective to what linux enumerated it as is still on you.

Ceph might scale up to really large deployments, but you better have a
well staffed data center to come with that or deploy it in a non-JBOD
fashion. 

Christian

> On Wed, Nov 12, 2014 at 5:05 AM, SCHAER Frederic <frederic.schaer@xxxxxx>
> wrote:
> 
> >  Hi,
> >
> >
> >
> > I’m used to RAID software giving me the failing disks  slots, and most
> > often blinking the disks on the disk bays.
> >
> > I recently installed a  DELL “6GB HBA SAS” JBOD card, said to be an LSI
> > 2008 one, and I now have to identify 3 pre-failed disks (so says
> > S.M.A.R.T) .
> >
> >
> >
> > Since this is an LSI, I thought I’d use MegaCli to identify the disks
> > slot, but MegaCli does not see the HBA card.
> >
> > Then I found the LSI “sas2ircu” utility, but again, this one fails at
> > giving me the disk slots (it finds the disks, serials and others, but
> > slot is always 0)
> >
> > Because of this, I’m going to head over to the disk bay and unplug the
> > disk which I think corresponds to the alphabetical order in linux, and
> > see if it’s the correct one…. But even if this is correct this time,
> > it might not be next time.
> >
> >
> >
> > But this makes me wonder : how do you guys, Ceph users, manage your
> > disks if you really have JBOD servers ?
> >
> > I can’t imagine having to guess slots that each time, and I can’t
> > imagine neither creating serial number stickers for every single disk
> > I could have to manage …
> >
> > Is there any specific advice reguarding JBOD cards people should (not)
> > use in their systems ?
> >
> > Any magical way to “blink” a drive in linux ?
> >
> >
> >
> > Thanks && regards
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux