Re: RAID question for Ceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19/07/2018 10:53, Simon Ironside wrote:
On 19/07/18 07:59, Dietmar Rieder wrote:

We have P840ar controllers with battery backed cache in our OSD nodes
and configured an individual RAID-0 for each OSD (ceph luminous +
bluestore). We have not seen any problems with this setup so far and
performance is great at least for our workload.

I'm doing the same with LSI RAID controllers for the same reason, to take advantage of the battery backed cache. No problems with this here either. As Troy said, you do need to go through the additional step of creating a single disk RAID0 whenever you replace a disk that you wouldn't with regular HBA.

This discussion has been running on ZFS lists for quite some time and extend... Since ZFS really does depend on that the software wants direct access to the disk without extra abstraction layers. And as both with ZFS and Ceph RAID is dead..... these newly designed storage systems solve problems that RAID cannot anymore. (Read about why new RAID versions will not really save you from crashed disk due to a MTBF time that equals recovery time on new large disks.)

Basic fact remains that RAID controllers sort of lie to the users, and even more the advanced ones with backup batteries. If everything is all well in paradise you will usually get away with it. But if not, that expensive piece of hardware will turn everything in to cr..p.

For example lots of LSI firmware has had bugs in them, especially the Enterprise version can do really wierd things. That is why we install the IT version of the firmware, as to cripple the RAID functionality as much as one can. It turns your expensive RAID controller basically into just a plain HBA. (no more configs for extra disks.)

So unless you HAVE to take it, because you can not rule it out in the system configurator whilest buying. Go for the simple controllers that can act as HBA.

There are a few more things to consider, like
 - what is the bandwidth on the disk carrier backplane?
	What kind of port multipliers are used, and is the design as
	it should be. I've seen board with 2 multipliers where it turns
	out that only one is used, and the other only can be used for
	multipath... So is going to be a bottleneck on the feed
	to the multiplier?
 - how many lanes from your expensive HBA with multi lane SAS/SATA are
	actually used?
	I have seen 24 tray backplanes that want to run over 2 or 4 SAS
	lanes. Even when you think you are using all 8 lanes from the
	HBA because you have 2 SFF-8087 cables.
	It is not for a reason that SuperMicro also has a disktray
	backplane with 24 individual wired out SAS/SATA ports.
	Just ordering the basic cabinet will probably get you the wrong
	stuff.
  - And once you sort have fixed the bottlenecks here, can you actually
	run all disks at full speed over the controller to the PCI
	bus(ses).
   Even a 16 lane PCIe slot will at very theoretical best do 16Gbit/s.
   Now connect a bunch of 12Gb/s SSD disks to this connector and see
   the bottleneck arise. Even with more than 20 HDD it is going to be
   crowed on this controller.

Normally I'd say: Lies, damned lies, and statistics.
But in this case: Lies, damned lies and hardware..... 8-D

--WjW
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux