> -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > Gurvinder Singh > Sent: 04 September 2015 08:57 > To: Wang, Warren <Warren_Wang@xxxxxxxxxxxxxxxxx>; Mark Nelson > <mnelson@xxxxxxxxxx>; ceph-users@xxxxxxxxxxxxxx > Subject: Re: high density machines > > On 09/04/2015 02:31 AM, Wang, Warren wrote: > > In the minority on this one. We have a number of the big SM 72 drive units > w/ 40 Gbe. Definitely not as fast as even the 36 drive units, but it isn't awful > for our average mixed workload. We can exceed all available performance > with some workloads though. > > > > So while we can't extract all the performance out of the box, as long > > as we don't max out on performance, the cost is very appealing, > I am wondering how much the cost difference you have seen with SM 72 > drive compare to lets say > http://www.supermicro.com/products/system/1U/6017/SYS-6017R- > 73THDP_.cfm > or any other smaller machine which you have compared with. As with the > discussion on this thread it is clear that the 72 drive box are actually > 4 * 18 drive boxes sharing power and cooling. Regarding performance I think > network might be bottleneck (may be cpu too), as it is 40 Gbit for whole box > so you get 10 Gbit per box (18 drives each) which can be peaked. > > Gurvinder I think the 72 disk box is one unit, it's the fat twins have 12x3.5" + 2x2.5" per sled, or 56 drives per 4U. That other server you linked is pretty similar to the fat twins sleds but the only disadvantage I can see is that it is single PSU and 1 less 2.5" drive. Unless you can spread your crush map over sufficient different number of PDU's/feeds, I would be wary about running single PSU nodes as you could have a 50% cluster failure quite easily. > and as far as filling a unit, I'm not sure how many folks have filled big prod > clusters, but you really don't want them even running into the > 70+% range due to some inevitable uneven filling, and room for failure. > > > > Also, I'm betting that Ceph will continue to optimize things like the > messenger, and reduce some of the massive CPU and TCP overhead, so we > can claw back performance. I would love to see a thread count reduction. > These can see over 130K threads per box. > > > > Warren > > > > -----Original Message----- > > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf > > Of Mark Nelson > > Sent: Thursday, September 03, 2015 3:58 PM > > To: Gurvinder Singh <gurvindersinghdahiya@xxxxxxxxx>; > > ceph-users@xxxxxxxxxxxxxx > > Subject: Re: high density machines > > > > > > > > On 09/03/2015 02:49 PM, Gurvinder Singh wrote: > >> Thanks everybody for the feedback. > >> On 09/03/2015 05:09 PM, Mark Nelson wrote: > >>> My take is that you really only want to do these kinds of systems if > >>> you have massive deployments. At least 10 of them, but probably > >>> more like > >>> 20-30+. You do get massive density with them, but I think if you > >>> 20-30+are > >>> considering 5 of these, you'd be better off with 10 of the 36 drive > >>> units. An even better solution might be ~30-40 of these: > >>> > >>> http://www.supermicro.com/products/system/1U/6017/SYS-6017R- > 73THDP_. > >>> c > >>> fm > >>> > >> This one does look interesting. > >>> An extremely compelling solution would be if they took this system: > >>> > >>> http://www.supermicro.com/products/system/1U/5018/SSG-5018A- > AR12L.cf > >>> m > >>> ?parts=SHOW > >>> > >> This one can be really good solution for archiving purpose with > >> replaced CPU to get more juice into it. > >>> > >>> and replaced the C2750 with a Xeon-D 1540 (but keep the same number > >>> of SATA ports). > >>> > >>> Potentially you could have: > >>> > >>> - 8x 2.0GHz Xeon Broadwell-DE Cores, 45W TDP > >>> - Up to 128GB RAM (32GB probably the sweet spot) > >>> - 2x 10GbE > >>> - 12x 3.5" spinning disks > >>> - single PCIe slot for PCIe SSD/NVMe > >> I am wondering does single PCIe SSD/NVMe device can support 12 OSDs > >> journals and still perform the same as 4 OSD per SSD ? > > > > Basically the limiting factor is how fast the device can do O_DSYNC writes. > We've seen that some PCIe SSD and NVME devices can do 1-2GB/s > depending on the capacity which is enough to reasonably support 12-24 > OSDs. Whether or not it's good to have a single PCIe card to be a point of > failure is a worthwhile topic (Probably only high write endurance cards should > be considered). There are plenty of other things that can bring the node > down too though (motherboard, ram, cpu, etc) though. A single node failure > will also have less impact if there are lots of small nodes vs a couple big ones. > > > >>> > >>> The density would be higher than the 36 drive units but lower than > >>> the > >>> 72 drive units (though with shorter rack depth afaik). > >> You mean the 1U solution with 12 disk is longer in length than 72 > >> disk 4U version ? > > > > Sorry, the other way around I believe. > > > >> > >> - Gurvinder > >> Probably more > >>> CPU per OSD and far better distribution of OSDs across servers. > >>> Given that the 10GbE and processor are embedded on the > motherboard, > >>> there's a decent chance these systems could be priced reasonably and > >>> wouldn't have excessive power/cooling requirements. > >>> > >>> Mark > >>> > >>> On 09/03/2015 09:13 AM, Jan Schermer wrote: > >>>> It's not exactly a single system > >>>> > >>>> SSG-F618H-OSD288P* > >>>> 4U-FatTwin, 4x 1U 72TB per node, Ceph-OSD-Storage Node > >>>> > >>>> This could actually be pretty good, it even has decent CPU power. > >>>> > >>>> I'm not a big fan of blades and blade-like systems - sooner or > >>>> later a backplane will die and you'll need to power off everything, > >>>> which is a huge PITA. > >>>> But assuming you get 3 of these it could be pretty cool! > >>>> It would be interesting to have a price comparison to a SC216 > >>>> chassis or similiar, I'm afraid it won't be much cheaper. > >>>> > >>>> Jan > >>>> > >>>>> On 03 Sep 2015, at 16:09, Kris Gillespie <kgillespie@xxxxxxx> wrote: > >>>>> > >>>>> It's funny cause in my mind, such dense servers seems like a bad > >>>>> idea to me for exactly the reason you mention, what if it fails. > >>>>> Losing 400+TB of storage is going to have quite some impact, 40G > >>>>> interfaces or not and no matter what options you tweak. > >>>>> Sure it'll be cost effective per TB, but that isn't the only > >>>>> aspect to look at (for production use anyways). > >>>>> > >>>>> But I'd also be curious about real world feedback. > >>>>> > >>>>> Cheers > >>>>> > >>>>> Kris > >>>>> > >>>>> The 09/03/2015 16:01, Gurvinder Singh wrote: > >>>>>> Hi, > >>>>>> > >>>>>> I am wondering if anybody in the community is running ceph > >>>>>> cluster with high density machines e.g. Supermicro > >>>>>> SYS-F618H-OSD288P (288 TB), Supermicro SSG-6048R-OSD432 (432 > TB) > >>>>>> or some other high density machines. I am assuming that the > >>>>>> installation will be of petabyte scale as you would want to have at > least 3 of these boxes. > >>>>>> > >>>>>> It would be good to hear their experiences in terms of > >>>>>> reliability, performance (specially during node failures). As > >>>>>> these machines have 40Gbit network connection it can be ok, but > >>>>>> experience from real users would be great to hear. As these are > >>>>>> mentioned in the reference architecture published by red hat and > supermicro. > >>>>>> > >>>>>> Thanks for your time. > >>>>>> _______________________________________________ > >>>>>> ceph-users mailing list > >>>>>> ceph-users@xxxxxxxxxxxxxx > >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>>> > >>>>> De informatie verzonden met dit e-mailbericht is uitsluitend > >>>>> bestemd voor de geadresseerde. Gebruik van deze informatie door > >>>>> anderen dan de geadresseerde is uitdrukkelijk verboden. Indien u > >>>>> dit bericht per vergissing heeft ontvangen, verzoeken wij u ons > >>>>> onmiddelijk hiervan op de hoogte te stellen en het bericht te > vernietigen. > >>>>> Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking > >>>>> van deze informatie aan derden is niet toegestaan. Bol.com b.v. > >>>>> staat niet in voor de juiste en volledige overbrenging van de > >>>>> inhoud van een verzonden e-mail, noch voor tijdige ontvangst > >>>>> daarvan en aanvaardt geen aansprakelijkheid in dezen. > >>>>> The information contained in this communication is confidential > >>>>> and may be legally privileged. It is intended solely for the use > >>>>> of the individual or entity to whom it is addressed and others > >>>>> authorised to receive it. If you are not the intended recipient > >>>>> please notify the sender and destroy this message. Any disclosure, > >>>>> copying, distribution or taking any action in reliance on the > >>>>> contents of this information is strictly prohibited and may be unlawful. > Bol.com b.v. > >>>>> is neither liable for the proper and complete transmission of the > >>>>> information contained in this communication nor for delay in its > >>>>> receipt. > >>>>> _______________________________________________ > >>>>> ceph-users mailing list > >>>>> ceph-users@xxxxxxxxxxxxxx > >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>> > >>>> _______________________________________________ > >>>> ceph-users mailing list > >>>> ceph-users@xxxxxxxxxxxxxx > >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>> > >>> _______________________________________________ > >>> ceph-users mailing list > >>> ceph-users@xxxxxxxxxxxxxx > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com