Re: high throughput storage server?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



did you contacted texas ssd solutions? i don't know how much $$$
should you pay for this setup, but it's a nice solution...

2011/3/18 Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>:
> Christoph Hellwig put forth on 3/18/2011 9:05 AM:
>
> Thanks for the confirmations and explanations.
>
>> The kernel is pretty smart in placement of user and page cache data, but
>> it can't really second guess your intention.  With the numactl tool you
>> can help it doing the proper placement for you workload.  Note that the
>> choice isn't always trivial - a numa system tends to have memory on
>> multiple nodes, so you'll either have to find a good partitioning of
>> your workload or live with off-node references.  I don't think
>> partitioning NFS workloads is trivial, but then again I'm not a
>> networking expert.
>
> Bringing mdraid back into the fold, I'm wondering what kinda of load the
> mdraid threads would place on a system of the caliber needed to push
> 10GB/s NFS.
>
> Neil, I spent quite a bit of time yesterday spec'ing out what I believe
> is the bare minimum AMD64 based hardware needed to push 10GB/s NFS.
> This includes:
>
>  4 LSI 9285-8e 8port SAS 800MHz dual core PCIE x8 HBAs
>  3 NIAGARA 32714 PCIe x8 Quad Port Fiber 10 Gigabit Server Adapter
>
> This gives us 32 6Gb/s SAS ports and 12 10GbE ports total, for a raw
> hardware bandwidth of 20GB/s SAS and 15GB/s ethernet.
>
> I made the assumption that RAID 10 would be the only suitable RAID level
> due to a few reasons:
>
> 1.  The workload being 50+ NFS large file reads of aggregate 10GB/s,
> yielding a massive random IO workload at the disk head level.
>
> 2.  We'll need 384 15k SAS drives to service a 10GB/s random IO load
>
> 3.  We'll need multiple "small" arrays enabling multiple mdraid threads,
> assuming a single 2.4GHz core isn't enough to handle something like 48
> or 96 mdraid disks.
>
> 4.  Rebuild times for parity raid schemes would be unacceptably high and
> would eat all of the CPU the rebuild thread would run on
>
> To get the bandwidth we need and making sure we don't run out of
> controller chip IOPS, my calculations show we'd need 16 x 24 drive
> mdraid 10 arrays.  Thus, ignoring all other considerations momentarily,
> a dual AMD 6136 platform with 16 2.4GHz cores seems suitable, with one
> mdraid thread per core, each managing a 24 drive RAID 10.  Would we then
> want to layer a --linear array across the 16 RAID 10 arrays?  If we did
> this, would the linear thread bottleneck instantly as it runs on only
> one core?  How many additional memory copies (interconnect transfers)
> are we going to be performing per mdraid thread for each block read
> before the data is picked up by the nfsd kernel threads?
>
> How much of each core's cycles will we consume with normal random read
> operations assuming 10GB/s of continuous aggregate throughput?  Would
> the mdraid threads consume sufficient cycles that when combined with
> network stack processing and interrupt processing, that 16 cores at
> 2.4GHz would be insufficient?  If so, would bumping the two sockets up
> to 24 cores at 2.1GHz be enough for the total workload?  Or, would we
> need to move to a 4 socket system with 32 or 48 cores?
>
> Is this possibly a situation where mdraid just isn't suitable due to the
> CPU, memory, and interconnect bandwidth demands, making hardware RAID
> the only real option?  And if it does requires hardware RAID, would it
> be possible to stick 16 block devices together in a --linear mdraid
> array and maintain the 10GB/s performance?  Or, would the single
> --linear array be processed by a single thread?  If so, would a single
> 2.4GHz core be able to handle an mdraid --leaner thread managing 8
> devices at 10GB/s aggregate?
>
> Unfortunately I don't currently work in a position allowing me to test
> such a system, and I certainly don't have the personal financial
> resources to build it.  My rough estimate on the hardware cost is
> $150-200K USD.  The 384 Hitachi 15k SAS 146GB drives at $250 each
> wholesale are a little over $90k.
>
> It would be really neat to have a job that allowed me to setup and test
> such things. :)
>
> --
> Stan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux