Re: To GlusterFS or not...

Alexey Zilber <alexeyzilber@xxxxxxxxx> · Tue, 23 Sep 2014 22:31:12 +0800

Yes, Roman is correct.   Also, if you have lots of random IO you're better off with many smaller SAS drives.   This is because the greater number of spindles you have the greater your random IO is.  This is also why we went with ssd drives because sas drives weren't cutting it on the random io front.
Another option you may try is using SAS drives with ZFS compression.   Compression will be especially helpful if you're using SATA drives.
-Alex
On Sep 23, 2014 2:10 PM, "Roman" <romeo.r@xxxxxxxxx> wrote:
Hi,

SAS 7200 RPM disks are not that small size at all (same as SATA basically). If I remember right, the reason of switching to SAS here would be Full Duplex with SAS (you can read and write in the same time to them)  instead of Half Duplex with SATA disks (read or write per one moment only).

2014-09-23 9:02 GMT+03:00 Chris Knipe <savage@xxxxxxxxxxxxx>:
Hi,

SSD has been considered but is not an option due to cost.  SAS has

been considered but is not a option due to the relatively small sizes

of the drives.  We are *rapidly* growing towards a PB of actual online

storage.

We are exploring raid controllers with onboard SSD cache which may help.

On Tue, Sep 23, 2014 at 7:59 AM, Roman <romeo.r@xxxxxxxxx> wrote:

> Hi,

>

> just a question ...

>

> Would SAS disks be better in situation with lots of seek times using

> GlusterFS?

>

> 2014-09-22 23:03 GMT+03:00 Jeff Darcy <jdarcy@xxxxxxxxxx>:

>>

>>

>> > The biggest issue that we are having, is that we are talking about

>> > -billions- of small (max 5MB) files. Seek times are killing us

>> > completely from what we can make out. (OS, HW/RAID has been tweaked to

>> > kingdom come and back).

>>

>> This is probably the key point.  It's unlikely that seek times are going

>> to get better with GlusterFS, unless it's because the new servers have

>> more memory and disks, but if that's the case then you might as well

>> just deploy more memory and disks in your existing scheme.  On top of

>> that, using any distributed file system is likely to mean more network

>> round trips, to maintain consistency.  There would be a benefit from

>> letting GlusterFS handle the distribution (and redistribution) of files

>> automatically instead of having to do your own sharding, but that's not

>> the same as a performance benefit.

>>

>> > I’m not yet too clued up on all the GlusterFS naming, but essentially

>> > if we do go the GlusterFS route, we would like to use non replicated

>> > storage bricks on all the front-end, as well as back-end servers in

>> > order to maximize storage.

>>

>> That's fine, so long as you recognize that recovering from a failed

>> server becomes more of a manual process, but it's probably a moot point

>> in light of the seek-time issue mentioned above.  As much as I hate to

>> discourage people from using GlusterFS, it's even worse to have them be

>> disappointed, or for other users with other needs to be so as we spend

>> time trying to fix the unfixable.

>> _______________________________________________

>> Gluster-users mailing list

>> Gluster-users@xxxxxxxxxxx

>> http://supercolony.gluster.org/mailman/listinfo/gluster-users

>

>

>

>

> --

> Best regards,

> Roman.

--

Regards,

Chris Knipe

-- 
Best regards,
Roman.

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users