Re: gluster fails under heavy array job load load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Harry,

My best guess is that you overloaded your interconnect. Do you have metrics for if/when your network was saturated? That would cause Gluster clients to time out.

My best guess is that you went into the "E" state of your "USE (Utilization, Saturation, Error)" spectrum.

IME, that is a common pattern for out Lustre/GPFS clients, you get all kinds of weird error states if you manage to saturate your I/O for an extended period of time and fill all of the buffers everywhere.

Regards,
Alex


On 12/12/2013 05:03 PM, harry mangalam wrote:
Short version: Our gluster fs (~340TB) provides scratch space for a
~5000core academic compute cluster.

Much of our load is streaming IO, doing a lot of genomics work, and that
is the load under which we saw this latest failure.


--
Alex Chekholko chekh@xxxxxxxxxxxx
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux