Performance

josh at moonfruit.com (Hiren Joshi) · Thu, 13 Aug 2009 09:27:30 +0100

We've had large ext3 filesystems go readonly (underlying hardware
problem) and recovery can take a days.

For this I'm using ext3 as well and as it's a raid 6 disk (hardware
raided) I can probably get away with 4X1TB. But I'm currently
experiencing bad performance on a single brick that's not mirrored.....

________________________________

	From: Mark Mielke [mailto:mark at mark.mielke.cc] 
	Sent: 12 August 2009 18:51
	To: Hiren Joshi
	Cc: gluster-users at gluster.org
	Subject: Re: Performance

	On 08/12/2009 01:24 PM, Hiren Joshi wrote: 

			36 partitions on each server - the word
"partition" is ambiguous. Are 
			they 36 separate drives? Or multiple partitions
on the same drive. If 
			multiple partitions on the same drive, this
would be a bad 
			idea, as it 
			would require the disk head to move back and
forth between the 
			partitions, significantly increasing the
latency, and therefore 
			significantly reducing the performance. If each
partition is 
			on its own 
			drive, you still won't see benefit unless you
have many clients 
			concurrently changing many different files. In
your above case, it's 
			touching a single file in sequence, and having a
cluster is 
			costing you 
			rather than benefitting you.

		We went with 36 partitions (on a single raid 6 drive)
incase we got file
		system corruption, it would take less time to fsck a
100G partition than
		a 3.6TB one. Would a 3.6TB single disk be better?

	Putting 3.6 TB on a single disk sounds like a lot of eggs in one
basket. :-)

	If you are worried about fsck, I would definitely do as the
other poster suggested and use a journalled file system. This nearly
eliminates the fsck time for most situations. This would be whether
using 100G partitions or using 3.6T partitions. In fact, there is very
few reasons not to use a journalled file system these days.

	As for how to deal with data on this partition - the file system
is going to have a better chance of placing files close to each other,
than setting up 36 partitions and having Gluster scatter the files
across all of them based on a hash. Personally, I would choose 4 x 1
Tbyte drives over 1 x 3.6 Tbyte drive, as this nearly quadruples my
bandwidth and for highly concurrent loads, nearly divides by four the
average latency to access files. 

	But, if you already have the 3.6 Tbyte drive, I think the only
performance-friendly use would be to partition it based upon access
requirements, rather than a hash (random). That is, files that are
accessed frequently should be clustered together at the front of a disk,
files accessed less frequently could be in the middle, and files
accessed infrequently could be at the end. This would be a three
partition disk. Gluster does not have a file system that does this
automatically (that I can tell), so it would probably require a software
solution on your end. For example, I believe dovecot (IMAP server)
allows an "alternative storage" location to be defined, so that
infrequently read files can be moved to another disk, and it knows to
check the primary storage first, and fall back to the alternative
storage after.

	It you can't break up your storage by access patterns, then I
think a 3.6 Tbyte file system might still be the next best option - it's
still better than 36 partitions. But, make sure you have a good file
system on it, that scales well to this size.

	Cheers,
	mark

	-- 
	Mark Mielke <mark at mielke.cc> <mailto:mark at mielke.cc>