default cluster.stripe-block-size for striped volumes on 3.0.x vs 3.3 beta (128kb), performance change if i reduce to a smaller block size?

landman at scalableinformatics.com (Joe Landman) · Fri, 24 Feb 2012 03:04:59 -0500

On 02/24/2012 01:50 AM, Sabuj Pattanayek wrote:
> This seems to be a bug in XFS as Joe pointed out :
>
> http://oss.sgi.com/archives/xfs/2011-06/msg00233.html

This was in a different context though. Are your files sparse by default?

> http://stackoverflow.com/questions/6940516/create-sparse-file-with-alternate-data-and-hole-on-ext3-and-xfs
>
> It seems to be there in XFS available natively in RHEL6 and RHEL5

Yes.

>
> On Thu, Feb 23, 2012 at 5:12 PM, Sabuj Pattanayek<sabujp at gmail.com>  wrote:
>> Hi,
>>
>> I've been migrating data from an old striped 3.0.x gluster install to
>> a 3.3 beta install. I copied all the data to a regular XFS partition
>> (4K blocksize) from the old gluster striped volume and it totaled
>> 9.2TB. With the old setup I used the following option in a "volume
>> stripe" block in the configuration file in a client :
>>
>> volume stripe
>>   type cluster/stripe
>>   option block-size 2MB
>>   subvolumes ....
>> end-volume
>>
>> IIRC, the data was using up about the same space on the old striped
>> volume (9.2T) . While copying the data back to the new v3.3 striped
>> gluster volume on the same 5 servers/same brick filesystems (XFS w/4K
>> blocksize), I noticed that the amount stored on disk increased by 5x

512B blocks are 1/8 the size of the 4096B blocks, so a scheme where 512B 
blocks are naively replaced by 4096B blocks should net an 8x space 
change if that is the issue.

.
>>
>> Currently if I do a du -sh on the gluster fuse mount of the new
>> striped volume I get 4.3TB (I haven't finished copying all 9.2TB of
>> data over, stopped it prematurely because it's going to use up all the
>> physical disk it seems if I let it keep going). However, if I do a du
>> -sh at the filesystem / brick level on each of the 5 directories on
>> the 5 servers that store the striped data, it shows that each one is
>> storing 4.1TB. So basically, 4.3TB of data from a 4K block size FS
>> took up 20.5TB of storage on a 128KB block size striped gluster

So you have 5 servers, each storing a portion of a stripe.  You get a 5x 
change in allocation?  This sounds less like an xfs issue and more like 
a gluster allocation issue.  I've not looked lately at the stripe code, 
but it may allocate the same space on each node, using the access 
pattern for performance.

>> volume. What is the correlation between the " option block-size"
>> setting on client configs in cluster/stripe blocks in 3.0.x vs the
>> cluster.stripe-block-size parameter in 3.3? If these settings are
>> talking about what I think they mean, then basically a file that is 1M
>> in size would be written out to the stripe in 128KB chunks across N
>> servers, i.e. 128/N KB of data per brick? What happens when the stripe
>> block size isn't evenly divisible by N (e.g. 128/5 = 25.6). If the old
>> block-size and new stripe-block-size options are describing the same
>> thing, then wouldn't a 2MB block size from the old config cause more
>> storage to be used up vs a 128KB block size?
>>
>> Thanks,
>> Sabuj
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615