Re: XFS on top of LVM span in AWS. Stripe or are AG's good enough?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/16/16 12:05 PM, Jeff Gibson wrote:
>> On Mon, Aug 15, 2016 at 11:36:14PM +0000, Jeff Gibson wrote:

...

>> Define "run". AGs can allocate/free blocks in parallel.
> By run I meant read/write data to/from the AGs.
> 
>> If IO does
>> not require allocation, then AGs play no part in the IO path.
> Can you explain this a bit please? From my understanding data is
> written and read from space inside of AGs, so I don't see how it
> couldn't be part of the IO path. Or do you simply mean reads just use
> inodes and don't care about the AGs?

I think Dave just means that IO to already-allocated blocks simply
addresses the block and goes.  There is no AG locking or concurrency or
anything else that comes into play w.r.t. the specific AG the block
under IO happens to live in.

>>> In a
>>> non-striped volume, if some of the AGs are temporarily slower to
>>> respond than others due to one of the underlying volumes being
>>> slow, will XFS prefer the quicker responding AGs
>>
>> No, it does not.
>>
>>> or is I/O always
>>> evenly distributed?
>>
>> No, it is not.
>>
>>> If XFS prefers the more responsive AG's it
>>> seems to me that it would be better NOT to stripe the underlying
>>> disk since all AG's that are distributed in a stripe will
>>> continuously hit all component volumes, including the slow volume
>>> (unless if XFS compensates for this?)
>>
>> I think you have the wrong idea about what allocation groups do.

> I'm reading the XFS File System Structure doc on xfs.org. It says,
> "XFS filesystems are divided into a number of equally sized chunks
> called Allocation Groups. Each AG can almost be thought of as an
> individual filesystem." so that's where most of my assumptions are
> coming from.

Well, the above quote is correct, but it doesn't say anything about
IO time, latency, responsiveness, or anything like that.  Each AG
does indeed include its own structures to track allocation, but that's
unrelated to any notion of "fast" or "slow."

>> They are for maintaining allocation concurrency and locality of
>> related objects on disk - they have no influence on where IO is
>> directed based on IO load or response time.

> I understand that XFS has locality as far as trying to write files to
> the same AG as the parent directory. Are there other cases?

In general, new directories go to a new AG.  Inodes within that directory
tend to stay in the same AG as their parent, and data blocks associated
with those inodes tend to stay nearby as well.  That's the high-level
goal, but of course fragmented freespace and near-full conditions can
cause that to not remain true.

> I get that it's probably not measuring the responsiveness of each AG.

It is *definitely* not measuring the responsiveness of each AG :)

> I guess what I'm trying to ask is - will XFS *indirectly* compensate
> if one subvolume is busier?  For example, if writes to a "slow"
> subvolume and resident AGs take longer to complete, will XFS tend to
> prefer to use other less-busy AGs more often (with the exception of
> locality) for writes?  What is the basic algorithm for determining
> where new data is written?  In load-balancer terms, does it
> round-robin, pick the least busy, etc?

xfs has no notion of fast vs slow regions.  See above for the basic
algorithm; it's round-robin for new directories, keep inodes and blocks
near their parent if possible.  There are a few other smaller-granularity
heuristics related to stripe geometry as well.

-Eric

> Thank you very much!
> JG
>     
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
> 

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux