Re: Issues with delalloc->real extent allocation

Dave Chinner <david@xxxxxxxxxxxxx> · Mon, 24 Jan 2011 10:26:37 +1100

On Fri, Jan 21, 2011 at 08:41:52AM -0600, Geoffrey Wehrman wrote:
> On Fri, Jan 21, 2011 at 01:51:40PM +1100, Dave Chinner wrote:
> | Realistically, for every disadvantage or advantage we can enumerate
> | for specific workloads, I think one of us will be able to come up
> | with a counter example that shows the opposite of the original
> | point. I don't think this sort of argument is particularly
> | productive. :/
> 
> Sorry, I wasn't trying to be argumentative.  Rather I was just documenting
> what I saw as potential issues.  I'm not arguing against your proposed
> change.  If you don't find my sharing of observations productive, I'm
> happy to keep my thoughts to my self in the future.

Ah, that's not what I meant, Geoffrey. Reading it back, I probably
should have said "direction of discussion" rather than "sort of
argument" to make it more obvious I trying not to get stuck with us
goign round and round trying to demonstrate the pros and cons of
different approaches on a workload-by-workload basis.

Basically all I was trying to do is move the discusion past a
potential sticking point - I definitely value the input and insight
you provide, and I'll try to write more clearly to hopefully avoid
such misunderstandings in future discussions.

> | Instead, I look at it from the point of view that a 64k IO is little
> | slower than a 4k IO so such a change would not make much difference
> | to performance. And given that terabytes of storage capacity is
> | _cheap_ these days (and getting cheaper all the time), the extra
> | space of using 64k instead of 4k for sparse blocks isn't a big deal.
> | 
> | When I combine that with my experience from SGI where we always
> | recommended using filesystems block size == page size for best IO
> | performance on HPC setups, there's a fair argument that using page
> | size extents for small sparse writes isn't a problem we really need
> | to care about.
> | 
> | Ð'd prefer to design for where we expect storage to be in the next
> | few years e.g. 10TB spindles. Minimising space usage is not a big
> | priority when we consider that in 2-3 years 100TB of storage will
> | cost less than $5000 (it's about $15-20k right now).  Even on
> | desktops we're going to have more capacity that we know what to do
> | with, so trading off storage space for lower memory overhead, lower
> | metadata IO overhead and lower potential fragmentation seems like
> | the right way to move forward to me.
> | 
> | Does that seem like a reasonable position to take, or are there
> | other factors that you think I should be considering?
> 
> Keep in mind that storage of the future may not be on spindles, and
> fragmentation may not be an issue.  Even so, with SSD 64K I/O is very
> reasonable as most flash memory implements at a minimum 64K page.  I'm
> fully in favor your proposal to require page sized I/O.

With flash memory there is the potential that we don't even need to
care. The trend is towards on-device compression (e.g. Sandforce
controllers already do this) to reduce write amplification to values
lower than one. Hence a 4k write surrounded by 60k of zeros is
unlikely to be a major issue as it will compress really well.... :)

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs