Re: [RFC PATCH] discarding swap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 12 Sep 2008, David Woodhouse wrote:
> On Fri, 2008-09-12 at 13:10 +0100, Hugh Dickins wrote:
> > So long as the I/O schedulers guarantee that a WRITE bio submitted
> > to an area already covered by a DISCARD_NOBARRIER bio cannot pass that
> > DISCARD_NOBARRIER - ...
> 
> > That seems a reasonable guarantee to me, and perhaps it's trivially
> > obvious to those who know their I/O schedulers; but I don't, so I'd
> > like to hear such assurance given.
> 
> No, that's the point. the I/O schedulers _don't_ give you that guarantee
> at all. They can treat DISCARD_NOBARRIER just like a write. That's all
> it is, really -- a special kind of WRITE request without any data.

Hmmm.  In that case I'll need to continue with DISCARD_BARRIER,
unless/until I rejig swap allocation to wait for discard completion,
which I've no great desire to do.

Is there any particular reason why DISCARD_NOBARRIER shouldn't be
enhanced to give the intuitive guarantee I suggest?  It is distinct
from a WRITE, I don't see why it has to be treated in the same way
if that's unhelpful to its users.

I expect the answer will be: it could be so enhanced, but we really
don't know if it's worth adding special code for that without the
experience of more users.

> 
> But -- and this came as a bit of a shock to me -- they don't guarantee
> that writes don't cross writes on their queue. If you issue two WRITE
> requests to the same sector, you have to make sure for _yourself_ that
> there is some kind of barrier between them to keep them in the right
> order.

Right, I recall from skimming the linux-fsdevel threads that it
emerged that currently WRITEs are depending on page lock for
that serialization, which cannot apply in the discard case.

So, there's been no need for such a guarantee in the WRITE case;
but it sure would be helpful in the DISCARD case, which has no
pages to lock anyway.

> 
> Does swap do that, when a page on the disk is deallocated and then used
> for something else?

Yes, that's managed through the PageWriteback flag: there are various
places where we'd like to free up swap, but cannot do so because it's
still attached to a cached page with PageWriteback set; in which case
its freeing has to be left until vmscan.c finds PageWriteback cleared,
then removes page from swapcache and frees the swap.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux