Re: O_DIRECT on deep-scrub read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 8, 2015 at 4:11 AM, Paweł Sadowski <ceph@xxxxxxxxx> wrote:
>
> On 10/07/2015 10:52 PM, Sage Weil wrote:
> > On Wed, 7 Oct 2015, David Zafman wrote:
> >> There would be a benefit to doing fadvise POSIX_FADV_DONTNEED after
> >> deep-scrub reads for objects not recently accessed by clients.
> > Yeah, it's the 'except for stuff already in cache' part that we don't do
> > (and the kernel doesn't give us a good interface for).  IIRC there was a
> > patch that guessed based on whether the obc was already in cache, which
> > seems like a pretty decent heuristic, but I forget if that was in the
> > final version.
>
> I've run some tests and it look like on XFS cache is discarded on
> O_DIRECT write and read but on EXT4 is discarded only on O_DIRECT write.
> I've found some patches to add support for "read only if in page cache"
> (preadv2/RWF_NONBLOCK) but can't find them in kernel source. Maybe
> Milosz Tanski can tell more about that. I think it could help a bit
> during deep scrub.


After a fair amount of bike shedding on the API (and removing
pwritev2) it looked like we (me and Christoph) had enough consensus to
get it upstream. But sadly it died, akpm preferred different approach
(fincore) and with enough roadblocks it died :/

>
>
> >> I see the NewStore objectstore sometimes using the O_DIRECT  flag for writes.
> >> This concerns me because the open(2) man pages says:
> >>
> >> "Applications should avoid mixing O_DIRECT and normal I/O to the same file,
> >> and especially to overlapping byte regions in the same file.  Even when the
> >> filesystem correctly handles the coherency issues in this situation, overall
> >> I/O throughput is likely to be slower than using either mode alone."
> > Yeah: an O_DIRECT write will do a cache flush on the write range, so if
> > there was already dirty data in cache you'll write twice.  There's
> > similarly an invalidate on read.  I need to go back through the newstore
> > code and see how the modes are being mixed and how it can be avoided...
> >
> > sage
> >
> >
> >> On 10/7/15 7:50 AM, Sage Weil wrote:
> >>> It's not, but it would not be ahrd to do this.  There are fadvise-style
> >>> hints being passed down that could trigger O_DIRECT reads in this case.
> >>> That may not be the best choice, though--it won't use data that happens
> >>> to be in cache and it'll also throw it out..
> >>>
> >>> On Wed, 7 Oct 2015, Pawe? Sadowski wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Can anyone tell if deep scrub is done using O_DIRECT flag or not? I'm
> >>>> not able to verify that in source code.
> >>>>
> >>>> If not would it be possible to add such feature (maybe config option) to
> >>>> help keeping Linux page cache in better shape?
> >>>>
> >>>> Thanks,
>
> --
> PS
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html




-- 
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: milosz@xxxxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux