On Thu, Mar 26, 2015 at 11:08:33PM -0700, Andrew Morton wrote: > On Fri, 27 Mar 2015 06:41:25 +0100 Volker Lendecke <Volker.Lendecke@xxxxxxxxx> wrote: > > > On Thu, Mar 26, 2015 at 08:28:24PM -0700, Andrew Morton wrote: > > > A thing which bugs me about pread2() is that it is specifically > > > tailored to applications which are able to use a partial read result. > > > ie, by sending it over the network. > > > > Can you explain what you mean by this? Samba gets a pread > > request from a client for some bytes. The client will be > > confused when we send less than requested although the file > > is long enough to satisfy all. > > Well it was my assumption that samba would be able to do something > useful with a partial read - pread() is allowed to return less than requested. No, this is not the case. Maybe my whole understanding of pread is wrong: I always thought that it won't return short if the file spans the pread range. EINTR nonwithstanding. > if (it's all in cache) I know I'm repeating myself: We have a race condition here. A small one, but it is racy. I've seen loaded systems where we spend seconds between becoming re-scheduled. In these systems, it will be the norm to block in later reads. And we don't have a good way to detect this situation afterwards and turn to threads as a precaution next time. > read it all now > else > ask a worker thread to read it all > > Bear in mind that these operations involve physical IO and large > memcpy's. Yes, a fincore() approach will consume more CPU but the > additional overhead will be relatively small. We have to pay this price for every single chunk. Without oplocks we get 10-byte read requests. This is hard to swallow for many vendors with small CPUs. With best regards, Volker Lendecke -- SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen phone: +49-551-370000-0, fax: +49-551-370000-9 AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen http://www.sernet.de, mailto:kontakt@xxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html