On Mon, 2008-02-04 at 20:56 +0300, Vladislav Bolkhovitin wrote: > James Bottomley wrote: > > On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote: > > > >>James Bottomley wrote: > >> > >>>>>>So, James, what is your opinion on the above? Or the overall SCSI target > >>>>>>project simplicity doesn't matter much for you and you think it's fine > >>>>>>to duplicate Linux page cache in the user space to keep the in-kernel > >>>>>>part of the project as small as possible? > >>>>> > >>>>> > >>>>>The answers were pretty much contained here > >>>>> > >>>>>http://marc.info/?l=linux-scsi&m=120164008302435 > >>>>> > >>>>>and here: > >>>>> > >>>>>http://marc.info/?l=linux-scsi&m=120171067107293 > >>>>> > >>>>>Weren't they? > >>>> > >>>>No, sorry, it doesn't look so for me. They are about performance, but > >>>>I'm asking about the overall project's architecture, namely about one > >>>>part of it: simplicity. Particularly, what do you think about > >>>>duplicating Linux page cache in the user space to have zero-copy cached > >>>>I/O? Or can you suggest another architectural solution for that problem > >>>>in the STGT's approach? > >>> > >>> > >>>Isn't that an advantage of a user space solution? It simply uses the > >>>backing store of whatever device supplies the data. That means it takes > >>>advantage of the existing mechanisms for caching. > >> > >>No, please reread this thread, especially this message: > >>http://marc.info/?l=linux-kernel&m=120169189504361&w=2. This is one of > >>the advantages of the kernel space implementation. The user space > >>implementation has to have data copied between the cache and user space > >>buffer, but the kernel space one can use pages in the cache directly, > >>without extra copy. > > > > > > Well, you've said it thrice (the bellman cried) but that doesn't make it > > true. > > > > The way a user space solution should work is to schedule mmapped I/O > > from the backing store and then send this mmapped region off for target > > I/O. For reads, the page gather will ensure that the pages are up to > > date from the backing store to the cache before sending the I/O out. > > For writes, You actually have to do a msync on the region to get the > > data secured to the backing store. > > James, have you checked how fast is mmaped I/O if work size > size of > RAM? It's several times slower comparing to buffered I/O. It was many > times discussed in LKML and, seems, VM people consider it unavoidable. Erm, but if you're using the case of work size > size of RAM, you'll find buffered I/O won't help because you don't have the memory for buffers either. > So, using mmaped IO isn't an option for high performance. Plus, mmaped > IO isn't an option for high reliability requirements, since it doesn't > provide a practical way to handle I/O errors. I think you'll find it does ... the page gather returns -EFAULT if there's an I/O error in the gathered region. msync does something similar if there's a write failure. > > You also have to pull tricks with > > the mmap region in the case of writes to prevent useless data being read > > in from the backing store. > > Can you be more exact and specify what kind of tricks should be done for > that? Actually, just avoid touching it seems to do the trick with a recent kernel. James - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html