On Fri, 2008-08-22 at 14:55 -0400, Vivek Goyal wrote: > > > > As an aside, when the IO context of a certain IO operation is known > > > > (synchronous IO comes to mind) I think it should be cashed in the > > > > resulting bio so that we can do without the expensive accesses to > > > > bio_cgroup once it enters the block layer. > > > > > > Will this give you everything you need for accounting and control (from the > > > block layer?) > > > > Well, it depends on what you are trying to achieve. > > > > Current IO schedulers such as CFQ only care about the io_context when > > scheduling requests. When a new request comes in CFQ assumes that it was > > originated in the context of the current task, which obviously does not > > hold true for buffered IO and aio. This problem could be solved by using > > bio-cgroup for IO tracking, but accessing the io context information is > > somewhat expensive: > > > > page->page_cgroup->bio_cgroup->io_context. > > > > If at the time of building a bio we know its io context (i.e. the > > context of the task or cgroup that generated that bio) I think we should > > store it in the bio itself, too. With this scheme, whenever the kernel > > needs to know the io_context of a particular block IO operation the > > kernel would first try to retrieve its io_context directly from the bio, > > and, if not available there, would resort to the slow path (accessing it > > through bio_cgroup). My gut feeling is that elevator-based IO resource > > controllers would benefit from such an approach, too. > > > > Hi Fernando, > > Had a question. > > IIUC, at the time of submtting the bio, io_context will be known only for > synchronous request. For asynchronous request it will not be known > (ex. writing the dirty pages back to disk) and one shall have to take > the longer path (bio-cgroup thing) to ascertain the io_context associated > with a request. > > If that's the case, than it looks like we shall have to always traverse the > longer path in case of asynchronous IO. By putting the io_context pointer > in bio, we will just shift the time of pointer traversal. (From CFQ to higher > layers). > > So probably it is not worth while to put io_context pointer in bio? Am I > missing something? Hi Vivek! IMHO, optimizing the synchronous path alone would justify the addition of io_context in bio. There is more to this though. As you point out, it would seem that aio and buffered IO would not benefit from caching the io context in the bio itself, but there are some subtleties here. Let's consider stacking devices and buffered IO, for example. When a bio enters such a device it may get replicated several times and, depending on the topology, some other derivative bios will be created (RAID1 and parity configurations come to mind, respectively). The problem here is that the memory allocated for the newly created bios will be owned by the corresponding dm or md kernel thread, not the originator of the bio we are replicating or calculating the parity bits from. The implication of this is that if we took the longer path (via bio_cgroup) to obtain the io_context of those bios we would end up charging the wrong guy for that IO: the kernel thread, not the perpetrator of the IO. A possible solution to this could be to track the original bio inside the stacking device so that the io context of derivative bios can be obtained from its bio_cgroup. However, I am afraid such an approach would be overly complex and slow. My feeling is that storing the io_context also in bios is the right way to go: once the bio enters the block layer the kernel we can forget about memory-related issues, thus avoiding what is arguably a layering violation; io context information is not lost inside stacking devices (we just need to make sure that whenever new bios are created the io_context is carried over from the original one); and, finally, the synchronous path can be easily optimized. I hope this makes sense. Thank you for your comments. - Fernando _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/virtualization