On Fri, 17 Oct 2008 02:59:53 -0400 "Alexander Nezhinsky" <nezhinsky@xxxxxxxxx> wrote: > > What we really need to do is improving Linux AIO suuport, which > > benefits not only tgt but everyone. > > What about introducing a new module (or expanding the existing stgt kernel > infrastructure), so that it supports an sg-like interface but translates scsi > commands into kernel aio api calls? > > This will delegate to kernel the data-path part of the target (which was always > considered the right thing to do), it is protocol independent and breaks > dependency on the AIO and eventfd implementation etc. > > I also thought about it because it can allow introducing a bunch of special > optimizations, which i can discuss separately. > > What do u think? If we agree that it is a plausible direction, i am > ready to start > writing a prototype. I'm not sure about what API you propose but my opinion is unrelated here. You can propose whatever you want to kernel developers. I'm not the person who decide what will be merged into mainline kernel. tgt uses the best API that mainline Linux kernel provides. The first requirement is the flexibility of device management, all the sane propriety storage system provides. Then, of course, the performance of the API is important. > >> This bs_sg implementation uses DIO at all times. I guess we don't have > >> to care about WCE because when we send a status to initiator, the data is > >> not merely written to cache (well, it is not), it has been actually sent to the > >> i/o device and acked by it. > > > > Not correct. Using sg's DIO means nothing for this issue. DIO just > > represents a way to move data between kernel and user space. > > > > For example, we enables WCE by default so initiators send > > SYNCHRONIZE_CACHE and this sg code ignores it. If a scsi device > > enables WCE, you are in trouble. > > In the case of SG when DIO is used, i guess that an I/O completion > delivered to the user space always means that I/O reached the device, > it can't be queued anywhere in the kernel. > So only the device itself may be really WCE'ed. But then, if we pass through > the INQUIRY command and not fake the WCE bit, then we are ok. > We report WCE of the device and forward > SYNCHRONIZE_CACHE to it, everything is consistent. > > Please, answer how you envision a desired solution. You did not respond > to my suggestion about an additional api between the device-type and > backing-store. Do you suggest that i'd introduce a new device type with more > commands forwarded to the backing-store than sbc currently does ? > I would like to fix the issue. I made it clear, SCSI passthrough is not an option. So any command filtering (that's exactly passthrough, as I wrote in the previous mail) is not an option too. If you let some SCSI commands (maybe you want to let MODE SELECT pass too to change WCE), you need to track the state of scsi devices (passed commands could change the state). It opens up a whole new can of worms. For example, you need to make sure that unit attention that tgtd fakes are consistent with the real state of scsi devices. -- To unsubscribe from this list: send the line "unsubscribe stgt" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html