On Thu, 2016-05-05 at 07:24 -0700, Christoph Hellwig wrote: > On Mon, May 02, 2016 at 06:41:51PM +0300, Boaz Harrosh wrote: > > > > > > > > All IO in a dax filesystem used to go through dax_do_io, which > > > cannot > > > handle media errors, and thus cannot provide a recovery path that > > > can > > > send a write through the driver to clear errors. > > > > > > Add a new iocb flag for DAX, and set it only for DAX mounts. In > > > the IO > > > path for DAX filesystems, use the same direct_IO path for both DAX > > > and > > > direct_io iocbs, but use the flags to identify when we are in > > > O_DIRECT > > > mode vs non O_DIRECT with DAX, and for O_DIRECT, use the > > > conventional > > > direct_IO path instead of DAX. > > > > > Really? What are your thinking here? > > > > What about all the current users of O_DIRECT, you have just made > > them > > 4 times slower and "less concurrent*" then "buffred io" users. Since > > direct_IO path will queue an IO request and all. > > (And if it is not so slow then why do we need dax_do_io at all? > > [Rhetorical]) > > > > I hate it that you overload the semantics of a known and expected > > O_DIRECT flag, for special pmem quirks. This is an incompatible > > and unrelated overload of the semantics of O_DIRECT. > Agreed - makig O_DIRECT less direct than not having it is plain > stupid, > and I somehow missed this initially. How is it any 'less direct'? All it does now is follow the blockdev O_DIRECT path. There still isn't any page cache involved.. > > This whole DAX story turns into a major nightmare, and I fear all our > hodge podge tweaks to the semantics aren't helping it. > > It seems like we simply need an explicit O_DAX for the read/write > bypass if can't sort out the semantics (error, writer synchronization) > just as we need a special flag for MMAP..��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f