On Fri, Sep 30, 2022 at 09:56:41AM +0900, Gotou, Yasunori/五島 康文 wrote: > Hello everyone, > > On 2022/09/20 11:38, Yang, Xiao/杨 晓 wrote: > > Hi Darrick, Brian and Christoph > > > > Ping. I hope to get your feedback. > > > > 1) I have confirmed that the following patch set did not change the test > > result of generic/470 with thin-volume. Besides, I didn't see any > > failure when running generic/470 based on normal PMEM device instaed of > > thin-volume. > > https://lore.kernel.org/linux-xfs/20211129102203.2243509-1-hch@xxxxxx/ > > > > 2) I can reproduce the failure of generic/482 without thin-volume. > > > > 3) Is it necessary to make thin-volume support DAX. Is there any use > > case for the requirement? > > > Though I asked other place(*), I really want to know the usecase of > dm-thin-volume with DAX and reflink. > > > In my understanding, dm-thin-volume seems to provide similar feature like > reflink of xfs. Both feature provide COW update to reduce usage of > its region, and snapshot feature, right? > > I found that docker seems to select one of them (or other feature which > supports COW). Then user don't need to use thin-volume and reflink at same > time. > > Database which uses FS-DAX may want to use snapshot for its data of FS-DAX, > its user seems to be satisfied with reflink or thin-volume. > > So I could not find on what use-case user would like to use dm-thin-volume > and reflink at same time. > > The only possibility is that the user has mistakenly configured dm-thinpool > and reflink to be used at the same time, but if that is the case, it seems > to be better for the user to disable one or the other. > > I really wander why dm-thin-volume must be used with reflik and FS-DAX. There isn't a hard requirement between fsdax and dm-thinp. The /test/ needs dm-logwrites to check that write page faults on a MAP_SYNC mmapping are persisted directly to disk. dm-logwrites requires a fast way to zero an entire device for correct operation of the replay step, and thinp is the only way to guarantee that. --D > If my understanding is something wrong, please correct me. > > (*)https://lore.kernel.org/all/TYWPR01MB1008258F474CA2295B4CD3D9B90549@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > Thanks, > --- > Yasunori Goto > > > > > > Best Regards, > > Xiao Yang > > > > On 2022/9/16 10:04, Yang, Xiao/杨 晓 wrote: > > > On 2022/9/15 18:14, Yang, Xiao/杨 晓 wrote: > > > > On 2022/9/15 0:28, Darrick J. Wong wrote: > > > > > On Wed, Sep 14, 2022 at 08:34:26AM -0400, Brian Foster wrote: > > > > > > On Wed, Sep 14, 2022 at 05:38:02PM +0800, Yang, Xiao/杨 晓 wrote: > > > > > > > On 2022/9/14 14:44, Yang, Xiao/杨 晓 wrote: > > > > > > > > On 2022/9/9 21:01, Brian Foster wrote: > > > > > > > > > Yes.. I don't recall all the internals of > > > > > > > > > the tools and test, but IIRC > > > > > > > > > it relied on discard to perform zeroing > > > > > > > > > between checkpoints or some such > > > > > > > > > and avoid spurious failures. The purpose of running on dm-thin was > > > > > > > > > merely to provide reliable discard zeroing > > > > > > > > > behavior on the target device > > > > > > > > > and thus to allow the test to run reliably. > > > > > > > > Hi Brian, > > > > > > > > > > > > > > > > As far as I know, generic/470 was original designed to verify > > > > > > > > mmap(MAP_SYNC) on the dm-log-writes device enabling DAX. Due to the > > > > > > > > reason, we need to ensure that all underlying devices under > > > > > > > > dm-log-writes device support DAX. However > > > > > > > > dm-thin device never supports > > > > > > > > DAX so > > > > > > > > running generic/470 with dm-thin device always returns "not run". > > > > > > > > > > > > > > > > Please see the difference between old and new logic: > > > > > > > > > > > > > > > > old logic new logic > > > > > > > > --------------------------------------------------------------- > > > > > > > > log-writes device(DAX) log-writes device(DAX) > > > > > > > > | | > > > > > > > > PMEM0(DAX) + PMEM1(DAX) Thin device(non-DAX) + PMEM1(DAX) > > > > > > > > | > > > > > > > > PMEM0(DAX) > > > > > > > > --------------------------------------------------------------- > > > > > > > > > > > > > > > > We think dm-thin device is not a good solution > > > > > > > > for generic/470, is there > > > > > > > > any other solution to support both discard zero and DAX? > > > > > > > > > > > > > > Hi Brian, > > > > > > > > > > > > > > I have sent a patch[1] to revert your fix because I > > > > > > > think it's not good for > > > > > > > generic/470 to use thin volume as my revert patch[1] describes: > > > > > > > [1] https://lore.kernel.org/fstests/20220914090625.32207-1-yangx.jy@xxxxxxxxxxx/T/#u > > > > > > > > > > > > > > > > > > > > > > > > > > I think the history here is that generic/482 was changed over first in > > > > > > commit 65cc9a235919 ("generic/482: use thin volume as > > > > > > data device"), and > > > > > > then sometime later we realized generic/455,457,470 had > > > > > > the same general > > > > > > flaw and were switched over. The dm/dax compatibility > > > > > > thing was probably > > > > > > just an oversight, but I am a little curious about that > > > > > > because it should > > > > > > > > > > It's not an oversight -- it used to work (albeit with EXPERIMENTAL > > > > > tags), and now we've broken it on fsdax as the pmem/blockdev divorce > > > > > progresses. > > > > Hi > > > > > > > > Do you mean that the following patch set changed the test result > > > > of generic/470 with thin-volume? (pass => not run/failure) > > > > https://lore.kernel.org/linux-xfs/20211129102203.2243509-1-hch@xxxxxx/ > > > > > > > > > > > > > > > have been obvious that the change caused the test to no > > > > > > longer run. Did > > > > > > something change after that to trigger that change in behavior? > > > > > > > > > > > > > With the revert, generic/470 can always run > > > > > > > successfully on my environment > > > > > > > so I wonder how to reproduce the out-of-order replay issue on XFS v5 > > > > > > > filesystem? > > > > > > > > > > > > > > > > > > > I don't quite recall the characteristics of the failures > > > > > > beyond that we > > > > > > were seeing spurious test failures with generic/482 that were due to > > > > > > essentially putting the fs/log back in time in a way that wasn't quite > > > > > > accurate due to the clearing by the logwrites tool not > > > > > > taking place. If > > > > > > you wanted to reproduce in order to revisit that, perhaps start with > > > > > > generic/482 and let it run in a loop for a while and see if it > > > > > > eventually triggers a failure/corruption..? > > > > > > > > > > > > > PS: I want to reproduce the issue and try to find a > > > > > > > better solution to fix > > > > > > > it. > > > > > > > > > > > > > > > > > > > It's been a while since I looked at any of this tooling > > > > > > to semi-grok how > > > > > > it works. > > > > > > > > > > I /think/ this was the crux of the problem, back in 2019? > > > > > https://lore.kernel.org/fstests/20190227061529.GF16436@dastard/ > > > > > > > > Agreed. > > > > > > > > > > > > > > > Perhaps it could learn to rely on something more explicit like > > > > > > zero range (instead of discard?) or fall back to manual zeroing? > > > > > > > > > > AFAICT src/log-writes/ actually /can/ do zeroing, but (a) it probably > > > > > ought to be adapted to call BLKZEROOUT and (b) in the worst case it > > > > > writes zeroes to the entire device, which is/can be slow. > > > > > > > > > > For a (crass) example, one of my cloudy test VMs uses 34GB partitions, > > > > > and for cost optimization purposes we're only "paying" for the cheapest > > > > > tier. Weirdly that maps to an upper limit of 6500 write iops and > > > > > 48MB/s(!) but that would take about 20 minutes to zero the entire > > > > > device if the dm-thin hack wasn't in place. Frustratingly, it doesn't > > > > > support discard or write-zeroes. > > > > > > > > Do you mean that discard zero(BLKDISCARD) is faster than both > > > > fill zero(BLKZEROOUT) and write zero on user space? > > > > > > Hi Darrick, Brian and Christoph > > > > > > According to the discussion about generic/470. I wonder if it is > > > necessary to make thin-pool support DAX. Is there any use case for > > > the requirement? > > > > > > Best Regards, > > > Xiao Yang > > > > > > > > Best Regards, > > > > Xiao Yang > > > > > > > > > > > If the > > > > > > eventual solution is simple and low enough overhead, it > > > > > > might make some > > > > > > sense to replace the dmthin hack across the set of tests mentioned > > > > > > above. > > > > > > > > > > That said, for a *pmem* test you'd expect it to be faster than that... > > > > > > > > > > --D > > > > > > > > > > > Brian > > > > > > > > > > > > > Best Regards, > > > > > > > Xiao Yang > > > > > > > > > > > > > > > > > > > > > > > BTW, only log-writes, stripe and linear support DAX for now. > > > > > > > > > > > > > >