On Thu, Sep 21, 2023 at 04:03:56PM +1000, Dave Chinner wrote: > On Wed, Sep 20, 2023 at 09:57:56PM -0700, Luis Chamberlain wrote: > > On Wed, Sep 20, 2023 at 08:00:12PM -0700, Luis Chamberlain wrote: > > > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=large-block-linus > > > > > > I haven't tested yet the second branch I pushed though but it applied without any changes > > > so it should be good (usual famous last words). > > > > I have run some preliminary tests on that branch as well above using fsx > > with larger LBA formats running them all on the *same* system at the > > same time. Kernel is happy. <-- snip --> > So I just pulled this, built it and run generic/091 as the very > first test on this: > > # ./run_check.sh --mkfs-opts "-m rmapbt=1 -b size=64k" --run-opts "-s xfs_64k generic/091" The cover letter for this patch series acknowledged failures in fstests. For kdevops now, we borrow the same last linux-next baseline: git grep "generic/091" workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_1024.txt:generic/091 # possible regression workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_16k.txt:generic/091 workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_32k.txt:generic/091 workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_64k_4ks.txt:generic/091 So well, we already know this fails. > For all these assertions about how none of your testing is finding > bugs in this code, It's taken me *4 seconds* of test runtime to find > the first failure. Because you know what to look for and this is not yet perfect. > And, well, it's the same failure as I reported for the previous > version of this code: And we haven't done *any* new changes to the patch series so no surprise either. > Guess what? The fsx parameters being used means it is testing things you > aren't. I actualy found quite a bit of issues with -W. And it was useful. > Yes, the '-Z -R -W' mean it is using direct IO for reads and writes, > mmap() is disabled. Other parameters indicate that using 4k aligned reads and > 512 byte aligned writes and truncates. Thanks! This will help for sure!. > There is a reason there are multiple different fsx tests in fstests; You made it clear, and I documented the goal to ensure we get to the point we pass all those: https://kernelnewbies.org/KernelProjects/large-block-size#fsx > they all exercise different sets of IO behaviours and alignments, > and they exercise the IO paths differently. > > So there's clearly something wrong here - it's likely that the > filesystem IO alignment parameters pulled from the underlying block > device (4k physical, 512 byte logical sector sizes) are improperly > interpreted. i.e. for a filesystem with a sector size of 4kB, > direct IO with an alignment of 512 bytes should be rejected...... So yes, this is not yet complete. But now let's step back and I want you to realize where we started and why we decided to post, in particular me, I was suggesting we post now, instead of waiting for us to resolve *it all*. When we first started this work we simply thought it was impossible. Unless of course you are Matthew and you believed hard in your work. The progress, which you don't see, is that steps towards fixing fsx issues have been logarithmic. Days, weeks, months before decent progress, but the progress was steady... And so to get to where we are today only just shows, well this is actually not impossible, and Matthew did the right thing with the right data structure, and the changes to the page cache with multi index array stuff, it seems to be able to also be used for LBS. At this point, from a logarithmic perspective, we have huge progress, and I don't think it will stop. It gives us confidence Matthew was right and LBS is possible indeed with the multi-index stuff. It's not about, can this crash. Yes, we know, it can crash. It's about how many different ways, and how many fixes left. Because clearly the multi-index stuff is working well. The code feedback so far on this patch series has mostly been "I don't think this patch is needed" or "perhaps this way is better", and that's the kind of feedback we're looking for. Because *each* new patch adds a huge a milestone. And it seems the progress has been logarithmic. It is exactly why this series went out with a few patches which ... we felt safer with them than without. For instance the batch delete.. I still am suspicious about us not needing as Hannes' patches also seem to rely on similer rounding on the wait stuff, and it seems to bring back memories on issues found on permissions. But anyway, the point is that, this is clearly not ready. But try to think of progress here as logarithmic, and any *dent* we make on the page cache to fix the last corner cases will be huge, not small. If you want to try, you can see for yourself, what's the next fix? :) And if found, was it logarithmic? How do we polish this? That's the goal of this patch series. Luis