On Sat, Apr 20, 2024 at 1:08 AM Veter Kamenev wrote: > > Hi folks! > > I learned about Nilfs2 many years ago and was impressed by its capabilities. The idea is brilliant! > > But due to glitches, I returned back to reiserfs3, which works like a charm, although it will be removed from the kernel in 2025. > > Now, after several years, I decided to return to Nilfs2, hoping that the bugs have been corrected. > > To my greatest regret, I received errors in all kernels and the system went into read-only state. > Luckily I didn't lose any data. > > The texts of the errors will be below in the letter. In this regard, I have 2 questions: > > 1. What is the reason that after so many years, Nilfs2 still cannot work stably? Hi Veter, I've been fixing bugs reported by Google's automatic fuzzing tool called "syzbot" (particularly intensively over the last several years), so I believe the current version is pretty stable. However, since this activity is still ongoing, it is not possible to say that there are no bugs. Fortunately or unfortunately, a few major bugs have been recently been fixed. > 2. Are the errors I sent corrected? If fixed, then starting with what version of the kernel? I can't say for sure due to lack of information, but the logs you reported appear to be symptoms of a high impact mmap regression that was fixed within the last three months. Assuming the problem occurred in v6.6.12 of subject, the following important fixes have not yet been backported to this version, including the mmap regression fix. Regarding stable-6.6.y, versions below v6.6.24 (latest) do not include all important fixes. $ git shortlog v6.6.12..v6.6.24 fs/nilfs2 Ryusuke Konishi (5): nilfs2: fix data corruption in dsync block recovery for small block sizes nilfs2: fix hang in nilfs_lookup_dirty_data_buffers() nilfs2: fix potential bug in end_buffer_async_write nilfs2: fix failure to detect DAT corruption in btree and direct mappings nilfs2: prevent kernel bug at submit_bh_wbc() On the other hand, v6.7.12 (latest) and v6.8.4 (not latest) included all of these important fixes. As for the logs, all messages in the logs are warning and are harmless in themselves. However, they are often produced due to the influence of other problems, so you need to be careful in that sense. > Apr 14 15:10:10 MainPC kernel: NILFS (dm-5): discard dirty page: offset=2579136512, ino=3 This message indicates that there were changes that could not be written to disk and were discarded during unmounting, and is likely to occur when the file system is degraded to read-only. > Apr 14 15:10:24 MainPC kernel: NILFS (dm-5): nilfs_get_block (ino=55694): a race condition while inserting a data block at offset=0 This message indicates that an unexpected race occurred when writing, but the bug that outputs this message has been recently fixed, so it may be that. Of the patches listed above that have not been applied to v6.6.12, these are the the following two: nilfs2: fix potential bug in end_buffer_async_write nilfs2: fix failure to detect DAT corruption in btree and direct mappings If the environment in which these messages were output is v6.7.12 or v6.8.4 and you think this is not the cause, it would be helpful if you could report the situation in more detail. Regards, Ryusuke Konishi