Re: I'm trying NILFS2 in kernels 6.6.12, 6.7.12, 6.8.4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 20, 2024 at 1:08 AM Veter Kamenev wrote:
>
> Hi folks!
>
> I learned about Nilfs2 many years ago and was impressed by its capabilities. The idea is brilliant!
>
> But due to glitches, I returned back to reiserfs3, which works like a charm, although it will be removed from the kernel in 2025.
>
> Now, after several years, I decided to return to Nilfs2, hoping that the bugs have been corrected.
>
> To my greatest regret, I received errors in all kernels and the system went into read-only state.
> Luckily I didn't lose any data.
>
> The texts of the errors will be below in the letter. In this regard, I have 2 questions:
>
> 1. What is the reason that after so many years, Nilfs2 still cannot work stably?

Hi Veter,

I've been fixing bugs reported by Google's automatic fuzzing tool
called "syzbot" (particularly intensively over the last several
years), so I believe the current version is pretty stable.

However, since this activity is still ongoing, it is not possible to
say that there are no bugs.  Fortunately or unfortunately, a few major
bugs have been recently been fixed.

> 2. Are the errors I sent corrected? If fixed, then starting with what version of the kernel?

I can't say for sure due to lack of information, but the logs you
reported appear to be symptoms of a high impact mmap regression that
was fixed within the last three months.

Assuming the problem occurred in v6.6.12 of subject, the following
important fixes have not yet been backported to this version,
including the mmap regression fix.   Regarding stable-6.6.y, versions
below v6.6.24 (latest) do not include all important fixes.

$ git shortlog v6.6.12..v6.6.24 fs/nilfs2

Ryusuke Konishi (5):
      nilfs2: fix data corruption in dsync block recovery for small block sizes
      nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
      nilfs2: fix potential bug in end_buffer_async_write
      nilfs2: fix failure to detect DAT corruption in btree and direct mappings
      nilfs2: prevent kernel bug at submit_bh_wbc()

On the other hand, v6.7.12 (latest) and v6.8.4 (not latest) included
all of these important fixes.

As for the logs, all messages in the logs are warning and are harmless
in themselves.  However, they are often produced due to the influence
of other problems, so you need to be careful in that sense.

> Apr 14 15:10:10 MainPC kernel: NILFS (dm-5): discard dirty page: offset=2579136512, ino=3

This message indicates that there were changes that could not be
written to disk and were discarded during unmounting, and is likely to
occur when the file system is degraded to read-only.

> Apr 14 15:10:24 MainPC kernel: NILFS (dm-5): nilfs_get_block (ino=55694): a race condition while inserting a data block at offset=0

This message indicates that an unexpected race occurred when writing,
but the bug that outputs this message has been recently fixed, so it
may be that.  Of the patches listed above that have not been applied
to v6.6.12, these are the the following two:

      nilfs2: fix potential bug in end_buffer_async_write
      nilfs2: fix failure to detect DAT corruption in btree and direct mappings


If the environment in which these messages were output is v6.7.12 or
v6.8.4 and you think this is not the cause, it would be helpful if you
could report the situation in more detail.


Regards,
Ryusuke Konishi





[Index of Archives]     [Linux Filesystem Development]     [Linux BTRFS]     [Linux CIFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux