Re: [PATCH 2/7] btrfs: reduce the log level for btrfs_dev_stat_inc_and_print()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2024/3/15 03:47, Filipe Manana 写道:
On Thu, Mar 14, 2024 at 9:54 AM Qu Wenruo <wqu@xxxxxxxx> wrote:

Currently when we increase the device statistics, it would always lead
to an error message in the kernel log.

I would argue this behavior is not ideal:

- It would flood the dmesg and bury real important messages
   One common scenario is scrub.
   If scrub hit some errors, it would cause both scrub and
   btrfs_dev_stat_inc_and_print() to print error messages.

   And in that case, btrfs_dev_stat_inc_and_print() is completely
   useless.

- The results of btrfs_dev_stat_inc_and_print() is mostly for history
   monitoring, doesn't has enough details

   If we trigger the errors during regular read, such messages from
   btrfs_dev_stat_inc_and_print() won't help us to locate the cause
   either.

The real usage for the btrfs device statistics is for some user space
daemon to check if there is any new errors, acting like some checks on
SMART, thus we don't really need/want those messages in dmesg.

This patch would reduce the log level to debug (disabled by default) for
btrfs_dev_stat_inc_and_print().
For users really want to utilize btrfs devices statistics, they should
go check "btrfs device stats" periodically, and we should focus the
kernel error messages to more important things.

Not sure if this is the right thing to do.

In the scrub context it can be annoying for sure.
Other cases I'm not so sure about, because having error messages in
dmesg/syslog may help notice issues more quickly.

For non-scrub cases, I'd argue we already have enough output:

No matter if the error is fixed or not, every time a mirror got csum
mismatch or other errors, we already have error message output:

 Data: btrfs_print_data_csum_error()
 Meta: btrfs_validate_extent_buffer()

For repaired ones, we have extra output from bio layer for both metadata
and data:
 btrfs_repair_io_failure()

So I'd say the dev_stat ones are already duplicated.

Thanks,
Qu


CC: stable@xxxxxxxxxxxxxxx
Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
---
  fs/btrfs/volumes.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e49935a54da0..126145950ed3 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7828,7 +7828,7 @@ void btrfs_dev_stat_inc_and_print(struct btrfs_device *dev, int index)

         if (!dev->dev_stats_valid)
                 return;
-       btrfs_err_rl_in_rcu(dev->fs_info,
+       btrfs_debug_rl_in_rcu(dev->fs_info,
                 "bdev %s errs: wr %u, rd %u, flush %u, corrupt %u, gen %u",
                            btrfs_dev_name(dev),
                            btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_WRITE_ERRS),
--
2.44.0








[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux