Re: WARNING in break_stripe_batch_list with "stripe state: 2001"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Could someone comment about it? Thanks.

BRs,
Guoqing

On 8/7/19 6:06 PM, Guoqing Jiang wrote:
Hi,

The below warning is appeared in the 4.14.86 kernel, but seems there is no obvious change for the code
in mainline kernel.

  [7028915.431770] stripe state: 2001
  [7028915.431815] ------------[ cut here ]------------
  [7028915.431828] WARNING: CPU: 18 PID: 29089 at drivers/md/raid5.c:4614 break_stripe_batch_list+0x203/0x240 [raid456]   [7028915.431829] Modules linked in: cpufreq_ondemand cpufreq_conservative cpufreq_userspace cpufreq_powersave ibnbd_server(O) ibtrs_server(O) ibtrs_core(O) sb_edac x86_pkg_temp_thermal crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ib_umad pcbc rdma_ucm ib_ipoib rdma_cm aesni_intel iw_cm aes_x86_64 ib_uverbs efi_pstore ib_cm crypto_simd glue_helper cryptd efivars sg ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad pcc_cpufreq button yars(O) andbd_server(O) andbd_client(O) andbd_shared(O) efivarfs raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq linear mlx5_ib ib_core ipv6 hid_generic usbhid crc32c_intel mlx5_core mlx_compat ehci_pci xhci_pci i2c_i801 ixgbe ahci i2c_core ehci_hcd xhci_hcd mdio libahci hwmon mpt3sas dca   [7028915.431879] CPU: 18 PID: 29089 Comm: kworker/u82:5 Tainted: G           O    4.14.86-1-storage #4.14.86-1.2~deb9   [7028915.431881] Hardware name: Supermicro SSG-2028R-ACR24L/X10DRH-iT, BIOS 3.1 06/18/2018
  [7028915.431888] Workqueue: raid5wq raid5_do_work [raid456]
  [7028915.431890] task: ffff9ab0ef36d7c0 task.stack: ffffb72926f84000
  [7028915.431896] RIP: 0010:break_stripe_batch_list+0x203/0x240 [raid456]
  [7028915.431898] RSP: 0018:ffffb72926f87ba8 EFLAGS: 00010286
  [7028915.431900] RAX: 0000000000000012 RBX: ffff9aaa84a98000 RCX: 0000000000000000   [7028915.431901] RDX: 0000000000000000 RSI: ffff9ab2bfa15458 RDI: ffff9ab2bfa15458   [7028915.431902] RBP: ffff9aaa8fb4e900 R08: 0000000000000001 R09: 0000000000002eb4   [7028915.431903] R10: 00000000ffffffff R11: 0000000000000000 R12: ffff9ab1736f1b00   [7028915.431904] R13: 0000000000000000 R14: ffff9aaa8fb4e900 R15: 0000000000000001   [7028915.431906] FS:  0000000000000000(0000) GS:ffff9ab2bfa00000(0000) knlGS:0000000000000000
  [7028915.431907] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [7028915.431908] CR2: 00007ff953b9f5d8 CR3: 0000000bf4009002 CR4: 00000000003606e0   [7028915.431909] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000   [7028915.431910] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [7028915.431910] Call Trace:
  [7028915.431923]  handle_stripe+0x8e7/0x2020 [raid456]
  [7028915.431930]  ? __wake_up_common_lock+0x89/0xc0
  [7028915.431935]  handle_active_stripes.isra.58+0x35f/0x560 [raid456]
  [7028915.431939]  raid5_do_work+0xc6/0x1f0 [raid456]
  [7028915.431944]  ? process_one_work+0x1c5/0x3b0
  [7028915.431945]  process_one_work+0x1c5/0x3b0
  [7028915.431948]  worker_thread+0x241/0x3e0
  [7028915.431951]  kthread+0xfc/0x130
  [7028915.431954]  ? trace_event_raw_event_workqueue_execute_start+0xa0/0xa0
  [7028915.431956]  ? kthread_create_on_node+0x70/0x70
  [7028915.431961]  ret_from_fork+0x1f/0x30


After go through the code, three places could call break_stripe_batch_list, two place are in handle_stripe, the other is in handle_stripe_clean_event (and handle_stripe_clean_event is only called in handle_stripe).

Since "2001" means STRIPE_IO_STARTED (not related to warning) and STRIPE_ACTIVE. And handle_stripe set STRIPE_ACTIVE flag at the beginning, the flag is cleared before handle_stripe returns. Which means if break_stripe_batch_list is called, then STRIPE_ACTIVE should be set already, please see below snippet.

handle_stripe
{
            ...
            test_and_set_bit_lock(STRIPE_ACTIVE, &sh->state)
            ...
            if (test_and_clear_bit(STRIPE_BATCH_ERR, &sh->state))
                break_stripe_batch_list(sh, 0);
            ...
            if (s.failed > conf->max_degraded ||
               (s.log_failed && s.injournal == 0)) {
                ...
                break_stripe_batch_list(sh, 0);
                ...
            }
             ...
            if (s.written &&
                (s.p_failed || ((test_bit(R5_Insync, &pdev->flags)
                             && !test_bit(R5_LOCKED, &pdev->flags)
                             && (test_bit(R5_UPTODATE, &pdev->flags) ||
                                 test_bit(R5_Discard, &pdev->flags))))) &&
                (s.q_failed || ((test_bit(R5_Insync, &qdev->flags)
                             && !test_bit(R5_LOCKED, &qdev->flags)
                             && (test_bit(R5_UPTODATE, &qdev->flags) ||
                                 test_bit(R5_Discard, &qdev->flags))))))
                    handle_stripe_clean_event(conf, sh, disks);
            ....
            clear_bit_unlock(STRIPE_ACTIVE, &sh->state);
}

So from my understanding, break_stripe_batch_list always triggers (which is not good) the warning if it is called, but the function is called under conditions: too many failed devices, write error happened or failure of pdisk/qdisk etc, which means the warning is happened rarely, though I still found the same
issue was reported in list [1].

Maybe it makes sense to remove the checking of STRIPE_ACTIVE just like commit 550da24f8d62f
("md/raid5: preserve STRIPE_PREREAD_ACTIVE in break_stripe_batch_list").

@@ -4606,8 +4607,7 @@ static void break_stripe_batch_list(struct stripe_head *head_sh,

                list_del_init(&sh->batch_list);

-               WARN_ONCE(sh->state & ((1 << STRIPE_ACTIVE) |
-                                         (1 << STRIPE_SYNCING) |
+               WARN_ONCE(sh->state & ((1 << STRIPE_SYNCING) |
                                          (1 << STRIPE_REPLACED) |
                                          (1 << STRIPE_DELAYED) |
                                          (1 << STRIPE_BIT_DELAY) |
@@ -4626,6 +4626,7 @@ static void break_stripe_batch_list(struct stripe_head *head_sh,

                set_mask_bits(&sh->state, ~(STRIPE_EXPAND_SYNC_FLAGS |
                                            (1 << STRIPE_PREREAD_ACTIVE) |
+                                           (1 << STRIPE_ACTIVE) |
                                            (1 << STRIPE_DEGRADED) |
                                            (1 << STRIPE_ON_UNPLUG_LIST)),
                              head_sh->state & (1 << STRIPE_INSYNC));



[1]. https://www.spinics.net/lists/raid/msg62552.html

Any comments? Thanks in advance.

BRs,
Guoqing





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux