On Sat, Oct 26, 2019 at 3:07 AM Anssi Hannula <anssi.hannula@xxxxxx> wrote: > > Song Liu kirjoitti 2019-10-25 01:56: > > On Thu, Oct 24, 2019 at 12:42 PM Anssi Hannula <anssi.hannula@xxxxxx> > > wrote: > >> Song Liu kirjoitti 2019-10-24 21:50: > >> > On Sat, Oct 19, 2019 at 2:10 AM Anssi Hannula <anssi.hannula@xxxxxx> > >> > wrote: > >> >> > >> >> Hi all, > >> >> > >> >> I'm seeing a reshape issue where the array gets stuck with requests > >> >> seemingly getting blocked and md0_raid6 process taking 100% CPU > >> >> whenever > >> >> I --continue the reshape. > >> >> > >> >> From what I can tell, the md0_raid6 process is stuck processing a set > >> >> of > >> >> stripes over and over via handle_stripe() without progressing. > >> >> > >> >> Log excerpt of one handle_stripe() of an affected stripe with some > >> >> extra > >> >> logging is below. > >> >> The 4600-5200 integers are line numbers for > >> >> http://onse.fi/files/reshape-infloop-issue/raid5.c . > >> > > >> > Maybe add sh->sector to DEBUGPRINT()? > >> > >> Note that the XX debug printing was guarded by > >> > >> bool debout = (sh->sector == 198248960) && __ratelimit(&_rsafasfas); > >> > >> So everything was for sector 198248960 and rate limited every 20sec to > >> avoid a flood. > >> > >> > Also, please add more DEBUGPRINT() in the > >> > > >> > if (sh->reconstruct_state == reconstruct_state_result) { > >> > > >> > case. > >> > >> OK, added prints there. > >> > >> Though after logging I noticed that the execution never gets there, > >> sh->reconstruct_state is always reconstruct_state_idle at that point. > >> It gets cleared on the "XX too many failed" log message (line 4798). > >> > > I guess the failed = 10 is the problem here.. > > > > What does /proc/mdstat say? > > After --assemble --backup-file=xx, before --grow: > > md0 : active raid6 sdac[0] sdf[21] sdh[17] sdj[18] sde[26] sdr[20] > sds[15] sdad[25] sdk[13] sdp[27] sdo[11] sdl[10] sdn[9] sdt[16] md8[28] > sdi[22] sdae[23] sdaf[24] sdm[3] sdg[2] sdq[1] > 74232661248 blocks super 1.1 level 6, 64k chunk, algorithm 2 > [20/20] [UUUUUUUUUUUUUUUUUUUU] > [===================>.] reshape = 97.5% (4024886912/4124036736) > finish=10844512.0min speed=0K/sec > bitmap: 5/31 pages [20KB], 65536KB chunk > > After --grow --continue --backup-file=xx (i.e. during the > handle_stripe() loop): > > md0 : active raid6 sdac[0] sdf[21] sdh[17] sdj[18] sde[26] sdr[20] > sds[15] sdad[25] sdk[13] sdp[27] sdo[11] sdl[10] sdn[9] sdt[16] md8[28] > sdi[22] sdae[23] sdaf[24] sdm[3] sdg[2] sdq[1] > 74232661248 blocks super 1.1 level 6, 64k chunk, algorithm 2 > [20/20] [UUUUUUUUUUUUUUUUUUUU] > [===================>.] reshape = 97.5% (4024917256/4124036736) > finish=7674.2min speed=215K/sec > bitmap: 5/31 pages [20KB], 65536KB chunk > > After a reboot due to the stuck array the reshape progress gets reset > back to 4024886912. So, I am looking at this part. [ 274.084437] check 19: state 0x813 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.084969] check 18: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.085491] check 17: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.086036] check 16: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.086569] check 15: state 0x811 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.087088] check 14: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.087612] check 13: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.088129] check 12: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.088652] check 11: state 0x811 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.089167] check 10: state 0x811 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.089691] check 9: state 0x811 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.090209] check 8: state 0x811 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.090727] check 7: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.091243] check 6: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.091763] check 5: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.092279] check 4: state 0xa01 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.092800] check 3: state 0x811 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.093309] check 2: state 0x811 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.093810] check 1: state 0x811 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 [ 274.094317] check 0: state 0x813 read 00000000b79fab11 write 00000000b79fab11 written 00000000b79fab11 So 10 devices don't have R5_Insync here. Could you please add more prints to the following section: if (!rdev) /* Not in-sync */; else if (is_bad) { /* also not in-sync */ [...] else if (test_bit(R5_UPTODATE, &dev->flags) && test_bit(R5_Expanded, &dev->flags)) /* If we've reshaped into here, we assume it is Insync. * We will shortly update recovery_offset to make * it official. */ set_bit(R5_Insync, &dev->flags); I guess we get into "is_bad", case, but it should not be the case? Thanks, Song