Re: Assemblin journaled array fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/4/20 12:07 AM, Song Liu wrote:

The hang happens at expected place.

[Jun 3 09:02] INFO: task mdadm:2858 blocked for more than 120 seconds.
[  +0.060545]       Tainted: G            E     5.4.19-msl-00001-gbf39596faf12 #2
[  +0.062932] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Could you please try disable the timeout message with

echo 0 > /proc/sys/kernel/hung_task_timeout_secs

And during this wait (after message
"r5c_recovery_flush_data_only_stripes before wait_event"),
checks whether the raid disks (not the journal disk) are taking IOs
(using tools like iostat).


Will report tommorow (machine was restarted, so gotta wait 19+ hours again until r5c_recovery_flush_log / processing gets its part of the job completed).

Non-assembling raid issue aside - any idea why is it so inhumanly slow ? It's not really much of an use in a production scenario in this state.

Following as every-10 seconds stats from journal device after the assembly of the main raid started.

Device             tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
md125             3.00      3072.00         0.00      30720          0
md125             2.80      2867.20         0.00      28672          0
md125             2.10      2150.40         0.00      21504          0
md125             1.90      1945.60         0.00      19456          0
md125             2.00      1920.40         0.00      19204          0
md125             1.30      1331.20         0.00      13312          0
md125             1.50      1536.00         0.00      15360          0



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux