Hello Neil, Many thanks for your input! The support you guys are providing is one of a kind! I've been able to kill some of the blocked processes, releasing the locked files in the `/data` mount point, but some of them remained locked. I've booted with SystemRescueCD and it automatically detected and assembled the array as md127. The array was in read-only state, but after mounting it in SystemRescueCD `/mnt` the reshape process started from begining. Right now the `/proc/mdstat` looks as follows: [root@sysresccd ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md127 : active raid6 sda1[5] sdg1[6] sdd1[4] sdf1[3] 7813771264 blocks super 1.2 level 6, 512k chunk, algorithm 18 [4/3] [UUU_] [=================>...] reshape = 88.1% (3444089344/3906885632) finish=189.5min speed=40688K/sec bitmap: 8/30 pages [32KB], 65536KB chunk unused devices: <none> Initially the speed was reaching 63M/sec, but now it's a bit slower, however still at a very good level. The `mdadm --examine` output is shown below: [root@sysresccd ~]# mdadm --examine /dev/sda /dev/sda: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) [root@sysresccd ~]# mdadm --examine /dev/sda1 /dev/sda1: Magic : a92b4efc Version : 1.2 Feature Map : 0x45 Array UUID : a3d7766c:aeb658d3:8e2d29b7:4de30dab Name : debnas:0 Creation Time : Thu Dec 25 12:15:20 2014 Raid Level : raid6 Raid Devices : 4 Avail Dev Size : 7813771264 (3725.90 GiB 4000.65 GB) Array Size : 7813771264 (7451.79 GiB 8001.30 GB) Data Offset : 262144 sectors New Offset : 131072 sectors Super Offset : 8 sectors State : active Device UUID : 1ca5a215:ce082450:6ebb8d48:ae2e78f6 Internal Bitmap : 8 sectors from superblock Reshape pos'n : 6856704000 (6539.06 GiB 7021.26 GB) New Layout : left-symmetric Update Time : Tue Sep 3 11:17:21 2019 Bad Block Log : 512 entries available at offset 72 sectors Checksum : 159054c7 - correct Events : 210442 Layout : left-symmetric-6 Chunk Size : 512K Device Role : Active device 0 Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing) What can you say about the data offset? The question that still remains is: when the reshape in SystemRescueCD finishes, can I safely mount the array in the outdated host and perform mdadm and system updates? Also, what I've found in `dmesg` is quite distressing: [ 1149.107869] md: reshape of RAID array md127 [ 1149.521336] EXT4-fs (dm-0): 1 orphan inode deleted [ 1149.521338] EXT4-fs (dm-0): recovery complete [ 1149.826795] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null) [ 1151.392376] md/raid:md127: read error corrected (8 sectors at 9990072 on sda1) [ 1151.392379] md/raid:md127: read error corrected (8 sectors at 9990080 on sda1) [ 1151.392380] md/raid:md127: read error corrected (8 sectors at 9990088 on sda1) [ 1151.392382] md/raid:md127: read error corrected (8 sectors at 9990096 on sda1) [ 1151.392383] md/raid:md127: read error corrected (8 sectors at 9990104 on sda1) [ 1151.392385] md/raid:md127: read error corrected (8 sectors at 9990112 on sda1) [ 1151.392387] md/raid:md127: read error corrected (8 sectors at 9990120 on sda1) [ 1151.392388] md/raid:md127: read error corrected (8 sectors at 9990128 on sda1) [ 1151.392390] md/raid:md127: read error corrected (8 sectors at 9990136 on sda1) [ 1151.392391] md/raid:md127: read error corrected (8 sectors at 9990144 on sda1) [ 1273.654745] raid5_end_read_request: 364 callbacks suppressed [ 1273.654748] md/raid:md127: read error corrected (8 sectors at 24437552 on sda1) [ 1273.654750] md/raid:md127: read error corrected (8 sectors at 24437560 on sda1) [ 1273.654752] md/raid:md127: read error corrected (8 sectors at 24437568 on sda1) [ 1273.654754] md/raid:md127: read error corrected (8 sectors at 24437576 on sda1) [ 1273.654756] md/raid:md127: read error corrected (8 sectors at 24437584 on sda1) [ 1273.654758] md/raid:md127: read error corrected (8 sectors at 24437592 on sda1) [ 1273.654759] md/raid:md127: read error corrected (8 sectors at 24437600 on sda1) [ 1273.654761] md/raid:md127: read error corrected (8 sectors at 24437608 on sda1) [ 1273.654763] md/raid:md127: read error corrected (8 sectors at 24437616 on sda1) [ 1273.654765] md/raid:md127: read error corrected (8 sectors at 24437624 on sda1) [ 1324.777467] raid5_end_read_request: 118 callbacks suppressed [ 1324.777471] md/raid:md127: read error corrected (8 sectors at 30680296 on sdd1) [ 1324.777474] md/raid:md127: read error corrected (8 sectors at 30680304 on sdd1) [ 1324.777476] md/raid:md127: read error corrected (8 sectors at 30680312 on sdd1) [ 1324.777478] md/raid:md127: read error corrected (8 sectors at 30680320 on sdd1) [ 1324.777480] md/raid:md127: read error corrected (8 sectors at 30680328 on sdd1) [ 1324.777483] md/raid:md127: read error corrected (8 sectors at 30680336 on sdd1) [ 1324.777485] md/raid:md127: read error corrected (8 sectors at 30680344 on sdd1) [ 1324.777487] md/raid:md127: read error corrected (8 sectors at 30680352 on sdd1) [ 1324.777489] md/raid:md127: read error corrected (8 sectors at 30680360 on sdd1) [ 1324.777491] md/raid:md127: read error corrected (8 sectors at 30680368 on sdd1) [ 5225.912468] raid5_end_read_request: 22 callbacks suppressed [ 5225.912472] md/raid:md127: read error corrected (8 sectors at 509112152 on sda1) [ 5225.912475] md/raid:md127: read error corrected (8 sectors at 509112160 on sda1) [ 5225.912477] md/raid:md127: read error corrected (8 sectors at 509112168 on sda1) [ 5225.912478] md/raid:md127: read error corrected (8 sectors at 509112176 on sda1) [ 5225.912481] md/raid:md127: read error corrected (8 sectors at 509112184 on sda1) [ 5225.912482] md/raid:md127: read error corrected (8 sectors at 509112192 on sda1) [ 5225.912484] md/raid:md127: read error corrected (8 sectors at 509112200 on sda1) [ 5225.912486] md/raid:md127: read error corrected (8 sectors at 509112208 on sda1) [ 5225.912488] md/raid:md127: read error corrected (8 sectors at 509112216 on sda1) [ 5225.912489] md/raid:md127: read error corrected (8 sectors at 509112224 on sda1) [ 7234.974190] perf: interrupt took too long (2517 > 2500), lowering kernel.perf_event_max_sample_rate to 79400 [ 9732.823248] perf: interrupt took too long (3152 > 3146), lowering kernel.perf_event_max_sample_rate to 63400 [16490.333419] perf: interrupt took too long (3943 > 3940), lowering kernel.perf_event_max_sample_rate to 50700 [24515.160554] audit: type=1130 audit(1567468823.835:34): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=shadow comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [24515.217087] audit: type=1131 audit(1567468823.885:35): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=shadow comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [24515.519762] audit: type=1130 audit(1567468824.195:36): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=logrotate comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [24515.519767] audit: type=1131 audit(1567468824.195:37): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=logrotate comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [24515.970085] audit: type=1130 audit(1567468824.645:38): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=updatedb comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [24515.970098] audit: type=1131 audit(1567468824.645:39): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=updatedb comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [24750.433082] audit: type=1130 audit(1567469059.105:40): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=man-db comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [24750.433086] audit: type=1131 audit(1567469059.105:41): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=man-db comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [24891.316197] raid5_end_read_request: 86 callbacks suppressed [24891.316200] md/raid:md127: read error corrected (8 sectors at 2883434712 on sda1) [24891.316204] md/raid:md127: read error corrected (8 sectors at 2883434720 on sda1) [24891.316205] md/raid:md127: read error corrected (8 sectors at 2883434728 on sda1) [24891.316207] md/raid:md127: read error corrected (8 sectors at 2883434736 on sda1) [24891.316208] md/raid:md127: read error corrected (8 sectors at 2883434744 on sda1) [24891.316210] md/raid:md127: read error corrected (8 sectors at 2883434752 on sda1) [24891.316211] md/raid:md127: read error corrected (8 sectors at 2883434760 on sda1) [24891.316213] md/raid:md127: read error corrected (8 sectors at 2883434768 on sda1) [24891.316214] md/raid:md127: read error corrected (8 sectors at 2883434776 on sda1) [24891.316216] md/raid:md127: read error corrected (8 sectors at 2883434784 on sda1) [24974.432792] raid5_end_read_request: 22 callbacks suppressed [24974.432795] md/raid:md127: read error corrected (8 sectors at 2892979280 on sdd1) [24974.432799] md/raid:md127: read error corrected (8 sectors at 2892979288 on sdd1) [24974.432801] md/raid:md127: read error corrected (8 sectors at 2892979296 on sdd1) [24974.432803] md/raid:md127: read error corrected (8 sectors at 2892979304 on sdd1) [24974.432805] md/raid:md127: read error corrected (8 sectors at 2892979312 on sdd1) [24974.432807] md/raid:md127: read error corrected (8 sectors at 2892979320 on sdd1) [24974.432810] md/raid:md127: read error corrected (8 sectors at 2892979328 on sdd1) [24974.432812] md/raid:md127: read error corrected (8 sectors at 2892979336 on sdd1) [24974.432814] md/raid:md127: read error corrected (8 sectors at 2892979344 on sdd1) [24974.432816] md/raid:md127: read error corrected (8 sectors at 2892979352 on sdd1) What do you think of that? The `smartctl -a` for the reported drives is not showing anything unusual and the drives are new, so it shouldn't be a hardware problem (still not noticed by SMART): [root@sysresccd ~]# smartctl -a /dev/sda smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.19.34-1-lts] (local build) Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Red Device Model: WDC WD40EFRX-68N32N0 Serial Number: WD-WCC7K0YEYNV9 LU WWN Device Id: 5 0014ee 2663bb359 Firmware Version: 82.00A82 User Capacity: 4,000,787,030,016 bytes [4.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-3 T13/2161-D revision 5 SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Sep 3 11:34:24 2019 UTC SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (45360) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 482) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 210 163 021 Pre-fail Always - 4475 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 67 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 195 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 67 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 21 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 52 194 Temperature_Celsius 0x0022 110 105 000 Old_age Always - 40 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Can I see if the filest at those reported sectors are correct? Best regards, Krzysztof Jakobczyk wt., 3 wrz 2019 o 02:18 NeilBrown <neilb@xxxxxxx> napisał(a): > > On Mon, Sep 02 2019, Krzysztof Jakóbczyk wrote: > > > Gentlemen, > > > > Just in order for me not to mix anything important I will quickly > > summarize what I'm about to do: > > I will try to release all the files that are being used on the target > > md0, by checking what is still being used with "lsof /data" and then > > will kill the processes that are still trying to use the array. > > You won't be able to kill those processes, and there is half a chance > that the "lsof /data" will hang and be unkillable. > > > After the files are being unlocked I will perform the outdated host shutdown. > > I would > sync & > wait a little while > reboot -f -n > > A Linux system should always survive "reboot -f -n" with little data > loss, usually none. > > > I will boot a thumbstick on that computer with SystemRescueCD and will > > try to assemble the array with the "mdadm --assemble --scan -v --run" > > applying --force if necessary. > > --force shouldn't be necessary, so if the first version doesn't work, > check with us first. > > > > Please confirm me if my understanding is correct. > > I'd like some more details: particular "mdadm -E" of one or more > component drives. I'm curious what the data offset is. As you didn't > need to git a "--backup=...." arg to mdadm, I suspect it is reasonably > large, which is good. > Sometimes raid reshape needs an 'mdadm' running to help the kernel, and > if that mdadm gets killed, the reshape will hang. > But with a largeish data-offset, no mdadm helper is needed. > > The hang was reported 307 seconds after a "read error corrected" > message. And by that time it had hung for at least 120 seconds - maybe > as much as 240. So there isn't obviously a strong connection, but maybe > there is a cause/effect there. > > Looking at code fixes since 3.16, I can see a couple of live-lock bugs > fixed, but they were fixed well before 2016-12-30, so probably got back > ported to the Debian kernel. > > So I cannot easily find an explanation. > > I suspect that if you just rebooted, the reshape would restart and > continue happily (unless/until another read error was found). > Rebooting to a rescue CD is likely to be safer. > Likely worst case is that it will hang again, and we'll need to look > more deeply. > > In any case, I'd like to see that "mdadm --examine" output. > > Thanks, > NeilBrown wt., 3 wrz 2019 o 02:18 NeilBrown <neilb@xxxxxxx> napisał(a): > > On Mon, Sep 02 2019, Krzysztof Jakóbczyk wrote: > > > Gentlemen, > > > > Just in order for me not to mix anything important I will quickly > > summarize what I'm about to do: > > I will try to release all the files that are being used on the target > > md0, by checking what is still being used with "lsof /data" and then > > will kill the processes that are still trying to use the array. > > You won't be able to kill those processes, and there is half a chance > that the "lsof /data" will hang and be unkillable. > > > After the files are being unlocked I will perform the outdated host shutdown. > > I would > sync & > wait a little while > reboot -f -n > > A Linux system should always survive "reboot -f -n" with little data > loss, usually none. > > > I will boot a thumbstick on that computer with SystemRescueCD and will > > try to assemble the array with the "mdadm --assemble --scan -v --run" > > applying --force if necessary. > > --force shouldn't be necessary, so if the first version doesn't work, > check with us first. > > > > Please confirm me if my understanding is correct. > > I'd like some more details: particular "mdadm -E" of one or more > component drives. I'm curious what the data offset is. As you didn't > need to git a "--backup=...." arg to mdadm, I suspect it is reasonably > large, which is good. > Sometimes raid reshape needs an 'mdadm' running to help the kernel, and > if that mdadm gets killed, the reshape will hang. > But with a largeish data-offset, no mdadm helper is needed. > > The hang was reported 307 seconds after a "read error corrected" > message. And by that time it had hung for at least 120 seconds - maybe > as much as 240. So there isn't obviously a strong connection, but maybe > there is a cause/effect there. > > Looking at code fixes since 3.16, I can see a couple of live-lock bugs > fixed, but they were fixed well before 2016-12-30, so probably got back > ported to the Debian kernel. > > So I cannot easily find an explanation. > > I suspect that if you just rebooted, the reshape would restart and > continue happily (unless/until another read error was found). > Rebooting to a rescue CD is likely to be safer. > Likely worst case is that it will hang again, and we'll need to look > more deeply. > > In any case, I'd like to see that "mdadm --examine" output. > > Thanks, > NeilBrown