On Mon, 27 Mar 2017 09:42:29 +1100 NeilBrown <neilb@xxxxxxxx> wrote: > On Fri, Mar 24 2017, pdi wrote: > > > On Fri, 24 Mar 2017 16:25:35 +1100 > > NeilBrown <neilb@xxxxxxxx> wrote: > > > >> On Thu, Mar 23 2017, pdi wrote: > >> > >> > Greetings all, > >> > > >> > The problem in a nutshell is that an array is clean after boot, > >> > until some specific jobs switch it to active where it remains > >> > until reboot. > >> > > >> > A similar problem was discussed, and solved, in > >> > https://www.spinics.net/lists/raid/msg46450.html. However, > >> > AFAICT, it is not the same issue. > >> > > >> > I would be grateful for any insights as to why this happens > >> > and/or how to prevent it. > >> > > >> > The relevant info follows, please let me know if anything further > >> > might help. > >> > > >> > Many thanks in advance. > >> > > >> > - uname -a > >> > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 > >> > x86_64 Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel > >> > GNU/Linux > >> > - mdadm -V > >> > mdadm - v3.3.4 - 3rd August 2015 > >> > - Desktop drives without sct/erc, > >> > with timeout mismatch correction as per > >> > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch > >> > - /dev/md9 is a raid10 array, 4 devices, far=2, > >> > with various dirs used as samba and nfs shares > >> > - The array is in *constant* array_state active > >> > - mdadm -D /dev/md9 | grep 'State :' > >> > State : active > >> > - cat /sys/block/md9/md/array_state > >> > active > >> > - watch -d 'grep md9 /proc/diskstats' > >> > remain unchanged > >> > - uptime > >> > load average: 0.00, 0.00, 0.00 > >> > - cat /sys/block/md9/md/safe_mode_delay > >> > 0.201 > >> > - echo 0.1 > /sys/block/md9/md/safe_mode_delay > >> > array_state remains active > >> > - echo clean > /sys/block/md9/md/array_state > >> > echo: write error: Device or resource busy > >> > - reboot (with or without prior check) > >> > array_state clean > >> > - After reboot, array remains clean until some specific > >> > jobs put it in constant active state. Such jobs so far > >> > identified: > >> > - echo check > /sys/block/md9/md/sync_action > >> > - run an rsnapshot job > >> > - start a qemu/kvm vm > >> > - Other jobs, like text/doc editing, multimedia playback, > >> > etc retain array_state clean > >> > >> This bug was introduced by > >> Commit: 20d0189b1012 ("block: Introduce new bio_split()") > >> in 3.14, and fixed by > >> Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is > >> split") in 4.8. > >> > >> Maybe the latter patch should be sent to -stable ?? > >> > >> NeilBrown > > > > NeilBrown, thank you for your swift and concise answer. > > > > I gather you are referring to kernel version numbers. The described > > behaviour was first noticed many months ago with kernel 2.6.37.6, > > and persisted after a system upgrade and kernel 4.4.38. However, > > after the upgrade two things were corrected, the timeout mismatch, > > and a Current_Pending_Sector in one of the drives; which may, or > > may not, explain the occurrence with the older kernel. > > > > Is this constant active state in the data array something to worry > > about and try kernel >= 4.8, or shall I let be? > > The only important consequence of the constant active state is that if > your machine crashes at a moment when the array would otherwise have > been idle, then a resync will be needed after reboot. Without the > constant active state, that resync would not have been needed. > > If you have a write-intent bitmap, this is not particularly relevant. > > I cannot say how important it is to you to avoid a resync after a > crash, so I don't know if you should just let it be or not. > > NeilBrown NeilBrown, Thank you for your clear explanation. Best regards, pdi -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html