Hi, On 2019/6/28 18:57, Mathias G wrote: > Hi Song > > On 21.06.19 21:51, Mathias G wrote: >>> Question: are you running the two drives with write cache on? >>> If yes, and if your application is not heavy on writes, could you try >>> turn off HDD write cache and see if the issue repros? >> Thanks for this. I just disabled the write cache with hdparm in rc.local >> for both RAID members and will let you know if the problem occurs again. > > Today the problem occurred again: > > kern.log >> Jun 28 12:39:11 $hostname kernel: [ 2.098096] md/raid1:md0: not clean -- starting background reconstruction >> Jun 28 12:39:11 $hostname kernel: [ 2.098099] md/raid1:md0: active with 2 out of 2 mirrors "not clean" means the resync has not been completed yet. Is the array still "not clean" in the previous boot/reboot or not ? If it is only "not clean" in the reproduced boot, that may mean the final update of MD super block is lost and the lagging behind of the events in bitmap super-block will be possible. If the array is also "not clean" in the previous boot/reboot, could you please check when does the status of array change from "clean" to "not clean" ? Is the RAID array (md0) used as rootfs or other fs ? And how do you reproduce the problem ? Just rebooting continuously until the problem reoccurs ? And not suddenly power-cut ? Regards, Tao >> Jun 28 12:39:11 $hostname kernel: [ 2.098201] md0: bitmap file is out of date (236662 < 236663) -- forcing full recovery >> Jun 28 12:39:11 $hostname kernel: [ 2.098252] md0: bitmap file is out of date, doing full recovery > > And the write cache is disabled for both RAID members: >> # hdparm -i /dev/sdb |grep WriteCache >> AdvancedPM=yes: disabled (255) WriteCache=disabled > >> # hdparm -i /dev/sdc |grep WriteCache >> AdvancedPM=no WriteCache=disabled > > I'm a little at a loss.. >