On Thu, Dec 1, 2016 at 5:35 PM, NeilBrown <neilb@xxxxxxxx> wrote: > On Fri, Dec 02 2016, Marc Smith wrote: > >> On Wed, Nov 30, 2016 at 9:52 PM, NeilBrown <neilb@xxxxxxxx> wrote: >>> On Mon, Nov 28 2016, Marc Smith wrote: >>> >>>> >>>> # find /sys/block/md127/md >>>> /sys/block/md127/md >>>> /sys/block/md127/md/reshape_position >>>> /sys/block/md127/md/layout >>>> /sys/block/md127/md/raid_disks >>>> /sys/block/md127/md/bitmap >>>> /sys/block/md127/md/bitmap/chunksize >>> >>> This tells me that: >>> sysfs_remove_group(&mddev->kobj, &md_bitmap_group); >>> hasn't been run, so mddev_delayed_delete() hasn't run. >>> That suggests the final mddev_put() hsn't run. i.e. mddev->active is > 0 >>> >>> Everything else suggests that array has been stopped and cleaned and >>> should be gone... >>> >>> This seems to suggest that there is an unbalanced mddev_get() without a >>> matching mddev_put(). I cannot find it though. >>> >>> If I could reproduce it, I would try to see what is happening by: >>> >>> - putting >>> printk("mddev->active = %d\n", atomic_read(&mddev->active)); >>> in the top of mddev_put(). That shouldn't be *too* noisy. >>> >>> - putting >>> printk("rd=%d empty=%d ctime=%d hold=%d\n", mddev->raid_disks, >>> list_empty(&mddev->disks), mddev->ctime, mddev->hold_active); >>> >>> in mddev_put() just before those values are tested. >>> >>> - putting >>> printk("queue_work\n"); >>> just before the 'queue_work()' call in mddev_put. >>> >>> - putting >>> printk("mddev_delayed_delete\n"); >>> in mddev_delayed_delete() >>> >>> Then see what gets printed when you stop the array. >> >> I made those modifications to md.c and here is the kernel log when stopping: >> >> --snip-- >> [ 3937.233487] mddev->active = 2 >> [ 3937.233503] mddev->active = 2 >> [ 3937.233509] mddev->active = 2 >> [ 3937.233516] mddev->active = 1 >> [ 3937.233516] rd=2 empty=0 ctime=1480617270 hold=0 > > At this point, mdadm has opened the /dev/md127 device, accessed a few > attributes via sysfs just to check on the status, and then closed it > again. > The array is still active, but we know that no other process has it > open. > > >> [ 3937.233679] udevd[492]: inotify event: 8 for /dev/md127 >> [ 3937.241489] md127: detected capacity change from 73340747776 to 0 >> [ 3937.241493] md: md127 stopped. > > Now mdadm has opened the array again and issued the STOP_ARRAY ioctl. > Still nothing else has the array open. > >> [ 3937.241665] udevd[492]: device /dev/md127 closed, synthesising 'change' >> [ 3937.241726] udevd[492]: seq 3631 queued, 'change' 'block' >> [ 3937.241829] udevd[492]: seq 3631 forked new worker [4991] >> [ 3937.241989] udevd[4991]: seq 3631 running >> [ 3937.242002] dlm: dc18e34c-b136-1964-1c34-4509a7c60a19: leaving the >> lockspace group... >> [ 3937.242039] udevd[4991]: removing watch on '/dev/md127' >> [ 3937.242068] mddev->active = 3 > > But somehow the ->active count got up to 3. > mdadm probably still has it open, but two other things do too. > If you have "mdadm --monitor" running in the background (which is good) > it will temporarily increase, then decrease the count. > udevd opens the device temporarily too. > So this isn't necessarily a problem. > >> [ 3937.242069] udevd[492]: seq 3632 queued, 'offline' 'dlm' >> [ 3937.242080] mddev->active = 3 >> [ 3937.242104] udevd[4991]: IMPORT 'probe-bcache -o udev /dev/md127' >> /usr/lib/udev/rules.d/69-bcache.rules:16 >> [ 3937.242161] udevd[492]: seq 3632 forked new worker [4992] >> [ 3937.242259] udevd[4993]: starting 'probe-bcache -o udev /dev/md127' >> [ 3937.242753] dlm: dc18e34c-b136-1964-1c34-4509a7c60a19: group event done 0 0 >> [ 3937.242847] dlm: dc18e34c-b136-1964-1c34-4509a7c60a19: >> release_lockspace final free >> [ 3937.242861] md: unbind<dm-1> >> [ 3937.256606] md: export_rdev(dm-1) >> [ 3937.256612] md: unbind<dm-0> >> [ 3937.263601] md: export_rdev(dm-0) >> [ 3937.263688] mddev->active = 4 >> [ 3937.263751] mddev->active = 3 > > But here, the active count only drops down to 2. (it is decremented > after it is printed). Assuming there really were no more messages like > this, there are two active references to the md device, and we don't > know what they are. > >> >> I didn't use my modified mdadm which stops the synthesized CHANGE from >> occurring, but if needed, I can re-run the test using that. > > It would be good to use the modified mdadm, if only to reduce the > noise. It won't change the end result, but might make it easier to see > what is happening. > Also please add > WARN_ON(1); > > in the start of mddev_get() and mddev_put(). > That will provide a stack trace whenever either of these are called, so > we can see who takes a references, and who doesn't release it. Okay, I added that to both functions, and now I can't get stopping the array to misbehave (eg, not generate the REMOVE event). I've been trying all morning! I literally just added the WARN_ON(1) to those two functions, and that's all I changed. I compiled and reinstalled image, no other changes. I've tried quite a few times now to reproduce this, and I'm failing to do so -- every time the REMOVE event is generated and everything is removed correctly. I'm going to switch back to the previous image and confirm its reproducible with that. --Marc > > Thanks, > NeilBrown > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html