On Thu, Dec 08 2016, Jinpu Wang wrote: This number: > nr_pending = { > counter = 1 > }, and this number: > nr_pending = { > counter = 856 > }, might be interesting. There are 855 requested on the list. Add the one that is currently being retried give 856, which is nr_pending for the device that failed. But nr_pending on the device that didn't fail is 1. I would expect zero. When a read or write requests succeeds, rdev_dec_pending() is called immediately so this should quickly go to zero. It seems as though there must be a request to the loop device that is stuck somewhere between the atomic_inc(&rdev->nr_pending) (possibly inside read_balance) and the call to generic_make_request(). I cannot yet see how that would happen. Can you check if the is a repeatable observation? Is nr_pending.counter always '1' on the loop device? Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature