Re: Overlong wait for old active object to go away

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2 Sep 2011, David Howells wrote:

> Mark Hills <mark@xxxxxxxxxxx> wrote:
> 
> > In dmesg "Overlong wait for old active object to go away" (see full 
> > message below), and an application will hang or lock, eg. blocking on 
> > read().
> > 
> > Sometimes it unblocks, sometimes it seems to hang indefinitely. Is there a 
> > higher level explanation for what is going on here?
> 
> I suspect that the OBJECT_RECYCLING state doesn't complete the 
> transition to the OBJECT_DYING state since we saw that we had to wait 
> for outstanding operations to complete at the time OBJECT_DYING was 
> processed.  I think there's probably a race in waking up the object 
> state machine when the last operation completes.
> 
> It would be useful you could try is grepping in /proc/fs/fscache/objects 
> for the objects named in the error text.  The object debug ID is 
> right-aligned in the first column without the "OBJ" attached.

Below is the relevant entries from the object list, during the hang. The 
RCYC entry also correlates with the hang, and the LOOK entry.

I also took the corresponding stats (during the hang) and the dmesg around 
the error message, which I took with:

  # echo $((1|2|4|8|16|32|512|1024|2048)) > /sys/module/fscache/parameters/debug

If I find some time then perhaps I will try and understand the meaning of 
some of this information! But for now I will post it here in the hope that 
it is useful -- perhaps there is a clear indication of a problem, 
somewhere.

Thanks

-- 
Mark


$ uname -a
Linux joker 2.6.39.4-mh #34 SMP Mon Sep 5 13:40:21 BST 2011 x86_64 x86_64 x86_64 GNU/Linux

$ cat /proc/fs/fscache/objects | grep -v ACTV
OBJECT   PARENT   STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA       OBJECT_KEY, AUX_DATA
======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================
     63a        3 RCYC     0   0   0   0  0     0  3 20 0 2 | <no_cookie>
     63b        3 LOOK     0   0   0   0  0     0 7b  0 0 2 | NFS.fh           DT  7 ffff880226a60420 010006015cd1795b8c9f417c8beaffe8b6b51c538c33cc0a424c3793, eef90a4e00000000dae96d3800000000eef90a4e00000000dae96d380000000069000000000000000000000000000000

$ cat /proc/fs/fscache/stats 
FS-Cache statistics
Cookies: idx=8 dat=1606 spc=0
Objects: alc=1595 nal=0 avl=1593 ded=881
ChkAux : non=0 ok=620 upd=0 obs=0
Pages  : mrk=15345 unc=3064
Acquire: n=1614 nul=0 noc=0 ok=1614 nbf=0 oom=0
Lookups: n=1603 neg=981 pos=613 crt=981 tmo=8
Updates: n=0 nul=0 run=0
Relinqs: n=884 nul=0 wcr=1 rtr=870
AttrChg: n=0 ok=0 nbf=0 oom=0 run=0
Allocs : n=0 ok=0 wt=0 nbf=0 int=0
Allocs : ops=0 owt=0 abt=0
Retrvls: n=1003 ok=617 wt=865 nod=350 nbf=35 int=0 oom=0
Retrvls: ops=967 owt=287 abt=0
Stores : n=8112 ok=8112 agn=0 nbf=0 oom=0
Stores : ops=385 run=8497 pgs=8112 rxd=8112 olm=0
VmScan : nos=1709 gon=0 bsy=0 can=0
Ops    : pend=287 run=1352 enq=16432 can=0 rej=0
Ops    : dfr=13 rel=1352 gc=13
CacheOp: alo=0 luo=1 luc=0 gro=0
CacheOp: upo=0 dro=1 pto=0 atc=0 syn=0
CacheOp: rap=0 ras=0 alp=0 als=0 wrp=0 ucp=0 dsp=0

$ dmesg
<cut>
[ 2383.150006] [opera ] ==> __fscache_relinquish_cookie(ffff88021ad44f00{NFS.fh,ffff880226a60420},1)
[ 2383.150012] [opera ] RELEASE OBJ63a
[ 2383.150016] [opera ] ==> fscache_enqueue_object({OBJ63a})
[ 2383.150032] [kworke] ==> fscache_object_work_func({OBJ63a})
[ 2383.150036] [kworke] ==> fscache_object_state_machine({OBJ63a,OBJECT_ACTIVE,24})
[ 2383.150041] [opera ] ==> __fscache_cookie_put(ffff88021ad44f00)
[ 2383.150045] [opera ] FREE COOKIE ffff88021ad44f00
[ 2383.150049] [opera ] <== __fscache_cookie_put()
[ 2383.150052] [opera ] <== __fscache_relinquish_cookie()
[ 2383.150057] [opera ] ==> __fscache_acquire_cookie({NFS.server},{NFS.fh},ffff880226a60420)
[ 2383.150061] [opera ] ==> fscache_acquire_non_index_cookie()
[ 2383.150065] [opera ] ==> fscache_select_cache_for_object()
[ 2383.150069] [opera ] <== fscache_select_cache_for_object() = ffff880229124c00 [parent]
[ 2383.150073] [opera ] cache mycache
[ 2383.150077] [opera ] ==> fscache_alloc_object(ffff880229124c00,ffff88021ad44f00{NFS.fh})
[ 2383.150082] [kworke] <== fscache_object_state_machine() [->OBJECT_DYING]
[ 2383.150086] [opera ] ALLOC OBJ63b: NFS.fh {0}
[ 2383.150091] [opera ] ==> fscache_alloc_object(ffff880229124c00,ffff88022910d000{NFS.server})
[ 2383.150105] [opera ] <== fscache_alloc_object() = 0 [found]
[ 2383.150109] [opera ] ==> fscache_attach_object({NFS.fh},{OBJ63b})
[ 2383.150113] [kworke] ==> fscache_enqueue_object({OBJ63a})
[ 2383.150117] [opera ] <== fscache_attach_object() = 0
[ 2383.150120] [opera ] <== fscache_alloc_object() = 0
[ 2383.150123] [opera ] ==> fscache_enqueue_object({OBJ63b})
[ 2383.150132] [opera ] <== fscache_acquire_non_index_cookie() = 0 [deferred]
[ 2383.150136] [kworke] ==> fscache_object_work_func({OBJ63b})
[ 2383.150140] [kworke] ==> fscache_object_state_machine({OBJ63b,OBJECT_INIT,0})
[ 2383.150144] [kworke] ==> fscache_initialise_object()
[ 2383.150149] [kworke] parent OBJECT_ACTIVE
[ 2383.150152] [opera ] <== __fscache_acquire_cookie() = ffff88021ad44f00
[ 2383.150156] [kworke] go
[ 2383.150159] [kworke] <== fscache_initialise_object()
[ 2383.150163] [kworke] <== fscache_object_state_machine() [->OBJECT_LOOKING_UP]
[ 2383.150166] [kworke] ==> fscache_enqueue_object({OBJ63b})
[ 2383.150205] [kworke] ==> fscache_object_work_func({OBJ63a})
[ 2383.150209] [kworke] ==> fscache_object_work_func({OBJ63b})
[ 2383.150213] [kworke] ==> fscache_object_state_machine({OBJ63b,OBJECT_LOOKING_UP,0})
[ 2383.150217] [kworke] ==> fscache_lookup_object()
[ 2383.150221] [kworke] LOOKUP "NFS.server/NFS.fh" in "mycache"
[ 2383.150227] [kworke] ==> fscache_object_state_machine({OBJ63a,OBJECT_DYING,24})
[ 2383.150233] [kworke] dying OBJ63a {0,0}
[ 2383.150238] [kworke] ==> fscache_enqueue_dependents({OBJ63a})
[ 2383.150243] [kworke] woke 0 ops on OBJ63a
[ 2383.150248] [kworke] <== fscache_object_state_machine() [->OBJECT_RECYCLING]
[ 2383.150253] [kworke] ==> fscache_enqueue_object({OBJ63a})
[ 2383.150261] [kworke] ==> fscache_object_work_func({OBJ63a})
[ 2383.150266] [kworke] ==> fscache_object_state_machine({OBJ63a,OBJECT_RECYCLING,20})
[ 2383.150271] [kworke] ==> fscache_release_object()
[ 2383.150276] [kworke] ==> fscache_drop_object({OBJ63a,0})
[ 2388.575662] [cat   ] ==> fscache_fsdef_netfs_get_aux({nfs.0},)
[ 2391.559612] [cat   ] ==> fscache_fsdef_netfs_get_key({nfs.0},)
[ 2443.150062] 
[ 2443.150066] CacheFiles: Error: Overlong wait for old active object to go away
[ 2443.150071] object: OBJ63b
[ 2443.150075] objstate=OBJECT_LOOKING_UP fl=0 wbusy=2 ev=0[7b]
[ 2443.150078] ops=0 inp=0 exc=0
[ 2443.150080] parent=ffff8802213f0480
[ 2443.150084] cookie=ffff88021ad44f00 [pr=ffff88022910d000 nd=ffff880226a60420 fl=7]
[ 2443.150087] key=[28] '010006015cd1795b8c9f417c8beaffe8b6b51c538c33cc0a424c3793'
[ 2443.150102] xobject: OBJ63a
[ 2443.150106] xobjstate=OBJECT_RECYCLING fl=0 wbusy=2 ev=20[3]
[ 2443.150108] xops=0 inp=0 exc=0
[ 2443.150110] xparent=ffff8802213f0480
[ 2443.150112] xcookie=NULL
[ 2443.150122] [kworke] <== fscache_lookup_object()
[ 2443.150125] [kworke] <== fscache_object_state_machine() [->OBJECT_LOOKING_UP]
[ 2443.150129] [kworke] ==> fscache_enqueue_object({OBJ63b})
[ 2443.150134] [kworke] ==> fscache_object_work_func({OBJ63b})
[ 2443.150138] [kworke] ==> fscache_object_state_machine({OBJ63b,OBJECT_LOOKING_UP,0})
[ 2443.150141] [kworke] ==> fscache_lookup_object()
[ 2443.150145] [kworke] LOOKUP "NFS.server/NFS.fh" in "mycache"
[ 2443.150215] [kworke] ==> fscache_object_lookup_negative({OBJ63b,OBJECT_LOOKING_UP})
[ 2443.150221] [kworke] release parent OBJ3 {616}
[ 2443.150226] [kworke] wake up lookup ffff88021ad44f40
[ 2443.150229] [kworke] <== fscache_drop_object()
[ 2443.150234] [kworke] <== fscache_object_state_machine() [->OBJECT_DEAD]
[ 2443.150240] [kworke] <== fscache_object_lookup_negative()
[ 2443.150250] [opera ] ==> fscache_submit_op({OBJ63b OP24f9},{1})
[ 2443.150369] [kworke] ==> fscache_obtained_object({OBJ63b,OBJECT_CREATING})
[ 2443.150372] [kworke] <== fscache_obtained_object()
[ 2443.150396] [kworke] <== fscache_lookup_object()
[ 2443.150399] [kworke] <== fscache_object_state_machine() [->OBJECT_AVAILABLE]
[ 2443.150402] [kworke] ==> fscache_enqueue_object({OBJ63b})
[ 2443.150410] [kworke] ==> fscache_object_work_func({OBJ63b})
[ 2443.150413] [kworke] ==> fscache_object_state_machine({OBJ63b,OBJECT_AVAILABLE,0})
[ 2443.150416] [kworke] ==> fscache_object_available({OBJ63b})
[ 2443.150420] [kworke] ==> fscache_done_parent_op(OBJ63b {OBJ3,1})
[ 2443.150430] [kworke] ==> fscache_put_operation({OBJ63b OP24f9,2})
[ 2443.150433] [kworke] woke 1 ops on OBJ63b
[ 2443.150437] [kworke] ==> fscache_enqueue_dependents({OBJ63b})
[ 2443.150441] [opera ] ==> fscache_put_operation({OBJ63b OP24f9,1})
[ 2443.150445] [opera ] PUT OP
[ 2443.150449] [opera ] woke 0 ops on OBJ63b
[ 2443.150452] [opera ] <== fscache_put_operation() [done]
[ 2443.150515] [kworke] <== fscache_object_available()
[ 2443.150518] [kworke] <== fscache_object_state_machine() [->OBJECT_ACTIVE]
[ 2443.150934] [kworke] ==> fscache_submit_op({OBJ63b OP24fb},{1})
[ 2443.150938] [kworke] ==> fscache_enqueue_operation({OBJ63b OP24fb,1})
[ 2443.150941] [kworke] queue async
[ 2443.150948] [kworke] ==> fscache_put_operation({OBJ63b OP24fb,2})
[ 2443.150953] [kworke] ==> fscache_op_work_func({OBJ63b OP24fb,1})
[ 2443.150982] [kworke] ==> fscache_enqueue_operation({OBJ63b OP24fb,1})
[ 2443.150985] [kworke] queue async
[ 2443.150994] [kworke] ==> fscache_put_operation({OBJ63b OP24fb,2})
[ 2443.150997] [kworke] <== fscache_op_work_func()
[ 2443.151001] [kworke] ==> fscache_op_work_func({OBJ63b OP24fb,1})
[ 2443.151014] [kworke] ==> fscache_put_operation({OBJ63b OP24fb,1})
[ 2443.151018] [kworke] PUT OP
[ 2443.151021] [kworke] woke 0 ops on OBJ63b
[ 2443.151025] [kworke] <== fscache_put_operation() [done]
[ 2443.151028] [kworke] <== fscache_op_work_func()

--
Linux-cachefs mailing list
Linux-cachefs@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cachefs


[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]
  Powered by Linux