David Howells <dhowells@xxxxxxxxxx> writes: > Hi Luis, Hi David, (thanks for the feedback and sorry for the delay replying -- in the meantime I had problems with my test environment that I had to fix first.) >> I've been seeing fscache complaining about duplicate cookies in 9p: >> >> FS-Cache: Duplicate cookie detected >> FS-Cache: O-cookie c=00000000ba929e80 [p=000000002e706df1 fl=226 nc=0 na=1] > > This cookie is marked acquired (fl=2xx), but not relinquished (fl=4xx), so it > would still seem to be active:-/. Pretty much one of the first things > __fscache_relinquish_cookie() does is to set that bit (should be bit 10). > > One thing that might be useful is if you can turn on a couple of fscache > tracepoints: > > echo 1 >/sys/kernel/debug/tracing/events/fscache/fscache_acquire/enable > echo 1 >/sys/kernel/debug/tracing/events/fscache/fscache_relinquish/enable > > The cookie pointers in the duplicate cookie report should match the entries in > the trace output. Awesome, thanks for the suggestion. So, here's what I did: I enabled the 2 tracepoints you suggested *and* I've also enabled the 'fscache_cookie' tracepoint so that I would see the collision in the trace. I then started my test and after I see the duplicated cookie warning I stopped it. And here's what I got in the logs: [ 29.702428] FS-Cache: Duplicate cookie detected [ 29.703859] FS-Cache: O-cookie c=0000000046a8b5b3 [p=00000000b2d5d5a9 fl=216 nc=0 na=0] [ 29.705922] FS-Cache: O-cookie d=0000000000000000 n=0000000000000000 [ 29.706996] FS-Cache: O-key=[8] '0312710100000000' [ 29.707597] FS-Cache: N-cookie c=000000002b6a2db1 [p=00000000b2d5d5a9 fl=2 nc=0 na=1] [ 29.708712] FS-Cache: N-cookie d=000000006a7abbdd n=00000000188a0c4c [ 29.709568] FS-Cache: N-key=[8] '0312710100000000' and in the trace ('=>' identifies the old cookie): => <...>-2712 [000] ...1 29.551257: fscache_acquire: c=0000000046a8b5b3 p=00000000b2d5d5a9 pu=50 pc=49 pf=22 n=9p.inod <...>-2820 [000] ...1 29.707083: fscache_relinquish: c=000000003ca97471 u=1 p=00000000b2d5d5a9 Nc=0 Na=1 f=26 r=0 <...>-2820 [000] ...1 29.707087: fscache_cookie: PUT rlq c=000000003ca97471 u=0 p=00000000b2d5d5a9 Nc=0 Na=0 f=16 <...>-2820 [000] ...1 29.707090: fscache_cookie: PUT prn c=00000000b2d5d5a9 u=49 p=000000005cf9f76e Nc=48 Na=1 f=22 <...>-2821 [002] ...2 29.709010: fscache_cookie: GET prn c=00000000b2d5d5a9 u=50 p=000000005cf9f76e Nc=48 Na=1 f=22 <...>-2821 [002] ...1 29.709011: fscache_acquire: c=00000000ca586ae6 p=00000000b2d5d5a9 pu=50 pc=49 pf=22 n=9p.inod <...>-2847 [002] ...1 29.750224: fscache_relinquish: c=00000000ca586ae6 u=1 p=00000000b2d5d5a9 Nc=0 Na=1 f=26 r=0 <...>-2847 [002] ...1 29.750228: fscache_cookie: PUT rlq c=00000000ca586ae6 u=0 p=00000000b2d5d5a9 Nc=0 Na=0 f=16 <...>-2847 [002] ...1 29.750230: fscache_cookie: PUT prn c=00000000b2d5d5a9 u=49 p=000000005cf9f76e Nc=48 Na=1 f=22 <...>-2851 [003] ...2 29.751561: fscache_cookie: GET prn c=00000000b2d5d5a9 u=50 p=000000005cf9f76e Nc=48 Na=1 f=22 <...>-2851 [003] ...1 29.751562: fscache_acquire: c=0000000019a5a5e8 p=00000000b2d5d5a9 pu=50 pc=49 pf=22 n=9p.inod <...>-2863 [001] ...1 29.771136: fscache_relinquish: c=0000000019a5a5e8 u=1 p=00000000b2d5d5a9 Nc=0 Na=1 f=26 r=0 <...>-2863 [001] ...1 29.771140: fscache_cookie: PUT rlq c=0000000019a5a5e8 u=0 p=00000000b2d5d5a9 Nc=0 Na=0 f=16 <...>-2863 [001] ...1 29.771143: fscache_cookie: PUT prn c=00000000b2d5d5a9 u=49 p=000000005cf9f76e Nc=48 Na=1 f=22 <...>-2865 [002] ...2 29.773439: fscache_cookie: GET prn c=00000000b2d5d5a9 u=50 p=000000005cf9f76e Nc=48 Na=1 f=22 <...>-2865 [002] ...1 29.773440: fscache_acquire: c=00000000ca586ae6 p=00000000b2d5d5a9 pu=50 pc=49 pf=22 n=9p.inod => <...>-2869 [003] ...2 29.779784: fscache_cookie: *COLLISION* c=0000000046a8b5b3 u=1 p=00000000b2d5d5a9 Nc=0 Na=1 f=26 <...>-2876 [002] ...1 29.779966: fscache_relinquish: c=0000000046a8b5b3 u=1 p=00000000b2d5d5a9 Nc=0 Na=1 f=26 r=0 <...>-2876 [002] .N.1 29.779969: fscache_cookie: PUT rlq c=0000000046a8b5b3 u=0 p=00000000b2d5d5a9 Nc=0 Na=0 f=16 <...>-2869 [003] .N.1 29.787568: fscache_cookie: DISCARD c=000000002b6a2db1 u=1 p=00000000b2d5d5a9 Nc=0 Na=1 f=02 [ I wonder why the timestamps don't match between the traces and the kernel log... ] So, can we infer from this trace that an evict could actually be on-going but the old cookie wasn't relinquished yet and hence the collision? > Note that my fscache-iter branch[1] improves the situation where the disk I/O > required to effect the destruction of a cache object delays the completion of > relinquishment by inserting waits, but that oughtn't to help here. Right, I haven't looked at it yet (I'll try to) but that could make things even worse, right? Cheers, -- Luis