On 24 Feb 2013 23:53:00 +0100 neilb@xxxxxxx wrote: > On 21 Feb 2013 16:44:41 +0100 Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> > wrote: > > > On 21 Feb 2013 00:55:00 +0100 neilb@xxxxxxx wrote: > > > > > On 20 Feb 2013 14:57:07 +0100 bstroesser@xxxxxxxxxxxxxx wrote: > > > > > > > On 20 Feb 2013 04:09:00 +0100 neilb@xxxxxxx wrote: > > > > snip > > > > > > > Maybe: > > > > > > > > > > switch(cache_make_upcall(detail, h)) { > > > > > case -EINVAL: > > > > > if (rv) { > > > > > set_bit(CACHE_NEGATIVE, &h->flags); > > > > > cache_fresh_locked(h, get_seconds() + CACHE_NEW_EXPIRY); > > > > > rv = -ENOENT; > > > > > } > > > > > /* FALLTHROUGH */ > > > > > case -EAGAIN: > > > > > cache_fresh_unlocked(h, detail); > > > > > } > > > > > > > > I agree, your patch is obviously better than the mine. > > > > But let me suggest one little change: I would like to substitute > > > > cache_fresh_unlocked() by clear_bit() and cache_revisit_request(), > > > > as the call to cache_dequeue() in cache_fresh_unlocked() seems to > > > > be obsolete here: > > > > > > It is exactly this sort of thinking (on my part) that got us into this mess > > > in the first place. I reasoned that the full locking/testing/whatever wasn't > > > necessary and took a short cut. It wasn't a good idea. > > > > Maybe I'm totally wrong, but AFAICS, calling cache_dequeue() here in extreme > > situations (two threads in parallel calling check_cache() while a first reader > > is going to open cache access) could again cause a race (?) > > Can you explain the race you see? O.k., let me try ... Suppposed there are 2 threads calling cache_check concurrently and a user process that is going to become a reader for the involved cache. The first thread calls cache_is_valid() and gets -EAGAIN. Next, it sets CACHE_PENDING and calls cache_make_upcall(), which returns -EINVAL, as there is no reader yet. Meanwhile the second thread also calls cache_is_valid(), also gets -EAGAIN, but now is interrupted and delayed for a while. The first thread continues its work, negates the entry and calls cache_fresh_locked() followed by cache_fresh_unlocked(). In cache_fresh_unlocked it resets CACHE_PENDING and calls cache_revisit_request(). While this, it is interrupted and delayed. The user process is scheduled and opens the channel to become a reader. The second thread wakes up again and sets CACHE_PENDING. Next it calls cache_make_upcall(), which results in a request being queued, as there is a reader now. The first thread wakes up and continues its work calling cache_dequeue(). Maybe the request is dropped before the reader could process it. Do you agree or did I miss something? > > > > > BTW: if there is a reader for a cache, is there any protection against many > > upcalls being queued for the same cache entry? > > The CACHE_PENDING flag should provide that protection. We only queue an > upcall after "test_and_set", and always dequeue after "clear_bit". Sorry, my question wasn't very clear. CACHE_PENDING *does* provide the protection against parallel upcalls. OTOH, as cache_is_valid() is called before test_and_set_bit(CACHE_PENDING) and then the upcall is made without again calling cache_is_valid(), I see a small chance of a second request being queued unnecessarily just after the first request was answered. Bodo > > NeilBrown > > > > > > Bodo > > > > > > > > Given that this is obviously difficult code to get right, we should make it > > > as easy to review as possible. Have "cache_fresh_unlocked" makes it more > > > obviously correct, and that is a good thing. > > > Maybe cache_dequeue isn't needed here, but it won't hurt so I'd much rather > > > have the clearer code. > > > In fact, I'd also like to change > > > > > > if (test_and_clear_bit(CACHE_PENDING, &ch->flags)) > > > cache_dequeue(current_detail, ch); > > > cache_revisit_request(ch); > > > > > > near the end of cache_clean to call cache_fresh_unlocked(). > > > ÿôèº{.nÇ+?·?®??+%?Ëÿ±éݶ¥?wÿº{.nÇ+?·¥?{±þwìþ)í?æèw*jg¬±¨¶????Ý¢jÿ¾«þG«?éÿ¢¸¢·¦j:+v?¨?wèjØm¶?ÿþø¯ù®w¥þ?àþf£¢·h??â?úÿ?Ù¥