On 25 Feb 2013 20:55:50 +0100 Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> wrote: > On 24 Feb 2013 23:53:00 +0100 neilb@xxxxxxx wrote: > > > On 21 Feb 2013 16:44:41 +0100 Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> > > wrote: > > > > > On 21 Feb 2013 00:55:00 +0100 neilb@xxxxxxx wrote: > > > > > > > On 20 Feb 2013 14:57:07 +0100 bstroesser@xxxxxxxxxxxxxx wrote: > > > > > > > > > On 20 Feb 2013 04:09:00 +0100 neilb@xxxxxxx wrote: > > > > > > snip > > > > > > > > > Maybe: > > > > > > > > > > > > switch(cache_make_upcall(detail, h)) { > > > > > > case -EINVAL: > > > > > > if (rv) { > > > > > > set_bit(CACHE_NEGATIVE, &h->flags); > > > > > > cache_fresh_locked(h, get_seconds() + CACHE_NEW_EXPIRY); > > > > > > rv = -ENOENT; > > > > > > } > > > > > > /* FALLTHROUGH */ > > > > > > case -EAGAIN: > > > > > > cache_fresh_unlocked(h, detail); > > > > > > } > > > > > > > > > > I agree, your patch is obviously better than the mine. > > > > > But let me suggest one little change: I would like to substitute > > > > > cache_fresh_unlocked() by clear_bit() and cache_revisit_request(), > > > > > as the call to cache_dequeue() in cache_fresh_unlocked() seems to > > > > > be obsolete here: > > > > > > > > It is exactly this sort of thinking (on my part) that got us into this mess > > > > in the first place. I reasoned that the full locking/testing/whatever wasn't > > > > necessary and took a short cut. It wasn't a good idea. > > > > > > Maybe I'm totally wrong, but AFAICS, calling cache_dequeue() here in extreme > > > situations (two threads in parallel calling check_cache() while a first reader > > > is going to open cache access) could again cause a race (?) > > > > Can you explain the race you see? > > O.k., let me try ... > > Suppposed there are 2 threads calling cache_check concurrently and a > user process that is going to become a reader for the involved cache. > > The first thread calls cache_is_valid() and gets -EAGAIN. > Next, it sets CACHE_PENDING and calls cache_make_upcall(), which > returns -EINVAL, as there is no reader yet. > > Meanwhile the second thread also calls cache_is_valid(), also gets > -EAGAIN, but now is interrupted and delayed for a while. > > The first thread continues its work, negates the entry and calls > cache_fresh_locked() followed by cache_fresh_unlocked(). > In cache_fresh_unlocked it resets CACHE_PENDING and calls > cache_revisit_request(). While this, it is interrupted and delayed. > > The user process is scheduled and opens the channel to become a reader. > > The second thread wakes up again and sets CACHE_PENDING. Next it calls > cache_make_upcall(), which results in a request being queued, as there > is a reader now. > > The first thread wakes up and continues its work calling cache_dequeue(). > Maybe the request is dropped before the reader could process it. > > Do you agree or did I miss something? Yes, I see what you mean, thanks. I would close that race with: @@ -1022,6 +1016,9 @@ static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch) struct cache_request *cr = container_of(cq, struct cache_request, q); if (cr->item != ch) continue; + if (test_bit(CACHE_PENDING, &ch->flags)) + /* Lost a race and it is pending again */ + break; if (cr->readers != 0) continue; list_del(&cr->q.list); > > > > > > > > > > BTW: if there is a reader for a cache, is there any protection against many > > > upcalls being queued for the same cache entry? > > > > The CACHE_PENDING flag should provide that protection. We only queue an > > upcall after "test_and_set", and always dequeue after "clear_bit". > > Sorry, my question wasn't very clear. CACHE_PENDING *does* provide the > protection against parallel upcalls. > > OTOH, as cache_is_valid() is called before test_and_set_bit(CACHE_PENDING) > and then the upcall is made without again calling cache_is_valid(), > I see a small chance of a second request being queued unnecessarily just > after the first request was answered. A small chance, yes. However I don't think it is harmful, just unnecessary. So I'm not sure it is worth fixing. Thanks, NeilBrown > > Bodo > > > > > NeilBrown > > > > > > > > > > Bodo > > > > > > > > > > > Given that this is obviously difficult code to get right, we should make it > > > > as easy to review as possible. Have "cache_fresh_unlocked" makes it more > > > > obviously correct, and that is a good thing. > > > > Maybe cache_dequeue isn't needed here, but it won't hurt so I'd much rather > > > > have the clearer code. > > > > In fact, I'd also like to change > > > > > > > > if (test_and_clear_bit(CACHE_PENDING, &ch->flags)) > > > > cache_dequeue(current_detail, ch); > > > > cache_revisit_request(ch); > > > > > > > > near the end of cache_clean to call cache_fresh_unlocked(). > > > >
Attachment:
signature.asc
Description: PGP signature