... > p += BIT_WORD(nr); > - if (READ_ONCE(*p) & mask) > - return 1; > - > old = arch_atomic_long_fetch_or(mask, (atomic_long_t *)p); > return !!(old & mask); > } This looks like the same pattern (attempting to avoid a locked bus cycle) that caused the qdisc code to sit on transmit packets (even on x86). That had some barriers in it (possibly nops on x86) that didn't help - although the comments suggested otherwise. I wonder if the pattern has been used anywhere else? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)