On Mon, Oct 01, 2012 at 08:56:21PM +0000, Brad Walker wrote: > Kent Overstreet <koverstreet@...> writes: > > > > > On Mon, Oct 01, 2012 at 08:05:14PM +0000, Brad Walker wrote: > > > Kent Overstreet <koverstreet@...> writes: > > > > > > > > > > > What about cache_bypass_hits, cache_bypass_misses? > > > > > > > > > > cache_bypass_hits = 0 > > > cache_bypass_misses = 0 > > > > I should've just asked you for all the stats - what about > > cache_miss_collision? So cache_miss_collisions, cache_read_races are 0... ---- I was just browsing around the code, and I bet I know what it is - btree_insert_check_key() is failing because the btree node is full. The way the code works is on cache miss, we can't just blindly insert that data into the cache because if a write happens to the same location after the cache miss but before the data from the cache miss gets inserted, we'd overwrite the write with stale data. So btree_insert_check_key() inserts a fake key atomically with the cache miss - we don't need that key to be persisted so we can skip journalling and all the normal btree insert code, which is how we can insert this fake key atomically. Then, on when we go to insert the real key that points to the data from the cache miss, we check if the fake key we inserted is still present and fail the insert if it's not. It's cmpxchg(), but for the btree. Anyways... since we're skipping all the normal btree_insert() code, btree_insert_check_key() can't split the btree node if it's full - if the btree node is full it just fails it. This'd be perfectly fine in any normal workload where you've got some mix of reads and writes... if the btree node is full, a write will come along to split it. But the synthetic workload is a bit of a pathological case here :) But, we should confirm this really is what's going on... Can you apply this patch and rerun to test my theory? See if the number of times the printk fires lines up with the number of cache misses. diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 4102267..d5c5313 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -1875,9 +1875,13 @@ bool bch_btree_insert_check_key(struct btree *b, struct btree_op *op, rw_unlock(false, b); rw_lock(true, b, b->level); + if (should_split(b)) { + printk(KERN_DEBUG "bcache: bch_btree_insert_check_key() failed because btree node full\n"); + goto out; + } + if (b->key.ptr[0] != btree_ptr || - b->seq != seq + 1 || - should_split(b)) + b->seq != seq + 1) goto out; op->replace = KEY(op->inode, bio_end(bio), bio_sectors(bio)); -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html