In pick_data_bucket() bcache search the c->data_buckets from the tail to head, if there are buckets (e.g b1, b2) with the same write_point, we want to write in b2 until it's full and then write into b1. But currently, bcache will pick b1 and b2 one by one. E.g, data_buckets shown as below: c->data_buckets |->b1->b2 (1) first write will pick b1 from the list. (2) and after write, b1 will be moved to the tail. (3) then list will be shown as below: c->data_buckets |->b2->b1 (4) the next write will pick b2 from the list and then move b2 to the tail. This commit can make sure we put the writes into b2 until b2 is full and then pick b1 to continue write. The remain question is: Why are there buckets with same write point? Below is one possible scenario: (1) Process-A write [16K - 20K], there is a Bucket-A with write_point(A) and key_off (20K) (2) Process-B write [4K - 8K], a new Bucket-B will be added, write_point(B) and key_off (8K) (3) Process-A write [8K - 12K], pick_data_bucket() will pick Bucket-B to write, as it's key_off is 8K. After (3), we got Bucket-A and Bucket-B with the same write_point(A). Signed-off-by: Dongsheng Yang <dongsheng.yang@xxxxxxxxxxxx> --- drivers/md/bcache/alloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c index 8f5ef27..45c19b0 100644 --- a/drivers/md/bcache/alloc.c +++ b/drivers/md/bcache/alloc.c @@ -576,7 +576,8 @@ static struct open_bucket *pick_data_bucket(struct cache_set *c, else if (!bkey_cmp(&ret->key, search)) goto found; else if (ret->last_write_point == write_point) - ret_task = ret; + if (!ret_task) + ret_task = ret; ret = ret_task ?: list_first_entry(&c->data_buckets, struct open_bucket, list); -- 1.8.3.1