Coly Li <colyli@xxxxxxx> 于2023年2月2日周四 20:22写道: > Hmm, there should be something to be fixed, but not the non-exist NULL dereference. > > If you look inside btree_node_alloc_replacement(), you may find __bch_btree_node_alloc() which does the real work indeed. And yes, I would suggest you to improve a bit inside __bch_btree_node_alloc(). > > 1088 struct btree *__bch_btree_node_alloc(struct cache_set *c, struct btree_op *op, > [snipped] > 1093 struct btree *b = ERR_PTR(-EAGAIN); > 1094 > 1095 mutex_lock(&c->bucket_lock); > 1096 retry: > 1097 if (__bch_bucket_alloc_set(c, RESERVE_BTREE, &k.key, wait)) > 1098 goto err; > [snipped] > 1102 > 1103 b = mca_alloc(c, op, &k.key, level); > 1104 if (IS_ERR(b)) > 1105 goto err_free; > 1106 > 1107 if (!b) { > 1108 cache_bug(c, > 1109 "Tried to allocate bucket that was in btree cache"); > 1110 goto retry; > 1111 } > 1112 > > From the above code, at line 1097 if __bch_bucket_alloc_set() returns non-zero value, the code will jump to label err: and returns ERR_PTR(-EAGAIN). But if the code fails at line 1103 and b is set to NULL, at line 1110 the code will jump back to label retry: and calls __bch_bucket_alloc_set() again. If the second time __bch_bucket_alloc_set() returns non-zero value and the code jumps to label err:, a NULL pointer other than ERR_PTR(-EAGAIN) will be returned. This inconsistent return value on same location at line 1097 seems incorrect, where ERR_PTR(-EAGAIN) should be returned IMHO. > > Therefore I feel that “b = ERR_PTR(-EAGAIN)” should be moved to the location after label retry:, then btree_node_alloc_replacement() will only return error code, and no NULL pointer returned. > > After this change, all following IS_ERR_OR_NULL() checks to btree_node_alloc_replacement() are unnecessary and IS_ERR() just enough, because no NULL will be returned. > > This is a nice catch. I’d like to have it to be fixed. I do appreciate if you want to compose two patches for the fix. Otherwise I can handle it myself. > Hi Coly, Thanks for your reply and detailed explaination! As you explain, I found __bch_btree_node_alloc may return NULL in some situation. So I add some more check in upper code. Your suggestion is more constructive. It'll make the function more clear for other developer. I'd like to help with the patch. And you have kindly pointed the right way to fix. May I merge fix it in one patch with the commit msg "refactor __bch_btree_node_alloc to avoid poential NULL dereference"? Because I think if __bch_btree_node_alloc returns NULL to bch_btree_node_alloc, the function btree_node_alloc_replacement will strill return NULL to n1 in btree_split. I think the possibility is low, if it's not proper, please feel free to let me know. Best regards, Zheng Wang