On 2020-09-14 13:41, Miklos Szeredi wrote:
On Thu, Sep 10, 2020 at 5:42 PM <ppvk@xxxxxxxxxxxxxx> wrote:
On 2020-09-08 16:55, Miklos Szeredi wrote:
> On Tue, Sep 8, 2020 at 10:17 AM Pradeep P V K
> <pragalla@xxxxxxxxxxxxxxxx> wrote:
>>
>> From: Pradeep P V K <ppvk@xxxxxxxxxxxxxx>
>>
>> There is a potential race between fuse_abort_conn() and
>> fuse_copy_page() as shown below, due to which VM_BUG_ON_PAGE
>> crash is observed for accessing a free page.
>>
>> context#1: context#2:
>> fuse_dev_do_read() fuse_abort_conn()
>> ->fuse_copy_args() ->end_requests()
>
> This shouldn't happen due to FR_LOCKED logic. Are you seeing this on
> an upstream kernel? Which version?
>
> Thanks,
> Miklos
This is happen just after unlock_request() in fuse_ref_page(). In
unlock_request(), it will clear the FR_LOCKED bit.
As there is no protection between context#1 & context#2 during
unlock_request(), there are chances that it could happen.
Ah, indeed, I missed that one.
Similar issue in fuse_try_move_page(), which dereferences oldpage
after unlock_request().
Fix for both is to grab a reference to the page from ap->pages[] array
*before* calling unlock_request().
Attached untested patch. Could you please verify that it fixes the
bug?
Thanks for the patch. It is an one time issue and bit hard to reproduce
but still we
will verify the above proposed patch and update the test results here.
Minor comments on the commit text of the proposed patch : This issue was
originally reported by me and kernel test robot
identified compilation errors on the patch that i submitted.
This confusion might be due to un proper commit text note on "changes
since v1"
Thanks,
Miklos
Thanks and Regards,
Pradeep