Re: [PATCH] netfs: Add a check for NULL folioq in netfs_writeback_unlock_folios

Chang Yu <marcus.yu.56@xxxxxxxxx> · Fri, 25 Oct 2024 18:01:06 -0700

On Fri, Oct 25, 2024 at 09:05:53AM +0100, David Howells wrote:
> Chang Yu <marcus.yu.56@xxxxxxxxx> wrote:
> 
> > syzkaller reported a null-pointer dereference bug
> > (https://syzkaller.appspot.com/bug?extid=af5c06208fa71bf31b16) in
> > netfs_writeback_unlock_folios caused by passing a NULL folioq to
> > folioq_folio. Fix by adding a check before entering the loop.
> 
> And, of course, the preceding:
> 
> 	if (slot >= folioq_nr_slots(folioq)) {
> 
> doesn't oops because it doesn't actually dereference folioq.
> 
> However... if we get into this function, there absolutely *should* be at least
> one folioq in the rolling buffer.  Part of the rolling buffer's method of
> operation involves keeping at least one folioq around at all times so that we
> don't need to use locks to add/remove from the queue.
> 
> Either the rolling buffer wasn't initialised yet (and it should be initialised
> for all write requests by netfs_create_write_req()) or it has been destroyed
> already.
> 
> Either way, your patch is, unfortunately, just covering up the symptoms rather
> than fixing the root cause.  I suggest instead that we patch the function to
> detect the empty rolling buffer up front, dump some information about the bad
> request and return.
> 
> David
> 
I see. This might be a stupid question, but is it ever possible that we have
exactly one folioq and at the same time

        slot >= folioq_nr_slots(folioq)

is true? Then I imagine netfs_delete_buffer_head would return NULL and
cause the bug to trigger as well?