On Tue, Dec 08, 2020 at 03:40:52PM -0800, Dan Williams wrote: > On Tue, Dec 8, 2020 at 2:49 PM Darrick J. Wong <darrick.wong@xxxxxxxxxx> wrote: > [..] > > > So what's your preferred poison? > > > > > > 1. Corrupt random data in whatever's been mapped into the next page (which > > > is what the helpers currently do) > > > > Please no. > > My assertion is that the kernel can't know it's corruption, it can > only know that the driver is abusing the API. So over-copy and WARN > seems better than violently regress by crashing what might have been > working silently before. Right now we have a mixed bag. zero_user() [and it's variants, circa 2008] does a BUG_ON.[0] While the other ones do nothing; clear_highpage(), clear_user_highpage(), copy_user_highpage(), and copy_highpage(). While continuing to audit the code I don't see any users who would violating the API with a simple conversion of the code. The calls which I have worked on [which is many at this point] all have checks in place which are well aware of page boundaries. Therefore, I tend to agree with Dan that if anything is to be done it should be a WARN_ON() which is only going to throw an error that something has probably been wrong all along and should be fixed but continue running as before. BUG_ON() is a very big hammer. And I don't think that Linus is going to appreciate a BUG_ON here.[1] Callers of this API should be well aware that they are operating on a page and that specifying parameters beyond the bounds of a page are going to have bad consequences... Furthermore, I'm still leery of adding the WARN_ON's because Greg KH says many people will be converting them to BUG_ON's via panic-on-warn anyway. But at least that is their choice. FWIW I think this is a 'bad BUG_ON' use because we are "checking something that we know we might be getting wrong".[1] And because, "BUG() is only good for something that never happens and that we really have no other option for".[2] IMO, These calls are like memcpy/memmove. memcpy/memmove don't validate bounds and developers have lived with those constructs for a long time. Ira [0] BTW, After writing this email, with various URL research, I think this BUG_ON() is also probably wrong... [1] <quote> ... It's [BUG_ON] not a "let's check that everybody did things right", it's a "this is a major design rule in this core code". ... </quote> -- Linus (https://lkml.org/lkml/2016/10/4/337) [2] https://yarchive.net/comp/linux/BUG.html