On Sat, May 20, 2023 at 05:25:52AM -0300, Jason Gunthorpe wrote: > On Sat, May 20, 2023 at 06:19:37AM +0100, Lorenzo Stoakes wrote: > > On Fri, May 19, 2023 at 07:17:41PM -0300, Jason Gunthorpe wrote: > > > On Fri, May 19, 2023 at 03:51:51PM +0100, Lorenzo Stoakes wrote: > > > > Given you are sharply criticising the code I authored here, is it too much > > > > to ask for you to cc- me, the author on commentaries like this? Thanks. > > > > > > > > On Fri, May 19, 2023 at 11:39:13AM +0200, Arnd Bergmann wrote: > > > > > From: Arnd Bergmann <arnd@xxxxxxxx> > > > > > > > > > > While looking at an unused-variable warning, I noticed a new interface coming > > > > > in that requires the use of IS_ERR_OR_NULL(), which tends to indicate bad > > > > > interface design and is usually surprising to users. > > > > > > > > I am not sure I understand your reasoning, why does it 'tend to indicate > > > > bad interface design'? You say that as if it is an obvious truth. Not > > > > obvious to me at all. > > > > > > > > There are 3 possible outcomes from the function - an error, the function > > > > failing to pin a page, or it succeeding in doing so. For some of the > > > > callers that results in an error, for others it is not an error. > > > > > > No, there really isn't. > > > > > > Either it pins the page or it doesn't. Returning "NULL" to mean a > > > specific kind of failure was encountered is crazy.. Especially if we > > > don't document what that specific failure even was. > > > > > > > It's not a specific kind of failure, it's literally "I didn't pin any > > pages" which a caller may or may not choose to interpret as a failure. > > Any time gup fails it didn't pin any pages, that is the whole > point. All that is happening is some ill defined subset of gup errors > are returning 0 instead of an error code. > > If we want to enable callers to ignore certain errors then we need to > return error codes with well defined meanings, have documentation what > errors are included and actually make it sane. Yeah I agree it's not exactly great that a failure to pin can be considered an ordinary case, but I don't think a wrapper function is where we should be trying to fix this. In fact this patch isn't even fixing it, it's treating EIO as a success case for the (possibly broken) uprobe case. I think we are at the wrong level of abstraction here, basically. > > > That can be a reason for gup returning 0 but also if it you look at the > > main loop in __get_user_pages_locked(), if it can't find the VMA it will > > bail early, OR if the VMA flags are not as expected it'll bail early. > > And how does that make any sense? Missing VMA should be EFAULT. Yeah missing VMA doesn't really make sense since we invoke the function with the mmap lock held (it _could_ make sense if you were calling it via one of the unlocked functions, speculatively, though how sensible doing that is another discussion...) > > > caller differentiates between an error and not being able to pin - > > uprobe_write_opcode() - which treats failure to pin as a non-error state. > > That looks like a bug since the function returns 0 on success but it > clearly didn't succeed. The code is specifically handling a failure-to-pin separately - set_swbp() -> uprobe_write_opcode() -> install_breakpoint() explicitly does the following:- ret = set_swbp(&uprobe->arch, mm, vaddr); if (!ret) clear_bit(MMF_RECALC_UPROBES, &mm->flags); So even if this is... questionable, the code literally does want to differentiate between an error and a failure to pin. Presumably this is because of the flag check, but yeah we should be differentiating between sub-cases. > > > Also if we decided at some point to return -EIO as an error suddenly we > > would be treating an error state as not an error state in the proposed code > > which sounds like a foot gun. > > No, this returning 0 on failure is a foot gun. Failing to pin a single > page is always an error, the only question is what sub reason caused > the error to happen. There is no third case where it is not an error. > > Jason The uprobe path thinks otherwise, but maybe the answer is that we just need to -EFAULT on missing VMA and -EPERM on invalid flags. I could look into a patch that tries to undo this convention, and then we could revisit this for the wrapper function too.