On Thu, 7 Dec 2017, Geert Uytterhoeven wrote: > Hi Alan, > > On Wed, Dec 6, 2017 at 11:02 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > On Wed, 6 Dec 2017, SF Markus Elfring wrote: > >> >>> Does the existing memory allocation error message include the > >> >>> &udev->dev device name and driver name? If it doesn't, there will be > >> >>> no way for the user to tell that the error message is related to the > >> >>> device failure. > >> >> > >> >> No, but the effect is similar. > >> >> > >> >> OOM does a dump_stack() so this function's call tree is shown. > >> > > >> > A call stack doesn't tell you which device was being handled. > >> > >> Do you find a default Linux allocation failure report insufficient then? > >> > >> Would you like to to achieve that the requested information can be determined > >> from a backtrace? > > > > It is not practical to do this. The memory allocation routines do not > > for what purpose the memory is being allocated; hence when a failure > > occurs they cannot tell what device (or other part of the system) will > > be affected. > > If even allocation of 24 bytes fails, lots of other devices and other parts of > the system will start failing really soon... In fact, one wonders if the allocation routine's own error message and stack trace would actually appear anywhere... > > That's why we have a secondary error message. > > ... and the secondary error message would still be useless. Well, there is still a difference between GFP_ATOMIC and GFP_KERNEL allocations. Failure of the first doesn't necessarily imply failure of the second, so perhaps the system could recover. The real problem is that the kernel development community doesn't have a fixed policy on how to handle memory allocation errors. There are several possibilities: Ignore them on the grounds that they will never happen. (Really? And what is the size limit above which they might happen?) Ignore them on the grounds that the machine will hang or crash in the near future. (Is this guaranteed?) Treat them like other errors: try to press forward (perhaps in a degraded mode). Treat them like other errors: log an error message and try to press forward. And probably a few more that haven't occurred to me. No doubt there are examples of each at various places in the kernel. Nobody seems able to agree on a single course of action. Maybe not even Linus. If there was one agreed-upon policy, then we could definitively point to old code and say "That's wrong, and here is how it should be fixed." But currently this is not possible, and we end up with repetitive discussions like this one that aren't of general use. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html