On 2021/1/12 19:42, Vlastimil Babka wrote:
On 1/12/21 8:24 AM, Michal Hocko wrote:
If we're going to do a separate "patch: make process_sysctl_arg()
return an errno instead of 0" then fine, we can discuss that. But it's
conceptually a different work from fixing this situation.
.
However, are the logs generated by process_sysctl_arg() clearer and more
accurate than parse_args()? Should the logs generated by
process_sysctl_arg() be deleted?
I think the individual logs are very useful and should be retained.
Yes, other sysfs specific error messages are likely useful. I just fail
to see why a missing value should be handled here when there is an
existing handling in the caller. Not sure whether a complete shadow
reporting in process_sysctl_arg is a deliberate decision or not.
Vlastimil?
Yes, it's a way to have more useful sysctl-specific reports than the generic
ones. And I think I was inspired by some other existing code, but don't remember
exactly. The options are:
1) the current sysctl-specific reports, return 0 as the values are only consumed
2) be silent and return error, invent new error codes to have generic report be
more useful for sysctl, but inevitably lose some nuances anyway
3) a mix where 2) is used for situations where generic report is sufficient
enough, 1) where not
Patch v2 went with option 1), v3 with option 3). I think it's down to
preferences. I would personally go with v2 and message similar to the existing
ones, i.e.:
"Failed to set sysctl parameter '%s': no value given\n"
Also we seem to be silently doing nothing when strlen(val) == 0, i.e.
"hung_task_panic=" was passed. Worth reporting the same error.
But v3 is fine with me as well. The generic error message works. We could just
add "if (!len) return -EINVAL" below the strlen() call.
Also please Cc: stable.
Anyway one way or the other, all I care about is to have a reporting in
place because this shouldn't be a silent failure.
The current v2 is already in the linux-next branch and throws a new
error:
https://lore.kernel.org/lkml/cb54e349-7147-0a1f-a349-1e16ba603fce@xxxxxxxxxxxxx/
This bug has been mentioned in the previous discussion and has been
fixed in the current v3 patch.
https://lore.kernel.org/linux-fsdevel/202101111149.20A58E1@keescook/
What am I supposed to do now?
- Resend V3?
- Rewrite a new fix patch based on the current code of linux-next.
- Develop a new V4 patch: Use V2 to discuss how to use the Patch4
solution.
https://lore.kernel.org/linux-fsdevel/bc098af4-c0cd-212e-d09d-46d617d0acab@xxxxxxxxxx/#t
Thanks
Xiaoming Ni