On Fri 19-03-21 15:34:50, David Hildenbrand wrote:
Exploring /dev/kmem and /dev/mem in the context of memory hot(un)plug and memory ballooning, I started questioning the existance of /dev/kmem. Comparing it with the /proc/kcore implementation, it does not seem to be able to deal with things like a) Pages unmapped from the direct mapping (e.g., to be used by secretmem) -> kern_addr_valid(). virt_addr_valid() is not sufficient. b) Special cases like gart aperture memory that is not to be touched -> mem_pfn_is_ram() Unless I am missing something, it's at least broken in some cases and might fault/crash the machine. Looks like its existance has been questioned before in 2005 and 2010 [1], after ~11 additional years, it might make sense to revive the discussion. CONFIG_DEVKMEM is only enabled in a single defconfig (on purpose or by mistake?). All distributions I looked at disable it. 1) /dev/kmem was popular for rootkits [2] before it got disabled basically everywhere. Ubuntu documents [3] "There is no modern user of /dev/kmem any more beyond attackers using it to load kernel rootkits.". RHEL documents in a BZ [5] "it served no practical purpose other than to serve as a potential security problem or to enable binary module drivers to access structures/functions they shouldn't be touching" 2) /proc/kcore is a decent interface to have a controlled way to read kernel memory for debugging puposes. (will need some extensions to deal with memory offlining/unplug, memory ballooning, and poisoned pages, though) 3) It might be useful for corner case debugging [1]. KDB/KGDB might be a better fit, especially, to write random memory; harder to shoot yourself into the foot. 4) "Kernel Memory Editor" hasn't seen any updates since 2000 and seems to be incompatible with 64bit [1]. For educational purposes, /proc/kcore might be used to monitor value updates -- or older kernels can be used. 5) It's broken on arm64, and therefore, completely disabled there. Looks like it's essentially unused and has been replaced by better suited interfaces for individual tasks (/proc/kcore, KDB/KGDB). Let's just remove it. [1] https://lwn.net/Articles/147901/ [2] https://www.linuxjournal.com/article/10505 [3] https://wiki.ubuntu.com/Security/Features#A.2Fdev.2Fkmem_disabled [4] https://sourceforge.net/projects/kme/ [5] https://bugzilla.redhat.com/show_bug.cgi?id=154796 Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Hillf Danton <hdanton@xxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@xxxxxxxxxxxxxx> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> Cc: Minchan Kim <minchan@xxxxxxxxxx> Cc: huang ying <huang.ying.caritas@xxxxxxxxx> Cc: Jonathan Corbet <corbet@xxxxxxx> Cc: Russell King <linux@xxxxxxxxxxxxxxx> Cc: Liviu Dudau <liviu.dudau@xxxxxxx> Cc: Sudeep Holla <sudeep.holla@xxxxxxx> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> Cc: Andrew Lunn <andrew@xxxxxxx> Cc: Gregory Clement <gregory.clement@xxxxxxxxxxx> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@xxxxxxxxx> Cc: Yoshinori Sato <ysato@xxxxxxxxxxxxxxxxxxxx> Cc: Brian Cain <bcain@xxxxxxxxxxxxxx> Cc: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> Cc: Jonas Bonn <jonas@xxxxxxxxxxxx> Cc: Stefan Kristiansson <stefan.kristiansson@xxxxxxxxxxxxx> Cc: Stafford Horne <shorne@xxxxxxxxx> Cc: Rich Felker <dalias@xxxxxxxx> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx> Cc: Chris Zankel <chris@xxxxxxxxxx> Cc: Max Filippov <jcmvbkbc@xxxxxxxxx> Cc: Arnd Bergmann <arnd@xxxxxxxx> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> Cc: Rob Herring <robh@xxxxxxxxxx> Cc: "Pavel Machek (CIP)" <pavel@xxxxxxx> Cc: Theodore Dubois <tblodt@xxxxxxxxxx> Cc: "Alexander A. Klimov" <grandmaster@xxxxxxxxxxxx> Cc: Pavel Machek <pavel@xxxxxx> Cc: Sam Ravnborg <sam@xxxxxxxxxxxx> Cc: Alexandre Belloni <alexandre.belloni@xxxxxxxxxxx> Cc: Andrey Zhizhikin <andrey.zhizhikin@xxxxxxxxxxxxxxxxxxxx> Cc: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> Cc: Krzysztof Kozlowski <krzk@xxxxxxxxxx> Cc: Viresh Kumar <viresh.kumar@xxxxxxxxxx> Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Xiaoming Ni <nixiaoming@xxxxxxxxxx> Cc: Robert Richter <rric@xxxxxxxxxx> Cc: William Cohen <wcohen@xxxxxxxxxx> Cc: Corentin Labbe <clabbe@xxxxxxxxxxxx> Cc: Kairui Song <kasong@xxxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: linux-doc@xxxxxxxxxxxxxxx Cc: linux-kernel@xxxxxxxxxxxxxxx Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx Cc: uclinux-h8-devel@xxxxxxxxxxxxxxxxxxxx Cc: linux-hexagon@xxxxxxxxxxxxxxx Cc: linux-m68k@xxxxxxxxxxxxxxxxxxxx Cc: openrisc@xxxxxxxxxxxxxxxxxxxx Cc: linux-sh@xxxxxxxxxxxxxxx Cc: sparclinux@xxxxxxxxxxxxxxx Cc: linux-xtensa@xxxxxxxxxxxxxxxx Cc: linux-fsdevel@xxxxxxxxxxxxxxx Cc: Linux API <linux-api@xxxxxxxxxxxxxxx> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
Acked-by: Michal Hocko <mhocko@xxxxxxxx> -- Michal Hocko SUSE Labs