On Mon, Sep 25, 2023 at 10:27:02PM -0500, Eric W. Biederman wrote: > Kees Cook <keescook@xxxxxxxxxxxx> writes: > > > On Mon, Sep 25, 2023 at 05:27:12PM +0200, Sebastian Ott wrote: > >> On Mon, 25 Sep 2023, Eric W. Biederman wrote: > >> > > >> > Implement a helper elf_load that wraps elf_map and performs all > >> > of the necessary work to ensure that when "memsz > filesz" > >> > the bytes described by "memsz > filesz" are zeroed. > >> > > >> > Link: https://lkml.kernel.org/r/20230914-bss-alloc-v1-1-78de67d2c6dd@xxxxxxxxxxxxxx > >> > Reported-by: Sebastian Ott <sebott@xxxxxxxxxx> > >> > Reported-by: Thomas Weißschuh <linux@xxxxxxxxxxxxxx> > >> > Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> > >> > --- > >> > fs/binfmt_elf.c | 111 +++++++++++++++++++++--------------------------- > >> > 1 file changed, 48 insertions(+), 63 deletions(-) > >> > > >> > Can you please test this one? > > > > Eric thanks for doing this refactoring! This does look similar to the > > earlier attempt: > > https://lore.kernel.org/lkml/20221106021657.1145519-1-pedro.falcato@xxxxxxxxx/ > > and it's a bit easier to review. > > I need to context switch away for a while so Kees if you will > I will let you handle the rest of this one. > > > A couple of thoughts running through my head for anyone whose ambitious > might include cleaning up binfmt_elf.c > > The elf_bss variable in load_elf_binary can be removed. > > Work for a follow on patch is using my new elf_load in load_elf_interp > and possibly in load_elf_library. (More code size reduction). > > An outstanding issue is if the first segment has filesz 0, and has a > randomized locations. But that is the same as today. > > There is a whole question does it make sense for the elf loader > to have it's own helper vm_brk_flags in mm/mmap.c or would it > make more sense for binfmt_elf to do what binfmt_elf_fdpic does and > have everything to go through vm_mmap. > > I think replacing vm_brk_flags with vm_mmap would allow fixing the > theoretical issue of filesz 0 and randomizing locations. > > > > In this change I replaced an open coded padzero that did not clear > all of the way to the end of the page, with padzero that does. Yeah, the resulting code is way more readable now. > I also stopped checking the return of padzero as there is at least > one known case where testing for failure is the wrong thing to do. > It looks like binfmt_elf_fdpic may have the proper set of tests > for when error handling can be safely completed. > > I found a couple of commits in the old history > https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git, > that look very interesting in understanding this code. > > commit 39b56d902bf3 ("[PATCH] binfmt_elf: clearing bss may fail") > commit c6e2227e4a3e ("[SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c") > commit 5bf3be033f50 ("v2.4.10.1 -> v2.4.10.2") > > Looking at commit 39b56d902bf3 ("[PATCH] binfmt_elf: clearing bss may fail"): > > commit 39b56d902bf35241e7cba6cc30b828ed937175ad > > Author: Pavel Machek <pavel@xxxxxx> > > Date: Wed Feb 9 22:40:30 2005 -0800 > > > > [PATCH] binfmt_elf: clearing bss may fail > > > > So we discover that Borland's Kylix application builder emits weird elf > > files which describe a non-writeable bss segment. > > > > So remove the clear_user() check at the place where we zero out the bss. I > > don't _think_ there are any security implications here (plus we've never > > checked that clear_user() return value, so whoops if it is a problem). > > > > Signed-off-by: Pavel Machek <pavel@xxxxxxx> > > Signed-off-by: Andrew Morton <akpm@xxxxxxxx> > > Signed-off-by: Linus Torvalds <torvalds@xxxxxxxx> > > It seems pretty clear that binfmt_elf_fdpic with skipping clear_user > for non-writable segments and otherwise calling clear_user (aka padzero) > and checking it's return code is the right thing to do. > > I just skipped the error checking as that avoids breaking things. > > It looks like Borland's Kylix died in 2005 so it might be safe to > just consider read-only segments with memsz > filesz an error. I really feel like having a read-only BSS is a pathological state that should be detected early? > Looking at commit 5bf3be033f50 ("v2.4.10.1 -> v2.4.10.2") the > binfmt_elf.c bits confirm my guess that the weird structure is because > before that point binfmt_elf.c assumed there would be only a single > segment with memsz > filesz. Which is why the code was structured so > weirdly. Agreed. > Looking a little farther it looks like the binfmt_elf.c was introduced > in Linux v1.0, with essentially the same structure in load_elf_binary as > it has now. Prior to that Linux hard coded support for a.out binaries > in execve. So if someone wants to add a Fixes tag it should be > "Fixes: v1.0" > > Which finally explains to me why the code is so odd. For the most part > the code has only received maintenance for the last 30 years or so. > Strictly 29 years, but 30 has a better ring to it. > > Anyway those are my rambling thoughts that might help someone. > For now I will be happy if we can get my elf_load helper tested > to everyone's satisfaction and merged. I'm probably going to pull most of this email into the commit log for the v2 patch -- there's good history here worth capturing. -- Kees Cook