On Wed, Jan 17, 2018 at 11:26 AM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Wed, Jan 17, 2018 at 6:17 AM, Alan Cox <alan@xxxxxxxxxxxxxxx> wrote: >> >> Can we kill off the remaining users of set_fs() ? > > I would love to, but it's not going to happen short-term. If ever. > > Some could be removed today: the code in arch/x86/net/bpf_jit_comp.c > seems to be literally the ramblings of a diseased mind. There's no > reason for the set_fs(), there's no reason for the > flush_icache_range() (it's a no-op on x86 anyway), and the smp_wmb() > looks bogus too. > > I have no idea how that braindamage happened, but I assume it got > copied from some broken source. At the time commit 0a14842f5a3c0e88a1e59fac5c3025db39721f74 went in, this was the first JIT implementation for BPF, so maybe I wanted to avoid other arches to forget to flush icache : You bet that my implementation served as a reference for other JIT. At that time, various calls to flush_icache_range() were definitely in arch/x86 or kernel/module.c (I believe I must have copied the code from kernel/module.c, but that I am not sure) > > But there are about ~100 set_fs() calls in generic code, and some of > those really are pretty fundamental. Doing things like "kernel_read()" > without set_fs() is basically impossible. > > We've had set_fs() since the beginning. The naming is obviously very > historical. We have it for a very good reason, and I don't really see > that reason going away. > > So realistically, we want to _minimize_ set_fs(), and we might want to > make sure that it's done only in limited settings (it might, for > example, be a good idea and a realistic goal to make sure that drivers > and modules can't do it, and use proper helper functions like that > "read_kernel()"). > > But getting rid of the concept entirely? Doesn't seem likely. > > Linus