Hi Thomas, On Mon, Aug 01, 2022 at 10:48:56PM +0200, Thomas Gleixner wrote: > On Sun, Jul 31 2022 at 03:31, Jason A. Donenfeld wrote: > You clearly forgot to tell people that they need a special config to > make this compile. As I wrote in my patch body: | The actual place that has the most work to do is in all of the other | files. Most of the vDSO shared page infrastructure is centered around | gettimeofday, and so the main structs are all in arrays for different | timestamp types, and attached to time namespaces, and so forth. I've | done the best I could to add onto this in an unintrusive way, but you'll | notice almost immediately from glancing at the code that it still needs | some untangling work. This also only works on x86 at the moment. I could | certainly use a hand with this part. So I'm not surprised other things are screwed up. This works well in my test harness, indeed, but I imagine there are lots of fiddly bits like that to work out. I wanted to send an RFC to elicit comments on the idea and API before moving forward, as I have a strong sense this is one of those "90% 10%" things, where 10% of the details take 90% of the time. Also, I haven't hooked up vdso32 yet. > > +vobjs-y := vdso-note.o vclock_gettime.o vgetcpu.o vgetrandom.o > > I don't even have to try to see that this cannot build with a defconfig: > > Lacks -pg for that file and the included chacha.c contains > EXPORT_SYMBOL() which is not really working in the VDSO. Thanks, I'll address this if I do a v3. You meant the removal of -pg, right? For the EXPORT_SYMBOL() stuff (and other symbols), I'm not sure whether I'll add an #ifdef maze, hoist a static function into a .h file, or just make another minier implementation of the necessary functions. Each approach has a pitfall. > > +DECLARE_VVAR_SINGLE(640, struct vdso_rng_data, _vdso_rng_data) > ... > > +#define __vdso_rng_data (VVAR(_vdso_rng_data)) > > + > > +static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) > > +{ > > + return &__vdso_rng_data; > > +} > > That's not working with time name spaces. > > +static __always_inline ssize_t > > +__cvdso_getrandom(void *opaque_state, void *buffer, size_t len, unsigned int flags) > > +{ > > + struct getrandom_state *state = opaque_state; > > + const struct vdso_rng_data *rng_info = __arch_get_vdso_rng_data(); > > This gives you vvar__vdso_rng_data and that points to the VVAR page at > offset 640. That works up to the point where a task is part of a > non-root time name space. > > The kernel side mapping (the one which is updated) looks like this: > > VVAR_PAGE > VIRT_CLOCK_PAGE[S] > TIMENS_PAGE > > If time namespaces are disabled or the task is in the root time > namespace then the user mapping is in the same order. > > If the task is in the non-root time namespace, then the user mapping is: > > TIMENS_PAGE > VIRT_CLOCK_PAGE[S] > VVAR_PAGE > > So your user space looks at offset 640 in the TIMENS_PAGE, which has > rand_data->ready and rand_data->generation == 0 forever. > > See the comment above timens_setup_vdso_data() and look at the way how > e.g. __cvdso_time_data() deals with that. Ahhh, bingo! Thanks a lot for that. I couldn't quite grok before what was happening with the timens stuff, but I think I get it now. When a process is made in a timens, these pages are mapped differently, so that the timens is in the same place as the init ns page would be. That's clever. So I need to figure out some way to make __arch_get_vdso_rng_ data() always return the address of VVAR_PAGE, even when it's been scooted down... I guess this means checking a bit in what's normally in the vvar slot, and if it's a timens one, then loading the one that it's in the timens slot, since that'll be the vvar one. Maybe that'll do it. > VDSO hacking is special and not a sunday evening project. :) While initially a somewhat bewildering maze, it's a rather fun puzzle. Jason