On Mon, Feb 24, 2025 at 02:12:13PM +0200, Lilith Gkini wrote: > On Mon, Feb 24, 2025 at 09:00:14AM +0900, Harry Yoo wrote: > > On second thought (after reading your email), > > I think it should be (fp == NULL) && (search == NULL)? > > > > When fp is NULL after the loop, it means (the end of the freelist > > is NULL) AND (there were equal to or less than (slab->objects - nr) objects > > on the freelist). > > > > If the loop has ended after observing fp == NULL, > > on_freelist() should return true only when search == NULL. > > > > If fp != NULL, it means there were more objects than it should have on > > the free list. In that case, we can't take risk of iterating the loop > > forever and it's reasonable to assume that the freelist does not > > end with NULL. > > > > > The reason I m saying this is cause the While loop is looking through > > > every non-NULL freelist pointer for the "search" pattern. If someone > > > wants to search for NULL in the freelist that return 1 will never > > > trigger, even for normal freelists. If "fp" was ever null it wouldn't re > > > enter the loop. Thus the result of the function would be search == NULL > > > because the original writers assumed the freelist will always end with > > > NULL. > > > > Agreed. > > > > > As you correctly pointed out there might be a case where the freelist > > > is corrupted and it doesnt end in NULL. In that case two things could happen: > > > > > > 1) The check_valid_pointer() catches it and fixes it, restoring the > > > corrupted freelist. > > > > > > 2) The check_valid_pointer() fails to catch it and there isn't any NULL. > > > > > > > > > In the first case the problem fixes itself and thats probably why > > > validate_slab() worked fine all this time. > > > > > > The second case is pretty rare, but thats the case that validate_slab() > > > wants to rout out when it checks the `!on_freelist(s, slab, NULL)`, > > > right? > > > > Yes. > > > > > Also to make my added suggestion of `return fp == NULL` work in the first > > > case (since it does corrrect the freelist we want it to now return TRUE) > > > we could also add a line that nulls out the fp right after the > > > `set_freepointer(s, object, NULL);`. > > > > Why? > > fp = get_freepionter() will observe NULL anyway. > > > > > But words are cheap, I should test it out dynamically rather than just > > > reading the code to see how it behaves. Feel free to also test it > > > yourself. > > > > Yes, testing is important, and you should test > > to some extent before submitting a patch. > > > > > I know I am supposed to send a proper Patch with the multiple added > > > lines, but for now we are mostly brainstorming solutions. It will be > > > better to see its behavior first in a debugger I think. > > > > I think it's generally considered good practice to discuss before > > writing any code. > > > > > I hope I am making sense with the thought process I outlined for the > > > return thing. I probably shouldn't be writing emails ealry Saturday > > > morning, haha. > > > > > > Also I really appreciate the kind feedback! The Linux Kernel is infamously > > > known for how rude and unhinged people can be, which can make it a bit > > > stressful and intimidating sending any emails, especially for someone > > > starting out such as myself. > > > > You're welcome ;-) > > > > -- > > Cheers, > > Harry Hi, Lilith. If you don't mind, could you please avoid bottom posting and reply with inline comments like how I reply to you? It makes it easier to follow the conversation. > Alright, I managed to run some tests through a debugger. > > You are right, my code isn't correct, `return fp == search` should be > more appropriate. > > So I think the right way would be to do these changes: > - while (fp && nr < slab->objects) { > > - fp = NULL; > > - return fp == search; > > The first line removes the "=" so it doesnt iterate more times than > slab->objects. Yes. > The second line sets fp to NULL for the case where the code caught a > corrupted freelist (check_valid_pointer) and fixes it by setting > the freepointer to NULL (set_freepointer). Now NULL will be in the > freelist. Yes. but do we care about the return value of on_freelist() when the freelist is corrupted? (we don't know if it was null-terminated or not because it's corrupted.) > The third line is the return value: > TRUE if the final fp we got happens to be the search value the caller > was looking for in the freelist. > FALSE if the final fp we got was not the same as the search value. > > This should make the validate_slab() work right and if anyone ever uses > the on_freelist() again with some other search value other than NULL it > should also work as intended. Yes! that makes sense to me. > For my tests I wrote a kernel module that creates an isolated cache with > 8 objects per slab, allocated all 8 of them and freed them. Then called > validate_slab_cache() with my cache and set a breakpoint at on_freelist(). > From there I could set any value I wanted and observe its behavior > through GDB (I used QEMU and GDB-ed remotely from my host). > This way I didn't have to set a bunch of EXPORT_SYMBOL() and change > stuff to not static; It made testing a lot easier. > > Here are the tests I did with this new change I just mentioned. > > Note: By "full slab" I mean that I allocated every object in the slab > and freed them. FYI in slab terms (slab->inuse == slab->objects) means full slab, You probably meant empty slab? > By "partial" I mean that I only allocated and freed some > objects, but not every object in the slab, ie if the total objects can > be 8 I only alloc-ed and freed 7. > > search == NULL > - full slab > - correct tail true > - corrupted tail with garbage false > - corrupted tail with valid address false Two falses because when the freepointer of the tail object is corrupted, the loop stops when nr equals slab->objects? > - partial slab > - correct tail true > - corrupted tail with garbage true This is true because for partial slabs, the number of objects plus 1 for the garbage, does not exceed slab->objects? > - corrupted tail with valid address false > > search == some fake address > - full slab > - correct tail false > - corrupted tail with garbage false > - corrupted tail with valid address false > > - partial slab > - correct tail false > - corrupted tail with garbage false > - corrupted tail with valid address false > > > search == some address in the freelist > - full slab > - correct tail true > - corrupted tail with garbage true > - corrupted tail with valid address true > > - partial slab > - correct tail true > - corrupted tail with garbage true > - corrupted tail with valid address true The result seems valid to me. At least, this is the best SLUB can do while avoiding the risk of infinite loop iteration. > I apologize if am going into too many details with my testing, I just > wanna make sure I didn't miss anything. No, it's ok ;-) > If my proposed changes look confusing I can send a proper patch. > Should I send it in this chain as a reply, or send a new email > and add you as well to the cc? You can either send a new email or reply to this email with In-Reply-To header. And please cc SLAB ALLOCATOR folks in MAINTAINERS file including me—I recently changed my name and email (previously Hyeonggon Yoo). -- Cheers, Harry