On Fri, Sep 24, 2021 at 6:07 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > On Thu, 23 Sep 2021 23:35:47 -0400 > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > The pid_mask will start out with 1024 entries for the first 10 MSB bits. > > This will cost 4K for 32 bit architectures and 8K for 64 bit. Each of > > these will have a 1024 array to store the next 10 bits of the pid (another > > 4 or 8K). These will hold an 512 byte bitmask (which will cover the LSB 12 > > bits or 4096 bits). > > Thinking about this more, I should adjust this split. > > Currently, this is a 10,10,12 split, but since the upper chunks are > arrays of pointers, and the lower chunk is a bitmask, and that pids > tend to be close together, I should make the lower split bigger. > > As a 4K page is 32768 bits (2^15), the lower split should be 15 bits, > not 12. This will probably allocate better. > > Of course 32 - 15 is 17. So maybe to keep it simple, by having the two > upper chunks still the same in size, I could have it be 14 bits for the > lower (2048 bytes). And since pid_max can only be 2^30, we don't even > need to cover the full 32 bits. > > 30 - 14 = 16 = 8 * 2. > > Then I can make the upper chunks cover 8 bits (arrays of 256 pointers) > and the lower chunks cove 14 bits. This would coincidentally make all > chunks 2K in size (on 64 bit machines). > > Hmm, in that case, I can make the lower and upper chunks use the same > memory and not separate them. They are unions after all. But that may > be unfair for 32 bit machines, as the upper chunks are only going to be > 1K in size for them (256 * 4). Note that there is only one top-level chunk (so its size doesn't matter much), (usually) about one middle-tier chunk (except for some heavy cases, since pids are allocated linearly), and quite some lower-tier bitset leaves. So I'd optimise towards smaller leaves at the expense of middle-tier (and especially top-tier) chunk size (especially considering the fact that in the kernel, buddy allocator is used), like 12-8-12 or something like that, but I have no factual basis for arguing about specific split. Also, I cannot resist from noticing that this reminds me an awful lot of XArray and [1]. Maybe, some wrapper around XArray would do? [1] https://gitlab.com/strace/strace/-/raw/master/src/trie.h > > -- Steve > -- Eugene Syromyatnikov mailto:evgsyr@xxxxxxxxx xmpp:esyr@jabber.{ru|org}