On Tue, 21 Jul 2009 09:03:33 -0700 Dave Hansen <dave@xxxxxxxxxxxxxxxxxx> wrote: > > Once a structure goes over PAGE_SIZE*2, we see occasional > allocation failures. Some people have chosen to switch > over to things like vmalloc() that will let them keep > array-like access to such a large structures. But, > vmalloc() has plenty of downsides. Thanks for looking into this. > Here's an alternative. I think it's what Andrew was > suggesting here: > > http://lkml.org/lkml/2009/7/2/518 > > I call it a flexible array. It does all of its work in > PAGE_SIZE bits, so never does an order>0 allocation. > The base level has PAGE_SIZE-2*sizeof(int) bytes of > storage for pointers to the second level. So, with a > 32-bit arch, you get about 4MB (4183112 bytes) of total > storage when the objects pack nicely into a page. It > is half that on 64-bit because the pointers are twice > the size. > > The interface is dirt simple. 4 functions: > alloc_flex_array() > free_flex_array() flex_array_alloc() and flex_array_free(), please. > flex_array_put() > flex_array_get() It's important to document the arguments too! The lack of an `index' arg to flex_array_put() was important info. > put() appends an item into the array while get() takes > indexes and does array-style access. The interface is rather unexpected. Some callers will want random-access writes and will have sparse data sets. Can we make it more array-like? What are the constraints of this implementation? Maximum index, maximum number of elements, etc? What are the locking rules? Caller-provided, obviously (and correctly). If the caller wants to use a spinlock the caller must use GFP_ATOMIC and handle the fallout when that fails. (But they'd need to handle the fallout with mutex/GFP_KERNEL too). > One thought is that we should perhaps make the base > structure half the size on 32-bit arches. That will > ensure that someone testing on 32-bit will not get > bitten by the size shrinking by half when moving to > 64-bit. {scratches head} If you say so ;) > We could also potentially just pass the "element_size" > into each of the API functions instead of storing it > internally. That would get us one more base pointer > on 32-bit. > > The last improvement that I thought about was letting > the individual array members span pages. In this > implementation, if you have a 2049-byte object, it > will only pack one of them into each "part" with > no attempt to pack them. At this point, I don't think > the added complexity would be worth it. I expect the majority of callers will be storing plain old pointers in here. In fact the facility would be quite useful if it explicitly stored and returned void*'s, like radix-tree and IDR. Do we know of any potential callers which would want flex_array to store elements by value in this manner? > ... > > +struct flex_array *alloc_flex_array(int element_size, int total, gfp_t flags) > +{ > + struct flex_array *ret; > + int max_size = __nr_part_ptrs() * __elements_per_part(element_size); > + > + /* max_size will end up 0 if element_size > PAGE_SIZE */ > + if (total > max_size) > + return NULL; > + ret = kzalloc(sizeof(struct flex_array), flags); > + if (!ret) > + return NULL; > + ret->element_size = element_size; > + return ret; > +} I expect that a decent proportion of users of this facility will only ever want a single flex_array. So they'll want to be able to define and initialise their flex_array at compile-time. That looks pretty easy to do? _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers