On Mon, 2014-03-31 at 17:05 -0700, Andrew Morton wrote: > On Mon, 31 Mar 2014 16:25:32 -0700 Davidlohr Bueso <davidlohr@xxxxxx> wrote: > > > On Mon, 2014-03-31 at 16:13 -0700, Andrew Morton wrote: > > > On Mon, 31 Mar 2014 15:59:33 -0700 Davidlohr Bueso <davidlohr@xxxxxx> wrote: > > > > > > > > > > > > > - Shouldn't there be a way to alter this namespace's shm_ctlmax? > > > > > > > > Unfortunately this would also add the complexity I previously mentioned. > > > > > > But if the current namespace's shm_ctlmax is too small, you're screwed. > > > Have to shut down the namespace all the way back to init_ns and start > > > again. > > > > > > > > - What happens if we just nuke the limit altogether and fall back to > > > > > the next check, which presumably is the rlimit bounds? > > > > > > > > afaik we only have rlimit for msgqueues. But in any case, while I like > > > > that simplicity, it's too late. Too many workloads (specially DBs) rely > > > > heavily on shmmax. Removing it and relying on something else would thus > > > > cause a lot of things to break. > > > > > > It would permit larger shm segments - how could that break things? It > > > would make most or all of these issues go away? > > > > > > > So sysadmins wouldn't be very happy, per man shmget(2): > > > > EINVAL A new segment was to be created and size < SHMMIN or size > > > SHMMAX, or no new segment was to be created, a segment with given key > > existed, but size is greater than the size of that segment. > > So their system will act as if they had set SHMMAX=enormous. What > problems could that cause? So, just like any sysctl configurable, only privileged users can change this value. If we remove this option, users can theoretically create huge segments, thus ignoring any custom limit previously set. This is what I fear. Think of it kind of like mlock's rlimit. And for that matter, why does sysctl exist at all, the same would go for the rest of the limits. > Look. The 32M thing is causing problems. Arbitrarily increasing the > arbitrary 32M to an arbitrary 128M won't fix anything - we still have > the problem. Think bigger, please: how can we make this problem go > away for ever? That's the thing, I don't think we can make it go away without breaking userspace. I'm not saying that my 4x increase is the correct value, I don't think any default value is really correct, as with any other hardcoded limits there are pros and cons. That's really why we give users the option to change it to the "correct" one via sysctl. All I'm saying is that 32mb is just too small for default in today's systems, and increasing it is just making a bad situation a tiny bit better. Thanks, Davidlohr -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>