On 1.5.2020 6.14, Hugh Dickins wrote:
On Tue, 28 Apr 2020, Topi Miettinen wrote:
On 28.4.2020 4.34, Hugh Dickins wrote:
On Sat, 25 Apr 2020, Topi Miettinen wrote:
Hi,
It seems that tmpfs does not count memory which is allocated for short
symlinks or xattrs towards size= limit.
Yes, you are right. And that is why tmpfs does not (so far) support
user xattrs, because the unprivileged user could take up too much
memory that way.
I guess the fix would be to change
shmem_sb_info->{used_blocks,max_blocks} to use bytes as units (instead of
blocks) and then add accounting and checks to shmem_symlink() and
shmem_initxattrs(). Would a patch for that be acceptable?
Thank you for offering, but I don't think a patch for exactly that would
be acceptable. Because "size=" has just been another way of expressing
"nr_blocks=" ever since it was added in 2.4.4, and tmpfs has never
counted the kernel metadata towards its data blocks limit.
You could certainly argue that it should have done from the start; but
in order to keep the accounting suitably simple (pages rather than bytes)
it never did. And I believe there are many users who expect a tmpfs of a
certain size to be able to accommodate data of that size, who would not
care to have to change their scripts or apps to meet a lower limitation.
Another issue is that inodes aren't counted towards size= limit either,
but
perhaps that's intentional because there's nr_inodes= mount option for
exactly that.
Yes, tmpfs lets the nr_inodes limit be used to constrain the kernel
metadata (and tmpfs has a peculiarity, that it actually counts hard
links out of nr_inodes, in order to limit the memory spent on dentries).
I doubt the nr_inodes limit is depended upon so critically as the
nr_blocks, and I think we might extend it (say, consider each 1 of
nr_inodes to grant approximately 1kB of unswappable lowmem metadata)
to enable limited user xattrs: a patch along those lines might well
be acceptable.
I'm interested in restricting the amount of memory allocated to tmpfs mounts
in the system rather than granting more. I've seen a system lock up because
tmpfs mounts consumed the entire memory. Possible contributing factors could
be use of LVM and encryption for the swap.
Yes, it is too easy to get into a terrible state that way. With OOM
killer doing no good at all, because it's busy killing processes, which
does nothing to free the memory held by tmpfs files. I've never found
a good answer to that in general, though marking files as suitable for
truncation on OOM has been useful in special cases.
It seems that similar state can be reached also when an unprivileged
process allocates and maps lots of SysV shm (if the limit isn't set,
which seems to be the case at least for Debian).
-Topi