Re: [PATCH 0/6] Maintain the relative size of fs.file-max and fs.nr_open

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 23 Nov 2024 19:32:27 +0000, Al Viro wrote:
> On Sat, Nov 23, 2024 at 06:27:30PM +0000, Al Viro wrote:
> > On Sun, Nov 24, 2024 at 02:08:55AM +0800, Jinliang Zheng wrote:
> > > According to Documentation/admin-guide/sysctl/fs.rst, fs.nr_open and
> > > fs.file-max represent the number of file-handles that can be opened
> > > by each process and the entire system, respectively.
> > > 
> > > Therefore, it's necessary to maintain a relative size between them,
> > > meaning we should ensure that files_stat.max_files is not less than
> > > sysctl_nr_open.
> > 
> > NAK.
> > 
> > You are confusing descriptors (nr_open) and open IO channels (max_files).
> > 
> > We very well _CAN_ have more of the former.  For further details,
> > RTFM dup(2) or any introductory Unix textbook.
> 
> Short version: there are 3 different notions -
> 	1) file as a collection of data kept by filesystem. Such things as
> contents, ownership, permissions, timestamps belong there.
> 	2) IO channel used to access one of (1).  open(2) creates such;
> things like current position in file, whether it's read-only or read-write
> open, etc. belong there.  It does not belong to a process - after fork(),
> child has access to all open channels parent had when it had spawned
> a child.  If you open a file in parent, read 10 bytes from it, then spawn
> a child that reads 10 more bytes and exits, then have parent read another
> 5 bytes, the first read by parent will have read bytes 0 to 9, read by
> child - bytes 10 to 19 and the second read by parent - bytes 20 to 24.
> Position is a property of IO channel; it belongs neither to underlying
> file (otherwise another process opening the file and reading from it
> would play havoc on your process) nor to process (otherwise reads done
> by child would not have affected the parent and the second read from
> parent would have gotten bytes 10 to 14).  Same goes for access mode -
> it belongs to IO channel.

I'm sorry that I don't know much about the implementation of UNIX, but
specific to the implementation of Linux, struct file is more like a
combination of what you said 1) and 2).

But I see your point, I missed the dup() case. dup() will occupy the
element position of the fdtable->fd array, but will not create a new
struct file.

Thank you.
Jinliang Zheng

> 	3) file descriptor - a number that has a meaning only in context
> of a process and refers to IO channel.	That's what system calls use
> to identify the IO channel to operate upon; open() picks a descriptor
> unused by the calling process, associates the new channel with it and
> returns that descriptor (a number) to caller.  Multiple descriptors can
> refer to the same IO channel; e.g. dup(fd) grabs a new descriptor and
> associates it with the same IO channel fd currently refers to.
> 
> 	IO channels are not directly exposed to userland, but they are
> very much present in Unix-style IO API.  Note that results of e.g.
> 	int fd1 = open("/etc/issue", 0);
> 	int fd2 = open("/etc/issue", 0);
> and
> 	int fd1 = open("/etc/issue", 0);
> 	int fd2 = dup(fd1);
> are not identical, even though in both cases fd1 and fd2 are opened
> descriptors and reading from them will access the contents of the
> /etc/issue; in the former case the positions being accessed by read from
> fd1 and fd2 will be independent, in the latter they will be shared.
> 
> 	It's really quite basic - Unix Programming 101 stuff.  It's not
> just that POSIX requires that and that any Unix behaves that way,
> anything even remotely Unix-like will be like that.
> 
> 	You won't find the words 'IO channel' in POSIX, but I refuse
> to use the term they have chosen instead - 'file description'.	Yes,
> alongside with 'file descriptor', in the contexts where the distinction
> between these notions is quite important.  I would rather not say what
> I really think of those unsung geniuses, lest CoC gets overexcited...
> 
> 	Anyway, in casual conversations the expression 'opened file'
> usually refers to that thing.  Which is somewhat clumsy (sounds like
> 'file on filesystem that happens to be opened'), but usually it's
> good enough.  If you need to be pedantic (e.g. when explaining that
> material in aforementioned Unix Programming 101 class), 'IO channel'
> works well enough, IME.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux