Re: [RFC v3][PATCH 8/9] File descriprtors (dump)

Oren Laadan <orenl@xxxxxxxxxxxxxxx> · Sun, 07 Sep 2008 00:52:12 -0400

Dave Hansen wrote:
> On Thu, 2008-09-04 at 04:05 -0400, Oren Laadan wrote:
>> +/**
>> + * cr_scan_fds - scan file table and construct array of open fds
>> + * @files: files_struct pointer
>> + * @fdtable: (output) array of open fds
>> + * @return: the number of open fds found
>> + *
>> + * Allocates the file descriptors array (*fdtable), caller should free
>> + */
>> +int cr_scan_fds(struct files_struct *files, int **fdtable)
>> +{
>> +	struct fdtable *fdt;
>> +	int *fdlist;
>> +	int i, n, max;
>> +
>> +	max = CR_DEFAULT_FDTABLE;
>> +
>> + repeat:
>> +	n = 0;
>> +	fdlist = kmalloc(max * sizeof(*fdlist), GFP_KERNEL);
>> +	if (!fdlist)
>> +		return -ENOMEM;
>> +
>> +	spin_lock(&files->file_lock);
>> +	fdt = files_fdtable(files);
>> +	for (i = 0; i < fdt->max_fds; i++) {
>> +		if (fcheck_files(files, i)) {
>> +			if (n == max) {
>> +				spin_unlock(&files->file_lock);
>> +				kfree(fdlist);
>> +				max *= 2;
>> +				if (max < 0) {	/* overflow ? */
>> +					n = -EMFILE;
>> +					break;
>> +				}
>> +				goto repeat;
>> +			}
>> +			fdlist[n++] = i;
>> +		}
>> +	}
>> +	spin_unlock(&files->file_lock);
>> +
>> +	*fdtable = fdlist;
>> +	return n;
>> +}
> 
> That loop needs some love.  At least save us from one level of
> indenting:
> 
>> +	for (i = 0; i < fdt->max_fds; i++) {
>> +		if (!fcheck_files(files, i)
>> 			continue;
>> 		if (n == max) {
>> +			spin_unlock(&files->file_lock);
>> +			kfree(fdlist);
>> +			max *= 2;
>> +			if (max < 0) {	/* overflow ? */
>> +				n = -EMFILE;
>> +				break;
>> +			}
>> +			goto repeat;
>> +		}
>> +		fdlist[n++] = i;
>> +	}
> 
> My gut also says that there has to be a better way to find a good size
> for fdlist() than growing it this way.  

Can you suggest a better way to find the open files of a task ?

Either I loop twice (loop to count, then allocate, then loop to fill),
or optimistically try an initial guess and expand on demand.

> 
> Why do we even have a fixed size for this?
> 
> +#define CR_DEFAULT_FDTABLE  256
> 
>> +/* cr_write_fd_data - dump the state of a given file pointer */
>> +static int cr_write_fd_data(struct cr_ctx *ctx, struct file *file, int parent)
>> +{
>> +	struct cr_hdr h;
>> +	struct cr_hdr_fd_data *hh = cr_hbuf_get(ctx, sizeof(*hh));
>> +	struct dentry *dent = file->f_dentry;
>> +	struct inode *inode = dent->d_inode;
>> +	enum fd_type fd_type;
>> +	int ret;
>> +
>> +	h.type = CR_HDR_FD_DATA;
>> +	h.len = sizeof(*hh);
>> +	h.parent = parent;
>> +
>> +	BUG_ON(!inode);
> 
> Why a BUG_ON()?  We'll deref it in just a sec anyway.  We prefer to just
> get the NULL dereference rather than an explicit BUG_ON().
> 
>> +	hh->f_flags = file->f_flags;
>> +	hh->f_mode = file->f_mode;
>> +	hh->f_pos = file->f_pos;
>> +	hh->f_uid = file->f_uid;
>> +	hh->f_gid = file->f_gid;
> 
> Is there a plan to save off the 'struct user' here instead?  Nested user
> namespaces in one checkpoint image might get confused otherwise.

Of course. Eventually, 'struct user' will be another shared object that
is encountered and saved with the checkpoint image.

> 
>> +	hh->f_version = file->f_version;
>> +	/* FIX: need also file->f_owner */
>> +
>> +	switch (inode->i_mode & S_IFMT) {
>> +	case S_IFREG:
>> +		fd_type = CR_FD_FILE;
>> +		break;
>> +	case S_IFDIR:
>> +		fd_type = CR_FD_DIR;
>> +		break;
>> +	case S_IFLNK:
>> +		fd_type = CR_FD_LINK;
>> +		break;
>> +	default:
>> +		return -EBADF;
>> +	}
> 
> Why don't we just store (and use) (inode->i_mode & S_IFMT) in fd_type
> instead of making our own types?

There will be others that cannot be inferred from inode->i_mode,
e.g. CR_FD_FILE_UNLINKED, CR_FD_DIR_UNLINKED, CR_FD_SOCK_UNIX,
CR_FD_SOCK_INET_V4, CR_FD_EVENTPOLL etc.

> 
>> +	/* FIX: check if the file/dir/link is unlinked */
>> +	hh->fd_type = fd_type;

[...]

Oren.

_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/containers