Re: [PATCH] staging/lustre/llite: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2014/02/03, 4:07 PM, "Andreas Dilger" <andreas.dilger@xxxxxxxxx> wrote:
>In kernel 3.11 O_TMPFILE was introduced, but the open flag value
>conflicts with the O_LOV_DELAY_CREATE flag 020000000 previously used
>by Lustre-aware applications.  O_LOV_DELAY_CREATE allows applications
>to defer file layout and object creation from open time (the default)
>until it can instead be specified by the application using an ioctl.
>
>Instead of trying to find a non-conflicting O_LOV_DELAY_CREATE flag
>or define a Lustre-specific flag that isn't of use to most/any other
>filesystems, use (O_NOCTTY|FASYNC) as the new value.  These flag
>are not meaningful for newly-created regular files and should be
>OK since O_LOV_DELAY_CREATE is only meaningful for new files.
>
>I looked into using O_ACCMODE/FMODE_WRITE_IOCTL, which allows calling
>ioctl() on the minimally-opened fd and is close to what is needed,
>but that doesn't allow specifying the actual read or write mode for
>the file, and fcntl(F_SETFL) doesn't allow O_RDONLY/O_WRONLY/O_RDWR
>to be set after the file is opened.

A few extra comments here that I wasn't sure should be in the commit
comment.

The main goal of the O_LOV_DELAY_CREATE flag is to allow the file to be
opened in a "preliminary" manner to allow the application to specify the
layout of the file across the Lustre storage targets (e.g. whether the
app has millions of separate files, each one written to a single server,
or there is a single huge file spread across all of the servers, or some
combination of the two, if it is RAID-0 or RAID-1, or whatever).

I'm open to a separate flag value for this, but I don't know if that is
of much interest to other filesystems since they don't have similar needs.
We want to avoid the need to have lots of syscalls to do this, since they
translate into multiple RPCs that we want to avoid when creating
potentially
millions of files over the network.

Cheers, Andreas

>Lustre-change: http://review.whamcloud.com/8312
>Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4209
>Signed-off-by: Andreas Dilger <andreas.dilger@xxxxxxxxx>
>Signed-off-by: Oleg Drokin <oleg.drokin@xxxxxxxxx>
>Signed-off-by: Peng Tao <bergwolf@xxxxxxxxx>
>---
> .../lustre/lustre/include/lustre/lustre_user.h     |   12 ++++------
> drivers/staging/lustre/lustre/include/lustre_mdc.h |   11 ++++++++++
> drivers/staging/lustre/lustre/llite/file.c         |   21
>++++++++++---------
> drivers/staging/lustre/lustre/mdc/mdc_lib.c        |    2 +-
> 4 files changed, 28 insertions(+), 18 deletions(-)
>
>diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>index 6b6c0240..91f22a8 100644
>--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>@@ -265,13 +265,11 @@ struct ost_id {
> 
> #define MAX_OBD_NAME 128 /* If this changes, a NEW ioctl must be added */
> 
>-/* Hopefully O_LOV_DELAY_CREATE does not conflict with standard O_xxx
>flags.
>- * Previously it was defined as 0100000000 and conflicts with
>FMODE_NONOTIFY
>- * which was added since kernel 2.6.36, so we redefine it as 020000000.
>- * To be compatible with old version's statically linked binary, finally
>we
>- * define it as (020000000 | 0100000000).
>- * */
>-#define O_LOV_DELAY_CREATE      0120000000
>+/* Define O_LOV_DELAY_CREATE to be a mask that is not useful for regular
>+ * files, but are unlikely to be used in practice and are not harmful if
>+ * used incorrectly.  O_NOCTTY and FASYNC are only meaningful for
>character
>+ * devices and are safe for use on new files (See LU-812, LU-4209). */
>+#define O_LOV_DELAY_CREATE	(O_NOCTTY | FASYNC)
> 
> #define LL_FILE_IGNORE_LOCK     0x00000001
> #define LL_FILE_GROUP_LOCKED    0x00000002
>diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h
>b/drivers/staging/lustre/lustre/include/lustre_mdc.h
>index c1e0270..468f363 100644
>--- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
>+++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
>@@ -166,6 +166,17 @@ void it_clear_disposition(struct lookup_intent *it,
>int flag);
> void it_set_disposition(struct lookup_intent *it, int flag);
> int it_open_error(int phase, struct lookup_intent *it);
> 
>+static inline bool cl_is_lov_delay_create(unsigned int flags)
>+{
>+	return (flags & O_LOV_DELAY_CREATE) == O_LOV_DELAY_CREATE;
>+}
>+
>+static inline void cl_lov_delay_create_clear(unsigned int *flags)
>+{
>+	if ((*flags & O_LOV_DELAY_CREATE) == O_LOV_DELAY_CREATE)
>+		*flags &= ~O_LOV_DELAY_CREATE;
>+}
>+
> /** @} mdc */
> 
> #endif
>diff --git a/drivers/staging/lustre/lustre/llite/file.c
>b/drivers/staging/lustre/lustre/llite/file.c
>index c12821a..dc9da77 100644
>--- a/drivers/staging/lustre/lustre/llite/file.c
>+++ b/drivers/staging/lustre/lustre/llite/file.c
>@@ -671,14 +671,13 @@ restart:
> 
> 	ll_capa_open(inode);
> 
>-	if (!lli->lli_has_smd) {
>-		if (file->f_flags & O_LOV_DELAY_CREATE ||
>-		    !(file->f_mode & FMODE_WRITE)) {
>-			CDEBUG(D_INODE, "object creation was delayed\n");
>-			GOTO(out_och_free, rc);
>-		}
>+	if (!lli->lli_has_smd &&
>+	    (cl_is_lov_delay_create(file->f_flags) ||
>+	     (file->f_mode & FMODE_WRITE) == 0)) {
>+		CDEBUG(D_INODE, "object creation was delayed\n");
>+		GOTO(out_och_free, rc);
> 	}
>-	file->f_flags &= ~O_LOV_DELAY_CREATE;
>+	cl_lov_delay_create_clear(&file->f_flags);
> 	GOTO(out_och_free, rc);
> 
> out_och_free:
>@@ -1381,23 +1380,25 @@ int ll_lov_setstripe_ea_info(struct inode *inode,
>struct file *file,
> 		ccc_inode_lsm_put(inode, lsm);
> 		CDEBUG(D_IOCTL, "stripe already exists for ino %lu\n",
> 		       inode->i_ino);
>-		return -EEXIST;
>+		GOTO(out, rc = -EEXIST);
> 	}
> 
> 	ll_inode_size_lock(inode);
> 	rc = ll_intent_file_open(file, lum, lum_size, &oit);
> 	if (rc)
>-		GOTO(out, rc);
>+		GOTO(out_unlock, rc);
> 	rc = oit.d.lustre.it_status;
> 	if (rc < 0)
> 		GOTO(out_req_free, rc);
> 
> 	ll_release_openhandle(file->f_dentry, &oit);
> 
>- out:
>+out_unlock:
> 	ll_inode_size_unlock(inode);
> 	ll_intent_release(&oit);
> 	ccc_inode_lsm_put(inode, lsm);
>+out:
>+	cl_lov_delay_create_clear(&file->f_flags);
> 	return rc;
> out_req_free:
> 	ptlrpc_req_finished((struct ptlrpc_request *) oit.d.lustre.it_data);
>diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
>b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
>index 91f6876..5b9f371 100644
>--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
>+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
>@@ -197,7 +197,7 @@ static __u64 mds_pack_open_flags(__u64 flags, __u32
>mode)
> 	if (flags & FMODE_EXEC)
> 		cr_flags |= MDS_FMODE_EXEC;
> #endif
>-	if (flags & O_LOV_DELAY_CREATE)
>+	if (cl_is_lov_delay_create(flags))
> 		cr_flags |= MDS_OPEN_DELAY_CREATE;
> 
> 	if (flags & O_NONBLOCK)
>-- 
>1.7.3.4
>
>


Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux