Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> On Sat, Oct 29, 2016 at 1:25 AM, Johannes Schindelin
> <Johannes.Schindelin@xxxxxx> wrote:
>>
>> Correct. We cannot change an open file handle's state ("FD_CLOEXEC") on
>> Windows, but we can ask open() to open said file handle with the correct
>> flag ("O_CLOEXEC", which is mapped to O_NOINHERIT on Windows.
>
> Ok. So then I have no issues with it, and let's use O_CLOEXEC if it
> exists and fcntl(FD_CLOEXEC) if O_CLOEXEC doesn't exist.

I am not sure if the fallback to FD_CLOEXEC is worth doing, because
it does not exist on Windows, where leaking an open file descriptor
to a child process matters more than other platforms [*1*].  

Of course, the world is not all Windows, and there may be other
platforms this fallback may help.  But there is a chance that some
other thread may fork between the time we open(2) and the time we
call fcntl(2), leaving FD_CLOEXEC ineffective and unreliable, so
that's another thing that makes me doubt if FD_CLOEXEC fallback is
worth doing.

Having said that, here is a rewrite of [*2*] to use the fallback
would look like.  After this topic settles, we may want to also
update the tempfile.c::create_tempfile() implementation to use the
helper function we use here.


[Footnotes]

*1* The call in ce_compare_data() is to see if the contents in the
    working tree file match what is in the index.  Perhaps the
    program is "git checkout another-branch" that knows this path is
    absent in the branch being switched to and trying to make sure
    we lose no local modification.  When the path needs to be passed
    through the clean/smudge filter before hashing to see if the
    working tree contents hashes to the object in the index, we need
    to fork and then after the filter finishes its processing, we
    close the file descriptor and then unlink(2) the path if it is
    clean.  We are about to add a variant of clean/smudge mechanism
    where a lazily spawned process can clean contents of multiple
    paths and that is where cloexec starts to matter.  This function
    starts to inspect one path, opens a fd to it, finds that it
    needs filtering, spawns a long-lingering process, while leaking
    the original fd to it.  After feeding the contents via pipe and
    then receiving the filtered contents (the protocol is not full
    duplex and there won't be a stuck pipe), the parent would decide
    that the path is clean and attempts to unlink(2).  Windows does
    not let you unlink a path with open filedescriptor held, causing
    "checkout" to fail.  As it is only one fd that leaks to the
    child, no matter how many subsequent paths are filtered by the
    same long-lingering child process, it will be a far less
    important issue on other platforms that lets you unlink(2) such
    a file.

*2* http://public-inbox.org/git/xmqqh97w38gj.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx

---
 cache.h           |  1 +
 git-compat-util.h |  4 ++++
 read-cache.c      |  9 +--------
 sha1_file.c       | 47 ++++++++++++++++++++++++++++-------------------
 4 files changed, 34 insertions(+), 27 deletions(-)

diff --git a/cache.h b/cache.h
index a902ca1f8e..6f1c21a352 100644
--- a/cache.h
+++ b/cache.h
@@ -1122,6 +1122,7 @@ extern int write_sha1_file(const void *buf, unsigned long len, const char *type,
 extern int hash_sha1_file_literally(const void *buf, unsigned long len, const char *type, unsigned char *sha1, unsigned flags);
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
+extern int git_open_cloexec(const char *name, int flags);
 extern int git_open(const char *name);
 extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz);
diff --git a/git-compat-util.h b/git-compat-util.h
index 43718dabae..64407c701c 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -683,6 +683,10 @@ char *gitstrdup(const char *s);
 #define O_CLOEXEC 0
 #endif
 
+#ifndef FD_CLOEXEC
+#define FD_CLOEXEC 0
+#endif
+
 #ifdef FREAD_READS_DIRECTORIES
 #ifdef fopen
 #undef fopen
diff --git a/read-cache.c b/read-cache.c
index db5d910642..c27d3e240b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -156,14 +156,7 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st)
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 {
 	int match = -1;
-	static int cloexec = O_CLOEXEC;
-	int fd = open(ce->name, O_RDONLY | cloexec);
-
-	if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) {
-		/* Try again w/o O_CLOEXEC: the kernel might not support it */
-		cloexec &= ~O_CLOEXEC;
-		fd = open(ce->name, O_RDONLY | cloexec);
-	}
+	int fd = git_open_cloexec(ce->name, O_RDONLY);
 
 	if (fd >= 0) {
 		unsigned char sha1[20];
diff --git a/sha1_file.c b/sha1_file.c
index 09045df1dc..fc8613a847 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1559,31 +1559,40 @@ int check_sha1_signature(const unsigned char *sha1, void *map,
 	return hashcmp(sha1, real_sha1) ? -1 : 0;
 }
 
-int git_open(const char *name)
+int git_open_cloexec(const char *name, int flags)
 {
-	static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
+	int fd;
+	static int o_cloexec = O_CLOEXEC;
+	static int fd_cloexec = FD_CLOEXEC;
 
-	for (;;) {
-		int fd;
+	fd = open(name, flags | o_cloexec);
+	if ((o_cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) {
+		/* Try again w/o O_CLOEXEC: the kernel might not support it */
+		o_cloexec &= ~O_CLOEXEC;
+		fd = open(name, flags | o_cloexec);
+	}
 
-		errno = 0;
-		fd = open(name, O_RDONLY | sha1_file_open_flag);
-		if (fd >= 0)
-			return fd;
+	if (!o_cloexec && 0 <= fd && fd_cloexec) {
+		/* Opened w/o O_CLOEXEC?  try with fcntl(2) to add it */
+		int flags = fcntl(fd, F_GETFL);
+		if (fcntl(fd, F_SETFL, flags | fd_cloexec))
+			fd_cloexec = 0;
+	}
 
-		/* Try again w/o O_CLOEXEC: the kernel might not support it */
-		if ((sha1_file_open_flag & O_CLOEXEC) && errno == EINVAL) {
-			sha1_file_open_flag &= ~O_CLOEXEC;
-			continue;
-		}
+	return fd;
+}
 
-		/* Might the failure be due to O_NOATIME? */
-		if (errno != ENOENT && (sha1_file_open_flag & O_NOATIME)) {
-			sha1_file_open_flag &= ~O_NOATIME;
-			continue;
-		}
-		return -1;
+int git_open(const char *name)
+{
+	static int noatime = O_NOATIME;
+	int fd = git_open_cloexec(name, O_RDONLY);
+
+	if (0 <= fd && (noatime & O_NOATIME)) {
+		int flags = fcntl(fd, F_GETFL);
+		if (fcntl(fd, F_SETFL, flags | noatime))
+			noatime = 0;
 	}
+	return fd;
 }
 
 static int stat_sha1_file(const unsigned char *sha1, struct stat *st)





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]