Re: [mdadm PATCH 4/4] Create: tell udev device is not ready when first created.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 20 2017, Jes Sorensen wrote:

> On 04/19/2017 10:40 PM, NeilBrown wrote:
>> When an array is created the content is not initializes,
>> so it could have remnants of an old filesystem or md array
>> etc on it.
>> udev will see this and might try to activate it, which is almost
>> certainly not what is wanted.
>>
>> So create a temporary udev rules files to set ENV{SYSTEMD_READY}="0"
>> while the creation event is processed.  This is fairly uniformly
>> used to suppress actions based on the contents of the device.
>>
>> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
>> ---
>>  Create.c |    9 +++++++++
>>  lib.c    |   22 ++++++++++++++++++++++
>>  mdadm.h  |    2 ++
>>  3 files changed, 33 insertions(+)
>
> Neil,
>
> I have heard of this problem before, but I have some concerns about the 
> solution. First of all, /run/udev/rules.d/ isn't a universal directory. 
> At least Fedora doesn't have it, so we need to take that into account. I 
> wasn't aware it was possible to create temporary udev rules like this.

Hmmm.. Debian doesn't have it either.  However I suspect that udev will
still try to read from the directory.  The man page explicitly lists it:

RULES FILES
       The udev rules are read from the files located in the system rules
       directory /usr/lib/udev/rules.d, the volatile runtime directory
       /run/udev/rules.d and the local administration directory
       /etc/udev/rules.d.  ....

So possibly mdadm would need to mkdir("/run/udev/rules.d", 0755) first.
The man page uses the term "volatile" rather than "temporary", but this
possibility does seem to be part of the design of udev.
 

>
> Second, isn't this going to be racey if you have multiple arrays 
> running? I am wondering if we cannot find a solution that relies on a 
> permanently installed udev rule that we enable/disable with systemctl 
> and use the device name as an argument?

There would only be a problematic race of an array as being created at
the same time that another array is being assembled.  Is that at all
likely?  They tend to happen at different parts of the usage cycle...  I
guess we shouldn't assume though.

I admit that blocking *all* arrays was a short cut.  We need to create
the udev rule before the new device is first activated, and keep it in
existence until after the array has been started and the fd on /dev/mdXX
has been closed - and then a bit more because udev doesn't respond
instantly.
In some cases we don't know the name of the md device until
create_mddev() chooses on with find_free_devnum().
So if we want to block udev from handling a device, we need to put the
block into effect (e.g. create the rules.d file) in mddev_create() just
before the call to open_dev_excl().

If we wanted an more permanent udev rule, we would need to record the
devices that should be ignored in the filesystem somewhere else.
Maybe in /run/mdadm.
e.g.

 KERNEL=="md*", TEST="/run/mdadm/creating-$kernel", ENV{SYSTEMD_READY}="0"

Then we could have something like the following (untested) in mdadm.
Does that seem more suitable?

Thanks,
NeilBrown

---
 Assemble.c    |  2 +-
 Build.c       |  2 +-
 Create.c      |  9 ++++++++-
 Incremental.c |  4 ++--
 lib.c         | 29 +++++++++++++++++++++++++++++
 mdadm.h       |  4 +++-
 mdopen.c      |  4 +++-
 7 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/Assemble.c b/Assemble.c
index b8285239354b..309841432ff5 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1459,7 +1459,7 @@ try_again:
 			name = strchr(name, ':')+1;
 
 		mdfd = create_mddev(mddev, name, ident->autof, trustworthy,
-				    chosen_name);
+				    chosen_name, 0);
 	}
 	if (mdfd < 0) {
 		st->ss->free_super(st);
diff --git a/Build.c b/Build.c
index 11ba12f4ae7d..665d9067b8d6 100644
--- a/Build.c
+++ b/Build.c
@@ -109,7 +109,7 @@ int Build(char *mddev, struct mddev_dev *devlist,
 	/* We need to create the device.  It can have no name. */
 	map_lock(&map);
 	mdfd = create_mddev(mddev, NULL, c->autof, LOCAL,
-			    chosen_name);
+			    chosen_name, 0);
 	if (mdfd < 0) {
 		map_unlock(&map);
 		return 1;
diff --git a/Create.c b/Create.c
index 6ca092449880..df1bc20c635b 100644
--- a/Create.c
+++ b/Create.c
@@ -605,7 +605,7 @@ int Create(struct supertype *st, char *mddev,
 
 	/* We need to create the device */
 	map_lock(&map);
-	mdfd = create_mddev(mddev, name, c->autof, LOCAL, chosen_name);
+	mdfd = create_mddev(mddev, name, c->autof, LOCAL, chosen_name, 1);
 	if (mdfd < 0) {
 		map_unlock(&map);
 		return 1;
@@ -620,6 +620,7 @@ int Create(struct supertype *st, char *mddev,
 			chosen_name);
 		close(mdfd);
 		map_unlock(&map);
+		udev_unblock();
 		return 1;
 	}
 	mddev = chosen_name;
@@ -1053,9 +1054,15 @@ int Create(struct supertype *st, char *mddev,
 		pr_err("not starting array - not enough devices.\n");
 	}
 	close(mdfd);
+	/* Give udev a moment to process the Change event caused
+	 * by the close.
+	 */
+	usleep(100*1000);
+	udev_unblock();
 	return 0;
 
  abort:
+	udev_unblock();
 	map_lock(&map);
  abort_locked:
 	map_remove(&map, fd2devnm(mdfd));
diff --git a/Incremental.c b/Incremental.c
index 28f1f7734956..63ed4fa1a88d 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -321,7 +321,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
 
 		/* Couldn't find an existing array, maybe make a new one */
 		mdfd = create_mddev(match ? match->devname : NULL,
-				    name_to_use, c->autof, trustworthy, chosen_name);
+				    name_to_use, c->autof, trustworthy, chosen_name, 0);
 
 		if (mdfd < 0)
 			goto out_unlock;
@@ -1605,7 +1605,7 @@ static int Incremental_container(struct supertype *st, char *devname,
 					    ra->name,
 					    c->autof,
 					    trustworthy,
-					    chosen_name);
+					    chosen_name, 0);
 		}
 		if (only && (!mp || strcmp(mp->devnm, only) != 0))
 			continue;
diff --git a/lib.c b/lib.c
index b640634ef6f2..d8692cae813d 100644
--- a/lib.c
+++ b/lib.c
@@ -163,6 +163,35 @@ char *fd2devnm(int fd)
 	return NULL;
 }
 
+/* When we create a new array, we don't want the content to
+ * be immediately examined by udev - it is probably meaningless.
+ * So create /run/udev/rules.d/01-mdadm-create.rules to tell udev
+ * that the device isn't ready.
+ */
+static char block_path[] = "/run/mdadm/creating-%s";
+static char *unblock_path = NULL;
+void udev_block(char *devnm)
+{
+	int fd;
+	char *path = NULL;
+
+	xasprintf(&path, block_path, devnm);
+	fd = open(path, O_CREAT|O_RDWR, 0600);
+	if (fd >= 0) {
+		close(fd);
+		unblock_path = path;
+	} else
+		free(path);
+}
+
+void udev_unblock(void)
+{
+	if (unblock_path)
+		unlink(unblock_path);
+	free(unblock_path);
+	unblock_path = NULL;
+}
+
 /*
  * convert a major/minor pair for a block device into a name in /dev, if possible.
  * On the first call, walk /dev collecting name.
diff --git a/mdadm.h b/mdadm.h
index f1f643c794d4..187e60050068 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1522,7 +1522,7 @@ extern char *get_md_name(char *devnm);
 extern char DefaultConfFile[];
 
 extern int create_mddev(char *dev, char *name, int autof, int trustworthy,
-			char *chosen);
+			char *chosen, int block_udev);
 /* values for 'trustworthy' */
 #define	LOCAL	1
 #define	LOCAL_ANY 10
@@ -1556,6 +1556,8 @@ extern char *stat2kname(struct stat *st);
 extern char *fd2kname(int fd);
 extern char *stat2devnm(struct stat *st);
 extern char *fd2devnm(int fd);
+extern void udev_block(char *devnm);
+extern void udev_unblock(void);
 
 extern int in_initrd(void);
 
diff --git a/mdopen.c b/mdopen.c
index 82b97fc90339..1dbcdcddb9f6 100644
--- a/mdopen.c
+++ b/mdopen.c
@@ -135,7 +135,7 @@ void make_parts(char *dev, int cnt)
  */
 
 int create_mddev(char *dev, char *name, int autof, int trustworthy,
-		 char *chosen)
+		 char *chosen, int block_udev)
 {
 	int mdfd;
 	struct stat stb;
@@ -414,6 +414,8 @@ int create_mddev(char *dev, char *name, int autof, int trustworthy,
 				make_parts(chosen, parts);
 		}
 	}
+	if (block_udev)
+		udev_block(devnm);
 	mdfd = open_dev_excl(devnm);
 	if (mdfd < 0)
 		pr_err("unexpected failure opening %s\n",
-- 
2.12.2

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux