Hello Christophe, Ben, Hannes,
Martin, Bart,
I am a member of host-side software
development team of ZXUSP storage project
in ZTE Corporation. Facing the market demand, our team decides to write code to
promote multipath efficiency next month. The whole idea is in the mail below.We
hope to participate in and make progress with the open source community, so any
suggestion and comment would be welcome.
Thanks,
Tang
------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------
1. Problem
In these scenarios, multipath processing efficiency is low:
1) Many paths exist in each multipath device,
2) Devices addition or deletion during iSCSI login/logout or FC link up/down.
2. Reasons
Multipath process uevents one by one, and each one also produce a new dm
addition change or deletion uevent to increased system resource consumption,
actually most of these uevents have no sense at all.
E.g. login procedure of 4 iSCSI sessions with 3 LUNs:
1) Multipath processes these uevents one by one:
UDEV [89068.806214] add /devices/platform/host3/session44/target3:0:0/3:0:0:0/block/sdc (block)
UDEV [89068.909457] add /devices/platform/host3/session44/target3:0:0/3:0:0:2/block/sdg (block)
UDEV [89068.944956] add /devices/platform/host3/session44/target3:0:0/3:0:0:1/block/sde (block)
UDEV [89068.959215] add /devices/platform/host5/session46/target5:0:0/5:0:0:0/block/sdh (block)
UDEV [89068.978558] add /devices/platform/host5/session46/target5:0:0/5:0:0:2/block/sdk (block)
UDEV [89069.004217] add /devices/platform/host5/session46/target5:0:0/5:0:0:1/block/sdj (block)
UDEV [89069.486361] add /devices/platform/host4/session45/target4:0:0/4:0:0:1/block/sdf (block)
UDEV [89069.495194] add /devices/platform/host4/session45/target4:0:0/4:0:0:0/block/sdd (block)
UDEV [89069.511628] add /devices/platform/host4/session45/target4:0:0/4:0:0:2/block/sdi (block)
UDEV [89069.716292] add /devices/platform/host6/session47/target6:0:0/6:0:0:0/block/sdl (block)
UDEV [89069.748456] add /devices/platform/host6/session47/target6:0:0/6:0:0:1/block/sdm (block)
UDEV [89069.789662] add /devices/platform/host6/session47/target6:0:0/6:0:0:2/block/sdn (block)
2) Multipath also produce DM uvents by step 1), which would be processed by
udev and other process who listening kernel:
KERNEL[89068.899614] add /devices/virtual/block/dm-2 (block)
KERNEL[89068.902477] change /devices/virtual/block/dm-2 (block)
KERNEL[89068.955364] add /devices/virtual/block/dm-3 (block)
KERNEL[89068.960663] change /devices/virtual/block/dm-3 (block)
KERNEL[89069.018903] add /devices/virtual/block/dm-4 (block)
KERNEL[89069.042102] change /devices/virtual/block/dm-4 (block)
KERNEL[89069.297252] change /devices/virtual/block/dm-2 (block)
KERNEL[89069.346718] change /devices/virtual/block/dm-4 (block)
KERNEL[89069.388361] change /devices/virtual/block/dm-3 (block)
KERNEL[89069.548270] change /devices/virtual/block/dm-4 (block)
KERNEL[89069.607306] change /devices/virtual/block/dm-2 (block)
KERNEL[89070.118067] change /devices/virtual/block/dm-3 (block)
KERNEL[89070.136256] change /devices/virtual/block/dm-2 (block)
KERNEL[89070.157222] change /devices/virtual/block/dm-4 (block)
KERNEL[89070.216269] change /devices/virtual/block/dm-3 (block)
3) After processing by udev in step 2), udev also transfers these uevents to
multipath:
UDEV [89068.926428] add /devices/virtual/block/dm-2 (block)
UDEV [89069.007511] add /devices/virtual/block/dm-3 (block)
UDEV [89069.098054] add /devices/virtual/block/dm-4 (block)
UDEV [89069.291184] change /devices/virtual/block/dm-2 (block)
UDEV [89069.320632] change /devices/virtual/block/dm-4 (block)
UDEV [89069.381434] change /devices/virtual/block/dm-3 (block)
UDEV [89069.637666] change /devices/virtual/block/dm-2 (block)
UDEV [89069.682303] change /devices/virtual/block/dm-4 (block)
UDEV [89069.860877] change /devices/virtual/block/dm-2 (block)
UDEV [89069.904735] change /devices/virtual/block/dm-4 (block)
UDEV [89070.327167] change /devices/virtual/block/dm-2 (block)
UDEV [89070.371114] change /devices/virtual/block/dm-4 (block)
UDEV [89070.434592] change /devices/virtual/block/dm-3 (block)
UDEV [89070.572072] change /devices/virtual/block/dm-3 (block)
UDEV [89070.703181] change /devices/virtual/block/dm-3 (block)
4) Multipath processes uevents above.
The efficiency of processing uevents one by one is low, and it produces too
many uevents, which further reducing the processing efficiency. The problem
is similar in the logout procedure of iSCSI sessions.
3. Negative effect
Multipath processes so slowly that it is not satisfied to some applications, For
example, Openstack is often timeout in waiting for the creation of multipath
devices.
4. Proposal
Other than processing uevents one by one, uevents which coming from the
same LUN devices can be mergered to one, and then uevent processing
thread only needs to process it once, and it only produces one DM addition
uevent which could reduce system resource consumption.
The example in Chapter 2 is continued to use to explain the proposal:
1) Multipath receives block device addition uevents from udev:
UDEV [89068.806214] add /devices/platform/host3/session44/target3:0:0/3:0:0:0/block/sdc (block)
UDEV [89068.909457] add /devices/platform/host3/session44/target3:0:0/3:0:0:2/block/sdg (block)
UDEV [89068.944956] add /devices/platform/host3/session44/target3:0:0/3:0:0:1/block/sde (block)
UDEV [89068.959215] add /devices/platform/host5/session46/target5:0:0/5:0:0:0/block/sdh (block)
UDEV [89068.978558] add /devices/platform/host5/session46/target5:0:0/5:0:0:2/block/sdk (block)
UDEV [89069.004217] add /devices/platform/host5/session46/target5:0:0/5:0:0:1/block/sdj (block)
UDEV [89069.486361] add /devices/platform/host4/session45/target4:0:0/4:0:0:1/block/sdf (block)
UDEV [89069.495194] add /devices/platform/host4/session45/target4:0:0/4:0:0:0/block/sdd (block)
UDEV [89069.511628] add /devices/platform/host4/session45/target4:0:0/4:0:0:2/block/sdi (block)
UDEV [89069.716292] add /devices/platform/host6/session47/target6:0:0/6:0:0:0/block/sdl (block)
UDEV [89069.748456] add /devices/platform/host6/session47/target6:0:0/6:0:0:1/block/sdm (block)
UDEV [89069.789662] add /devices/platform/host6/session47/target6:0:0/6:0:0:2/block/sdn (block)
2) Multipath merges these 12 uevents to 3 internal uvents
UEVENT add sdc sdh sdd sdl
UEVENT add sde sdj sdf sdm
UEVENT add sdg sdk sdi sdn
3) Multipath process these 3 uevents one by one, and only produce 3 addition
DM uvents, no dm change uevent exists.
KERNEL[89068.899614] add /devices/virtual/block/dm-2 (block)
KERNEL[89068.955364] add /devices/virtual/block/dm-3 (block)
KERNEL[89069.018903] add /devices/virtual/block/dm-4 (block)
4) Udev process these uevents above, and transfer it to multipath
UDEV [89068.926428] add /devices/virtual/block/dm-2 (block)
UDEV [89069.007511] add /devices/virtual/block/dm-3 (block)
UDEV [89069.098054] add /devices/virtual/block/dm-4 (block)
5) Multipath processes these uevents above, and finishes the creation of multipath
devices.
5. Coding
After taking over uevents form uevent listening thread, uevent processing thread can
merger these uevents before processing:
int uevent_dispatch(int (*uev_trigger)(struct uevent *, void * trigger_data),
void * trigger_data)
{
...
while (1) {
...
list_splice_init(&uevq, &uevq_tmp);
...
list_merger_uevents(&uevq_tmp);
service_uevq(&uevq_tmp);
}
...
}
In structure of “struct uevent” , an additional member of “char wwid[WWID_SIZE]” is
added to record each device WWID for addition or change uevent to identify whether
these uevents coming from the same LUN, and an additional member of
“struct list_head merger_node” is added to record the list of uevents which having been
merged with this uevent:
struct uevent {
struct list_head node;
struct list_head merger_node;
char wwid[WWID_SIZE]
struct udev_device *udev;
...
};
In list_merger_uevents(&uevq_tmp), each node is traversed from the latest to the oldest,
and the older node with the same WWID and uevent type(e.g. add) would be moved to
the merger_node list of the later node. If a deletion uevent node occurred, other older
uevent nodes about this device would be filtered(Thanks to Martin’s idea).
After above processing, attention must be paid to that the parameter
“struct uevent * uev” is not a single uevent any more in and after uev_trigger(), code
need to be modified to process batch uevents in uev_add_path() and so on.
in ZTE Corporation. Facing the market demand, our team decides to write code to
promote multipath efficiency next month. The whole idea is in the mail below.We
hope to participate in and make progress with the open source community, so any
suggestion and comment would be welcome.
Thanks,
Tang
------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------
1. Problem
In these scenarios, multipath processing efficiency is low:
1) Many paths exist in each multipath device,
2) Devices addition or deletion during iSCSI login/logout or FC link up/down.
2. Reasons
Multipath process uevents one by one, and each one also produce a new dm
addition change or deletion uevent to increased system resource consumption,
actually most of these uevents have no sense at all.
E.g. login procedure of 4 iSCSI sessions with 3 LUNs:
1) Multipath processes these uevents one by one:
UDEV [89068.806214] add /devices/platform/host3/session44/target3:0:0/3:0:0:0/block/sdc (block)
UDEV [89068.909457] add /devices/platform/host3/session44/target3:0:0/3:0:0:2/block/sdg (block)
UDEV [89068.944956] add /devices/platform/host3/session44/target3:0:0/3:0:0:1/block/sde (block)
UDEV [89068.959215] add /devices/platform/host5/session46/target5:0:0/5:0:0:0/block/sdh (block)
UDEV [89068.978558] add /devices/platform/host5/session46/target5:0:0/5:0:0:2/block/sdk (block)
UDEV [89069.004217] add /devices/platform/host5/session46/target5:0:0/5:0:0:1/block/sdj (block)
UDEV [89069.486361] add /devices/platform/host4/session45/target4:0:0/4:0:0:1/block/sdf (block)
UDEV [89069.495194] add /devices/platform/host4/session45/target4:0:0/4:0:0:0/block/sdd (block)
UDEV [89069.511628] add /devices/platform/host4/session45/target4:0:0/4:0:0:2/block/sdi (block)
UDEV [89069.716292] add /devices/platform/host6/session47/target6:0:0/6:0:0:0/block/sdl (block)
UDEV [89069.748456] add /devices/platform/host6/session47/target6:0:0/6:0:0:1/block/sdm (block)
UDEV [89069.789662] add /devices/platform/host6/session47/target6:0:0/6:0:0:2/block/sdn (block)
2) Multipath also produce DM uvents by step 1), which would be processed by
udev and other process who listening kernel:
KERNEL[89068.899614] add /devices/virtual/block/dm-2 (block)
KERNEL[89068.902477] change /devices/virtual/block/dm-2 (block)
KERNEL[89068.955364] add /devices/virtual/block/dm-3 (block)
KERNEL[89068.960663] change /devices/virtual/block/dm-3 (block)
KERNEL[89069.018903] add /devices/virtual/block/dm-4 (block)
KERNEL[89069.042102] change /devices/virtual/block/dm-4 (block)
KERNEL[89069.297252] change /devices/virtual/block/dm-2 (block)
KERNEL[89069.346718] change /devices/virtual/block/dm-4 (block)
KERNEL[89069.388361] change /devices/virtual/block/dm-3 (block)
KERNEL[89069.548270] change /devices/virtual/block/dm-4 (block)
KERNEL[89069.607306] change /devices/virtual/block/dm-2 (block)
KERNEL[89070.118067] change /devices/virtual/block/dm-3 (block)
KERNEL[89070.136256] change /devices/virtual/block/dm-2 (block)
KERNEL[89070.157222] change /devices/virtual/block/dm-4 (block)
KERNEL[89070.216269] change /devices/virtual/block/dm-3 (block)
3) After processing by udev in step 2), udev also transfers these uevents to
multipath:
UDEV [89068.926428] add /devices/virtual/block/dm-2 (block)
UDEV [89069.007511] add /devices/virtual/block/dm-3 (block)
UDEV [89069.098054] add /devices/virtual/block/dm-4 (block)
UDEV [89069.291184] change /devices/virtual/block/dm-2 (block)
UDEV [89069.320632] change /devices/virtual/block/dm-4 (block)
UDEV [89069.381434] change /devices/virtual/block/dm-3 (block)
UDEV [89069.637666] change /devices/virtual/block/dm-2 (block)
UDEV [89069.682303] change /devices/virtual/block/dm-4 (block)
UDEV [89069.860877] change /devices/virtual/block/dm-2 (block)
UDEV [89069.904735] change /devices/virtual/block/dm-4 (block)
UDEV [89070.327167] change /devices/virtual/block/dm-2 (block)
UDEV [89070.371114] change /devices/virtual/block/dm-4 (block)
UDEV [89070.434592] change /devices/virtual/block/dm-3 (block)
UDEV [89070.572072] change /devices/virtual/block/dm-3 (block)
UDEV [89070.703181] change /devices/virtual/block/dm-3 (block)
4) Multipath processes uevents above.
The efficiency of processing uevents one by one is low, and it produces too
many uevents, which further reducing the processing efficiency. The problem
is similar in the logout procedure of iSCSI sessions.
3. Negative effect
Multipath processes so slowly that it is not satisfied to some applications, For
example, Openstack is often timeout in waiting for the creation of multipath
devices.
4. Proposal
Other than processing uevents one by one, uevents which coming from the
same LUN devices can be mergered to one, and then uevent processing
thread only needs to process it once, and it only produces one DM addition
uevent which could reduce system resource consumption.
The example in Chapter 2 is continued to use to explain the proposal:
1) Multipath receives block device addition uevents from udev:
UDEV [89068.806214] add /devices/platform/host3/session44/target3:0:0/3:0:0:0/block/sdc (block)
UDEV [89068.909457] add /devices/platform/host3/session44/target3:0:0/3:0:0:2/block/sdg (block)
UDEV [89068.944956] add /devices/platform/host3/session44/target3:0:0/3:0:0:1/block/sde (block)
UDEV [89068.959215] add /devices/platform/host5/session46/target5:0:0/5:0:0:0/block/sdh (block)
UDEV [89068.978558] add /devices/platform/host5/session46/target5:0:0/5:0:0:2/block/sdk (block)
UDEV [89069.004217] add /devices/platform/host5/session46/target5:0:0/5:0:0:1/block/sdj (block)
UDEV [89069.486361] add /devices/platform/host4/session45/target4:0:0/4:0:0:1/block/sdf (block)
UDEV [89069.495194] add /devices/platform/host4/session45/target4:0:0/4:0:0:0/block/sdd (block)
UDEV [89069.511628] add /devices/platform/host4/session45/target4:0:0/4:0:0:2/block/sdi (block)
UDEV [89069.716292] add /devices/platform/host6/session47/target6:0:0/6:0:0:0/block/sdl (block)
UDEV [89069.748456] add /devices/platform/host6/session47/target6:0:0/6:0:0:1/block/sdm (block)
UDEV [89069.789662] add /devices/platform/host6/session47/target6:0:0/6:0:0:2/block/sdn (block)
2) Multipath merges these 12 uevents to 3 internal uvents
UEVENT add sdc sdh sdd sdl
UEVENT add sde sdj sdf sdm
UEVENT add sdg sdk sdi sdn
3) Multipath process these 3 uevents one by one, and only produce 3 addition
DM uvents, no dm change uevent exists.
KERNEL[89068.899614] add /devices/virtual/block/dm-2 (block)
KERNEL[89068.955364] add /devices/virtual/block/dm-3 (block)
KERNEL[89069.018903] add /devices/virtual/block/dm-4 (block)
4) Udev process these uevents above, and transfer it to multipath
UDEV [89068.926428] add /devices/virtual/block/dm-2 (block)
UDEV [89069.007511] add /devices/virtual/block/dm-3 (block)
UDEV [89069.098054] add /devices/virtual/block/dm-4 (block)
5) Multipath processes these uevents above, and finishes the creation of multipath
devices.
5. Coding
After taking over uevents form uevent listening thread, uevent processing thread can
merger these uevents before processing:
int uevent_dispatch(int (*uev_trigger)(struct uevent *, void * trigger_data),
void * trigger_data)
{
...
while (1) {
...
list_splice_init(&uevq, &uevq_tmp);
...
list_merger_uevents(&uevq_tmp);
service_uevq(&uevq_tmp);
}
...
}
In structure of “struct uevent” , an additional member of “char wwid[WWID_SIZE]” is
added to record each device WWID for addition or change uevent to identify whether
these uevents coming from the same LUN, and an additional member of
“struct list_head merger_node” is added to record the list of uevents which having been
merged with this uevent:
struct uevent {
struct list_head node;
struct list_head merger_node;
char wwid[WWID_SIZE]
struct udev_device *udev;
...
};
In list_merger_uevents(&uevq_tmp), each node is traversed from the latest to the oldest,
and the older node with the same WWID and uevent type(e.g. add) would be moved to
the merger_node list of the later node. If a deletion uevent node occurred, other older
uevent nodes about this device would be filtered(Thanks to Martin’s idea).
After above processing, attention must be paid to that the parameter
“struct uevent * uev” is not a single uevent any more in and after uev_trigger(), code
need to be modified to process batch uevents in uev_add_path() and so on.
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel