On Mon, Nov 28, 2016 at 10:19:15AM +0800, tang.junhui@xxxxxxxxxx wrote: > Hello Christophe, Ben, Hannes, Martin, Bart, > I am a member of host-side software development team of ZXUSP storage > project > in ZTE Corporation. Facing the market demand, our team decides to write > code to > promote multipath efficiency next month. The whole idea is in the mail > below.We > hope to participate in and make progress with the open source community, > so any > suggestion and comment would be welcome. Like I mentioned before, I think this is a good idea in general, but the devil is in the details here. > > Thanks, > Tang > > ------------------------------------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------------------------------------ > 1. Problem > In these scenarios, multipath processing efficiency is low: > 1) Many paths exist in each multipath device, > 2) Devices addition or deletion during iSCSI login/logout or FC link > up/down. <snip> > 4. Proposal > Other than processing uevents one by one, uevents which coming from the > same LUN devices can be mergered to one, and then uevent processing > thread only needs to process it once, and it only produces one DM addition > uevent which could reduce system resource consumption. > > The example in Chapter 2 is continued to use to explain the proposal: > 1) Multipath receives block device addition uevents from udev: > UDEV [89068.806214] add > /devices/platform/host3/session44/target3:0:0/3:0:0:0/block/sdc (block) > UDEV [89068.909457] add > /devices/platform/host3/session44/target3:0:0/3:0:0:2/block/sdg (block) > UDEV [89068.944956] add > /devices/platform/host3/session44/target3:0:0/3:0:0:1/block/sde (block) > UDEV [89068.959215] add > /devices/platform/host5/session46/target5:0:0/5:0:0:0/block/sdh (block) > UDEV [89068.978558] add > /devices/platform/host5/session46/target5:0:0/5:0:0:2/block/sdk (block) > UDEV [89069.004217] add > /devices/platform/host5/session46/target5:0:0/5:0:0:1/block/sdj (block) > UDEV [89069.486361] add > /devices/platform/host4/session45/target4:0:0/4:0:0:1/block/sdf (block) > UDEV [89069.495194] add > /devices/platform/host4/session45/target4:0:0/4:0:0:0/block/sdd (block) > UDEV [89069.511628] add > /devices/platform/host4/session45/target4:0:0/4:0:0:2/block/sdi (block) > UDEV [89069.716292] add > /devices/platform/host6/session47/target6:0:0/6:0:0:0/block/sdl (block) > UDEV [89069.748456] add > /devices/platform/host6/session47/target6:0:0/6:0:0:1/block/sdm (block) > UDEV [89069.789662] add > /devices/platform/host6/session47/target6:0:0/6:0:0:2/block/sdn (block) > > 2) Multipath merges these 12 uevents to 3 internal uvents > UEVENT add sdc sdh sdd sdl > UEVENT add sde sdj sdf sdm > UEVENT add sdg sdk sdi sdn > > 3) Multipath process these 3 uevents one by one, and only produce 3 > addition > DM uvents, no dm change uevent exists. > KERNEL[89068.899614] add /devices/virtual/block/dm-2 (block) > KERNEL[89068.955364] add /devices/virtual/block/dm-3 (block) > KERNEL[89069.018903] add /devices/virtual/block/dm-4 (block) Just because I'm pedantic: There will, of cource, be dm change events. Without them, you couldn't have a multipath device. Whenever you load a table in a dm device (including during the initial creation), you get a change event. > 4) Udev process these uevents above, and transfer it to multipath > UDEV [89068.926428] add /devices/virtual/block/dm-2 (block) > UDEV [89069.007511] add /devices/virtual/block/dm-3 (block) > UDEV [89069.098054] add /devices/virtual/block/dm-4 (block) multipathd ignores add events for dm devices (look at uev_trigger). A dm device isn't set up until it's initial change event happens. > 5) Multipath processes these uevents above, and finishes the creation of > multipath > devices. > > 5. Coding > After taking over uevents form uevent listening thread, uevent processing > thread can > merger these uevents before processing�� > int uevent_dispatch(int (*uev_trigger)(struct uevent *, void * > trigger_data), > void * trigger_data) > { > ... > while (1) { > ... > list_splice_init(&uevq, &uevq_tmp); > ... > list_merger_uevents(&uevq_tmp); > service_uevq(&uevq_tmp); > } > ... > } > In structure of ��struct uevent�� , an additional member of ��char > wwid[WWID_SIZE]�� is > added to record each device WWID for addition or change uevent to identify > whether > these uevents coming from the same LUN, and an additional member of > ��struct list_head merger_node�� is added to record the list of uevents > which having been > merged with this uevent: > struct uevent { > struct list_head node; > struct list_head merger_node; > char wwid[WWID_SIZE] > struct udev_device *udev; > ... > }; You can't just get the wwid with no work (look at all work uev_add_path does, specifically alloc_path_with_pathinfo). Now you could reorder this, but there isn't much point, since it is doing useful things, like checking if this is a spurious uevent, and necessary things, like figuring out the device type and using that that the configuration to figure out HOW to get the wwid. It seems like what you want to do is to call uev_add_path multiple times, but defer most of the work that ev_add_path does (creating or updating the multipath device), until you've processed all that paths. > In list_merger_uevents(&uevq_tmp), each node is traversed from the latest > to the oldest, > and the older node with the same WWID and uevent type(e.g. add) would be > moved to > the merger_node list of the later node. If a deletion uevent node > occurred, other older > uevent nodes about this device would be filtered(Thanks to Martin��s > idea). > > After above processing, attention must be paid to that the parameter > ��struct uevent * uev�� is not a single uevent any more in and after > uev_trigger(), code > need to be modified to process batch uevents in uev_add_path() and so on. -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel