在 2022/9/16 2:17, Benjamin Marzinski 写道: > On Thu, Sep 15, 2022 at 02:56:36PM +0800, Wu Guanghao wrote: >> Sorry for the late feedback. >> >> The version we are currently testing is 0.8.4, so we only merge the >> first 3 patches in this series of patches. Then after the actual test, >> it was found that the effect improvement is not very obvious. >> >> Test environment: 1000 multipath devices, 16 paths per device >> Test command: Aggregate multipath devices using multipathd add path >> Time consuming (time for 16 paths to aggregate 1 multipath device): >> 1. Original: >> < 4s: 76.9%; >> 4s~10s: 18.4%; >> >10s: 4.7%; >> 2. After using the patches: >> < 4s: 79.1%; >> 4s~10s: 18.2%; >> >10s: 2.7%; >> >> >From the results, the time-consuming improvement of the patch is not obvious, >> so we made a few changes to the patch and it worked as expected. The modification >> of the patch is as follows: >> >> Paths_checked is changed to configurable, current setting n = 2. >> Removed judgment on diff_time. >> Sleep change from 10us to 5ms > > I worry that this is giving too much deference to the uevents. If there > is a uevent storm, checking will stop for 5ms every 2 paths checked. I'm > worried that this will make it take significantly longer for the the > path checker to complete a cycle. Making paths_checked configurable, so > that this doesn't trigger too often does help to avoid the issue that > checking the time from chk_start_time was dealing with, and makes the > mechanics of this simpler. But 5ms seems like a long time to have to > wait, just to make sure that another process had the time to grab the > vecs lock. Perhaps it would be better to sleep for a shorter length of > time, but in a loop, where we can check to see if progress has been > made, perhaps by checking some counter of events and user commands > serviced. This way, we aren't sleeping too long for no good reason. > I agree with this, we are also going to adjust the sleep time, and then continue the test. >> if (++paths_checked % n == 0 && >> lock_has_waiters(&vecs->lock)) { >> get_monotonic_time(&end_time); >> timespecsub(&end_time, &chk_start_time, >> &diff_time); >> if (diff_time.tv_sec > 0) // Delete the code, goto directly >> goto unlock; >> } >> >> if (checker_state != CHECKER_FINISHED) { >> /* Yield to waiters */ >> //struct timespec wait = { .tv_nsec = 10000, }; >> //nanosleep(&wait, NULL); >> usleep(5000); // sleep change from 10us to 5ms >> } >> >> Time consuming(After making the above changes to the patch): >> < 4s: 99.5%; >> = 4s: 0.5%; >>> 4s: 0%; > > Since I believe that this is likely causing a real impact on the checker > speed, I'm not sure that we're looking for results like this. Are you > seeing That "path checkers took longer than ..." message? How long does > it say the checker is taking (and what do you have max_polling_interval > set to)? > Yes, I saw this message. In the current test environment, using the original code, all path checks take at least 50s to complete. We currently set max_polling_interval=20s, so it must print this message. Adding sleep just makes this message print more often. > -Ben > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel