On 2024-01-17 11:15 +0800, Hangbin Liu wrote: > On Tue, Jan 16, 2024 at 02:47:46PM -0500, Benjamin Poirier wrote: > > On 2024-01-16 11:29 -0800, Jakub Kicinski wrote: > > > On Tue, 16 Jan 2024 14:21:51 -0500 Benjamin Poirier wrote: > > > > real 13m35.065s > > > > user 0m1.657s > > > > sys 0m27.918s > > > > > > > > The test is not cpu bound; as Jay pointed out, it spends most of its > > > > time sleeping. > > > > > > Ugh, so it does multiple iterations of 118 sec? > > > > There are other test functions in the script which include a lot of > > sleeping. > > The arp_validate_test need to check the mii_status, which sleep too much time. > Maybe we can use busywait to save more time. > > > > > > Could you send a patch to bump the timeout to 900 or 1200 in this case? > > > > Sure but I'd like to give a chance for Hangbin to reply first. Would the > > test be just as good if it was shortened by removing some cases or > > reducing the time intervals? Or is increasing the timeout the best > > approach? > > The purpose of grat_arp is testing commit 9949e2efb54e ("bonding: fix > send_peer_notif overflow"). As the send_peer_notif was defined to u8, > to overflow it, we need to > > send_peer_notif = num_peer_notif * peer_notif_delay = num_grat_arp * peer_notify_delay / miimon > 255 > (kernel) (kernel parameter) (user parameter) > > e.g. 30 (num_grat_arp) * 1000 (peer_notify_delay) / 100 (miimon) > 255. > > Which need 30s to complete sending garp messages. To save the testing time, > the only way is reduce the miimon number. Something like > 30 (num_grat_arp) * 500 (peer_notify_delay) / 50 (miimon) > 255. > > To save more time, we can remove the 50 num_grat_arp testing. The patch would > like > > diff --git a/tools/testing/selftests/drivers/net/bonding/bond_options.sh b/tools/testing/selftests/drivers/net/bonding/bond_options.sh > index c54d1697f439..20c4d862c436 100755 > --- a/tools/testing/selftests/drivers/net/bonding/bond_options.sh > +++ b/tools/testing/selftests/drivers/net/bonding/bond_options.sh > @@ -277,7 +277,7 @@ garp_test() > ip -n ${s_ns} link set ${active_slave} down > > exp_num=$(echo "${param}" | cut -f6 -d ' ') > - sleep $((exp_num + 2)) > + sleep $((exp_num / 2 + 2)) > > active_slave=$(cmd_jq "ip -n ${s_ns} -d -j link show bond0" ".[].linkinfo.info_data.active_slave") > > @@ -296,8 +296,8 @@ garp_test() > num_grat_arp() > { > local val > - for val in 10 20 30 50; do > - garp_test "mode active-backup miimon 100 num_grat_arp $val peer_notify_delay 1000" > + for val in 10 20 30; do > + garp_test "mode active-backup miimon 50 num_grat_arp $val peer_notify_delay 500" > log_test "num_grat_arp" "active-backup miimon num_grat_arp $val" > done > } > > With this we can save 100s. > Thanks for looking into it. This change got the runtime down from 13m35s to 12m7s on the same system I used to test yesterday. That's a start but since it's still well above the current timeout of 120s, I sent a patch to increase the timeout to 1200s.