Hi Ilpo, On 7/14/2023 3:35 AM, Ilpo Järvinen wrote: > On Thu, 13 Jul 2023, Reinette Chatre wrote: >> On 7/13/2023 6:19 AM, Ilpo Järvinen wrote: >>> Perf event fd (fd_lm) is not closed on some error paths. >>> >>> Always close fd_lm in get_llc_perf() and add close into an error >>> handling block in cat_val(). >>> >>> Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest") >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> >>> --- >>> tools/testing/selftests/resctrl/cache.c | 10 +++++----- >>> 1 file changed, 5 insertions(+), 5 deletions(-) >>> >>> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c >>> index 8a4fe8693be6..ced47b445d1e 100644 >>> --- a/tools/testing/selftests/resctrl/cache.c >>> +++ b/tools/testing/selftests/resctrl/cache.c >>> @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no) >>> static int get_llc_perf(unsigned long *llc_perf_miss) >>> { >>> __u64 total_misses; >>> + int ret; >>> >>> /* Stop counters after one span to get miss rate */ >>> >>> ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0); >>> >>> - if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) { >>> + ret = read(fd_lm, &rf_cqm, sizeof(struct read_format)); >>> + close(fd_lm); >>> + if (ret == -1) { >>> perror("Could not get llc misses through perf"); >>> - >>> return -1; >>> } >>> >>> total_misses = rf_cqm.values[0].value; >>> - >>> - close(fd_lm); >>> - >>> *llc_perf_miss = total_misses; >>> >>> return 0; >>> @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param) >>> memflush, operation, resctrl_val)) { >>> fprintf(stderr, "Error-running fill buffer\n"); >>> ret = -1; >>> + close(fd_lm); >>> break; >>> } >>> >> >> Instead of fixing these existing patterns I think it would make the code >> easier to understand and maintain if it is made symmetrical. >> Having the perf event fd opened in one place but its close() >> scattered elsewhere has the potential for confusion and making later >> mistakes easy to miss. >> >> What if perf event fd is closed in a new "disable_llc_perf()" that >> is matched with "reset_enable_llc_perf()" and called >> from cat_val()? >> >> I think this raises another issue with the test trickery where >> measure_cache_vals() has some assumptions about state based on the >> test name. > > I very much agree on the principle here, and thus I already have created > patches which will do a major cleanup on this area. The cleaned-up code > has pe_fd local var to cat_val() and handles closing it in cat_val() with > the usual patterns. > > However, the patch is currently resides post L3 CAT test rewrite. > Backporting the cleanups/refactors into this series would require > considerable effort due to how convoluted all those n-step cleanup patches > and L3 CAT test rewrite are in this area. There's just very much to > cleanup here and L3 rewrite will touch the same areas so its a net > full of conflicts. > > Do you want me to spend the effort to backport them into this series > (I expect will take some time)? Considering the "Fixes" tag, having a smaller fix that can easily be backported would be ideal so I am ok with deferring a bigger rework. I do think this fix can be made more robust with a couple of small changes that should not introduce significant conflicts: * initialize fd_lm to -1 * do not close() fd_lm in get_llc_perf() but instead move its close() to at exit of cat_val(). * add check in get_llc_perf() that it does not attempt ioctl() on "fd_lm == -1" (later addition would be error checking of the ioctl()) > I currently have these items pending besides this series (in order): > - L3 CAT test rewrite and its preparatory patches > - More cleanups (including the pe_fd cleanup) > - New generalized test framework > - L2 CAT test Thank you very much for taking this on. Reinette