On 3/20/23 10:07, Moger, Babu wrote: > Hi Reinette, > > On 3/16/23 15:33, Reinette Chatre wrote: >> Hi Babu, >> >> On 3/16/2023 12:51 PM, Moger, Babu wrote: >>> On 3/16/23 12:12, Reinette Chatre wrote: >>>> On 3/16/2023 9:27 AM, Moger, Babu wrote: >>>>>> -----Original Message----- >>>>>> From: Reinette Chatre <reinette.chatre@xxxxxxxxx> >>>>>> Sent: Wednesday, March 15, 2023 1:33 PM >>>>>> To: Moger, Babu <Babu.Moger@xxxxxxx>; corbet@xxxxxxx; >>>>>> tglx@xxxxxxxxxxxxx; mingo@xxxxxxxxxx; bp@xxxxxxxxx >>>>>> Cc: fenghua.yu@xxxxxxxxx; dave.hansen@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx; >>>>>> hpa@xxxxxxxxx; paulmck@xxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; >>>>>> quic_neeraju@xxxxxxxxxxx; rdunlap@xxxxxxxxxxxxx; >>>>>> damien.lemoal@xxxxxxxxxxxxxxxxxx; songmuchun@xxxxxxxxxxxxx; >>>>>> peterz@xxxxxxxxxxxxx; jpoimboe@xxxxxxxxxx; pbonzini@xxxxxxxxxx; >>>>>> chang.seok.bae@xxxxxxxxx; pawan.kumar.gupta@xxxxxxxxxxxxxxx; >>>>>> jmattson@xxxxxxxxxx; daniel.sneddon@xxxxxxxxxxxxxxx; Das1, Sandipan >>>>>> <Sandipan.Das@xxxxxxx>; tony.luck@xxxxxxxxx; james.morse@xxxxxxx; >>>>>> linux-doc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; >>>>>> bagasdotme@xxxxxxxxx; eranian@xxxxxxxxxx; christophe.leroy@xxxxxxxxxx; >>>>>> jarkko@xxxxxxxxxx; adrian.hunter@xxxxxxxxx; quic_jiles@xxxxxxxxxxx; >>>>>> peternewman@xxxxxxxxxx >>>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group >>>>>> at once >>>>>> >>>>>> Hi Babu, >>>>>> >>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote: >>>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be >>>>>>> done one at a time. For example: >>>>>>> >>>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/ >>>>>>> $mkdir /sys/fs/resctrl/clos1 >>>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks >>>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks >>>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks >>>>>>> >>>>>>> This is not user-friendly when dealing with hundreds of tasks. Also, >>>>>>> there is a syscall overhead for each command executed from user space. >>>>>> >>>>>> To support this change it may also be helpful to add that moving tasks take the >>>>>> mutex so attempting to move tasks in parallel will not achieve a significant >>>>>> performance gain. >>>>> >>>>> Agree. It may not be significant performance gain. Will remove this line. >>>> >>>> It does not sound as though you are actually responding to my comment. >>> >>> I am confused. I am already saying there is syscall overhead for each >>> command if we move the tasks one by one. Now do you want me to add "moving >>> tasks take the mutex so attempting to move tasks in parallel will not >>> achieve a significant performance gain". >>> >>> It is contradictory, So, I wanted to remove the line about performance. >>> Did I still miss something? >> >> Where is the contradiction? >> >> Consider your example: >> $echo 123 > /sys/fs/resctrl/clos1/tasks >> $echo 456 > /sys/fs/resctrl/clos1/tasks >> $echo 789 > /sys/fs/resctrl/clos1/tasks >> >> Yes, there is syscall overhead for each of the above lines. My statement was in >> support of this work by stating that a user aiming to improve performance by >> attempting the above in parallel would not be able to see achieve significant >> performance gain since the calls would end up being serialized. > > ok. Sure. Will add the text. I may modify little bit. >> >> You are providing two motivations (a) "user-friendly when dealing with >> hundreds of tasks", and (b) syscall overhead. Have you measured the >> improvement this solution provides? > > No. I have not measured the performance improvement. > >> >>>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct >>>>>> kernfs_open_file *of, >>>>>>> char *buf, size_t nbytes, loff_t off) { >>>>>>> struct rdtgroup *rdtgrp; >>>>>>> + char *pid_str; >>>>>>> int ret = 0; >>>>>>> pid_t pid; >>>>>>> >>>>>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0) >>>>>>> + /* Valid input requires a trailing newline */ >>>>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n') >>>>>>> return -EINVAL; >>>>>> >>>>>> The resctrl files should be seen as user space API. With the above change you >>>>>> take an interface that did not require a newline and dictate that it should have >>>>>> a trailing newline. How convinced are you that this does not break any current >>>>>> user space scripts or applications? Why does this feature require a trailing >>>>>> newline? >>>>> >>>>> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems. >>>>> We are already doing newline check for few other inputs. >>>> >>>> You tested this with the _one_ user space tool that you use. This is not sufficient >>>> to be convincing that this change has no impact. I do not believe that it is a valid >>>> argument that other inputs do a newline check. This input never required a newline >>>> check and it is not clear why this change now requires it. It seems that this is an >>>> unnecessary new requirement that runs the risk of breaking an existing application. >>>> >>>> I would like to ask again: How convinced are you that this does not break _any_ current >>>> user space scripts or applications? Why does this feature require a trailing >>>> newline? > > I dont think this feature required trailing newline. I may have carried > away from similar code in the area. I will remove that requirement. > >>> >>> I do not know of any other tool using resctrl fs. >>> So, you want me to drop the newline requirement for this. I can try that. >>> Will let you know how it goes. >> >> You continue to avoid my question about why this requires a newline. Until >> I learn why this is required, yes, from what I can tell based on this patch >> this requirement can and should be dropped. >> >>>>>>> + >>>>>>> + buf[nbytes - 1] = '\0'; >>>>>>> + >>>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn); >>>>>>> if (!rdtgrp) { >>>>>>> rdtgroup_kn_unlock(of->kn); >>>>>>> return -ENOENT; >>>>>>> } >>>>>>> + >>>>>>> +next: >>>>>>> + if (!buf || buf[0] == '\0') >>>>>>> + goto unlock; >>>>>>> + >>>>>>> + pid_str = strim(strsep(&buf, ",")); >>>>>>> + >>>>>> >>>>>> Could lib/cmdline.c:get_option() be useful? >>>>> >>>>> Yes. We could that also. May not be required for the simple case like this. >>>> >>>> Please keep an eye out for how much of it you end up duplicating .... >>> >>> Using the get_options will require at least two calls(one to get the >>> length and then read the integers). Also need to allocate the integers >>> array dynamically. That is lot code if we are going that route. >>> >> >> I did not ask about get_options(), I asked about get_option(). > > If you insist, will use get_option. But we still have to loop thru all the > string till get_option returns 0. I can try that. > >> >>>> >>>>>>> + ret = -EINVAL; >>>>>>> + goto unlock; >>>>>>> + } >>>>>>> + >>>>>>> rdt_last_cmd_clear(); >>>>>>> >>>>>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6 >>>>>> +721,10 @@ >>>>>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, >>>>>>> } >>>>>>> >>>>>>> ret = rdtgroup_move_task(pid, rdtgrp, of); >>>>>>> + if (ret) >>>>>>> + goto unlock; >>>>>>> + else >>>>>>> + goto next; >>>>>>> >>>>>> >>>>>> The documentation states "The failure details will be logged in >>>>>> resctrl/info/last_cmd_status file." but I do not see how this is happening here. >>>>>> From what I can tell this implementation does not do anything beyond what >>>>>> last_cmd_status already does so any special mention in the docs is not clear to >>>>>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it >>>>>> sounded as though last_cmd_status would contain the error with the pid that >>>>>> encountered the error but I do not see this happening here. >>>>> >>>>> You are right we are not doing anything special here. pid failures error was already there. >>>>> I will have to change the text here. >>>> >>>> What do you mean with "pid failures error was already there"? From what >>>> I understand your goal is to communicate to the user which pid >>>> encountered the error and I do not see that done. How will user know >>>> which pid encountered a failure? >>> >>> We only have couple of failures to take here. Those failures are already >>> handled by rdtgroup_move_task. It logs the pid for failure(using >>> rdt_last_cmd_printf). >> >> The pid is only logged if there is no task with that pid. How about the >> error in __rdtgroup_move_task() - how will the user know which pid triggered >> that error? > > Yes. These cases we may be able to report the pid. I meant "we may not" > >> >>> >>> I can say "The failure pid will be logged in >>> /sys/fs/resctrl/info/last_cmd_status file." >> >> That will not be accurate. Not all errors include the pid. > > Can you please suggest? > Thanks > Babu Moger -- Thanks Babu Moger