Beau Belgrave <beaub@xxxxxxxxxxxxxxxxxxx> 于2023年6月20日周二 02:40写道: > > On Mon, Jun 19, 2023 at 04:51:56PM +0800, sunliming wrote: > > Beau Belgrave <beaub@xxxxxxxxxxxxxxxxxxx> 于2023年6月17日周六 00:08写道: > > > > > > On Fri, Jun 09, 2023 at 11:03:00AM +0800, sunliming wrote: > > > > The writing operation return the count of writes whether events are > > > > enabled or disabled. This is incorrect when events are disabled. Fix > > > > this by just return -ENOENT when events are disabled. > > > > > > > > > > When testing this patch locally I found that we would occasionally get > > > -ENOENT when events were enabled, but then become disabled, since writes > > > do not have any locking around the tracepoint checks for performance > > > reasons. > > > > > > I've asked a few peers of mine their thoughts on this, whether an error > > > should result when there are no enabled events. The consensus I've heard > > > back is that they would not consider this case an actual error, just as > > > writing to /dev/null does not actually return an error. > > > > > > However, if you feel strongly we need this and have a good use case, it > > > seems better to enable this logic behind a flag instead of having it > > > default based on my conversations with others. > > > > > > Thanks, > > > -Beau > > > > > > > > There is indeed a problem. Once enabled, perform the write operation > > immediately. > > > > The immediate write does work, and gets put into a buffer. The ftrace > and perf self tests do the above case. So, no worries at this point. > > > Now,when the event is disabled, the trace record appears to be lost. > > I'm taking this to mean, if in between the time of the bit check and the > actual write() /writev() syscall the event becomes disabled, the event > won't write to the buffer. Yes, that is expected. > Yes , got it, thank you for your explanation. > > In some situations > > where data timing is sensitive, it may cause confusion. In this case, > > not returning an > > error (as mentioned in your reply, it is not considered this case an > > actual error) and > > returning 0 ( meaning that the number of data to be written is 0) may > > be a good way > > to handle it? > > This is where I get a little lost. What would a user process do with a > return of 0 bytes? It shouldn't retry, since it just hit that small > timing window. In reality, it just incurred a temporary excessive > syscall cost, but no real data loss (the operator/admin turned the event > off). > > I'm missing why you feel it's important the user process know such a > window was hit? > > Can you help me understand that? > I haven't encountered a specific scenario that it's important the user process know such a window was hit. This may be a mistake in my understanding. When someone uses user events checking the output of an event to confirm the execution status of a program, it may cause confusion if someone else prohibits the event. This shouldn't be a serious issue, this patch just makes things look better. Thanks, -Sunliming > I do think returning 0 bytes is better than an error here, but I'd > really like to know why the user process wants to know at all. Maybe > they have user-space only logging and want to be able to mark there if > it's in both spots (kernel and user buffers)? > > Thanks, > -Beau > > > Thanks, > > -Sunliming > > > > > > > > > Signed-off-by: sunliming <sunliming@xxxxxxxxxx> > > > > --- > > > > kernel/trace/trace_events_user.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c > > > > index 1ac5ba5685ed..92204bbe79da 100644 > > > > --- a/kernel/trace/trace_events_user.c > > > > +++ b/kernel/trace/trace_events_user.c > > > > @@ -1957,7 +1957,8 @@ static ssize_t user_events_write_core(struct file *file, struct iov_iter *i) > > > > > > > > if (unlikely(faulted)) > > > > return -EFAULT; > > > > - } > > > > + } else > > > > + return -ENOENT; > > > > > > > > return ret; > > > > } > > > > -- > > > > 2.25.1