On Tue, 21 Aug 2018 11:16:49 -0600 Mathieu Poirier <mathieu.poirier@xxxxxxxxxx> wrote: > On Mon, 20 Aug 2018 at 08:36, Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote: > > > > On 08/20/2018 03:22 PM, Kim Phillips wrote: > > > On Mon, 20 Aug 2018 11:03:03 +0100 > > > Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote: > > > > > >> On 08/16/2018 08:28 PM, Mathieu Poirier wrote: > > >>> On Wed, 15 Aug 2018 at 09:28, Kim Phillips <kim.phillips@xxxxxxx> wrote: > > >>>> > > >>>> On Wed, 15 Aug 2018 10:39:13 +0100 > > >>>> Will Deacon <will.deacon@xxxxxxx> wrote: > > >>>> > > >>>>> On Tue, Aug 14, 2018 at 01:42:27PM -0600, Mathieu Poirier wrote: > > >>>>>> On Tue, 14 Aug 2018 at 11:09, Kim Phillips <kim.phillips@xxxxxxx> wrote: > > >>>>>>> The other thing that's going on here is that I'm becoming numb to the > > >>>>>>> loathsome "failed to mmap with 12 (Cannot allocate memory)" being > > >>>>>>> returned no matter what the error is/was. E.g., an error that would > > >>>>>>> indicate a sense of non-implementation would be much better > > >>>>>>> appreciated than presumably what the above is doing, i.e., returning > > >>>>>>> -ENOMEM. That, backed up with specific details in the form of human > > >>>>>>> readable text in dmesg would be *most* welcome. > > >>>>>> > > >>>>>> As part of the refactoring of the code to support CPU-wide scenarios I > > >>>>>> intend to emit better diagnostic messages from the driver. Modifying > > >>>>>> rb_alloc_aux() to propagate the error message generated by the > > >>>>>> architecture specific PMUs doesn't look hard either and I _may_ get to > > >>>>>> it as part of this work. > > >>>>> > > >>>>> For the record, I will continue to oppose PMU drivers that dump diagnostics > > >>>>> about user-controlled input into dmesg, but the coresight drivers are yours > > >>>>> so it's up to you and I won't get in the way! > > >>>> > > >>>> That sounds technically self-contradicting to me. Why shouldn't > > >>>> coresight share the same policies as those used for PMU drivers? Or > > >>>> why not allow the individual vendor PMU driver authors control the > > >>>> level of user-friendliness of their own drivers? > > >>>> > > >>>> That being said, Matheiu, would you accept patches that make coresight > > >>>> more verbose in dmesg? > > >>> > > >>> It depends on the issue you're hoping to address. I'd rather see the > > >>> root cause of the problem fixed than adding temporary code. Suzuki > > >>> added the ETR perf API and I'm currently working on CPU-wide > > >>> scenarios. From there and with regards to what can happen in > > >>> setup_aux(), we should have things covered. > > >> > > >> I think the main issue is the lack of error code propagation from > > >> setup_aux() back to the perf_aux_output_handle_begin(), which always > > >> return -ENOMEM. If we fix that, we could get better idea of whats > > >> wrong. > > > > > > Why get a better idea when we can get the exact details? > > > > The different values for error numbers are there for a reason... > > > > > > > >> If someone is planning to add verbose messages, they may do so by adding > > >> dev_dbg() / pr_debug(), which can be turned on as and when needed. > > > > > > I disagree: that just adds another usage and kernel configuration > > > obstacle. Why not use pr_err straight up? > > I think everything on this topic has been said already. As I remarked Apparently not :) > earlier once I'm done with CPU-wide support we shouldn't need detailed > error reporting. Everything should be handled via error code > propagation and that is what I'd like to see addressed in the first > place. From there we can think about individual error cases as they > come up. FYI, there are a lot of -EINVAL cases: $ git grep -- -EINVAL drivers/hwtracing/coresight/ | wc -l 142 > > I personally don't agree to usage of pr_err() in paths which are easily > > triggered from user input. Also, we are moving all the "debugging" > > messages to the dynamic debug, to prevent lockdep splats. > > A slight correction here - we are moving most of the framework error > reporting to dynamic debug because they clog the log file and aren't > useful outside of a development context. It is not a remedy for the I didn't see anyone complain about any 'clogging', and I sure don't mind seeing the extra messaging, esp. when using coresight in sysfs mode: it's reinforcement to the user that it's doing work. > negative interaction between event locking and console access > generated by the framework's reporting of device status as a path is > built/enabled/disabled. I'd like to see this splat from this 'negative interaction', and it be fixed before we allow code in that makes it silently go away... Kim