Re: [RFCv11 PATCH 04/29] media-request: core request support

Hans Verkuil <hverkuil@xxxxxxxxx> · Mon, 23 Apr 2018 14:43:36 +0200

On 04/12/2018 09:11 AM, Sakari Ailus wrote:
> On Wed, Apr 11, 2018 at 01:13:44PM -0300, Mauro Carvalho Chehab wrote:
>> Em Wed, 11 Apr 2018 18:35:14 +0300
>> Sakari Ailus <sakari.ailus@xxxxxx> escreveu:
>>
>>> On Wed, Apr 11, 2018 at 12:17:27PM -0300, Mauro Carvalho Chehab wrote:
>>>> Em Wed, 11 Apr 2018 18:02:19 +0300
>>>> Sakari Ailus <sakari.ailus@xxxxxx> escreveu:
>>>>   
>>>>> On Wed, Apr 11, 2018 at 10:49:35AM -0300, Mauro Carvalho Chehab wrote:  
>>>>>> Em Wed, 11 Apr 2018 16:21:16 +0300
>>>>>> Sakari Ailus <sakari.ailus@xxxxxx> escreveu:
>>>>>>
>>>>>>     
>>>>>>>>>> Btw, this is a very good reason why you should define the ioctl to
>>>>>>>>>> have an integer argument instead of a struct with a __s32 field
>>>>>>>>>> on it, as per my comment to patch 02/29:
>>>>>>>>>>
>>>>>>>>>> 	#define MEDIA_IOC_REQUEST_ALLOC	_IOWR('|', 0x05, int)
>>>>>>>>>>
>>>>>>>>>> At 64 bit architectures, you're truncating the file descriptor!        
>>>>>>>>>
>>>>>>>>> I'm not quite sure what do you mean. int is 32 bits on 64-bit systems as
>>>>>>>>> well.      
>>>>>>>>
>>>>>>>> Hmm.. you're right. I was thinking that it could be 64 bits on some
>>>>>>>> archs like sparc64 (Tru64 C compiler declares it with 64 bits), but,
>>>>>>>> according with:
>>>>>>>>
>>>>>>>> 	https://www.gnu.org/software/gnu-c-manual/gnu-c-manual.html
>>>>>>>>
>>>>>>>> This is not the case on gcc.      
>>>>>>>
>>>>>>> Ok. The reasoning back then was that what "int" means varies across
>>>>>>> compilers and languages. And the intent was to codify this to __s32 which
>>>>>>> is what the kernel effectively uses.    
>>>>>>
>>>>>> ...
>>>>>>     
>>>>>>> The rest of the kernel uses int rather liberally in the uAPI so I'm not
>>>>>>> sure in the end whether something desirable was achieved. Perhaps it'd be
>>>>>>> good to go back to the original discussion to find out for sure.
>>>>>>>
>>>>>>> Still binaries compiled with Tru64 C compiler wouldn't work on Linux anyway
>>>>>>> due to that difference.
>>>>>>>
>>>>>>> Well, I stop here for this begins to be off-topic. :-)    
>>>>>>
>>>>>> Yes. Let's keep it as s32 as originally proposed. Just ignore my comments
>>>>>> about that :-)
>>>>>>     
>>>>>>>>>>> +	get_task_comm(comm, current);
>>>>>>>>>>> +	snprintf(req->debug_str, sizeof(req->debug_str), "%s:%d",
>>>>>>>>>>> +		 comm, fd);        
>>>>>>>>>>
>>>>>>>>>> Not sure if it is a good idea to store the task that allocated
>>>>>>>>>> the request. While it makes sense for the dev_dbg() below, it
>>>>>>>>>> may not make sense anymore on other dev_dbg() you would be
>>>>>>>>>> using it.        
>>>>>>>>>
>>>>>>>>> The lifetime of the file handle roughly matches that of the request. It's
>>>>>>>>> for debug only anyway.
>>>>>>>>>
>>>>>>>>> Better proposals are always welcome of course. But I think we should have
>>>>>>>>> something here that helps debugging by meaningfully making the requests
>>>>>>>>> identifiable from logs.      
>>>>>>>>
>>>>>>>> What I meant to say is that one PID could be allocating the
>>>>>>>> request, while some other one could be actually doing Q/DQ_BUF.
>>>>>>>> On such scenario, the debug string could provide mislead prints.      
>>>>>>>
>>>>>>> Um, yes, indeed it would no longer match the process. But the request is
>>>>>>> still the same. That's actually a positive thing since it allows you to
>>>>>>> identify the request.
>>>>>>>
>>>>>>> With a global ID space this was trivial; you could just print the request
>>>>>>> ID and that was all that was ever needed. (I'm not proposing to consider
>>>>>>> that though.)
>>>>>>>     
>>>>>>
>>>>>> IMO, a global ID number would work better than get_task_comm().
>>>>>>
>>>>>> Just add a static int monotonic counter and use it for the debug purposes,
>>>>>> e. g.:
>>>>>>
>>>>>> {
>>>>>> 	static unsigned int req_count = 0;
>>>>>>
>>>>>> 	snprintf(req->debug_str, sizeof(req->debug_str), "%u:%d",
>>>>>> 		req_count++, fd);    
>>>>>>
>>>>>> Ok, eventually, it will overflow, but, it will be unique within
>>>>>> a reasonable timeframe to be good enough for debugging purposes.    
>>>>>
>>>>> Yes, but you can't figure out which process allocated it anymore, making
>>>>> associating kernel debug logs with user space process logs harder.
>>>>>
>>>>> How about process id + file handle? That still doesn't tell which process
>>>>> operated on the request though, but I'm not sure whether that's really a
>>>>> crucial piece of information.  
>>>>
>>>> You don't need that. With dev_dbg() - and other *_dbg() macros - you can
>>>> enable process ID for all debug messages.  
>>>
>>> With this, the problem again is that it does not uniquely identify the
>>> request: the request is the same request independently of which process
>>> would operate on it. Or whether it is being processed in an interrupt
>>> context.
>>>
>>> AFAICT, the allocator PID (or process name) + file handle are both required
>>> to match a request between user and kernel space logs.
>>
>> Sorry, I was unable to understand what you're saying.
>>
>> If you set the debug string with:
>>
>> 	snprintf(req->debug_str, sizeof(req->debug_str), "%u:%d", req_count++, fd);  
>>
>> With the remaining stuff at patch 04/29, e. g. those two printks:
>>
>> 	dev_dbg(mdev->dev, "request: allocated %s\n", req->debug_str);
>> 	dev_dbg(mdev->dev, "request: release %s\n", req->debug_str);
>>
>> And use "+pt" to enable those debug messages, for the request #1 with fd #45, 
>> created by PID 16613 you would have a log like:
>>
>> 	[  269.021116] [16613] request: allocated 1:45
>> 	[  269.024118] [16613] request: release 1:45
>>
>> (assuming that the same PID would create and release)
>>
>> The "1:45" is an unique global ID that would allow tracking it, even
>> if Q/DQ_BUF is done by some other PID.
>>
>> E. g. if a PID#16618 were responsible for Q/DQ_BUF, you would have
>> something like:
>>
>> 	[  269.021116] [16613] request: allocated 1:45
>> 	[  269.021117] [16618] request: Q_BUF 1:45
>> 	[  269.021118] [16618] request: DQ_BUF 1:45
>> 	[  269.024118] [16613] request: release 1:45
>>
>> (assuming that you would have a Q_BUF/DQ_BUF similar dev_dbg())
>>
>> That seems good enough to track it.
>>
>> Yet, in order to make easier to track, I would actually change the
>> dev_dbg() parameter order everywhere to something like:
>>
>> 	dev_dbg(mdev->dev, "request#%s: allocated\n", req->debug_str)
>> 	dev_dbg(mdev->dev, "request#%s: release\n", req->debug_str);
>>
>> In order to print something like:
>>
>> 	[  269.021116] [16613] request#1:45: allocated 
>> 	[  269.021117] [16618] request#1:45: Q_BUF
>> 	[  269.021118] [16618] request#1:45: DQ_BUF
>> 	[  269.024118] [16613] request#1:45: release
>>
>> Then, getting everything related to the first request would be as simple as:
>>
>> 	$ dmesg|grep request#1:
>>
>> That will provide the PID for both processes: the one that
>> created/released and the one that queued/dequeued.
> 
> Ah, right; yes, then you can. It's still a bit more complicated as you have
> one more piece of information to follow (the ID) vs. just PID and FD. For
> instance, you can't grep for requests created by a given process. Note that
> you can still print the PID of the process that operates on the request
> through dyndbg.
> 
> I'd like to hear what Hans thinks.
> 

Frankly, I have no opinion :-)

I'm OK with changing this to:

snprintf(req->debug_str, sizeof(req->debug_str), "%u:%d", req_count++, fd);

if Sakari agrees with that.

Regards,

	Hans