Re: [PATCH v3] ipc: Modify message queue accounting to not take kernel data structures into account

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/07/2015 09:01 AM, Michael Kerrisk (man-pages) wrote:
> Hi David,
> 
> On 7 July 2015 at 07:16, Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote:
>> On Mon, 2015-07-06 at 17:49 +0200, Marcus Gelderie wrote:
>>> A while back, the message queue implementation in the kernel was
>>> improved to use btrees to speed up retrieval of messages (commit
>>> d6629859b36). The patch introducing the improved kernel handling of
>>> message queues (using btrees) has, as a by-product, changed the
>>> meaning of the QSIZE field in the pseudo-file created for the queue.
>>> Before, this field reflected the size of the user-data in the queue.
>>> Since, it also takes kernel data structures into account. For
>>> example, if 13 bytes of user data are in the queue, on my machine the
>>> file reports a size of 61 bytes.
>>>
>>> There was some discussion on this topic before (for example
>>> https://lkml.org/lkml/2014/10/1/115). Commenting on a th lkml, Michael
>>> Kerrisk gave the following background (https://lkml.org/lkml/2015/6/16/74):
>>>
>>>     The pseudofiles in the mqueue filesystem (usually mounted at
>>>     /dev/mqueue) expose fields with metadata describing a message
>>>     queue. One of these fields, QSIZE, as originally implemented,
>>>     showed the total number of bytes of user data in all messages in
>>>     the message queue, and this feature was documented from the
>>>     beginning in the mq_overview(7) page. In 3.5, some other (useful)
>>>     work happened to break the user-space API in a couple of places,
>>>     including the value exposed via QSIZE, which now includes a measure
>>>     of kernel overhead bytes for the queue, a figure that renders QSIZE
>>>     useless for its original purpose, since there's no way to deduce
>>>     the number of overhead bytes consumed by the implementation.
>>>     (The other user-space breakage was subsequently fixed.)
>>
>> Michael, this breakage was never finally documented in the manpage,
>> right?
> 
> Right.
> 
>> I took a look and there is no mention, but it was a quick look.
>> It's just that if this patch goes in, I'd hate ending up with something
>> like this in the manpage:
>>
>> as of 3.5
>>   <accounts for kernel overhead>
>>
>> as of 4.3
>>   <behavior reverted back to not include kernel overhead... *sigh*>
>>
>> If there are changes to be made to the manpage, it should probably be
>> posted with this patch, methinks.
> 
> I think something like the above will have to end up in the man page.
> The only thing I'd add is that the fix also has gone to -stable
> kernels. At least: I think this patch should also be marked for
> -stable. I'll take care of the man page updates as the patch goes
> through.
> 
>>> This patch removes the accounting of kernel data structures in the
>>> queue. Reporting the size of these data-structures in the QSIZE field
>>> was a breaking change (see Michael's comment above). Without the QSIZE
>>> field reporting the total size of user-data in the queue, there is no
>>> way to deduce this number.
>>>
>>> It should be noted that the resource limit RLIMIT_MSGQUEUE is counted
>>> against the worst-case size of the queue (in both the old and the new
>>> implementation). Therefore, the kernel overhead accounting in QSIZE is
>>> not necessary to help the user understand the limitations RLIMIT imposes
>>> on the processes.
>>
>> Also, I would suggest adding some comment in struct mqueue_inode_info
>> for future reference, ie:
>>
>> -       unsigned long qsize; /* size of queue in memory (sum of all msgs) */
>> +       /*
>> +        * Size of queue in memory (sum of all msgs). Accounts for
>> +        * only userspace overhead; ignoring any in-kernel rbtree nodes.
>> +        */
>> +       unsigned long qsize;
>>
>> But no big deal in any case.
>>
>> I think this is the right approach,
> 
> Me too.
> 
>> but would still like to know if Doug
>> has any concerns about it.
> 
> Looking on gmane, Doug does not appear to have been active on any
> lists since late May! Not sure why.

I responded yesterday in the v2 patch thread I believe.  In any case, I
think this patch is fine, and can go to stable.  This patch doesn't
actually change the math related to the rlimit checks (which is the main
thing I wanted to correct in my original patches), instead it corrects a
mistake I made.  At the time, I mistakenly thought that the qsize
included the current message data total + the struct msg_msg size total.
 It didn't, it was just the current user data total.  I added the rbtree
nodes in order to keep the total accurate but I shouldn't have added the
rbtree nodes to this total because none of the other kernel usage was
previously included.

Acked-by: Doug Ledford <dledford@xxxxxxxxxx>



-- 
Doug Ledford <dledford@xxxxxxxxxx>
              GPG KeyID: 0E572FDD


Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux