Re: [PATCH] can: j1939: fix memory leak of skbs

Fedor Pchelkin <pchelkin@xxxxxxxxx> · Fri, 5 Aug 2022 11:56:18 +0300

Hi Oleksij,

On 29.07.2022 07:22, Oleksij Rempel wrote:
Initial issue can be reproduced by using real (slow) CAN with j1939cat[1]
tool. Both parts should be started to make sure the j1939_session_tx_dat() will
actually start using the queue. After pushing about 100K of data, application
will try to close the socket and exit. After socket is closed, all skb related
to this socket will be freed and j1939_session_tx_dat() will use freed skbs.

Ok, the patch I suggested was a kind of a guess, now I understand that
it breaks important logic.

On 29.07.2022 07:22, Oleksij Rempel wrote:
> This skb_get() is counter part of skb_unref()
> j1939_session_skb_drop_old().

However, we have a case [1] where j1939_session_skb_queue() is called
but the corresponding j1939_session_skb_drop_old() is not called and it
causes a memory leak.

I tried to investigate it a little bit: the thing is that
j1939_session_skb_drop_old() can be called only when processing
J1939_ETP_CMD_CTS. On the contrary, as I can see,
j1939_session_skb_queue() can be called independently from
J1939_ETP_CMD_CTS so the two functions obviously do not correspond to
each other.

In reproducer case there is a J1939_ETP_CMD_RTS processing, then
we send some messages (where j1939_session_skb_queue() is called) and
after that J1939_ETP_CMD_ABORT is processed and we lose those skbs.

[1] https://forge.ispras.ru/issues/11743