Re: j1939: control messages and PGN

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 28, 2019 at 04:48:03PM +0200, Kurt Van Dijck wrote:
> On di, 28 mei 2019 16:27:57 +0200, David Jander wrote:
> > On Tue, 28 May 2019 15:13:44 +0200
> > Kurt Van Dijck <dev.kurt@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> > 
> > > On di, 28 mei 2019 13:54:35 +0200, Oleksij Rempel wrote:
> > > > Hi all,
> > > > 
> > > > when receiving j1939 control messages, the current code looks up the
> > > > session by DA and SA only, not taking the PGN (which is part of the
> > > > control messages' data) into account.
> > > > 
> > > > When it comes to error control messages the session is aborted, even if
> > > > the PGN doesn't match. In EOMA the session is aborted, too. This means
> > > > receiving control messages with non matching PGNs lead to session abort.
> > > > 
> > > > Is this in general a good behavior?
> > 
> > Tl;DR: Yes, if PGN does not match the (E)TP session should be aborted.
> > 
> > > It is indeed a bit stupid.
> > > 
> > > If 2 ends are talking to each other, and 1 of those 2 talks about
> > > something else, that implies that you talk not about the same thing
> > > either, and you probably want to abort soon. It would be better if
> > > you only abort 'probably soon' and not 'immediately' in such case, since
> > > you're right that reception of another PGN control frame does not imply
> > > that you're current session became invalid.
> > 
> > If the session is with me and I see conflicting PGN (not start of a new
> > session), why not send an abort immediately?
> > 
> > > In j1939 however, the data part does not carry PGN info, since only 1
> > > session can be open. This implies 2 things:
> > > * Ignoring PGN difference in control frames makes you blind to the data
> > >   consistency, so you may think in certain cases that you continue to receive
> > >   data while it's actually the data that belongs to another PGN.
> > 
> > The sender cannot start sending data without the receiver acknowledging that
> > with a CTS message first. If the CTS contains a different PGN than the RTS,
> > then the sender should abort immediately.
> > For ETP, if the receiver sees a DPO frame with a different PGN, it should also
> > send an abort for that PGN immediately.
> > 
> > > * If node A talks to node B on a different PGN that node B thinks that
> > >   node A is talking, then this is AFAIK considered as protocol
> > >   violation because you risk data corruption.
> > > 
> > > The PGN in all-but-1st control frame could be considered redundant, but
> > > since it's there, it should match.
> > 
> > Ack.
> > 
> > > So, it's still not a good behaviour, but j1939 IMHO requires you to do so.
> > > 
> > > So you think this is bad, let's make it even worse :-)
> > > Between 2 nodes, actually 2 sessions may exist, 1 recv & 1 send.
> > 
> > Actually four in this case: ETPrx, TPrx, ETPtx and ETPrx, right?
> 
> Right, I tend to forget ETP for simplicity :-)
> > 
> > > Still, control frames that to RTS, CTS, DPO, ... are uni-directional,
> > > i.e. they map to only 1 of those 2 sessions exclusively.
> > > This is not the case for an abort message.
> > > _If I'm not mistaken_, the PGN info should be ignored for abort frames,
> > > since it may be unclear what exactly you abort: a old PGN, or a newly
> > > requested PGN. And due to that, it's also unclear if it applies to the
> > > send or recv path, so you abort, AFAIK, both directions at once.
> > > But I have not the specifications around now, I can't verify.
> > 
> > The abort message also contains the PGN of the packeted message, so AFAICS, you
> > can abort any one specific of the 4 theoretically simultaneous sessions,
> > because the should have different PGN's for the different directions (rx/tx).
> > 
> > That's probably one of the reasons why there is always a different PGN used to
> > talk in one direction than in the other.
> I see.
> I did not implement that very nice, I think.

Let's take some example to make me better understand all possible
scenarios.
We have node 0x80 and node 0x90:
- 0x80 is transmitting data to the 0x90 with RTS PGN 0x12300
- if 0x90 get control signal from 0x80 (DPO) with PGN 0x13300, 0x90 should
  send abort message to 0x13300 and cancel currently running 0x80->0x90,0x12300
  session.

- if 0x80 get control signal from 0x90 (CTS, EOMA) with PGN 0x13300, 0x80 should
  send abort message to 0x90 0x13300 and cancel currently running 0x80->0x90,0x12300
  session.

- if 0x80 and 0x90 will get abort signal for 0x80->0x90,0x13300, which
  was send by 0x80 or evil third ECU, currently running 0x80->0x90,0x12300
  session should not be aborted.
 
Correct?

  What is about not related  control signals. For 0x90 - CTS, EOMA; and
  for 0x80 - DPO?
  I ask because this stack has loop back design, so 0x90 and 0x80 will get own
  signals as well.

I can imagine at least some reason why we can get wrong signals:
- address conflicts (multiple ECUs configured with same address)
- buggy software 
- some CAN bus issues
- malicious attempts to exploit ECU remotely. 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |



[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux