Hello,
Am 18.06.21 um 11:16 schrieb Marc Kleine-Budde:
On 17.06.2021 14:22:03, Harald Mommer wrote:
we are currently in the process of developing a draft specification for
Virtio CAN. In the scope of this work I am developing a Virtio CAN Linux
driver and a Virtio CAN Linux device
Oh that sounds interesting. Please keep the linux-can mailing list in
the loop. Do you have a first draft version for review, yet?
First draft went to virtio-comment@xxxxxxxxxxxxxxxxxxxx and
virtio-dev@xxxxxxxxxxxxxxxxxxxx.
https://markmail.org/search/?q=virtio-can&q=list%3Aorg.oasis-open.lists.virtio-comment#query:virtio-can%20list%3Aorg.oasis-open.lists.virtio-comment+page:1+mid:hdxj35fsthypllkt+state:results
Link should reveal the short conversation. Currently working on the next
draft which incorporates the review comments I got so far but the next
draft will also address the "TX ACK" problem we are discussing here.
In the future I will put the Linux-CAN list in the loop.
running on top of our hypervisor solution.
The Virtio CAN Linux device forwards an existing SocketCAN CAN device
(currently vcan) via Virtio to the Virtio driver guest so that the virtual
driver guest can send and receive CAN frames via SocketCAN.
What was originally planned (probably with too much AUTOSAR CAN driver
semantics in my head and too few SocketCAN knowledge) is to mark a
transmission request as used (done) when it's sent finally on the CAN bus
(vs. when it's given to SocketCAN not really done but still pending
somewhere in the protocol stack).
Makes sense.
Reading the "Makes sense". But reading also the rest of the E-Mail (and
the thread) it makes the impression that making this timing requirement
mandatory using SocketCAN is calling for trouble.
- Could remove the timing requirement. This is the easy solution. But
there is the "Makes sense".
- The original strict timing requirement becomes an option so it's not a
mandatory requirement.
2nd is my favorite (but I tend to do over engineering in the first shot
so the option before may be indeed the better one).
Not having this timing behavior has the implication that in the next
virtio draft spec some other things have to be changed and this means
now simplified.
Thought this was doable with some implementation effort using
setsockopt(..., SOL_CAN_RAW, CAN_RAW_RECV_OWN_MSGS, ...) and evaluatiing the
MSG_CONFIRM bit on received messages.
Where does that code run? Would that be part of qemu running on the host
of an open source solution?
The device application is closed source, runs under the COQOS hypervisor
which is also closed source. A qemu device implementation is not planned
as of now. The virtio CAN driver is a Linux device driver and will be
open sourced at some point in time in the hope to get it upstreamed in a
more far away future. Currently the driver is on an internal development
branch, outsiders cannot see it (still better for everyone) and the
colleagues are reviewing helping to bring it into an acceptable shape.
Can you sketch a quick block diagram showing guest, host, Virtio device,
Virtio driver, etc...
I hope this arrives on the list as is been sent and not garbled:
Guest 2 | Guest3
---------------- | ----------------
! cangen, ! | ! cangen, !
! candump, ! | ! candump, !
! cansend ! | ! cansend !
! using vcan0 ! | ! using can0 !
---------------- | ----------------
^ | ^
! --------------------- | !
! ! Service process ! | !
! ! in user space ! | !
! ! virtio-can device ! | !
! ! forwarding vcan0 ! | !
! --------------------- | !
! ^ ^ | !
! ! ! | !
--------------------------------------------------
! ! Device side ! kernel | Driver side ! kernel
v v v | v
---------------- -------------- | ----------------
! Device Linux ! ! HV support ! | ! Driver Linux !
! VCan ! ! module ! | ! Virtio CAN !
! vcan0 ! ! on device ! | ! can0 !
! ! ! side ! | ! !
---------------- -------------- | ----------------
^ ^ | ^
! ! | !
--------------------------------------------------
! ! ! Hypervisor
v v v
--------------------------------------------------
! COQOS-HV !
--------------------------------------------------
This works fine with
cangen -g 0 -i can0
on the driver side sending CAN messages to the device guest. No confirmation
is lost testing for several minutes.
Where's the driver side? On the host or the guest?
Both sides are guests of the hypervisor in our architecture. There is no
host in this sense, COQOS-HV is a type 1 hypervisor. The hypervisor does
not provide devices directly on its own, the devices are provided with
the support of a device (provider) guest which is also only a guest of
the hypervisor.
Have you activated SO_RXQ_OVFL?
With recvmsg() you get the number of dropped messages in the socket.
Have a look at:
https://github.com/linux-can/can-utils/blob/master/cansequence.c
I had no idea about SO_RXQ_OVFL. This looks to be useful to implement an
emergency recovery mechanism not to get stuck. If detecting loss of
received frames and the controller is still active and TX messages are
pending for a too long time then marking the pending TX messages as used
(done) to cope with the situation and not getting stuck (for too long).
Might be acceptable if this was something which normally does not happen
besides in really exceptional situations.
Nothing which should be done now, getting far too complicated for a 1st
shot to implement a Virtio CAN device.
We don't have a feature flag to query if the Linux driver support proper
CAN echo on TX complete notification.
Not so nice. But the device integrator should know which backend is used
and having a command line option for the device application the issue
can be handled. Need the command line switch anyway now to do experiments.
Regards
Harald
--
Dipl.-Ing. Harald Mommer
Senior Software Engineer
OpenSynergy GmbH
Rotherstr. 20, 10245 Berlin
Phone: +49 (30) 60 98 540-0 <== Zentrale
Fax: +49 (30) 60 98 540-99
E-Mail:harald.mommer@xxxxxxxxxxxxxxx
www.opensynergy.com
Handelsregister: Amtsgericht Charlottenburg, HRB 108616B
Geschäftsführer/Managing Director: Regis Adjamah