Re: Linux system dead if cannot complete a memory read request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am Freitag, 27. Mai 2011, 09:10:41 schrieb Ricardo Martínez:
> Dear all,
>
> I'm trying to read a memory mapped register in a PCIe peripheral. If that
> peripheral is busy doing any kind of work and cannot receive the Memory Read
> Request from the CPU, what is it likely to happen? That packet is buffered
> at level 2 in that peripheral, but the TLP is not processed and not
> answered.
> I mean that a read operation of a PCI memory mapped address is supposed to
> be atomic from the CPU point of view. Am I correct? Then, if that atomic
> operation cannot end, will system crash?

Well, it depends. I am not absolutely sure on what, but from my experience
it's a BIOS or driver thing. In most cases your read will return -1 (i.e. all
bits set). I have seen machines which will throw MCE or other things then
(especially on Windows). Therefore this behaviour must IMHO be avoided under
all sane circumstances.

> In this concrete scenario, my Linux system goes down without any log
> messages, it is a quick and quiet death. Is this the expected behaviour for
> you?

As above: it depends. Try using a different mainboard for debugging your FPGA
device. You have a good chance to see just a -1 as return value of the read
request. The majority of boards I have seen so far will handle this just fine.

> Could you please give me any hints, any ideas about how to work around this
> situation? Or about how can I debug further? Does the linux Kernel have any
> debug compilation-time-option for this kind of PCI problems?

There are credit mechanisms in the PCIe spec and there is also a maximum Read
Request answer time. You should look at both and also implement some sort of
priority mechanism in your scheduling in FPGA. Basically Completions should
always be answered first.

Just to give you some numbers: I have worked with PCIe x4 FPGA devices using
Xilinx Spartan FPGAs (some 4000* stuff) that had outbound transfer of ~900MB/s
with parallel reads working just fine. So if your device can't handle it you
need to fix your FPGA design. Do Completions first. Always. Otherwise you are
asking for serious, hard to debugging trouble.

Eike

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux