Hi, On Fri, Apr 24, 2020 at 07:46:26PM +0200, Martin Burnicki wrote: > I came across this thread and want to let you know that I also have > problems with the cx23885 driver on a Ryzen system. > > The only solution I found on the 'net that could make it work was to add > a line > > options cx23885 debug=7 > > to the file /etc/modprobe.d/cx23885.conf Have you tried: options cx23885 dma_reset_workaround=2 ? > However, this causes a *huge* number of debug messages, so I also run > the command > > rm -f /var/log/kern.log* > > in a daily cronjob. This works stable here for some months now. > > With lower debug levels the problem occurred less often, but it still > occurred. Only with debug level 7 (at least) the driver runs stable over > time here. > > In case somebody is interested in details of the systemI'm running here: > https://burnicki.net/public_html/martin/tmp/system-etails.txt Not found, I'm afraid. > The commit messages mentioned earlier in this thread are already pretty > old (from ~2018 or so), and I'm running kernel 5.3 on my Ubuntu system, > so I guess those commits are already in there, but the problem still occurs. Those commits check for a particular PCI PID/VIDs; newer IDs could be missing, if they are still broken. > I'm not familiar with the video stuff, with the cx23885 driver, etc., > but I'm maintaining another kernel driver for different PCI cards and > encountered similar problems as the cx23885 driver. > > The symptoms were that the driver worked stable for many years on all > systems, but suddenly failed to work properly on systems with very new > chips sets and/or CPUs (not only AMD Ryzen). > > It turned out that the problem was due to missing barriers when > accessing memory mapped registers. > > In my original driver code (written many years ago) the driver accessed > the memory mapped registers directly > > val = *mem_addr > *mem_addr = val > > which worked without problems for a long time, so it looks like older > CPUs/chipsets didn't do reordering which would have been inhibited by > barriers. > > As said above, with recent versions of CPUs/chipsets this seems indeed > to happen, but since I changed the driver code so that all access to > memory mapped registers uses the specific kernel inline functions (which > use barriers, AFAIK), all problems have vanished and my driver works > fine with the latest CPUs and chipsets. > > So maybe somewhere in the cx23885 driver code a memory barrier may be > missing, and depending on whether debugging is enabled, or not, accesses > to the device are re-ordered, or not. > > This is just an idea, and maybe this is not the case here, but by chance > someone who is familiar with the cx23885 code may have a look. That does seem possible. Actually I think it would be very useful if you could try and track down this issue, by replacing the various lines that do some debug action with a memory barrier or nothing. That would tell where the problem is. Unless anyone has better ideas, of course. Thanks, Sean