Re: [PATCH 1/3] drm/radeon: stop poisoning the GART TLB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 27.06.2014 10:59, schrieb Michel Dänzer:
On 27.06.2014 17:26, Christian König wrote:
Am 27.06.2014 04:31, schrieb Michel Dänzer:
On 25.06.2014 12:59, Michel Dänzer wrote:
With these patches, 3.15 just survived two piglit runs on my Bonaire,
one with the GART poisoning fix and one without. It never survived a
single run before.

Acked-and-Tested-by: Michel Dänzer <michel.daenzer@xxxxxxx>
So, are these patches going to 3.16 and 3.15?
We could send them in for 3.15,
What's the alternative for 3.15?

Well, figuring out what's the real reason behind those lockups would be a good start :)

Looks like e.g. https://bugs.freedesktop.org/show_bug.cgi?id=80141 is
confirmed to be this.


but for 3.16 we have some new features that depend on the new code.

We could backport them to the old code, but I really want to work on
figuring out what's wrong with the new approach instead.

Going to prepare a branch for you to test over the weekend, would be
nice if you could give it a try on Monday and see if that fixes the
issues as well.
Sure, will do.

I've just pushed the branch testing-3.15 to git://people.freedesktop.org/~deathsimple/linux. It's based on 3.15.2 and contains the "stop poisoning the GART TLB" patch backported to 3.15 and a couple of things that I would like to try.

I've disabled the redirection of page faults to the dummy page for now and so the system should lockup on the first page fault it encounters. Apart from that the page directory and page tables are now completely over allocated and over aligned.

Setting the READABLE bit on invalid entries shouldn't have an effect other than making those entries non zero. So please try to lockup your bonaire with this branch and as soon as you encounter the first page fault take a look at VM_CONTEXT1_PROTECTION_FAULT_STATUS and figure out which VMID caused the lockup.

Then use the attached script to make a dump from the complete page directory and page table of the VMID in question. E.g. "./dump_vm.sh 1" if the lockup was caused by VMID 1 etc... Make sure you've got a radeontool that supports CIK, otherwise it would only return zeros as page directory address.

Since even the invalid page table entries should now have at least the READABLE bit set there shouldn't be anything zero in this dump and look out for anything else suspicious as well (0xdeadbeef etc...).

Thanks for the help,
Christian.

Attachment: dump_vm.sh
Description: application/shellscript

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux