On Tue, Jul 10, 2012 at 12:24 PM, Don Dutile <ddutile@xxxxxxxxxx> wrote: > On 07/10/2012 01:12 PM, Alex wrote: >> >> I am trying to write a driver with custom mmap() function for PCIe >> BAR, with the goal to make this BAR cacheable in the processor cache. >> I am aware this is not the best way to achieve highest bandwidth and >> that the order of writes is unpredictable (neither are the issues in >> this case). >> >> The processor is Sandy Bridge i7, PCIe device is Altera Stratix IV dev. >> board. >> >> First, I tried to do it on CentOS 5 (2.6.18). I changed the MTRR >> settings to make sure the BAR is not within uncacheable MTRR and used >> io_remap_pfn_range() with _PAGE_PCD and _PAGE_PWT bits cleared. Reads >> worked as expected: reads returned correct values and second read to >> the same address does not necessarily cause the read to go to PCIe >> (read counter was checked in FPGA). However, the writes caused the >> system to freeze and then reboot. >> >> Second, I tried to do it on CentOS 6 (2.6.32), which has PAT support. >> The result is the same: reads work correctly, writes cause system >> freeze and reboot. Interestingly, non-temporal/write-combining full >> cache line writes (AVX/SSE) work as expected, i.e. they always go to >> FPGA and FPGA observes full cache line writes, reads return correct >> values afterwards. However, simple 64-bit writes still cause system >> freeze/reboot. >> >> The message on the screen: Machine Check Exception: 5 Bank 5: >> be2000000003110a. >> >> Third, I also tried to ioremap_cache() and then iowrite32() inside the >> driver code. The result is the same. >> >> >> I also tried to do the same thing on 2-socket Sandy Bridge (Romley): >> reads and non-temporal write behavior is the same, simple writes do >> not cause MCE/crash but have no effect on system state, i.e. value in >> memory does not change. >> >> Also, I tried the same code on older 2-socket Nehalem system: simple >> writes also cause MCE, although the codes are different. >> >> >> I think it is a hardware issue but I would appreciate if somebody can >> share any ideas about what's going on. >> Alex >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-pci" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > Once moving the registers into cacheable address space, your > device may be getting a full cache block write with varying byte masks > set. Does your FPGA handle such a large write packet with varying byte > masks? or does it cause a PCIe error that gets translated into the > Machine-checks your seeing under various write cases? > i.e., even an iowrite32() will write an entire cache block with > a large number of byte mask bits not set. > FPGA does handle full and partial line writes correctly. In fact, in the beginning FPGA setup was tested with uncached and write-combining BAR mappings and full and partial line (with byte mask) writes were extensively tested. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html