Hi Linus, please pull from: git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm libnvdimm-fixes ...to receive several fixes to the DSM (ACPI device specific method) marshaling implementation. I consider these urgent enough to send for 4.9 consideration since they fix the kernel's handling of ARS (Address Range Scrub) commands. Especially for platforms without machine-check-recovery capabilities, successful execution of ARS commands enables the platform to potentially break out of an infinite reboot problem if a media error is present in the boot path. There is also a one line fix for a device-dax read-only mapping regression. +ACI-acpi, nfit: fix extended status translations for ACPI DSMs+ACI- and +ACI-device-dax: fix private mapping restriction, permit read-only+ACI- are true regression fixes for changes introduced this cycle. +ACI-acpi, nfit, libnvdimm: fix / harden ars+AF8-status output length handling+ACI- fixes the kernel's handling of zero-length results, this never would have worked in the past, but we only just recently discovered a BIOS implementation that emits this arguably spec non-compliant result. The remaining two commits are additional fall out from thinking through the implications of a zero / truncated length result of the ARS Status command. In order to mitigate the risk that these changes introduce yet more regressions they are backstopped by a new unit test in +ACI-tools/testing/nvdimm: unit test acpi+AF8-nfit+AF8-ctl()+ACI- that mocks inputs to acpi+AF8-nfit+AF8-ctl(). Please consider pulling for 4.9, it has appeared in a -next release with no reported issues. The following changes since commit 3e5de27e940d00d8d504dfb96625fb654f641509: Linux 4.9-rc8 (2016-12-04 12:50:51 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm libnvdimm-fixes for you to fetch changes up to 325896ffdf90f7cbd59fb873b7ba20d60d1ddf3c: device-dax: fix private mapping restriction, permit read-only (2016-12-06 17:42:37 -0800) ---------------------------------------------------------------- Dan Williams (5): acpi, nfit, libnvdimm: fix / harden ars+AF8-status output length handling acpi, nfit: validate ars+AF8-status output buffer size acpi, nfit: fix bus vs dimm confusion in xlat+AF8-status tools/testing/nvdimm: unit test acpi+AF8-nfit+AF8-ctl() device-dax: fix private mapping restriction, permit read-only Vishal Verma (1): acpi, nfit: fix extended status translations for ACPI DSMs drivers/acpi/nfit/core.c +AHw- 55 +-+-+-+-+---- drivers/acpi/nfit/nfit.h +AHw- 2 +- drivers/dax/dax.c +AHw- 2 +-- drivers/nvdimm/bus.c +AHw- 25 +-+-+-- include/linux/libnvdimm.h +AHw- 2 +-- tools/testing/nvdimm/Kbuild +AHw- 1 +- tools/testing/nvdimm/test/iomap.c +AHw- 23 +-+-+-- tools/testing/nvdimm/test/nfit.c +AHw- 236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-- tools/testing/nvdimm/test/nfit+AF8-test.h +AHw- 8 +-- 9 files changed, 326 insertions(+-), 28 deletions(-) commit 9a901f5495e26e691c7d0ea7b6057a2f3e6330ed Author: Vishal Verma +ADw-vishal.l.verma+AEA-intel.com+AD4- Date: Mon Dec 5 17:00:37 2016 -0700 acpi, nfit: fix extended status translations for ACPI DSMs ACPI DSMs can have an 'extended' status which can be non-zero to convey additional information about the command. In the xlat+AF8-status routine, where we translate the command statuses, we were returning an error for a non-zero extended status, even if the primary status indicated success. Return from each command's 'case' once we have verified both its status and extend status are good. Cc: +ADw-stable+AEA-vger.kernel.org+AD4- Fixes: 11294d63ac91 (+ACI-nfit: fail DSMs that return non-zero status by default+ACI-) Signed-off-by: Vishal Verma +ADw-vishal.l.verma+AEA-intel.com+AD4- Signed-off-by: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- commit efda1b5d87cbc3d8816f94a3815b413f1868e10d Author: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- Date: Tue Dec 6 09:10:12 2016 -0800 acpi, nfit, libnvdimm: fix / harden ars+AF8-status output length handling Given ambiguities in the ACPI 6.1 definition of the +ACI-Output (Size)+ACI- field of the ARS (Address Range Scrub) Status command, a firmware implementation may in practice return 0, 4, or 8 to indicate that there is no output payload to process. The specification states +ACI-Size of Output Buffer in bytes, including this field.+ACI-. However, 'Output Buffer' is also the name of the entire payload, and earlier in the specification it states +ACI-Max Query ARS Status Output Buffer Size: Maximum size of buffer (including the Status and Extended Status fields)+ACI-. Without this fix if the BIOS happens to return 0 it causes memory corruption as evidenced by this result from the acpi+AF8-nfit+AF8-ctl() unit test. ars+AF8-status00000000: 00020000 00000000 ........ BUG: stack guard page was hit at ffffc90001750000 (stack is ffffc9000174c000..ffffc9000174ffff) kernel stack overflow (page fault): 0000 +AFsAIw-1+AF0- SMP DEBUG+AF8-PAGEALLOC task: ffff8803332d2ec0 task.stack: ffffc9000174c000 RIP: 0010:+AFsAPA-ffffffff814cfe72+AD4AXQ- +AFsAPA-ffffffff814cfe72+AD4AXQ- +AF8AXw-memcpy+-0x12/0x20 RSP: 0018:ffffc9000174f9a8 EFLAGS: 00010246 RAX: ffffc9000174fab8 RBX: 0000000000000000 RCX: 000000001fffff56 RDX: 0000000000000000 RSI: ffff8803231f5a08 RDI: ffffc90001750000 RBP: ffffc9000174fa88 R08: ffffc9000174fab0 R09: ffff8803231f54b8 R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000003 R15: ffff8803231f54a0 FS: 00007f3a611af640(0000) GS:ffff88033ed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc90001750000 CR3: 0000000325b20000 CR4: 00000000000406e0 Stack: ffffffffa00bc60d 0000000000000008 ffffc90000000001 ffffc9000174faac 0000000000000292 ffffffffa00c24e4 ffffffffa00c2914 0000000000000000 0000000000000000 ffffffff00000003 ffff880331ae8ad0 0000000800000246 Call Trace: +AFsAPA-ffffffffa00bc60d+AD4AXQ- ? acpi+AF8-nfit+AF8-ctl+-0x49d/0x750 +AFs-nfit+AF0- +AFsAPA-ffffffffa01f4fe0+AD4AXQ- nfit+AF8-test+AF8-probe+-0x670/0xb1b +AFs-nfit+AF8-test+AF0- Cc: +ADw-stable+AEA-vger.kernel.org+AD4- Fixes: 747ffe11b440 (+ACI-libnvdimm, tools/testing/nvdimm: fix 'ars+AF8-status' output buffer sizing+ACI-) Signed-off-by: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- commit 82aa37cf09867c5e2c0326649d570e5b25c1189a Author: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- Date: Tue Dec 6 12:45:24 2016 -0800 acpi, nfit: validate ars+AF8-status output buffer size If an ARS Status command returns truncated output, do not process partial records or otherwise consume non-status fields. Cc: +ADw-stable+AEA-vger.kernel.org+AD4- Fixes: 0caeef63e6d2 (+ACI-libnvdimm: Add a poison list and export badblocks+ACI-) Signed-off-by: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- commit d6eb270c57fef35798525004ddf2ac5dcdadd43b Author: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- Date: Tue Dec 6 15:06:55 2016 -0800 acpi, nfit: fix bus vs dimm confusion in xlat+AF8-status Given dimms and bus commands share the same command number space we need to be careful that we are translating status in the correct context. Otherwise we can, for example, fail an ND+AF8-CMD+AF8-GET+AF8-CONFIG+AF8-SIZE command because max+AF8-xfer is zero. It fails because that condition erroneously correlates with the 'cleared +AD0APQ- 0' failure of ND+AF8-CMD+AF8-CLEAR+AF8-ERROR. Cc: +ADw-stable+AEA-vger.kernel.org+AD4- Fixes: aef253382266 (+ACI-libnvdimm, nfit: centralize command status translation+ACI-) Signed-off-by: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- commit a7de92dac9f0dbf01deb56fe1d661d7baac097e1 Author: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- Date: Mon Dec 5 13:43:25 2016 -0800 tools/testing/nvdimm: unit test acpi+AF8-nfit+AF8-ctl() A recent flurry of bug discoveries in the nfit driver's DSM marshalling routine has highlighted the fact that we do not have unit test coverage for this routine. Add a self-test of acpi+AF8-nfit+AF8-ctl() routine before probing the +ACI-nfit+AF8-test.0+ACI- device. This mocks stimulus to acpi+AF8-nfit+AF8-ctl() and if any of the tests fail +ACI-nfit+AF8-test.0+ACI- will be unavailable causing the rest of the tests to not run / fail. This unit test will also be a place to land reproductions of quirky BIOS behavior discovered in the field and ensure the kernel does not regress against implementations it has seen in practice. Signed-off-by: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- commit 325896ffdf90f7cbd59fb873b7ba20d60d1ddf3c Author: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4- Date: Tue Dec 6 17:03:35 2016 -0800 device-dax: fix private mapping restriction, permit read-only Hugh notes in response to commit 4cb19355ea19 +ACI-device-dax: fail all private mapping attempts+ACI-: +ACI-I think that is more restrictive than you intended: haven't tried, but I believe it rejects a PROT+AF8-READ, MAP+AF8-SHARED, O+AF8-RDONLY fd mmap, leaving no way to mmap /dev/dax without write permission to it.+ACI- Indeed it does restrict read-only mappings, switch to checking VM+AF8-MAYSHARE, not VM+AF8-SHARED. Cc: +ADw-stable+AEA-vger.kernel.org+AD4- Cc: Dave Hansen +ADw-dave.hansen+AEA-linux.intel.com+AD4- Cc: Pawel Lebioda +ADw-pawel.lebioda+AEA-intel.com+AD4- Fixes: 4cb19355ea19 (+ACI-device-dax: fail all private mapping attempts+ACI-) Reported-by: Hugh Dickins +ADw-hughd+AEA-google.com+AD4- Signed-off-by: Dan Williams +ADw-dan.j.williams+AEA-intel.com+AD4--- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html