On Mon, Oct 05, 2015 at 03:18:05PM +0200, Paolo Bonzini wrote: > > > On 05/10/2015 14:50, Peter Maydell wrote: > > If you want to try to support "firmware might also be reading > > fw_cfg at the same time as the kernel" this is a (painful) > > problem regardless of how the kernel figures out whether a > > fw_cfg device is present. I had assumed that one of the design > > assumptions of this series was that firmware would only > > read the fw_cfg before booting the guest kernel and never touch > > it afterwards. If it might touch it later then letting the > > guest kernel also mess with fw_cfg seems like a really bad idea. > > The idea of tinkering with fw_cfg from the AML code (DSDT/SSDT) has been > proposed many times, and always dropped. One of the reasons was that > the OS could have a driver for fw_cfg. > > So I think that we can define the QEMU0002 id as owned by the OSPM, > similar to the various standard ACPI ids that are usually found in the > x86 world (e.g. PNP0B00 is a mc146818 RTC, PNP0303 is an 8042 keyboard > controller, PNP0501 is a 16550 or similar UART, and so on). This > basically sanctions _CRS as the way to pass information from the > firmware to the OSPM, also similarly to those standard PNP ids. OK, so I don't expect to go the "pure ACPI" route in the final version, mainly because I'm still hoping to fill the requirement of writing a driver which can use either ACPI or DT to detect the presence of fw_cfg (how I'm going to compile this on kernels with no ACPI or no DT support is still TBD, and probably will have to involve #ifdef, but I digress :) However, Michael's idea of having ACPI supply an "accessor method" for the guest kernel driver to call, without having to worry about the specific details in _CRS, sounded intriguing enough to at least explore in further detail. So, assuming we have the following fw_cfg AML in ssdt (i386) or dsdt (arm) (using cpp-style #ifdef to highlight the arch-specific bits): Scope (\_SB) { Device (FWCF) { Name (_HID, EisaId ("QMU0002")) // _HID: Hardware ID Name (_STA, 0x0B) // _STA: Status #ifdef ARCH_X86 Name (_CRS, ResourceTemplate () { IO (Decode16, 0x0510, // Range Minimum 0x0510, // Range Maximum 0x01, // Alignment 0x02, // Length ) }) #else /* ARCH_ARM */ NAME (_CRS, ResourceTemplate () { Memory32Fixed (ReadWrite, 0x09020000, // Address Base 0x0000000a, // Address Length ) }) #endif } } I can easily patch QEMU to additionally insert the following into "Device (FWCF)": #ifdef ARCH_X86 OperationRegion (FWCR, SystemIO, 0x0510, 0x02) Field (FWCR, WordAcc, NoLock, Preserve) { FWCC, 16 } Field (FWCR, ByteAcc, NoLock, Preserve) { Offset (0x01), FWCD, 8 } #else /* ARCH_ARM */ OperationRegion (FWCR, SystemMemory, 0x09020000, 0x0a) Field (FWCR, WordAcc, NoLock, Preserve) { Offset (0x08), FWCC, 16 // not sure about endianness on ARM here, but // I can still address this from the userspace // kernel driver if needed } Field (FWCR, ByteAcc, NoLock, Preserve) { FWCD, 8 } #endif Method (RDBL, 3, Serialized) // (Arg0 = key, Arg1 = skip, Arg2 = count) { FWCC = Arg0 Local0 = Zero While ((Local0 < Arg1)) { Local1 = FWCD Local0++ } Name (BBUF, Buffer (Arg2) {}) Local0 = Zero While ((Local0 < Arg2)) { Index (BBUF, Local0) = FWCD /* \_SB_.FWCF.FWCD */ Local0++ } Return (BBUF) /* \_SB_.FWCF.RDBL.BBUF */ } With the host generating the additional RDBL method above, I could write a "pure ACPI" driver for the guest kernel, where instead of the current fw_cfg_read_blob() logic: static DEFINE_MUTEX(fw_cfg_dev_lock); static bool fw_cfg_is_mmio; static inline u16 fw_cfg_sel_endianness(u16 key) { return fw_cfg_is_mmio ? cpu_to_be16(key) : cpu_to_le16(key); } static inline void fw_cfg_read_blob(u16 key, void *buf, loff_t pos, size_t count) { mutex_lock(&fw_cfg_dev_lock); iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl); while (pos-- > 0) ioread8(fw_cfg_reg_data); ioread8_rep(fw_cfg_reg_data, buf, count); mutex_unlock(&fw_cfg_dev_lock); } I could instead write something like this: struct acpi_device *dev; /* set during module init. */ static inline void fw_cfg_read_blob(u16 key, void *buf, loff_t pos, size_t count) { union acpi_object arg_elem[3], *obj; struct acpi_object_list arg; struct acpi_buffer acpi_buf = { ACPI_ALLOCATE_BUFFER, NULL }; acpi_status status; arg.count = 3; arg.pointer = &arg_elem[0]; arg_elem[0].type = arg_elem[1].type = arg_elem[2].type = ACPI_TYPE_INTEGER; arg_elem[0].integer.value = key; arg_elem[1].integer.value = pos; arg_elem[2].integer.value = count; status = acpi_evaluate_object_typed(dev->handle, "RDBL", &arg, &acpi_buf, ACPI_TYPE_BUFFER); if (ACPI_FAILURE(status)) { return; /* FIXME: actual error signaling to caller TBD */ } obj = (union acpi_object *)acpi_buf.pointer; /* FIXME: in lieu of better error signaling to caller: */ assert(count == obj->buffer.length); memcpy(buf, obj->buffer.pointer, obj->buffer.length); kfree(acpi_buf.pointer); } Now, if ACPI-less DT-enabled architectures are to be supported, this wouldn't get me out of having to spell out the original ioread8_rep() based fw_cfg_read_blob(), so probably not worth doing. But I simply *had* to try and chase this down before writing it off, since part of the reason I'm doing all this in the first place is to teach myself new tricks... :) Sorry for going off on a somewhat lengthy (and maybe just a *little* bit off-topic) tangent, but I figured I'd put it out there to at least facilitate future archaeology :) Obviously, any comments or feedback much appreciated! Cheers, --Gabriel -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html