On Thu, 2015-06-25 at 15:00 -0700, Dan Williams wrote: > On Thu, Jun 25, 2015 at 2:51 PM, Toshi Kani <toshi.kani@xxxxxx> wrote: > > On Thu, 2015-06-25 at 14:31 -0700, Dan Williams wrote: > >> On Thu, Jun 25, 2015 at 11:34 AM, Williams, Dan J > >> <dan.j.williams@xxxxxxxxx> wrote: > >> > On Thu, 2015-06-25 at 11:45 -0600, Toshi Kani wrote: > >> >> On Thu, 2015-06-25 at 05:37 -0400, Dan Williams wrote: > >> >> > From: Toshi Kani <toshi.kani@xxxxxx> > >> >> > > >> >> > ACPI NFIT table has System Physical Address Range Structure entries that > >> >> > describe a proximity ID of each range when ACPI_NFIT_PROXIMITY_VALID is > >> >> > set in the flags. > >> >> > > >> >> > Change acpi_nfit_register_region() to map a proximity ID to its node ID, > >> >> > and set it to a new numa_node field of nd_region_desc, which is then > >> >> > conveyed to the nd_region device. > >> >> > > >> >> > The device core arranges for btt and namespace devices to inherit their > >> >> > node from their parent region. > >> >> > > >> >> > Signed-off-by: Toshi Kani <toshi.kani@xxxxxx> > >> >> > [djbw: move set_dev_node() from region 'probe' to 'create'] > >> >> > >> >> Sorry, I failed to mention other issue, which led me call set_dev_node() > >> >> in probe. nd_async_device_register() calls device_add(), which does: > >> >> > >> >> /* use parent numa_node */ > >> >> if (parent) > >> >> set_dev_node(dev, dev_to_node(parent)); > >> >> > >> >> and overwrites numa_node to -1. Since region's parent is ndbusN, we > >> >> cannot set numa_node to the parent. So, I had to set it in probe. > >> > > >> > In general, I still don't like leaving it up to ->probe() which is > >> > within its rights to fail and not set the node. How about the following > >> > that moves it to the bus uevent code? Should get triggered before probe > >> > so the numa_node is valid before userspace is ever notified about the > >> > device. > >> > > >> > device_add() does: > >> > > >> > kobject_uevent(&dev->kobj, KOBJ_ADD); > >> > bus_probe_device(dev); > >> > > >> > ...so I think we're good, agree? I also added a missing init of > >> > ndr_desc.numa_node in arch/x86/kernel/pmem.c, see below. > >> > >> This looks good in a quick manual test. It's interesting/illustrative > >> that I inadvertently broke the one bit of the libnvdimm sysfs > >> interface that did not have unit test coverage. > > > > Sorry I had some interrupt. Yes, this works fine for region & > > namespace. I'd like to check with you for btt since the attach logic > > has changed in v2. > > > > Previously, as described in patch 16/17, bttN bound to pmem had a valid > > numa_node value, and seeding btt0 had -1. > > > > /sys/bus/nd/devices > > |-- btt0/numa_node:-1 > > |-- btt1/numa_node:0 > > > > In this version, there are unbound (seeding?) btt0-3 for every region > > (there are 4 regions) and btt4 & 5 bound to pmem0 & 3 on my system. > > > > btt0/numa_node:0 > > btt1/numa_node:0 > > btt2/numa_node:1 > > btt3/numa_node:1 > > btt4/numa_node:0 > > btt5/numa_node:1 > > > > btt0 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt0 > > btt1 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region1/btt1 > > btt2 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region2/btt2 > > btt3 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt3 > > btt4 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt4 > > btt5 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt5 > > > > And unbound bttNs attach to different regions across a reboot. > > > > btt0/numa_node:0 > > btt1/numa_node:1 > > btt2/numa_node:1 > > btt3/numa_node:0 > > btt4/numa_node:0 > > btt5/numa_node:1 > > > > btt0 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt0 > > btt1 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt1 > > btt2 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region2/btt2 > > btt3 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region1/btt3 > > btt4 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/btt4 > > btt5 > > -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region3/btt5 > > > > Is this how you'd expect btt to work in this version? (I have not > > looked at the btt changes yet) > > Yes, this looks fine. > > As requested by Christoph, in the latest version BTTs are child > devices of regions rather than busses. They automatically inherit the > numa_node of the parent region. In your dump above the numa_nodes are > not changing from boot-to-boot, instead the BTTs are registered > asynchronously so get different ids from boot-to-boot. Userspace > should not care what the btt id is and the same naming trick we use to > give block devices static names would not work for BTTs. The child > block device of the BTT will still have the static name as we > discussed earlier (/dev/pmemXs or /dev/ndblkX.Ys) because the scan > order of those is deterministic. Yes, I see no problem with bound BTTs and their device files. So, how do we bind BTT with this new version? Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html