On 06/25/2014 03:24 PM, Grant Likely wrote: > On Tue, 24 Jun 2014 15:10:55 -0500, Nathan Fontenot <nfont@xxxxxxxxxxxxxx> wrote: >> On 06/23/2014 09:48 AM, Grant Likely wrote: >>> On Thu, 19 Jun 2014 10:26:15 -0500, Nathan Fontenot <nfont@xxxxxxxxxxxxxx> wrote: >>>> On 06/18/2014 03:07 PM, Grant Likely wrote: >>>>> Hi Nathan and Tyrel, >>>>> >>>>> I'm looking into lifecycle issues on nodes modified by OF_DYNAMIC, and >>>>> I'm hoping you can help me. Right now, pseries seems to be the only >>>>> user of OF_DYNAMIC, but making OF_DYNAMIC work has a huge impact on >>>>> the entire kernel because it requires all DT code to manage reference >>>>> counting with iterating over nodes. Most users simply get it wrong. >>>>> Pantelis did some investigation and found that the reference counts on >>>>> a running kernel are all over the place. I have my doubts that any >>>>> code really gets it right. >>>>> >>>>> The problem is that users need to know when it is appropriate to call >>>>> of_node_get()/of_node_put(). All list traversals that exit early need >>>>> an extra call to of_node_put(), and code that is searching for a node >>>>> in the tree and holding a reference to it needs to call of_node_get(). >>>>> >>>>> I've got a few pseries questions: >>>>> - What are the changes being requested by pseries firmware? Is it only >>>>> CPUs and memory nodes, or does it manipulate things all over the tree? >>>> >>>> The short answer, everything. >>> >>> :-) >>> >>>> For pseries the two big actions that can change the device tree are >>>> adding/removing resources and partition migration. >>>> >>>> The most frequent updates to the device tree happen during resource >>>> (cpu, memory, and pci/phb) add and remove. During this process we add >>>> and remove the node and its properties from the device tree. >>>> - For memory on newer systems this just involves updating the >>>> ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory property. Older >>>> firmware levels add and remove the memroy@XXX nodes and their properties. >>>> - For cpus the cpus/PowerPC,POWERXXXX nodes and its properties are added >>>> or removed >>>> - For pci/phb the pci@XXXXX nodes and properties are added/removed. >>>> >>>> The less frequent operation of live partition migration (and suspend/resume) >>>> can update just about anything in the device tree. When this occurs and the >>>> systems starts after being migrated (or waking up after a suspend) we make >>>> a call to firmware to get updates to the device tree for the new hardware >>>> we are running on. >>>> >>>>> - How frequent are the changes? How many changes would be likely over >>>>> the runtime of the system? >>>> >>>> This can happen frequently. >>> >>> Thanks, that is exactly the information that I want. I'm not so much >>> concerned with the addition or removal of nodes/properties, which is >>> actually pretty easy to handle. It is the lifecycle of allocations on >>> dynamic nodes that causes heartburn. >>> >>>>> - Are you able to verify that removed nodes are actually able to be >>>>> freed correctly? Do you have any testcases for node removal? >>>> >>>> I have always tested this by doing resource add/remove, usually cpu and memory >>>> since it is the easiest. >>> >>> Is that just testing the functionality, or do you have tests that check >>> if the memory gets freed? >> >> In general it's just functionality testing. >> >>> >>>>> I'm thinking very seriously about changing the locking semantics of DT >>>>> code entirely so that most users never have to worry about >>>>> of_node_get/put at all. If the DT code is switched to use rcu >>>>> primitives for tree iteration (which also means making DT code use >>>>> list_head, something I'm already investigating), then instead of >>>>> trying to figure out of_node_get/put rules, callers could use >>>>> rcu_read_lock()/rcu_read_unlock() to protect the region that is >>>>> searching over nodes, and only call of_node_get() if the node pointer >>>>> is needed outside the rcu read-side lock. >>>>> >>>> >>>> This sounds good. I like just taking the rcu lock around accessing the DT. >>>> Do we have many places where DT node pointers are held that require >>>> keeping the of_node_get/put calls? If this did exist perhaps we could >>>> update those places to look up the DT node every time instead of >>>> holding on to the pointer. We could just get rid of the reference counting >>>> altogether then. >>> >>> There are a few, but I would be happy to restrict reference counting to >>> only those locations. Most places will decode the DT data, and then >>> throw away the reference. We /might/ even be able to do rcu_lock/unlock >>> around the entire probe path which would make it transparent to all >>> device drivers. >>> >>>>> I'd really like to be rid of the node reference counting entirely, but >>>>> I can't figure out a way of doing that safely, so I'd settle for >>>>> making it a lot easier to get correct. >>>>> >>>> >>>> heh! I have often thought about adding reference counting to device tree >>>> properties. >>> >>> You horrible, horrible man. >> >> Yes. I are evil :) >> >> After looking again the work needed to add reference counts to properties >> would be huge. The few properties I am concerned with are specific to powerpc >> so perhaps just adding an arch specific lock around updating those >> properties would work. > > Which code/properties? I'd like to have a look myself. /ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory The property is updated in arch/powerpc/platforms/pseries/hotplug-memory.c:pseries_update_drconf_memory() -Nathan -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html