Hi, Dan > From: Dan Williams [mailto:dan.j.williams@xxxxxxxxx] > Subject: Re: [PATCH v5] ACPICA: Tables: Add mechanism to allow to balance late stage acpi_get_table() > independently > > On Wed, Jun 7, 2017 at 2:14 PM, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > On Wed, Jun 7, 2017 at 8:41 AM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > >> On Tue, Jun 6, 2017 at 9:54 PM, Lv Zheng <lv.zheng@xxxxxxxxx> wrote: > >>> Considering this case: > >>> 1. A program opens a sysfs table file 65535 times, it can increase > >>> validation_count and first increment cause the table to be mapped: > >>> validation_count = 65535 > >>> 2. AML execution causes "Load" to be executed on the same table, this time > >>> it cannot increase validation_count, so validation_count remains: > >>> validation_count = 65535 > >>> 3. The program closes sysfs table file 65535 times, it can decrease > >>> validation_count and the last decrement cause the table to be unmapped: > >>> validation_count = 0 > >>> 4. AML code still accessing the loaded table, kernel crash can be observed. > >>> > >>> This is because orginally ACPICA doesn't support unmapping tables during > >>> OS late stage. So the current code only allows unmapping tables during OS > >>> early stage, and for late stage, no acpi_put_table() clones should be > >>> invoked, especially cases that can trigger frequent invocations of > >>> acpi_get_table()/acpi_put_table() are forbidden: > >>> 1. sysfs table accesses > >>> 2. dynamic Load/Unload opcode executions > >>> 3. acpi_load_table() > >>> 4. etc. > >>> Such frequent acpi_put_table() balance changes have to be done altogether. > >>> > >>> This philosophy is not convenient for Linux driver writers. Since the API > >>> is just there, developers will start to use acpi_put_table() during late > >>> stage. So we need to consider a better mechanism to allow them to safely > >>> invoke acpi_put_table(). > >>> > >>> This patch provides such a mechanism by adding a validation_count > >>> threashold. When it is reached, the validation_count can no longer be > >>> incremented/decremented to invalidate the table descriptor (means > >>> preventing table unmappings) so that acpi_put_table() balance changes can be > >>> done independently to each others. > >>> > >>> Note: code added in acpi_tb_put_table() is actually a no-op but changes the > >>> warning message into a warning once message. Lv Zheng. > >>> > >> > >> This still seems to be unnecessary gymnastics to keep the validation > >> count around and make it work for random drivers. > > > > Well, I'm not sure I agree here. > > > > If we can make it work at one point, it should not be too hard to > > maintain that status. > > > > I agree with that, my concern was with driver writers needing to be > worried about when it is safe to call acpi_put_table(). This reference > count behaves differently than other reference counts like kobjects. I don't think they behave differently. "kref" needn't consider unbalanced "get/put". Because when the drivers(users) are deploying "kref", they are responsible for ensuring balanced "get/put". "kref" needn't take too much care about "overflow/underflow" as if all users ensure balanced "get/put", "overflow/underflow" is not possible. Occurrence of "overflow/underflow" means bugs. And can be further captured as "panic". If "kref" considers to "warn_once" overflow/underflow users, the logic in this commit can also be introduced to kref. However it's useless as all users have ensured balanced "get/put". Putting useless check than panic on hot path could be a waste. > The difference is not necessarily bad, but hopefully it can be > contained within the acpi core. The old warning logic for table desc is just derived from utdelete.c. Which reduces communication cost when the mechanism is upstreamed. ACPICA table "validation_count" is deployed on top of old design. Where "table unmap" is forbidden for late stage. Thus there is no users ensuring balanced "get/put". Under this circumstances, when we start to deploy balanced "get/put", we need to consider all users as a whole. You cannot say current unbalanced "get/put" users have bugs. They are there just because of historical reasons. Fortunately after applying this patch, drivers should be able to have a better environment to use the new APIs. Cheers, Lv ��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f