On 21.2.2023. 14:52, Mirsad Goran Todorovac wrote:
On 20. 02. 2023. 14:43, Andy Shevchenko wrote:
On Mon, Feb 20, 2023 at 02:10:00PM +0100, Mirsad Todorovac wrote:
On 2/16/23 15:16, Bartosz Golaszewski wrote:
...
As Mr. McKenney once said, a bunch of monkeys with keyboard could
have done it in a considerable number of trials and errors ;-)
But here I have something that could potentially leak as well. I could not devise a
reproducer due to the leak being lightly triggered only in extreme memory contention.
See it for yourself:
drivers/gpio/gpio-sim.c:
301 static int gpio_sim_setup_sysfs(struct gpio_sim_chip *chip)
302 {
303 struct device_attribute *val_dev_attr, *pull_dev_attr;
304 struct gpio_sim_attribute *val_attr, *pull_attr;
305 unsigned int num_lines = chip->gc.ngpio;
306 struct device *dev = chip->gc.parent;
307 struct attribute_group *attr_group;
308 struct attribute **attrs;
309 int i, ret;
310
311 chip->attr_groups = devm_kcalloc(dev, sizeof(*chip->attr_groups),
312 num_lines + 1, GFP_KERNEL);
313 if (!chip->attr_groups)
314 return -ENOMEM;
315
316 for (i = 0; i < num_lines; i++) {
317 attr_group = devm_kzalloc(dev, sizeof(*attr_group), GFP_KERNEL);
318 attrs = devm_kcalloc(dev, GPIO_SIM_NUM_ATTRS, sizeof(*attrs),
319 GFP_KERNEL);
320 val_attr = devm_kzalloc(dev, sizeof(*val_attr), GFP_KERNEL);
321 pull_attr = devm_kzalloc(dev, sizeof(*pull_attr), GFP_KERNEL);
322 if (!attr_group || !attrs || !val_attr || !pull_attr)
323 return -ENOMEM;
324
325 attr_group->name = devm_kasprintf(dev, GFP_KERNEL,
326 "sim_gpio%u", i);
327 if (!attr_group->name)
328 return -ENOMEM;
Apparently, if the memory allocation only partially succeeds, in the theoretical case
that the system is close to its kernel memory exhaustion, `return -ENOMEM` would not
free the partially succeeded allocs, would it?
To explain it better, I tried a version that is not yet full doing "all or nothing"
memory allocation for the gpio-sim driver, because I am not that familiar with the
driver internals.
devm_*() mean that the resource allocation is made in a managed manner, so when
it's done, it will be freed automatically.
Didn't see that one coming ... :-/ "buzzing though the bush ..."
The question is: is the lifetime of the attr_groups should be lesser or the
same as chip->gc.parent? Maybe it's incorrect to call devm_*() in the first place?
Bona fide said, I hope that automatic deallocation does things in the right order.
I've realised that devm_kzalloc() calls devm_kmalloc() that registers allocations on
a per driver list. But I am not sure how chip->gc was allocated?
Here is said it is allocated in drivers/gpio/gpio-sim.c:386 in gpio_sim_add_bank(), as a part of
struct gpio_sim_chip *chip;
struct gpio_chip *gc;
gc = &chip->gc;
and gc->parent is set to
gc->parent = dev;
in line 420, which appears called before gpio_sim_setup_sysfs() and the lines above.
P.S.
The exact line is:
chip = devm_kzalloc <https://elixir.bootlin.com/linux/latest/C/ident/devm_kzalloc>(dev, sizeof(*chip), GFP_KERNEL <https://elixir.bootlin.com/linux/latest/C/ident/GFP_KERNEL>); so I guess it is reasonable to assume that chip will also
be deallocated after attr_groups. chip->gc.parent appears to be a mere pointer to dev parameter in static int gpio_sim_add_bank <https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_add_bank>(struct fwnode_handle <https://elixir.bootlin.com/linux/latest/C/ident/fwnode_handle> *swnode <https://elixir.bootlin.com/linux/latest/C/ident/swnode>, struct device <https://elixir.bootlin.com/linux/latest/C/ident/device> *dev) This is OTOH called from: static int gpio_sim_probe <https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_probe>(struct platform_device <https://elixir.bootlin.com/linux/latest/C/ident/platform_device> *pdev)
{
struct device <https://elixir.bootlin.com/linux/latest/C/ident/device> *dev = &pdev->dev;
struct fwnode_handle <https://elixir.bootlin.com/linux/latest/C/ident/fwnode_handle> *swnode <https://elixir.bootlin.com/linux/latest/C/ident/swnode>;
int ret;
device_for_each_child_node <https://elixir.bootlin.com/linux/latest/C/ident/device_for_each_child_node>(dev, swnode <https://elixir.bootlin.com/linux/latest/C/ident/swnode>) {
ret = gpio_sim_add_bank <https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_add_bank>(swnode
<https://elixir.bootlin.com/linux/latest/C/ident/swnode>, dev); Which means dev passed to chip->gc.parent is initialised with &pdev->dev from pdev parm of gpio_sim_probe(). This is OTOH
referenced from the very:
static struct platform_driver <https://elixir.bootlin.com/linux/latest/C/ident/platform_driver> gpio_sim_driver
<https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_driver> = { .driver = { .name = "gpio-sim", .of_match_table
<https://elixir.bootlin.com/linux/latest/C/ident/of_match_table> = gpio_sim_of_match
<https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_of_match>, }, .probe
<https://elixir.bootlin.com/linux/latest/C/ident/probe> = gpio_sim_probe
<https://elixir.bootlin.com/linux/latest/C/ident/gpio_sim_probe>, }; Hope this helps. There's more to this than meets the eye, but
this is really an idiot's attempt to analyse a Linux kernel driver. :-)
If I understood well, automatic deallocation on unloading the driver goes
in the reverse order, so lifetime of chip appears to be longer than attr_groups,
but I am really not that good at this ...
Or maybe the chip->gc.parent should be changed to something else (actual GPIO
device, but then it's unclear how to provide the attributes in non-racy way
Really, dunno. I have to repeat that my learning curve cannot adapt so quickly.
I merely gave the report of KMEMLEAK, otherwise I am not a Linux kernel
device expert nor would be appropriate to try the craft not earned ;-)
Regards,
Mirsad
--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu