On 16.10.19 17:28, Stefan Haberland wrote: > On 16.10.19 17:26, Qian Cai wrote: >> On Wed, 2019-10-16 at 16:56 +0200, Stefan Haberland wrote: >>> On 16.10.19 16:09, Qian Cai wrote: >>>> On Wed, 2019-10-16 at 15:29 +0200, Stefan Haberland wrote: >>>>> Hi, >>>>> >>>>> thanks for reporting this. >>>>> >>>>> On 02.10.19 21:33, Qian Cai wrote: >>>>>> For some reasons, dasd_eckd_check_characteristics() received -ENOMEM and then >>>>>> dasd_generic_set_online() emits this message, >>>>>> >>>>>> dasd: 0.0.0122 Setting the DASD online with discipline ECKD failed with rc=-12 >>>>>> >>>>>> After that, there are several memory leaks below. There are "config_data" and >>>>>> then stored as, >>>>>> >>>>>> /* store per path conf_data */ >>>>>> device->path[pos].conf_data = conf_data; >>>>>> >>>>>> When it processes the error path in dasd_generic_set_online(), it calls >>>>>> dasd_delete_device() which nuke the whole "struct dasd_device" without freeing >>>>>> the device->path[].conf_data first. >>>>> Usually dasd_delete_device() calls dasd_generic_free_discipline() which >>>>> takes care of >>>>> the device->path[].conf_data in dasd_eckd_uncheck_device(). >>>>> From a first look this looks sane. >>>>> >>>>> So I need to spend a closer look if this does not happen correctly here. >>>> When dasd_eckd_check_characteristics() failed here, >>>> >>>> if (!private) { >>>> private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA); >>>> if (!private) { >>>> dev_warn(&device->cdev->dev, >>>> "Allocating memory for private DASD data " >>>> "failed\n"); >>>> return -ENOMEM; >>>> } >>>> device->private = private; >>>> >>>> The device->private is NULL. >>>> >>>> Then, in dasd_eckd_uncheck_device(), it will return immediately. >>>> >>>> if (!private) >>>> return; >>> Yes but in this case there is no per_path configuration data stored. >>> This is done after the private structure is allocated successfully. >> Yes, you are right. It has been a while so I must lost a bit memory. Actually, >> it looks like in dasd_eckd_check_characteristic() that device->private is set to >> NULL from this path, >> >> /* Read Configuration Data */ >> rc = dasd_eckd_read_conf(device); >> if (rc) >> goto out_err1; >> >> out_err1: >> kfree(private->conf_data); >> kfree(device->private); >> device->private = NULL; >> return rc; >> >> because dasd_eckd_read_conf() returns -ENOMEM and calls, >> >> out_error: >> kfree(rcd_buf); >> *rcd_buffer = NULL; >> *rcd_buffer_size = 0; >> return ret; >> >> It will only free its own config_data in one operational path, but there could >> has already allocated in earlier paths in dasd_eckd_read_conf() without any >> rollback and calls return, >> >> for (lpm = 0x80; lpm; lpm>>= 1) { >> if (!(lpm & opm)) >> continue; >> rc = dasd_eckd_read_conf_lpm(device, &conf_data, >> &conf_len, lpm); >> if (rc && rc != -EOPNOTSUPP) { /* -EOPNOTSUPP is ok */ >> DBF_EVENT_DEVID(DBF_WARNING, device->cdev, >> "Read configuration data returned " >> "error %d", rc); >> return rc; >> } >> >> Later, dasd_eckd_uncheck_device() see private->device is NULL without cleaning >> up. Does it make sense? > Yes, this looks like an error path not handling this correctly. > I will provide a patch for this. Most likely I will cover this in the error handling of the function. Thanks again for reporting. Regards, Stefan