Re: Regression caused by "eeprom: at24: Probe for DDR3 thermal sensor in the SPD case" - "sysfs: cannot create duplicate filename"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27.06.2024 13:24, Krzysztof Olędzki wrote:
> On 24.06.2024 at 07:54, Guenter Roeck wrote:
>> On 6/24/24 01:38, Krzysztof Olędzki wrote:
>>> On 23.06.2024 at 22:33, Guenter Roeck wrote:
>>>> On 6/23/24 11:47, Krzysztof Olędzki wrote:
>>>>> Hi,
>>>>>
>>>>> After upgrading kernel to Linux 6.6.34 on one of my systems, I noticed "sysfs: cannot create duplicate filename" and i2c registration errors in dmesg, please see below.
>>>>>
>>>>> This seems to be related to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-6.6.y&id=4d5ace787273cb159bfdcf1c523df957938b3e42 - reverting the change fixes the problem.
>>>>>
>>>>> Note that jc42 devices are registered correctly and work with and without the change.
>>>>>
>>>>
>>>> My guess is that the devices are fist instantiated through the jc42
>>>> driver's _detect function and then again from the at24 driver.
>>>> The at24 driver should possibly call i2c_new_scanned_device() instead
>>>> of i2c_new_client_device() to only instantiate the device if it wasn't
>>>> already instantiated.
>>>
>>> i2c_new_scanned_device() also calls i2c_default_probe() at the end (unless
>>> different probe is provided) which seems risky given the comment that explains
>>> that it would use quick write for that address. However, maybe it is safe in this case?
>>> I wish we had a way to just tell "no probing is needed".
>>>
>>
>> Sorry, I don't understand why it would be less risky to just probe the device
>> without such a test.
> 
> 
> I'm referring to this comment on i2c_default_probe():
> 
> /*
>  * Legacy default probe function, mostly relevant for SMBus. The default
>  * probe method is a quick write, but it is known to corrupt the 24RF08
>  * EEPROMs due to a state machine bug, and could also irreversibly
>  * write-protect some EEPROMs, so for address ranges 0x30-0x37 and 0x50-0x5f,
>  * we use a short byte read instead. Also, some bus drivers don't implement
>  * quick write, so we fallback to a byte read in that case too.
>  * On x86, there is another special case for FSC hardware monitoring chips,
>  * which want regular byte reads (address 0x73.) Fortunately, these are the
>  * only known chips using this I2C address on PC hardware.
>  * Returns 1 if probe succeeded, 0 if not.
>  */
> 
> <CUT>
> 
>>> For now compile-tested only given the write-test concern above.
>>>
>>
>> The device detect code in the i2c core does that same write-test that you
>> are concerned about.
> 
> 
> There is no write-test testing in i2c_new_client_device() and I cannot find how 
> i2c_detect() would do that. However, after looking at this more, it seems that
> we actually have jc42_detect() precisely for this, as this is what jc42.c
> provides as the .detect callback.
> 
>>
>>> That said, I have some follow-up questions:
>>>
>>> 1. if the jc42 driver handles this already, I wonder what's the point of adding
>>> at24_probe_temp_sensor()? Is there a situation where it would not do it properly?
>>> Or do we expect to remove the probing functionally from jc42.c?
>>>
>>
>> The jc42 driver is not auto-loaded. When suggesting to remove the "probing
>> functionally", I assume you mean to remove its detect function. That would only
>> work if SPD EEPROMs were only connected to I2C adapters calling i2c_register_spd(),
>> and if the systems with those adapters would support DMI.
> 
> I'm not suggesting to remove it. I'm just asking why we have two two different mechanisms
> for doing the same thing and what is the plan longer term? A new code was added, where is
> seems the old one has worked just fine. As you mentioned in the other response, it even
> handles well a buggy DIMMS with their eeprom data incorrectly stating no temp sensor.
>  
Auto-probing (scanning) is a legacy mechanism, and all use cases but hwmon have been
removed meanwhile. One of the reasons is that the detect i2c access could do harm to
a device of another type that sits on one of the scanned addresses.
Explicit instantiation is preferred.

Another reason why explicit instantiation is preferred: It allows auto-loading of
i2c client driver modules. W/o explicit instantiation jc42 needs to be loaded
manually (or by some user space tool).


It seems we have two issues:
1. Calls to i2c_new_client_device() aren't serialized. Parallel instantiation
   of the same device can cause errors like reported here.
2. i2c_new_client_device() complains also if the same device has been instantiated
   already.

Issue 2 we may be able to tackle w/o big effort:
So far we just check for registered clients with the same bus address, and complain
if found. It should be possible (not tested yet) to include the client device name
in the check. If the names are identical, then we can assume it's the same device
that has been registered by other means already, and silence the error message.

>> In v6.9, i2c_register_spd() is only called from the i801 driver (Intel systems).
>> In v6.11, piix4 (AMD) will be added. Even after that, all non-Intel / non-AMD systems
>> would no longer be able to support jc42 compatible chips by just loading the jc42
>> driver. That would not be acceptable.
> 
> Indeed.
>  
>>> 2. I don't understand why we are also getting the "Failed creating jc42" and
>>> "sysfs: cannot create duplicate filename" errors since i2c_new_client_device() calls
>>> i2c_check_addr_busy() on its own and should abort after the first error message?
>>>
>>
>> The "Failed creating" message is from the i2c core's detect function which
>> is only called if a new i2c adapter is added. This is actually the case here,
>> since the call sequence of the backtrace includes i801_probe(). It looks like
>> i2c_detect() runs asynchronously and doesn't protect itself against having
>> devices added to a bus while it is running on that same bus. That is just
>> a guess, though - I have not tried to verify it.
>>
>> That does suggest, though, that even your suggested code above might not
>> completely fix the problem. It may be necessary to call i2c_lock_bus()
>> or similar from i2c_new_scanned_device() and i2c_detect(), but I don't know
>> if that is save, sufficient, or even possible.
> 
> Right... Which again brings the original question: is there a situation where
> jc42 is not able to detect the temp sensors on its own?
> 
> Krzysztof





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux