Re: "ata: libata: move ata_{port,link,dev}_dbg to standard pr_XXX() macros" - 742bef476ca5352b16063161fb73a56629a6d995 changed logging behavior and disabled a number of messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024/05/13 15:50, Krzysztof Olędzki wrote:
> On 06.05.2024 at 15:08, Damien Le Moal wrote:
>> On 5/7/24 06:49, Niklas Cassel wrote:
>>> Hello Krzysztof,
> 
> Hello and thank you for your quick and detailed response from both of you!
> 
>>> On Sun, May 05, 2024 at 01:06:28PM -0700, Krzysztof Olędzki wrote:
>>>> Hi,
>>>>
>>>> I noticed that since https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=742bef476ca5352b16063161fb73a56629a6d995 (which also got backported to -stable kernels) several of messages from dmesg regarding the ATA subsystem are no longer logged.
>>>>
>>>> For example, on my Dell PowerEdge 840 which has only one PATA port I used to get:
>>>>
>>>> scsi host1: ata_piix
>>>> ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xfc00 irq 14
>>>> ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xfc08 irq 15
>>>> ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
>>>> ata2: port disabled--ignoring
>>>> ata1.01: NODEV after polling detection
>>>> ata1.00: ATAPI: SAMSUNG CD-R/RW SW-248F, R602, max UDMA/33
>>>>
>>>> After that commit, the following two log entries are missing:
>>>> ata2: port disabled--ignoring
>>>> ata1.01: NODEV after polling detection
>>>>
>>>> Note that these are just examples, there are many more messages impacted by that.
>>>>
>>>> Looking at the code, these messages are logged via ata_link_dbg / ata_dev_dbg:
>>>> ata_link_dbg(link, "port disabled--ignoring\n");	[in drivers/ata/libata-eh.c]
>>>> ata_dev_dbg(dev, "NODEV after polling detection\n");	[in drivers/ata/libata-core.c]
>>>>
>>>> The commit change how the logging is called - ata_dev_printk function which was calling printk() directly got replaced with the following macro:
>>>>
>>>> +#define ata_dev_printk(level, dev, fmt, ...)			\
>>>> +        pr_ ## level("ata%u.%02u: " fmt,			\
>>>> +               (dev)->link->ap->print_id,			\
>>>> +	       (dev)->link->pmp + (dev)->devno,			\
>>>> +	       ##__VA_ARGS__)
>>>> (...)
>>>>  #define ata_link_dbg(link, fmt, ...)				\
>>>> -	ata_link_printk(link, KERN_DEBUG, fmt, ##__VA_ARGS__)
>>>> +	ata_link_printk(debug, link, fmt, ##__VA_ARGS__)
>>>> (...)
>>>>  #define ata_dev_dbg(dev, fmt, ...)				\
>>>> -	ata_dev_printk(dev, KERN_DEBUG, fmt, ##__VA_ARGS__)
>>>> +	ata_dev_printk(debug, dev, fmt, ##__VA_ARGS__
>>>>
>>>> So, instead of printk(..., level == KERN_DEBUG, ) we now call pr_debug(...). This is a problem as printk(msg, KERN_DEBUG) != pr_debug(msg).
>>>>
>>>> pr_debug is defined as:
>>>> /* If you are writing a driver, please use dev_dbg instead */
>>>> #if defined(CONFIG_DYNAMIC_DEBUG) || \
>>>> 	(defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE))
>>>> #include <linux/dynamic_debug.h>
>>>>
>>>> /**
>>>>  * pr_debug - Print a debug-level message conditionally
>>>>  * @fmt: format string
>>>>  * @...: arguments for the format string
>>>>  *
>>>>  * This macro expands to dynamic_pr_debug() if CONFIG_DYNAMIC_DEBUG is
>>>>  * set. Otherwise, if DEBUG is defined, it's equivalent to a printk with
>>>>  * KERN_DEBUG loglevel. If DEBUG is not defined it does nothing.
>>>>  *
>>>>  * It uses pr_fmt() to generate the format string (dynamic_pr_debug() uses
>>>>  * pr_fmt() internally).
>>>>  */
>>>> #define pr_debug(fmt, ...)			\
>>>> 	dynamic_pr_debug(fmt, ##__VA_ARGS__)
>>>> #elif defined(DEBUG)
>>>> #define pr_debug(fmt, ...) \
>>>> 	printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
>>>> #else
>>>> #define pr_debug(fmt, ...) \
>>>> 	no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
>>>> #endif
>>>>
>>>> Without CONFIG_DYNAMIC_DEBUG and if no CONFIG_DEBUG is enabled, the end result is calling no_printk which means nothing gets logged.
>>>>
>>>> Looking at the code, there are more impacted calls, like for example ata_dev_dbg(dev, "disabling queued TRIM support\n") for ATA_HORKAGE_NO_NCQ_TRIM, which also seems like an important information to log, and there are more.
>>>>
>>>> Was this change done intentionally? Note that most of the "debug" printks in libata code seem to be guarded by ata_msg_info / ata_msg_probe / ATA_DEBUG which was sufficient to prevent excess debug information logging.
>>>> One of the cases like this was covered in the patch:
>>>> -#ifdef ATA_DEBUG
>>>>         if (status != 0xff && (status & (ATA_BUSY | ATA_DRQ)))
>>>> -               ata_port_printk(ap, KERN_DEBUG, "abnormal Status 0x%X\n",
>>>> -                               status);
>>>> -#endif
>>>> +               ata_port_dbg(ap, "abnormal Status 0x%X\n", status);
>>>>
>>>> Assuming this is the intended direction, would it make sense for now to at promote "unconditionally" logged messages from ata_link_dbg/ata_dev_dbg to ata_link_info/ata_dev_info?
>>>>
>>>> Longer term, perhaps we want to revisit ata_msg_info/ata_msg_probe/ATA_DEBUG/ATA_VERBOSE_DEBUG vs ata_dev_printk/ata_link_printk/pr_debug (and maybe also pr_devel), especially that DYNAMIC_DEBUG is available these days...
>>>
>>> The change to the dynamic debug model was very much intentional.
>>>
>>> CONFIG_ATA_DEBUG/CONFIG_ATA_VERBOSE_DEBUG (and ata_msg_info/ata_msg_probe)
>>> was replaced with dynamic debug.
> 
> Understood.
> 
>>> So the reason why you saw the messages before was most likely because you
>>> built your kernel with CONFIG_ATA_DEBUG / CONFIG_ATA_VERBOSE_DEBUG.
> 
> I don't believe this is the case - please note these were ATA_VERBOSE_DEBUG / ATA_DEBUG (i.e. not CONFIG_...) and these logging entries were not covered by ifdefs, for example:
> 
> if (rc == -ENOENT) {
>         ata_link_dbg(link, "port disabled--ignoring\n");
>         ehc->i.action &= ~ATA_EH_RESET;
> 
>         ata_for_each_dev(dev, link, ALL)
>                 classes[dev->devno] = ATA_DEV_NONE;
> 
>         rc = 0;
> } else
>         ata_link_err(link,
>                      "prereset failed (errno=%d)\n",
>                      rc);
> 
>>> If you want to see the same error messages, you are now expected to use
>>> dynamic debug for the file which you want debug messages enabled.
> 
> Totally agree on the "debug" type of messages but I'm not sure if asking users to use dynamic debug to see error messages is what we want...
> 
>>> I do agree that the debug prints that were NOT guarded by CONFIG_ATA_DEBUG
>>> do now work differently, as they are no longer printed to dmesg at all.
>>> I'm not sure if this part was intentional. I'm guessing that it wasn't.
> 
> That's precisely the point I have been trying to make here. The dynamic debugging change seems like a very welcome one, the messages disappearing - perhaps not so much. Especially in a middle of a -stable kernel.
> 
>>> Looking at these three prints specifically:
>>> ata_link_dbg(link, "port disabled--ignoring\n");
>>
>> Given the recent addition of the port map mask, we could restore this print as
>> an info level.
>>
>>> ata_dev_dbg(dev, "NODEV after polling detection\n");
>>
>> This one is weird. It should be an error print.
>>
>>> ata_dev_dbg(dev, "disabling queued TRIM support\n");
>>
>> This is also OK to restore as an info print. But given that, there are a lot
>> more horkage disabling features that we probably should print.
> 
> Thanks. Just a note that these were just some examples. Also, can I help somehow or should I just wait for patches to test?
> 
>>> None of them were protected with CONFIG_ATA_DEBUG/CONFIG_ATA_VERBOSE_DEBUG
>>> before the series converting to dynamic debug was merged.
>>>
>>>
>>> The best way forward is probably to look at all debug prints before the
>>> series converting to dynamic debug was merged, and see if any print NOT
>>> guarded by CONFIG_ATA_DEBUG/CONFIG_ATA_VERBOSE_DEBUG should be considered
>>> to promoted to an info print instead.
> 
> Same question as the above: can I help somehow or should I just wait for patches to test?

If you have something in mind, please, by all means, send patches and we'll go
from there. What I think we need is:

1) Review the code to make sure that we print an error message for everything
that fails but should really succeeds. Beware here that some things may "fail"
during port scan if a device is not attached. That is normal and should not
result in printing errors.

2) Add prints whenever a horkage flag is applied to a device, port or link.

That should not be a lot.


-- 
Damien Le Moal
Western Digital Research





[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux