Re: [PATCH net] net/smc: use the correct ndev to find pnetid by pnetid table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2025/1/8 03:32, Halil Pasic wrote:
> On Tue, 7 Jan 2025 09:44:30 +0100
> Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
> 
>> On 12/27/24 5:04 AM, Guangguan Wang wrote:
> 
> @Guangguan Wang: please use my linux.ibm.com address
> in the future.
Get it.

> 
>>> The command 'smc_pnet -a -I <ethx> <pnetid>' will add <pnetid>
>>> to the pnetid table and will attach the <pnetid> to net device
>>> whose name is <ethx>. But When do SMCR by <ethx>, in function
>>> smc_pnet_find_roce_by_pnetid, it will use <ethx>'s base ndev's
>>> pnetid to match rdma device, not <ethx>'s pnetid. The asymmetric
>>> use of the pnetid seems weird. Sometimes it is difficult to know
>>> the hierarchy of net device what may make it difficult to configure
>>> the pnetid and to use the pnetid. Looking into the history of
>>> commit, it was the commit 890a2cb4a966 ("net/smc: rework pnet table")
>>> that changes the ndev from the <ethx> to the <ethx>'s base ndev
>>> when finding pnetid by pnetid table. It seems a mistake.
>>>
>>> This patch changes the ndev back to the <ethx> when finding pnetid
>>> by pnetid table.
>>>
>>> Fixes: 890a2cb4a966 ("net/smc: rework pnet table")
>>> Signed-off-by: Guangguan Wang <guangguan.wang@xxxxxxxxxxxxxxxxx>  
>>
>> If I read correctly, this will break existing applications using the
>> lookup schema introduced by the blamed commit - which is not very
>> recent.
> 
> Hi Paolo,
> 
> sorry for chiming in late. Wenjia is on vacation and Jan is out sick!
> After some reading and thinking I could not figure out how 890a2cb4a966
> ("net/smc: rework pnet table") is broken.

Before commit 890a2cb4a966:
smc_pnet_find_roce_resource
    smc_pnet_find_roce_by_pnetid(ndev, ...) /* lookup via hardware-defined pnetid */
        smc_pnetid_by_dev_port(base_ndev, ...)
    smc_pnet_find_roce_by_table(ndev, ...) /* lookup via SMC PNET table */
    {
        ...
        list_for_each_entry(pnetelem, &smc_pnettable.pnetlist, list) {
                if (ndev == pnetelem->ndev) { /* notice here, it was ndev to matching pnetid element in pnet table */
        ...
    }

After commit 890a2cb4a966:
smc_pnet_find_roce_resource
    smc_pnet_find_roce_by_pnetid
    {
        ...
        base_ndev = pnet_find_base_ndev(ndev); /* rename the variable name to base_ndev for better understanding */
        smc_pnetid_by_dev_port(base_ndev, ...)
        smc_pnet_find_ndev_pnetid_by_table(base_ndev, ...)
        {
                ...
                list_for_each_entry(pnetelem, &smc_pnettable.pnetlist, list) {
                if (base_ndev == pnetelem->ndev) { /* notice here, it is base_ndev to matching pnetid element in pnet table */
                ...
        }

    }

The commit 890a2cb4a966 has changed ndev to base_ndev when matching pnetid element in pnet table.
But in the function smc_pnet_add_eth, the pnetid is attached to the ndev itself, not the base_ndev.
smc_pnet_add_eth(...)
{
    ...
    ndev = dev_get_by_name(net, eth_name);
    ...
        if (new_netdev) {
            if (ndev) {
                new_pe->ndev = ndev;
                netdev_tracker_alloc(ndev, &new_pe->dev_tracker,
                    GFP_ATOMIC);
            }
            list_add_tail(&new_pe->list, &pnettable->pnetlist);
            mutex_unlock(&pnettable->lock);
        } else {
    ...
}

> 
> Admittedly I'm not really a net guy,and I'm mostly guessing what that
> lower and upper device stuff is, so please bear with me. All that said, I
> do think that going to the lowest netdev in the hierarchy is a sane
> thing to do here.  I assume  that lower and upper devices are applicable
> to stuff like bonding. 
> 
> PNETID stands for "Physical Network Identifier" and the idea is that iff
> two ports are connected to the same physical network then they should
> have the same PNETID. And on s390 PNETID can come and often is comming
> "from the hardware". Now for something like a bond of two OSA
> interfaces, I would expect the two legs of the bond to probably have a
> "HW PNETID", but the netdev representing the bond itself won't have one
> unless the Linux admin defines a software PNETID, which is work, and
> can't have a HW PNETID because it is a software construct within Linux.
> Breaking for example an active-backup bond setup where the legs have
> HW PNETIDs and the admin did not bother to specify a PNETID for the bond
> is not acceptable.
> 
> Let me also note that if ndev is a leaf (i.e. there is no lower device to
> it) then ndev == base_ndev, and the whole discussion does not matter for
> that case.
> 
> Again I have to emphasize that my domain knowledge is very limited, but
> I really don't feel comfortable going forward with this without Jan or
> Wenjia weighing in on the matter.
> 
> Paolo thanks for bringing this up!
> 
>>
>> Perhaps for a net patch would be better to support both lookup schemas
>> i.e.
>>
>> 	(smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid) ||
>> 	 smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid))
>>
>> ?
>>
> 
> Hm, I guess the idea here is that if ndev has a PNETID then it should
> take precedence, but if not we should try to obtain the PNETID of its
> "base_ndev". I'm not sure this would make things better compared to the
> original idea of caring about the leaf. Which makes me question my
> understanding of the problem statement from the commit message.
> 
> BTW to implement the logic proposed by you Paolo, as understood by me,
> we would have to use "&&" instead of "||". The whole expression is
> supposed
> to evaluate to false if a pnetid is found and to true if no pnet_id is
> found. smc_pnet_find_ndev_pnetid_by_table(ndev) returns false if a pnetid
> is found. I.e. if not found we would just short circuit to true and not
> call smc_pnet_find_ndev_pnetid_by_table(base_ndev), which is not what I
> believe you wanted to propose.

Yes, it should be
        (smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid) &&
 	 smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid))
if for the consideration of application's compatible usage of smc_pnet.

Thanks,
Guangguan Wang

> 
> To sum it up, please let us wait until Wenjia or Jan chime in. Copying
> Alexandra as well: she is more of a net person than I am, and maybe she
> has a more informed opinion.
> 
> Regards,
> Halil
> [...]
> 




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux