BPF office hours summary, August 2022

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Below you can find short summary of BPF office hours that happened in August 2022.

Let me know if you have any feedback on how to improve this summary letter.

Thanks,
Mykola

8/4: “Discuss fix to tcp_inq for bpf sockmap socket redirect” with Ashok Dwarakinath <ashokd@xxxxxxxxxxx>: 

Ashok is running into an issue when an ioctl returns an incorrect number of bytes when sockops is enabled (similar to [1]). He tried to figure out whether it’s a sockops issue or a TCP stack issue. Ashok found that FIONREAD isn’t implemented for sockets that are in the sockmap. He has a fix for looking at how many bytes are in the ingress message queue and writing down the number of bytes that are there. Ashok plans to submit a patch for review upstream.

[1] https://github.com/cilium/cilium/issues/19304

8/4: “Discuss bpf_verify_pkcs7_signature() kfunc” with Roberto Sassu <roberto.sassu@xxxxxxxxxx>:

Roberto Sassu wanted to clarify Alexei Starovoitov’s response to one of his patches in the “bpf: Add kfuncs for PKCS#7 signature verification” patchset. Alexei explained that his request was to remove the __ref parameter suffix because there wasn’t a use case for it and that all the args can be marked as trusted in the bpf_verify_pkcs7_signature() since all of them are valid. Roberto asked whether this would work since the system_keyring parameter is of type “long”, not a pointer. Alexei clarified that the trust is only relevant to pointers and integers are automatically trusted. Additionally, Roberto asked about how to make dynamic pointers trusted using KF_TRUSTED_ARGS annotation. Alexei noted that this is currently not supported. Then, Roberto asked how bpf_verify_pkcs7_signature() should handle possible NULL pointers to user or system keyrings. Daniel Borkmann suggested getting rid of the two key-related parameters and implementing a bpf_lookup_[system/user]_key() kfunc to produce struct bpf_key containing pointers to system and user keyrings.

8/11: “eBPF standardization/documentation” with Dave Thaler (dthaler@xxxxxxxxxxxxx), James Harris (james.r.harris@xxxxxxxxx) and Quentin Monnet (quentin@xxxxxxxxxxxxx)

Dave Thaler from Microsoft led the BPF standardization discussion. He reviewed slides describing his ideas on how standardization should proceed and took notes inline. Alexei, Daniel Borkman, KP Singh, Dave Tucker, Christoph Hellwig, Jim Harris, Quentin Monnet among others actively participated. The main goal of this effort is to have a central source of truth on how BPF should be implemented without getting into platform specific implementation details. Participants agreed that “must/should/may” language should be used throughout the document(s) and document(s) will have versions. Initial scope will include a description of ISA (instruction set architecture), ELF Format, BTF (for debug/metadata info) and some verifier expectations not covered by ISA. Standardization artifacts will be upstreamed in the Linux kernel tree. Communication will be done over the email using BPF mailing lists. Now, the BPF community will look for volunteers to drive the effort.

8/18: “scheduler BPF” with Ren Zhijie (renzhijie2@xxxxxxxxxx) and Hanjun Guo (guohanjun@xxxxxxxxxx)  

Ren Zhijie led a presentation [1] about Huawei BPF scheduler implementation and considerations. Huawei BPF scheduler is based on Roman’s patchset [2] that introduces a set of hooks and helpers in CFS. The goal is to have a scheduler that can be easily adopted for different workloads including cloud computing, databases and mobile platforms. The Huawei patchset is ready for inclusion into the OpenEuler Linux distribution. Huawei patchset adds tag management system, additional hooks, helpers, and self-tests. All code is isolated behind CONFIG_BPF_SCHED. tag management system allows grouping tasks using the same scheduling policy as well as communication between different Linux kernel subsystems and user-space. tag is a s64 value added to the struct task and struct task_group. Huawei BPF scheduler patches do not support advanced CFS features like load balancing, load tracking and others, and are left out of the scope of the current project. Huawei plans to start upstreaming effort around the end of September. Then followed the discussion between Ren Zhijie, Hanjun Guo, Hao Luo (haoluo@xxxxxxxxxx), Dan Schatzberg (dschatzberg@xxxxxx) and Tejun Heo (tj@xxxxxxxxxx). Google is working on a similar effort called Ghost [3]. Meta engineers have also been looking into various kinds of scheduler optimizations. Engineers discussed different approaches to the problem: from extending CFS with hooks to building a completely new scheduling class with logic implemented in BPF.  Clearly every approach has pros and cons. In the end, engineers from Huawei, Meta and Google agreed to collaborate on the alignment and upstreaming effort.

[1] https://gitee.com/openeuler/kernel/issues/I5KUFB (see the PDF slides attached)
[2] https://lore.kernel.org/bpf/20210916162451.709260-1-guro@xxxxxx/
[3] https://dl.acm.org/doi/10.1145/3477132.3483542




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux