Hi All, I apologize for the lengthy email, but I have a lot of things to cover. As some of you know, a goal of mine is to make it possible to write blk-mq device drivers in Rust. The RFC patches I have sent to this list are the first steps of making that goal a reality. They are a sample of the work I am doing. My current plan of action is to provide a Rust API that allows implementation of blk-mq device drives, along with a Rust implementation of null_blk to serve as a reference implementation. This reference implementation will demonstrate how to use the API. I attended LSF in Vancouver a few weeks back where I led a discussion on the topic. My goal for that session was to obtain input from the community on how to upstream the work as it becomes more mature. I received a lot of feedback, both during the session, in the hallway, and on the mailing list. Ultimately, we did not achieve consensus on a path forward. I will try to condense the key points raised by the community here. If anyone feel their point is not contained below, please chime in. Please note that I am paraphrasing the points below, they are not citations. 1) "Block layer community does not speak Rust and thus cannot review Rust patches" This work hinges on one of two things happening. Either block layer reviewers and maintainers eventually becoming fluent in Rust, or they accept code in their tree that are maintained by the "rust people". I very much would prefer the first option. I would suggest to use this work to facilitate gradual adoption of Rust. I understand that this will be a multi-year effort. By giving the community access to a Rust bindings specifically designed or the block layer, the block layer community will have a helpful reference to consult when investigating Rust. While the block community is getting up to speed in Rust, the Rust for Linux community is ready to conduct review of patches targeting the block layer. Until such a time where Rust code can be reviewed by block layer experts, the work could be gated behind an "EXPERIMENTAL" flag. Selection of the null_blk driver for a reference implementation to drive the Rust block API was not random. The null_blk driver is relatively simple and thus makes for a good platform to demonstrate the Rust API without having to deal with actual hardware. The null_blk driver is a piece of testing infrastructure that is not usually deployed in production environments, so people who are worried about Rust in general will not have to worry about their production environments being infested with Rust. Finally there have been suggestions both to replace and/or complement the existing C null_blk driver with the Rust version. I would suggest (eventually, not _now_) complementing the existing driver, since it can be very useful to benchmark and test the two drivers side by side. 2) "Having Rust bindings for the block layer in-tree is a burden for the maintainers" I believe we can integrate the bindings in a way so that any potential breakage in the Rust API does not impact current maintenance work. Maintainers and reviewers that do not wish to bother with Rust should be able to opt out. All Rust parts should be gated behind a default N kconfig option. With this scheme there should be very little inconvenience for current maintainers. I will take necessary steps to make sure block layer Rust bindings are always up to date with changes to kernel C API. I would run CI against - for-next of https://git.kernel.dk/linux.git - master of https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git - mainline releases including RCs - stable and longterm kernels with queues applied - stable and longterm releases including RCs Samsung will provide resources to support this CI effort. Through this effort I will aim to minimize any inconvenience for maintainers. 3) "How will you detect breakage in the Rust API caused by changes to C code?" The way we call C code from Rust in the kernel guarantees that most changes to C APIs that are called by Rust code will cause a compile failure when building the kernel with Rust enabled. This includes changing C function argument names or types, and struct field names or types. Thus, we do not need to rely on symvers CRC calculation as suggested by James Bottomley at LSF. However, if the semantics of a kernel C function is changed without changing its name or signature, potential breakage will not be detected by the build system. To detect breakage resulting from this kind of change, we have to rely _on the same mechanics_ that maintainers of kernel C code are relying on today: - kunit tests - blktests - fstests - staying in the loop wrt changes in general We also have Rust support in Intel 0-day CI, although only compile tests for now. 4) "How will you prevent breakage in C code resulting from changes to Rust code" The way the Rust API is designed, existing C code is not going to be reliant on Rust code. If anything breaks just disable Rust and no Rust code will be built. Or disable block layer Rust code if you want to keep general Rust support. If Rust is disabled by default, nothing in the kernel should break because of Rust, if not explicitly enabled. 5) "Block drivers in general are not security sensitive because they are mostly privileged code and have limited user visible API" There are probably easier ways to exploit a Linux system than to target the block layer, although people are plugging in potentially malicious block devices all the time in the form of USB Mass Storage devices or CF cards. While memory safety is very relevant for preventing exploitable security vulnerabilities, it is also incredibly useful in preventing memory safety bugs in general. Fewer bugs means less risk of bugs leading to data corruption. It means less time spent on tracking down and fixing bugs, and less time spent reviewing bug fixes. It also means less time required to review patches in general, because reviewers do not have to review for memory safety issues. So while Rust has high merit in exposed and historically exploited subsystems, this does not mean that it has no merit in other subsystems. 6) "Other subsystems may benefit more from adopting Rust" While this might be true, it does not prevent the block subsystem from benefiting from adopting Rust (see 5). 7) "Do not waste time re-implementing null_blk, it is test infrastructure so memory safety does not matter. Why don't you do loop instead?" I strongly believe that memory safety is also relevant in test infrastructure. We waste time and energy fixing memory safety issues in our code, no matter if the code is test infrastructure or not. I refer to the statistics I posted to the list at an earlier date [3]. Further, I think it is a benefit to all if the storage community can become fluent in Rust before any critical infrastructure is deployed using Rust. This is one reason that I switched my efforts to null_block and that I am not pushing Rust NVMe. 8) "Why don't you wait with this work until you have a driver for a new storage standard" Let's be proactive. I think it is important to iron out the details of the Rust API before we implement any potential new driver. When we eventually need to implement a driver for a future storage standard, the choice to do so in Rust should be easy. By making the API available ahead of time, we will be able to provide future developers with a stable implementation to choose from. 9) "You are a new face in our community. How do we know you will not disappear?" I recognize this consideration and I acknowledge that the community is trust based. Trust takes time to build. I can do little more than state that I intend to stay with my team at Samsung to take care of this project for many years to come. Samsung is behind this particular effort. In general Google and Microsoft are actively contributing to the wider Rust for Linux project. Perhaps that can be an indication that the project in general is not going away. 10) "How can I learn how to build the kernel with Rust enabled?" We have a guide in `Documentation/rust/quick-start.rst`. If that guide does not get you started, please reach out to us [1] and we will help you get started (and fix the documentation since it must not be good enough then). 11) "What if something catches fire and you are out of office?" If I am for some reason not responding to pings during a merge, please contact the Rust subsystem maintainer and the Rust for Linux list [2]. There are quite a few people capable of firefighting if it should ever become necessary. 12) "These patches are not ready yet, we should not accept them" They most definitely are _not_ ready, and I would not ask for them to be included at all in their current state. The RFC is meant to give a sample of the work that I am doing and to start this conversation. I would rather have this conversation preemptively. I did not intend to give the impression that the patches are in a finalized state at all. With all this in mind I would suggest that we treat the Rust block layer API and associated null block driver as an experiment. I would suggest that we merge it in when it is ready, and we gate it behind an experimental kconfig option. If it turns out that all your worst nightmares come true and it becomes an unbearable load for maintainers, reviewers and contributors, it will be low effort remove it again. I very much doubt this will be the case though. Jens, Kieth, Christoph, Ming, I would kindly ask you to comment on my suggestion for next steps, or perhaps suggest an alternate path. In general I would appreciate any constructive feedback from the community. [1] https://rust-for-linux.com/contact [2] rust-for-linux@xxxxxxxxxxxxxxx [3] https://lore.kernel.org/all/87y1ofj5tt.fsf@xxxxxxxxxxxx/ Best regards, Andreas Hindborg