On 7/19/21 8:31 PM, longli@xxxxxxxxxxxxxxxxx wrote: > From: Long Li <longli@xxxxxxxxxxxxx> > > Microsoft Azure Blob storage service exposes a REST API to applications > for data access. > (https://docs.microsoft.com/en-us/rest/api/storageservices/blob-service-rest-api) > > This patchset implements a VSC (Virtualization Service Consumer) that > communicates with a VSP (Virtualization Service Provider) on the Hyper-V > host to execute Blob storage access via native network stack on the host. > This VSC doesn't implement the semantics of REST API. Those are implemented > in user-space. The VSC provides a fast data path to VSP. > > Answers to some previous questions discussing the driver: > > Q: Why this driver doesn't use the block layer > > A: The Azure Blob is based on a model of object oriented storage. The > storage object is not modeled in block sectors. While it's possible to > present the storage object as a block device (assuming it makes sense to > fake all the block device attributes), we lose the ability to express > functionality that are defined in the REST API. > > Q: You just lost all use of caching and io_uring and loads of other kernel > infrastructure that has been developed and relied on for decades? > > A: The REST API is not designed to have caching at system level. This > driver doesn't attempt to improve on this. There are discussions on > supporting ioctl() on io_uring (https://lwn.net/Articles/844875/), that > will benefit this driver. The block I/O scheduling is not helpful in this > case, as the Blob application and Blob storage server have complete > knowledge on the I/O pattern based on storage object type. This knowledge > doesn't get easily consumed by the block layer. > > Q: You also just abandoned the POSIX model and forced people to use a > random-custom-library just to access their storage devices, breaking all > existing programs in the world? > > A: The existing Blob applications access storage via HTTP (REST API). They > don't use POSIX interface. The interface for Azure Blob is not designed > on POSIX. > > Q: What programs today use this new API? > > A: Currently none is released. But per above, there are also none using > POSIX. > > Q: Where is the API published and what ensures that it will remain stable? > > A: Cloud based REST protocols have similar considerations to the kernel in > terms of interface stability. Applications depend on cloud services via > REST in much the same way as they depend on kernel services. Because > applications can consume cloud APIs over the Internet, there is no > opportunity to recompile applications to ensure compatibility. This causes > the underlying APIs to be exceptionally stable, and Azure Blob has not > removed support for an exposed API to date. This driver is supporting a > pass-through model where requests in a guest process can be reflected to a > VM host environment. Like the current REST interface, the goal is to ensure > that each host provide a high degree of compatibility with each guest, but > that task is largely outside the scope of this driver, which exists to > communicate requests in the same way an HTTP stack would. Just like an HTTP > stack does not require updates to add a new custom header or receive one > from a server, this driver does not require updates for new functionality > so long as the high level request/response model is retained. > > Q: What happens when it changes over time, do we have to rebuild all > userspace applications? > > A: No. We don’t rebuild them all to talk HTTP either. In the current HTTP > scheme, applications specify the version of the protocol they talk, and the > storage backend responds with that version. > > Q: What happens to the kernel code over time, how do you handle changes to > the API there? > > A: The goal of this driver is to get requests to the Hyper-V host, so the > kernel isn’t involved in API changes, in the same way that HTTP > implementations are robust to extra functionality being added to HTTP. Another question is why do we need this in the kernel? Has it been considered to provide a driver similar to vfio on top of the Hyper-V bus such that this object storage driver can be implemented as a user-space library instead of as a kernel driver? As you may know vfio users can either use eventfds for completion notifications or polling. An interface like io_uring can be built easily on top of vfio. Thanks, Bart.