Hi, Jan!
在 2024/03/21 19:27, Jan Kara 写道:
Hello!
On Tue 19-03-24 16:26:19, Yu Kuai wrote:
在 2024/03/19 7:22, Christoph Hellwig 写道:
On Mon, Mar 18, 2024 at 03:19:03PM +0800, Yu Kuai wrote:
I come up with an ideal:
While opening the block_device the first time, store the generated new
file in "bd_inode->i_private". And release it after the last opener
close the block_device.
The advantages are:
- multiple openers can share the same bdev_file;
- raw block device ops can use the bdev_file as well, and there is no
need to distinguish iomap/buffer_head for raw block_device;
Please let me know what do you think?
That does sound very reasonable to me.
I just implement the ideal with following patch(not fully tested, just
boot and some blktests)
So I was looking into this and I'm not sure I 100% understand the problem.
I understand that the inode you get e.g. in blkdev_get_block(),
blkdev_iomap_begin() etc. may be an arbitrary filesystem block device
inode. But why can't you use I_BDEV(inode->i_mapping->host) to get to the
block device instead of your file_bdev(inode->i_private)? I don't see any
advantage in stashing away that special bdev_file into inode->i_private but
perhaps I'm missing something...
Because we're goning to remove the 'block_device' from iomap and
buffer_head, and replace it with a 'bdev_file'.
patch 19 from this set is using a union of block_device and bdev_file,
this can work as well.
Thanks,
Kuai
Honza
diff --git a/block/fops.c b/block/fops.c
index 4037ae72a919..059f6c7d3c09 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -382,7 +382,7 @@ static ssize_t blkdev_direct_IO(struct kiocb *iocb,
struct iov_iter *iter)
static int blkdev_iomap_begin(struct inode *inode, loff_t offset, loff_t
length,
unsigned int flags, struct iomap *iomap, struct iomap
*srcmap)
{
- struct block_device *bdev = I_BDEV(inode);
+ struct block_device *bdev = file_bdev(inode->i_private);
loff_t isize = i_size_read(inode);
iomap->bdev = bdev;
@@ -404,7 +404,7 @@ static const struct iomap_ops blkdev_iomap_ops = {
static int blkdev_get_block(struct inode *inode, sector_t iblock,
struct buffer_head *bh, int create)
{
- bh->b_bdev = I_BDEV(inode);
+ bh->b_bdev = file_bdev(inode->i_private);
bh->b_blocknr = iblock;
set_buffer_mapped(bh);
return 0;
@@ -598,6 +598,7 @@ blk_mode_t file_to_blk_mode(struct file *file)
static int blkdev_open(struct inode *inode, struct file *filp)
{
+ struct file *bdev_file;
struct block_device *bdev;
blk_mode_t mode;
int ret;
@@ -614,9 +615,28 @@ static int blkdev_open(struct inode *inode, struct file
*filp)
if (!bdev)
return -ENXIO;
+ bdev_file = alloc_and_init_bdev_file(bdev,
+ BLK_OPEN_READ | BLK_OPEN_WRITE, NULL);
+ if (IS_ERR(bdev_file)) {
+ blkdev_put_no_open(bdev);
+ return PTR_ERR(bdev_file);
+ }
+
+ bdev_file->private_data = ERR_PTR(-EINVAL);
+ get_bdev_file(bdev, bdev_file);
ret = bdev_open(bdev, mode, filp->private_data, NULL, filp);
- if (ret)
+ if (ret) {
+ put_bdev_file(bdev);
blkdev_put_no_open(bdev);
+ } else {
+ filp->f_flags |= O_LARGEFILE;
+ filp->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
+ if (bdev_nowait(bdev))
+ filp->f_mode |= FMODE_NOWAIT;
+ filp->f_mapping = bdev_mapping(bdev);
+ filp->f_wb_err =
filemap_sample_wb_err(bdev_file->f_mapping);
+ }
+
return ret;
}
.