Re: [LSF/MM/BPF TOPIC] Optimizing Page Cache Readahead Behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 23, 2025 at 11:04:50AM +0530, Ritesh Harjani wrote:
> Kalesh Singh <kaleshsingh@xxxxxxxxxx> writes:
> 
> > Hi organizers of LSF/MM,
> >
> > I realize this is a late submission, but I was hoping there might
> > still be a chance to have this topic considered for discussion.
> >
> > Problem Statement
> > ===============
> >
> > Readahead can result in unnecessary page cache pollution for mapped
> > regions that are never accessed. Current mechanisms to disable
> > readahead lack granularity and rather operate at the file or VMA
> 
> >From what I understand the readahead setting is done at the per-bdi
> level (default set to 128K). That means we don't get to control the
> amount of readahead pages needed on a per file basis. If say we can
> control the amount of readahead pages on a per open fd, will that solve
> the problem you are facing? That also means we don't need to change the
> setting for the entire system, but we can control this knob on a per fd
> basis? 
> 
> I just quickly hacked fcntl to allow setting no. of ra_pages in
> inode->i_ra_pages. Readahead algorithm then takes this setting whenever
> it initializes the readahead control in "file_ra_state_init()"
> So after one opens the file, we can set the fcntl F_SET_FILE_READAHEAD
> to the preferred value on the open fd. 
> 
> 
> Note: I am not saying the implementation could be 100% correct. But it's
> just a quick working PoC to discuss whether this is the right approach
> to the given problem.

> @@ -678,6 +678,8 @@ struct inode {
>  	unsigned short          i_bytes;
>  	u8			i_blkbits;
>  	enum rw_hint		i_write_hint;
> +	/* Per inode setting for max readahead in page_size units */
> +	unsigned long		i_ra_pages;
>  	blkcnt_t		i_blocks;

If your final patch needs to store data in struct inode, please try to
optimize it so that the size does not change. There are at least 2 4
byte holes so if you're fine with a page size unit for readahead then
this should be sufficient.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux