Re: Tracking actual disk write sources instead of flush thread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Apr 23, 2014, at 1:39 PM, Phillip Susi <psusi@xxxxxxxxxx> wrote:
> On 4/23/2014 9:48 AM, Matthew Wilcox wrote:
> > I don't understand your high-level goal, which makes suggesting
> > low-overhead solutions hard.  Can you tolerate a certain amount of
> > ambiguity, for example?  Do you really only want to track back to
> > the UID that is causing the I/O?  With shared mmaps, are you OK
> > attributing the I/O to one of the processes that has written to it,
> > or do you need to attribute the write to all the processes that
> > have written to that page?
> 
> I suppose the first process that dirties the page would be fine.  It
> isn't very often that more than one process is writing to the same
> data at the same time.

I think that adding a pointer or integer per page would meet resistance,
but I think it is pretty reasonable to track this on a per-inode basis.
It is fairly uncommon to have multiple threads writing to the same file,
and I would guess it is vanishingly rare that different applications are
writing to the same file at one time.

Storing {current->comm}.{pid} would take 20 bytes of space per inode, but
would be much more useful than just storing {pid}, since a process may be
long gone by the time that the blocks are even submitted to disk due to
delayed allocation and such.

It would be possible to store a refcounted struct with this info pointed
to from the inode, since it would only be useful on inodes being written,
but that has to be balanced against the complexity of maintaining that
struct and the potential of saving 12 bytes per inode (since there would
still need to be a pointer in the inode).

There are potentially a number of other fields in struct inode that are
only used during writes (i_size_seqcount, dirtied_when (how did that avoid
getting an i_ prefix?), i_wb_list, i_writecount) that might also be moved
to a separate struct that is allocated only for files being written (add
8-byte pointer, subtract 32 bytes for fields).  That would have the benefit
of slimming down the majority of files not currently being written, and
make the addition "write source" information less costly.

Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux