Re: quota: dqio_mutex design

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andrew,

On Fri 23-06-17 02:43:44, Andrew Perepechko wrote:
> The original workload was 50 threads sequentially creating files, each
> 
> thread in its own directory, over a fast RAID array.

OK, I can reproduce this. Actually I can reproduce on normal SATA drive.
Originally I've tried on ramdisk to simulate really fast drive but there
dq_list_lock and dq_data_lock contention is much more visible and the
contention on dqio_mutex is minimal (two orders of magnitude smaller). On
SATA drive we spend ~45% of runtime contending on dqio_mutex when creating
empty files.

The problem is that if it is single user that is creating all these files,
it is not clear how we could do much better - all processes contend to
update the same location on disk with quota information for that user and
they have to be synchronized somehow. If there are more users, we could do
better by splitting dqio_mutex on per-dquot basis (I have some preliminary
patches for that).

One idea I have how we could make things faster is that instead of having
dquot dirty flag, we would have a sequence counter. So currently dquot
modification looks like:

update counters in dquot
dquot_mark_dquot_dirty(dquot);
dquot_commit(dquot)
  mutex_lock(dqio_mutex);
  if (!clear_dquot_dirty(dquot))
    nothing to do -> bail
  ->commit_dqblk(dquot)
  mutex_unlock(dqio_mutex);

When several processes race updating the same dquot, they very often all
end up updating dquot on disk even though another process has already
written dquot for them while they were waiting for dqio_sem - in my test
above the ratio of commit_dqblk / dquot_commit calls was 59%. What we could
do is that dquot_mark_dquot_dirty() would return "current sequence of
dquot", dquot_commit() would then get sequence that is required to be
written and if that is already written (we would also store in dquot latest
written sequence), it would bail out doing nothing. This should cut down
dqio_mutex hold times and thus wait times but I need to experiment and
measure that...

								Honza

> > On Fri 03-03-17 11:08:42, Jan Kara wrote:
> 
> > > Hello!
> 
> > >
> 
> > > On Thu 02-02-17 15:23:44, Andrew Perepechko wrote:
> 
> > > > We have a heavy metadata related workload (ext4, quota journalling)
> 
> > > > and profiling shows that there's significant dqio_mutex contention.
> 
> > > >
> 
> > > > From the quota code, it looks like every time dqio_mutex is taken
> 
> > > > it protects access to only one quota file.
> 
> > > >
> 
> > > > Is it possible to split dqio_mutex for each of MAXQUOTAS so that
> 
> > > > e.g. 2 parallel dquot_commit()'s can be running for user and group
> 
> > > > quota update? Am I missing any dqio_mutex function that requires
> 
> > > > dqio_mutex to be monolithic?
> 
> > >
> 
> > > So we can certainly make dqio_mutex less heavy. Making it per-quota-type
> 
> > > would OK but I suspect it will not bring a big benefit. What would likely
> 
> > > be more noticeable is if we avoided dqio_mutex for updates of quota
> 
> > > information - that should not be that hard to do since we update that
> 
> > > in-place and so don't really need the serialization for anything
> 
> > > substantial. However we will need some restructuring of the code to make
> 
> > > such locking scheme possible in a clean way...
> 
> >
> 
> > So I'm experimenting with some patches. However I have trouble creating
> 
> > a workload where quota updates would show significant overhead. Can you
> 
> > share which workload is problematic for you? Thanks!
> 
> >
> 
> > Honza
> 
>  
> 
>  
> 
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux