Re: reiser4: FITRIM ioctl -- how to grab the space?

Ivan Shapovalov <intelfx100@xxxxxxxxx> · Sat, 16 Aug 2014 21:02:38 +0400

On Saturday 16 August 2014 at 14:15:29, Edward Shishkin wrote:	
> 
> On 08/16/2014 01:17 PM, Ivan Shapovalov wrote:
> > On Saturday 16 August 2014 at 10:09:44, Edward Shishkin wrote:	
> >> On 08/16/2014 02:44 AM, Ivan Shapovalov wrote:
> >>> On Monday 11 August 2014 at 13:39:12, Ivan Shapovalov wrote:	
> >>>> [...]
> >>>>>> I've meant "grabbing all space and then allocating all space" -- so there won't
> >>>>>> be multiple grabs or multiple atoms.
> >>>>>>
> >>>>>> Then all processes grabbing space with BA_CAN_COMMIT will wait for the discard
> >>>>>> atom to commit.
> >>>>> It seems such waiting will screw up the system. No?
> >>>> I was afraid of such situations, but how would that happen? The discard atom's
> >>>> commit will always be able to proceed as it doesn't grab space at all.
> >>>>
> >>>>>>     (Actually, there is a small race window between grabbing space
> >>>>>> and creating an atom...)
> >>>>> Which one?
> >>>> BA_CAN_COMMIT machinery does wait only for atoms, not for contexts. If
> >>>> process X happens to grab space between us grabbing space and creating an atom,
> >>>> it will get -ENOSPC even with BA_CAN_COMMIT.
> >>
> >> I still don't see any "races" here. How atom creation is related to grabbing
> >> space? Are we talking about races in the existing code? f so, please show
> >> the racing paths..
> > Well, this is not a race per se - it does not involve locking. But it is
> > a race-like behavior.
> >
> > taskA                                 taskB
> > --------------------------------------------------------------------------
> > grab very much space
> 
> Ok, assume A wants X blocks.
> 
> >                                          grab some space with BA_CAN_COMMIT
> 
> Assume B wants Y blocks.
> 
> > create an atom using the grabbed space
> 
> 
> Please, specify which code is executing at this point.
> 
> Anyway, we don't need any reservation to _create_ an atom.
> Reservation is expended when allocating blocks on the low level
> (bitmaps). Reservation (grabbing space) is needed to avoid hard
> ENOSPC (=no free bits in bitmaps) in situation, when we can not
> fail (e.g. flush, commit, etc..,)

Let's take reiser4_sync_file_common().

The grabbing is
	reserve = estimate_update_common(dentry->d_inode);
	if (reiser4_grab_space(reserve, BA_CAN_COMMIT)) {

The creation of atom is (somewhere deep in the call stack) at
	write_sd_by_inode_common(dentry->d_inode);

Clearly, syncing file won't increase the real space occupied by data on disk.
However, because there is WA + journaling, such transaction still needs some
space to complete. This is "X blocks".

Suppose there is a second sync scheduled between grabbing and creation of atom
of the first sync. In the same vein it needs Y blocks, and Y is such that
Y < free-space < X+Y.

In this case, the second sync will fail despite BA_CAN_COMMIT flag given to
reiser4_grab_space(): at time of its execution, the first sync did not yet
create its atom, so there is nothing to commit to reclaim those X blocks.

However, if the second sync gets ordered after write_sd_by_inode_common() of
the first sync, BA_CAN_COMMIT machinery will eventually execute
txnmgr_force_commit_all() which will wait for the first sync to complete and
reclaim those X blocks.

So, the second transaction's result depends on scheduling. It is a race-like
behavior.

-- 
Ivan Shapovalov / intelfx /

> 
> 
> >
> > In this case, the taskB's grab will fail though it could wait for taskA's
> > not yet created atom.
> 
> 
> I still don't see why somebody should fail if X+Y < free-space-on-disk.
> If X+Y > free-space, then yes, someone will fail, and it is correct.
Attachment:
signature.asc

Description: This is a digitally signed message part.