Re: Orangefs ABI documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> Objections, comments?

I have no objections so far to your suggestions, I'm trying to keep
my co-workers looking in on this discussion... most other days
we're all lined up in adjacent cubes...

Your attention and fixes will almost certainly all be improvements...
I found a few fixes you made to the old
version of the writev code that were off base, but that
code made it look like impossible situations needed
to be handled...

I'll compile your repo again and run the quick tests
when you think your additions are stable...

-Mike

On Fri, Jan 22, 2016 at 12:43 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Jan 22, 2016 at 05:08:38PM +0000, Al Viro wrote:
>
>> BTW, what should happen to requests that got a buggered response in
>> orangefs_devreq_write_iter()?  As it is, you just free them and to hell
>> with whatever might've been waiting on them; that's easy to fix, but
>> what about the submitter?  Should we let it time out/simulate an error
>> properly returned by daemon/something else?
>
> FWIW, here's what I have in mind for lifetime rules:
>         1) op_alloc() creates them with refcount 1
>         2) when we decide to pass the sucker to daemon, we bump the refcount
> before taking the thing out of request list.  After successful copying
> to userland, we check if the submitter has given up on it.  If it hasn't,
> we move the sucker to hash and decrement the refcount (it can't be the
> last reference there).  If it has, we do put_op().  After _failed_ copying
> to userland, we also check if it's been given up.  If it hasn't, we move
> it back to request list and decrement the refcount.  If it has, we do
> put_op().
>         3) when we get a response from daemon, we bump the refcount before
> taking the thing out of hash.  Once we are done, we call put_op() - in *all*
> cases.  Malformed response is treated like a well-formed error.
>         4) if submitter decides to give up upon a request, it sets a flag in
> ->op_state.  From that point on nobody else is going to touch op->list.
>         5) once submitter is done, it does put_op() - *NOT* op_release().
> Normally it ends up with request freed, but if we'd raced with interaction
> with daemon we'll get it freed by put_op() done in (2) or (3).  No
>         /*
>          * tell the device file owner waiting on I/O that this read has
>          * completed and it can return now.  in this exact case, on
>          * wakeup the daemon will free the op, so we *cannot* touch it
>          * after this.
>          */
>         wake_up_daemon_for_return(new_op);
>         new_op = NULL;
> anymore - just wake it up and do put_op() regardless.  In that case daemon
> has already done get_op() and will free the sucker once it does its put_op().
>
> Objections, comments?
>
> BTW, it really looks like we might be better off without atomic_t here -
> op->lock is held for all non-trivial transitions and we might as well have
> put_op() require op->lock held and do something like
>         if (--op->ref_count) {
>                 spin_unlock(op->lock);
>                 return;
>         }
>         spin_unlock(op->lock);
>         op_release(op);
>
> Speaking of op->lock:
>                         /*
>                          * let process sleep for a few seconds so shared
>                          * memory system can be initialized.
>                          */
>                         spin_lock_irqsave(&op->lock, irqflags);
>                         prepare_to_wait(&orangefs_bufmap_init_waitq,
>                                         &wait_entry,
>                                         TASK_INTERRUPTIBLE);
>                         spin_unlock_irqrestore(&op->lock, irqflags);
> What do you need to block interrupts for?
>
> Oh, and why do you call orangefs_op_initialize(orangefs_op) from op_release(),
> when you immediately follow it with freeing orangefs_op and do it on
> allocation anyway?
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux