Re: serializing access to configfs

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Sat, 16 Aug 2014 18:50:02 -0700

On Thu, 2014-08-14 at 10:05 -0700, Andy Grover wrote:
> On 08/12/2014 04:52 PM, Nicholas A. Bellinger wrote:
> > So I've been thinking about this recently, and wondering what use cases
> > could end up being problematic with a 'BCL' approach, especially if done
> > below the targetcli (end-user) level that end up preventing the inherent
> > parallelism provided by existing /sys/kernel/config/target/* logic.
> >
> > One case that comes to mind is where a large number (say 256) of
> > backends + fabric endpoints + LUNs are failing over from one physical
> > host to another.  Having to sequentially preform these options under the
> > guise of a single BCL can significantly slow down the total time to
> > recreate backends + fabric endpoints + LUNs, potentially resulting in
> > forward facing I/O timeouts and other types of unpleasantness.
> 
> I don't follow - if userspace is doing a large want-to-be atomic 
> transition like this, wouldn't grabbing the BCL allow it to do its thing 
> without disruption? Every time a finer-grained lock is released is a 
> chance for another entity to get in and change things unexpectedly, so 
> to be safe the initial entity would have to re-read configfs under the 
> new lock instead of carrying knowledge of the state across the entire 
> transaction.
> 

I was specifically talking about parallel creation of configfs groups
and symlinks in target/core/$HBA/$DEV/ + target/$FABRIC/$WWPN/$TPG/.

The only real ordering requirement in rtslib for parallel creation is
that backend devices are created before attempting to be configfs LUN
symlinked by individual target WWPN endpoints.

> > This can be compounded by the amount of time it takes for hardware to
> > re-initialize when transitioning into target mode.  One example that
> > comes to mind is with qla2xxx, where enabling target mode ends up taking
> > 5-10 seconds per WWPN.
> 
> This isn't the best example, because afaik qla2xxx must be set into 
> target mode at boot.

The loading of qla2xxx.ko at boot does not put hardware ISPs into target
mode, but simply prevents normal initiator mode operation from being
enabled during the initial pci probe_one callbacks.  The hardware ISP
change into target mode and subsequent reset delay occurs each time
while being activated via target/qla2xxx/$WWPN/$TPGT/enable.

This is an important reason why the notion of BCL can be problematic in
real world usage; configfs operations for target fabric endpoints block
longer than expected, preventing rtslib operations on separate port
endpoints from being independently completed.

>  As a counterexample I would say in general configfs 
> doesn't wait for hardware, it allows the setting to be set and doesn't 
> block. iscsi lets you configure portals for IPs that have not been 
> configured, and tcm_fc also lets you configure WWPNs before they are 
> present, as examples.
> 

That's a fine assumption for creation of a few simple iscsi + tcm_fc
endpoints, but what happens when one outstanding I/O blocks waiting to
complete during active session shutdown..?

In general there is no delay if all backend devices give back all of the
I/O requests all of the time, but in practice what happens is individual
backend I/Os take longer than expected to complete, or in the case of
some broken LLDs, not return at all.

This makes the BCL approach particularly problematic with H/A usage when
shutdown I/O timeouts occur on multiple devices at the same time,
causing further delays of rtslib operations logically unrelated to the
affected configfs context.

> > So that said, I think there is some value in a BCL approach at the
> > targetcli level to help avoid simple beginner mistakes for new users,
> > but such an approach at rtslib level really limits what the kernel
> > design was intended to do; Allowing for many outstanding operations to
> > be executed in parallel from different processes within separate
> > configfs group contexts.
> >
> > So please, let's avoid putting training wheels on rtslib that limit the
> > underlying parallelism already built into target/configfs.
> 
> Taking a step back, I think we all agree that contention will be 
> extremely low -- this is really about a safeguard.
>
> I think there's still room for an entity grabbing the lock to 
> parallelize its own configfs accesses while holding the BCL.
> 
> Given that, and that we'd want to evangelize this to other people 
> writing LIO tools in different languages, simplicity is key. A single 
> BCL enables easier parallelism within an accessing entity, and the 
> downside of not enabling cross-entity parallelism is moot if contention 
> is low.
> 
> BTW we also could prototype some different designs and try them out, to 
> help reach a consensus, too.
> 

A safeguard for end-users in the shell, probably yes.

A set of training wheels for developers using rtslib to protect them
from their own apps, I'm not entirely convinced..

Jerome, any more thoughts..?

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html