Re: [PATCH 3/3] opensm/osm_perfmgr.c: Fix perfmgr sweep_state race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/27/2014 1:56 PM, Albert Chu wrote:
> Introduce new sweep state PERFMGR_SWEEP_POST_PROCESSING to fix
> race in perfmgr.
> 
> Race occurs as follows:

Nice find!

> 
> Under typical conditions, osm_perfmgr_process() is entered
> with sweep_state set to PERFMGR_SWEEP_SLEEP.  osm_perfmgr_process()
> sets sweep_state to PERFMGR_SWEEP_ACTIVE when it begins to sweep.
> 
> osm_perfmgr_process() will eventually call perfmgr_send_mad() by
> way of perfmgr_query_counters() and several other functions.
> 
> Responses to performance counter MADs may initiate the sending
> of more MADs via perfmgr_send_mad(), such as through redirection
> or the desire to clear counters.
> 
> If too many MADs have been put on the wire, perfmgr_send_mad()
> will throttle sending out MADS and temporarily change sweep_state
> between PERFMGR_SWEEP_SUSPENDED and PERFMGR_SWEEP_ACTIVE as it
> throttles.  The sweep_state is set to PERFMGR_SWEEP_ACTIVE
> when all performance counter MADs have been sent out by the sweeper.
> 
> osm_perfmgr_process() eventually completes its sweep and puts
> sweep_state back into PERFMGR_SWEEP_SLEEP.
> 
> At this point, some MADs may still be on the wire.  New MADs may be
> put back on the wire if responses necessitate it (redirection or
> clearing counters).  If enough MADs are put back onto the wire,
> perfmgr_send_mad() will throttle as normal, temporarily moving
> between PERFMGR_SWEEP_SUSPENDED and PERFMGR_SWEEP_ACTIVE.  After
> the throttling is complete, sweep_state is put into
> PERFMGR_SWEEP_ACTIVE state.
> 
> This is the key problem, the sweep_state is changed from
> PERFMGR_SWEEP_SLEEP to PERFMGR_SWEEP_ACTIVE outside of
> osm_perfmgr_process().
> 
> Now that the perfmgr is in ACTIVE state, any future sweep call to
> osm_perfmgr_process() will not sweep b/c the sweep_state is set
> to PERFMGR_SWEEP_ACTIVE.
> 
> The introduction of a new sweep_state PERFMGR_SWEEP_POST_PROCESSING
> fixes this problem.
> 
> If perfmgr_send_mad() throttles mads while in PERFMGR_SWEEP_SLEEP.
> sweep_state will be moved into the PERFMGR_SWEEP_POST_PROCESSING
> state instead of PERFMGR_SWEEP_SUSPENDED/PERFMGR_SWEEP_ACTIVE.
> 
> When all post-SLEEP state MAD processing is complete, the sweep_state
> will move from PERFMGR_SWEEP_POST_PROCESSING back to PERFMGR_SWEEP_SLEEP,
> so that future sweeps can operate as normal.
> 
> Signed-off-by: Albert L. Chu <chu11@xxxxxxxx>
> ---
>  include/opensm/osm_perfmgr.h |    6 +++++-
>  opensm/osm_perfmgr.c         |   21 +++++++++++++++++++--
>  2 files changed, 24 insertions(+), 3 deletions(-)
> 
> diff --git a/include/opensm/osm_perfmgr.h b/include/opensm/osm_perfmgr.h
> index ab02c26..44a278d 100644
> --- a/include/opensm/osm_perfmgr.h
> +++ b/include/opensm/osm_perfmgr.h
> @@ -88,7 +88,8 @@ typedef enum {
>  typedef enum {
>  	PERFMGR_SWEEP_SLEEP,
>  	PERFMGR_SWEEP_ACTIVE,
> -	PERFMGR_SWEEP_SUSPENDED
> +	PERFMGR_SWEEP_SUSPENDED,
> +	PERFMGR_SWEEP_POST_PROCESSING
>  } osm_perfmgr_sweep_state_t;
>  
>  typedef struct monitored_port {
> @@ -239,6 +240,9 @@ inline static const char *osm_perfmgr_get_sweep_state_str(osm_perfmgr_t * perfmg
>  	case PERFMGR_SWEEP_SUSPENDED:
>  		return "Suspended";
>  		break;
> +	case PERFMGR_SWEEP_POST_PROCESSING:
> +		return "PostProcessing";
> +		break;
>  	}
>  	return "UNKNOWN";
>  }
> diff --git a/opensm/osm_perfmgr.c b/opensm/osm_perfmgr.c
> index bfa63dc..863e9bf 100644
> --- a/opensm/osm_perfmgr.c
> +++ b/opensm/osm_perfmgr.c
> @@ -164,6 +164,16 @@ static inline void decrement_outstanding_queries(osm_perfmgr_t * pm)
>  {
>  	cl_atomic_dec(&pm->outstanding_queries);
>  	cl_event_signal(&pm->sig_query);
> +
> +	if (!pm->outstanding_queries) {
> +		cl_spinlock_acquire(&pm->lock);
> +		if (pm->sweep_state == PERFMGR_SWEEP_POST_PROCESSING) {
> +			pm->sweep_state = PERFMGR_SWEEP_SLEEP;
> +			OSM_LOG(pm->log, OSM_LOG_INFO,
> +				"PM sweep state exiting Post Processing\n");
> +		}
> +		cl_spinlock_release(&pm->lock);
> +	}

Shouldn't this potential sweep state update occur prior to the event
being signaled ?

-- Hal

>  }
>  
>  /**********************************************************************
> @@ -439,7 +449,13 @@ static ib_api_status_t perfmgr_send_mad(osm_perfmgr_t *perfmgr,
>  		while (perfmgr->outstanding_queries >
>  		       (int32_t)perfmgr->max_outstanding_queries) {
>  			cl_spinlock_acquire(&perfmgr->lock);
> -			perfmgr->sweep_state = PERFMGR_SWEEP_SUSPENDED;
> +			if (perfmgr->sweep_state == PERFMGR_SWEEP_SLEEP) {
> +				perfmgr->sweep_state = PERFMGR_SWEEP_POST_PROCESSING;
> +				OSM_LOG(perfmgr->log, OSM_LOG_INFO,
> +					"PM sweep state going into Post Processing\n");
> +			}
> +			else if (perfmgr->sweep_state == PERFMGR_SWEEP_ACTIVE)
> +				perfmgr->sweep_state = PERFMGR_SWEEP_SUSPENDED;
>  			cl_spinlock_release(&perfmgr->lock);
>  wait:
>  			sts = cl_event_wait_on(&perfmgr->sig_query,
> @@ -1015,7 +1031,8 @@ void osm_perfmgr_process(osm_perfmgr_t * pm)
>  
>  	cl_spinlock_acquire(&pm->lock);
>  	if (pm->sweep_state == PERFMGR_SWEEP_ACTIVE ||
> -	    pm->sweep_state == PERFMGR_SWEEP_SUSPENDED) {
> +	    pm->sweep_state == PERFMGR_SWEEP_SUSPENDED ||
> +	    pm->sweep_state == PERFMGR_SWEEP_POST_PROCESSING) {
>  		cl_spinlock_release(&pm->lock);
>  		OSM_LOG(pm->log, OSM_LOG_INFO,
>  			"PM sweep state %d, skipping sweep\n",

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux