Re: [RFC PATCH v8 08/10] ice: implement dpll interface to control cgu

Jiri Pirko <jiri@xxxxxxxxxxx> · Wed, 21 Jun 2023 14:28:38 +0200

Mon, Jun 19, 2023 at 08:08:11PM CEST, arkadiusz.kubalewski@xxxxxxxxx wrote:
>>From: Jiri Pirko <jiri@xxxxxxxxxxx>
>>Sent: Saturday, June 10, 2023 11:57 AM
>>
>>Fri, Jun 09, 2023 at 02:18:51PM CEST, arkadiusz.kubalewski@xxxxxxxxx wrote:

[...]

>>>+/**
>>>+ * ice_dpll_cb_lock - lock dplls mutex in callback context
>>>+ * @pf: private board structure
>>>+ *
>>>+ * Lock the mutex from the callback operations invoked by dpll subsystem.
>>>+ * Prevent dead lock caused by `rmmod ice` when dpll callbacks are under
>>>stress
>>
>>"dead lock", really? Which one? Didn't you want to write "livelock"?
>>
>
>As explained, rmmod takes and destroys lock, it can happen that
>netlink request/ops hangs on trying to lock when rmmod started deinit.

Yeah, don't take the lock in rmmod, see below.

>
>>If this is livelock prevention, is this something you really see or
>>just an assumption? Seems to me unlikely.
>>
>>Plus, see my note in ice_dpll_init(). If you remove taking the lock from
>>ice_dpll_init() and ice_dpll_deinit(), do you still need this? I don't
>>think so.
>>
>
>I won't remove it from there, as the lock shall serialize the access to ice dpll
>data, so it must be held on init until everything is ready for netlink access,

Before you register the dpll device/pin, no netlink cmd can happen. So
do whatever you want in the init, no lock needed, register at the and
when you are ready. Same on deinit. After you unregister, no call can
happen, no need to lock anything in deinit.

Either I'm missing something really odd, or your locking scheme is very
wrong.

If this locking here is needed, could you please present an example of a
race it solves here?

>or when deinit takes place, until everything is released.
>
>>
>>>+ * tests.
>>>+ *
>>>+ * Return:
>>>+ * 0 - if lock acquired
>>>+ * negative - lock not acquired or dpll was deinitialized
>>>+ */
>>>+static int ice_dpll_cb_lock(struct ice_pf *pf)
>>>+{
>>>+	int i;
>>>+
>>>+	for (i = 0; i < ICE_DPLL_LOCK_TRIES; i++) {
>>>+		if (mutex_trylock(&pf->dplls.lock))
>>>+			return 0;
>>>+		usleep_range(100, 150);
>>>+		if (!test_bit(ICE_FLAG_DPLL, pf->flags))
>>
>>How exactly could this happen? I don't think it can. Drop it.
>>
>
>Asynchronously called deinit, and kworker tries to update at the same time.

Could you provide an example please?
Because I see 2 possible paths:
1) dpll op call
2) periodic work

But when ICE_FLAG_DPLL is cleared in ice_dpll_deinit(), no dpll op call
can happen anymore and no periodic work is scheduled.

Is this another "just in case" situation? If yes, please remove.
If no, please provide an example of a race this solves.

>
>>
>>>+			return -EFAULT;
>>>+	}
>>>+
>>>+	return -EBUSY;
>>>+}

[...]

>>>+static void ice_dpll_periodic_work(struct kthread_work *work)
>>>+{
>>>+	struct ice_dplls *d = container_of(work, struct ice_dplls,
>>>work.work);
>>>+	struct ice_pf *pf = container_of(d, struct ice_pf, dplls);
>>>+	struct ice_dpll *de = &pf->dplls.eec;
>>>+	struct ice_dpll *dp = &pf->dplls.pps;
>>>+	int ret = 0;
>>>+
>>>+	if (!test_bit(ICE_FLAG_DPLL, pf->flags))
>>
>>How exactly could this happen? I don't think it can. Drop it.
>>
>
>I will change a bit here, ice_dpll_cb_lock returns an error
>if that flag was already down, so will use it instead.

Yeah, I believe that this simply cannot happen as when the bit is
cleared, no work is scheduled anymore. See above.

>
>>
>>>+		return;
>>>+	ret = ice_dpll_cb_lock(pf);
>>>+	if (ret) {
>>>+		d->lock_err_num++;
>>

[...]

>
>>>+		mutex_lock(&pf->dplls.lock);
>>
>>Related to my question in ice_dpll_init(), why do you need to lock the
>>mutex
>>here?
>
>Because before deinit takes place on driver unload the user can still ask
>for the info from the dpll subsystem or kworker can try to alter the status.

When you unregister dpll and stop the kworker, you can't see anything
like that. No need to take this lock.

>
>>
>>
>>>+		ice_dpll_deinit_pins(pf, cgu);
>>>+		ice_dpll_deinit_info(pf);
>>>+		ice_dpll_deinit_dpll(pf, &pf->dplls.pps, cgu);
>>>+		ice_dpll_deinit_dpll(pf, &pf->dplls.eec, cgu);
>>
>>Please reorder to match error path in ice_dpll_init()
>>
>
>Fixed.
>
>>>+		if (cgu)
>>
>>In ice_dpll_init() you call this "cgu_present". Please be consistent in
>>naming.
>>
>
>Fixed.
>
>>
>>>+			ice_dpll_deinit_worker(pf);
>>>+		clear_bit(ICE_FLAG_DPLL, pf->flags);
>>>+		mutex_unlock(&pf->dplls.lock);
>>>+		mutex_destroy(&pf->dplls.lock);
>>>+	}
>>>+}
>>>+
>>>+/**
>>>+ * ice_dpll_init_info_direct_pins - initializes direct pins info
>>>+ * @pf: board private structure
>>>+ * @pin_type: type of pins being initialized
>>>+ *
>>>+ * Init information for directly connected pins, cache them in pf's pins
>>>+ * structures.
>>>+ *
>>>+ * Context: Function initializes and holds pf->dplls.lock mutex.
>>>+ * Return:
>>>+ * * 0 - success
>>>+ * * negative - init failure reason
>>>+ */
>>>+static int
>>>+ice_dpll_init_info_direct_pins(struct ice_pf *pf,
>>>+			       enum ice_dpll_pin_type pin_type)
>>>+{
>>>+	struct ice_dpll *de = &pf->dplls.eec, *dp = &pf->dplls.pps;
>>>+	int num_pins, i, ret = -EINVAL;
>>>+	struct ice_hw *hw = &pf->hw;
>>>+	struct ice_dpll_pin *pins;
>>>+	u8 freq_supp_num;
>>>+	bool input;
>>>+
>>>+	switch (pin_type) {
>>>+	case ICE_DPLL_PIN_TYPE_INPUT:
>>>+		pins = pf->dplls.inputs;
>>>+		num_pins = pf->dplls.num_inputs;
>>>+		input = true;
>>>+		break;
>>>+	case ICE_DPLL_PIN_TYPE_OUTPUT:
>>>+		pins = pf->dplls.outputs;
>>>+		num_pins = pf->dplls.num_outputs;
>>>+		input = false;
>>>+		break;
>>>+	default:
>>>+		return ret;
>>>+	}
>>>+
>>>+	for (i = 0; i < num_pins; i++) {
>>>+		pins[i].idx = i;
>>>+		pins[i].prop.board_label = ice_cgu_get_pin_name(hw, i, input);
>>>+		pins[i].prop.type = ice_cgu_get_pin_type(hw, i, input);
>>>+		if (input) {
>>>+			ret = ice_aq_get_cgu_ref_prio(hw, de->dpll_idx, i,
>>>+						      &de->input_prio[i]);
>>>+			if (ret)
>>>+				return ret;
>>>+			ret = ice_aq_get_cgu_ref_prio(hw, dp->dpll_idx, i,
>>>+						      &dp->input_prio[i]);
>>>+			if (ret)
>>>+				return ret;
>>>+			pins[i].prop.capabilities |=
>>>+				DPLL_PIN_CAPS_PRIORITY_CAN_CHANGE;
>>>+		}
>>>+		pins[i].prop.capabilities |= DPLL_PIN_CAPS_STATE_CAN_CHANGE;
>>>+		ret = ice_dpll_pin_state_update(pf, &pins[i], pin_type);
>>>+		if (ret)
>>>+			return ret;
>>>+		pins[i].prop.freq_supported =
>>>+			ice_cgu_get_pin_freq_supp(hw, i, input, &freq_supp_num);
>>>+		pins[i].prop.freq_supported_num = freq_supp_num;
>>>+		pins[i].pf = pf;
>>>+	}
>>>+
>>>+	return ret;
>>>+}
>>>+
>>>+/**
>>>+ * ice_dpll_init_rclk_pin - initializes rclk pin information
>>>+ * @pf: board private structure
>>>+ * @pin_type: type of pins being initialized
>>>+ *
>>>+ * Init information for rclk pin, cache them in pf->dplls.rclk.
>>>+ *
>>>+ * Return:
>>>+ * * 0 - success
>>>+ * * negative - init failure reason
>>>+ */
>>>+static int ice_dpll_init_rclk_pin(struct ice_pf *pf)
>>>+{
>>>+	struct ice_dpll_pin *pin = &pf->dplls.rclk;
>>>+	struct device *dev = ice_pf_to_dev(pf);
>>>+
>>>+	pin->prop.board_label = dev_name(dev);
>>
>>What??? Must be some sort of joke, correct?
>>"board_label" should be an actual writing on a board. For syncE, I don't
>>think it makes sense to fill any label. The connection to the netdev
>>should be enough. That is what I do in mlx5.
>>
>>Please drop this.
>>
>
>No, we want a label, as this is recovered clock, will change it to

Okay, so it is recovered clock, so what? I want a lot of things, that
does not make them meaningful.

>package_label but the name will stay for now, this is much more meaningful
>then i.e. "phy0" or "RCLK".

No, dev_name() here is total non-sense!
The label should contain the actual label as for example a writing on a
front panel, board, etc.

Why exactly do you need this? Why a link from netdev to this dpll pin is
not enough for you.

Please describe exactly what you need the label for. Usecases, examples,
etc.

Jakub, if you read this, this is very nice example of a misuse that even
very precisely defined netlink attribute cannot prevent :/

>
>>
>>
>>>+	pin->prop.type = DPLL_PIN_TYPE_SYNCE_ETH_PORT;
>>>+	pin->prop.capabilities |= DPLL_PIN_CAPS_STATE_CAN_CHANGE;
>>>+	pin->pf = pf;
>>>+

[...]

>>>+int ice_dpll_init(struct ice_pf *pf)
>>>+{
>>>+	bool cgu_present = ice_is_feature_supported(pf, ICE_F_CGU);
>>>+	struct ice_dplls *d = &pf->dplls;
>>>+	int err = 0;
>>>+
>>>+	mutex_init(&d->lock);
>>>+	mutex_lock(&d->lock);
>>
>>Seeing pattern like this always triggers questions.
>>Why exactly do you need to lock the mutex here?
>>
>
>Could do it few lines below before the dplls are inited,
>but this would make error path confusing.

See above, you don't need to do this if you register dpll and start the
periodic work only after everything is ready. I believe that you
actually have it like this now.

>
>>
>>>+	err = ice_dpll_init_info(pf, cgu_present);
>>>+	if (err)
>>>+		goto err_exit;
>>>+	err = ice_dpll_init_dpll(pf, &pf->dplls.eec, cgu_present,
>>>+				 DPLL_TYPE_EEC);
>>>+	if (err)
>>>+		goto deinit_info;
>>>+	err = ice_dpll_init_dpll(pf, &pf->dplls.pps, cgu_present,
>>>+				 DPLL_TYPE_PPS);
>>>+	if (err)
>>>+		goto deinit_eec;
>>>+	err = ice_dpll_init_pins(pf, cgu_present);
>>>+	if (err)
>>>+		goto deinit_pps;
>>>+	set_bit(ICE_FLAG_DPLL, pf->flags);
>>>+	if (cgu_present) {
>>>+		err = ice_dpll_init_worker(pf);
>>>+		if (err)
>>>+			goto deinit_pins;
>>>+	}
>>>+	mutex_unlock(&d->lock);
>>>+	dev_info(ice_pf_to_dev(pf), "DPLLs init successful\n");
>>
>>What is this good for? Please avoid polluting dmesg and drop this.
>>
>
>Sure, removed.
>
>>
>>>+
>>>+	return err;
>>>+
>>>+deinit_pins:
>>>+	ice_dpll_deinit_pins(pf, cgu_present);
>>>+deinit_pps:
>>>+	ice_dpll_deinit_dpll(pf, &pf->dplls.pps, cgu_present);
>>>+deinit_eec:
>>>+	ice_dpll_deinit_dpll(pf, &pf->dplls.eec, cgu_present);
>>>+deinit_info:
>>>+	ice_dpll_deinit_info(pf);
>>>+err_exit:
>>>+	clear_bit(ICE_FLAG_DPLL, pf->flags);
>>>+	mutex_unlock(&d->lock);
>>>+	mutex_destroy(&d->lock);
>>>+	dev_warn(ice_pf_to_dev(pf), "DPLLs init failure err:\n");
>>
>>You are missing the err. But why do you need the message?
>>
>
>To give a clue that something went wrong on dpll init.

Yeah, you ignore the err in the caller. That makes sense.
Don't forget to add the "err" :)

>
>>
>>>+
>>>+	return err;

[...]