Hi, On Mon, Mar 9, 2020 at 2:31 AM Maulik Shah <mkshah@xxxxxxxxxxxxxx> wrote: > > Add changes to invoke rpmh flush() from within cache_lock when the data in > cache is dirty. > > Introduce two new APIs for this. Clients can use rpmh_start_transaction() > before any rpmh transaction once done invoke rpmh_end_transaction() which > internally invokes rpmh_flush() if the caches has become dirty. > > Add support to control this with flush_dirty flag. > > Signed-off-by: Maulik Shah <mkshah@xxxxxxxxxxxxxx> > Reviewed-by: Srinivas Rao L <lsrao@xxxxxxxxxxxxxx> > --- > drivers/soc/qcom/rpmh-internal.h | 4 +++ > drivers/soc/qcom/rpmh-rsc.c | 6 +++- > drivers/soc/qcom/rpmh.c | 64 ++++++++++++++++++++++++++++++++-------- > include/soc/qcom/rpmh.h | 10 +++++++ > 4 files changed, 71 insertions(+), 13 deletions(-) As mentioned previously but not addressed [3], I believe your series breaks things if there are zero ACTIVE TCSs and you're using the immediate-flush solution. Specifically any attempt to set something's "active" state will clobber the sleep/wake. I believe this is hard to fix, especially if you want rpmh_write_async() to work properly and need to be robust to the last man going down while rpmh_write_async() is running but hasn't finished. My suggestion was to consider it to be an error at probe time for now. Actually, though, I'd be super surprised if the "active == 0" case works anyway. Aside from subtle problems of not handling -EAGAIN (see another previous message that you didn't respond to [2]), I think you'll also get failures because you never enable interrupts in RSC_DRV_IRQ_ENABLE for anything other than the ACTIVE_TCS. Thus you'll never get interrupts saying when your transactions on the borrowed "wake" TCS finish. Speaking of previous emails that you didn't respond to, I think you still have these action items: * Document that rpmh_write(active) and rpmh_write_async(active) also updates wake state. [1] * Change is_req_valid() to still return true if (sleep == wake), or keep track of "active" and return true if (sleep != wake || wake != active). [1] * Document that for batch a write to active doesn't update wake. [1] > diff --git a/drivers/soc/qcom/rpmh-internal.h b/drivers/soc/qcom/rpmh-internal.h > index 6eec32b..d36be3d 100644 > --- a/drivers/soc/qcom/rpmh-internal.h > +++ b/drivers/soc/qcom/rpmh-internal.h > @@ -70,13 +70,17 @@ struct rpmh_request { > * > * @cache: the list of cached requests > * @cache_lock: synchronize access to the cache data > + * @active_clients: count of rpmh transaction in progress > * @dirty: was the cache updated since flush > + * @flush_dirty: if the dirty cache need immediate flush > * @batch_cache: Cache sleep and wake requests sent as batch > */ > struct rpmh_ctrlr { > struct list_head cache; > spinlock_t cache_lock; > + u32 active_clients; > bool dirty; > + bool flush_dirty; > struct list_head batch_cache; > }; > > diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c > index e278fc1..b6391e1 100644 > --- a/drivers/soc/qcom/rpmh-rsc.c > +++ b/drivers/soc/qcom/rpmh-rsc.c > @@ -61,6 +61,8 @@ > #define CMD_STATUS_ISSUED BIT(8) > #define CMD_STATUS_COMPL BIT(16) > > +#define FLUSH_DIRTY 1 > + > static u32 read_tcs_reg(struct rsc_drv *drv, int reg, int tcs_id, int cmd_id) > { > return readl_relaxed(drv->tcs_base + reg + RSC_DRV_TCS_OFFSET * tcs_id + > @@ -670,13 +672,15 @@ static int rpmh_rsc_probe(struct platform_device *pdev) > INIT_LIST_HEAD(&drv->client.cache); > INIT_LIST_HEAD(&drv->client.batch_cache); > > + drv->client.flush_dirty = device_get_match_data(&pdev->dev); > + > dev_set_drvdata(&pdev->dev, drv); > > return devm_of_platform_populate(&pdev->dev); > } > > static const struct of_device_id rpmh_drv_match[] = { > - { .compatible = "qcom,rpmh-rsc", }, > + { .compatible = "qcom,rpmh-rsc", .data = (void *)FLUSH_DIRTY }, Ick. This is just confusing. IMO better to set 'drv->client.flush_dirty = true' directly in probe with a comment saying that it could be removed if we had OSI. ...and while you're at it, why not fire off a separate patch (not in your series) adding the stub to 'include/linux/psci.h'. Then when we revisit this in a year it'll be there and it'll be super easy to set the value properly. > { } > }; > > diff --git a/drivers/soc/qcom/rpmh.c b/drivers/soc/qcom/rpmh.c > index 5bed8f4..9d40209 100644 > --- a/drivers/soc/qcom/rpmh.c > +++ b/drivers/soc/qcom/rpmh.c > @@ -297,12 +297,10 @@ static int flush_batch(struct rpmh_ctrlr *ctrlr) > { > struct batch_cache_req *req; > const struct rpmh_request *rpm_msg; > - unsigned long flags; > int ret = 0; > int i; > > /* Send Sleep/Wake requests to the controller, expect no response */ > - spin_lock_irqsave(&ctrlr->cache_lock, flags); > list_for_each_entry(req, &ctrlr->batch_cache, list) { > for (i = 0; i < req->count; i++) { > rpm_msg = req->rpm_msgs + i; > @@ -312,7 +310,6 @@ static int flush_batch(struct rpmh_ctrlr *ctrlr) > break; > } > } > - spin_unlock_irqrestore(&ctrlr->cache_lock, flags); > > return ret; > } > @@ -433,16 +430,63 @@ static int send_single(struct rpmh_ctrlr *ctrlr, enum rpmh_state state, > } > > /** > + * rpmh_start_transaction: Indicates start of rpmh transactions, this > + * must be ended by invoking rpmh_end_transaction(). > + * > + * @dev: the device making the request > + */ > +void rpmh_start_transaction(const struct device *dev) > +{ > + struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev); > + unsigned long flags; > + > + if (!ctrlr->flush_dirty) > + return; > + > + spin_lock_irqsave(&ctrlr->cache_lock, flags); > + ctrlr->active_clients++; Wouldn't hurt to have something like: /* * Detect likely leak; we shouldn't have 1000 * people making in-flight changes at the same time. */ WARN_ON(ctrlr->active_clients > 1000) > + spin_unlock_irqrestore(&ctrlr->cache_lock, flags); > +} > +EXPORT_SYMBOL(rpmh_start_transaction); > + > +/** > + * rpmh_end_transaction: Indicates end of rpmh transactions. All dirty data > + * in cache can be flushed immediately when ctrlr->flush_dirty is set > + * > + * @dev: the device making the request > + * > + * Return: 0 on success, error number otherwise. > + */ > +int rpmh_end_transaction(const struct device *dev) > +{ > + struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev); > + unsigned long flags; > + int ret = 0; > + > + if (!ctrlr->flush_dirty) > + return ret; > + > + spin_lock_irqsave(&ctrlr->cache_lock, flags); WARN_ON(!active_clients); > + > + ctrlr->active_clients--; > + if (ctrlr->dirty && !ctrlr->active_clients) > + ret = rpmh_flush(ctrlr); As mentioned previously [2], I don't think it's valid to call rpmh_flush() with interrupts disabled. Specifically (as of your previous patch) rpmh_flush now loops if rpmh_rsc_invalidate() returns -EAGAIN. I believe that the caller needs to enable interrupts for a little bit before trying again. If the caller doesn't need to enable interrupts for a little bit before trying again then why was -EAGAIN even returned? tcs_invalidate() could have just looped itself and all the code would be much simpler. > + > + spin_unlock_irqrestore(&ctrlr->cache_lock, flags); > + > + return ret; > +} > +EXPORT_SYMBOL(rpmh_end_transaction); > + > +/** > * rpmh_flush: Flushes the buffered active and sleep sets to TCS > * > * @ctrlr: controller making request to flush cached data > * > - * Return: -EBUSY if the controller is busy, probably waiting on a response > - * to a RPMH request sent earlier. > + * Return: 0 on success, error number otherwise. > * > - * This function is always called from the sleep code from the last CPU > - * that is powering down the entire system. Since no other RPMH API would be > - * executing at this time, it is safe to run lockless. > + * This function can either be called from sleep code on the last CPU > + * (thus no spinlock needed) or with the ctrlr->cache_lock already held. > */ > int rpmh_flush(struct rpmh_ctrlr *ctrlr) > { > @@ -464,10 +508,6 @@ int rpmh_flush(struct rpmh_ctrlr *ctrlr) > if (ret) > return ret; > > - /* > - * Nobody else should be calling this function other than system PM, > - * hence we can run without locks. > - */ > list_for_each_entry(p, &ctrlr->cache, list) { > if (!is_req_valid(p)) { > pr_debug("%s: skipping RPMH req: a:%#x s:%#x w:%#x", > diff --git a/include/soc/qcom/rpmh.h b/include/soc/qcom/rpmh.h > index f9ec353..85e1ab2 100644 > --- a/include/soc/qcom/rpmh.h > +++ b/include/soc/qcom/rpmh.h > @@ -22,6 +22,10 @@ int rpmh_write_batch(const struct device *dev, enum rpmh_state state, > > int rpmh_invalidate(const struct device *dev); > > +void rpmh_start_transaction(const struct device *dev); > + > +int rpmh_end_transaction(const struct device *dev); > + > #else > > static inline int rpmh_write(const struct device *dev, enum rpmh_state state, > @@ -41,6 +45,12 @@ static inline int rpmh_write_batch(const struct device *dev, > static inline int rpmh_invalidate(const struct device *dev) > { return -ENODEV; } > > +void rpmh_start_transaction(const struct device *dev) > +{ return -ENODEV; } Unexpected return from void function. > + > +int rpmh_end_transaction(const struct device *dev) > +{ return -ENODEV; } > + > #endif /* CONFIG_QCOM_RPMH */ > > #endif /* __SOC_QCOM_RPMH_H__ */ [1] https://lore.kernel.org/r/CAD=FV=VzNnRdDN5uPYskJ6kQHq2bAi2ysEqt0=taagdd_qZb-g@xxxxxxxxxxxxxx [2] https://lore.kernel.org/r/CAD=FV=UYpO2rSOoF-OdZd3jKfSZGKnpQJPoiE5fzH+u1uafS6g@xxxxxxxxxxxxxx [3] https://lore.kernel.org/r/CAD=FV=VNaqwiti+UB8fLgjF5r2CD2xeF_p7qHS-_yXqf+ZDrBg@xxxxxxxxxxxxxx -Doug