Re: [RFC] drm/tegra: Add a flag to mark that there is only one display pll

Dmitry Osipenko <digetx@xxxxxxxxx> · Sun, 2 Sep 2018 04:54:09 +0300

On 9/2/18 3:27 AM, r yang wrote:
> On Fri, Aug 31, 2018 at 11:44:11PM +0300, Dmitry Osipenko wrote:
>> On 8/31/18 10:31 PM, r yang wrote:
>>> On Wed, Aug 29, 2018 at 08:52:37PM +0300, Dmitry Osipenko wrote:
>>>> On 25.08.2018 04:11, r yang wrote:
>>>>> On Fri, Aug 24, 2018 at 05:40:11AM +0300, Dmitry Osipenko wrote:
>>>>>> On Friday 24 August 2018 03:53:22 ryang wrote:
>>>>>>> There is a workaround in which the tegra rgb driver initializes
>>>>>>> the tegra dc pclk to 0 so that it will skip setting the parent clk rate.
>>>>>>>
>>>>>>> The relevant commits:
>>>>>>> 3cebae6737b100323baca21de6bce6647249e778
>>>>>>> 76d59ed049197bdaaa24c0e061a105af6ce74457
>>>>>>>
>>>>>>> A more recent commit sets the rate of the dc clk itself:
>>>>>>> 39e08affecf0998be1b01f4752016e33fa98eb9a
>>>>>>>
>>>>>>> This doesn't make sense because it always sets the dc clk to 0.
>>>>>>> Is this intended behavior or does it just happen to be working for
>>>>>>> the current tegra 2 boards in the mainline kernel?
>>>>>>>
>>>>>>>
>>>>>>> For context I am running the kernel on a tegra 2 based
>>>>>>> Galaxy Tab 10.1. The display panel driver is out of tree.
>>>>>>> This panel has very low clock rate tolerances so it must be
>>>>>>> driven at a rate very close to the required specification (68.75Mhz).
>>>>>>> pll_p is not adequate to drive this panel so pll_d must be used.
>>>>>>>
>>>>>>> Another issue is the current workaround is always forcing the disp1
>>>>>>> clk to zero. The Galaxy Tab 10.1 locks up completely when calling
>>>>>>> clk_set_rate with a clk rate of 0 whether the parent is pll_p or
>>>>>>> pll_d.
>>>>>>>
>>>>>>>
>>>>>>> This patch adds a flag single_display_pll to mark that the device has
>>>>>>> only one display pll. This replaces the the pclk = 0 workaround.
>>>>>>>
>>>>>>> There is a comment in rgb.c about using the shift clock divider for
>>>>>>> tegra 2 but the divider is not set due to the has_nvdisplay flag.
>>>>>>> This patch also sets the shift clock divider based on the
>>>>>>> single_display_pll flag.
>>>>>>>
>>>>>>> A change I'm uncertain about. In tegra_dc_commit_state()
>>>>>>> the dc clock is now being set to the display panel rate rather than zero.
>>>>>>>
>>>>>>> I don't have any other tegra devices to test with. So I don't know
>>>>>>> if this breaks other devices. Previously the code was trying to set the
>>>>>>> clock rate to zero anyways.
>>>>>>>
>>>>>>> Signed-off-by: ryang <decatf@xxxxxxxxx>
>>>>>>> ---
>>>>>>>  drivers/gpu/drm/tegra/dc.c  | 9 +++++++--
>>>>>>>  drivers/gpu/drm/tegra/dc.h  | 1 +
>>>>>>>  drivers/gpu/drm/tegra/rgb.c | 1 -
>>>>>>>  3 files changed, 8 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
>>>>>>> index 965088afcfad..03f1ad630254 100644
>>>>>>> --- a/drivers/gpu/drm/tegra/dc.c
>>>>>>> +++ b/drivers/gpu/drm/tegra/dc.c
>>>>>>> @@ -1664,7 +1664,7 @@ static void tegra_dc_commit_state(struct tegra_dc
>>>>>>> *dc, * which is shared with other peripherals. Changing the clock rate *
>>>>>>> should therefore be avoided.
>>>>>>>  	 */
>>>>>>> -	if (state->pclk > 0) {
>>>>>>> +	if (!dc->soc->single_display_pll) {
>>>>>>
>>>>>> We don't want to change pll_p rate here. It will be better to check whether 
>>>>>> parent clock for disp is pll_d in tegra_rgb_encoder_atomic_check() and if it 
>>>>>> is the parent, then allow to propagate pclk into commit_state. We may request 
>>>>>> the pll_d using clk_get_sys(), then get the parent clock of disp by 
>>>>>> clk_get_parent(disp_clk) and finally compare the parent with pll_d using 
>>>>>> clk_is_math(parent, pll_d).
>>>>>
>>>>> Understood.
>>>>>
>>>>>>
>>>>>> Certainly parent clock selection could be more advanced, ideally the best 
>>>>>> parent clock should be selected in the atomic check and applied to disp clk 
>>>>>> on the commit by changing the disp's parent. So pll_d could be always 
>>>>>> selected if only one display controller is active at a time, otherwise the 
>>>>>> atomic check should resolve the parent clocks based on the required rates. It 
>>>>>> also could be permitable to adjust the pll_c rate.
>>>>>
>>>>> Right, the TRM describes quite a wide range of clock selection which currently
>>>>> aren't all possible to use on the kernel. The Galaxy Tab 10.1 stock kernel uses
>>>>> pll_c solely for disp1. Every clock normally on pll_c such as gr3d/gr2d/vde/etc
>>>>> are on pll_m. I suspect it was designed this way because the TRM says pll_d has
>>>>> a very power usage and that it should be avoided if possible.
>>>>>
>>>>> In absense of more advanced clock selection. It was my intention with this
>>>>> patch to try to at least have pll_p and pll_d working right for the internal
>>>>> display.
>>>>>
>>>>
>>>> Could you show the clock tree from downstream kernel? I'm curious what it looks
>>>> like, that should be in "/sys/kernel/debug/clock/clock_tree".
>>>>
>>>>
>>>> The patch below seems to work reasonably good, both HDMI and panel either are
>>>> working simultaneously or only panel works when both displays are running off
>>>> the pll_d. Could you give it a try?
>>>>
>>>>
>>>> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
>>>> index bdc418bcd898..c91f1203418d 100644
>>>> --- a/drivers/gpu/drm/tegra/dc.c
>>>> +++ b/drivers/gpu/drm/tegra/dc.c
>>>> @@ -26,6 +26,33 @@
>>>>  #include <drm/drm_atomic_helper.h>
>>>>  #include <drm/drm_plane_helper.h>
>>>>
>>>> +unsigned int tegra_dc_div61(unsigned long  parent, unsigned long pixel)
>>>> +{
>>>> +	unsigned int rem, fract = 0;
>>>> +	unsigned int div61;
>>>> +	u64 integer;
>>>> +
>>>> +	if (parent <= pixel)
>>>> +		return 0;
>>>> +
>>>> +	integer = parent;
>>>> +	rem = do_div(integer, pixel);
>>>> +
>>>> +	if (rem > pixel / 4) {
>>>> +		if (rem > pixel * 2 / 3)
>>>> +			integer += 1;
>>>> +		else
>>>> +			fract = 1;
>>>> +	}
>>>> +
>>>> +	div61 = (integer << 1) | fract;
>>>> +
>>>> +	if (div61 > 255)
>>>> +		return 255;
>>>> +
>>>> +	return div61;
>>>> +}
>>>> +
>>>>  static void tegra_dc_stats_reset(struct tegra_dc_stats *stats)
>>>>  {
>>>>  	stats->frames = 0;
>>>> diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h
>>>> index 382d38b1524b..dc3f681ddf1f 100644
>>>> --- a/drivers/gpu/drm/tegra/dc.h
>>>> +++ b/drivers/gpu/drm/tegra/dc.h
>>>> @@ -174,6 +174,7 @@ int tegra_dc_state_setup_clock(struct tegra_dc *dc,
>>>>  			       struct drm_crtc_state *crtc_state,
>>>>  			       struct clk *clk, unsigned long pclk,
>>>>  			       unsigned int div);
>>>> +unsigned int tegra_dc_div61(unsigned long  parent, unsigned long pixel);
>>>>
>>>>  /* from rgb.c */
>>>>  int tegra_dc_rgb_probe(struct tegra_dc *dc);
>>>> diff --git a/drivers/gpu/drm/tegra/hdmi.c b/drivers/gpu/drm/tegra/hdmi.c
>>>> index 0082468f703c..3c19af0eafe7 100644
>>>> --- a/drivers/gpu/drm/tegra/hdmi.c
>>>> +++ b/drivers/gpu/drm/tegra/hdmi.c
>>>> @@ -1450,10 +1450,13 @@ tegra_hdmi_encoder_atomic_check(struct drm_encoder *encoder,
>>>>  	struct tegra_dc *dc = to_tegra_dc(conn_state->crtc);
>>>>  	unsigned long pclk = crtc_state->mode.clock * 1000;
>>>>  	struct tegra_hdmi *hdmi = to_hdmi(output);
>>>> +	unsigned long rate;
>>>>  	int err;
>>>>
>>>> +	rate = clk_round_rate(hdmi->clk_parent, pclk);
>>>> +
>>>>  	err = tegra_dc_state_setup_clock(dc, crtc_state, hdmi->clk_parent,
>>>> -					 pclk, 0);
>>>> +					 rate, tegra_dc_div61(rate, pclk));
>>>>  	if (err < 0) {
>>>>  		dev_err(output->dev, "failed to setup CRTC state: %d\n", err);
>>>>  		return err;
>>>> diff --git a/drivers/gpu/drm/tegra/rgb.c b/drivers/gpu/drm/tegra/rgb.c
>>>> index 28a78d3120bc..3ec3e13ce47f 100644
>>>> --- a/drivers/gpu/drm/tegra/rgb.c
>>>> +++ b/drivers/gpu/drm/tegra/rgb.c
>>>> @@ -19,6 +19,7 @@ struct tegra_rgb {
>>>>  	struct tegra_output output;
>>>>  	struct tegra_dc *dc;
>>>>
>>>> +	struct clk *pll_d_out0;
>>>>  	struct clk *clk_parent;
>>>>  	struct clk *clk;
>>>>  };
>>>> @@ -130,6 +131,8 @@ static void tegra_rgb_encoder_disable(struct drm_encoder
>>>> *encoder)
>>>>
>>>>  	if (output->panel)
>>>>  		drm_panel_unprepare(output->panel);
>>>> +
>>>> +	clk_rate_exclusive_put(rgb->clk);
>>>>  }
>>>>
>>>>  static void tegra_rgb_encoder_enable(struct drm_encoder *encoder)
>>>> @@ -165,6 +168,8 @@ static void tegra_rgb_encoder_enable(struct drm_encoder
>>>> *encoder)
>>>>
>>>>  	if (output->panel)
>>>>  		drm_panel_enable(output->panel);
>>>> +
>>>> +	clk_rate_exclusive_get(rgb->clk);
>>>>  }
>>>>
>>>>  static int
>>>> @@ -174,11 +179,17 @@ tegra_rgb_encoder_atomic_check(struct drm_encoder *encoder,
>>>>  {
>>>>  	struct tegra_output *output = encoder_to_output(encoder);
>>>>  	struct tegra_dc *dc = to_tegra_dc(conn_state->crtc);
>>>> -	unsigned long pclk = crtc_state->mode.clock * 1000;
>>>>  	struct tegra_rgb *rgb = to_rgb(output);
>>>> -	unsigned int div;
>>>> +	unsigned long pclk = crtc_state->mode.clock * 1000;
>>>> +	unsigned long rate;
>>>> +	bool pll_d_parent;
>>>>  	int err;
>>>>
>>>> +	if (clk_is_match(rgb->clk_parent, rgb->pll_d_out0))
>>>> +		pll_d_parent = true;
>>>> +	else
>>>> +		pll_d_parent = false;
>>>> +
>>>>  	/*
>>>>  	 * We may not want to change the frequency of the parent clock, since
>>>>  	 * it may be a parent for other peripherals. This is due to the fact
>>>> @@ -194,12 +205,17 @@ tegra_rgb_encoder_atomic_check(struct drm_encoder *encoder,
>>>>  	 * The best we can do at this point is to use the shift clock divider
>>>>  	 * and hope that the desired frequency can be matched (or at least
>>>>  	 * matched sufficiently close that the panel will still work).
>>>> +	 *
>>>> +	 * Make excuse for pll_d_out0 if it is explicitly set as a parent
>>>> +	 * for display panel.
>>>>  	 */
>>>> -	div = ((clk_get_rate(rgb->clk) * 2) / pclk) - 2;
>>>> -	pclk = 0;
>>>> +	if (pll_d_parent)
>>>> +		rate = clk_round_rate(rgb->clk_parent, pclk * 4);
>>>> +	else
>>>> +		rate = clk_get_rate(rgb->clk);
>>>>
>>>>  	err = tegra_dc_state_setup_clock(dc, crtc_state, rgb->clk_parent,
>>>> -					 pclk, div);
>>>> +					 rate, tegra_dc_div61(rate, pclk));
>>>>  	if (err < 0) {
>>>>  		dev_err(output->dev, "failed to setup CRTC state: %d\n", err);
>>>>  		return err;
>>>> @@ -254,6 +270,12 @@ int tegra_dc_rgb_probe(struct tegra_dc *dc)
>>>>  		return err;
>>>>  	}
>>>>
>>>> +	rgb->pll_d_out0 = clk_get_sys(NULL, "pll_d_out0");
>>>> +	if (IS_ERR(rgb->pll_d_out0)) {
>>>> +		dev_err(dc->dev, "failed to get pll_d_out0\n");
>>>> +		return PTR_ERR(rgb->pll_d_out0);
>>>> +	}
>>>> +
>>>>  	dc->rgb = &rgb->output;
>>>>
>>>>  	return 0;
>>>> @@ -265,6 +287,7 @@ int tegra_dc_rgb_remove(struct tegra_dc *dc)
>>>>  		return 0;
>>>>
>>>>  	tegra_output_remove(dc->rgb);
>>>> +	clk_put(to_rgb(dc->rgb)->pll_d_out0);
>>>>  	dc->rgb = NULL;
>>>>
>>>>  	return 0;
>>>
>>> I've tested the patch and it works. The disp1 will run off pll_d without
>>> any issues.
>>>
>>> Here is the clock tree from the downstream kernel.
>>
>> Okay, thank you. Now I think that it's better not to touch the display driver at
>> all, but change the PLLC clock to 586 MHz in the clock driver. There is no
>> appreciable difference in 600 vs 586 MHz, so that should be good enough.
>>
>> It would be ideal if clock rates could be specified via device tree, but that's
>> not possible today. So just change the PLLC rate globally in the clocks
>> init_table. Maybe later we'll manage to implement something that will allow to
>> set rates via device tree.
>>
> 
> There are two variants of the display panel chip for this tablet.
> They run at different clock rates:
> 
>  - 68.9Mhz divided down from 586Mhz
>  - 72Mhhz divided down from 570Mhz
> 
> With this in consideration I don't think it makes sense to change the
> initial pll_c rate.

The 72MHz variant will work perfectly with the PLLP as a parent. Have you tried
to use timings and rate of the 72MHz variant?

> Secondly, the issue with the workaround in the display driver still
> exists. The display driver is setting the display clock rate to zero
> in atomic check and applies the disp clock on the commit.
> This results in the device locking up.
> 
> It appears to me that there isn't a way to avoid the issue with this
> workaround.
> 
> I thought this lock up was due to the display panel chip in this
> particular tablet. I have discovered that the device will lock up if
> I set pll_d or pll_c clock rate to zero. And that this happens even
> if disp1 is not running off the pll.

Please show the clock tree of the upstream kernel from
"/sys/kernel/debug/clk/clk_summary".

> There is no lock up when trying to set pll_p rate to zero. I suspect
> this might be due to it being a fixed rate clock.
> Is this expected behavior? Can you reproduce this lock up?

Yes, PLLP is defined as fixed clock in the clock driver.

It is a bug in the DC driver. The disp clocks are registered with the
CLK_SET_RATE_PARENT flag, hence disp clock rate changes propagate to the parent.
Please feel free to send a patch, just remove the clk_set_rate(dc->clk,
state->pclk) from tegra_dc_commit_state().

> Is it not possible to go the route of your first proposed patch to check
> if the parent of display clock is pll_p?

It is possible. If you'll decide to choose that route, then please drop the
tegra_dc_div61() stuff as it is actually div71 and is not needed at all, I was
just trying something using it.

I think both variants are acceptable, though let's wait for the comment from
Thierry, as the last word will be after him.

>> Note that changing PLLC rate will makes sense only if you're going to upstream
>> the device tree for your device. Please prepare and submit the device tree and
>> clock patches. I think Thierry and Peter won't oppose to the variant with
>> changing rate of PLLC, for now that should be the best option.
>>
> 
> I would like to eventually upstream the device tree. There are only two things
> at the moment that are blocking this device from working on upstream kernel.
> One is this display clock initialization issue. The other is the
> non-standard partition table which required a patch from the Anrdoid kernels.
> For now I am using CONFIG_CMDLINE_PARTITION to specify the partition
> layout.

The partition layout shouldn't be a real stopper as seems there is external MMC.

IIRC, the custom partition is the ms-dos partition with a custom offsets, isn't
it? At least some of Tegra devices also have a spare GPT partition that is
specified by bootloader via gpt_sector argument of the kernels commandline.