Re: [RFC] drm/tegra: Add a flag to mark that there is only one display pll

Dmitry Osipenko <digetx@xxxxxxxxx> · Mon, 3 Sep 2018 15:10:49 +0300

On 9/3/18 2:28 PM, Peter Geis wrote:
> 
> 
> On 9/1/2018 9:54 PM, Dmitry Osipenko wrote:
>> On 9/2/18 3:27 AM, r yang wrote:
>>> On Fri, Aug 31, 2018 at 11:44:11PM +0300, Dmitry Osipenko wrote:
>>>> On 8/31/18 10:31 PM, r yang wrote:
>>>>> On Wed, Aug 29, 2018 at 08:52:37PM +0300, Dmitry Osipenko wrote:
>>>>>> On 25.08.2018 04:11, r yang wrote:
>>>>>>> On Fri, Aug 24, 2018 at 05:40:11AM +0300, Dmitry Osipenko wrote:
>>>>>>>> On Friday 24 August 2018 03:53:22 ryang wrote:
>>>>>>>>> There is a workaround in which the tegra rgb driver initializes
>>>>>>>>> the tegra dc pclk to 0 so that it will skip setting the parent clk rate.
>>>>>>>>>
>>>>>>>>> The relevant commits:
>>>>>>>>> 3cebae6737b100323baca21de6bce6647249e778
>>>>>>>>> 76d59ed049197bdaaa24c0e061a105af6ce74457
>>>>>>>>>
>>>>>>>>> A more recent commit sets the rate of the dc clk itself:
>>>>>>>>> 39e08affecf0998be1b01f4752016e33fa98eb9a
>>>>>>>>>
>>>>>>>>> This doesn't make sense because it always sets the dc clk to 0.
>>>>>>>>> Is this intended behavior or does it just happen to be working for
>>>>>>>>> the current tegra 2 boards in the mainline kernel?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> For context I am running the kernel on a tegra 2 based
>>>>>>>>> Galaxy Tab 10.1. The display panel driver is out of tree.
>>>>>>>>> This panel has very low clock rate tolerances so it must be
>>>>>>>>> driven at a rate very close to the required specification (68.75Mhz).
>>>>>>>>> pll_p is not adequate to drive this panel so pll_d must be used.
>>>>>>>>>
>>>>>>>>> Another issue is the current workaround is always forcing the disp1
>>>>>>>>> clk to zero. The Galaxy Tab 10.1 locks up completely when calling
>>>>>>>>> clk_set_rate with a clk rate of 0 whether the parent is pll_p or
>>>>>>>>> pll_d.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This patch adds a flag single_display_pll to mark that the device has
>>>>>>>>> only one display pll. This replaces the the pclk = 0 workaround.
>>>>>>>>>
>>>>>>>>> There is a comment in rgb.c about using the shift clock divider for
>>>>>>>>> tegra 2 but the divider is not set due to the has_nvdisplay flag.
>>>>>>>>> This patch also sets the shift clock divider based on the
>>>>>>>>> single_display_pll flag.
>>>>>>>>>
>>>>>>>>> A change I'm uncertain about. In tegra_dc_commit_state()
>>>>>>>>> the dc clock is now being set to the display panel rate rather than zero.
>>>>>>>>>
>>>>>>>>> I don't have any other tegra devices to test with. So I don't know
>>>>>>>>> if this breaks other devices. Previously the code was trying to set the
>>>>>>>>> clock rate to zero anyways.
>>>>>>>>>
>>>>>>>>> Signed-off-by: ryang <decatf@xxxxxxxxx>
>>>>>>>>> ---
>>>>>>>>>   drivers/gpu/drm/tegra/dc.c  | 9 +++++++--
>>>>>>>>>   drivers/gpu/drm/tegra/dc.h  | 1 +
>>>>>>>>>   drivers/gpu/drm/tegra/rgb.c | 1 -
>>>>>>>>>   3 files changed, 8 insertions(+), 3 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
>>>>>>>>> index 965088afcfad..03f1ad630254 100644
>>>>>>>>> --- a/drivers/gpu/drm/tegra/dc.c
>>>>>>>>> +++ b/drivers/gpu/drm/tegra/dc.c
>>>>>>>>> @@ -1664,7 +1664,7 @@ static void tegra_dc_commit_state(struct tegra_dc
>>>>>>>>> *dc, * which is shared with other peripherals. Changing the clock rate *
>>>>>>>>> should therefore be avoided.
>>>>>>>>>        */
>>>>>>>>> -    if (state->pclk > 0) {
>>>>>>>>> +    if (!dc->soc->single_display_pll) {
>>>>>>>>
>>>>>>>> We don't want to change pll_p rate here. It will be better to check whether
>>>>>>>> parent clock for disp is pll_d in tegra_rgb_encoder_atomic_check() and
>>>>>>>> if it
>>>>>>>> is the parent, then allow to propagate pclk into commit_state. We may
>>>>>>>> request
>>>>>>>> the pll_d using clk_get_sys(), then get the parent clock of disp by
>>>>>>>> clk_get_parent(disp_clk) and finally compare the parent with pll_d using
>>>>>>>> clk_is_math(parent, pll_d).
>>>>>>>
>>>>>>> Understood.
>>>>>>>
>>>>>>>>
>>>>>>>> Certainly parent clock selection could be more advanced, ideally the best
>>>>>>>> parent clock should be selected in the atomic check and applied to disp clk
>>>>>>>> on the commit by changing the disp's parent. So pll_d could be always
>>>>>>>> selected if only one display controller is active at a time, otherwise the
>>>>>>>> atomic check should resolve the parent clocks based on the required
>>>>>>>> rates. It
>>>>>>>> also could be permitable to adjust the pll_c rate.
>>>>>>>
>>>>>>> Right, the TRM describes quite a wide range of clock selection which
>>>>>>> currently
>>>>>>> aren't all possible to use on the kernel. The Galaxy Tab 10.1 stock
>>>>>>> kernel uses
>>>>>>> pll_c solely for disp1. Every clock normally on pll_c such as
>>>>>>> gr3d/gr2d/vde/etc
>>>>>>> are on pll_m. I suspect it was designed this way because the TRM says
>>>>>>> pll_d has
>>>>>>> a very power usage and that it should be avoided if possible.
>>>>>>>
>>>>>>> In absense of more advanced clock selection. It was my intention with this
>>>>>>> patch to try to at least have pll_p and pll_d working right for the internal
>>>>>>> display.
>>>>>>>
>>>>>>
>>>>>> Could you show the clock tree from downstream kernel? I'm curious what it
>>>>>> looks
>>>>>> like, that should be in "/sys/kernel/debug/clock/clock_tree".
>>>>>>
>>>>>>
>>>>>> The patch below seems to work reasonably good, both HDMI and panel either are
>>>>>> working simultaneously or only panel works when both displays are running off
>>>>>> the pll_d. Could you give it a try?
>>>>>>
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
>>>>>> index bdc418bcd898..c91f1203418d 100644
>>>>>> --- a/drivers/gpu/drm/tegra/dc.c
>>>>>> +++ b/drivers/gpu/drm/tegra/dc.c
>>>>>> @@ -26,6 +26,33 @@
>>>>>>   #include <drm/drm_atomic_helper.h>
>>>>>>   #include <drm/drm_plane_helper.h>
>>>>>>
>>>>>> +unsigned int tegra_dc_div61(unsigned long  parent, unsigned long pixel)
>>>>>> +{
>>>>>> +    unsigned int rem, fract = 0;
>>>>>> +    unsigned int div61;
>>>>>> +    u64 integer;
>>>>>> +
>>>>>> +    if (parent <= pixel)
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    integer = parent;
>>>>>> +    rem = do_div(integer, pixel);
>>>>>> +
>>>>>> +    if (rem > pixel / 4) {
>>>>>> +        if (rem > pixel * 2 / 3)
>>>>>> +            integer += 1;
>>>>>> +        else
>>>>>> +            fract = 1;
>>>>>> +    }
>>>>>> +
>>>>>> +    div61 = (integer << 1) | fract;
>>>>>> +
>>>>>> +    if (div61 > 255)
>>>>>> +        return 255;
>>>>>> +
>>>>>> +    return div61;
>>>>>> +}
>>>>>> +
>>>>>>   static void tegra_dc_stats_reset(struct tegra_dc_stats *stats)
>>>>>>   {
>>>>>>       stats->frames = 0;
>>>>>> diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h
>>>>>> index 382d38b1524b..dc3f681ddf1f 100644
>>>>>> --- a/drivers/gpu/drm/tegra/dc.h
>>>>>> +++ b/drivers/gpu/drm/tegra/dc.h
>>>>>> @@ -174,6 +174,7 @@ int tegra_dc_state_setup_clock(struct tegra_dc *dc,
>>>>>>                      struct drm_crtc_state *crtc_state,
>>>>>>                      struct clk *clk, unsigned long pclk,
>>>>>>                      unsigned int div);
>>>>>> +unsigned int tegra_dc_div61(unsigned long  parent, unsigned long pixel);
>>>>>>
>>>>>>   /* from rgb.c */
>>>>>>   int tegra_dc_rgb_probe(struct tegra_dc *dc);
>>>>>> diff --git a/drivers/gpu/drm/tegra/hdmi.c b/drivers/gpu/drm/tegra/hdmi.c
>>>>>> index 0082468f703c..3c19af0eafe7 100644
>>>>>> --- a/drivers/gpu/drm/tegra/hdmi.c
>>>>>> +++ b/drivers/gpu/drm/tegra/hdmi.c
>>>>>> @@ -1450,10 +1450,13 @@ tegra_hdmi_encoder_atomic_check(struct drm_encoder
>>>>>> *encoder,
>>>>>>       struct tegra_dc *dc = to_tegra_dc(conn_state->crtc);
>>>>>>       unsigned long pclk = crtc_state->mode.clock * 1000;
>>>>>>       struct tegra_hdmi *hdmi = to_hdmi(output);
>>>>>> +    unsigned long rate;
>>>>>>       int err;
>>>>>>
>>>>>> +    rate = clk_round_rate(hdmi->clk_parent, pclk);
>>>>>> +
>>>>>>       err = tegra_dc_state_setup_clock(dc, crtc_state, hdmi->clk_parent,
>>>>>> -                     pclk, 0);
>>>>>> +                     rate, tegra_dc_div61(rate, pclk));
>>>>>>       if (err < 0) {
>>>>>>           dev_err(output->dev, "failed to setup CRTC state: %d\n", err);
>>>>>>           return err;
>>>>>> diff --git a/drivers/gpu/drm/tegra/rgb.c b/drivers/gpu/drm/tegra/rgb.c
>>>>>> index 28a78d3120bc..3ec3e13ce47f 100644
>>>>>> --- a/drivers/gpu/drm/tegra/rgb.c
>>>>>> +++ b/drivers/gpu/drm/tegra/rgb.c
>>>>>> @@ -19,6 +19,7 @@ struct tegra_rgb {
>>>>>>       struct tegra_output output;
>>>>>>       struct tegra_dc *dc;
>>>>>>
>>>>>> +    struct clk *pll_d_out0;
>>>>>>       struct clk *clk_parent;
>>>>>>       struct clk *clk;
>>>>>>   };
>>>>>> @@ -130,6 +131,8 @@ static void tegra_rgb_encoder_disable(struct drm_encoder
>>>>>> *encoder)
>>>>>>
>>>>>>       if (output->panel)
>>>>>>           drm_panel_unprepare(output->panel);
>>>>>> +
>>>>>> +    clk_rate_exclusive_put(rgb->clk);
>>>>>>   }
>>>>>>
>>>>>>   static void tegra_rgb_encoder_enable(struct drm_encoder *encoder)
>>>>>> @@ -165,6 +168,8 @@ static void tegra_rgb_encoder_enable(struct drm_encoder
>>>>>> *encoder)
>>>>>>
>>>>>>       if (output->panel)
>>>>>>           drm_panel_enable(output->panel);
>>>>>> +
>>>>>> +    clk_rate_exclusive_get(rgb->clk);
>>>>>>   }
>>>>>>
>>>>>>   static int
>>>>>> @@ -174,11 +179,17 @@ tegra_rgb_encoder_atomic_check(struct drm_encoder
>>>>>> *encoder,
>>>>>>   {
>>>>>>       struct tegra_output *output = encoder_to_output(encoder);
>>>>>>       struct tegra_dc *dc = to_tegra_dc(conn_state->crtc);
>>>>>> -    unsigned long pclk = crtc_state->mode.clock * 1000;
>>>>>>       struct tegra_rgb *rgb = to_rgb(output);
>>>>>> -    unsigned int div;
>>>>>> +    unsigned long pclk = crtc_state->mode.clock * 1000;
>>>>>> +    unsigned long rate;
>>>>>> +    bool pll_d_parent;
>>>>>>       int err;
>>>>>>
>>>>>> +    if (clk_is_match(rgb->clk_parent, rgb->pll_d_out0))
>>>>>> +        pll_d_parent = true;
>>>>>> +    else
>>>>>> +        pll_d_parent = false;
>>>>>> +
>>>>>>       /*
>>>>>>        * We may not want to change the frequency of the parent clock, since
>>>>>>        * it may be a parent for other peripherals. This is due to the fact
>>>>>> @@ -194,12 +205,17 @@ tegra_rgb_encoder_atomic_check(struct drm_encoder
>>>>>> *encoder,
>>>>>>        * The best we can do at this point is to use the shift clock divider
>>>>>>        * and hope that the desired frequency can be matched (or at least
>>>>>>        * matched sufficiently close that the panel will still work).
>>>>>> +     *
>>>>>> +     * Make excuse for pll_d_out0 if it is explicitly set as a parent
>>>>>> +     * for display panel.
>>>>>>        */
>>>>>> -    div = ((clk_get_rate(rgb->clk) * 2) / pclk) - 2;
>>>>>> -    pclk = 0;
>>>>>> +    if (pll_d_parent)
>>>>>> +        rate = clk_round_rate(rgb->clk_parent, pclk * 4);
>>>>>> +    else
>>>>>> +        rate = clk_get_rate(rgb->clk);
>>>>>>
>>>>>>       err = tegra_dc_state_setup_clock(dc, crtc_state, rgb->clk_parent,
>>>>>> -                     pclk, div);
>>>>>> +                     rate, tegra_dc_div61(rate, pclk));
>>>>>>       if (err < 0) {
>>>>>>           dev_err(output->dev, "failed to setup CRTC state: %d\n", err);
>>>>>>           return err;
>>>>>> @@ -254,6 +270,12 @@ int tegra_dc_rgb_probe(struct tegra_dc *dc)
>>>>>>           return err;
>>>>>>       }
>>>>>>
>>>>>> +    rgb->pll_d_out0 = clk_get_sys(NULL, "pll_d_out0");
>>>>>> +    if (IS_ERR(rgb->pll_d_out0)) {
>>>>>> +        dev_err(dc->dev, "failed to get pll_d_out0\n");
>>>>>> +        return PTR_ERR(rgb->pll_d_out0);
>>>>>> +    }
>>>>>> +
>>>>>>       dc->rgb = &rgb->output;
>>>>>>
>>>>>>       return 0;
>>>>>> @@ -265,6 +287,7 @@ int tegra_dc_rgb_remove(struct tegra_dc *dc)
>>>>>>           return 0;
>>>>>>
>>>>>>       tegra_output_remove(dc->rgb);
>>>>>> +    clk_put(to_rgb(dc->rgb)->pll_d_out0);
>>>>>>       dc->rgb = NULL;
>>>>>>
>>>>>>       return 0;
>>>>>
>>>>> I've tested the patch and it works. The disp1 will run off pll_d without
>>>>> any issues.
>>>>>
>>>>> Here is the clock tree from the downstream kernel.
>>>>
>>>> Okay, thank you. Now I think that it's better not to touch the display
>>>> driver at
>>>> all, but change the PLLC clock to 586 MHz in the clock driver. There is no
>>>> appreciable difference in 600 vs 586 MHz, so that should be good enough.
>>>>
>>>> It would be ideal if clock rates could be specified via device tree, but that's
>>>> not possible today. So just change the PLLC rate globally in the clocks
>>>> init_table. Maybe later we'll manage to implement something that will allow to
>>>> set rates via device tree.
>>>>
>>>
>>> There are two variants of the display panel chip for this tablet.
>>> They run at different clock rates:
>>>
>>>   - 68.9Mhz divided down from 586Mhz
>>>   - 72Mhhz divided down from 570Mhz
>>>
>>> With this in consideration I don't think it makes sense to change the
>>> initial pll_c rate.
>>
>> The 72MHz variant will work perfectly with the PLLP as a parent. Have you tried
>> to use timings and rate of the 72MHz variant?
>>
>>> Secondly, the issue with the workaround in the display driver still
>>> exists. The display driver is setting the display clock rate to zero
>>> in atomic check and applies the disp clock on the commit.
>>> This results in the device locking up.
>>>
>>> It appears to me that there isn't a way to avoid the issue with this
>>> workaround.
>>>
>>> I thought this lock up was due to the display panel chip in this
>>> particular tablet. I have discovered that the device will lock up if
>>> I set pll_d or pll_c clock rate to zero. And that this happens even
>>> if disp1 is not running off the pll.
>>
>> Please show the clock tree of the upstream kernel from
>> "/sys/kernel/debug/clk/clk_summary".
>>
>>> There is no lock up when trying to set pll_p rate to zero. I suspect
>>> this might be due to it being a fixed rate clock.
>>> Is this expected behavior? Can you reproduce this lock up?
>>
>> Yes, PLLP is defined as fixed clock in the clock driver.
>>
>> It is a bug in the DC driver. The disp clocks are registered with the
>> CLK_SET_RATE_PARENT flag, hence disp clock rate changes propagate to the parent.
>> Please feel free to send a patch, just remove the clk_set_rate(dc->clk,
>> state->pclk) from tegra_dc_commit_state().
>>
>>> Is it not possible to go the route of your first proposed patch to check
>>> if the parent of display clock is pll_p?
>>
>> It is possible. If you'll decide to choose that route, then please drop the
>> tegra_dc_div61() stuff as it is actually div71 and is not needed at all, I was
>> just trying something using it.
>>
>> I think both variants are acceptable, though let's wait for the comment from
>> Thierry, as the last word will be after him.
>>
>>>> Note that changing PLLC rate will makes sense only if you're going to upstream
>>>> the device tree for your device. Please prepare and submit the device tree and
>>>> clock patches. I think Thierry and Peter won't oppose to the variant with
>>>> changing rate of PLLC, for now that should be the best option.
>>>>
>>>
>>> I would like to eventually upstream the device tree. There are only two things
>>> at the moment that are blocking this device from working on upstream kernel.
>>> One is this display clock initialization issue. The other is the
>>> non-standard partition table which required a patch from the Anrdoid kernels.
>>> For now I am using CONFIG_CMDLINE_PARTITION to specify the partition
>>> layout.
>>
>> The partition layout shouldn't be a real stopper as seems there is external MMC.
>>
>> IIRC, the custom partition is the ms-dos partition with a custom offsets, isn't
>> it? At least some of Tegra devices also have a spare GPT partition that is
>> specified by bootloader via gpt_sector argument of the kernels commandline.
>>
> 
> Indeed, both the Ouya and Moto Xoom use the gpt_sector method, however this does
> not work without a patch.
> Currently this is the only actual blocker to supporting these two devices.
> Does anyone know why the force gpt sector patch was never submitted to mainline?

Probably because nobody cared to do that yet.

It isn't really a blocker since you could utilize other storage devices, like
external MMC; USB drive or even NFS over usb-net, you could also use the
CONFIG_CMDLINE_PARTITION.