On 05/03/2021 14:45, Doug Anderson wrote:
Hi,
On Fri, Mar 5, 2021 at 2:28 AM Srinivas Kandagatla
<srinivas.kandagatla@xxxxxxxxxx> wrote:
On 27/02/2021 00:26, Douglas Anderson wrote:
When running the latest kernel on an sc7180 with KASAN I got this
splat:
BUG: KASAN: slab-out-of-bounds in a6xx_gpu_init+0x618/0x644
Read of size 4 at addr ffffff8088f36100 by task kworker/7:1/58
CPU: 7 PID: 58 Comm: kworker/7:1 Not tainted 5.11.0+ #3
Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
Workqueue: events deferred_probe_work_func
Call trace:
dump_backtrace+0x0/0x3a8
show_stack+0x24/0x30
dump_stack+0x174/0x1e0
print_address_description+0x70/0x2e4
kasan_report+0x178/0x1bc
__asan_report_load4_noabort+0x44/0x50
a6xx_gpu_init+0x618/0x644
adreno_bind+0x26c/0x438
This is because the speed bin is defined like this:
gpu_speed_bin: gpu_speed_bin@1d2 {
reg = <0x1d2 0x2>;
bits = <5 8>;
};
As you can see the "length" is 2 bytes. That means that the nvmem
subsystem allocates only 2 bytes. The GPU code, however, was casting
the pointer allocated by nvmem to a (u32 *) and dereferencing. That's
not so good.
Let's fix this to just use the nvmem_cell_read_u16() accessor function
which simplifies things and also gets rid of the splat.
Let's also put an explicit conversion from little endian in place just
to make things clear. The nvmem subsystem today is assuming little
endian and this makes it clear. Specifically, the way the above sc7180
cell is interpreted:
NVMEM:
+--------+--------+--------+--------+--------+
| ...... | 0x1d3 | 0x1d2 | ...... | 0x000 |
+--------+--------+--------+--------+--------+
^ ^
msb lsb
You can see that the least significant data is at the lower address
which is little endian.
NOTE: someone who is truly paying attention might wonder about me
picking the "u16" version of this accessor instead of the "u8" (since
the value is 8 bits big) or the u32 version (just for fun). At the
moment you need to pick the accessor that exactly matches the length
the cell was specified as in the device tree. Hopefully future
patches to the nvmem subsystem will fix this.
Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu")
Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 31 +++++++--------------------
1 file changed, 8 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index ba8e9d3cf0fe..0e2024defd79 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1350,35 +1350,20 @@ static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu,
u32 revn)
{
struct opp_table *opp_table;
- struct nvmem_cell *cell;
u32 supp_hw = UINT_MAX;
- void *buf;
-
- cell = nvmem_cell_get(dev, "speed_bin");
- /*
- * -ENOENT means that the platform doesn't support speedbin which is
- * fine
- */
- if (PTR_ERR(cell) == -ENOENT)
- return 0;
- else if (IS_ERR(cell)) {
- DRM_DEV_ERROR(dev,
- "failed to read speed-bin. Some OPPs may not be supported by hardware");
- goto done;
- }
+ u16 speedbin;
+ int ret;
- buf = nvmem_cell_read(cell, NULL);
I think the issue here is not passing len pointer which should return
how many bytes the cell is!
Then from there we can decide to do le16_to_cpu or le32_to_cpu or not!
This will also future proof the code to handle speed_bins of different
sizes!
I think what you're saying is that you want to copy/paste this code
(or something similar) everywhere that accesses an nvmem cell. Is
that correct? ...or maybe you can suggest some smaller / shorter code
that I'm missing?
It depends what the consumer is doing! If it is already aware of what
size of data its expecting then you can use nvmem_cell_read_u8/16/32/64
variants, however it wants to do bit more with the data then
nvmem_cell_read() should give more flexibility!
---
{
struct nvmem_cell *cell;
ssize_t len;
char *ret;
int i;
*data = 0;
cell = nvmem_cell_get(dev, cname);
if (IS_ERR(cell)) {
if (PTR_ERR(cell) != -EPROBE_DEFER)
dev_err(dev, "undefined cell %s\n", cname);
return PTR_ERR(cell);
}
ret = nvmem_cell_read(cell, &len);
nvmem_cell_put(cell);
if (IS_ERR(ret)) {
dev_err(dev, "can't read cell %s\n", cname);
return PTR_ERR(ret);
}
for (i = 0; i < len; i++)
*data |= ret[i] << (8 * i);
kfree(ret);
dev_dbg(dev, "efuse read(%s) = %x, bytes %zd\n", cname, *data, len);
return 0;
}
---
The above code is from cpr_read_efuse() in "cpr.c". I mentioned in
the cover letter that I thought about doing this and decided it wasn't
a great idea. There should be _some_ function in the nvmem core that
says: there's an integer that's 32-bits or less stored in nvmem.
There is no such helper function other than using the above snippet to do.
Please read it for me. If you don't think we can use one of the
existing functions for that, would you be opposed to me creating a new
one?
I have no issue in adding a new helper function in nvmem to allow such
thing!
---
In any case, while we discuss what we should do long term, I still
hope that Rob can merge this patch since it fixes the bug.
Yes, I agree this definitely fixes the mentioned bug!
-Doug
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel