media: mediatek: vcodec: fix AV1 decode fail for 36bit iova

Wed Jun 28 09:07:13 PDT 2023

Hi,

Le mercredi 28 juin 2023 à 13:41 +0800, Xiaoyong Lu a écrit :
> Decoder hardware will access incorrect iova address when tile buffer is
> 36bit, leading to iommu fault when hardware access dram data.
> 
> Fixes: 2f5d0aef37c6 ("media: mediatek: vcodec: support stateless AV1 decoder")
> Signed-off-by: Xiaoyong Lu<xiaoyong.lu at mediatek.com>
> ---
> - Test ok: mt8195 32bit and mt8188 36bit iova.
> ---
>  .../platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c    | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c
> index 404a1a23fd40..420222c8a56d 100644
> --- a/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c
> +++ b/drivers/media/platform/mediatek/vcodec/vdec/vdec_av1_req_lat_if.c
> @@ -1658,9 +1658,9 @@ static void vdec_av1_slice_setup_tile_buffer(struct vdec_av1_slice_instance *ins
>  	u32 allow_update_cdf = 0;
>  	u32 sb_boundary_x_m1 = 0, sb_boundary_y_m1 = 0;
>  	int tile_info_base;
> -	u32 tile_buf_pa;
> +	u64 tile_buf_pa;
>  	u32 *tile_info_buf = instance->tile.va;
> -	u32 pa = (u32)bs->dma_addr;
> +	u64 pa = (u64)bs->dma_addr;
>  
>  	if (uh->disable_cdf_update == 0)
>  		allow_update_cdf = 1;
> @@ -1673,7 +1673,8 @@ static void vdec_av1_slice_setup_tile_buffer(struct vdec_av1_slice_instance *ins
>  		tile_info_buf[tile_info_base + 0] = (tile_group->tile_size[tile_num] << 3);
>  		tile_buf_pa = pa + tile_group->tile_start_offset[tile_num];
>  
> -		tile_info_buf[tile_info_base + 1] = (tile_buf_pa >> 4) << 4;
> +		tile_info_buf[tile_info_base + 1] = (unsigned int)(tile_buf_pa >> 4) << 4 +
> +			((unsigned int)(tile_buf_pa >> 32) & 0xf);

I'm not clear on how this works. In the original code, it was a complicated way
to ignore the 4 least significant bits. Something like this would avoid the cast
and clarify it:

		tile_info_buf[tile_info_base + 1] = tile_buf_pa & 0xFFFFFFFFFFFFFF00ull;

But in the updated code, if you have 36 bit, you store these 2 bits in the lower
part, which was originally cleared. Can you confirm this is exactly what you
wanted ? And if so add a comment ? It could also be written has (but this is
just me considering this more readable, I also prefer | (or) rather then +, and
hates casting):

		tile_info_buf[tile_info_base + 1] = (tile_buf_pa & 0xFFFFFFFFFFFFFF00ull) |
			(tile_buf_pa & 0x0000000F00000000ull) >> 32;

>  		tile_info_buf[tile_info_base + 2] = (tile_buf_pa % 16) << 3;

Is this the same as ?

		tile_info_buf[tile_info_base + 2] = (tile_buf_pa & 0x00FFull) << 3;

> 
>  
>  		sb_boundary_x_m1 =