Re: re： [AArch64][Spec2017]Question about mlow-precision-div optimization.

Richard Sandiford <richard.sandiford@xxxxxxx> · Mon, 09 Mar 2020 12:28:51 +0000

Hi,

bule <bule1@xxxxxxxxxx> writes:
> Thanks for the reply.
>
> I am an engineer from Huawei Technologies Co.,Ltd. And my company has signed
> the copyright assignment.
>
> My huawei email is: bule1@xxxxxxxxxx

OK, great.

> diff -Nurp gcc-10.0/gcc/config/aarch64/aarch64.c gcc-10.0_opti/gcc/config/aarch64/aarch64.c
> --- gcc-10.0/gcc/config/aarch64/aarch64.c	2020-03-08 18:00:34.581798076 +0800
> +++ gcc-10.0_opti/gcc/config/aarch64/aarch64.c	2020-03-08 17:36:15.400515481 +0800
> @@ -12854,10 +12854,10 @@ aarch64_emit_approx_div (rtx quo, rtx nu
>    /* Iterate over the series twice for SF and thrice for DF.  */
>    int iterations = (GET_MODE_INNER (mode) == DFmode) ? 3 : 2;
>  
> -  /* Optionally iterate over the series once less for faster performance,
> -     while sacrificing the accuracy.  */
> +  /* Optionally iterate over the series less for faster performance,
> +     while sacrificing the accuracy. The default is 2 for DF and 1 for SF.  */
>    if (flag_mlow_precision_div)
> -    iterations--;
> +    iterations = aarch64_double_recp_precision : aarch64_float_recp_precision;

This is missing the "GET_MODE_INNER (mode) == DFmode ?" part of the condition.
Adding that will take it over the 80-char limit, so it should be formatted as:

    iterations = (GET_MODE_INNER (mode) == DFmode
		  ? aarch64_double_recp_precision
		  : aarch64_float_recp_precision);

Looks good otherwise.

Could you try doing a bootstrap with that change and seeing if it
still works for your use case?  If so, could you post the final patch?

Thanks,
Richard