[PATCH 07/27] arc: TCG instruction definitions

Cupertino Miranda Cupertino.Miranda at synopsys.com
Mon Apr 12 15:27:12 BST 2021


Hi Richard,

I totally understand your position with a new scripting language and the 
unclean code produced by the auto generated tools.
In order to ease out the review process, I propose to drop the idea of 
the generated code and cleanup by hand all of the semfunc.c functions.
What is you opinion about this?

Just to clarify my initial position:

I agree that output code from my generation tools are far from optimal 
and way too verbose.
First thing to improve would be to replace the temp_locals when possible.

 From my early experiments, I got the impression that TCG optimizer was 
not that bad and that hand optimized TCG would not be producing 
significantly better x86 code, except when using those temp_locals, 
obviously.
My personal inclination, and initial thought, was that more verbose code 
would be acceptable. Also my perception, since I did not had the 
opportunity to dig into the TCG optimizer, was that TCG optimizer before 
being able to generate host code would need to decompose any TCG 
constructs into simpler forms and only then construct host machine code, 
and for that reason having more compact TCG code would be more of a code 
size optimization rather then a real improvement in final execution result.

Answering also on the duplication from v2 and v3. I understand that 
duplication in general seems sloppy. However, please take into 
consideration that the semfunc.c, mapping and decoder code are generated 
or reused from binutils, did not seem to be so bad to keep them in the 
original form.

Regards,
Cupertino

On 4/8/21 1:20 AM, Richard Henderson wrote:
> On 4/5/21 7:31 AM, cupertinomiranda at gmail.com wrote:
>> +/*
>> + * ADD
>> + *    Variables: @b, @c, @a
>> + *    Functions: getCCFlag, getFFlag, setZFlag, setNFlag, setCFlag, 
>> CarryADD,
>> + *               setVFlag, OverflowADD
>> + * --- code ---
>> + * {
>> + *   cc_flag = getCCFlag ();
>> + *   lb = @b;
>> + *   lc = @c;
>> + *   if((cc_flag == true))
>> + *     {
>> + *       lb = @b;
>> + *       lc = @c;
>> + *       @a = (@b + @c);
>> + *       if((getFFlag () == true))
>> + *         {
>> + *           setZFlag (@a);
>> + *           setNFlag (@a);
>> + *           setCFlag (CarryADD (@a, lb, lc));
>> + *           setVFlag (OverflowADD (@a, lb, lc));
>> + *         };
>> + *     };
>> + * }
>> + */
>> +
>> +int
>> +arc_gen_ADD(DisasCtxt *ctx, TCGv b, TCGv c, TCGv a)
>> +{
>> +    int ret = DISAS_NEXT;
>> +    TCGv temp_3 = tcg_temp_local_new();
>> +    TCGv cc_flag = tcg_temp_local_new();
>> +    TCGv lb = tcg_temp_local_new();
>> +    TCGv lc = tcg_temp_local_new();
>> +    TCGv temp_1 = tcg_temp_local_new();
>> +    TCGv temp_2 = tcg_temp_local_new();
>> +    TCGv temp_5 = tcg_temp_local_new();
>> +    TCGv temp_4 = tcg_temp_local_new();
>> +    TCGv temp_7 = tcg_temp_local_new();
>> +    TCGv temp_6 = tcg_temp_local_new();
>> +    getCCFlag(temp_3);
>> +    tcg_gen_mov_tl(cc_flag, temp_3);
>> +    tcg_gen_mov_tl(lb, b);
>> +    tcg_gen_mov_tl(lc, c);
>> +    TCGLabel *done_1 = gen_new_label();
>> +    tcg_gen_setcond_tl(TCG_COND_EQ, temp_1, cc_flag, arc_true);
>> +    tcg_gen_xori_tl(temp_2, temp_1, 1);
>> +    tcg_gen_andi_tl(temp_2, temp_2, 1);
>> +    tcg_gen_brcond_tl(TCG_COND_EQ, temp_2, arc_true, done_1);
>> +    tcg_gen_mov_tl(lb, b);
>> +    tcg_gen_mov_tl(lc, c);
>> +    tcg_gen_add_tl(a, b, c);
>> +    if ((getFFlag () == true)) {
>> +        setZFlag(a);
>> +        setNFlag(a);
>> +        CarryADD(temp_5, a, lb, lc);
>> +        tcg_gen_mov_tl(temp_4, temp_5);
>> +        setCFlag(temp_4);
>> +        OverflowADD(temp_7, a, lb, lc);
>> +        tcg_gen_mov_tl(temp_6, temp_7);
>> +        setVFlag(temp_6);
>> +    }
>> +    gen_set_label(done_1);
>> +    tcg_temp_free(temp_3);
>> +    tcg_temp_free(cc_flag);
>> +    tcg_temp_free(lb);
>> +    tcg_temp_free(lc);
>> +    tcg_temp_free(temp_1);
>> +    tcg_temp_free(temp_2);
>> +    tcg_temp_free(temp_5);
>> +    tcg_temp_free(temp_4);
>> +    tcg_temp_free(temp_7);
>> +    tcg_temp_free(temp_6);
>> +
>> +    return ret;
>> +}
>
> I must say I'm not really impressed by the results here.
>
> Your input is clearly intended to be fed to an optimizing compiler, 
> which TCG is not.
>
>
>> +/*
>> + * DIV
>> + *    Variables: @src2, @src1, @dest
>> + *    Functions: getCCFlag, divSigned, getFFlag, setZFlag, setNFlag, 
>> setVFlag
>> + * --- code ---
>> + * {
>> + *   cc_flag = getCCFlag ();
>> + *   if((cc_flag == true))
>> + *     {
>> + *       if(((@src2 != 0) && ((@src1 != 2147483648) || (@src2 != 
>> 4294967295))))
>> + *         {
>> + *           @dest = divSigned (@src1, @src2);
>> + *           if((getFFlag () == true))
>> + *             {
>> + *               setZFlag (@dest);
>> + *               setNFlag (@dest);
>> + *               setVFlag (0);
>> + *             };
>> + *         }
>> + *       else
>> + *         {
>> + *         };
>> + *     };
>> + * }
>> + */
>> +
>> +int
>> +arc_gen_DIV(DisasCtxt *ctx, TCGv src2, TCGv src1, TCGv dest)
>> +{
>> +    int ret = DISAS_NEXT;
>> +    TCGv temp_9 = tcg_temp_local_new();
>> +    TCGv cc_flag = tcg_temp_local_new();
>> +    TCGv temp_1 = tcg_temp_local_new();
>> +    TCGv temp_2 = tcg_temp_local_new();
>> +    TCGv temp_3 = tcg_temp_local_new();
>> +    TCGv temp_4 = tcg_temp_local_new();
>> +    TCGv temp_5 = tcg_temp_local_new();
>> +    TCGv temp_6 = tcg_temp_local_new();
>> +    TCGv temp_7 = tcg_temp_local_new();
>> +    TCGv temp_8 = tcg_temp_local_new();
>> +    TCGv temp_10 = tcg_temp_local_new();
>> +    TCGv temp_11 = tcg_temp_local_new();
>> +    getCCFlag(temp_9);
>> +    tcg_gen_mov_tl(cc_flag, temp_9);
>> +    TCGLabel *done_1 = gen_new_label();
>> +    tcg_gen_setcond_tl(TCG_COND_EQ, temp_1, cc_flag, arc_true);
>> +    tcg_gen_xori_tl(temp_2, temp_1, 1);
>> +    tcg_gen_andi_tl(temp_2, temp_2, 1);
>> +    tcg_gen_brcond_tl(TCG_COND_EQ, temp_2, arc_true, done_1);
>> +    TCGLabel *else_2 = gen_new_label();
>> +    TCGLabel *done_2 = gen_new_label();
>> +    tcg_gen_setcondi_tl(TCG_COND_NE, temp_3, src2, 0);
>> +    tcg_gen_setcondi_tl(TCG_COND_NE, temp_4, src1, 2147483648);
>> +    tcg_gen_setcondi_tl(TCG_COND_NE, temp_5, src2, 4294967295);
>> +    tcg_gen_or_tl(temp_6, temp_4, temp_5);
>> +    tcg_gen_and_tl(temp_7, temp_3, temp_6);
>> +    tcg_gen_xori_tl(temp_8, temp_7, 1);
>> +    tcg_gen_andi_tl(temp_8, temp_8, 1);
>> +    tcg_gen_brcond_tl(TCG_COND_EQ, temp_8, arc_true, else_2);
>> +    divSigned(temp_10, src1, src2);
>> +    tcg_gen_mov_tl(dest, temp_10);
>> +    if ((getFFlag () == true)) {
>> +        setZFlag(dest);
>> +        setNFlag(dest);
>> +        tcg_gen_movi_tl(temp_11, 0);
>> +        setVFlag(temp_11);
>> +    }
>> +    tcg_gen_br(done_2);
>> +    gen_set_label(else_2);
>> +    gen_set_label(done_2);
>> +    gen_set_label(done_1);
>
> Nor is your compiler, for that matter, creating branches for empty 
> elses.  The two together produce cringe-worthy results.
>
> I can't help but feeling that the same amount of effort would have 
> produced a legible, maintainable conversion directly to TCG, and 
> without the fantastic amount of duplication you have created with your 
> independent v2 and v3 files.
>
>
> r~




More information about the linux-snps-arc mailing list