[PATCH 05/15] arc: TCG instruction generator and hand-definitions
Richard Henderson
richard.henderson at linaro.org
Fri Jan 15 15:17:53 EST 2021
On 1/15/21 7:11 AM, Cupertino Miranda wrote:
>> On 11/11/20 10:17 AM, cupertinomiranda at gmail.com wrote:
>>> +/*
>>> + * The macro to add boiler plate code for conditional execution.
>>> + * It will add tcg_gen codes only if there is a condition to
>>> + * be checked (ctx->insn.cc != 0). This macro assumes that there
>>> + * is a "ctx" variable of type "DisasCtxt *" in context. Remember
>>> + * to pair it with CC_EPILOGUE macro.
>>> + */
>>> +#define CC_PROLOGUE \
>>> + TCGv cc = tcg_temp_local_new(); \
>>> + TCGLabel *done = gen_new_label(); \
>>> + do { \
>>> + if (ctx->insn.cc) { \
>>> + arc_gen_verifyCCFlag(ctx, cc); \
>>> + tcg_gen_brcondi_tl(TCG_COND_NE, cc, 1, done); \
>>> + } \
>>> + } while (0)
>>> +
>>> +/*
>>> + * The finishing counter part of CC_PROLUGE. This is supposed
>>> + * to be put at the end of the function using it.
>>> + */
>>> +#define CC_EPILOGUE \
>>> + if (ctx->insn.cc) { \
>>> + gen_set_label(done); \
>>> + } \
>>> + tcg_temp_free(cc)
>>
>> Why would this need to be boiler-plate? Why would this not simply exist in
>> exactly one location?
>>
>> You don't need a tcg_temp_local_new, because the cc value is not used past the
>> branch. You should free the temp immediately after the branch.
>>
>
> I wonder if what you thought was to move those macros to functions and
> call it when CC_PROLOGUE and CC_EPILOGUE are used.
> I think the macros were choosen due to the sharing of the 'cc' and
> 'done' variables in both CC_PROLOGUE AND CC_EPILOGUE.
I meant that the checking of ctx->insn.cc could be done at a higher level, so
that this code existed in exactly one place, not scattered between all of the
different instructions.
But if that isn't possible for some reason, you can certainly put "done" into
the DisasContext so that you can have to functions cc_prologue(ctx) and
cc_epilogue(ctx).
>>> +void gen_goto_tb(DisasContext *ctx, int n, TCGv dest)
>>> +{
>>> + tcg_gen_mov_tl(cpu_pc, dest);
>>> + tcg_gen_andi_tl(cpu_pcl, dest, 0xfffffffc);
>>> + if (ctx->base.singlestep_enabled) {
>>> + gen_helper_debug(cpu_env);
>>> + }
>>> + tcg_gen_exit_tb(NULL, 0);
>>
>> Missing else. This is dead code for single-step.
> Goes a little above my knowledge of QEMU internals to be honest.
> Do you have a recommendation what we should be doing here ?
Both of these actions end the TB, so:
if (ctx->base.singlestep_enabled) {
gen_helper_debug(cpu_env);
} else {
tcg_gen_exit_tb(NULL, 0);
}
>> You could put all of these into a const static table.
> What do you mean, can we make the effect of tcg_global_mem_new_i32 as
> constant ?
No, I mean all of the data that is passed to tcg_global_mem_new. See for
instance target/sparc/translate.c, sparc_tcg_init().
>>> +static void init_constants(void)
>>> +{
>>> +#define SEMANTIC_FUNCTION(...)
>>> +#define MAPPING(...)
>>> +#define CONSTANT(NAME, MNEMONIC, OP_NUM, VALUE) \
>>> + add_constant_operand(MAP_##MNEMONIC##_##NAME, OP_NUM, VALUE);
>>> +#include "target/arc/semfunc_mapping.def"
>>> +#include "target/arc/extra_mapping.def"
>>> +#undef MAPPING
>>> +#undef CONSTANT
>>> +#undef SEMANTIC_FUNCTION
>>> +}
>>
>> Ew. Yet another thing that can be done at build time.
> As far as I remember it, there was no way I could generate this table
> using the C pre-processor. Do you suggest to make this table using an
> external tool ?
I assumed that you could just as easily generate a table using the c
preprocessor as this function. I guess I'd like to know more about why you
can't...
>>> + int32_t limm = operand.value;
>>> + if (operand.type & ARC_OPERAND_LIMM) {
>>> + limm = ctx->insn.limm;
>>> + tcg_gen_movi_tl(cpu_limm, limm);
>>> + ret = cpu_r[62];
>>> + } else {
>>> + ret = tcg_const_local_i32(limm);
>>> + }
>>> + }
>>> + }
>>> +
>>> + return ret;
>>
>> Why are you using locals for everything? Is it because you have no proper
>> control over your use of branching?
>
> Initially we though locals the good way to define temporaries. :-(
> What should be the best ? We will need to change a lot of code for this.
TCG is a poor optimizer. If you can at all avoid branches, while implementing
a single instruction, do so. Because this means that you can use
tcg_temp_new_i32 (et al) which are "normal" temps, and are not spilled to the
stack at the end of the BB.
This does not necessarily apply to conditional execution (cc_prologue, et al)
because you can think of those as "outside" of the instruction, merely wrapping
them. The actual live temps will be "inside" and not live past the done label.
>>> +/* Return from exception. */
>>> +static void gen_rtie(DisasContext *ctx)
>>> +{
>>> + tcg_gen_movi_tl(cpu_pc, ctx->cpc);
>>> + gen_helper_rtie(cpu_env);
>>> + tcg_gen_mov_tl(cpu_pc, cpu_pcl);
>>> + gen_goto_tb(ctx, 1, cpu_pc);
>>> +}
>>
>> You must return to the main loop here, not goto_tb. You must return to the
>> main loop every time your interrupt mask changes, so that pending interrupts
>> may be accepted.
>>
> "gen_goto_tb" calls in the end "tcg_gen_exit_tb(NULL, 0)", is it not the
> same ?
No. Because gen_goto_tb uses tcg_gen_goto_tb, which ends the TB right there.
Another instance of the "else" bug above.
> We need to investigate this implementation further. A quick change to
> gen_rtie broke linux booting.
> Can you recomend some target that implements the loop exit on rtie as
> you suggest ?
target/riscv/ -- see trans_mret() and exit_tb().
r~
More information about the linux-snps-arc
mailing list