[RFC PATCH v2 09/20] objtool: arm64: Implement command to invoke the decoder

Tue Jun 7 11:13:54 PDT 2022

On 6/1/22 17:45, Madhavan T. Venkataraman wrote:
> 
> 
> On 5/30/22 02:51, Peter Zijlstra wrote:
>> On Sun, May 29, 2022 at 09:49:44AM -0500, Madhavan T. Venkataraman wrote:
>>>
>>>
>>> On 5/24/22 09:09, Mark Brown wrote:
>>>> On Mon, May 23, 2022 at 07:16:26PM -0500, madvenka at linux.microsoft.com wrote:
>>>>> From: "Madhavan T. Venkataraman" <madvenka at linux.microsoft.com>
>>>>>
>>>>> Implement a built-in command called cmd_fpv() that can be invoked as
>>>>> follows:
>>>>>
>>>>> 	objtool fpv generate file.o
>>>>>
>>>>> The built-in command invokes decode_instructions() to walk each function
>>>>> and decode the instructions of the function.
>>>>
>>>> In commit b51277eb9775ce91 ("objtool: Ditch subcommands") Josh removed
>>>> subcommands so this interface is going to need a rethink.
>>>
>>> Thanks for mentioning this. I will sync my patchset to the latest and send out version 3.
>>
>> Before you do; why are you duplicating lots of validate_branch() ? Why
>> can't you use the regular code to generate ORC data?
>>
>> I'm really not happy about all this.
> 
> Hi Peter,
> 
> I am preparing a detailed response to this explaining why I have not used validate_branch().
> The short answer is that no validation is required for my approach. But I will send my detailed
> response shortly.
> 
> Thanks.
> 
> Madhavan

Sorry for the delay in responding to your comment. I had to make changes
to address it and complete testing.

I will address your comment in version 3. There are two parts to your comment:

- Why is validate_branch() not being used in Dynamic FP Validation?

	I will use it in version 3. See below.

- Why is the current code not being used to compute ORC?

	There are some differences in the CFI code between X86 and ARM64.
	So, I have defined handle_insn_ops() separately for the two in v3.

What I am doing in v3
=====================

I have discarded my own function (walk_code()) to walk instructions. Instead,
I have used validate_branch() per your comment.

However, my approach requires no validation. More on this below.

So, what I have done is to move all of the validation checks and actions into
their own functions. I define the functions separately for Static Stack
Validation (in check.c) and Dynamic Frame Pointer Validation (in fpv.c). The
two files are mutually exclusive.

The functions in fpv.c contain their own checks and actions.

	validate_insn_initial()		Initial checks.
	validate_insn_cfi()		CFI-related checks.
	validate_insn_alt()		Alternate instructions checks.
	validate_insn_return()		Return instruction checks.
	validate_insn_call()		Call/Call dynamic instruction checks.
	validate_insn_jump()		Jump instruction checks.
	validate_insn_jump_dynamic()	Jump Dynamic instruction checks.
	validate_insn_rest()		Checks for miscellaneous instructions.
	validate_ibt_insn()		IBT instruction checks.
	handle_insn_ops()		Update CFI from stack ops generated by
					the decoder.

validate_branch() looks like this now:

int validate_branch(struct objtool_file *file, struct symbol *func,
		    struct instruction *insn, struct insn_state state)
{
	struct instruction *next_insn, *prev_insn = NULL;
	struct section *sec;
	u8 visited;
	int ret;

	sec = insn->sec;

	while (1) {
		next_insn = next_insn_to_validate(file, insn);

		if (validate_insn_initial(file, func, insn, &ret))
			return ret;

		if (validate_insn_cfi(prev_insn, insn, &state, &visited, &ret))
			return ret;

		if (state.noinstr)
			state.instr += insn->instr;

		if (validate_insn_alt(file, func, insn, &state, &ret))
			return ret;

		if (handle_insn_ops(insn, next_insn, &state))
			return 1;

		switch (insn->type) {

		case INSN_RETURN:
			validate_insn_return(func, insn,
					     next_insn, &state, &ret);
			return ret;

		case INSN_CALL:
		case INSN_CALL_DYNAMIC:
			if (validate_insn_call(file, func, insn, &state, &ret))
				return ret;
			break;

		case INSN_JUMP_CONDITIONAL:
		case INSN_JUMP_UNCONDITIONAL:
			if (validate_insn_jump(file, func, insn, &state, &ret))
				return ret;
			break;

		case INSN_JUMP_DYNAMIC:
		case INSN_JUMP_DYNAMIC_CONDITIONAL:
			if (validate_insn_jump_dynamic(file, insn, next_insn,
						       &state, &ret)) {
				return ret;
			}
			break;

		default:
			if (validate_insn_rest(func, insn, next_insn,
					       &state, &ret)) {
				return ret;
			}
			break;
		}

		if (ibt)
			validate_ibt_insn(file, insn);

		if (insn->dead_end)
			return 0;

		if (!next_insn) {
			if (state.cfi.cfa.base == CFI_UNDEFINED)
				return 0;
			WARN("%s: unexpected end of section", sec->name);
			return 1;
		}

		prev_insn = insn;
		insn = next_insn;
	}

	return 0;
}

Why no validation?
==================

There are two approaches for reliable stacktrace.

1. Static Stack Validation (the current approach).

   Analyze the code statically and perform checks for ABI compliance and valid
   stack operations. If any warnings or errors are encountered, "fix" the
   kernel and/or the toolchain so the generated code conforms to Objtool's
   expectations.

2. Dynamic Frame Pointer Validation.

   Don't perform any validation of kernel code. Simply compute the SP and FP
   offsets at each instruction address based on the actual code. During unwind,
   compute a frame pointer from the offsets at each frame and validate the
   actual frame pointer with it. If an FP cannot be computed or the computed
   FP does not match the actual FP, consider the frame unreliable for unwind.

   Since the unwinder can clearly tell whether a frame is reliable or not,
   reliable stacktrace can be provided.

I am doing (2) in my patch series.

Different cases
===============

C Functions
-----------

I find that the compiler generates proper FP prolog and epilog for C functions.
The only exceptions I found are functions that have multiple code paths sharing
some instructions with differing CFIs. See CFI mismatch below. This mismatch
happens only for a very small percentage of the functions.

	Buggy code generated by compiler
	--------------------------------

	Even assuming that the compiler can sometimes generate code that does
	not follow ABI rules, it is still not a problem as the unwinder can
	do an FP match and tell whether some code is reliable for unwind or not.

Assembly Functions
------------------

There are two cases:

	SYM_CODE functions
	------------------

	Functions defined using the SYM_CODE_*() macros. 

	AFAICT, Objtool does not process these. These are low-level functions
	that don't follow any ABI rules. The ORC entries for these would be
	undefined. So, the unwinder will rightly consider them unreliable
	for unwind.

	SYM_FUNC functions
	------------------

	Functions defined using the SYM_FUNC_*() macros.

	These are supposed to have proper FP prologs and epilogs.

	At the moment, they don't for ARM64. The unwinder will consider these
	unreliable for unwind at the moment.

	That said, I am working on a separate patch series to add the prologs
	and epilogs to these functions (except in cases where functionality
	or performance would be affected). This is not required to support
	reliable stack trace. This is only to reduce potential retries during
	the livepatch process.

Functions without a proper FP prolog/epilog
-------------------------------------------

For leaf functions, the compiler may not generate FP prologs/epilogs for
performance reasons. In Dynamic FP Validation, the unwinder will recognize
these to be unreliable for unwind.

Assembly functions that don't have a proper FP prolog/epilog are treated like
leaf functions.

CFI mismatch
------------

This is based on actual code on ARM64.

Let us say there are two code paths in a function. The two code paths share
some instructions. If the SP and FP offsets are different in the two code paths,
the shared instructions will have a CFI mismatch. But this is not invalid or
buggy code. It is just a case that Objtool cannot handle because only one CFI
is associated with an instruction in Objtool.

In my approach, one CFI will be recorded. The other will be ignored. The
computed FP will match the actual FP in one code path. It will not match
in the other one. The unwinder will consider the former reliable and the
latter unreliable.

This happens only for a very small number of functions in the entire kernel.

That said, I am investigating the possibility of storing both in ORC entries
in a manner similar to alternate instructions. If this is feasible, then the
unwinder can do an FP match using any of these entries.

Thanks!

Madhavan