[PATCH v4 3/3] pwm: Add support for Xilinx AXI Timer

Tue Jun 29 11:01:31 PDT 2021

On 6/29/21 4:31 AM, Uwe Kleine-König wrote:
 > Hello Sean,
 >
 > On Mon, Jun 28, 2021 at 01:41:43PM -0400, Sean Anderson wrote:
 >> On 6/28/21 1:20 PM, Uwe Kleine-König wrote:
 >> > On Mon, Jun 28, 2021 at 12:35:19PM -0400, Sean Anderson wrote:
 >> >> On 6/28/21 12:24 PM, Uwe Kleine-König wrote:
 >> >> > On Mon, Jun 28, 2021 at 11:50:33AM -0400, Sean Anderson wrote:
 >> >> > > On 6/27/21 2:19 PM, Uwe Kleine-König wrote:
 >> >> > > > On Fri, Jun 25, 2021 at 01:46:26PM -0400, Sean Anderson wrote:
 >> >> > > > > So for the moment, why not give an error? This will be legal code both
 >> >> > > > > now and after round_state is implemented.
 >> >> > > >
 >> >> > > > The problem is where to draw the line. To stay with your example: If a
 >> >> > > > request for period = 150 ns comes in, and let X be the biggest period <=
 >> >> > > > 150 ns that the hardware can configure. For which values of X should an
 >> >> > > > error be returned and for which values the setting should be
 >> >> > > > implemented.
 >> >> > > >
 >> >> > > > In my eyes the only sensible thing to implement here is to tell the
 >> >> > > > consumer about X and let it decide if it's good enough. If you have a
 >> >> > > > better idea let me hear about it.
 >> >> > >
 >> >> > > Sure. And I think it's ok to tell the consumer that X is the best we can
 >> >> > > do. But if they go along and request an unconfigurable state anyway, we
 >> >> > > should tell them as much.
 >> >> >
 >> >> > I have the impression you didn't understand where I see the problem. If
 >> >> > you request 150 ns and the controller can only do 149 ns (or 149.6667 ns)
 >> >> > should we refuse? If yes: This is very unusable, e.g. the led-pwm driver
 >> >> > expects that it can configure the duty_cycle in 1/256 steps of the
 >> >> > period, and then maybe only steps 27 and 213 of the 256 possible steps
 >> >> > work. (This example doesn't really match because the led-pwm driver
 >> >> > varies duty_cycle and not period, but the principle becomes clear I
 >> >> > assume.) If no: Should we accept 151 ns? Isn't that ridiculous?
 >> >>
 >> >> I am fine with this sort of rounding. The part I take issue with is when
 >> >> the consumer requests (e.g.) a 10ns period, but the best we can do is
 >> >> 20ns. Or at the other end if they request a 4s period but the best we
 >> >> can do is 2s. Here, there is no obvious way to round it, so I think we
 >> >> should just say "come back with a reasonable period" and let whoever
 >> >> wrote the device tree pick a better period.
 >> >
 >> > Note that giving ridiculus examples is easy, but this doesn't help to
 >> > actually implement something sensible. Please tell us for your example
 >> > where the driver can only implement 20 ns what is the smallest requested
 >> > period the driver should accept.
 >>
 >> 20ns :)
 >>
 >> In the case of this device, that would result in 0% duty cycle with a
 >> 100MHz input. So the smallest reasonable period is 30ns with a duty
 >> cycle of 20ns.
 >
 > I took the time to understand the hardware a bit better, also to be able
 > to reply to your formulae below. So to recap (and simplify slightly
 > assuming TCSR_UDT = 1):
 >
 >
 >                TLR0 + 2
 >   period     = --------
 >                clkrate
 >
 >                TLR1 + 2
 >   duty_cycle = -------- if TLR1 < TLR0, else 0
 >                clkrate
 >
 >
 > where TLRx has the range [0..0xffffffff] (for some devices the range is
 > smaller). So clkrate seems to be 100 MHz?

On my system, yes.

 >
 >> >> > > IMO, this is the best way to prevent surprising results in the API.
 >> >> >
 >> >> > I think it's not possible in practise to refuse "near" misses and every
 >> >> > definition of "near" is in some case ridiculous. Also if you consider
 >> >> > the pwm_round_state() case you don't want to refuse any request to tell
 >> >> > as much as possible about your controller's capabilities. And then it's
 >> >> > straight forward to let apply behave in the same way to keep complexity
 >> >> > low.
 >> >> >
 >> >> > > The real issue here is that it is impossible to determine the correct
 >> >> > > way to round the PWM a priori, and in particular, without considering
 >> >> > > both duty_cycle and period. If a consumer requests very small
 >> >> > > period/duty cycle which we cannot produce, how should it be rounded?
 >> >> >
 >> >> > Yeah, because there is no obviously right one, I picked one that is as
 >> >> > wrong as the other possibilities but is easy to work with.
 >> >> >
 >> >> > > Should we just set TLR0=1 and TLR1=0 to give them 66% duty cycle with
 >> >> > > the least period? Or should we try and increase the period to better
 >> >> > > approximate the % duty cycle? And both of these decisions must be made
 >> >> > > knowing both parameters. We cannot (for example) just always round up,
 >> >> > > since we may produce a configuration with TLR0 == TLR1, which would
 >> >> > > produce 0% duty cycle instead of whatever was requested. Rounding rate
 >> >> > > will introduce significant complexity into the driver. Most of the time
 >> >> > > if a consumer requests an invalid rate, it is due to misconfiguration
 >> >> > > which is best solved by fixing the configuration.
 >> >> >
 >> >> > In the first step pick the biggest period not bigger than the requested
 >> >> > and then pick the biggest duty cycle that is not bigger than the
 >> >> > requested and that can be set with the just picked period. That is the
 >> >> > behaviour that all new drivers should do. This is somewhat arbitrary but
 >> >> > after quite some thought the most sensible in my eyes.
 >> >>
 >> >> And if there are no periods smaller than the requested period?
 >> >
 >> > Then return -ERANGE.
 >>
 >> Ok, so instead of
 >>
 >> 	if (cycles < 2 || cycles > priv->max + 2)
 >> 		return -ERANGE;
 >>
 >> you would prefer
 >>
 >> 	if (cycles < 2)
 >> 		return -ERANGE;
 >> 	else if (cycles > priv->max + 2)
 >> 		cycles = priv->max;
 >
 > The actual calculation is a bit harder to handle TCSR_UDT = 0 but in
 > principle, yes, but see below.
 >
 >> But if we do the above clamping for TLR0, then we have to recalculate
 >> the duty cycle for TLR1. Which I guess means doing something like
 >>
 >> 	ret = xilinx_timer_tlr_period(priv, &tlr0, tcsr0, state->period);
 >> 	if (ret)
 >> 		return ret;
 >>
 >> 	state->duty_cycle = mult_frac(state->duty_cycle,
 >> 				      xilinx_timer_get_period(priv, tlr0, tcsr0),
 >> 				      state->period);
 >>
 >> 	ret = xilinx_timer_tlr_period(priv, &tlr1, tcsr1, state->duty_cycle);
 >> 	if (ret)
 >> 		return ret;
 >
 > No, you need something like:
 >
 > 	/*
 > 	 * The multiplication cannot overflow as both priv_max and
 > 	 * NSEC_PER_SEC fit into an u32.
 > 	 */
 > 	max_period = div64_ul((u64)priv->max * NSEC_PER_SEC, clkrate);
 >
 > 	/* cap period to the maximal possible value */
 > 	if (state->period > max_period)
 > 		period = max_period;
 > 	else
 > 		period = state->period;
 >
 > 	/* cap duty_cycle to the maximal possible value */
 > 	if (state->duty_cycle > max_period)
 > 		duty_cycle = max_period;
 > 	else
 > 		duty_cycle = state->duty_cycle;

These caps may increase the % duty cycle. For example, consider when the
max is 100, and the user requests a period of 150 and a duty cycle of
75, for a % duty cycle of 50%. The current logic is equivalent to

	period = min(state->period, max_period);
	duty_cycle = min(state->duty_cycle, max_period);

Which will result in a period of 100 and a duty cycle of 75, for a 75%
duty cycle. Instead, we should do

	period = min(state->period, max_period);
	duty_cycle = mult_frac(state->duty_cycle, period, state->period);

which will result in a period of 100 and a duty cycle of 50.

 > 	period_cycles = period * clkrate / NSEC_PER_SEC;
 >
 > 	if (period_cycles < 2)
 > 		return -ERANGE;
 >
 > 	duty_cycles = duty_cycle * clkrate / NSEC_PER_SEC;
 >
 > 	/*
 > 	 * The hardware cannot emit a 100% relative duty cycle, if
 > 	 * duty_cycle >= period_cycles is programmed the hardware emits
 > 	 * a 0% relative duty cycle.
 > 	 */
 > 	if (duty_cycle == period_cycles)
 > 		duty_cycles = period_cycles - 1;
 >
 > 	/*
 > 	 * The hardware cannot emit a duty_cycle of one clk step, so
 > 	 * emit 0 instead.
 > 	 */
 > 	if (duty_cycles < 2)
 > 		duty_cycles = period_cycles;

Of course, the above may result in 100% duty cycle being rounded down to
0%. I feel like that is too big of a jump to ignore. Perhaps if we
cannot return -ERANGE we should at least dev_warn.

--Sean

 >> >> > > > > Perhaps I should add
 >> >> > > > >
 >> >> > > > > 	if (tlr0 <= tlr1)
 >> >> > > > > 		return -EINVAL;
 >> >> > > > >
 >> >> > > > > here to prevent accidentally getting 0% duty cycle.
 >> >> > > >
 >> >> > > > You can assume that duty_cycle <= period when .apply is called.
 >> >> > >
 >> >> > > Ok, I will only check for == then.
 >> >> >
 >> >> > You just have to pay attention to the case that you had to decrement
 >> >> > .period to the next possible value. Then .duty_cycle might be bigger
 >> >> > than the corrected period.
 >> >>
 >> >> This is specifically to prevent 100% duty cycle from turning into 0%. My
 >> >> current draft is
 >> >>
 >> >> 	/*
 >> >> 	 * If TLR0 == TLR1, then we will produce 0% duty cycle instead of 100%
 >> >> 	 * duty cycle. Try and reduce the high time to compensate. If we can't
 >> >> 	 * do that because the high time is already 0 cycles, then just error
 >> >> 	 * out.
 >> >> 	 */
 >> >> 	if (tlr0 == tlr1 && !tlr1--)
 >> >> 		return -EINVAL;
 >> >
 >> > If you follow my suggested policy this isn't an error and you should
 >> > yield the biggest duty_cycle here even if it is zero.
 >>
 >> So like this?
 >>
 >> 	if (tlr0 == tlr1) {
 >> 		if (tlr1)
 >> 			tlr1--;
 >> 		else if (tlr0 != priv->max)
 >> 			tlr0++;
 >> 		else
 >> 			return -ERANGE;
 >> 	}
 >
 > No, this is wrong as it configures a longer period than requested in
 > some cases.
 >
 >> And I would really appreciate if you could write up some documentation
 >> with common errors and how to handle them. It's not at all obvious to me
 >> what all the implications of the above guidelines are.
 >
 > Yes, I fully agree this should be documented and doing that is on my
 > todo list. Until I come around to do this, enabling PWM_DEBUG should
 > help you getting this right (assuming you test extensively and read the
 > resulting kernel messages).
 >
 > Best regards
 > Uwe
 >