[OpenWrt-Devel] [PATCH v2] gpio-button-hotplug: mind debounce interval consistently

Christian Lamparter chunkeey at gmail.com
Sat Jul 13 18:04:24 EDT 2019


Hi David,

On Saturday, July 13, 2019 7:53:05 PM CEST David Bauer wrote:
> first of all, i wouldn't have imagined that there is so mush to talk about pressing buttons ;)

Well, if you remember my first versions that I pushed out of the gpio-button
rework were really buggy, so... :(
On the other hand, the "git commit && git push" is just one
"enter" away. ;) We can finish this "right now" and deal with
the "shoot from the hip" later. (Though this is not advised)

> On 13.07.19 18:04, Christian Lamparter wrote:
> > On Wednesday, July 10, 2019 12:31:12 AM CEST David Bauer wrote:
> >> thanks for your reworked patch. I think i found two bugs around the debounce
> >> interval for both flavors. I'll have to admit, they are both edgecases ;)
> >>
> >> On 06.07.19 23:57, Christian Lamparter wrote
> >>> -static void gpio_keys_polled_check_state(struct gpio_keys_button_data *bdata)
> >>> +static void gpio_keys_handle_button(struct gpio_keys_button_data *bdata)
> >>>   {
> >>> +	unsigned int type = bdata->b->type ?: EV_KEY;
> >>>   	int state = gpio_button_get_value(bdata);
> >>> +	unsigned long seen = jiffies;
> >>>   
> >>> -	if (state != bdata->last_state) {
> >>> -		unsigned int type = bdata->b->type ?: EV_KEY;
> >>> +	pr_debug(PFX "event type=%u, code=%u, pressed=%d\n",
> >>> +		 type, bdata->b->code, state);
> >>> +
> >>> +	/* is this the initialization state? */
> >>> +	if (bdata->last_state == -1) {
> >>> +		/*
> >>> +		 * Don't advertise unpressed buttons on initialization.
> >>> +		 * Just save their state and continue otherwise this
> >>> +		 * can cause OpenWrt to enter failsafe.
> >>> +		 */
> >>> +		if (type == EV_KEY && state == 0)
> >>> +			goto set_state;
> >>
> >> As we are calling gpio_keys_handle_button every poll-interval, we need 
> >> to make sure we save the initial state AFTER the button has been stable 
> >> for the debounce interval. For irq-based keys we get this "for free" as 
> >> we modify the execution timer for the irq handling function.
> > 
> > Ah, I think due to how the patch looks (as in this mail and not applied
> > to the source code) this feature/behavior got lost there. 
> > 
> > I've added the gpio_keys_handle_button() below after that version of the
> > patch is applied.
> > 
> > ---
> > static void gpio_keys_handle_button(struct gpio_keys_button_data *bdata)
> > {
> >         unsigned int type = bdata->b->type ?: EV_KEY;
> >         int state = gpio_button_get_value(bdata);
> >         unsigned long seen = jiffies;
> > 
> >         pr_debug(PFX "event type=%u, code=%u, pressed=%d\n",
> >                  type, bdata->b->code, state);
> > 
> >         /* is this the initialization state? */
> >         if (bdata->last_state == -1) {
> >                 /*
> >                  * Don't advertise unpressed buttons on initialization.
> >                  * Just save their state and continue otherwise this
> >                  * can cause OpenWrt to enter failsafe.
> >                  */
> >                 if (type == EV_KEY && state == 0)
> >                         goto set_state;
> >                 /*
> >                  * But we are very interested in pressed buttons and
> >                  * initial switch state. These will be reported to
> >                  * userland.
> >                  */
> >         } else if (bdata->last_state == state) {
> >                 /* reset asserted counter (only relevant for polled keys) */
> >                 bdata->count = 0;
> >                 return;
> >         }
> > 
> >         if (bdata->count < bdata->threshold) {
> >                 bdata->count++;
> >                 return;
> >         }
> > 
> >         if (bdata->seen == 0)
> >                 bdata->seen = seen;
> > 
> >         button_hotplug_create_event(button_map[bdata->map_entry].name, type,
> >                                     (seen - bdata->seen) / HZ, state);
> >         bdata->seen = seen;
> > 
> > set_state:
> >         bdata->last_state = state;
> >         bdata->count = 0;
> > }
> > ---
> > 
> > The pressed button state (for the polled) version will now always have to be
> > stable for the debounce-duration (including on the initial state). That's
> > guaranteed by the "if (bdata->count < bdata->threshold) { count++;return; }"
> > path. If there's a laps during the initial state then last_state gets set to
> > 0 and the counter is reset. But this bogus "released" event is not reported
> > to userspace, so if it's an inital pressed state it will still be reported
> > correctly, altough it might take another full debounce-duration. 
> 
> Hmm, i might have some problems grasping here, but let's take the following
> situation:
> 
>      goto set_state called here
>      |
> 0    5    10   15   20   25   30
> |    |    |    |    |    |    |
> 1111110000000000000000000000000
> 
> Debounce: 10ms
> Poll:     5ms

This might be "easy" or really complicated... (So please hold on.)

> >                 if (type == EV_KEY && state == 0)
> >                         goto set_state;

The "goto set_setstate;" only happens if state = 0 (Which in gpio-button
context means) that the button is in an unpressed state.

But, there's more going on here. In this example for this button there is:

bdata->threshold = DIV_ROUND_UP(button->debounce_interval,
                                pdata->poll_interval); // this is existing code 
Which breaks down to:
bdata->threshold = __KERNEL_DIV_ROUND_UP(10, 5)
bdata->threshold = (10 + 5 - 1) / 5 = 14 / 5 = 2 (integer math)

Due to: 

        if (bdata->count < bdata->threshold) {
                bdata->count++;
                return;
        }

count must be 2 for this exit path to get skipped. However the "count" always
starts at 0 (devm_kzalloc and count=0;)  so there are effectively 3 poll intervals
 (0, 1 and 2) that get checked before the event goes out. Which should work
out as >=10 ms in this case, since on the first go through we start at 0.

> So, from my understanding (and to keep the behavior in line), the initial stable
> state for GPIO-polled should be set to 0 at 20ms. No event should be created
> for this.
> 
> With the gpio_keys_handle_button function you snipped above however, we would initially
> set our state to 1 and would fire a release event at 20ms, even though this is our real
> initial stable state.
> 
> failsafe is triggered on both edges, so pressed & released events would trigger it.
> 
> Or am i missing something here?

Ok, now I'm not so sure anymore what you mean (see above and below).

In my comment on first "v2" of this patch 
<https://patchwork.ozlabs.org/patch/1117787/> I commented that for the polled
case we already wait "software_debounce" until gpio-button does the first read.
(For reference, it's the "Hm, well since the first state is -1 we could just as
well schedule the work immediately here... ") so in a way the first 10 ms in this
example are already skipped/ignored at the moment. But we can change that and
schedule the poll immediately (like in the current openwrt version) 

But in regards to the current state of the failsafe script, you are
definitly right. This is confusing and something smells rotten.

Currently the failsafe script looks like this (procd basically forwards
everything related to BUTTON during preinit to this script thanks to
the handler in /etc/hotplug-preinit.json):

|#!/bin/sh
|
|[ "${TYPE}" = "switch" ] || echo ${BUTTON} > /tmp/failsafe_button
|
|return 0

So before my gpio-button commits it looked like this:

|static void gpio_keys_polled_check_state(struct gpio_keys_button_data *bdata)
|{
|	int state = gpio_button_get_value(bdata);
|
|	if (state != bdata->last_state) {
|		unsigned int type = bdata->b->type ?: EV_KEY;
|
|		if (bdata->count < bdata->threshold) {
|			bdata->count++;
| 			return;
|		}
|
|		if ((bdata->last_state != -1) || (type == EV_SW))
|			button_hotplug_event(bdata, type, state);
|
|		bdata->last_state = state;
|	}
|
|	bdata->count = 0;
|}

You can see that gpio_keys_polled_check_state() always "ate" initial
state event for polled buttons (yes, both states - pressed and unpressed -) got
ignored.  Which I think is very wrong... mostly because it goes against the 
decades of experience I have with "holding down buttons while powering up devices"
and expect them to get noticed properly :D. That's why my code only eats the initial
unpressed/0 event, but reports the pressed button event. This should still be
compatible with the current failsafe setup. 

Related Note: 
As for the sudden burst of the "device is always entering failsafe".
I can see how this is causing problems now. Because if the polarity of
a button was declared (or that got copied from another device without
checking ;) ) with the wrong way; This will now become a problem because
it "physically unpressed" button gets reported as "pressed" and this causes
the device to always enter failsafe (unless the very button with the wrong
polarity is pressed, which in this context means unpressed).

However, I think this is a problem of the individual dts/board support code.
But sadly I have currently no time to monitor the forums, bugtracker, ML or
github for these problems... so what to do?

> > For the irq-driven case, threshold is now 0 and therefore this gets skipped.
> > Which should be fine since we schedule the gpio_keys_handle_button() after
> > the debounce-interval so it should be stable at that point (if not then the
> > debounce interval needs tweaking).
> > 
> > So, the button behavoir should be Ok. The "switch" case is a definitely
> > more lenient in that the initial state. It could jump between 0 and 1 and
> > not reset. However, a "physical" switch that manages to constantly
> > flip-flop during the poll interval probably has a electrical problem.
> > 
> >>> +		/*
> >>> +		 * But we are very interested in pressed buttons and
> >>> +		 * initial switch state. These will be reported to
> >>> +		 * userland.
> >>> +		 */
> >>> +	} else if (bdata->last_state == state) {
> >>> +		/* reset asserted counter (only relevant for polled keys) */
> >>
> >> This is also needed for irq-driven keys. If the state changes two times 
> >> within the debounce interval, gpio_keys_handle_button is still executed 
> >> and we need to terminate here. Otherwise, we would skip a "release" or
> >> "push" action.
> >>
> >> I did rework those two parts a bit and tested it with the irq and
> >> polled flavors without a problem. See below:
> >>
> > I don't follow there? In your v2, you have fixed the debounce-rearming
> > timer for the irq-case by doing mod_delayed_work() function, which is
> > correct! So, if a new event within the debounce interval, the timer
> > will get rescheduled by another irq to fire after after a new
> > debounce-interval passed (so it should be stable by then).
> 
> You are right. Your commit was relating to the reset of the counter, my remark was
> about "There should be no events in case last_state == current_state also for irq
> keys. So no functional problem to see here :)

Ok, so this is fine?

As everyone who follows the thread can see: The struggle is real!
Even for something as seemingly simple and benign as a gpio-button.

Cheers,
Christian



_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel



More information about the openwrt-devel mailing list