[Pcsclite-muscle] Rare race condition in SCardGetStatusChange()

Maximilian Stein maximilian.stein at secunet.com
Tue Apr 17 06:12:26 PDT 2018


Hello all,

recently we stumbled upon another (rare) race condition in the
SCardGetStatusChange() function. I especially ask Maksim Ivanov and
Florian Kaiser kindly to share their opinion about the problem and the
proposed fix, since they did a good job in improving
SCardGetStatusChange/SCardCancel in the recent past.


The problem:
The problem is basically the small gap between fetching the reader
states from pcscd [winscard_clnt.c:1741 and 2119] and registering for
event notifications [winscard_clnt.c:2070]. If a notification is sent in
this time period, SCardGetStatusChange() will sleep until another event
or the internal time-out is reached. Consider the following flow of
execution and events:

Precondition: [client] knows the current state of all readers from a
previous call to SCardGetStatusChange() and no changes happened since
then. Basically consider a thread that monitors the reader states which
calls SCardGetStatusChange() in a loop.
1. [client] calls SCardGetStatusChange() and fetches the reader states
from pcscd (line 1741). The fetched states are equal to the already
known states.
2. [pcscd] notices a change in any terminal state (smartcard movement or
number of connected clients to the smartcard changes) and sends a signal
to all *registered* clients via EHSignalEventToClients(). [client] is
not yet registered for event notifications, so EHSignalEventToClients()
won't send anything.
3. [client] continues execution and compares the given reader states to
the states fetched before [pcscd] noticed the change. The states are
equal so it registers for event notifications (line 2070). No
notification was sent, so [client] sleeps until the internal time-out
(60 seconds) is reached.
4. [client] reaches the internal time-out and fetches the current reader
states from [pcscd]. Now the states are different and
SCardGetStatusChange returns.

In the end the state change is noticed thanks to the internal polling
mechanism, but in automated environments, 60 seconds is a very long time
until a state change is detected. The error is most likely to occur if
multiple readers are used and state changes are frequent. Even more
likely it occurs if reader state polling is used by pcscd, i.e. when the
IFD handler does not support asynchronous card event notification (no
capability TAG_IFD_POLLING_THREAD_WITH_TIMEOUT). Then multiple readers
can accumulate their events to be processed in a very small time frame.


Suggestion for a fix:
The proposed fix makes fetching the reader states and registering for
event notifications an atomic action. The command
CMD_WAIT_READER_STATE_CHANGE expected no return value anyway, so I made
it return the reader states equal to CMD_GET_READERS_STATE. The action
is protected by the ClientsWaitingForEvent_lock in eventhandler.c, which
prevents parallel calls of MSGSignalClient() via
EHSignalEventToClients(). This is necessary to prevent a signal before
the reader states are sent, which would appear as garbage in the client
socket.

With the proposed fix, the client is registered for events after the
reader states were fetched. So if any difference is found in the local
and remote state (so that SCardGetStatusChange() returns) we have to
unregister from events. This was not necessary before, but works just
like unregistering after a timeout. This could be refined by checking
why the loop was exited and only unregister if necessary.

Unfortunately the proposed fix will slightly alter the internal protocol
between libpcsclite and pcscd, breaking statically linked client
applications with newer pcscd versions.


Further related thoughts:
I'm a bit uncertain if my proposed fix works nicely with SCardCancel(),
because I can think of one very rare situation when
SCardGetStatusChange() times out and unregisters from event
notifications, then gets cancelled but is not informed about that. Then
it re-registers for notifications, because no changes happened. Thus it
will not returned despite it was cancelled. But this should have been an
issue even before my fix.

I think the notification mechanism could be improved by using "response
headers" analogous to the server side, or just an additional field
"command" in the data structs. This way every message related to reader
state events could be identified by the client and handled respectively.
As I understand it, some of the past issues were because of signal,
cancel or stop-reader-state-change messages messing up the client socket
data. With a command field it can be decided what data the client
received, and what data is still expected to be received.


Best regards
Maximilian Stein
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SCardGetStatusChange-Fix-rare-race-condition.patch
Type: text/x-patch
Size: 8253 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/pcsclite-muscle/attachments/20180417/9dff30bc/attachment.bin>


More information about the pcsclite-muscle mailing list