[PATCH v2 06/28] kernel: Define gettimeofday vdso common code

Thomas Gleixner tglx at linutronix.de
Fri Feb 8 11:28:07 PST 2019


On Fri, 8 Feb 2019, Will Deacon wrote:
> On Fri, Dec 07, 2018 at 05:53:21PM +0000, Will Deacon wrote:
> > Anyway, moving the counter read into the protected region is a little fiddly
> > because the memory barriers we have in there won't give us the ordering we
> > need. We'll instead need to do something nasty, like create a dependency
> > from the counter read to the read of the seqlock:
> > 
> > Maybe the untested crufty hack below, although this will be a nightmare to
> > implement in C.
> We discussed this in person this week, but you couldn't recall the details
> off the top of your head so I'm replying here. Please could you clarify what
> your concern was with the existing code, and whether or not I've got the
> wrong end of the stick?

If you just collect the variables under the seqcount protection and have
the readout of the timer outside of it then you are not guaranteeing
consistent state.

The problem is:

	do {
		seq = read_seqcount_begin(d->seq);
		last = d->cycle_last;
		mult = d->mult;
		shift = d->shift;
		ns = d->ns_base;
	while (read_seqcount_retry(d->seq, seq));

	ns += ((read_clock() - last) * mult);
	ns >>= shift;

So on the first glance this looks consistent because you collect all data
and then do the read and calc outside the loop.

But if 'd->mult' gets updated _before_ read_clock() then you can run into a
situation where time goes backwards with the next read.

Here is the flow you need for that:

t1 = gettime()

     ---> Interrupt, updates mult (mult becomes smaller)

     	  This can expand over a very long time when the task is scheduled
     	  out here and there are multiple updates in between. The farther
     	  out the read is delayed, the more likely the problem is going to


t2 = gettime()

This second read uses updated data, i.e. the smaller mult. So depending on
the distance of the two reads and the difference of the mult, the caller
can observe t2 < t1, which is a NONO. You can observe it on a single
CPU. No virt, SMP, migration needed at all.



More information about the linux-arm-kernel mailing list