[RFC PATCH v2] ptp: Add vDSO-style vmclock support

David Woodhouse dwmw2 at infradead.org
Thu Jun 27 09:03:04 PDT 2024


I've updated the tree at
https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/vmclock
(but not yet the qemu one).

I think I've taken into account all your comments apart from the one
about non-64-bit counters wrapping. I reduced the seq_count to 32 bit
to make room for a 32-bit flags field, added the time type
(UTC/TAI/MONOTONIC) and a smearing hint, with some straw man
definitions for smearing algorithms for which I could actually find
definitions.

The structure now looks like this:


struct vmclock_abi {
	uint32_t magic;
#define VMCLOCK_MAGIC	0x4b4c4356 /* "VCLK" */
	uint16_t size;		/* Size of page containing this structure */
	uint16_t version;	/* 1 */

	/* Sequence lock. Low bit means an update is in progress. */
	uint32_t seq_count;

	uint32_t flags;
	/* Indicates that the tai_offset_sec field is valid */
#define VMCLOCK_FLAG_TAI_OFFSET_VALID		(1 << 0)
	/*
	 * Optionally used to notify guests of pending maintenance events.
	 * A guest may wish to remove itself from service if an event is
	 * coming up. Two flags indicate the rough imminence of the event.
	 */
#define VMCLOCK_FLAG_DISRUPTION_SOON		(1 << 1) /* About a day */
#define VMCLOCK_FLAG_DISRUPTION_IMMINENT	(1 << 2) /* About an hour */
	/* Indicates that the utc_time_maxerror_picosec field is valid */
#define VMCLOCK_FLAG_UTC_MAXERROR_VALID		(1 << 3)
	/* Indicates counter_period_error_rate_frac_sec is valid */
#define VMCLOCK_FLAG_PERIOD_ERROR_VALID		(1 << 4)

	/*
	 * This field changes to another non-repeating value when the CPU
	 * counter is disrupted, for example on live migration. This lets
	 * the guest know that it should discard any calibration it has
	 * performed of the counter against external sources (NTP/PTP/etc.).
	 */
	uint64_t disruption_marker;

	uint8_t clock_status;
#define VMCLOCK_STATUS_UNKNOWN		0
#define VMCLOCK_STATUS_INITIALIZING	1
#define VMCLOCK_STATUS_SYNCHRONIZED	2
#define VMCLOCK_STATUS_FREERUNNING	3
#define VMCLOCK_STATUS_UNRELIABLE	4

	uint8_t counter_id;
#define VMCLOCK_COUNTER_INVALID		0
#define VMCLOCK_COUNTER_X86_TSC		1
#define VMCLOCK_COUNTER_ARM_VCNT	2
#define VMCLOCK_COUNTER_X86_ART		3

	/*
	 * By providing the offset from UTC to TAI, the guest can know both
	 * UTC and TAI reliably, whichever is indicated in the time_type
	 * field. Valid if VMCLOCK_FLAG_TAI_OFFSET_VALID is set in flags.
	 */
	int16_t tai_offset_sec;

	/*
	 * The time exposed through this device is never smeaared; if it
	 * claims to be VMCLOCK_TIME_UTC then it MUST be UTC. This field
	 * provides a hint to the guest operating system, such that *if*
	 * the guest OS wants to provide its users with an alternative
	 * clock which does not follow the POSIX CLOCK_REALTIME standard,
	 * it may do so in a fashion consistent with the other systems
	 * in the nearby environment.
	 */
	uint8_t leap_second_smearing_hint;
	/* Provide true UTC to users, unsmeared. */;
#define VMCLOCK_SMEARING_NONE			0
	/*
	 * https://aws.amazon.com/blogs/aws/look-before-you-leap-the-coming-leap-second-and-aws/
	 * From noon on the day before to noon on the day after, smear the
	 * clock by a linear 1/86400s per second.
	*/
#define VMCLOCK_SMEARING_LINEAR_86400		1
	/*
	 * draft-kuhn-leapsecond-00
	 * For the 1000s leading up to the leap second, smear the clock by
	 * clock by a linear 1ms per second.
	 */
#define VMCLOCK_SMEARING_UTC_SLS		2

	/*
	 * What time is exposed in the time_sec/time_frac_sec fields?
	 */
	uint8_t time_type;
#define VMCLOCK_TIME_UNKNOWN		0	/* Invalid / no time exposed */
#define VMCLOCK_TIME_UTC		1	/* Since 1970-01-01 00:00:00z */
#define VMCLOCK_TIME_TAI		2	/* Since 1970-01-01 00:00:00z */
#define VMCLOCK_TIME_MONOTONIC		3	/* Since undefined epoch */

	/* Bit shift for counter_period_frac_sec and its error rate */
	uint8_t counter_period_shift;

	/*
	 * Unlike in NTP, this can indicate a leap second in the past. This
	 * is needed to allow guests to derive an imprecise clock with
	 * smeared leap seconds for themselves, as some modes of smearing
	 * need the adjustments to continue even after the moment at which
	 * the leap second should have occurred.
	 */
	int8_t leapsecond_direction;
	uint64_t leapsecond_tai_sec; /* Since 1970-01-01 00:00:00z */

	/*
	 * Paired values of counter and UTC at a given point in time.
	 */
	uint64_t counter_value;
	uint64_t time_sec; /* Since 1970-01-01 00:00:00z */
	uint64_t time_frac_sec;

	/*
	 * Counter frequency, and error margin. The unit of these fields is
	 * seconds >> (64 + counter_period_shift)
	 */
	uint64_t counter_period_frac_sec;
	uint64_t counter_period_error_rate_frac_sec;

	/* Error margin of UTC reading above (± picoseconds) */
	uint64_t utc_time_maxerror_picosec;
};

#endif /*  __VMCLOCK_H__ */

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5965 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20240627/649fb60b/attachment.p7s>


More information about the linux-arm-kernel mailing list