draft-ietf-ntp-ntpv4-proto-02.txt   draft-ietf-ntp-ntpv4-proto-03.txt 
NTP WG J. Burbank, Ed. NTP WG J. Burbank, Ed.
Internet-Draft JHU/APL Internet-Draft JHU/APL
Obsoletes: RFC 4330 (if approved) J. Martin, Ed. Obsoletes: RFC 4330 (if approved) J. Martin, Ed.
Expires: September 2, 2006 Netzwert AG Intended status: Standards Track Netzwert AG
D. Mills Expires: April 26, 2007 D. Mills
U. Del. U. Del.
October 23, 2006
The Network Time Protocol Version 4 Protocol Specification Network Time Protocol Version 4 Reference and Implementation Guide
draft-ietf-ntp-ntpv4-proto-02 draft-ietf-ntp-ntpv4-proto-03
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 36 skipping to change at page 1, line 37
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 2, 2006. This Internet-Draft will expire on April 26, 2007.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2006). Copyright (C) The Internet Society (2006).
Abstract Abstract
The Network Time Protocol (NTP) is widely used to synchronize The Network Time Protocol (NTP) is widely used to synchronize
computer clocks in the Internet. This memorandum describes Version 4 computer clocks in the Internet. This memorandum describes Version 4
of the NTP (NTPv4), introducing several changes from Version 3 of NTP of the NTP (NTPv4), introducing several changes from Version 3 of NTP
(NTPv3) described in RFC 1305, including the introduction of a (NTPv3) described in RFC 1305, including the introduction of a
modified protocol header to accomodate Internet Protocol Version 6. modified protocol header to accomodate Internet Protocol Version 6.
NTPv4 also includes optional extensions to the NTPv3 NTPv4 also includes optional extensions to the NTPv3
protocol,including a dynamic server discovery mechanism. protocol,including a dynamic server discovery mechanism.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Requirements Notation . . . . . . . . . . . . . . . . . . 4 1.1. Requirements Notation . . . . . . . . . . . . . . . . . . 4
2. NTP Timestamp . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Modes of Operation . . . . . . . . . . . . . . . . . . . . . 4
3. NTP Message Formats . . . . . . . . . . . . . . . . . . . . . 6 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1. Leap Indicator (LI) . . . . . . . . . . . . . . . . . . . 7 4. Implementation Model . . . . . . . . . . . . . . . . . . . . 9
3.2. Version (VN) . . . . . . . . . . . . . . . . . . . . . . . 8 5. Data Types . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3. Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 6. Data Structures . . . . . . . . . . . . . . . . . . . . . . . 15
3.4. Stratum (Strat) . . . . . . . . . . . . . . . . . . . . . 9 6.1. Structure Conventions . . . . . . . . . . . . . . . . . . 16
3.5. Poll Interval (Poll) . . . . . . . . . . . . . . . . . . . 9 6.2. Global Parameters . . . . . . . . . . . . . . . . . . . . 16
3.6. Precision (Prec) . . . . . . . . . . . . . . . . . . . . . 9 6.3. Packet Header Variables . . . . . . . . . . . . . . . . . 18
3.7. Root Delay . . . . . . . . . . . . . . . . . . . . . . . . 9 6.3.1. The Kiss-o'-Death Packet . . . . . . . . . . . . . . 22
3.8. Root Dispersion . . . . . . . . . . . . . . . . . . . . . 10 6.3.2. NTP Extension Field Format . . . . . . . . . . . . . 23
3.9. Reference Identifier . . . . . . . . . . . . . . . . . . . 10 7. On Wire Protocol . . . . . . . . . . . . . . . . . . . . . . 25
3.10. Reference Timestamp . . . . . . . . . . . . . . . . . . . 11 8. Peer Process . . . . . . . . . . . . . . . . . . . . . . . . 29
3.11. Originate Timestamp . . . . . . . . . . . . . . . . . . . 11 8.1. Peer Process Variables . . . . . . . . . . . . . . . . . 30
3.12. Receive Timestamp . . . . . . . . . . . . . . . . . . . . 11 8.2. Peer Process Operations . . . . . . . . . . . . . . . . . 32
3.13. Transmit Timestamp . . . . . . . . . . . . . . . . . . . . 11 9. Clock Filter Algorithm . . . . . . . . . . . . . . . . . . . 39
3.14. NTPv4 Extension Fields . . . . . . . . . . . . . . . . . . 12 10. System Process . . . . . . . . . . . . . . . . . . . . . . . 42
3.15. Authentication (optional) . . . . . . . . . . . . . . . . 13 10.1. System Process Variables . . . . . . . . . . . . . . . . 42
4. NTP Protocol Operation . . . . . . . . . . . . . . . . . . . . 14 10.2. System Process Operations . . . . . . . . . . . . . . . . 43
5. SNTP Protocol Operation . . . . . . . . . . . . . . . . . . . 17 10.2.1. Selection Algorithm . . . . . . . . . . . . . . . . . 44
6. NTP Server Operations . . . . . . . . . . . . . . . . . . . . 18 10.2.2. Clustering Algorithm . . . . . . . . . . . . . . . . 46
7. NTP Client Operations . . . . . . . . . . . . . . . . . . . . 20 10.2.3. Combining Algorithm . . . . . . . . . . . . . . . . . 48
8. NTP Symmetric Peer Operations . . . . . . . . . . . . . . . . 22 10.2.4. Clock Discipline Algorithm . . . . . . . . . . . . . 52
9. Dynamic Server Discovery . . . . . . . . . . . . . . . . . . . 22 10.3. Clock Adjust Process . . . . . . . . . . . . . . . . . . 60
10. The Kiss-o'-Death Packet . . . . . . . . . . . . . . . . . . . 23 11. Poll Process . . . . . . . . . . . . . . . . . . . . . . . . 61
11. Security Considerations . . . . . . . . . . . . . . . . . . . 24 11.1. Poll Process Variables and Parameters . . . . . . . . . . 61
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 11.2. Poll Process Operations . . . . . . . . . . . . . . . . . 62
13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 12. Security Considerations . . . . . . . . . . . . . . . . . . . 63
14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 63
14.1. Normative References . . . . . . . . . . . . . . . . . . . 25 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 64
14.2. Informative References . . . . . . . . . . . . . . . . . . 25 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 64
Appendix A. NTP Control Messages . . . . . . . . . . . . . . . . 27 15.1. Normative References . . . . . . . . . . . . . . . . . . 64
A.1. NTP Control Message Format . . . . . . . . . . . . . . . . 28 15.2. Informative References . . . . . . . . . . . . . . . . . 64
A.2. Status Words . . . . . . . . . . . . . . . . . . . . . . . 30 Appendix A. Code Skeleton . . . . . . . . . . . . . . . . . . . 65
A.2.1. System Status Word . . . . . . . . . . . . . . . . . . 31 A.1. Global Definitions . . . . . . . . . . . . . . . . . . . 65
A.2.2. Peer Status Word . . . . . . . . . . . . . . . . . . . 33 A.2. Definitions, Constants, Parameters . . . . . . . . . . . 65
A.2.3. Clock Status Word . . . . . . . . . . . . . . . . . . 34 A.3. Packet Data Structures . . . . . . . . . . . . . . . . . 69
A.2.4. Error Status Word . . . . . . . . . . . . . . . . . . 35 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 109
A.3. Commands . . . . . . . . . . . . . . . . . . . . . . . . . 36 Intellectual Property and Copyright Statements . . . . . . . . . 110
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 39
Intellectual Property and Copyright Statements . . . . . . . . . . 40
1. Introduction 1. Introduction
The Network Time Protocol Version 3 (NTPv3) [1] has been widely used This document is a reference and implementation guide for the Network
to synchronize computer clocks in the global Internet. It provides Time Protocol Version 4 (NTPv4), which is widely used to synchronize
comprehensive mechanisms to access national time and frequency the system clocks among a set of distributed time servers and
dissemination services, organize the NTP subnet of servers and clients. This document defines the core architecture, protocol,
clients and adjust the system clock in each participant. In most state machines, data structures and algorithms. This document
places on the Internet of today, NTP provides accuracies of 1-50 ms, describes NTPv4, which introduces new functionality to NTPv3 as
depending on the characteristics of the synchronization source and described in [1], and functionality expanded from that of SNTPv4 as
network paths. described in [2] (SNTPv4 is a subset of NTPv4). This document
obsoletes RFC 1305 and RFC 4330. While certain minor changes have
NTP is designed for use by clients and servers with a wide range of been made in some protocol header fields, these do not affect the
capabilities. Thus, the Simple Network Time Protocol Version 4 interoperability between NTPv4 and previous versions.
(SNTPv4) as described in [2] was developed for platforms that cannot
afford the size and complexity of NTP as a whole.
Since the standardization of NTPv3, there has been significant The NTP subnet model includes a number of widely accessible primary
development which has led to Version 4 of the Network Time Protocol time servers synchronized by wire or radio to national standards.
(NTPv4). This document describes NTPv4, which introduces new The purpose of the NTP protocol is to convey timekeeping information
functionality to NTPv3 as described in RFC 1305, and functionality from these primary servers to secondary time servers and clients via
expanded from that of SNTPv4 as described in RFC 4330 (SNTPv4 is a both private networks and the public Internet. Crafted algorithms
subset of NTPv4). This document obsoletes RFC 4330. mitigate errors that may result from network disruptions, server
failures and possible hostile action. Servers and clients are
configured such that values flow from the primary servers at the root
via branching secondary servers toward clients.
When operating with current and previous versions of NTP and SNTP, The NTPv4 design overcomes significant shortcomings in the NTPv3
NTPv4 requires no changes to the protocol or implementations now design, corrects certain bugs and incorporates new features. In
running or likely to be implemented specifically for future NTP or particular, expanded NTP timestamp definitions encourage the use of
SNTP versions. The NTP and SNTP packet formats are the same and the floating double data types throughout any implementation. The time
arithmetic operations to calculate the client time, clock offset and resolution is better than one nanosecond and frequency resolution
round trip delay are the same. To a NTP or SNTP server, NTP and SNTP better than one nanosecond per second. Additional improvements
clients are indistinguishable; to a NTP or SNTP client, NTP and SNTP include a new clock discipline algorithm which is more responsive to
servers are indistinguishable. system clock hardware frequency fluctuations. Typical primary
servers using modern machines are precise within a few tens of
microseconds. Typical secondary servers and clients on fast LANs are
within a few hundred microseconds with poll intervals up to 1024
seconds, which was the maximum with NTPv3. With NTPv4, servers and
clients are within a few tens of milliseconds with poll intervals up
to 36 hours.
An important provision in this memo is the interpretation of certain The main body of this document describes only the core protocol and
NTP header fields which provide for IPv6 [3]and OSI [4] addressing. data structures necessary to interoperate between conforming
The only significant difference between the NTPv3 and NTPv4 header implementations. Additional detail is provided in the form of a
formats is the four-octet Reference Identifier field, which is used skeleton program included as an appendix. This program includes data
primarily to detect and avoid synchronization loops. In all NTP and structures and code segments for the core algorithms and in addition
SNTP versions providing IPv4 addressing, primary servers use a four- the mitigation algorithms used to enhance reliability and accuracy.
character ASCII reference clock identifier in this field, while While the skeleton and other descriptions in this document apply to a
secondary servers use the 32-bit IPv4 address of the synchronization particular implementation, they are not intended as the only way the
source. In NTPv4 providing IPv6 and OSI addressing, primary servers required functions can be implemented. While the NTPv3 symmetric key
use the same clock identifier, but secondary servers use the first 32 authentication scheme described in this document carries over from
bits of the MD5 hash of the IPv6 or NSAP address of the NTPv3, the Autokey public key authentication scheme new to NTPv4 is
synchronization source. A further use of this field is when the described in [3].
server sends a kiss-o'-death message documented later in this
document.
In the case of OSI, the Connectionless Transport Service (CLTS) is The NTP protocol includes the modes of operation described in Section
used as in [5]. Each NTP packet is transmitted as the TS- Userdata 2 using the data types described in Section 5 and the data structures
parameter of a T-UNITDATA Request primitive. Alternately, the header in Section 6. The implementation model described in Section 4 is
can be encapsulated in a TPDU which itself is transported using UDP, based on a multiple-process, threaded architecture, although other
as described in [6]. It is not advised that NTP be operated at the architectures could be used as well. The on-wire protocol described
upper layers of the OSI stack, such as might be inferred from [7], as in Section 7 is based on a returnable-time design which depends only
this could seriously degrade accuracy. With the header formats on measured clock offsets, but does not require reliable message
defined in this memo, it is, in principle, possible to interwork delivery. The synchronization subnet is a self-organizing,
between servers and clients of one protocol family and another, hierarchical, master-slave network with synchronization paths
although the practical difficulties may make this inadvisable. determined by a shortest-path spanning tree and defined metric.
While multiple masters (primary servers) may exist, there is no
requirement for an election protocol.
This document is organized as follows. Section 2 describes the NTP The remaining sections of this document define the data structures
timestamp format and Section 3 the NTP message format. Section 4 and algorithms suitable for a fully featured NTPv4 implementation.
provides general NTP protocol details, with the subset SNTP described Appendix A contains the code skeleton with definitions, structures
in Section 5. This is followed by specific sections on Server and code segments that represent the basic structure of the reference
(Section 6), Client(Section 7), and Symmetric Peer(Section 8) modes implementation.
of operation. Section 9 defines the new mechanism for server
discovery. describes the control and management mechanism for NTP.
Section 10 describes the kiss-o'-death message, whose functionality
is similar to the ICMP Source Quench and ICMP Destination Unreachable
messages. Section 11 presents NTPv4 security considerations and
Section 12 discusses IANA Considerations.Appendix A presents optional
NTP control messages.
NTPv4 is hereafter referred to simply as NTP, unless explicitly The remainder of this document contains numerous variables and
noted. mathematical expressions. Those variables take the form of Greek
characters. Those Greek characters are spelled out by their full
name, with the "cap" prefix added to variables referring to the
corresponding upper case Greek character. For example capdelta
refers to the uppercase Greek character, where delta refers to the
lowercase Greek character. Furthermore, subscripts are denoted with
a '_' separating the variable name and the subscript. For example
'theta_i' refers to the variable lowercase Greek character theta with
subscript i, or phonetically 'theta sub i.'
1.1. Requirements Notation 1.1. Requirements Notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [8]. document are to be interpreted as described in RFC 2119 [4].
2. NTP Timestamp 2. Modes of Operation
There are three NTP formats used to represent time values: a 128-bit An NTP implementation operates as a primary server, secondary server
date format, a 64-bit timestamp format, and a 32-bit short format. or client. A primary server is synchronized directly to a reference
NTP data are specified as integer or fixed-point quantities, with clock, such as a GPS receiver or telephone modem service. A client
bits numbered in big-endian fashion from 0 starting at the left or is synchronized to one or more upstream servers, but does not provide
most significant end. Unless specified otherwise, all quantities are synchronization to dependent clients. A secondary server has one or
unsigned and may occupy the full field width with an implied 0 more upstream servers and one or more downstream servers or clients.
preceding bit 0. Note that dates cannot be produced by NTP, but can All servers and clients claiming full NTPv4 compliance must implement
rather be obtained from external means and conveyed via the protocol. the entire suite of algorithms described in this document. In order
Date values are represented in twos compliment arithmetic relative to to maintain stability in large NTP subnets, secondary servers must be
the base date of 0628:16h 7 February 2036 UTC (when all 128 bits are fully NTPv4 compliant.
zero). Values greater than zero represent times after the base date;
values less than zero represent times before the base date. Dates
are signed values. Timestamps are signed values. A value of zero is
a special case representing unknown or unsynchronized time.
Figure 1 illustrates the three NTP time formats. Primary servers and clients complying with a subset of NTP, called
the Simple Network Time Protocol (SNTPv4) [2], do not need to
implement all algorithms. SNTP is intended for primary servers
equipped with a single reference clock, as well as clients with a
single upstream server and no dependent clients. The fully developed
NTPv4 implementation is intended for secondary servers with multiple
upstream servers and multiple downstream servers or clients. Other
than these considerations, NTP and SNTP servers and clients are
completely interoperable and can be mixed and matched in NTP subnets.
+-------------------+--------------+-------------+
| Association Mode | Assoc. Mode | Packet Mode |
+-------------------+--------------+-------------+
| Symmetric Active | 1 | 1 or 2 |
| Symmetric Passive | 2 | 1 |
| Client | 3 | 4 |
| Server | 4 | 3 |
| Broadcast Server | 5 | 5 |
| Broadcast Client | 6 | N/A |
+-------------------+--------------+-------------+
Table 1: Association and Packet Modes
There are three NTP protocol variants, symmetric, client/server and
broadcast. Each is associated with an association mode as shown in
Table 1. In the client/server variant a client association sends
mode-3 (client) packets to a server, which returns mode-4 (server)
packets. Servers provide synchronization to one or more clients, but
do not accept synchronization from them. A server can also be a
reference clock which obtains time directly from a standard source
such as a GPS receiver or telephone modem service. We say that
clients pull synchronization from servers.
In the symmetric variant a peer operates as both a server and client
using either a symmetric-active or symmetric-passive association. A
symmetric-active association sends mode-1 (symmetric-active) packets
to a symmetric-active peer association. Alternatively, a symmetric-
passive association can be mobilized upon arrival of a mode-1 packet.
That association sends mode-2 (symmetric-passive) packets and
persists until error or timeout. Peers both push and pull
synchronization to and from each other. For the purposes of this
document, a peer operates like a client, so a reference to client
implies peer as well.
In the broadcast variant a broadcast server association sends
periodic mode-5 (broadcast) packets which are received by multiple
mode-6 (broadcast client) associations. It is useful to provide an
initial volley where the client operating in mode 3 exchanges several
packets with the server in order to calibrate the propagation delay
and to run the Autokey security protocol, after which the client
reverts to mode 6. We say that broadcast servers push
synchronization to willing consumers.
Following conventions established by the telephone industry, the
level of each server in the hierarchy is defined by a number called
the stratum, with the primary servers assigned stratum one and the
secondary servers at each level assigned one greater than the
preceding level. As the stratum increases from one, the accuracies
achievable degrade somewhat depending on the particular network path
and system clock stability. It is useful to assume that mean errors,
and thus a metric called the synchronization distance, increase
approximately in proportion to the stratum and measured roundtrip
delay. It is important to note that NTP stratum is only loosely
modeled after telecommunications stratum. The NTP stratum numbers
and telecommunications stratum numbers do not correlate with one
another. Telecommunications stratum numbers are rigorously defined
by international standards that are not covered within this document.
Drawing from the experience of the telephone industry, which learned
such lessons at considerable cost, the subnet topology should be
organized to produce the lowest synchronization distances, but must
never be allowed to form a loop. In NTP the subnet topology is
determined using a variant of the Bellman-Ford distributed routing
algorithm, which computes the shortest-distance spanning tree rooted
on the primary servers. As a result of this design, the algorithm
automatically reorganizes the subnet to produce the most accurate and
reliable time, even when one or more primary or secondary servers or
the network paths.
3. Definitions
A number of terms used throughout this document have a precise
technical definition. A timescale is a frame of reference where time
is expressed as the value of a monotonic-increasing binary counter
with an indefinite number of bits. It counts in seconds and fraction
with the decimal point somewhere in the middle. The Coordinated
Universal Time (UTC) timescale represents mean solar time as
disseminated by national standards laboratories. The system time is
represented by the system clock maintained by the operating system
kernel. The goal of the NTP algorithms is to minimize both the time
difference and frequency difference between UTC and the system clock.
When these differences have been reduced below nominal tolerances,
the system clock is said to be synchronized to UTC.
The date of an event is the UTC time at which it takes place. Dates
are ephemeral values which always increase in step with reality and
are designated with upper case T in this document. It is convenient
to define another timescale coincident with the running time of the
NTP program that provides the synchronization function. This is
convenient in order to determine intervals for the various repetitive
functions like poll events. Running time is usually designated with
lower case t.
A timestamp T(t) represents either the UTC date or time offset from
UTC at running time t. Which meaning is intended should be clear
from the context. Let T(t) be the time offset, R(t) the frequency
offset, D(t) the ageing rate (first derivative of R(t) with respect
to t). Then, if T(t_0) is the UTC time offset determined at t=t_0,
the UTC time offset after some interval is:
T(t+t_0) = T(t_0) + R(t_0)(t+t_0)+(1/2)*D(t_0)(t+t_0)^2 + e,
where e is a stochastic error term discussed later in this document.
While the D(t) term is important when characterizing precision
oscillators, it is ordinarily neglected for computer oscillators. In
this document all time values are in seconds (s) and all frequency
values are in seconds-per-second (s/s). It is sometimes convenient
to express frequency offsets in parts-per-million (PPM), where 1 PPM
is equal to 1*10^(-6) seconds.
It is important in computer timekeeping applications to assess the
performance of the timekeeping function. The NTP performance model
includes four statistics which are updated each time a client makes a
measurement with a server. The offset theta represents the maximum
likelihood time offset of the server clock relative to the system
clock. The delay del represents the roundtrip delay between the
client and server. The dispersion epsilon represents the maximum
error inherent in the measurement. It increases at a rate equal to
the maximum disciplined system clock frequency tolerance phi,
typically 15 PPM. The jitter varphi, defined as the root-mean-square
(RMS) average of the most recent time offset differences, represents
the nominal error in estimating theta.
While the theta, del, epsilon, and psi statistics represent
measurements of the system clock relative to the each server clock
separately, the NTP protocol includes mechanisms to combine the
statistics of several servers to more accurately discipline and
calibrate the system clock. The system offset captheta represents
the maximum-likelihood offset estimate for the server population.
The system jitter vartheta represents the nominal error in estimating
captheta. The del and epsilon statistics are accumulated at each
stratum level from the reference clocks to produce the root delay
delta and root dispersion E statistics. The synchronization distance
gamma=E+delta/2 represents the maximum error due all causes. The
detailed formulations of these statistics are given later in this
document. They are available to the dependent applications in order
to assess the performance of the synchronization function.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|LI | VN |Mode | Strat | Poll | Prec |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Root Delay |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Root Dispersion |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reference ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Reference Timestamp +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Origin Timestamp +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Receive Timestamp +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Transmit Timestamp +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Extension Field 1 (Optional) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Extension Field 2 (Optional) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
. .
. Authentication .
. (Optional) (160 bits) .
. .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: NTPv4 Message Format
4. Implementation Model
Figure 1 shows two processes dedicated to each server, a peer process
to receive messages from the server or reference clock and a poll
process to transmit messages to the server or reference clock. .
..........................................................
. Remote .. Peer/Poll .. System .
. Servers .. Processes .. Process .
. .. .. .
.----------..-------------..-------------- .
.| |->| |..| | .
.|Server 1|..|Peer/Poll 1|->| | .
.| |<-| |..| | ............
.----------..-------------..| | . Clock .
.Discipline.
. .. ^ ..| | .. Process .
. .. | ..| | .. .
.----------..-------------..| | |-----------|.. .
.| |->| |..| Selection |->| ..-------- .
.|Server 2|..|Peer/Poll 2|->| and | | Combining |->| Loop | .
.| |<-| |..| Clustering | | Algorithm |..|Filter| .
.----------..-------------..| Algorithms |->| |.-----------
. .. ^ ..| | |-----------|. |
. .. | ..| | . |
.----------..-------------..| | . |
.| |->| |..| | . |
.|Server 3|..|Peer/Poll 3|->| | . |
.| |<-| |..| | . |
.----------..-------------..|------------| . |
....................^..................................... |
| |
| \|/
| ...............
| . /-----\ .
'----------------------------------<-| VFO |-<-.
. \-----/ .
. Clock Adjust.
. Process .
...............
Figure 1 NTPv4 Algorithm Interactions
Figure 2: NTPv4 Algorithm Interactions
These processes operate on a common data structure called an
association, which contains the statistics described above along with
various other data described later. A client sends an NTP packet to
one or more servers and processes the replies as received. The
server interchanges addresses and ports, overwrites certain fields in
the packet and returns it immediately (client/ server mode) or at
some time later (symmetric modes). As each NTP message is received,
the offset theta between the peer clock and the system clock is
computed along with the associated statistics del, epsilon, and
varphi.
The system process includes the selection, clustering and combining
algorithms which mitigate among the various servers and reference
clocks to determine the most accurate and reliable candidates to
synchronize the system clock. The selection algorithm uses Byzantine
principles to remote the falsetickers from the incident population,
leaving only truechimers. A 'truechimer' is a clock that maintains
timekeeping accuracy to a previously published (and trusted)
standard, while a 'falseticker' is a clock that does not maintain
that level of timekeeping accuracy. The clustering algorithm uses
statistical principles to sift the most accurate truechimers leaving
the survivors as result. The combining algorithm develops the final
clock offset as a statistical average of the survivors.
The clock discipline process, which is actually part of the system
process, includes engineered algorithms to control the time and
frequency of the system clock, here represented as a variable
frequency oscillator (VFO). Timestamps struck from the VFO close the
feedback loop which maintains the system clock time. Associated with
the clock discipline process is the clock adjust process, which runs
once each second to inject a computed time offset and maintain
constant frequency. The RMS average of past time offset differences
represents the nominal error or system jitter vartheta. The RMS
average of past frequency offset differences represents the
oscillator frequency stability or frequency wander cappsi.
A client sends messages to each server with a poll interval of 2^tau
seconds, as determined by the poll exponent tau. In NTPv4 tau ranges
from 4 (16 s) through 17 (36 h). The value of tau is determined by
the clock discipline algorithm to match the loop time constant T_c =
2^tau. A server responds with messages at an update interval of mu
seconds. For stateless servers, mu = T_c, since the server responds
immediately. However, in symmetric modes each of two peers manages
the time constant as a function of current system offset and system
jitter, so may not agree with the same tau. It is important that the
dynamic behavior of the clock discipline algorithms be carefully
controlled in order to maintain stability in the NTP subnet at large.
This requires that the peers agree on a common tau equal to the
minimum poll exponent of both peers. The NTP protocol includes
provisions to properly negotiate this value.
While not shown in the figure, the implementation model includes some
means to set and adjust the system clock. The operating system is
assumed to provide two functions, one to set the time directly, for
example the Unix settimeofday() function, and another to adjust the
time in small increments advancing or retarding the time by a
designated amount, for example the Unix adjtime()1 function
(parentheses following a name indicate reference to a function rather
than a simple variable). In the intended design the clock discipline
process uses the adjtime() function if the adjustment is less than a
designated threshold, and the settimeofday() function if above the
threshold. The manner in which this is done and the value of the
threshold is described later.
5. Data Types
All NTP time values are represented in twos-complement format, with
bits numbered in big-endian (as described in Appendix A of [5])
fashion from zero starting at the left, or high-order, position.
There are three NTP time formats, a 128-bit date format, a 64-bit
timestamp format and a 32-bit short format, as shown in Figure 3.
The 128-bit date format is used where sufficient storage and word
size are available. It includes a 64-bit signed seconds field
spanning 584 billion years and a 64-bit fraction field resolving .05
attosecond (i.e. 0.5e-18). For convenience in mapping between
formats, the seconds field is divided into a 32-bit era field and a
32-bit timestamp field. Eras cannot be produced by NTP directly, nor
is there need to do so. When necessary, they can be derived from
external means, such as the filesystem or dedicated hardware.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seconds | Fraction | | Seconds | Fraction |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
NTP Short Format NTP Short Format
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
skipping to change at page 5, line 37 skipping to change at page 13, line 34
| Era Number | | Era Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Era Offset | | Era Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
| Fraction | | Fraction |
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
NTP Date Format NTP Date Format
Figure 1: NTP Timestamp Format Figure 3: NTP Time Format
Note that, since some time in 1968 (second 2,147,483,648) the most The 64-bit timestamp format is used in packet headers and other
significant bit (bit 0 of the integer part) has been set and that the places with limited word size. It includes a 32-bit unsigned seconds
64-bit field will overflow some time in 2036 (second 4,294,967,296). field spanning 136 years and a 32-bit fraction field resolving 232
There will exist a 232-picosecond interval, henceforth ignored, every picoseconds. The 32-bit short format is used in delay and dispersion
136 years when the 64-bit field will be 0, which by convention is header fields where the full resolution and range of the other
interpreted as an invalid or unavailable timestamp. formats are not justified. It includes a 16-bit unsigned seconds
field and a 16-bit fraction field.
If bit 0 is set, the UTC time is in the range 1968-2036 and UTC time In the date format the prime epoch, or base date of era 0, is 0 h 1
is reckoned from 0h 0m 0s UTC on 1 January 1900. If bit 0 is not January 1900 UTC, when all bits are zero. It should be noted that
set, the time is in the range 2036-2104 and UTC time is calculated strictly speaking, UTC did not exist prior to 1 January 1972, but it
from 6h 28m 16s UTC on 7 February 2036. Note that when calculating is convenient to assume it has existed for all eternity, even if all
the correspondence, 2000 is a leap year and leap seconds are not knowledge of historic leap seconds has been lost. Dates are relative
included in the reckoning. to the prime epoch; values greater than zero represent times after
that date; values less than zero represent times before it.
3. NTP Message Formats Timestamps are unsigned values and operations on them produce a
result in the same or adjacent eras. Era 0 includes dates from the
prime epoch to some time in 2036, when the timestamp field wraps
around and the base date for era 1 is established. In either format
a value of zero is a special case representing unknown or
unsynchronized time. Figure 4 shows a number of historic NTP dates
together with their corresponding Modified Julian Day (MJD), NTP era
and NTP timestamp.
Year MJD NTP Era NTP Timestamp Epoch
1 Jan -4712 -2,400,001 -49 1,795,583,104 1st day Julian
1 Jan -1 -679,306 -14 139,775,744 2 BCE
1 Jan 0 -678,491 -14 171,311,744 1 BCE
1 Jan 1 -678,575 -14 202,939,144 1 CE
4 Oct 1582 -100,851 -3 2,873,647,488 Last day Julian
15 Oct 1582 -100,840 -3 2,874,597,888 1st day Gregorian
31 Dec 1899 15019 -1 4,294,880,896 Last d NTPEra-1
1 Jan 1900 15020 0 0 First d NTPEra0
1 Jan 1970 40,587 0 2,208,988,800 First day UNIX
1 Jan 1972 41,317 0 2,272,060,800 First day UTC
31 Dec 1999 51,543 0 3,155,587,200 Last d 20th Cent
8 Feb 2036 64,731 1 63,104 1st day NTPEra1
Both NTP and SNTP are layered above the User Datagram Protocol (UDP) Figure 4: Interesting Historic NTP Dates
[9], which itself is layered on the Internet Protocol (IP) [10] [3].
The structure of the IP and UDP headers is described in the cited
specification documents and will not be detailed further here. The
UDP port number assigned to NTP is 123, which MUST be used in both
the Source Port and Destination Port fields in the UDP header. The
remaining UDP header fields should be set as described in the
specification.
Figure 2 provides a description of the NTPv4 message format. This Let p be the number of significant bits in the second fraction. The
format is identical to that described in RFC 1305, with the exception clock resolution is defined 2p, in seconds. In order to minimize
of the contents of the reference identifier field and optional bias and help make timestamps unpredictable to an intruder, the non-
extension fields. The header fields are defined in Figure 2. significant bits should be set to an unbiased random bit string. The
clock precision is defined as the running time to read the system
clock, in seconds. Note that the precision defined in this way can
be larger or smaller than the resolution. The term rho, representing
the precision used in this document, is the larger of the two.
The only operation permitted with dates and timestamps is twos-
complement subtraction, yielding a 127-bit or 63-bit signed result.
It is critical that the first-order differences between two dates
preserve the full 128-bit precision and the first-order differences
between two timestamps preserve the full 64-bit precision. However,
the differences are ordinarily small compared to the seconds span, so
they can be converted to floating double format for further
processing and without compromising the precision.
It is important to note that twos-complement arithmetic does not know
the difference between signed and unsigned values; only the
conditional branch instructions. Thus, although the distinction is
made between signed dates and unsigned timestamps, they are processed
the same way. A perceived hazard with 64-bit timestamp calculations
spanning an era, such as possible in 2036, might result in incorrect
values. In point of fact, if the client is set within 68 years of
the server before the protocol is started, correct values are
obtained even if the client and server are in adjacent eras.
Some time values are represented in exponent format, including the
precision, time constant and poll interval values. These are in
8-bit signed integer format in log2 (log to the base 2) seconds.
The only operations permitted on them are increment and decrement.
For the purpose of this document and to simplify the presentation, a
reference to one of these state variables by name means the
exponentiated value, e.g., the poll interval is 1024 s, while
reference by name and exponent means the actual value, e.g., the poll
exponent is 10.
To convert system time in any format to NTP date and timestamp
formats requires that the number of seconds s from the prime epoch to
the system time be determined. The era is the integer quotient and
the timestamp the integer remainder as in:
era = s / 2^(32) and timestamp = s - era*2^(32)
which works for positive and negative dates. To convert from NTP era
and timestamp to system time requires the calculation
s = era*2^(32) + timestamp
to determine the number of seconds since the prime epoch. Converting
between NTP and system time can be a little messy, but beyond the
scope of this document. Note that the number of days in era 0 is one
more than the number of days in most other eras and this won't happen
again until the year 2400 in era 3.
In the description of state variables to follow, explicit reference
to integer type implies a 32-bit unsigned integer. This simplifies
bounds checks, since only the upper limit needs to be defined.
Without explicit reference, the default type is 64-bit floating
double. Exceptions will be noted as necessary.
6. Data Structures
The NTP protocol state machines described in following sections are
defined using state variables and flow chart fragments. State
variables are separated into classes according to their function in
packet headers, peer and poll processes, the system process and the
clock discipline process. Packet variables represent the NTP header
values in transmitted and received packets. Peer and poll variables
represent the contents of the association for each server separately.
System variables represent the state of the server as seen by its
dependent clients. Clock discipline variables represent the internal
workings of the clock discipline algorithm. Additional constant and
variable classes are defined in Appendix A..
6.1. Structure Conventions
In order to distinguish between different variables of the same name
but used in different processes, the naming convention summarized in
Table 2 is employed. A receive packet variable v is a member of the
packet structure r with fully qualified name r.v. In a similar
manner x.v is a transmit packet variable, p.v is a peer variable, s.v
is a system variable and c.v is a clock discipline variable. There
is a set of peer variables for each association; there is only one
set of system and clock variables. Most flow chart fragments begin
with a statement label and end with a named go-to or exit. A
subroutine call includes a dummy () following the name and return at
the end.to the point following the call.
+------+---------------------------------+
| Name | Description |
+------+---------------------------------+
| r. | receive packet header variable |
| x. | transmit packet header variable |
| p. | peer/poll variable |
| s. | system variable |
| c. | clock discipline variable |
+------+---------------------------------+
Table 2: Name Prefix Conventions
6.2. Global Parameters
In addition to the variable classes a number of global parameters are
defined in this document, including those shown with values in
Figure 5
Name Value Description
PORT 123 NTP port number
VERSION 4 version number
TOLERANCE 15e-6 frequency tolerance (s/s)
MINPOLL 4 minimum poll exponent (16 s)
MAXPOLL 17 maximum poll exponent (36 h)
MAXDISP 16 maximum dispersion (s)
MINDISP .005 minimum dispersion increment (s)
MAXDIST 1 distance threshold (s)
MAXSTRAT 16 maximum stratum number
Figure 5: Global Parameters
. While these are the only parameters needed in this document, a
larger collection is necessary in the skeleton and larger still for
any implementation. Section B.1 contains those used by the skeleton
for the mitigation algorithms, clock discipline algorithm and related
implementation-dependent functions. Some of these parameter values
are cast in stone, like the NTP port number assigned by the IANA and
the version number assigned NTPv4 itself. Others like the frequency
tolerance, involve an assumption about the worst case behavior of a
system clock once synchronized and then allowed to drift when its
sources have become unreachable. The minimum and maximum parameters
define the limits of state variables as described in later sections.
While shown with fixed values in this document, some implementations
may make them variables adjustable by configuration commands. For
instance, the reference implementation computes the value of
PRECISION as log2 of the minimum time in several iterations to read
the system clock.
Name Formula Description
leap leap leap indicator (LI)
version version version number (VN)
mode mode mode
stratum stratum stratum
poll poll poll exponent
precision rho precision exponent
rootdelay delta root delay
rootdisp E root dispersion
refid refid reference ID
reftime reftime reference timestamp
org T1 origin timestamp
rec T2 receive timestamp
xmt T3 transmit timestamp
dst T4 destination timestamp
keyid keyid key ID
digest digest message digest
Figure 6: Packet Header Variables
6.3. Packet Header Variables
The most important state variables from an external point of view are
the packet header variables described below. The NTP packet consists
of a number of 32-bit (4 octet) words in network byte order. The
packet format consists of three components, the header itself, one or
more optional extension fields and an optional message authentication
code (MAC). The header component is identical to the NTPv3 header
and previous versions. The optional extension fields are used by the
Autokey public key cryptographic algorithms described in [3]. The
optional MAC is used by both Autokey and the symmetric key
cryptographic algorithms described in the main body of this report.
The NTP packet header follows the UDP and IP headers and the physical
header specific to the underlying transport network. It consists of
a number of 32-bit (4-octet) words, although some fields use multiple
words and others are packed in smaller fields within a word. The NTP
packet header shown in Appendix A has 12 words followed by optional
extension fields and finally an optional message authentication code
(MAC) consisting of the key identifier and message digest fields.
The optional extension fields described in this section are used by
the Autokey security protocol [3], which is not described here. The
MAC is used by both Autokey and the symmetric key authentication
scheme described in Appendix A. As is the convention in other
Internet protocols, all fields are in network byte order, commonly
called big-endian.
A list of the packet header variables is shown in Figure 6 and
described in detail below. The packet header fields apply to both
transmitted (x prefix) and received packets (r prefix). The NTP
header is shown in Figure 7
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|LI | VN |Mode | Strat | Poll | Prec | |LI | VN |Mode | Strat | Poll | Prec |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Root Delay | | Root Delay |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Root Dispersion | | Root Dispersion |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
skipping to change at page 7, line 46 skipping to change at page 19, line 51
| | | |
+ Extension Field 2 (Optional) + + Extension Field 2 (Optional) +
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
. . . .
. Authentication . . Authentication .
. (Optional) (160 bits) . . (Optional) (160 bits) .
. . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: NTPv4 Message Format Figure 7: NTPv4 Message Format
, where the size of some multiple-word fields is shown in bits if not
3.1. Leap Indicator (LI) the default 32 bits. The header extends from the beginning of the
packet to the end of the Transmit Timestamp field. The
This is a two-bit field indicating an impending leap second to be interpretation of the header fields is shown in the main body of this
inserted in the NTP timescale. The bits are set before 23:59 on the report. When using the IPv4 address family these fields are
day of insertion and reset after 00:00 on the following day. This backwards compatible with NTPv3. When using the IPv6 address family
causes the number of seconds (rollover interval) in the day of on an NTPv4 server with a NTPv3 client, the Reference Identifier
insertion to be increased or decreased by one. A leap second is field appears to be a random value and a timing loop might not be
inserted or deleted in the timescale on the last day of June or detected. The message authentication code (MAC) consists of a 32-bit
December. The possible values of the LI field, and corresponding Key Identifier followed by a 128bit Message Digest. The message
meanings, are given in Table 1. digest, or cryptosum, is calculated as in [6] over all header and
optional extension fields.
+----+--------------------------------------------+
| LI | Meaning |
+----+--------------------------------------------+
| 0 | no warning |
| 1 | last minute of the day has 61 seconds |
| 2 | last minute of the day has 59 seconds |
| 3 | alarm condition (clock never synchronized) |
+----+--------------------------------------------+
Table 1: Length Indicator Field Values
On startup, servers set this field to 3 (clock not synchronized) and
set this field to some other value when synchronized to the primary
reference clock. Once set to other than 3, the field is never set to
that value again, even if all synchronization sources become
unreachable or defective.
3.2. Version (VN)
This is a three-bit integer indicating the NTP/SNTP version number,
set to 4 for NTPv4. If necessary to distinguish between IPv4, IPv6
and OSI, the encapsulating context must be inspected.
3.3. Mode
This is a three-bit number indicating the protocol mode. The values
are defined in Table 2.
+------+--------------------------+
| Mode | Meaning |
+------+--------------------------+
| 0 | reserved |
| 1 | symmetric active |
| 2 | symmetric passive |
| 3 | client |
| 4 | server |
| 5 | broadcast |
| 6 | NTP control message |
| 7 | reserved for private use |
+------+--------------------------+
Table 2: Mode Field Values
Mode 0 is reserved. Modes 1 and 2 are intended for use by symmetric
peers who set this mode to 1 or 2 depending on whether it is active
or passive mode. In unicast mode or discovery mode, the client sets
this field to 3 (client) in the request and the server sets it to 4
(server) in the reply. In broadcast mode, the server sets this field
to 5 (broadcast). A mode type of 6 is reserved for NTP control
messages. Mode 7 is reserved for private usage.
3.4. Stratum (Strat) The variables are interpreted as follows:
leap: 2-bit integer warning of an impending leap second to be
inserted or deleted in the last minute of the current month,
coded as follows:
This is a eight-bit unsigned integer indicating the stratum. This 0 no warning
field is significant only in SNTP server messages, where the values 1 last minute of the day has 61 seconds
are defined in Table 3. 2 last minute of the day has 59 seconds
3 alarm condition (the clock is not synchronized)
+---------+-------------------------------------------------------+ version:
| Stratum | Meaning | 3-bit integer representing the NTP version number, currently 4.
+---------+-------------------------------------------------------+
| 0 | kiss-o'-death message |
| 1 | primary reference (e.g., synchronized by radio clock) |
| 2-255 | secondary reference (synchronized by NTP or SNTP) |
+---------+-------------------------------------------------------+
Table 3: Stratum Field Values mode: 3-bit integer representing the mode, with values defined
as follows:
3.5. Poll Interval (Poll) 0 reserved
1 symmetric active
2 symmetric passive
3 client
4 server
5 broadcast
6 NTP control message
7 reserved for private use
This is an eight-bit unsigned integer indicating the maximum interval stratum: 8-bit integer representing the stratum, with values
between successive messages, in log2 seconds. A client SHOULD NOT defined as follows:
use a poll interval less than 15 seconds, except at initial startup
when it MAY send a sequence of 8 packets at 1 second intervals to
provide initial synchronization of the clients with each server. A
client SHOULD increase the poll interval as performance permits and
especially if the server does not respond within a reasonable time.
3.6. Precision (Prec) 0 unspecified or invalid
1 primary server (e.g., equipped with a GPS receiver)
2-15 secondary server (via NTP)
16 client-only
17-255 undefined
This is an eight-bit signed integer indicating the precision of the It is customary to map the stratum value 0 in received packets to
system clock in log2 seconds. Precision is normally determined when MAXSTRAT (16) in the peer variable p.stratum and to map
the service is established as the minimum number of iterations of the p.stratum values of MAXSTRAT or greater to 0 in transmitted
time to read the system clock. As an example, a value of -18 packets. This allows reference clocks, which normally appear at
corresponds to a precision of about one microsecond. stratum 0, to be conveniently mitigated using the same algorithms
used for external sources.
3.7. Root Delay poll: 8-bit signed integer representing the maximum interval
between successive messages, in log2 seconds. Suggested default
limits for minimum and maximum poll intervals are 6 and 10, '
respectively.
This is a 32-bit signed fixed-point number indicating the total precision: 8-bit signed integer representing the precision of
roundtrip delay to the primary reference source, in 32-bit NTP short the system clock, in log2 seconds. For instance a value of -18
format. Note that this variable can take on both positive and corresponds to a precision of about one microsecond. The
negative values, depending on the relative time and frequency precision can be determined when the service first starts up as
offsets. This field is significant only in server messages, where the minimum time of several iterations to read the system clock.
the values range from negative values of a few milliseconds to
positive values of several hundred milliseconds.
3.8. Root Dispersion rootdelay: Total roundtrip delay to the reference clock, in NTP
short format.
This is a 32-bit unsigned fixed-point number indicating the nominal rootdisp: Total dispersion to the reference clock, in NTP short
error relative to the primary reference source in seconds, in 32-bit format.
NTP short format.
3.9. Reference Identifier refid: 32-bit code identifying the particular server or
referenceclock. The interpretation depends on the value in the
stratum field. For packet stratum 0 (unspecified or invalid)
this is a four-character ASCII string, called the kiss code,
used for debugging and monitoring purposes. For stratum 1
(reference clock) this is a four-octet, left-justified,
zero-padded ASCII string assigned to the radio clock. While not
specifically enumerated in this document, the following have
been used as ASCII identifiers:
This is a 32-bit bitstring identifying the particular reference GOES Geosynchronous Orbit Environment Satellite
source. The interpretation of this field depends on the value in the GPS Global Position System
stratum field. For stratum 0, this is a four-character ASCII string, PPS Generic pulse-per-second
referred to as a 'kiss code' and is used for debugging and monitoring IRIG Inter-Range Instrumentation Group
purposes. For stratum 1, this is a four-octet, left-justified, zero- WWVB LF Radio WWVB Ft. Collins, CO 60 kHz
padded ASCII string assigned to the reference source. Above stratum DCF LF Radio DCF77 Mainflingen, DE 77.5 kHz
1 (secondary servers and clients), this is the reference identifier HBG LF Radio HBG Prangins, HB 75 kHz
of the server. If employing IPv4, the value is the 32-bit IPv4 MSF LF Radio MSF Rugby, UK 60 kHz
address of the synchronization source. For IPv6 and OSI, the value JJY LF Radio JJY Fukushima, JP 40 kHz, Saga, JP 60 kHz
is the first 32 bits of the MD5 hash of the IPv6 or NSAP address of LORC MF Radio LORAN C 100 kHz
the synchronization source. The fASCII identifiers that are TDF MF Radio Allouis, FR 162 kHz
currently defined are given in Table 4. CHU HF Radio CHU Ottawa, Ontario
WWV HF Radio WWV Ft. Collins, CO
WWVH HF Radio WWVH Kaui, HI
NIST NIST telephone modem
USNO USNO telephone modem
PTB etc. European telephone modem
Primary (stratum 1) servers set this field to a code identifying the Above stratum 1 (secondary servers and clients) this is the
external reference source according to Table 4. reference identifier of the server. If using the IPv4 address
family, the identifier is the four-octet IPv4 address. If using
the IPv6 address family, it is the first four octets of the MD5
hash of the IPv6 address.
+-------+----------------------------------------------------+ reftime: Time when the system clock was last set or corrected,
| Code | External Reference Source | in NTP timestamp format.
+-------+----------------------------------------------------+
| GOES | Geosynchronous Orbit Environment Satellite |
| GPS | Global Position System |
| PPS | Generic pulse-per-second |
| IRIG | Inter-Range Instrumentation Group |
| WWVB | LF Radio WWVB Ft. Collins, CO 60 kHz |
| DCF77 | LF Radio DCF77 Mainflingen, DE 77.5 kHz |
| HBG | LF Radio HBG Prangins, HB 75 kHz |
| MSF | LF Radio MSF Rugby, UK 60 kHz |
| JJY | LF Radio JJY Fukushima, JP 40 kHz, Saga, JP 60 kHz |
| LORC | MF Radio LORAN C 100 kHz |
| TDF | MF Radio Allouis, FR 162 kHz |
| CHU | HF Radio CHU Ottawa, Ontario |
| WWV | HF Radio WWV Ft. Collins, CO |
| WWVH | HF Radio WWVH Kauai, HI |
| NIST | NIST telephone modem |
| USNO | USNO telephone modem |
| PTB | European telephone modem |
+-------+----------------------------------------------------+
Table 4: Currently-defined Reference Identifiers org: Time at the client when the request departed for the
server, in NTP timestamp format.
If the external reference is one of those listed, the associated code rec: Time at the server when the request arrived from the
should be used. Codes for sources not listed can be created as client, in NTP timestamp format.
appropriate (see IANA Considerations section of this document).
3.10. Reference Timestamp xmt: Time at the server when the response left for the
client, in NTP timestamp format.
This is a 64 bit signed integer indicating the time when the system dst: Time at the client when the reply arrived from the
clock was last set or correctetd, in 64-bit NTP timestamp format. server, in NTP timestamp format. Note: This value is not
included in a header field; it is determined upon arrival
of the packet and made available in the packet buffer data
structure.
3.11. Originate Timestamp keyid: 32-bit unsigned integer used by the client and server to
designate a secret 128-bit MD5 key. Together, the keyid and
digest fields collectively are called message authentication
code (MAC).
This is the time at which the request departed the client for the digest: 128-bit bitstring computed by the keyed MD5 message
server, in 64-bit NTP timestamp format. digest algorithm described in Appendix A.
3.12. Receive Timestamp 6.3.1. The Kiss-o'-Death Packet
This is the time at which the request arrived at the server or the If the Stratum field is 0, which is an 'unspecified' Stratum field
reply arrived at the client, in 64-bit NTP timestamp format. value, the Reference Identifier field can be used to convey messages
useful for status reporting and access control. In NTPv4 and SNTPv4,
packets of this kind are called Kiss-o'-Death (KoD) packets and the
ASCII messages they convey are called kiss codes. The KoD packets
got their name because an early use was to tell clients to stop
sending packets that violate server access controls. The kiss codes
can provide useful information for an intelligent client. These
codes are encoded in four-character ASCII strings left justified and
zero filled. The strings are designed for character displays and log
files. A list of the currently-defined kiss codes is given below:
3.13. Transmit Timestamp +------+------------------------------------------------------------+
| Code | Meaning |
+------+------------------------------------------------------------+
| ACST | The association belongs to a unicast server |
| AUTH | Server authentication failed |
| AUTO | Autokey sequence failed |
| BCST | The association belongs to a broadcast server |
| CRYP | Cryptographic authentication or identification failed |
| DENY | Access denied by remote server |
| DROP | Lost peer in symmetric mode |
| RSTR | Access denied due to local policy |
| INIT | The association has not yet synchronized for the first |
| | time |
| MCST | The association belongs to a dynamically discovered server |
| NKEY | No key found. Either the key was never installed or is |
| | not trusted |
| RATE | Rate exceeded. The server has temporarily denied access |
| | because the client exceeded the rate threshold |
| RMOT | Alteration of association from a remote host running |
| | ntpdc. |
| STEP | A step change in system time has occurred, but the |
| | association has not yet resynchronized |
+------+------------------------------------------------------------+
This is the time at which the request departed the client or the Figure 9: Currently-defined NTP Kiss Codes
reply departed the server, in 64-bit NTP timestamp format.
3.14. NTPv4 Extension Fields 6.3.2. NTP Extension Field Format
NTPv4 defines new extension field formats. The minimum extension In NTPv4 one or more extension fields can be inserted after the
field length is 8 octets. The format of the NTP extension field is header and before the MAC, which is always present when extension
given in Figure Figure 3. fields are present. The extension fields can occur in any order;
however, in some cases there is a preferred order which improves the
protocol efficiency.
An extension field contains a request or response message in the
format shown in Figure 10
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Field Type | Length | | Field Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Association ID | | Association ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Timestamp | | Timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Filestamp | | Filestamp |
skipping to change at page 12, line 37 skipping to change at page 24, line 30
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Signature Length | | Signature Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
. . . .
. Signature . . Signature .
. . . .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Padding (as needed) | | Padding (as needed) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: NTP Extension Field Format Figure 10: NTP Extension Field Format
The Field Type field is a 16-bit integer which indicates the type of . All extension fields are zero-padded to a word (4 octets)
extension message contained within the extension field. boundary. The Length field covers the entire extension field,
including the Length and Padding fields. While the minimum field
length is 4 words (16 octets), a maximum field length remains to be
established.
The RE, VN, and Code fields together form a Field Type field, a 16-
bit integer which indicates the type of extension message contained
within the extension field.
The Length field is a 16-bit integer indicates the length of the The Length field is a 16-bit integer indicates the length of the
entire extension field in octets, including the Length and Padding entire extension field in octets, including the Length and Padding
fields. fields.
The 32-bit Association ID field is set by clients to the value The 32-bit Association ID field is set by clients to the value
previously received from the server or 0 otherwise. The server sets previously received from the server or 0 otherwise. The server sets
the Association ID field when sending a response as a handle for the Association ID field when sending a response as a handle for
subsequent exchanges. If the association ID value in a request does subsequent exchanges. If the association ID value in a request does
not match the association ID of any association, the server returns not match the association ID of any association, the server returns
skipping to change at page 13, line 14 skipping to change at page 25, line 14
The Timestamp and Filestamp 32-bit fields carry the seconds field of The Timestamp and Filestamp 32-bit fields carry the seconds field of
an NTP timestamp. The Timestamp field establishes the signature an NTP timestamp. The Timestamp field establishes the signature
epoch of the data field in the message, while the filestamp epoch of the data field in the message, while the filestamp
establishes the generation epoch of the file that ultimately produced establishes the generation epoch of the file that ultimately produced
the data. the data.
The 32-bit Value Length field indicates the length of the Value field The 32-bit Value Length field indicates the length of the Value field
in octets. The minimum length of the Value field is 0. in octets. The minimum length of the Value field is 0.
The 32-bit Signature Length field indicates the length of the The 32-bit Value Length field indicates the length of the Value field
Signature field in octets. in octets. The minimum length of the Value field is 0.
Zero padding is applied, as necessary, to extend the extension field Zero padding is applied, as necessary, to extend the extension field
to a word (4-octet) boundary. If multiple extension fields are to a word (4-octet) boundary. If multiple extension fields are
present, the last extension field is zero-padded to a double-word (8 present, the last extension field is zero-padded to a double-word (8
octet) boundary. octet) boundary.
3.15. Authentication (optional) The presence of the MAC and extension fields in the packet is
determined from the length of the remaining area after the header to
NTPv4 provides an optional 160-bit Authentication field. When the end of the packet. The parser initializes a pointer just after
implemented, the 32-bit Key Identifier and 128-bit Message Digest the header. If the Length field is not a multiple of 4, a format
fields contain the Message Authentication Code (MAC) information error has occurred and the packet is discarded. The following cases
which uses an MD5 cryptosum of NTP header plus extension fields. The are possible based on the remaining length in words.
authentication field format is shown in Figure Figure 4. 0 The packet is not authenticated.
0 1 2 3 1 The packet is an error report or crypto-NAK.
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2, 3, 4 The packet is discarded with a format error.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 5 The remainder of the packet is the MAC.
| Key Identifier | >5 One or more extension fields are present.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Message Digest +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: NTP Authentication Field
The 32-bit Key Identifier is an integer identifying the 128-bit
private key used to generate the MAC. The Message Digest field
contains the MD5 Message Digest. In NTPv4, the presence of one or
more extension fields requires the presence of an authentication
field. The presence of the Authentication field and extension fields
is determined from the Length field.
The Key Identifier is initialized to zero at the start of an
association. The type of association then determines the key
identifier. If the association is active (modes 1, 3, 5) the key is
determined from the system key identifier. If the association is
passive (modes 2, 4) the key is determined from the peer key
identifier, if the authentic bit is set (see [1]), or as the default
key (zero) otherwise.
4. NTP Protocol Operation
The NTP protocol defines three operational roles, Client, Server, and
Symmetric Peer. Clients request or receive time from Servers
(solicited or unsolicited). Servers respond to requests or send
periodic time updates to Clients. Symmetric Peers exchange time data
bidirectionally. A given NTPv4 implementation can operate in any or
all of these modes.
NTP messages make use of two different communication modes, one to
one and one to many, commonly referred to as unicast and broadcast.
For the purposes of this document, the term broadcast is interpreted
to mean any available one to many mechanism. For IPv4 this equates
to either IPv4 broadcast or IPv4 multicast. For IPv6 this equates to
IPv6 multicast. For this purpose, IANA has allocated the IPv4
multicast address 224.0.1.1 and the IPv6 multicast address ending
:101, with prefix determined by scoping rules.
Except in broadcast mode, an NTP association is formed when two peers
exchange messages and one or both of them create and maintain an
instantiation of the protocol machine, called an association. The
association can operate in one of five modes as indicated by the
host- mode variable (peer.mode) (see [1] for a description of the NTP
variables): symmetric active, symmetric passive, client, server and
broadcast, which are defined as follows:
Symmetric Active (1): A host operating in this mode sends periodic
messages regardless of the reachability state or stratum of its peer.
By operating in this mode the host announces its willingness to
synchronize and be synchronized by the peer.
Symmetric Passive (2): This type of association is ordinarily created
upon arrival of a message from a peer operating in the symmetric
active mode and persists only as long as the peer is reachable and
operating at a stratum level less than or equal to the host;
otherwise, the association is dissolved. However, the association
will always persist until at least one message has been sent in
reply. By operating in this mode the host announces its willingness
to synchronize and be synchronized by the peer.
Client (3): A host operating in this mode sends periodic messages
regardless of the reachability state or stratum of its peer. By
operating in this mode the host announces its willingness to be
synchronized by, but not to synchronize the peer.
Server (4): This type of association is ordinarily created upon
arrival of a client request message and exists only in order to reply
to that request, after which the association is dissolved. By
operating in this mode the host announces its willingness to
synchronize, but not to be synchronized by the peer.
Broadcast (5): A host operating in this mode sends periodic messages
regardless of the reachability state or stratum of the peers. By
operating in this mode the host announces its willingness to
synchronize all of the peers, but not to be synchronized by any of
them.
NTP messages are layered on top of UDP. All messages MUST be sent
with a destination port of 123, and SHOULD be sent with a source port
of 123.
The on-wire protocol uses four timestamps numbered T1 through T4 and If an extension field is present, the parser examines the Length
three state variables org, rec, and xmt, as shown in Figure Figure 5, field. If the length is less than 4 or not a multiple of 4, a format
where T1 corresponds to the Reference Timestamp T2 corresponds to the error has occurred and the packet is discarded; otherwise, the parser
Originate Timestamp, T3 corresponds to the Receive Timestamp, and T4 increments the pointer by this value. The parser now uses the same
corresponds to the Transmit Timestamp. rules as above to determine whether a MAC is present and/or another
extension field. An additional implementation dependent test is
necessary to ensure the pointer does not stray outside the buffer
space occupied by the packet.
7. On Wire Protocol
t2 t3 t6 t7 t2 t3 t6 t7
+---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+
T1 | 0 | | t2 | | t4 | | t6 | T1 | 0 | | t2 | | t4 | | t6 |
+---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+
T2 | 0 | | t1 | | t3 | | t5 | Packet T2 | 0 | | t1 | | t3 | | t5 | Packet
+---------+ +---------+ +---------+ +---------+ Variables +---------+ +---------+ +---------+ +---------+ Variables
T3 |t2=clock | | t2 | |t6=clock | | t6 | T3 |t2=clock | | t2 | |t6=clock | | t6 |
+---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+
T4 | t1 | |t3=clock | | t5 | |t7=clock | T4 | t1 | |t3=clock | | t5 | |t7=clock |
+---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+
skipping to change at page 16, line 52 skipping to change at page 26, line 51
+---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+
Peer A Peer A
+---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+
org | 0 | | T3<>0? | | t3 | | T3<>t3? | org | 0 | | T3<>0? | | t3 | | T3<>t3? |
+---------+ +---------+ +---------+ +---------+ State +---------+ +---------+ +---------+ +---------+ State
rec | 0 | | t4 | | t4 | | t8 | Variables rec | 0 | | t4 | | t4 | | t8 | Variables
+---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+
xmt | t1 | | T1=t1? | | t5 | | T1<>t5? | xmt | t1 | | T1=t1? | | t5 | | T1<>t5? |
+---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+ +---------+
Figure 5: NTPState Figure 12: On-Wire Protocol
This figure shows the most general case, where each of two peers, A The NTP on-wire protocol is the core mechanism to exchange time
values between servers, peers and clients. It is inherently
resistant to lost or duplicate data packets. Data integrity is
provided by the IP and UDP checksums. No flow-control or
retransmission facilities are provided or necessary. The protocol
uses timestamps, either extracted from packet headers or struck from
the system clock upon the arrival or departure of a packet.
Timestamps are precision data and should be restruck in case of link
level retransmission and corrected for the time to compute a MAC on
transmit.
NTP messages make use of two different communication modes, one to
one and one to many, commonly referred to as unicast and broadcast.
For the purposes of this document, the term broadcast is interpreted
to mean any available one to many mechanism. For IPv4 this equates
to either IPv4 broadcast or IPv4 multicast. For IPv6 this equates to
IPv6 multicast. For this purpose, IANA has allocated the IPv4
multicast address 224.0.1.1 and the IPv6 multicast address ending
:101, with prefix determined by scoping rules.
The on-wire protocol uses four timestamps numbered T_1 through T_4
and three state variables org, rec and xmt, as shown in Figure 12.
This figure shows the most general case where each of two peers, A
and B, independently measure the offset and delay relative to the and B, independently measure the offset and delay relative to the
other. For illustrative purposes, the individual timestamp values other. For purposes of illustration the individual timestamp values
are shown in lower case with subscripts indicating the order of are shown in lower case with subscripts indicating the order of
transmission and reception. In the figure, the first packet transmission and reception.
transmitted by A contains only the transmit timestamp T4 with value
t1. B receives the packet at t2 and saves the originate timestamp T2 In the figure the first packet transmitted by A containing only the
with value t1 in state variable org and the receive timestamp T3 with transmit timestamp T3 with value t1. B receives the packet at t2 and
value t2 in state variable rec. Afterwards, B sends a packet to A saves the origin timestamp T1 with value t1 in state variable org and
containing the org and rec state variables in T2 and T1 respectively the destination timestamp T4 with value t2 in state variable rec. At
and additionally the transmit timestamp T4 with value t3, which is this time or some time later B sends a packet to A containing the org
saved in the xmt state variable. When this packet arrives at A the and rec state variables in T1 and T2, respectively and in addition
packet header variables T1, T2, T3, and T4 represent the four the transmit timestamp T3 with value t3, which is saved in the xmt
timestampes necessary to compute the offset and delay of B relative state variable. When this packet arrives at A the packet header
to A. variables T1, T2, T3 and destination timestamp T4 represent the four
timestamps necessary to compute the offset and delay of B relative to
A, as described later.
Before the A state variables are updated, two sanity checks are Before the A state variables are updated, two sanity checks are
performed in order to protect against duplicate or invalid packets. performed in order to protect against duplicate or bogus packets. A
A packet is a duplicate if the transmit timestamp T4 in the packet packet is a duplicate if the transmit timestamp T3 in the packet
matches the xmt state variable. A packet is invalid if the origin matches the xmt state variable. A packet is bogus if the origin
timestamp T2 in the packet does not match the org state variable. In timestamp T1 in the packet does not match the org state variable. In
either of these cases the state variables are updated, but the packet either of these cases the state variables are updated, but the packet
is discarded. is discarded.
The general rules that govern the updating of state variables and The four most recent timestamps, T1 through T4, are used to compute
packet variables are given in Figure 6. the offset of B relative to A
+-------------------------------------------------------+
| Receive | Transmit |
+-------------------------------------------------------+
| org=T4 | org=unchanged |
| rec=Time of Receipt | rec=unchanged |
| xmt=unchanged | xmt=Time of transmission |
| | |
| T1=Received T3 | T1=rcv |
| T2=Received T2 | T2=org |
| T3=rec | T3=unchanged |
| T4=Received T4 | T4=xmt |
+-------------------------------------------------------+
Figure 6: Relationship between NTP State Variables and NTP Packet theta = T(B) - T(A) = 1/2*(T_2-T_1)+(T_3-T_4)
Variables
5. SNTP Protocol Operation and the roundtrip delay
SNTP operates using the same message formats, addresses, and ports as del = T(ABA)- = (T_4-T_1)-(T_3-T_2)
NTP. However, it is stateless, operating only in the client or
server roles. Thus it is compatible with, and a subset of, NTP.
6. NTP Server Operations Note that the quantities within parentheses are computed from 64-bit
unsigned timestamps and result in signed values with 63 significant
bits plus sign. These values can represent dates from 68 years in
the past to 68 years in the future. However, the offset and delay
are computed as the sum and difference of these values, which contain
62 significant bits and two sign bits, so can represent unambiguous
values from 34 years in the past to 34 years in the future. In other
words, the time of the client must be set within 34 years of the
server before the service is started. This is a fundamental
limitation with 64-bit integer arithmetic..
Fundamentally, the NTP Server role consists of listening for client In implementations where floating double arithmetic is available, the
requests, and providing time and associated details as a response. first-order differences can be converted to floating double and the
Additionally, a server can provide time and associated details second-order sums and differences computed in that arithmetic. Since
periodically via a broadcast mechanism. the second-order terms are typically very small relative to the
timestamps themselves, there is no loss in significance, yet the
unambiguous range is increased from 34 years to 68 years.
An NTP server can communicate via unicast, broadcast, or both. A In some scenarios where the frequency offset between the client and
server receiving a unicast request (NTP mode 3), modifies fields in server is relatively large and the actual propagation time small, it
the NTP header as described below, and sends a reply (NTP mode 4), is possible that the delay computation becomes negative. For
possibly using the same message buffer as the request. When instance, if the frequency difference is 100 PPM and the interval T_4
operating in a broadcast mode, unsolicited messages (NTP mode 5) with - T_1 is 64 s, the apparent delay is -6.4 ms. Since negative values
field values as described below are normally sent at intervals are misleading in subsequent computations, the value of del should be
ranging from 64 s to 1024 s, depending on the expected frequency clamped not less than the system precision s.precision rho defined
tolerance of the client clocks and the required accuracy. below.
A broadcast server may or may not send messages if not synchronized The discussion above assumes the most general case where two
to a correctly operating source, but the preferred option is to symmetric peers independently measure the offsets and delays between
transmit, since this allows reachability to be determined regardless them. In the case of a stateless server, the protocol can be
of synchronization state. simplified. A stateless server copies T_3 and T_4 from the client
packet to T_1 and T_2 of the server packet and tacks on the transmit
timestamp T_3 before sending it to the client. Additional details
for filling in the remaining protocol fields are given in the next
section and in Appendix A.
The Leap Indicator (LI) is set to 3 (unsynchronized) if the server A SNTP primary server implementing the on-wire protocol has no
has never synchronized to a reference source. Once synchronized, the upstream servers except a single reference clock In principle, it is
LI field is set to one of the other three values and remains at the indistinguishable from an NTP primary server which has the mitigation
last value set even if the reference source becomes unreachable or algorithms, presumably to mitigate between multiple reference clocks.
turns faulty. Upon receiving a client request, a SNTP primary server constructs and
sends the reply packet as shown in Figure 5 below. Note that the
dispersion field in the packet header must be calculated in the same
way as in the NTP case.
The Version (VN) is copied from the request packet, if responding to A SNTP client using the on-wire protocol has a single server and no
a unicast request. For broadcast, this is set to 4. downstream clients. It can operate with any subset of the NTP on-
wire protocol, the simplest using only the transmit timestamp of the
server packet and ignoring all other fields. However, the additional
complexity to implement the full on-wire protocol is minimal and is
encouraged.
The Mode is set to Server (4) if in response to a unicast request. 8. Peer Process
For broadcast, this is set to Broadcast (5).
The Stratum field is set to the server's current stratum, if The peer process is called upon arrival of a server packet. It runs
synchronized. If synchronized to a primary reference source the the on-wire protocol to determine the clock offset and roundtrip
Stratum field is set to 1. If unsynchronized this field is set to 0. delay and in addition computes statistics used by the system and poll
processes. Peer variables are instantiated in the association data
structure when the structure is initialized and updated by arriving
packets. There is a peer process, poll process and association for
each server.
The Poll field is coppied from the request, if responding to a The discussion in this section covers only the variables and routines
unicast request. For broadcast, this is set to the nearest integer necessary for a conforming NTPv4 implementation. Additional
log2 of the poll interval. implementation details are in Section B.5.
The Precision field is set to reflect the maximum reading error of 8.1. Peer Process Variables
the system clock. The Root Delay and Root Dispersion fields are set Name Formula Description
to 0 for a primary server; optionally, the Root Dispersion field can Configuration Variables
be set to a value corresponding to the maximum error of the radio srcaddr srcaddr source address
clock itself. srcport srcport source port
dstaddr dstaddr destination address
dstport destport destination port
keyid keyid key identifier key ID
Packet Variables
leap leap leap indicator
version version version number
mode mode mode
stratum stratum stratum
ppoll ppoll peer poll exponent
rootdelay delta root delay
rootdisp E root dispersion
refid refid reference ID
reftime reftime reference timestamp
Timestamp Variables
t t epoch
org T1 origin timestamp
rec T2 receive timestamp
xmt T3 transmit timestamp
Statistics Variables
offset theta clock offset
delay del roundtrip delay
disp epsilon dispersion
jitter varphi jitter
If the server is synchronized to a reference source, the value of the Figure 13: Peer Process Variables
Reference ID is set to a four-character ASCII string identifying the
source, left justified and zero padded to 32bits. For IPv4 secondary
servers,the value is the 32-bit IPv4 address of the synchronization
source. For IPv6 and OSI secondary servers, the value is the first
32 bits of the MD5 hash of the IPv6 or NSAP address of the
synchronization source. If unsynchronized, it is set to an ASCII
error identifier.
The timestamp fields in the server message are set as follows. If Figure 13 summarizes the common names, formula names and a short
the server is unsynchronized or first coming up, all timestamp fields description of each peer variable, all of which have prefix p. The
are set to zero with one exception. If the server is synchronized, following configuration variables are normally initialized when the
the Transmit Timestamp field of the request is copied unchanged to association is mobilized, either from a configuration file or upon
the Originate Timestamp field of the reply. arrival of the first packet for an ephemeral association.
If the server is synchronized, the Reference Timestamp is set to the p.srcadr: IP address of the remote server or reference clock. This
time the last update was received from the reference source. The becomes the destination IP address in packets sent from this
Originate Timestamp field is set as in the unsynchronized case above. association.
The Transmit Timestamp field is set to the time of day when the
message is sent. In broadcast messages the Receive Timestamp field
is set to zero and copied from the Transmit Timestamp field in other
messages.
Table 5 summarizes these actions. p.srcport: UDP port number of the server or reference clock. This
becomes the destination port number in packets sent from this
association. When operating in symmetric modes (1 and 2) this field
must contain the NTP port number PORT (123) assigned by the IANA. In
other modes it can contain any number consistent with local policy.
+---------------+-----------+-------------------+-------------------+ p.dstadr: IP address of the client. This becomes the source IP
| Field Name | Unicast | Unicast Reply | Broadcast | address in packets sent from this association.
| | Request | | |
+---------------+-----------+-------------------+-------------------+
| LI | ignore | as needed | as needed |
| VN | 1-4 | copied from | 4 |
| | | request | |
| Mode | 1 or 3 | 2 or 4 | 5 |
| Stratum | ignore | 1 | 1 |
| Poll | ignore | copied from | log2 poll |
| | | request | interval |
| Precision | ignore | -log2 server | -log2 server |
| | | significant bits | significant bits |
| Root Delay | ignore | 0 | 0 |
| Root | ignore | 0 | 0 |
| Dispersion | | | |
| Reference | ignore | source ident | source ident |
| Identifier | | | |
| Reference | ignore | time of last src. | time of last src. |
| Timestamp | | update | update |
| Originate | ignore | copied from xmit | 0 |
| Timestamp | | timestamp | |
| Receive | ignore | time of day | 0 |
| Timestamp | | | |
| Transmit | (see | time of day | time of day |
| Timestamp | text) | | |
| Authenticator | optional | optional | optional |
+---------------+-----------+-------------------+-------------------+
Table 5: NTP Server Message Field Population p.dstport: UDP port number of the client, ordinarily the NTP port
number PORT (123) assigned by the IANA. This becomes the source port
number in packets sent from this association.
Broadcast servers should respond to client unicast requests, as p.keyid: Symmetric key ID for the 128-bit MD5 key used to generate
well as send unsolicited broadcast messages. Broadcast clients and verify the MAC. The client and server or peer can use different
may send unicast requests in order to measure the network values, but they must map to the same key.
propagation delay between the server and client and then continue
operation in listen-only mode. However, broadcast servers may
choose not to respond to unicast requests, so unicast clients
should be prepared to abandon the measurement and assume a default
value for the delay.
7. NTP Client Operations The variables defined below are updated from the packet header as
each packet arrives. They are interpreted in the same way as the as
the packet variables of the same names.
------------------
| receive |
------------------
\| /
------------------ no------------------
| format OK? |-->| format error |
------------------ ------------------
\| / yes
------------------ no------------------
| access OK? |-->| access error |
------------------ ------------------
\| / yes
------------------yes------------------
| mode = 3? |-->| client_packet |
------------------ ------------------
\| / no
------------------yes------------------
| auth OK? |-->| auth error |
------------------ ------------------
\| / yes
------------------
| match_assoc |
------------------
The role of an NTP client is to determine the current time (and Figure 14: Receive Processing
associated information) from an NTP server. This can be done
actively, by sending a unicast request to a configured server, or
passively by listening on a known address for periodic server
messages.
An NTP client can operate in unicast or broadcast modes. In unicast p.leap, p.version, p.mode, p.stratum, p.ppoll, p.rootdelay,
mode the client sends a request (NTP mode 3) to a designated unicast p.rootdisp, p.refid, p.reftime
server and expects a reply (NTP mode 4) from that server. In
broadcast client mode it sends no request and waits for a broadcast
(NTP mode 5) from one or more broadcast servers.
A unicast client initializes the NTP message header, sends the It is convenient for later processing to convert the NTP short format
request to the server and strips the time of day from the Transmit packet values p.rootdelay and p.rootdisp to floating doubles as peer
Timestamp field of the reply. For this purpose, all of the NTP variables.
header fields shown in Section 3 are set to 0, except the Mode, VN
and optional Transmit Timestamp fields.
NTP and SNTP clients set the mode field to 3 (client) for unicast The p.org, p.rec, p.xmt variables represent the timestamps computed
requests. They set the VN field to any version number supported by by the on-wire protocol described previously. The p.offset, p.delay,
the server selected by configuration or discovery and can p.disp, p.jitter variables represent the current time values and
interoperate with all previous version NTP and SNTP servers. Servers statistics produced by the clock filter algorithm. The offset and
reply with the same version as the request, so the VN field of the delay are computed by the on-wire protocol; the dispersion and jitter
request also specifies the VN field of the reply. An NTP client can are calculated as described below. Strictly speaking, the epoch p.t
specify the earliest acceptable version on the expectation that any is not a timestamp; it records the system timer upon arrival of the
server of that or later version will respond. NTPv4 servers are latest packet selected by the clock filter algorithm.
backwards compatible with NTPv3 as defined in RFC 1305, NTPv2 as
defined in [11], and NTPv1 as defined in [12]. NTPv0 defined in [13]
is not supported.
In unicast mode, the Transmit Timestamp field in the request SHOULD 8.2. Peer Process Operations
be set to the time of day according to the client clock in NTP
timestamp format. This allows for the determination of the
propagation delay between the server and client and to align the
system clock relative to the server. In addition, this provides a
simple method to verify that the server reply is in fact a legitimate
response to the specific client request and avoid replays. Note that
in broadcast mode, the client cannot necessarily calculate the
propagation delay or determine the validity of the server.
There is some latitude on the part of most clients to forgive invalid Figure 14 shows the peer process code flow upon the arrival of a
timestamps, such as might occur when first coming up or during packet. There is no specific method required for access control,
periods when the reference source is inoperative. The most important although it is recommended that implementations include a match-and-
indicator of an unhealthy server is the Stratum field, in which a mask scheme similar to many others now in widespread use. Format
value of 0 indicates an unsynchronized condition. When this value is checks require correct field length and alignment, acceptable version
displayed, clients should discard the server message, regardless of number (1-4) and correct extension field syntax, if present. There
the contents of other fields. is no specific requirement for authentication; however, if
authentication is implemented, the symmetric key scheme described in
Section 6 must be included among the supported. This scheme uses the
MD5 keyed hash algorithm Section A.2. For the most vulnerable
applications the Autokey public key scheme described in [3] is
recommended.
Table 6 summarizes the required NTP client operations in unicast and Next, the association table is searched for matching source address
broadcast modes and source port using the find_assoc() routine in Section A.5. The
dispatch table near the beginning of that section is indexed by the
packet mode and association mode (0 if no matching association) to
determine the dispatch code and thus the case target. The
significant cases are FXMT, NEWPS and NEWBC.
-----------------
| client_packet |
-----------------
\ | /
-----------------
| copy header |
-----------------
\ | /
-----------------
| copy T_1,T_2 |
-----------------
\ | /
-----------------
| T_3 = clock |
-----------------
\ | /
-----------------yes-----------------
| copy header |-->| MD5 digest |-\
----------------- ----------------- |
| no |
\ | / |
----------------- |
| NAK digest | |
----------------- |
|-----------------------------/
\ | /
-----------------
| fast_xmit() |
-----------------
\ | /
-----------------
| xmt = T_3 |
-----------------
\ | /
-----------------
| return |
-----------------
+-------------------+---------------+-------------------+-----------+ Packet Variable <-- Variable
| Field Name | Unicast | Unicast Reply | Broadcast | x.leap <-- s.leap
| | Request | | | x.version <-- r.version
+-------------------+---------------+-------------------+-----------+ x.mode <-- 4
| LI | 0 | 0-3 | 0-3 | x.stratum <-- s.stratum
| VN | 1-4 | copied from | 1-4 | x.poll <-- r.poll
| | | request | | x.precision <-- s.precision
| Mode | 1 or 3 | 2 or 4 | 5 | x.rootdelay <-- s.rootdelay
| Stratum | 0 | 0-15 | 0-15 | x.rootdisp <-- s.rootdisp
| Poll | 0 | ignore | ignore | x.refid <-- s.refid
| Precision | 0 | ignore | ignore | x.reftime <-- s.reftime
| Root Delay | 0 | ignore | ignore | x.org <-- r.xmt
| Root Dispersion | 0 | ignore | ignore | x.rec <-- r.dst
| Reference | 0 | ignore | ignore | x.xmt <-- clock
| Identifier | | | | x.keyid <-- r.keyid
| Reference | 0 | ignore | ignore | x.digest <-- md5 digest
| Timestamp | | | |
| Originate | 0 | (see text) | ignore |
| Timestamp | | | |
| Receive Timestamp | 0 | (see text) | ignore |
| Transmit | (see text) | nonzero | nonzero |
| Timestamp | | | |
| Authenticator | optional | optional | optional |
+-------------------+---------------+-------------------+-----------+
Table 6: NTP Client Message Field Population Figure 15: Client Packet Processing
8. NTP Symmetric Peer Operations FXMIT. This is a client (mode 3) packet matching no association.
The server constructs a server (mode 4) packet and returns it to the
client without retaining state. The server packet is constructed as
in Figure 15 and the fast_xmit() routine in Section B.5. If the
s.rootdelay and s.rootdisp system variables are stored in floating
double, they must be converted to NTP short format first. Note that,
if authentication fails, the server returns a special message called
a crypto-NAK. This message includes the normal NTP header data shown
in the figure, but with a MAC consisting of four octets of zeros.
The client is free to accept or reject the data in the message.
NTP Symmetric Peer mode is intended for configurations where a set of NEWBC. This is a broadcast (mode 5) packet matching no association.
low-stratum peers operate as mutual backups for each other. Each The client mobilizes a client (mode 3) association as shown in the
peer normally operates with one or more sources, such as a reference mobilize() and clear() routines in Section A.2. Implementations
clock, or a subset of primary or secondry servers known to be supporting authentication first perform the necessary steps to run
reliable or authentic. the Autokey or other protocol, and determine the propagation delay,
then continues in listen-only (mode 6) to receive further packets.
Note the distinction between a mode-6 packet, which is reserved for
the NTP monitor and control functions, and a mode-6 association.
Symmetric Peer mode is exclusive to the NTP protocol and is NEWPS. This is a symmetric active (1) packet matching no
specifically excluded from SNTP operation. For the purposes of this association. The client mobilizes a symmetric passive (mode 2)
document, an NTP peer operates like a client. association as shown in the mobilize() and clear() routines in
Section A.2. Code flow continues to the match_assoc() fragment
described below. In other cases the packet matches an existing
association and code flows to the match_assoc fragment in Figure 16.
The packet timestamps are carefully checked to avoid invalid,
duplicate or bogus packets, as shown in the figure. Note that a
crypto-NAK is considered valid only if it survives these tests.
Next, the peer variables are copied from the packet header variables
as shown in Figure 17 and the packet() routine in Section A.5.
Implementations must include a number of data range checks as shown
in Table 3 and discard the packet if the ranges are exceeded;
however, the header fields are copied even if errors occur, since
they are necessary in symmetric modes to construct the subsequent
poll message.
9. Dynamic Server Discovery ------------------
| match assoc |
----------------
\ | /
----------------yes----------------
| T_3 = 0? |-->| format error |
---------------- ----------------
\ | / no
----------------yes----------------
| T_3 = xmt? |-->| duplicate |
---------------- ----------------
\ | / no
----------------no ----------------yes
| mode = 5? |-->|T_1 or T2 = 0?|--\
---------------- ---------------- |
| yes \ | / no |
\ | /<-----\ ---------------- |
| \-| T_1 = xmt? | |
---------------- ---------------- |
| auth = NAK? | no \ | /<-----/
---------------- |
yes\|/ no\|/ ----------------
--------- ------ | org = T_3 |
|org=T_3| |auth| | rec = T_4 |
|rec=T_4| |err | ----------------
--------- ------ \ | /
\|/ ----------------
--------- | return |
|packet | ----------------
---------
NTPv4 provides a mechanism, commonly known as "Manycast", for a Figure 16: Timestamp Processing
client to dynamically discover the existance of one or more servers ----------------
with no a-priori knowledge. Once servers are discovered, they are | packet |
then treated as any other unicast server. ----------------
\ | /
----------------
| copy header |
----------------
\ | /
----------------bad----------------
| header? |-->|header error |
---------------- ----------------
\ | /
----------------
| reach |= 1 |
----------------
\ | /
----------------
| poll update |
----------------
\ | /
----------------------------------------
| theta = 1/2*(T_2-T_1)+(T_3-T_4) |
| del = (T_4-T_1)-(T_3-T_2) |
| epsilon = rho_r+rho+capphi*((T_4-T_1)|
----------------------------------------
\ | /
----------------
| clock filter |
----------------
A client employing server discovery is configured with MinServers, Peer Variables <-- Packet Variables
the minimum number of desired servers and MaxServers, the maximum p.leap <-- r.leap
number of desired servers. The discovery mechanism is a simple p.mode <-- r.mode
expanding ring search, using IP multicast with increasing TTLs or Hop p.stratum <-- r.stratum
Counts. The multicast address used MUST be scoped to the local site, p.ppoll <-- r.ppoll
as defined by [14]. p.rootdelay <-- r.rootdelay
p.rootdisp <-- r.rootdisp
p.refid <-- r.refid
p.reftime <-- r.reftime
The client initiates the discovery process by sending an NTP message Figure 17: Packet Processing
to the configured multicast address (224.0.1.1 for IPv4 and a +----------------+--------------------------------------------------+
multicast address ending :101 for IPv6 with proper scoping.) with an | Packet Type | Description |
IP TTL or Hop Count of 1. This message has all of the NTP header +----------------+--------------------------------------------------+
fields set to 0, except the Mode, VN and optional Transmit Timestamp | 1 duplicate | The packet is at best an old duplicate or at |
fields. The Mode is set to 3. It then starts a retry timer | packet | worst a replay by a hacker. This can happen in |
(Default: 64 seconds) and listens for unicast responses from servers. | | symmetric modes if the poll intervals are |
The source address of any server responses are treated as newly | | uneven. |
configured unicast servers, up to a limit of MaxServers. If the | 2 bogus packet | |
number of discovered servers is less than MinServers when the retry | 3 invalid | One or more timestamp fields are invalid. This |
timer expires, an identical NTP message is sent with an increased | | normally happens in symmetric modes when one |
TTL/Hop Count, and the retry timer is restarted. This continues | | peer sends the first packet to the other and |
until either MinServers servers have been discovered or a configured | | before the other has received its first reply. |
maximum TTL/Hop Count is reached. If the configured maximum TTL/Hop | 4 access | The access controls have black |
Count is reached, packets continue to be periodically sent at the | denied | |
maximum TTL/Hop Count. If at some subsequent time, the number of | 5 | The cryptographic message digest does not match |
valid servers drops below MinServers, the process restarts at the | authentication | the MAC. |
initial state. | failure | |
| 6 | The server is not synchronized to a valid |
| unsynchronized | source. |
| 7 bad header | One or more header fields are invalid. |
| data | |
| 8 autokey | Public key cryptography has failed to |
| error | authenticate the packet. |
| 9 crypto error | Mismatched or missing cryptographic keys or |
| | certificates. |
+----------------+--------------------------------------------------+
A server configured to provide server discovery will listen on the Table 3: Packet Error Checks
specified multicast address for discovery messages from clients. If
the server is in scope of the current TTL and is itself synchronized
to a valid source it replies to the discovery message from the client
with an ordinary unicast server message as described in Section 6
10. The Kiss-o'-Death Packet The 8-bit p.reach shift register in the poll process described later
is used to determine whether the server is reachable or not and
provide information useful to insure the server is reachable and the
data are fresh. The register is shifted left by one bit when a
packet is sent and the rightmost bit is set to zero. As valid
packets arrive, the rightmost bit is set to one. If the register
contains any nonzero bits, the server is considered reachable;
otherwise, it is unreachable. Since the peer poll interval might
have changed since the last packet, the poll_update() routine in
Section A.8 is called to re-determine the host poll interval.
According to the NTPv3 specification [1], if the Stratum field in the The on-wire protocol calculates the clock offset theta and roundtrip
NTP header is 1, indicating a primary server, the Reference delay del from the four most recent timestamps as shown in Figure 12.
Identifier field contains an ASCII string identifying the particular While it is in principle possible to do all calculations except the
reference clock type. However, in [1] nothing is said about the first-order timestamp differences in fixed-point arithmetic, it is
Reference Identifier field if the Stratum field is 0, which is called much easier to convert the first-order differences to floating
out as "unspecified". However, if the Stratum field is 0, the doubles and do the remaining calculations in that arithmetic, and
Reference Identifier field can be used to convey messages useful for this will be assumed in the following description. The dispersion
status reporting and access control. In NTPv4 and SNTPv4, packets of statistic epsilon(t) represents the maximum error due to the
this kind are called Kiss-o'-Death (KoD) packets and the ASCII frequency tolerance and time since the last measurement. It is
messages they convey are called kiss codes. The KoD packets got initialized
their name because an early use was to tell clients to stop sending
packets that violate server access controls.
The kiss codes can provide useful information for an intelligent epsilon(t_o) = rho_r + rho +cappsi(T_4-T_1)
client. These codes are encoded in four-character ASCII strings left
justified and zero filled. The strings are designed for character
displays and log files. A list of the currently-defined kiss codes
is given in Table 7.
+------+------------------------------------------------------------+ when the measurement is made at t _0. Here rho_r is the peer
| Code | Meaning | precision in the packet header r.precision and rho the system
+------+------------------------------------------------------------+ precision s.precision, both expressed in seconds. These terms are
| ACST | The association belongs to a unicast server | necessary to account for the uncertainty in reading the system clock
| AUTH | Server authentication failed | in both the server and the client. The dispersion then grows at
| AUTO | Autokey sequence failed | constant rate TOLERANCE (cappsi); in other words, at time t,
| BCST | The association belongs to a broadcast server | epsilon(t) = epsilon(t_0) + cappsi(t-t_0). With the default value
| CRYP | Cryptographic authentication or identification failed | cappsi = 15 PPM, this amounts to about 1.3 s per day. With this
| DENY | Access denied by remote server | understanding, the argument t will be dropped and the dispersion
| DROP | Lost peer in symmetric mode | represented simply as epsilon. The remaining statistics are computed
| RSTR | Access denied due to local policy | by the clock filter algorithm described in the next section.
| INIT | The association has not yet synchronized for the first |
| | time |
| MCST | The association belongs to a dynamically discovered server |
| NKEY | No key found. Either the key was never installed or is |
| | not trusted |
| RATE | Rate exceeded. The server has temporarily denied access |
| | because the client exceeded the rate threshold |
| RMOT | Alteration of association from a remote host running |
| | ntpdc. |
| STEP | A step change in system time has occurred, but the |
| | association has not yet resynchronized |
+------+------------------------------------------------------------+
Table 7: Currently-defined NTP Kiss Codes 9. Clock Filter Algorithm
-----------------------
| clock filter |
-----------------------
\ | /
-----------------------
| shift sample theta, |
| del, epsilon, and t |
| filter shift registr|
-----------------------
\ | /
-----------------------
| copy filter to a |
| temporary list. sort|
| list by increasing |
| del. Let theta_i |
| del_i, epsilon_i, |
| t_i be the ith entry|
| on the sorted list. |
-----------------------
\ | /
----------------------- no
| t_0 > t? |----\
----------------------- |
\ | / yes |
----------------------- |
| theta = theta_0 | |
| del = del_0 | |
| epsilon | |
| = sum(epsilon_i) | |
| ---------- | |
| 2^(i+1) | |
| varphi | |
| = sqrt(1/7* ... | |
| ... sum( ... | |
| (theta_0-theta_i)^2 | |
| t = t_0 | |
----------------------- |
\ | / |
----------------------- |
| clock_select() | |
----------------------- |
\ | /<------------/
-----------------------
| return |
-----------------------
In general, an NTP client should stop sending to a particular server Figure 18: Clock Filter Processing
if that server returns a reply with a Stratum field of 0, regardless The clock filter algorithm grooms the stream of on-wire data to
of kiss code, and an alternate server is available. If no alternate select the samples most likely to represent the correct time. The
server is available, the client SHOULD increase the poll interval as algorithm produces the p.offset theta, p.delay del, p.dispersion
performance permits. epsilon, p.jitter varphi, and time of arrival p.t t used by the
mitigation algorithms to determine the best and final offset used to
discipline the system clock. They are also used to determine the
server health and whether it is suitable for synchronization. The
core processing steps of this algorithm are shown in Figure 18 with
more detail in the clock_filter() routine in Section A.5.
11. Security Considerations The clock filter algorithm saves the most recent sample tuples
(theta, del, epsilon, t) in an 8-stage shift register in the order
that packets arrive. Here t is the system timer, not the peer
variable of the same name. The following scheme is used to insure
sufficient samples are in the register and that old stale data are
discarded. Initially, the tuples of all stages are set to the dummy
tuple (0,MAXDISP, MAXDISP, t). As valid packets arrive, the (theta,
del, epsilon, t) tuples are shifted into the register causing old
samples to be discarded, so eventually only valid samples remain. If
the three low order bits of the reach register are zero, indicating
three poll intervals have expired with no valid packets received, the
poll process calls the clock filter algorithm with the dummy tuple
just as if the tuple had arrived from the network. If this persists
for eight poll intervals, the register returns to the initial
condition.
NTPv4 provides an optional authentication field that utilizes the MD5 In the next step the shift register stages are copied to a temporary
algorithm. MD5, as the case for SHA-1, is derived from MD4, which list and the list sorted by increasing del. Let j index the stages
has long been known to be weak. In 2004, techniques for efficiently starting with the lowest del. If the sample epoch t_0 is not later
finding collisions in MD5 were announced. A summary of the weakness than the last valid sample epoch p.t, the routine exits without
of MD5 can be found in [15]. affecting the current peer variables. Otherwise, let epsilon_j be
the dispersion of the jth entry, then
i=n-1
--- e_i
e= \ --------
/ (i+1)
--- 2
i=0
In the case of NTP as specified herein, there is a vulnerability that is the peer dispersion p.disp. Note the overload of epsilon, whether
NTP broadcast clients can be disrupted by misbehaving or hostile SNTP input to the clock filter or output, the meaning should be clear from
or NTP broadcast servers elsewhere in the Internet. Access controls context.
and/or cryptographic authentication means should be provided for
additional security in such cases.
While not required in a conforming NTP client implementation, there The observer should note (a) if all stages contain the dummy tuple
are a variety of recommended checks that an NTP client can perform with dispersion MAXDISP, the computed dispersion is a little less
that are designed to avoid various types of abuse that might happen than 16 s, (b) each time a valid tuple is shifted into the register,
as the result of server implementation errors or malicious attack. the dispersion drops by a little less than half, depending on the
These recommended checks are as follows: valid tuples dispersion, (c) after the fourth valid packet the
dispersion is usually a little less than 1 s, which is the assumed
value of the MAXDIST parameter used by the selection algorithm to
determine whether the peer variables are acceptable or not.
When the IP source and destination addresses are available for the Let the first stage offset in the sorted list be theta_0; then, for
client request, they should match the interchanged addresses in the other stages in any order, the jitter is the RMS average
the server reply. +----- -----+
| 1/2 |
| +----- -----+ |
| | n-1 | |
| | --- | |
| 1 | \ 2 | |
varphi = | -------- * | / (theta_0-theta_j) | |
| (n-1) | --- | |
| | j=1 | |
| +----- -----+ |
| |
+----- -----+
When the UDP source and destination ports are available for the where n is the number of valid tuples in the register. In order to
client request, they should match the interchanged ports in the insure consistency and avoid divide exceptions in other computations,
server reply. the varphi is bounded from below by the system precision rho
expressed in seconds. While not in general considered a major factor
in ranking server quality, jitter is a valuable indicator of
fundamental timekeeping performance and network congestion state.
The Originate Timestamp in the server reply should match the Of particular importance to the mitigation algorithms is the peer
Transmit Timestamp used in the client request. synchronization distance, which is computed from the root delay and
root dispersion. The root delay is
A client can check the Root Delay and Root Dispersion fields are del ' = delta_r + del
each greater than or equal to 0 and less than infinity, where
infinity is is on the order of 15-20 seconds. This check avoids
using a server whose synchronization source has expired for a very
long time.
12. IANA Considerations and the root dispersion is
UDP/TCP Port 123 was previously assigned by IANA for this protocol. epsilon ' = E_r + epsilon + varphi
The IANA has assigned the IPv4 multicast group address 224.0.1.1 and
the IPv6 multicast address ending :101 for NTP.
This document identifies the set of defined 4-character (ASCII) Note that epsilon and therefore increase at rate capphi. The peer
Reference Identifier values. This document also defines the set of synchronization distance is defined
defined Kiss Codes. This document also introduces NTP extension
fields allowing for the development of future extensions to the
protocol, where a particular extension is to be identified by the
Field Type sub-field within the extension field.
IANA is requested to establish and maintain a registry for Reference lambda = (del ' / 2) + epsilon
Identifiers, Kiss codes, and Extension Field Types associated with
this protocol, populating this registry from the Reference
Identifiers given in Section 3.9 and Kiss Codes given in Section 11
as the initial entries. The Extension Field Types registry will have
no initial entries. As future needs arise, new Reference
Identifiers, Kiss Codes, and Extension Field Types may be defined.
Following the policies outlined in [16], new values are to be defined
by IETF Consensus.
13. Acknowledgements and recalculated as necessary. The lambda is a component of the root
synchronization distance caplambda used by the mitigation algorithms
as a metric to evaluate the quality of time available from each
server. Note that there is no state variable for lambda, as it
depends on the time since the last update.
This document has drawn material from RFC 4330, "Simple Network Time 10. System Process
Protocol (SNTP) Version 4 for IPv4, IPv6 and OSI." As a result, the
authors would like to acknowledge D. Plonka of the University of
Wisconsin and J. Montgomery of Netgear, who were contributors. The
authors would also like to thank B. Haberman for providing rigorous
reviews of this document.
14. References As each new sample (theta, delta, epsilon, t) is produced by the
clock filter algorithm, the sample is processed by the mitigation
algorithms consisting of the selection, clustering, combining and
clock discipline algorithms in the system process. The selection
algorithm scans all associations and casts off the falsetickers,
which have demonstrably incorrect time, leaving the truechimers as
result. In a series of rounds the clustering algorithm discards the
association statistically furthest from the centroid until a minimum
number of survivors remain. The combining algorithm produces the
best and final offset on a weighted average basis and selects one of
the associations as the system peer providing the best statistics for
performance evaluation. The final offset is passed to the clock
discipline algorithm to steer the system clock to the correct time.
The statistics (theta, delta, epsilon, t) associated with the system
peer are used to construct the system variables inherited by
dependent servers and clients and made available to other
applications running on the same machine.
14.1. Normative References The discussion in following sections covers only the basic variables
and routines necessary for a conforming NTPv4 implementation.
Additional implementation details are in Section B.6. An interface
that might be considered in a formal specification is represented by
the function prototypes in Section B.1.
10.1. System Process Variables
The variables and parameters associated with the system process are
summarized in Figure 21, which gives the variable name, formula name
and short description. Unless noted otherwise, all variables have
assumed prefix s.
Name/Formula/Description
t/t/epoch
leap/leap/leap indicator
stratum/stratum/stratum
precision/rho/precision
p/p/system peer pointer
offset/captheta/combined offset
jitter/varsigma/combined jitter
rootdelay/capdelta/root delay
rootdisp/E/root dispersion
refid/refid/reference ID
reftime/reftime/reference time
NMIN/3/minimum survivors
CMIN/1/minimum candidates
Figure 21: System Process Variables and Parameters
All the variables except s.t and s.p have the same format and
interpretation as the peer variables of the same name. The remaining
variables are defined below.
s.t: Integer representing the value of the system timer at the last
update.
s.p: System peer association pointer.
s.precision: 8-bit signed integer representing the precision of the
system clock, in log2 seconds.
s.offset: Offset computed by the combining algorithm.
s.jitter: Jitter computed by the cluster and combining algorithms.
The variables defined below are updated from the system peer process
as described later. They are interpreted in the same way as the as
the peer variables of the same names.
s.leap, s.stratum, s.rootdelay, s.rootdisp, s.refid, s.reftime
Initially, all variables are cleared to zero, then the s.leap is set
to 3 (unsynchronized) and s.stratum is set to MAXSTRAT (16). The
remaining statistics are determined as described below.
10.2. System Process Operations
The system process implements the selection, clustering, combining
and clock discipline algorithms. The clock_select() routine in
Figure 22 includes the selection algorithm of Section 9.2.1 that
produces a majority clique of truechimers based on agreement
principles. The clustering algorithm of Section 9.2.2 discards the
outliers of the clique to produce the survivors used by the combining
algorithm in Section 9.2.3, which in turn provides the final offset
for the clock discipline algorithm in Section 9.2.4. If the
selection algorithm cannot produce a majority clique, or if the
clustering algorithm cannot produce at least CMIN survivors, the
system process terminates with no further processing. If successful,
the clustering algorithm selects the statistically best candidate as
the system peer and its variables are inherited as the system
variables. The selection and clustering algorithms are described
below separately, but combined in the code skeleton.
-------------------------
| clock_select() |
-------------------------
\|/
-----------------------------------|---------------
| ----------- ---------------------- |
| /---| accept? | | scan candidates | |
| | ----------- | | |
| | yes no| | | |
| ----------- | | | |
| | add peer| | | | |
| ----------- | | | |
| | \|/ | | |
| \-------->----->| | |
| | | |
| selection algorithm ---------------------- |
| \|/ |
------------------------------------|--------------
no -----------------------
/--------------| survivors? |
| -----------------------
| \|/ yes
| -----------------------
| | clustering algorithm|
| -----------------------
| \|/
| -----------------------
|<---------yes-| n < CMIN? |
\|/ -----------------------
------------------------- \|/ no
| s.p = NULL | -----------------------
------------------------- | s.p = vo.p |
\|/ -----------------------
------------------------- \|/
| return (UNSYNC) | -----------------------
------------------------- | return (SYNC) |
-----------------------
Figure 22: clock_select() routine
10.2.1. Selection Algorithm
The selection algorithm operates to find the truechimers using
Byzantine agreement principles originally proposed by Marzullo [7],
but modified to improve accuracy. An overview of the algorithm is
listed below and the first half of the clock_select() routine in
Section A.6.1. First, those servers which are unusable according to
the rules of the protocol are detected and discarded by the accept()
routine in Figure 23 and Section B.6.3. Next, a set of tuples {p,
type, edge} is generated for the remaining servers, where p is an
association pointer, type and edge identifies the upper (+1), middle
(0) and lower (-1) endpoint of a correctness interval [theta -
lambda, theta + lambda], where lambda is the root distance.
1. 1. For each of m associations, construct a correctness interval
[(theta - rootdist()), (theta + rootdist())].
2. 2. Select the lowpoint, midpoint and highpoint of these
intervals. Sort these values in a list from lowest to highest.
Set the number of falsetickers f = 0.
3. 3. Set the number of midpoints d = 0. Set c = 0. Scan from
lowest endpoint to highest. Add one to c for every lowpoint,
subtract one for every highpoint, add one to d for every
midpoint. If c >= m - f, stop; set l = current lowpoint
4. 4. Set c = 0. Scan from highest endpoint to lowest. Add one to
c for every highpoint, subtract one for every lowpoint, add one
to d for every midpoint. If c >= m - f, stop; set u = current
highpoint.
5. 5. Is d = f and l < u?
6. if yes, then follow step 5y, else, follow step 5n.
7. 5y. Success: the intersection interval is [l, u].
8. 5n. Add one to f. Is f < (m / 2)? If yes, then go to step 3
again. If no, then go to step 6.
9. 6. Failure; a majority clique could not be found. Stop
algorithm.
The tuples are placed on a list and sorted by edge. The list is
processed from the lowest to the highest, then from highest to lowest
as described in detail in [8]. The algorithm starts with the
assumption that there are no falsetickers (f = 0) and attempts to
find a nonempty intersection interval containing the midpoints of all
correct servers, i.e., truechimers. If a nonempty interval cannot be
found, it increases the number of assumed falsetickers by one and
tries again. If a nonempty interval is found and the number of
falsetickers is less than the number of truechimers, a majority
clique has been found and the midpoints (offsets) represent the
survivors available for the clustering algorithm. Otherwise, there
are no suitable candidates to synchronize the system clock.
--------------------
| accept() |
--------------------
\|/
--------------------
| leap = 11? |
| stratum >= |--any yes---\ server not
| MAXSTRAT? | | synchronized
-------------------- |
\|/ all no |
-------------------- |
| reach = 0? |---yes----->| server not
-------------------- | reachable
\|/ no |
-------------------- |
| root_dist() >= | |
| MAXDIST? |---yes----->| root distance
-------------------- | exceeded
\|/ no |
-------------------- |
| refid = addr? |---yes----->| server/client
-------------------- | sync loop
\|/ no |
-------------------- |
| return (YES) | -----------------------
-------------------- | return (NO) |
-----------------------
Figure 23: accept() routine
10.2.2. Clustering Algorithm
The members of the majority clique are placed on the survivor list,
and sorted first by stratum, then by root distance lambda. The
sorted list is processed by the clustering algorithm below and the
second half of the clock_select() algorithm in Section B.6.1.
1. Let (theta, phi, Lambda) represent a candidate peer with
offset theta, jitter j and a weight factor Lambda = stratum *
MAXDIST + rootdist().
2. Sort the candidates by increasing Lambda. Let n be the number
of candidates and NMIN the minimum number of survivors.
3. For each candidate compute the selection jitter jsubS (RMS
peer offset differences between this and all other candidates).
4. Select j_max as the candidate with maximum j_S.
5. Select j_min as the candidate with minimum j_S.
If yes, go to step 6y. If no, go to step 6n.
6y. Done. The remaining cluster survivors are correct. The
survivors are in the v. structure sorted by Lambda.
6n. Delete the outlyer candidate with j_max; reduce n by one, and
go back to step 3.
It operates in a series of rounds where each round discards the
furthest statistical outlier until a specified minimum number of
survivors NMIN (3) are left or until no further improvement is
possible. In each round let n be the number of survivors and s index
the survivor list. Assume jp is the peer jitter of the s survivor.
Compute
+----- -----+
| 1/2 |
| +----- -----+ |
| | n-1 | |
| | --- | |
| 1 | \ 2 | |
varphi_s = | -------- * | / (theta_s-theta_j) | |
| (n-1) | --- | |
| | j=1 | |
| +----- -----+ |
| |
+----- -----+
as the selection jitter. Then choose varphi_max = max (varphi) and
varphi_min = min (varphi). If varphi_max < varphi_min or n < NMIN,
no further reduction in selection jitter is possible, so the
algorithm terminates and the remaining survivors are processed by the
combining algorithm. Otherwise, the algorithm case off the
varphi_max survivor, reduces n by one and makes another round.
10.2.3. Combining Algorithm
---------------------
| clock_combine() |
---------------------
\|/
---------------------
| y = z = w = 0 |
---------------------
\|/
---------------------
| scan cluster | ------------------
| survivors |-->| x = rootdist() |
| | ------------------
| | \|/
| | ------------------
| |<--| y+= 1/x |
| | | z+=theta_i/x |
| | | w+=(theta_i - |
| | | theta_o)^2 |
--------------------- ------------------
\|/ done
-----------------------
| captheta = z/y |
| vartheta = sqrt(w/y)|
-----------------------
\|/
-----------------------
| return |
-----------------------
Variable/Process/Description
captheta/system/combined clock offset
vartheta_p/system/combined jitter
theta_0/survivor list/first survivor offset
theta_i/survivor list/ith survivor offset
x,y,z,w/ /temporaries
Figure 25: clock_combine() routine
--------------------
| clock_update() |
--------------------
\|/
--------------------
/----no----->| p.t > s.t |
| --------------------
| \|/ yes
| --------------------
| | s.t = p.t |
| --------------------
| \|/
| --------------------
| | local_clock() |
| --------------------
| \|/
|<--------------------+-----------------\
| panic\|/ | adj step\|/
| ------------- | -------------------
| | panic exit| | | clear all assoc.|
| ------------- | -------------------
| ----------------- \|/
| |*update system | -----------------
| | variables | | leap = 3 |
| ----------------- | quamtum = |
| \|/ | MAXSTRAT |
| | -----------------
\---------------------+----------------/
|
---------------
| return |
---------------
System Variables <-- System Peer Variables
leap <-- leap
stratum <-- stratum + 1
refid <-- refid
reftime <-- reftime
capdelta <-- capdelta_r + del
E <-- E_r+epsilon+cappsi*mu+varphi+|captheta|
* update system variables
Figure 26: clock_update() routine
The remaining survivors are processed by the clock_combine() routine
in Figure 25 and Section A.6.4 to produce the best and final data for
the clock discipline algorithm. The routine processes the peer
offset theta and jitter varphi to produce the system offset captheta
and system peer jitter vartheta_p, where each server statistic is
weighted by the reciprocal of the root distance and the result
normalized. The system peer jitter vartheta_p is a component of the
system jitter described later.
The system statistics are passed to the clock_update() routine in
Figure 26 and Section A.6.4. If there is only one survivor, the
offset passed to the clock discipline algorithm is captheta = theta
and the system peer jitter is vartheta=varphi. Otherwise, the
selection jitter vartheta_s is computed as in (8), where theta_0
represents the offset of the system peer and j ranges over the
survivors.
Peer Variables Client System Variables
---------------- -----------------
| theta = 1/2* |-------------------->| captheta = |
| [(T_2 - T_1)+| | (combine |
| (T_3 - T_4)] | | (theta_j)) |
---------------- -----------------
| del = [(T_4 -|--sum--------------->| capdelta= |
| T_1) - (T_3 -| /|\ | capdelta_r + |
| T_2)] | | | del |
---------------- | -----------------
| epsilon = | | | E = E_r + |
| rho_r + rho +| | | epsilon + |
| captheta*( | | | vartheta + |
| T_4 - T_1) |------------sum----->| absolutevalue(|
---------------- | /|\ | theta) |
| varphi = | | | -----------------
| sqrt((1/n)-1)*| | | | varphi_s = |
| (sum(theta_0)| | | | sqrt(1/(m-1)* |
| -theta_i)^2))|---|---\ | | sum(theta_0- |
---------------- | | | | theta_j)^2) |
/|\ | | | -----------------
| | | | \|/
| | \------------------>sum
server| | | |
---------------- | | \|/
| rho_r | | | |
---------------- | | -----------------
| capdelta_r |>--/ | | vartheta = |
---------------- | | sqrt( |
| E_r |>------------/ | (vartheta_p)^2|
---------------- | + |
| (vartheta_s)^2|
-----------------
Figure 27: System Variables Processing
The first survivor on the survivor list is selected as the system
peer, here represented by the statistics (theta, del, epsilon,
varphi). By rule, an update is discarded if its time of arrival p.t
is not strictly later than the last update used s.t. Let mu = p.t -
s.t be the time since the last update or update interval. If the
update interval is less than or equal to zero, the update is
discarded. Otherwise, the system variables are updated from the
system peer variables as shown in Figure 26. Note that s.stratum is
set to p.stratum plus one.
The arrows labeled IGNOR, PANIC, ADJ and STEP refer to return codes
from the local_clock() routine described in the next section. IGNORE
means the update has been ignored as an outlier. PANIC means the
offset is greater than the panic threshold PANICT (1000 s) and
normally causes the program to exit with a diagnostic message to the
system log. STEP means the offset is less than the panic threshold,
but greater than the step threshold STEPT (125 ms). Since this means
all peer data have been invalidated, all associations are reset and
the client begins as at initial start. ADJ means the offset is less
than the step threshold and thus a valid update for the local_clock()
routine described later. In this case the system variables are
updated as shown in Figure 26.
There is one exception not shown. The dispersion increment is
bounded from below by MINDISP. In subnets with very fast processors
and networks and very small dispersion and delay this forces a
monotone-definite increase in , which avoids loops between peers
operating at the same stratum.
Figure 27 shows how the error budget grows from the packet variables,
on-wire protocol and system peer process to produce the system
variables that are passed to dependent applications and clients. The
system jitter is defined
vartheta = sqrt((vartheta_p)^2+(vartheta_s)^2)
where vartheta_s is the selection jitter relative to the system peer.
The system jitter is passed to dependent applications programs as the
nominal error statistic. The root delay capdelta and root dispersion
E statistics are relative to the primary server reference clock and
thus inherited by each server along the path. The system
synchronization distance is defined
caplambda = capdelta/2 + E
which is passed to dependent application programs as the maximum
error statistic.
10.2.4. Clock Discipline Algorithm
---------
thetar + | \ +----------------+
NTP --------->| Phase \ V_d | | V_s
thetac - | Detector ------>| Clock Filter |-----+
+-------->| / | | |
| | / +----------------+ |
| --------- |
| |
----- |
/ \ |
| VFO | |
\ / |
----- +-------------------------------------+ |
^ | Loop Filter | |
| | | |
| | +---------+ x +-------------+ | |
| | | |<-----| | | |
+------|-| Clock | y | Phase/Freq |<---|------+
| | Adjust |<-----| Prediction | |
| | | | | |
| +---------+ +-------------+ |
| |
+-------------------------------------+
Figure 28: Clock Discipline Feedback Loop
The NTPv4 clock discipline algorithm, shortened to discipline in the
following, functions as a combination of two philosophically quite
different feedback control systems. In a phase-locked loop (PLL)
design, periodic phase updates at update intervals m are used
directly to minimize the time error and indirectly the frequency
error. In a frequency-locked loop (FLL) design, periodic frequency
updates at intervals mu are used directly to minimize the frequency
error and indirectly the time error. As shown in [8], a PLL usually
works better when network jitter dominates, while a FLL works better
when oscillator wander dominates. This section contains an outline
of how the NTPv4 design works. An in-depth discussion of the design
principles is provided in [8], which also includes a performance
analysis.
The clock discipline and clock adjust processes interact with the
other algorithms in NTPv4. The output of the combining algorithm
represents the best estimate of the system clock offset relative to
the server ensemble. The discipline adjusts the frequency of the VFO
to minimize this offset. Finally, the timestamps of each server are
compared to the timestamps derived from the VFO in order to calculate
the server offsets and close the feedback loop.
The discipline is implemented as the feedback control system shown in
Figure 28. The variable theta_r represents the combining algorithm
offset (reference phase) and theta_c the VFO offset (control phase).
Each update produces a signal Vd representing the instantaneous phase
difference theta_r - theta_c. The clock filter for each server
functions as a tapped delay line, with the output taken at the tap
selected by the clock filter algorithm. The selection, clustering
and combining algorithms combine the data from multiple filters to
produce the signal Vs. The loop filter, with impulse response F(t),
produces the signal Vc which controls the VFO frequency omega_c and
thus its phase theta_c = integral (omega_c, dt) which closes the
loop. The Vc signal is generated by the clock adjust process in
Section 9.3. The characteristic behavior of this model, which is
determined by F(t) and the various gain factors given in Section
A.6.6.
The transient behavior of the PLL/FLL feedback loop is determined by
the impulse response of the loop filter F(t). The loop filter shown
in Figure 29 predicts a phase adjustment x as a function of Vs. The
PLL predicts a frequency adjustment yFLL as an integral of Vs*mu with
repsect to t, while the FLL predicts an adjustment yPLL as a function
of Vs /mu. The two adjustments are combined to correct the frequency
y as shown in Figure 29. The x and y are then used by the
clock_adjust()routine to control the VFO frequency. The detailed
equations that implement these functions are best presented in the
routines of Sections A.6.6 and A.7.1.
x <------(Phase Correction)<--.
|
y_FLL |
.-(FLL Predict)<-------+<--V_s
| |
\|/ |
y <--(Sum) |
^ |
| |
'-(PLL Predict)<-------'
y_PLL
Figure 29: Clock Discipline Loop Filter
Ordinarily, the pseudo-linear feedback loop described above operates
to discipline the system clock. However, there are cases where a
nonlinear algorithm offers considerable improvement. One case is
when the discipline starts without knowledge of the intrinsic clock
frequency. The pseudo-linear loop takes several hours to develop an
accurate measurement and during most of that time the poll interval
cannot be increased. The nonlinear loop described below does this in
15 minutes. Another case is when occasional bursts of large jitter
are present due to congested network links. The state machine
described below resists error bursts lasting less than 15 minutes.
The remainder of this section describes how the discipline works.
Figure 30 contains a summary of the variables and parameters
including the program name, formula name and short description.
Unless noted otherwisse, all variables have assumed prefix c. The
variables c.t, c.tc, c.state, and c.count are integers; the memainder
are floating doubles. The function of each will be explained in the
algorithm descriptions below.
Name Formula Description
---- ------- -----------
t timer seconds counter
offset captheta combined offset
resid captheta_r residual offset
freq phi clock frequency
jitter varphi clock jitter
wander cappsi frequency wander
tc tau time constant(log2)
state state state
adj adj frequency adjustment
count count hysteresis counter
STEPT 125 step threshold (.125 s)
WATCH 900 stepout thresh(s)
PANICT 1000 panic threshold(1000 s)
LIMIT 30 hysteresis limit
PGATE 4 hysteresis gate
TC 16 time constant scale
AVG 8 averaging constant
Figure 30
=====================================================================
| State | captheta < STEP | captheta > STEP | Comments |
---------------------------------------------------------------------
| NSET | > FREQ; adjust | > FREQ; step | no frequency |
| | time | time | file |
---------------------------------------------------------------------
| FSET | > SYNC; adjust | > SYNC; step | frequency file |
| | time | time | |
---------------------------------------------------------------------
| SPIK | > SYNC; adjust | if (<900 s)>SPIK | outlier detected |
| | freq, adjust time | else SYNC; step | |
| | | freq; step time | |
---------------------------------------------------------------------
| FREQ | if (<900 s)> FREQ | if (<900 s)>FREQ | initial frequency |
| | else >SYNC; step | else >SYNC; step | |
| | freq, adjust time | freq, adjust time | |
---------------------------------------------------------------------
| SYNC | >SYNC; adjust freq| if (<900 s)>SPIK | normal operation |
| | adjust time | else >SYNC; step | |
| | | freq; step time | |
---------------------------------------------------------------------
Figure 31
The discipline is implemented by the local_clock() routine, which is
called from the clock_update() routine. The local_clock() routine
pseudo code in Section B.6.6 has two parts; first the state machine
shown in Figure 32 and second the algorithm that determines the time
constant and thus the poll interval in Figure 33. The state
transition function in Figure 32 is implemented by the rst() function
shown at the lower left of the figure. The local_clock() routine
exits immediately if the offset is greater than the panic threshold.
---
| A |
---
||
\/
--- yes ---
| B |-->| C |
--- ---
no ||
\/
---
| D |
---
||
\/
--- no --- yes SYNC SPIK FREQ
| E |<--| F |----------------------------------
--- --- || ||
SYNC || \/ \/
SPIKE FSET \/ FREQ NSET --- ---
------------------------- | G | | H |
|| || || || --- ---
|| || \/ \/ || yes || || no
|| || --- --- || || \/
|| --- | H | | I | || || ---
\/ | I | --- --- || || | J |
--- --- no || ||yes || || || ---
| K | || || || \/ || || || || yes
--- || \/ || --- || || || \/
|| || --- || | L | || || || ---
|| || | M ||| --- || || || | M |
|| || --- || || || || || ---
|| || || \/ \/ \/ \/ || ||
|| || || ------------>\/<----------- \/ \/
|| || || --- --->\/<-----
|| || || | N | ---
|| || || --- | O |
|| || || ---
|| || || ||
|| || || \/
|| || || --- --- ---
----->-------->----| P |----><--------| Q |<------| R |
--- || --- ---
--- \/ ||
| S | --- \/
--- | T | ---
|| --- | U |
\/ ---
--- ||
| V | \/
--- ---
|| | W |
\/ ---
---
| X |
---
A: local_clock()
B: |captheta|>PANICT?
C: return(PANIC)
D: freq=0
rval=IGNOR
E:
F: |captheta|>STEPT?
G: state=SPIK
H: mu<WATCH
I: captheta_g=captheta
J: FREQ?
K: Calculate new freq adjustment from captheta, tau, and mu using
hybrid PLL and FLL
L: rst(FREQ,0)
M: freq=((captheta-captheta_B-captheta_R)/mu)
N: return(rval)
O: step_time(captheta)
rval=STEP
P: rval=ADJ
Q: rst(SYNC,0)
R: state=NSET?
S: rst(new,off)
T: tc
U: rst(FREQ,0)
V: state=new
captheta_B=off-captheta_R
captheta_R=off
W: return(rval)
X: return
Figure 32: local_clock() routine (1 of 2)
-----
| A |
-----
\|/
-----
| B |
-----
\|/
-----
| C |-no-----\
----- |
\|/yes |
----- -----
| D | | E |
----- -----
\|/ \|/
----- -----
| F |no\ | G |no\
----- | ----- |
\|/yes| \|/yes|
| | | |
----- | ----- |
| H | | | I | |
----- | ----- |
| J | | | K | |
----- | ----- |
|y no-><-no y| |
---- | ---- |
| L| | | M| |
-------><---------/
\|/
-----
| N |
-----
\|/
-----
| O |
-----
\|/
-----
| P |
-----
A: tc
B: state=SYNC
C: |captheta_g| > PGATE?
D: count -= 2*tau
E: count += tau
F: count <= -LIMIT?
G: count >= LIMIT?
H: count = 0
I: count = 0
J: tau>MINPOLL
K: tau<MAXPOLL
L: tau--
M: tau++
N: phi += freq
O: cappsi = sqrt(expectationvalue(phi^2))
P: return(rval)
Figure 33: local_clock() routine (2 of 2)
The remaining portion of the local_clock() routine is shown in
Figure 33. The time constant tau is determined by comparing the
clock jitter varphi with the magnitude of the current residual offset
captheata_R. produced by the clock adjust routine in the next
section. If the residual offset is greater than PGATE (4) times the
clock jitter, be hysteresis counter is reduced by two; otherwise, it
is increased by one. If the hysteresis counter increases to the
upper limit LIMIT (30), the time constant is increased by one; if it
decreases to the lower limit -LIMIT (-30), the time constant is
decreased by one. Normally, the time constant hovers near MAXPOLL,
but quickly decreases it frequency surges due to a temperature spike,
for example.
The clock jitter statistic vartheta and the clock wander statistic
cappsi are implemented as exponential averages of RMS offset
differences and RMS frequency differences, respectively. Let x_i be
a measurement at time i of either vartheta or cappsi,y_i = x_i -
x_(i-1) the first-order sample difference and y_i_HAT the exponential
average. Then,
y_(i+1)_HAT = sqrt((y_i_HAT)^2+[(y_i)^2-(y_i_HAT)^2)/AVG])
where AVG (4) is the averaging parameter in Figure 30, is the
exponential average at time i + 1. The clock jitter statistic is
used by the poll-adjust algorithm above; the clock wander statistic
issued only for performance monitoring.
10.3. Clock Adjust Process
-----
| A |
-----
\|/
-----
| B |
-----
\|/
-----
| C |
-----
\|/
-----
| D |
-----
\|/
-----
| E |
-----
\|/
-----
| F |-----no----\
----- |
\|/yes \|/
----- -----
| H |<--------| G |
----- -----
A: clock_adjust()
B: E += captheta
C: tmp = captheta_r/TC(tau)
D: captheta_R -= tmp
E: adjust_time(phi + tmp)
F: next < timer?
G: poll()
H: return
Figure 34: clock_adjust() Routine
The actual clock adjustment is performed by the clock_adjust()
routine shown in Figure 34 and Section B.7.1. It runs at one-second
intervals to add the frequency offset in Figure 33 and a fixed
percentage of the residual offset captheta_R. The captheta_R is in
effect the exponential decay of the captheta value produced by the
loop filter at each update. The TC parameter scales the time
constant to match the poll interval for convenience. Note that the
dispersion E increases by capphi at each second.
The clock adjust process includes a timer interrupt facility driving
the system timer c.t. It begins at zero when the service starts and
increments once each second. At each interrupt the clock_adjust()
routine is called to incorporate the clock discipline time and
frequency adjustments, then the associations are scanned to determine
if the system timer equals or exceeds the p.next state variable
defined in the next section. If so, the poll process is called to
send a packet and compute the next p.next value.
11. Poll Process
Each association supports a poll process that runs at regular
intervals to construct and send packets in symmetric, client and
broadcast server associations. It runs continuously, whether or not
servers are reachable. The discussion in this section covers only
the variables and routines necessary for a conforming NTPv4
implementation. Additional implementation details are in Section
B.8. Further details and rationale for the engineering design are
discussed in [8].
Name Formula Description
---- ------- -----------
hpoll hpoll host poll exponent
last last last poll time
next next next poll time
reach reach reach register
unreach unreach unreach counter
UNREACH 24 unreach limit
BCOUNT 8 burst count
BURST flag burst enable
IBURST flag iburst enable
Figure 35
11.1. Poll Process Variables and Parameters
The poll process variables are allocated in the association data
structure along with the peer process variables. Figure 35 shows the
names, formula names and short definition for each one. Following is
a detailed description of the variables, all of which carry the p
prefix.
p.hpoll: Signed integer representing the poll exponent, in log2
seconds.
p.last: Integer representing the system timer value when the most
recent packet was sent.
p.next: Integer representing the system timer value when the next
packet is to be sent.
p.reach: 8-bit integer shift register. When a packet is sent, the
register is shifted left one bit, with zero entering from the right
and overflow bits discarded.
p.unreach: Integer representing the number of seconds the server has
been unreachable.
11.2. Poll Process Operations
As described previously, once each second the clock_adjust() routine
is called. This routine calls the poll() routine in Section B.8.1
for each association in turn. If the time for the next poll message
is greater than the system timer, the routine returns immediately. A
mode-5 (broadcast server) association always sends a packet, but a
mode-6 (broadcast client) association never sends a packet, but runs
the routine to update the p.reach and p.unreach variables. The
poll() routine calls the peer_xmit() routine in Section B.8.3 to send
a packet. If in a burst (p.burst > 0), nothing further is done
except call the poll_update() routine to set the next poll interval.
If not in a burst, the p.reach variable is shifted left by one bit,
with zero replacing the rightmost bit. If the server has not been
heard for the last three poll intervals, the clock_filter() routine
is called to increase the dispersion as described in Section 8.3. If
the BURST flag is lit and the server is reachable and a valid source
of synchronization is available, the client sends a burst of BCOUNT
(8) packets at each poll interval. This is useful to accurately
measure jitter with long poll intervals. If the IBURST flag is lit
and this is the first packet sent when the server becomes
unreachable, the client sends a burst. This is useful to quickly
reduce the synchronization distance below the distance threshold and
synchronize the clock. The figure also shows the mechanism which
backs off the poll interval if the server becomes unreachable. If
p.reach is nonzero, the server is reachable and p.unreach is set to
zero; otherwise, p.unreach is incremented by one for each poll to the
maximum UNREACH (24). Thereafter for each poll p.hpoll is increased
by one, which doubles the poll interval up to the maximum MAXPOLL
determined by the poll_update() routine. When the server again
becomes reachable, p.unreach is set to zero, p.hpoll is reset to tau
and operation resumes normally.
When a packet is sent from an association, some header values are
copied from the peer variables left by a previous packet and others
from the system variables. includes a flow diagram and a table
showing which values are copied to each header field. In those
implementations using floating double data types for root delay and
root dispersion, these must be converted to NTP short format. All
other fields are either copied intact from peer and system variables
or struck as a timestamp from the system clock.
The poll_update() routine shown in Section B.8.2 is called when a
valid packet is received and immediately after a poll message is
sent. If in a burst, the poll interval is fixed at 2 s; otherwise,
the host poll exponent is set to the minimum of p.poll from the last
packet received and p.hpoll from the poll() routine, but not less
than MINPOLL nor greater than MAXPOLL. Thus the clock discipline can
be oversampled, but not undersampled. This is necessary to preserve
subnet dynamic behavior and protect against protocol errors.
Finally, the poll exponent is converted to an interval which
establishes the time at the next poll p.next.
12. Security Considerations
NTPv4 provides an optional authentication field that utilizes the MD5
algorithm. MD5, as the case for SHA-1, is derived from MD4, which
has long been known to be weak. In 2004, techniques for efficiently
finding collisions in MD5 were announced. A summary of the weakness
of MD5 can be found in [9].
In the case of NTP as specified herein, NTP broadcast clients are
vulnerable to disruption by misbehaving or hostile SNTP or NTP
broadcast servers elsewhere in the Internet. Access controls and/or
cryptographic authentication means should be provided for additional
security in such cases.
13. IANA Considerations
UDP/TCP Port 123 was previously assigned by IANA for this protocol.
The IANA has assigned the IPv4 multicast group address 224.0.1.1 and
the IPv6 multicast address ending :101 for NTP. This document
introduces NTP extension fields allowing for the development of
future extensions to the protocol, where a particular extension is to
be identified by the Field Type sub-field within the extension field.
IANA is requested to establish and maintain a registry for Extension
Field Types associated with this protocol, populating this registry
with no initial entries. As future needs arise, new Extension Field
Types may be defined. Following the policies outlined in [10], new
values are to be defined by IETF Consensus.
14. Acknowledgements
This authors would like to thank Brian Haberman, Greg Dowd, Mark
Elliot, and Harlan Stenn for technical reviews of this document.
15. References
15.1. Normative References
[1] Mills, D., "Network Time Protocol (Version 3) Specification, [1] Mills, D., "Network Time Protocol (Version 3) Specification,
Implementation", RFC 1305, March 1992. Implementation", RFC 1305, March 1992.
14.2. Informative References 15.2. Informative References
[2] Mills, D., "Simple Network Time Protocol (SNTP) Version 4 for [2] Mills, D., "Simple Network Time Protocol (SNTP) Version 4 for
IPv4, IPv6 and OSI", RFC 4330, January 2006. IPv4, IPv6 and OSI", RFC 4330, January 2006.
[3] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) [3] University of Delaware, "The Autokey security architecture,
Specification", RFC 2460, December 1998. protocol and algorithms. Electrical and Com puter Engineering
Technical Report 06-1-1", NDSS , January 2006.
[4] Colella, R., Callon, R., Gardner, E., and Y. Rekhter, [4] Bradner, S., "Key words for use in RFCs to Indicate Requirement
"Guidelines for OSI NSAP Allocation in the Internet", RFC 1629, Levels", BCP 14, RFC 2119, March 1997.
May 1994.
[5] International Standards Organization, "International Standards [5] Postel, J., "Internet Protocol", STD 5, RFC 791,
8602 - Information Processing Systems - OSI: Connectionless September 1981.
Transport Protocol Specification.", NDSS , December 1986.
[6] Shue, C., Haggerty, W., and K. Dobbins, "OSI connectionless [6] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
transport services on top of UDP: Version 1", RFC 1240, April 1992.
June 1991.
[7] Furniss, P., "Octet Sequences for Upper-Layer OSI to Support [7] Marzullo and S. Owicki, "Maintaining the time in a distributed
Basic Communications Applications", RFC 1698, October 1994. system.", ACM Operating Systems Review 19 , July 1985.
[8] Bradner, S., "Key words for use in RFCs to Indicate Requirement [8] Mills, D. L., "Computer Network Time Synchronization - the
Levels", BCP 14, RFC 2119, March 1997. Network Time Protocol. CRC Press, 304pp.", 2006.
[9] Postel, J., "User Datagram Protocol", STD 6, RFC 768, [9] Bellovin, S. and E. Rescorla, Proceedings of the 13th annual
August 1980. ISOC Network and Distributed System Security Symposium,
"Deploying a new Hash Algorithm", February 2006.
[10] Postel, J., "Internet Protocol", STD 5, RFC 791, [10] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
September 1981. Considerations Section in RFCs", BCP 26, RFC 2434,
October 1998.
[11] Mills, D., "Network Time Protocol (version 2) specification and Appendix A. Code Skeleton
implementation", STD 12, RFC 1119, September 1989.
[12] Mills, D., "Network Time Protocol (version 1) specification and This appendix is intended to describe the protocol and algorithms of
implementation", RFC 1059, July 1988. an implementation in a general way using what is called a code
skeleton program. This consists of a set of definitions, structures
and code segments which illustrate the protocol operations without
the complexities of an actual implementation of the protocol. This
program is not an executable and is not designed to run in the
ordinary sense. It is designed to be compiled only in order to
verify consistent variable and type usage. The program is not
intended to be fast or compact, just to demonstrate the algorithms
with sufficient fidelity to understand how they work. Reword or
remove The code skeleton consists of five segments, a header segment
included by each of the other segments, plus a code segment for the
main program and peer, system, clock_adjust and poll processes.
These are presented in order below along with definitions and
variables specific to each process.
[13] Postel, J. and J. Reynolds, "File Transfer Protocol", STD 9, A.1. Global Definitions
RFC 959, October 1985.
[14] Meyer, D., "Administratively Scoped IP Multicast", BCP 23, Following are definitions and other data shared by all programs.
RFC 2365, July 1998. These values are defined in a header file ntp4.h which is included in
all files.
[15] Bellovin, S. and E. Rescorla, "Deploying a New Hash Algorithm", A.2. Definitions, Constants, Parameters
Proceedings of the 13th Annual ISOC Network and Distributed #include <math.h> s/* avoids complaints about sqrt() */
System Security Symposium (NDSS) , February 2006. #include <sys/time.h> /* for gettimeofday() and friends */
#include <stdlib.h> /* for malloc() and friends */
[16] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA /*
Considerations Section in RFCs", BCP 26, RFC 2434, * Data types
October 1998. *
* This program assumes the int data type is 32 bitsand
the long data
* type is 64 bits. The native data
type used in most calculations is
* floating double. The data types used
in some packet header fields
* require conversion to and from this
representation. Some header
* fields involve partitioning an octet, here
represented by individual
* octets.
*
* The 64-bit NTP timestamp format used in
timestamp calculations is
* unsigned seconds and fraction with the
decimal point to the left of
* bit 32. The only operation permitted
with these values is
* subtraction, yielding a signed 31-bit
difference. The 32-bit NTP
* short format used in delay and dispersion
calculations is seconds and
* fraction with the decimal point to the
left of bit 16. The only
* operations permitted with these values
are addition and
* multiplication by a constant.
*
* The IPv4 address is 32 bits, while the
IPv6 address is 128 bits. The
* message digest field is 128 bits as
constructed by the MD5 algorithm.
* The precision and poll interval fields
are signed log2 seconds.
*/
Appendix A. NTP Control Messages typedef unsigned long tstamp;
typedef unsigned int tdist;
typedef unsigned long ipaddr;
typedef unsinged int ipport;
typedef unsigned long digest;
typedef signed char s_char;
In a comprehensive network-management environment, facilities are /*
presumed available to perform routine NTP control and monitoring * Arithmetic conversion macroni
functions, such as setting the leap-indicator bits at the primary */
servers, adjusting the various system parameters and monitoring
regular operations. Ordinarily, these functions can be implemented
using a network-management protocol such as SNMP and suitable
extensions to the MIB database. However, in those cases where such
facilities are not available, these functions can be implemented
using special NTP control messages described herein. These messages
are intended for use only in systems where no other management
facilities are available or appropriate, such as in dedicated-
function bus peripherals. Support for these messages is not required
in order to conform to this specification.
The NTP Control Message has the value 6 specified in the mode field /* NTP timestamp format */
of the first octet of the NTP header and is formatted as shown in /* NTP short format */
Section 10.1. The format of the data field is specific to each /* IPv4 or IPv6 address */
command or response; however, in most cases the format is designed to /* IP port number */
be constructed and viewed by humans and so is coded in free-form /* md5 digest */
ASCII. This facilitates the specification and implementation of /* precision and poll interval (log2) */
simple management tools in the absence of fully evolved network-
management facilities. As in ordinary NTP messages, the
authenticator field follows the data field. If the authenticator is
used the data field is zero-padded to a 32-bit boundary, but the
padding bits are not considered part of the data field and are not
included in the field count.
IP hosts are not required to reassemble datagrams larger than 576 #define LOG2D(a) ((a) < 0 ? 1. / (1L << -(a)) : \
octets; however, some commands or responses may involve more data
than will fit into a single datagram. Accordingly, a simple
reassembly feature is included in which each octet of the message
data is numbered starting with zero. As each fragment is transmitted
the number of its first octet is inserted in the offset field and the
number of octets is inserted in the count field. The more-data (M)
bit is set in all fragments except the last.
Most control functions involve sending a command and receiving a 1L << (a)) /* poll, etc. */
response, perhaps involving several fragments. The sender chooses a
distinct, nonzero sequence number and sets the status field and R and
E bits to zero. The responder interprets the opcode and additional
information in the data field, updates the status field, sets the R
bit to one and returns the three 32-bit words of the header along
with additional information in the data field. In case of invalid
message format or contents the responder inserts a code in the status
field, sets the R and E bits to one and, optionally, inserts a
diagnostic message in the data field.
Some commands read or write system variables and peer variables for #define LFP2D(a) ((double)(a) / 0x100000000L) /* NTP timestamp */
an association identified in the command. Others read or write
variables associated with a radio clock or other device directly
connected to a source of primary synchronization information. To
identify which type of variable and association a 16-bit association
identifier is used. System variables are indicated by the identifier
zero. As each association is mobilized a unique, nonzero identifier
is created for it. These identifiers are used in a cyclic fashion,
so that the chance of using an old identifier which matches a newly
created association is remote. A management entity can request a
list of current identifiers and subsequently use them to read and
write variables for each association. An attempt to use an expired
identifier results in an exception response, following which the list
can be requested again.
Some exception events, such as when a peer becomes reachable or #define D2LFP(a) ((tstamp)((a) * 0x100000000L))
unreachable, occur spontaneously and are not necessarily associated
with a command. An implementation may elect to save the event
information for later retrieval or to send an asynchronous response
(called a trap) or both. In case of a trap the IP address and port
number is determined by a previous command and the sequence field is
set as described below. Current status and summary information for
the latest exception event is returned in all normal responses. Bits
in the status field indicate whether an exception has occurred since
the last response and whether more than one exception has occurred.
Commands need not necessarily be sent by an NTP peer, so ordinary #define FP2D(a) (double)(a) / 0x10000L) /* NTP short */
access-control procedures may not apply; however, the optional mask/ #define D2FP(a) ((tdist)((a) * 0x10000L))
match mechanism suggested elsewhere in this document provides the #define SQUARE(x) (x * x)
capability to control access by mode number, so this could be used to #define SQRT(x) (sqrt(x))
limit access for control messages (mode 6) to selected address
ranges.
A.1. NTP Control Message Format /*
* Global constants. Some of these might be
converted to variables
* which can be tinkered by configuration
or computed on-fly. For
* instance, PRECISION could be calculated
on-fly and
* provide performance tuning for the defines
marked with % below.
*/
#define VERSION 4 /* version number */
#define PORT 123 /* NTP poert number */
#define MINDISP .01 /* % minimum dispersion (s) */
#define MAXDISP 16 /* % maximum dispersion (s) */
#define MAXDIST 1 /* % distance threshold (s) */
#define NOSYNC 3 /* leap unsync */
#define MAXSTRAT 16 /* maximum stratum (infinity metric) */
#define MINPOLL 4 /* % minimum poll interval (16 s)*/
#define MAXPOLL 17 /* % maximum poll interval (36.4 h) */
#define PHI 15e-6 /* % frequency tolerance (15 PPM) */
#define NSTAGE 8 /* clock register stages */
#define NMAX 50 /* % maximum number of peers */
#define NSANE 1 /* % minimum intersection survivors */
#define NMIN 3 /* % minimum cluster survivors */
The format of the NTP Control Message header, which immediately /*
follows the UDP header, is shown in Figure 7. Following is a * Global return values
description of its fields. Bit positions marked as zero are reserved */
and should always be transmitted as zero. #define TRUE 1 /* boolean true */
#define FALSE 0 /* boolean false */
#define NULL 0 /* empty pointer */
0 1 2 3 /*
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 * Local clock process return codes
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ */
|00 | VN | 6 | REM | Op | Sequence | #define IGNORE 0 /* ignore */
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ #define SLEW 1 /* slew adjustment */
| Status | Association ID | #define STEP 2 /* step adjustment */
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ #define PANIC 3 /* panic - no adjustment */
| Offset | Count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
. .
. Data (468 Octets Max) .
. .
| | Padding (zeros) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Authenticator (optional)(96) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7: NTP Control Message Format /*
* System flags
*/
#define S_FLAGS 0 /* any system flags */
#define S_BCSTENAB 0x1 /* enable broadcast client */
/*
* Peer flags
*/
#define P_FLAGS 0 /* any peer flags */
#define P_EPHEM 0x01 /* association is ephemeral */
#define P_BURST 0x02 /* burst enable */
#define P_IBURST 0x04 /* intial burst enable */
#define P_NOTRUST 0x08 /* authenticated access */
#define P_NOPEER 0x10 /* authenticated mobilization */
Version Number (VN): This is a three-bit integer indicating the NTP /*
version number, currently four (4) * Authentication codes
*/
#define A_NONE 0 /* no authentication */
#define A_OK 1 /* authentication OK */
#define A_ERROR 2 /* authentication error */
#define A_CRYPTO 3 /* crypto-NAK */
Mode: This is a three-bit integer indicating the mode. It must have /*
the value 6, indicating an NTP control message. * Association state codes
*/
#define X_INIT 0 /* initialization */
#define X_STALE 1 /* timeout */
#define X_STEP 2 /* time step */
#define X_ERROR 3 /* authentication error */
#define X_CRYPTO 4 /* crypto-NAK received */
#define X_NKEY 5 /* untrusted key */
Response Bit (R): Set to zero for commands, one for responses. /*
* Protocol mode definitionss
*/
#define M_RSVD 0 /* reserved */
#define M_SACT 1 /* symmetric active */
#define M_PASV 2 /* symmetric passive */
#define M_CLNT 3 /* client */
#define M_SERV 4 /* server */
#define M_BCST 5 /* broadcast server */
#define M_BCLN 6 /* broadcast client */
/*
* Clock state definitions
*/
#define NSET 0 /* clock never set */
#define FSET 1 /* frequency set from file */
#define SPIK 2 /* spike detected */
#define FREQ 3 /* frequency mode */
#define SYNC 4 /* clock synchronized */
Error Bit (E): Set to zero for normal response, one for error A.3. Packet Data Structures
response. /*
* The receive and transmit packets may
contain an optional message
* authentication code (MAC) consisting of a
key identifier (keyid) and * message digest (mac).
NTPv4 supports optional extension fields which * are
inserted after the the header and before the MAC,
but these are * not described here.
*
More Bit (M): Set to zero for last fragment, one for all others. * Receive packet
*
* Note the dst timestamp is not part of the packet
itself. It is
* captured upon arrival and returned in the
receive buffer along with
* the buffer length and data. Note that some
of the char fields are
* packed in the actual header, but the
details are omited here.
*/
struct r {
ipaddr srcaddr; /* source (remote) address */
ipaddr dstaddr; /* destination (local) address */
char version; /* version number */
char leap; /* leap indicator */
char mode; /* mode */
char stratum; /* stratum */
char poll; /* poll interval */
s_char precision; /* precision */
tdist rootdelay; /* root delay */
tdist rootdisp; /* root dispersion */
char refid; /* reference ID */
tstamp reftime; /* reference time */
tstamp org; /* origin timestamp */
tstamp rec; /* receive timestamp */
tstamp xmt; /* transmit timestamp */
int keyid; /* key ID */
digest digest; /* message digest */
tstamp dst; /* destination timestamp */
} r;
Operation Code (Op): This is a five-bit integer specifying the /*
command function. Values currently defined are given in Table 8. * Transmit packet
*/
struct x {
ipaddr dstaddr; /* source (local) address */
ipaddr srcaddr; /* destination (remote) address */
char version; /* version number */
char leap; /* leap indicator */
char mode; /* mode */
char stratum; /* stratum */
char poll; /* poll interval */
s_char precision; /* precision */
tdist rootdelay; /* root delay */
tdist rootdisp; /* root dispersion */
char refid; /* reference ID */
tstamp reftime; /* reference time */
tstamp org; /* origin timestamp */
tstamp rec; /* receive timestamp */
tstamp xmt; /* transmit timestamp */
int keyid; /* key ID */
digest digest; /* message digest */
} x;
+-------+----------------------------------------+ A.1.3 Association Data Structures
| Value | Meaning |
+-------+----------------------------------------+
| 0 | reserved |
| 1 | read status command/response |
| 2 | read variables command/response |
| 3 | write variables command/response |
| 4 | read clock variables command/response |
| 5 | write clock variables command/response |
| 6 | set trap address/port command/response |
| 7 | trap response |
| 8-31 | reserved |
+-------+----------------------------------------+
Table 8: Currently-defined Operation Codes /*
* Filter stage structure. Note the t member in this and other
* structures refers to process time, not real time. Process time
* increments by one second for every elapsed second of real time.
*/
struct f {
tstamp t; /* update time */
double offset; /* clock ofset */
double delay; /* roundtrip delay */
double disp; /* dispersion */
} f;
Sequence: This is a 16-bit integer indicating the sequence number of /*
the command or response. * Association structure. This is shared between the
peer process and * poll process.
*/
struct p {
Status: This is a 16-bit code indicating the current status of the /*
system, peer or clock, with values coded as described in following * Variables set by configuration
sections. */
ipaddr srcaddr; /* source (remote) address */
ipport srcport; /* source port number *.
ipaddr dstaddr; /* destination (local) address */
ipport dstport; /* destination port number */
char version; /* version number */
char mode; /* mode */
int keyid; /* key identifier */
int flags; /* option flags */
/*
* Variables set by received packet
*/
char leap; /* leap indicator */
char mode; /* mode */
char stratum; /* stratum */
char ppoll; /* peer poll interval */
double rootdelay; /* root delay */
double rootdisp; /* root dispersion */
char refid; /* reference ID */
tstamp reftime; /* reference time */
#define begin_clear org /* beginning of clear area */
tstamp org; /* originate timestamp */
tstamp rec; /* receive timestamp */
tstamp xmt; /* transmit timestamp */
Association ID: This is a 16-bit integer identifying a valid /*
association. * Computed data
*/
double t; /* update time */
struct f f[NSTAGE]; /* clock filter */
double offset; /* peer offset */
double delay; /* peer delay */
double disp; /* peer dispersion */
double jitter; /* RMS jitter */
Offset: This is a 16-bit integer indicating the offset, in octets, of /*
the first octet in the data area. * Poll process variables
*/
char hpoll; /* host poll interval */
int burst; /* burst counter */
int reach; /* reach register */
#define end_clear unreach /* end of clear area */
int unreach; /* unreach counter */
int last; /* last poll time */
int next; /* next poll time */
} p;
A.1.4 System Data Structures
Count: This is a 16-bit integer indicating the length of the data /*
field, in octets. * Chime list. This is used by the intersection algorithm.
*/
struct m { /* m is for Marzullo */
struct p *p; /* peer structure pointer */
int type; /* high +1, mid 0, low -1 */
double edge; /* correctness interval edge */
} m;
/*
* Survivor list. This is used by the clustering algorithm.
*/
struct v {
struct p *p; /* peer structure pointer */
double metric; /* sort metric */
} v;
/*
* System structure
*/
struct s {
tstamp t; /* update time */
char leap; /* leap indicator */
char stratum; /* stratum */
char poll; /* poll interval */
char precision; /* precision */
double rootdelay; /* root delay */
double rootdisp; /* root dispersion */
char refid; /* reference ID */
tstamp reftime; /* reference time */
struct m m[NMAX]; /* chime list */
struct v v[NMAX]; /* survivor list */
struct p *p; /* association ID */
double offset; /* combined offset */
double jitter; /* combined jitter */
int flags; /* option flags */
} s;
A.1.5 Local Clock Data Structure
Data: This contains the message data for the command or response. /*
The maximum number of data octets is 468. * Local clock structure
*/
struct c {
tstamp t; /* update time */
int state; /* current state */
double offset; /* current offset */
double base; /* base offset */
double last; /* previous offset */
int count; /* jiggle counter */
double freq; /* frequency */
double jitter; /* RMS jitter */
double wander; /* RMS wander */
} c;
A.1.6 Function Prototypes
Authenticator (optional): When the NTP authentication mechanism is /*
implemented, this contains the authenticator information. * Peer process
*/
void receive(struct r *); /* receive packet */
void fast_xmit(struct r *, int, int);
/* transmit a reply packet */
struct p *find_assoc(struct r *);
/* search the association table */
void packet(struct p *, struct r *);
/* process packet */
void clock_filter(struct p *, double, double, double);
/* filter */
int accept(struct p *);
/* determine fitness of server */
int access(struct r *);
/* determine access restrictions */
A.2. Status Words /*
* System process
*/
void clock_select(); /* find the best clocks */
void clock_update(struct p *); /* update the system clock */
void clock_combine(); /* combine the offsets */
double root_dist(struct p *); /* calculate root distance */
Status words indicate the present status of the system, associations /*
and clock. They are designed to be interpreted by network-monitoring * Clock discipline process
programs and are in one of four 16-bit formats described in this */
section. System and peer status words are associated with responses int local_clock(struct p *, double); /* clock discipline */
for all commands except the read clock variables, write clock void rstclock(int, double, double); /* clock state transition */
variables and set trap address/port commands. The association
identifier zero specifies the system status word, while a nonzero
identifier specifies a particular peer association. The status word
returned in response to read clock variables and write clock
variables commands indicates the state of the clock hardware and
decoding software. A special error status word is used to report
malformed command fields or invalid values.
A.2.1. System Status Word /*
* Clock adjust process
*/
void clock_adjust(); /* one-second timer process */
The system status word appears in the status field of the response to /*
a read status or read variables command with a zero association * Poll process
identifier. The format of the system status word is given in */
Figure 8. void poll(struct p *); /* poll process */
0 1 void poll_update(struct p *, int); /* update the poll interval */
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 void peer_xmit(struct p *); /* transmit a packet */
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| LI | Clock Source | Count | Code |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Figure 8: System Status Word Format /*
* Main program and utility routines
*/
int main(); /* main program */
struct p *mobilize(ipaddr, ipaddr, int, int, int, int);
/* mobilize */
void clear(struct p *, int); /* clear association */
digest md5(int); /* generate a message digest */
Leap Indicator (LI): This is a two-bit code warning of an impending /*
leap second to be inserted/deleted in the last minute of the current * Kernel I/O Interface
day, with bit 0 and bit 1, respectively, coded as shown in Table 9. */
struct r *recv_packet(); /* wait for packet */
void xmit_packet(struct x *); /* send packet */
+-------+------------------------------------------+ .*
| Value | Meaning | * Kernel system clock interface
+-------+------------------------------------------+ */
| 00 | no warning | void step_time(double); /* step time */
| 01 | last minute has 61 seconds | void adjust_time(double); /* adjust (slew) time */
| 10 | last minute has 59 seconds | tstamp get_time(); /* read time */
| 11 | alarm condition (clock not synchronized) | A.2 Main Program and Utility Routines
+-------+------------------------------------------+
Table 9: Leap Indicator Field #include "ntp4.h"
Clock Source: This is a six-bit integer indicating the current /*
synchronization source, with values coded as shown in Table 10. * Definitions
*/
#define PRECISION -18 /* precision (log2 s) */
#define IPADDR 0 /* any IP address */
#define MODE 0 /* any NTP mode */
#define KEYID 0 /* any key identifier */
+-------+---------------------------------------------------------+ /*
| Value | Meaning | * main() - main program
+-------+---------------------------------------------------------+ */
| 0 | unspecified or unknown | int
| 1 | Calibrated atomic clock (e.g.,, HP 5061) | main()
| 2 | VLF (band 4) or LF (band 5) radio (e.g.,, OMEGA,, WWVB) | {
| 3 | HF (band 7) radio (e.g.,, CHU,, MSF,, WWV/H) | struct p *p; /* peer structure pointer */
| 4 | UHF (band 9) satellite (e.g.,, GOES,, GPS) | struct r *r; /* receive packet pointer */
| 5 | local net (e.g.,, DCN,, TSP,, DTS) |
| 6 | UDP/NTP |
| 7 | UDP/TIME |
| 8 | wall time |
| 9 | telephone modem (e.g. NIST) |
| 10-31 | reserved |
| 32 | PPS signal |
| 33-63 | reserved |
+-------+---------------------------------------------------------+
Table 10: Clock Source Field Values /*
* Read command line options and initialize system
variables. * Implementations MAY measure the precision
specific * to each machine by measuring the clock
increments to read the * system clock.
*/
memset(&s, sizeof(s), 0);
s.leap = NOSYNC;
s.stratum = MAXSTRAT;
s.poll = MINPOLL;
s.precision = PRECISION;
s.p = NULL;
/*
* Initialize local clock variables
*/
memset(&c, sizeof(c), 0);
if (/* frequency file */ 0) {
c.freq = /* freq */ 0;
rstclock(FSET, 0, 0);
} else {
rstclock(NSET, 0, 0);
}
c.jitter = LOG2D(s.precision);
System Event Counter: This is a four-bit integer indicating the /*
number of system exception events occurring since the last time the * Read the configuration file and mobilize persistent
system status word was returned in a response or included in a trap * associations with spcified addresses, version, mode,
message. The counter is cleared when returned in the status field of key ID * and flags.
a response and freezes when it reaches the value 15. */
while (/* mobilize configurated associations */ 0) {
p = mobilize(IPADDR, IPADDR, VERSION, MODE, KEYID,
P_FLAGS);
}
System Event Code: This is a four-bit integer identifying the latest /*
system exception event, with new values overwriting previous values, * Start the system timer, which ticks once per second. Then
and coded as shown in Table 11. * read packets as they arrive, strike receive timestamp and
* call the receive() routine.
*/
while (0) {
r = recv_packet(); r->dst = get_time(); receive(r);
}
}
+-------+-----------------------------------------------------------+ /*
| Value | Meaning | * mobilize() - mobilize and initialize an association
+-------+-----------------------------------------------------------+ */
| 0 | unspecified | struct p
| 1 | system restart | *mobilize(
| 2 | system or hardware fault | ipaddr srcaddr, /* IP source address */
| 3 | system new status word (leap bits or synchronization | ipaddr dstaddr, /* IP destination address */
| | change) | int version, /* version */
| 4 | system new synchronization source or stratum (sys.peer or | int mode, /* host mode */
| | sys.stratum) change | int keyid, /* key identifier */
| 5 | system clock reset (offset correction exceeds CLOCK.MAX) | int flags /* peer flags */
| 6 | system invalid time or date | )
| 7 | system clock exception (see system clock status word) | {
| 8-15 | reserved | struct p *p; /* peer process pointer */
+-------+-----------------------------------------------------------+
Table 11: System Event Code Values /*
* Allocate and initialize association memory
*/
p = malloc(sizeof(struct p));
p->srcaddr = srcaddr;
p->srcport = PORT;
p->dstaddr = dstaddr;
p->dstport = PORT;
p->version = version;
p->mode = mode;
p->keyid = keyid;
p->hpoll = MINPOLL;
clear(p, X_INIT);
p->flags == flags;
return (p);
}
A.2.2. Peer Status Word /*
* clear() - reinitialize for persistent association,
demobilize * for ephemeral association.
*/
void
clear(
struct p *p, /* peer structure pointer */
int kiss /* kiss code */
)
{
int i;
A peer status word is returned in the status field of a response to a /*
read status, read variables or write variables command and appears * The first thing to do is return all resources to
also in the list of association identifiers and status words returned the bank. * Typical resources are not detailed here
by a read status command with a zero association identifier. The , but they include * dynamically allocated structures
format of a peer status word is shown in Figure 9. for keys, certificates, etc. * If an ephemeral
0 1 association and not initialization, return * the association
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 memory as well.
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ */
| Peer Status | Sel | Count | Code | /* return resources */
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ if (s.p == p)
Peer Status Word s.p = NULL;
if (kiss != X_INIT && (p->flags & P_EPHEM)) {
free(p);
return;
}
Figure 9: Peer Status Word Format /*
* Initialize the association fields for general reset.
*/
memset(BEGIN_CLEAR(p), LEN_CLEAR, 0); p->leap = NOSYNC;
p->stratum = MAXSTRAT;
p->ppoll = MAXPOLL;
p->hpoll = MINPOLL;
p->disp = MAXDISP;
p->jitter = LOG2D(s.precision); p->refid = kiss;
for (i = 0; i < NSTAGE; i++)
p->f[i].disp = MAXDISP;
Peer Status: This is a five-bit code indicating the status of the /*
peer determined by the packet procedure, with bits assigned as shown * Randomize the first poll just in case thousands
in Table 12. of broadcast * clients have just been stirred up after
a long absence of the * broadcast server.
*/
p->last = p->t = c.t;
p->next = p->last + (random() & ((1 << MINPOLL) - 1));
}
+-------+------------------------------------------+ /*
| Value | Meaning | * md5() - compute message digest
+-------+------------------------------------------+ */
| 0 | configured (peer.config) | digest
| 1 | authentication enabled (peer.authenable) | md5(
| 2 | authentication okay (peer.authentic) | int keyid /* key identifier */
| 3 | reachability okay (peer.reach) | )
| 4 | reserved | {
+-------+------------------------------------------+ /*
* Compute a keyed cryptographic message digest.
The key
* identifier is associated with a key in the local
key cache.
* The key is prepended to the packet header and
extension fieds * and the result hashed by the MD5
algorithm as described in * RFC-1321. Return a MAC
consisting of the 32-bit key ID
* concatenated with the 128-bit digest.
*/
return (/* MD5 digest */ 0);
}
A.3 Kernel Input/Output Interface
Table 12: Peer Status Values /*
* Kernel interface to transmit and receive packets. Details are
* deliberately vague and depend on the operating system.
*
* recv_packet - receive packet from network
*/
struct r /* receive packet pointer*/
*recv_packet() {
return (/* receive packet r */ 0);
}
Peer Selection (Sel): This is a three-bit integer indicating the /*
status of the peer determined by the clock-selection procedure, with * xmit_packet - transmit packet to network
values coded as shown in Table 13. */
void
xmit_packet(
struct x *x /* transmit packet pointer */
)
{
/* send packet x */
}
A.4 Kernel System Clock Interface
+-------+-----------------------------------------------------------+ *
| Value | Meaning | * There are three time formats: native (Unix),
+-------+-----------------------------------------------------------+ NTP and floating double.
| 0 | rejected | * The get_time() routine returns the time in NTP long
| 1 | passed sanity checks | format. The Unix
| 2 | passed correctness checks | * routines expect arguments as a structure of two
| 3 | passed candidate checks (if limit check implemented) | signed 32-bit words
| 4 | passed outlyer checks | * in seconds and microseconds (timeval) or
| 5 | current synchronization source; max distance exceeded (if | nanoseconds (timespec). The
| | limit check implemented) | * step_time() and adjust_time() routines ex
| 6 | current synchronization source; max distance okay | pect signed arguments in
| 7 | reserved | * floating double. The simplified code shown
+-------+-----------------------------------------------------------+ here is for illustration
* only and has not been verified.
*/
#define JAN_1970 2208988800UL /* 1970 - 1900 in seconds */
Table 13: Peer Selection Field Values /*
* get_time - read system time and convert to NTP format
*/
tstamp
get_time()
{
struct timeval unix_time;
Peer Event Counter: This is a four-bit integer indicating the number /*
of peer exception events that occurred since the last time the peer * There are only two calls on this routine in the program. One
status word was returned in a response or included in a trap message. * when a packet arrives from the network and the other when a
The counter is cleared when returned in the status field of a * packet is placed on the send queue. Call the kernel time of
response and freezes when it reaches the value 15. * day routine (such as gettimeofday()) and convert to NTP
* format.
*/
gettimeofday(&unix_time, NULL);
return ((unix_time.tv_sec + JAN_1970) * 0x100000000L +
(unix_time.tv_usec * 0x100000000L) / 1000000);
}
Peer Event Code: This is a four-bit integer identifying the latest /*
peer exception event, with new values overwriting previous values, * step_time() - step system time to given offset valuet
and coded as shown in Table 14. */
void
step_time(
double offset /* clock offset */
)
{
struct timeval unix_time;
tstamp ntp_time;
+-------+-----------------------------------------------------------+ /*
| Value | Meaning | * Convert from double to native format (signed) and add to the
+-------+-----------------------------------------------------------+ * current time. Note the addition is done in native format to
| 0 | unspecified | * avoid overflow or loss of precision.
| 1 | peer IP error | */
| 2 | peer authentication failure (peer.authentic bit was one | ntp_time = D2LFP(offset); gettimeofday(&unix_time, NULL);
| | now zero) | unix_time.tv_sec += ntp_time / 0x100000000L;
| 3 | peer unreachable (peer.reach was nonzero now zero) | unix_time.tv_usec += ntp_time % 0x100000000L;
| 4 | peer reachable (peer.reach was zero now nonzero) | unix_time.tv_sec += unix_time.tv_usec / 1000000;
| 5 | peer clock exception (see peer clock status word) | unix_time.tv_usec %= 1000000;
| 6-15 | reserved | settimeofday(&unix_time, NULL);
+-------+-----------------------------------------------------------+ }
Table 14: Peer Event Codes /*
* adjust_time() - slew system clock to given offset value
*/
void
adjust_time(
double offset /* clock offset */
)
{
struct timeval unix_time;
tstamp ntp_time;
A.2.3. Clock Status Word /*
* Convert from double to native format (signed) and add to the
* current time.
*/
ntp_time = D2LFP(offset);
unix_time.tv_sec = ntp_time / 0x100000000L;
unix_time.tv_usec = ntp_time % 0x100000000L;
unix_time.tv_sec += unix_time.tv_usec / 1000000;
unix_time.tv_usec %= 1000000;
adjtime(&unix_time, NULL);
}
A.5 Peer Process
There are two ways a reference clock can be attached to a NTP service #include "ntp4.h"
host, as an dedicated device managed by the operating system and as a
synthetic peer managed by NTP. As in the read status command, the
association identifier is used to identify which one, zero for the
system clock and nonzero for a peer clock. Only one system clock is
supported by the protocol, although many peer clocks can be
supported. A system or peer clock status word appears in the status
field of the response to a read clock variables or write clock
variables command. This word can be considered an extension of the
system status word or the peer status word as appropriate. The
format of the clock status word is shown in Figure 10.
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| Clock Status | Code |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Figure 10: Clock Status Word Format /*
* A crypto-NAK packet includes the NTP header followed
by a MAC
* consisting only of the key identifier with value zero.
It tells the
* receiver that a prior request could not be properly
authenticated,
* but the NTP header fields are correct.
*
* A kiss-o'-death packet has an NTP header with leap 3
(NOSYNC) and
* stratum 0. It tells the receiver that something drastic
* has happened, as revealled by the kiss code in the
refid field. The
* NTP header fields may or may not be correct.
*/
/*
* Definitions
*/
#define SGATE 3 /* spike gate (clock filter */
#define BDELAY .004 /* broadcast delay (s) */
Clock Status: This is an eight-bit integer indicating the current /*
clock status, with values coded as shown in Table 15. * Dispatch codes
*/
#define ERR -1 /* error */
#define DSCRD 0 /* discard packet */
#define PROC 1 /* process packet */
#define BCST 2 /* broadcast packet */
#define FXMIT 3 /* client packet */
#define NEWPS 4 /* new symmetric passive client */
#define NEWBC 5 /* new broadcast client */
+-------+----------------------------+ /*
| Value | Meaning | * Dispatch matrix
+-------+----------------------------+ * active passv client server bcast */
| 0 | clock nominally operating | int table[7][5] = {
| 1 | reply timeout | /* nopeer */{ NEWPS, DSCRD, FXMIT, DSCRD, NEWBC },
| 2 | bad reply format | /* active */{ PROC, PROC, DSCRD, DSCRD, DSCRD },
| 3 | hardware or software fault | /* passv */{ PROC, ERR, DSCRD, DSCRD, DSCRD },
| 4 | propagation failure | /* client */{ DSCRD, DSCRD, DSCRD, PROC, DSCRD },
| 5 | bad date format or value | /* server */{ DSCRD, DSCRD, DSCRD, DSCRD, DSCRD },
| 6 | bad time format or value | /* bcast */{ DSCRD, DSCRD, DSCRD, DSCRD, DSCRD },
| 7-255 | reserved | /* bclient */{ DSCRD, DSCRD, DSCRD, DSCRD, PROC}
+-------+----------------------------+ };
Table 15: Clock Status Values /*
* Miscellaneous macroni
*
* This macro defines the authentication state. If x is 0,
* authentication is optional, othewise it is required.
*/
#define AUTH(x, y)((x) ? (y) == A_OK : (y) == A_OK || \
(y) == A_NONE)
Clock Event Code: This is an eight-bit integer identifying the latest /*
clock exception event, with new values overwriting previous values. * These are used by the clear() routine
When a change to any nonzero value occurs in the radio status field, */
the radio status field is copied to the clock event code field and a #define BEGIN_CLEAR(p) ((char *)&((p)->begin_clear))
system or peer clock exception event is declared as appropriate. #define END_CLEAR(p) ((char *)&((p)->end_clear))
#define LEN_CLEAR (END_CLEAR ((struct p *)0) - \
BEGIN_CLEAR((struct p *)0))
A.5.1 receive()
A.2.4. Error Status Word /*
* receive() - receive packet and decode modes
*/
void
receive(
struct r *r /* receive packet pointer */
)
{
struct p *p; /* peer structure pointer
int auth; /* authentication code */
int has_mac; /* size of MAC */
int synch; /* synchronized switch */
int auth; /* authentication code */
An error status word is returned in the status field of an error /*
response as the result of invalid message format or contents. Its * Check access control lists. The intent here is to implement a
presence is indicated when the E (error) bit is set along with the * whitelist of those IP addresses specifically accepted and/or
response (R) bit in the response. It consists of an eight-bit * a blacklist of those IP addresses specifically rejected.
integer coded as shown in Figure 11. * There could be different lists for authenticated clients and
* unauthenticated clients.
*/
if (!access(r))
return; /* access denied */
0 1 /*
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 * The version must not be in the future. Format checks include
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ * packet length, MAC length and extension field lengths, if
| Error Code | Reserved | * present.
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ */
if (r->version > VERSION /* or format error */)
return; /* format error */
Figure 11: Error Status Word Format /*
* Authentication is conditioned by two switches which can be
* specified on a per-client basis.
*
* P_NOPEER do not mobilize an association unless
* authenticated
* P_NOTRUST do not allow access unless authenticated
* (implies P_NOPEER)*
* There are four outcomes:
*
* A_NONE the packet has no MAC
* A_OK the packet has a MAC and authentication
* succeeds
* A_ERROR the packet has a MAC and authentication fails
* A_CRYPTO crypto-NAK. the MAC has four octets only.
*
* Note: The AUTH(x, y) macro is used to filter outcomes. If x
* is zero, acceptable outcomes of y are NONE and OK. If x is
* one, the only acceptable outcome of y is OK.
*/
has_mac = /* length of MAC field */ 0; if (has_mac == 0) {
auth = A_NONE; /* not required */
} else if (has_mac == 4) {
auth == A_CRYPTO; /* crypto-NAK */
} else {
if (r->mac != md5(r->keyid))
auth = A_ERROR; /* auth error */
else
auth = A_OK; /* auth OK */
}
Currently-defined error codes are given in Table 16. /*
* Find association and dispatch code. If there is no
* association to match, the value of p->mode is assumed NULL.
*/
p = find_assoc(r);
switch(table[p->mode][r->mode]) {
+-------+----------------------------------+ /*
| Value | Meaning | * Client packet. Send server reply (no association). If
+-------+----------------------------------+ * authentication fails, send a crypto-NAK packet.
| 0 | unspecified | */
| 1 | authentication failure | case FXMIT:
| 2 | invalid message length or format | if (AUTH(p->flags & P_NOTRUST, auth))
| 3 | invalid opcode | fast_xmit(r, M_SERV, auth);
| 4 | unknown association identifier | else if (auth == A_ERROR)
| 5 | unknown variable name | fast_xmit(r, M_SERV, A_CRYPTO);
| 6 | invalid variable value | return; /* M_SERV packet sent */
| 7 | administratively prohibited |
| 8-255 | reserved |
+-------+----------------------------------+
Table 16: Error Code Values /*
* New symmetric passive client (ephemeral association). It is
* mobilized in the same version as in the packet. If
* authentication fails, send a crypto-NAK packet. If restrict
* no-moblize, send a symmetric active packet instead.
*/
case NEWPS:
if (!AUTH(p->flags & P_NOTRUST, auth)) {
if (auth == A_ERROR)
fast_xmit(r, M_SACT, A_CRYPTO);
return; /* crypto-NAK packet sent */
}
if (!AUTH(p->flags & P_NOPEER, auth)) {
fast_xmit(r, M_SACT, auth);
return; /* M_SACT packet sent */
}
p = mobilize(r->srcaddr, r->dstaddr, r->version, M_PASV,
r->keyid, P_EPHEM);
break;
A.3. Commands /*
* New broadcast client (ephemeral association). It is mobilized
* in the same version as in the packet. If authentication
* error, ignore the packet.
*/
case NEWBC:
if (!AUTH(p->flags & (P_NOTRUST | P_NOPEER), auth))
return; /* authentication error */
Commands consist of the header and optional data field of the Status if (!(s.flags & S_BCSTENAB))
Word. When present, the data field contains a list of identifiers or return; /* broadcast not enabled */
assignments in the form
<<identifier>>[=<<value>>],<<identifier>>[=<<value>>],... p = mobilize(r->srcaddr, r->dstaddr, r->version, M_BCLN,
r->keyid, P_EPHEM);
break; /* processing continues */
where <<identifier>> is the ASCII name of a system or peer variable /*
specified in Table 2 or Table 3 of [1] and <<value>> is expressed as * Process packet. Placeholdler only.
a decimal, hexadecimal or string constant in the syntax of the C */
programming language. Where no ambiguity exists, the <169>sys.<170> case PROC:
or <169>peer.<170> prefixes shown in Table 2 or Table 4 of [1] can be break; /* processing continues */
suppressed. Whitespace (ASCII nonprinting format effectors) can be /*
added to improve readability for simple monitoring programs that do * Invalid mode combination. We get here only in case of
not reformat the data field. Internet addresses are represented as * ephemeral associations, so the correct action is simply to
four octets in the form [n.n.n.n], where n is in decimal notation and * toss it.
the brackets are optional. Timestamps, including reference, */
originate, receive and transmit values, as well as the logical clock, case ERR:
are represented in units of seconds and fractions, preferably in clear(p, X_ERROR);
hexadecimal notation, while delay, offset, dispersion and distance return; /* invalid mode combination */
values are represented in units of milliseconds and fractions,
preferably in decimal notation.All other values are represented
as-is, preferably in decimal notation.
Implementations may define variables other than those listed in Table /*
2 or Table 3 of [1]. Called extramural variables, these are * No match; just discard the packet.
distinguished by the inclusion of some character type other than */
alphanumeric or <169>.<170> in the name. For those commands that case DSCRD:
return a list of assignments in the response data field, if the return; /* orphan abandoned */
command data field is empty, it is expected that all available }
variables defined in Table 3 or Table 4 of [1] will be included in
the response. For the read commands, if the command data field is
nonempty, an implementation may choose to process this field to
individually select which variables are to be returned.
Commands are interpreted as follows: /*
* Next comes a rigorous schedule of timestamp checking. If the
* transmit timestamp is zero, the server is horribly broken.
*/
if (r->xmt == 0)
return; /* invalid timestamp */
Read Status (1): The command data field is empty or contains a list /*
of identifiers separated by commas. The command operates in two ways * If the transmit timestamp duplicates a previous one, the
depending on the value of the association identifier. If this * packet is a replay.
identifier is nonzero, the response includes the peer identifier and */
status word. Optionally, the response data field may contain other if (r->xmt == p->xmt)
information, such as described in the Read Variables command. If the return; /* duplicate packet */
association identifier is zero, the response includes the system
identifier (0) and status word, while the data field contains a list
of binary-coded pairs
<<association identifier>> <<status word>>, /*
* If this is a broadcast mode packet, skip further checking.
* If the origin timestamp is zero, the sender has not yet heard
* from us. Otherwise, if the origin timestamp does not match
* the transmit timestamp, the packet is bogus.
*/
synch = TRUE;
if (r->mode != M_BCST) {
if (r->org == 0)
synch = FALSE;/* unsynchronized */
one for each currently defined association. else if (r->org != p->xmt)
synch = FALSE;/* bogus packet */
}
Read Variables (2): The command data field is empty or contains a /*
list of identifiers separated by commas. If the association * Update the origin and destination timestamps. If
identifier is nonzero, the response includes the requested peer * unsynchronized or bogus, abandon ship.
identifier and status word, while the data field contains a list of
peer variables and values as described above. If the association
identifier is zero, the data field contains a list of system
variables and values. If a peer has been selected as the
synchronization source, the response includes the peer identifier and
status word; otherwise, the response includes the system identifier
(0) and status word.
Write Variables (3): The command data field contains a list of */
assignments as described above. The variables are updated as p->org = r->xmt;
indicated. The response is as described for the Read Variables p->rec = r->dst;
command. if (!synch)
return; /* unsynch */
Read Clock Variables (4): The command data field is empty or contains /*
a list of identifiers separated by commas. The association * The timestamps are valid and the receive packet matches the
identifier selects the system clock variables or peer clock variables * last one sent. If the packet is a crypto-NAK, the server
in the same way as in the Read Variables command. The response * might have just changed keys. We demobilize the association
includes the requested clock identifier and status word and the data * and wait for better times.
field contains a list of clock variables and values, including the */
last timecode message received from the clock. if (auth == A_CRYPTO) {
clear(p, X_CRYPTO);
return; /* crypto-NAK */
}
Write Clock Variables (5): The command data field contains a list of /*
assignments as described above. The clock variables are updated as * If the association is authenticated, the key ID is nonzero
indicated. The response is as described for the Read Clock Variables * and received packets must be authenticated. This is designed *
command. to avoid a bait-and-switch attack, which was possible in past
* versions.
*/
if (!AUTH(p->keyid || (p->flags & P_NOTRUST), auth))
return; /* bad auth */
Set Trap Address/Port (6): The command association identifier, status /*
and data fields are ignored. The address and port number for * Everything possible has been done to validate the timestamps
subsequent trap messages are taken from the source address and port * and prevent bad guys from disrupting the protocol or
of the control message itself. The initial trap counter for trap * injecting bogus data. Earn some revenue.
response messages is taken from the sequence field of the command. */
The response association identifier, status and data fields are not packet(p, r);
significant. Implementations should include sanity timeouts which }
prevent trap transmissions if the monitoring program does not renew
this information after a lengthy interval.
Trap Response (7): This message is sent when a system, peer or clock /*
exception event occurs. The opcode field is 7 and the R bit is set. * find_assoc() - find a matching association
The trap counter is incremented by one for each trap sent and the */
sequence field set to that value. The trap message is sent using the struct p /* peer structure pointer or NULL */
IP address and port fields established by the set trap address/port *find_assoc(
command. If a system trap the association identifier field is set to struct r *r /* receive packet pointer */
zero and the status field contains the system status word. If a peer )
trap the association identifier field is set to that peer and the {
status field contains the peer status word. Optional ASCII-coded struct p *p; /* dummy peer structure pointer */
information can be included in the data field.
/*
* Search association table for matching source * address and
source port.
*/
while (/* all associations */ 0) {
if (r->srcaddr == p->srcaddr && r->port == p->port)
return(p);
}
return (NULL);
}
A.5.2 packet()
/*
* packet() - process packet and compute offset, delay and
* dispersion.
*/
void
packet(
struct p *p, /* peer structure pointer */
struct r *r /* receive packet pointer */
)
{
double offset; /* sample offsset */
double delay; /* sample delay */
double disp; /* sample dispersion */
/*
* By golly the packet is valid. Light up the remaining header
* fields. Note that we map stratum 0 (unspecified) to MAXSTRAT
* to make stratum comparisons simpler and to provide a natural
* interface for radio clock drivers that operate for
* convenience at stratum 0.
*/
p->leap = r->leap;
if (r->stratum == 0)
p->stratum = MAXSTRAT; else
p->stratum = r->stratum; p->mode = r->mode;
p->ppoll = r->poll;
p->rootdelay = FP2D(r->rootdelay); p->rootdisp = FP2D(r->rootdisp);
p->refid = r->refid;
p->reftime = r->reftime;
/*
* Verify the server is synchronized with valid stratum and
* reference time not later than the transmit time.
*/
if (p->leap == NOSYNC || p->stratum >= MAXSTRAT)
return; /* unsynchronized */
/*
* Verify valid root distance.
*/
if (r->rootdelay / 2 + r->rootdisp >= MAXDISP || p->reftime >
r->xmt)
return; /* invalid header values */
poll_update(p, p->hpoll);
p->reach |= 1;
/*
* Calculate offset, delay and dispersion, then pass to the
* clock filter. Note carefully the implied processing. The
* first-order difference is done directly in 64-bit arithmetic,
* then the result is converted to floating double. All further
* processing is in floating double arithmetic with rounding
* done by the hardware. This is necessary in order to avoid
* overflow and preseve precision.
*
* The delay calculation is a special case. In cases where the
* server and client clocks are running at different rates and
* with very fast networks, the delay can appear negative. In
* order to avoid violating the Principle of Least Astonishment,
* the delay is clamped not less than the system precision.
*/
if (p->mode == M_BCST) {
offset = LFP2D(r->xmt - r->dst); delay = BDELAY;
disp = LOG2D(r->precision) + LOG2D(s.precision) + PHI *
2 * BDELAY;
} else {
offset = (LFP2D(r->rec - r->org) + LFP2D(r->dst
r->xmt)) / 2;
delay = max(LFP2D(r->dst - r->org) - LFP2D(r->rec
r->xmt), LOG2D(s.precision));
disp = LOG2D(r->precision) + LOG2D(s.precision) + PHI *
LFP2D(r->dst - r->org);
}
clock_filter(p, offset, delay, disp);
}
A.5.3 clock_filter()
/*
* clock_filter(p, offset, delay, dispersion) - select the best