NTP server misuse and abuse

From Wikipedia, the free encyclopedia

NTP server misuse and abuse covers a number of practices which cause damage or degradation to an NTP server, ranging from flooding it with traffic (effectively a DDoS attack) or violating the server's access policy or the NTP rules of engagement. One incident was branded NTP vandalism in an open letter from Poul-Henning Kamp to the router manufacturer D-Link in 2006. This term has later been extended by others to retroactively include other incidents. There is, however, no evidence that any of these problems are deliberate vandalism. They are more usually caused by shortsighted or poorly chosen default configurations.

Contents

[edit] Common NTP client problems

The most troublesome problems have involved NTP server addresses hardcoded in the firmware of consumer networking devices. As major manufacturers produce hundreds of thousands of devices and since most customers never upgrade the firmware, any problems will persist for as long as the devices are in service.

One particularly common software error is to generate query packets at short (less than five second) intervals until a response is received. When such an implementation finds itself behind a packet filter that refuses to pass the incoming response, this results in a never-ending stream of requests to the NTP server. Such grossly over-eager clients (particularly those polling once per second) commonly make up more than 50% of the traffic of public NTP servers, despite being a minuscule fraction of the total clients. While it is reasonable to send a few initial packets at short intervals, it is essential for the health of any connectionless network that unacknowledged packets be generated at exponentially decreasing rates. This applies to any connectionless protocol, and many portions of connection-based protocols. Examples can be found in the TCP specification for connection establishment, zero-window probing, and keepalive transmissions.

[edit] Tardis and Trinity College, Dublin

In October 2002, one of the earliest known cases of time server misuse resulted in problems for a web server at Trinity College, Dublin. The traffic was ultimately traced to misbehaving copies of a program called Tardis, with thousands of copies around the world contacting the web server and obtaining a timestamp via HTTP. Ultimately, the solution was to modify the web server configuration so as to deliver a customized version of the home page (greatly reduced in size) and to return a bogus time value, which caused most of the clients to choose a different time server. An updated version of Tardis was later released to correct for this problem. This incident was recently described by David Malone in the April 2006 issue of ;login: magazine, published by the USENIX Association.

[edit] NETGEAR and the University of Wisconsin

The first widely known case of NTP server problems began in May 2003, when NETGEAR's hardware products flooded the University of Wisconsin's NTP server with requests. University personnel initially assumed this was a malicious distributed denial of service attack and took actions to block the flood at their network border. Rather than abating (as most DDOS attacks do) the flow increased, reaching 250,000 packets-per-second (150 megabits per second) by June. Subsequent investigation revealed that four models of NETGEAR routers were the source of the problem. It was found that the SNTP (Simple NTP) client in the routers has two serious flaws. First, it relies on a single NTP server (at the University of Wisconsin) whose IP address was hard-coded in the firmware. Second, it polls the server at one second intervals until it receives a response. A total of 707,147 products with the faulty client were produced.

NETGEAR has released firmware updates for the affected products (DG814, HR314, MR814 and RP614) which query NETGEAR's own servers, poll only once every ten minutes, and give up after five failures. While this update fixes the flaws in the original SNTP client, it does not solve the larger problem. Most consumers will never update their router's firmware, particularly if the device seems to be operating properly. The University of Wisconsin NTP server continues to receive high levels of traffic from NETGEAR routers, with occasional floods of up to 100,000 packets-per-second. NETGEAR has donated $375,000 to the University of Wisconsin's Division of Information Technology for their help in identifying the flaw.

[edit] SMC and CSIRO

Also in 2003, another case forced the NTP servers of the Australian Commonwealth Scientific and Research Organization's (CSIRO) National Measurement Laboratory to close to the public[1]. The traffic was shown to come from a bad NTP implementation in some SMC router models where the IP address of the CSIRO server was embedded in the firmware. SMC has released firmware updates for the products: the 7004VBR and 7004VWBR models are known to be affected.

[edit] D-Link and Poul-Henning Kamp

The history of the most recent problem started in 2005 when Poul-Henning Kamp, the manager of the only Danish Stratum 1 NTP server available to the general public, observed a huge rise in traffic and discovered that between 75 and 90% was originating with D-Link's router products. Stratum 1 NTP servers receive their time signal from an accurate external source, such as a GPS receiver, radio clock, or a calibrated atomic clock. By convention, Stratum 1 time servers should only be used by Stratum 2 servers, and by applications requiring extremely precise time measurements, such as scientific applications.[2] A home networking router does not meet either of these criteria. In addition, Kamp's server's access policy explicitly limited it to servers directly connected to the Danish Internet Exchange (DIX). The direct use of this and other Stratum 1 servers by D-Link's routers resulted in a huge rise in traffic, increasing bandwidth costs and server load.

In many countries, official timekeeping services are provided by a government agency (such as NIST in the U.S.). Since there is no Danish equivalent, Kamp provides his time service "pro bono publico". In return, DIX agreed to provide a free connection for his time server under the assumption that the bandwidth involved would be relatively low, given the limited number of servers and potential clients. With the increased traffic caused by the D-Link routers, DIX requested he pay a yearly connection fee of DKK 54,000 (approximately $8,800 USD).

Kamp contacted D-Link in November 2005, hoping to get them to fix the problem and compensate him for the time and money he spent tracking down the problem and the bandwidth charges caused by D-Link products. The company denied any problem, accused him of extortion, and offered an amount in compensation which Kamp asserted did not cover his expenses. On April 7, 2006 Kamp posted the story on his website. The story was picked up by Slashdot, reddit and other news sites. After going public, Kamp realized that D-Link routers were directly querying other Stratum 1 time servers, violating the access policies of at least 43 of them in the process. On April 27, 2006, D-Link and Kamp announced that they had "amicably resolved" their dispute.[3]

[edit] Technical solutions

After these incidents, it became clear that apart from stating a server’s access policy, a technical means of enforcing a policy was needed. One such mechanism was provided by extending semantics of a Reference Identifier field in an NTP packet when a Stratum field is 0.

In January 2006, RFC 4330 was published, updating details of the SNTP protocol, but also provisionally clarifying and extending the related NTP protocol in some areas. Sections 8 to 11 of RFC 4330 are of particular relevance to this topic (The Kiss-o'-Death Packet, On Being a Good Network Citizen, Best Practices, Security Considerations). Section 8 introduces Kiss-o'-Death packets:

"In NTPv4 and SNTPv4, packets of this kind are called Kiss-o'-Death (KoD) packets, and the ASCII messages they convey are called kiss codes. The KoD packets got their name because an early use was to tell clients to stop sending packets that violate server access controls."

Unfortunately the new requirements of the NTP protocol do not work retroactively, and old clients and implementations of earlier version of the protocol do not recognize KoD and act on it. For the time being there are no good technical means to counteract misuse of NTP servers.

[edit] External links