Unix time
From Wikipedia, the free encyclopedia
Unix time, or POSIX time, is a system for describing points in time: it is the number of seconds elapsed since midnight UTC on the morning of January 1, 1970, not counting leap seconds. It is widely used not only on Unix-like operating systems but in many other computing systems, including the Java programming language. It is not a true encoding of UTC time, but is sufficiently similar to a linear representation of the passage of time that it is frequently mistaken for one. The main complication is that UTC accounts for leap seconds, while Unix time does not.
Contents |
[edit] Definition
There are two layers of encoding that make up Unix time, and they can be usefully separated. The first layer encodes a point in time as a scalar real number, and the second encodes that number as a sequence of bits or in some other manner.
[edit] Encoding time as a number
Modern Unix time is based strictly on UTC. UTC counts time using SI seconds, and breaks up the span of time into days. UTC days are mostly 86400 s long, but are occasionally 86401 s and could be 86399 s long (though the latter option has never been used as of December 2005) in order to keep the days synchronised with the rotation of the Earth. As is standard with UTC, this article will label days using the Gregorian calendar, and count times within each day in hours, minutes, and seconds. Some of the examples will also show TAI, another time scheme, which uses the same seconds and is displayed in the same format as UTC, but has every day exactly 86400 s long, making no attempt to stay synchronised with the Earth's rotation.
The Unix epoch is the time 00:00:00 UTC on January 1, 1970. There is a problem with this definition, in that UTC did not exist in its current form until 1972; this issue is discussed below. For brevity, the remainder of this section will use ISO 8601 date format, in which the Unix epoch is 1970-01-01T00:00:00Z.
The Unix time number is zero at the Unix epoch, and increases by exactly 86 400 per day since the epoch. Thus 2004-09-16T00:00:00Z, 12 677 days after the epoch, is represented by the Unix time number 12 677 × 86 400 = 1 095 292 800. This can be extended backwards from the epoch too, using negative numbers; thus 1957-10-04T00:00:00Z, 4472 days before the epoch, is represented by the Unix time number -4472 × 86 400 = -386 380 800.
Within each day, the Unix time number is as calculated in the preceding paragraph at midnight UTC (00:00:00Z), and increases by exactly 1 per second since midnight. Thus 2004-09-16T17:55:43.54Z, 64 543.54 s since midnight on the day in the example above, is represented by the Unix time number 1 095 292 800 + 64 543.54 = 1 095 357 343.54. On dates before the epoch the number still increases, thus becoming less negative, as time moves forward.
The above scheme means that on a normal UTC day, of duration 86 400 s, the Unix time number changes in a continuous manner across midnight. For example, at the end of the day used in the examples above, the time representations progress like this:
TAI | UTC | Unix time |
---|---|---|
2004-09-17T00:00:30.75 | 2004-09-16T23:59:58.75 | 1 095 379 198.75 |
2004-09-17T00:00:31.00 | 2004-09-16T23:59:59.00 | 1 095 379 199.00 |
2004-09-17T00:00:31.25 | 2004-09-16T23:59:59.25 | 1 095 379 199.25 |
2004-09-17T00:00:31.50 | 2004-09-16T23:59:59.50 | 1 095 379 199.50 |
2004-09-17T00:00:31.75 | 2004-09-16T23:59:59.75 | 1 095 379 199.75 |
2004-09-17T00:00:32.00 | 2004-09-17T00:00:00.00 | 1 095 379 200.00 |
2004-09-17T00:00:32.25 | 2004-09-17T00:00:00.25 | 1 095 379 200.25 |
2004-09-17T00:00:32.50 | 2004-09-17T00:00:00.50 | 1 095 379 200.50 |
2004-09-17T00:00:32.75 | 2004-09-17T00:00:00.75 | 1 095 379 200.75 |
2004-09-17T00:00:33.00 | 2004-09-17T00:00:01.00 | 1 095 379 201.00 |
2004-09-17T00:00:33.25 | 2004-09-17T00:00:01.25 | 1 095 379 201.25 |
When a leap second occurs, so that the UTC day is not exactly 86 400 s long, a discontinuity occurs in the Unix time number. The Unix time number increases by exactly 86 400 each day, regardless of how long the day is. When a leap second is deleted (which has never occurred as of 2006), the Unix time number jumps up by 1 at the instant where the leap second was deleted from, which is the end of the day. When a leap second is inserted (which has occurred on average once every year and a half), the Unix time number increases continuously during the leap second, during which time it is more than 86 400 s since the start of the current day, and then jumps down by 1 at the end of the leap second, which is the start of the next day. For example, this is what happened on strictly conforming POSIX.1 systems at the end of 1998:
TAI | UTC | Unix time |
---|---|---|
1999-01-01T00:00:29.75 | 1998-12-31T23:59:58.75 | 915 148 798.75 |
1999-01-01T00:00:30.00 | 1998-12-31T23:59:59.00 | 915 148 799.00 |
1999-01-01T00:00:30.25 | 1998-12-31T23:59:59.25 | 915 148 799.25 |
1999-01-01T00:00:30.50 | 1998-12-31T23:59:59.50 | 915 148 799.50 |
1999-01-01T00:00:30.75 | 1998-12-31T23:59:59.75 | 915 148 799.75 |
1999-01-01T00:00:31.00 | 1998-12-31T23:59:60.00 | 915 148 800.00 |
1999-01-01T00:00:31.25 | 1998-12-31T23:59:60.25 | 915 148 800.25 |
1999-01-01T00:00:31.50 | 1998-12-31T23:59:60.50 | 915 148 800.50 |
1999-01-01T00:00:31.75 | 1998-12-31T23:59:60.75 | 915 148 800.75 |
1999-01-01T00:00:32.00 | 1999-01-01T00:00:00.00 | 915 148 800.00 |
1999-01-01T00:00:32.25 | 1999-01-01T00:00:00.25 | 915 148 800.25 |
1999-01-01T00:00:32.50 | 1999-01-01T00:00:00.50 | 915 148 800.50 |
1999-01-01T00:00:32.75 | 1999-01-01T00:00:00.75 | 915 148 800.75 |
1999-01-01T00:00:33.00 | 1999-01-01T00:00:01.00 | 915 148 801.00 |
1999-01-01T00:00:33.25 | 1999-01-01T00:00:01.25 | 915 148 801.25 |
Observe that when a positive leap second occurs (i.e., when a leap second is inserted) the Unix time numbers repeat themselves. The Unix time number 915 148 800.50 is ambiguous: it can refer either to the instant in the middle of the leap second, or to the instant one second later, half a second after midnight UTC. In the theoretical case when a negative leap second occurs (i.e., when a leap second is deleted) no ambiguity is caused, but instead there is a range of Unix time numbers that do not refer to any point in time at all.
A Unix clock is often implemented with a different type of positive leap second handling associated with the Network Time Protocol. This yields a system that does not conform to the POSIX standard. See the section below concerning NTP for details.
When dealing with time periods that do not encompass a UTC leap second, the difference between two Unix time numbers is equal to the duration in seconds of the time period between the corresponding points in time. This is a common computational technique. However, where leap seconds occur, such calculations give the wrong answer. In applications where this level of accuracy is required, it is necessary to consult a table of leap seconds when dealing with Unix times, and it is often preferable to use a different time encoding that does not suffer this problem.
A Unix time number is easily converted back into UTC by taking the quotient and modulus of the Unix time number, modulo 86400. The quotient is the number of days since the epoch, and the modulus is the number of seconds since midnight UTC on that day. (It is important to ensure that the right type of modulus is being calculated when dealing with times before the epoch.) If given a Unix time number that is ambiguous due to a positive leap second, this algorithm will interpret it as the time just after midnight. It will never generate a time that is during a leap second. If given a Unix time number that is invalid due to a negative leap second, it will generate an equally invalid UTC time. If these conditions are significant then it is necessary to consult a table of leap seconds in order to detect them.
[edit] NTP-based variant
A popular model for managing a Unix clock is that given in "A Kernel Model for Precision Timekeeping" by David L. Mills, the inventor of the Network Time Protocol (NTP). The objective of the model is to allow the Unix clock to be used as a component and an output of a synchronisation system such as NTP. Part of it specifies leap second behaviour.
In addition to the conventional Unix time number, this type of clock maintains a state variable that controls its leap second state machine. This variable can be retrieved along with the Unix time number using the ntp_gettime() function. The state variable normally has the value TIME_OOP when no leap second is nearby. It can be used to disambiguate timestamps around a leap second.
This is what happens around a positive leap second:
TAI | UTC | state | Unix clock |
---|---|---|---|
1999-01-01T00:00:29.75 | 1998-12-31T23:59:58.75 | TIME_INS | 915 148 798.75 |
1999-01-01T00:00:30.00 | 1998-12-31T23:59:59.00 | TIME_INS | 915 148 799.00 |
1999-01-01T00:00:30.25 | 1998-12-31T23:59:59.25 | TIME_INS | 915 148 799.25 |
1999-01-01T00:00:30.50 | 1998-12-31T23:59:59.50 | TIME_INS | 915 148 799.50 |
1999-01-01T00:00:30.75 | 1998-12-31T23:59:59.75 | TIME_INS | 915 148 799.75 |
1999-01-01T00:00:31.00 | 1998-12-31T23:59:60.00 | TIME_OOP | 915 148 799.00 |
1999-01-01T00:00:31.25 | 1998-12-31T23:59:60.25 | TIME_OOP | 915 148 799.25 |
1999-01-01T00:00:31.50 | 1998-12-31T23:59:60.50 | TIME_OOP | 915 148 799.50 |
1999-01-01T00:00:31.75 | 1998-12-31T23:59:60.75 | TIME_OOP | 915 148 799.75 |
1999-01-01T00:00:32.00 | 1999-01-01T00:00:00.00 | TIME_WAIT | 915 148 800.00 |
1999-01-01T00:00:32.25 | 1999-01-01T00:00:00.25 | TIME_WAIT | 915 148 800.25 |
1999-01-01T00:00:32.50 | 1999-01-01T00:00:00.50 | TIME_WAIT | 915 148 800.50 |
1999-01-01T00:00:32.75 | 1999-01-01T00:00:00.75 | TIME_WAIT | 915 148 800.75 |
1999-01-01T00:00:33.00 | 1999-01-01T00:00:01.00 | TIME_WAIT | 915 148 801.00 |
1999-01-01T00:00:33.25 | 1999-01-01T00:00:01.25 | TIME_WAIT | 915 148 801.25 |
This behaviour of the Unix time number does not conform to POSIX.1, but is very common. As a result, Unix times such as 915 148 799.50, apparently in the second preceding a leap second, are de facto ambiguous, as are (both de facto and de jure) times such as 915 148 800.50. For programs that need to handle the leap second properly, the TIME_OOP state makes it clear which is the leap second.
The only leap second state that needs to be distinguished in order to decode the time properly is TIME_OOP. TIME_INS and TIME_DEL theoretically act as advance warning of the leap, but with no guarantee of how far in advance the state changes this is of limited use. Applications requiring advance knowledge of leap seconds are better off using out-of-band mechanisms. However, the full set of states has another use. Commonly the leap second handling is not synchronous with the change of the Unix time number, so the behaviour is not quite as described here. This is discussed in the next subsection.
[edit] Non-synchronous NTP-based variant
Commonly a Mills-style Unix clock (as described in the preceding section) is implemented with leap second handling not synchronous with the change of the Unix time number. The time number initially increases normally where a leap should have occurred, and then it leaps a few milliseconds later. This is done in order to make implementation easier, and is suggested in passing by Mills's paper. This is what happens across a positive leap second:
TAI | UTC | state | Unix clock |
---|---|---|---|
1999-01-01T00:00:29.75 | 1998-12-31T23:59:58.75 | TIME_INS | 915 148 798.75 |
1999-01-01T00:00:30.00 | 1998-12-31T23:59:59.00 | TIME_INS | 915 148 799.00 |
1999-01-01T00:00:30.25 | 1998-12-31T23:59:59.25 | TIME_INS | 915 148 799.25 |
1999-01-01T00:00:30.50 | 1998-12-31T23:59:59.50 | TIME_INS | 915 148 799.50 |
1999-01-01T00:00:30.75 | 1998-12-31T23:59:59.75 | TIME_INS | 915 148 799.75 |
1999-01-01T00:00:31.00 | 1998-12-31T23:59:60.00 | TIME_INS | 915 148 800.00 |
1999-01-01T00:00:31.25 | 1998-12-31T23:59:60.25 | TIME_OOP | 915 148 799.25 |
1999-01-01T00:00:31.50 | 1998-12-31T23:59:60.50 | TIME_OOP | 915 148 799.50 |
1999-01-01T00:00:31.75 | 1998-12-31T23:59:60.75 | TIME_OOP | 915 148 799.75 |
1999-01-01T00:00:32.00 | 1999-01-01T00:00:00.00 | TIME_OOP | 915 148 800.00 |
1999-01-01T00:00:32.25 | 1999-01-01T00:00:00.25 | TIME_WAIT | 915 148 800.25 |
1999-01-01T00:00:32.50 | 1999-01-01T00:00:00.50 | TIME_WAIT | 915 148 800.50 |
1999-01-01T00:00:32.75 | 1999-01-01T00:00:00.75 | TIME_WAIT | 915 148 800.75 |
1999-01-01T00:00:33.00 | 1999-01-01T00:00:01.00 | TIME_WAIT | 915 148 801.00 |
1999-01-01T00:00:33.25 | 1999-01-01T00:00:01.25 | TIME_WAIT | 915 148 801.25 |
This can be decoded properly by paying attention to the leap second state variable, which unambiguously indicates whether the leap has been performed yet. The state variable change is synchronous with the leap.
A similar situation arises with a negative leap second, where the second that is skipped is slightly too late. Very briefly the system shows a nominally impossible time number, but this can be detected by the TIME_DEL state and corrected.
In this type of system the Unix time number violates POSIX around both types of leap second. Collecting the leap second state variable along with the time number allows for unambiguous decoding, so the correct POSIX time number can be generated if desired, or the full UTC time can be stored in a more suitable format.
The decoding logic required to cope with this style of Unix clock would also correctly decode a hypothetical POSIX-conforming clock using the same interface. This would be achieved by indicating the TIME_INS state during the entirety of an inserted leap second, then indicating TIME_WAIT during the entirety of the following second while repeating the seconds count. This requires synchronous leap second handling. This is probably the best way to express UTC time in Unix clock form, via a Unix interface, when the underlying clock is fundamentally untroubled by leap seconds.
[edit] TAI-based variant
Another, much rarer, non-conforming variant of Unix time keeping involves encoding TAI rather than UTC. Because TAI has no leap seconds, and every TAI day is exactly 86 400 s long, this encoding is actually a pure linear count of seconds elapsed since 1970-01-01T00:00:00 TAI. This makes time interval arithmetic much easier. Time values from these systems do not suffer the ambiguity that strictly conforming POSIX systems or NTP-driven systems have.
In these systems it is necessary to consult a table of leap seconds in order to correctly convert between UTC and the pseudo-Unix-time representation. This resembles the manner in which time zone tables must be consulted in order to convert to and from civil time. The leap second table must be updated (from the published leap second bulletins) more frequently than the time zone tables, because leap seconds occur at shorter notice than changes to daylight saving time rules. (A standard Unix time system must similarly consult a leap second table to convert to and from TAI, but this is a much rarer requirement.) Conversion also runs into definitional problems prior to the 1972 commencement of the current form of UTC (see the later section about UTC).
This TAI-based system, despite its superficial resemblance, is not Unix time. It encodes times with significantly different values from the POSIX time values, and does not have the simple mathematical relationship to UTC that is mandated by POSIX.
[edit] Representing the number
A Unix time number can be represented in any form capable of representing numbers. In some applications the number is simply represented textually as a string of decimal digits, raising only trivial additional issues. However, there are certain binary representations of Unix times that are of particular significance.
The standard Unix time_t (data type representing a point in time) is a signed integer data type, traditionally of 32 bits (but see below), directly encoding the Unix time number as described in the preceding section. Being integer means that it has a resolution of one second; many Unix applications therefore handle time only to that resolution. Being 32 bits (of which one bit is the sign bit) means that it covers a range of about 136 years in total. The minimum representable time is 1901-12-13T20:45:52Z, and the maximum representable time is 2038-01-19T03:14:07Z. At 2038-01-19T03:14:08Z this representation will overflow. This milestone is anticipated with a mixture of amusement and dread; see the separate section below.
In some newer operating systems, time_t has been widened to 64 bits. In the negative direction, this goes back more than twenty times the age of the universe, and so suffices. In the positive direction, whether the 290 billion representable years is truly sufficient depends on the ultimate fate of the universe, but it is certainly adequate for most practical purposes.
There has been some controversy over whether the Unix time_t should be signed or unsigned. If unsigned, its range in the future would be doubled, postponing the 32-bit overflow. However, it would then be incapable of representing times prior to 1970. Dennis Ritchie, when asked about this issue, said that he hadn't thought very deeply about it, but was of the opinion that the ability to represent all times within his lifetime would be nice. (Ritchie's birth time is around Unix time -893,400,000.) The consensus, and universal practice, is for time_t to be signed.
The POSIX and Open Group Unix specifications include the ISO C standard library, which includes the time types and functions defined in the <time.h> header file. The ISO C standard states that time_t must be an arithmetic type, but does not mandate any specific type or encoding for it.
Unix has no tradition of directly representing non-integer Unix time numbers as binary fractions. Instead, times with sub-second precision are represented using compound data types that consist of two integers, the first being a time_t (the integral part of the Unix time), and the second being the fractional part of the time number in millionths (in struct timeval) or billionths (in struct timespec). These structures provide a decimal-based fixed-point data format, which is useful for some applications, and trivial to convert for others.
[edit] UTC basis
The present form of UTC, with leap seconds, is defined only from January 1, 1972 onwards. Prior to that, since January 1, 1961 there was an older form of UTC in which not only were there occasional time steps, which were by non-integer numbers of seconds, but also the UTC second was slightly longer than the SI second, and periodically changed, in order to continuously approximate the Earth's rotation. Prior to 1961 there was no UTC, and prior to 1958 there was no widespread atomic timekeeping; in these eras, some approximation of GMT (based directly on the Earth's rotation) was used instead of an atomic timescale.
The precise definition of Unix time as an encoding of UTC is only uncontroversially applicable to the present form of UTC. Fortunately, the fact that the Unix epoch predates the start of this form of UTC does not affect its use in this era: the number of days from January 1, 1970 (the Unix epoch) to January 1, 1972 (the start of UTC) is not in question, and the number of days is all that is significant to Unix time.
The meaning of Unix time values below +63072000 (i.e., prior to January 1, 1972) is not precisely defined. The basis of such Unix times is best understood to be an unspecified approximation of GMT. Computers of that era rarely had clocks set sufficiently accurately to provide meaningful sub-second timestamps in any case. Unix time is not a suitable way to represent times prior to 1972 in applications requiring sub-second precision; such applications must, at least, define which form of UT or GMT they are using.
As of 2004, the possibility of ending the use of leap seconds in civil time is being considered. A likely means to execute this change is to define a new time scale, called "International Time", that initially matches UTC but thereafter has no leap seconds, thus remaining at a constant offset from TAI. If this happens, it is likely that Unix time will be prospectively defined in terms of this new time scale, instead of UTC. Uncertainty about whether this will occur makes prospective Unix time no less predictable than it already is: if UTC were simply to have no further leap seconds the result would be the same.
[edit] History
The earliest versions of Unix time had a 32-bit integer incrementing at a rate of 60 Hz, which was the rate of the system clock on the hardware of the early Unix systems. The value 60 Hz still appears in some software interfaces as a result. The epoch also differed from the current value. The first edition Unix Programmer's Manual dated November 3, 1971 defines the Unix time as "the time since 00:00:00, Jan. 1, 1971, measured in sixtieths of a second". It also comments that "the chronologically-minded user will note that 232 sixtieths of a second is only about 2.5 years". Because of this limited range, the epoch was redefined more than once, before the rate was changed to 1 Hz and the epoch was set to its present value. This yielded a range in excess of 130 years, though with more than half the range in the past (see discussion of signedness above).
As indicated by the definition quoted above, the Unix time scale was originally intended to be a simple linear representation of time elapsed since an epoch. However, there was no consideration of the details of time scales, and it was implicitly assumed that there was a simple linear time scale already available and agreed upon. Indeed, the first edition manual's definition doesn't even specify which timezone is used. Several later problems, including the complexity of the present definition, result from Unix time having been defined gradually by usage rather than fully defined to start with.
When POSIX.1 was written, in the 1980s (it was published in 1988), the question arose of how to precisely define time_t in the face of leap seconds. Some argued for it to remain, as intended, a linear count of seconds since the epoch, at the expense of complexity in conversions with civil time. Others argued for it to remain, as conflictingly intended, easily interconvertible with the conventional representation of civil time, at the expense of inconsistency around leap seconds. Computer clocks of the era were not sufficiently precisely set to form a precedent one way or the other. The POSIX committee was swayed by arguments against complexity in the library functions, and firmly defined the Unix time in a simple manner in terms of the elements of UTC time. Unfortunately, this definition was so simple that it didn't even encompass the entire leap year rule of the Gregorian calendar, and would make 2100 a leap year.
The 2001 edition of POSIX.1 rectified the faulty leap year rule in the definition of Unix time, but retained the essential definition of Unix time as an encoding of UTC rather than a linear time scale. Also, since the mid-1990s computer clocks have been routinely set with sufficient precision for this to matter, and they have most commonly been set using the UTC-based definition of Unix time. This has resulted in considerable complexity in Unix implementations, and in the Network Time Protocol, in order to execute steps in the Unix time number whenever leap seconds occur.
In 2004 POSIX added new interfaces making several different time scales available to programs, splitting up the many uses to which Unix times have traditionally been put. The future is one where time values are accompanied by explicit labels of the time scale defining their significance. Unix time as described in this article will still be in wide use for decades to come, but is likely to be increasingly treated as a legacy system and superseded by better-defined systems.
[edit] 32-bit overflow
At 03:14:08 UTC on January 19, 2038 (+231), a 32-bit signed integer representation of Unix time will overflow. Systems using a 32-bit signed integer Unix time_t will therefore be unable to represent that time, or any later, and will likely wrap around to 20:45:52 UTC on December 13, 1901, with integer value -231.
Programs which must handle times beyond the overflow date will need to be changed to use a 64-bit time_t, a bignum representation of Unix time, or some other means of representing points in time. This is similar to the year 2000 problem. Adapting existing programs may be as easy as re-compiling them with header files that declare time_t as a 64-bit integer. However this will create problems if and when the time_t value is passed from that program to other code that has not had the definition changed (such as the system's C shared library). Also some other programs make deep assumptions as to the nature of time_t and the source code to some software packages may have been lost by then, in which case programmers might have to reverse engineer the software to change its date behavior. Some claim that the expiration of 32-bit time_t may cause more damage than was predicted for the year 2000 problem.
[edit] time_t parties
Unix enthusiasts have a history of holding time_t parties to celebrate significant values of the Unix time number. These are directly analogous to the new year celebrations that occur at the change of year in many calendars. As the use of Unix time has spread, so has the practice of celebrating its milestones. Usually it is time values that are round numbers in decimal that are celebrated, following the Unix convention of viewing time_t values in decimal. Among some groups round binary numbers are also celebrated, such as +230 which occurred at 13:37:04 UTC on January 10, 2004.
The events that these celebrate are typically described as "N seconds since the Unix epoch", but this is inaccurate. As discussed above, due to the handling of leap seconds in Unix time, the number of seconds elapsed since the Unix epoch is slightly greater than the Unix time number, for times later than the epoch.
At 01:46:40 UTC on September 9, 2001, the Unix billennium (Unix time number 1000000000) was celebrated.
At 01:58:31 UTC on March 18, 2005, the Unix time number reached 1111111111.
At 23:31:30 UTC on February 13, 2009, a celebration is expected as the Unix time number reaches 1234567890 seconds. This day happens to fall on Friday the 13th on the Gregorian calendar.
[edit] Unix time in literature
Vernor Vinge's novel A Deepness in the Sky describes a space-faring trading civilization tens of thousands of years (hundreds of gigaseconds) in the future that still uses the Unix epoch. It is noted that this epoch is the first second of the first year after man first walked on the moon. Leap seconds are not mentioned, and the Earth makes no appearance in the story, so it seems unlikely that the time system in use was Unix time as it is currently understood.
[edit] See also
[edit] External links
- Unix Programmer's Manual, first edition
- The UnixTime Apocalypse
- Online tool to convert from Unix time to human-friendly representations, and back
- Unixtime online: Convert unix time to plain English, and vice-versa!
- personal account of the POSIX decisions by Landon Curt Noll
- BASH: Convert Unix Timestamp to a Date