Keystroke dynamics

From Wikipedia, the free encyclopedia

Keystroke dynamics, or typing dynamics, is the detailed timing information that describes exactly when each key was depressed and when it was released as a person is typing at a computer keyboard.

Contents

[edit] Science of Keystoke Dynamics

The behavioral biometric of Keystroke Dynamics uses the manner and rhythm in which an individual types characters on a keyboard or keypad. The keystroke rhythms of a user are measured to develop a unique biometric template of the users typing pattern for future authentication. Raw measurements available from most every keyboard can be recorded to determine Dwell time (the time a key pressed) and Flight time (the time between “key down” and the next “key down” and the time between “key up” and the next “key up”). The recorded keystroke timing data is then processed through a unique neural algorithm, which determines a primary pattern for future comparison

Data needed to analyze keystroke dynamics is obtained by keystroke logging. Normally, all that is retained when logging a typing session is the sequence of characters corresponding to the order in which keys were pressed and timing information is discarded. If I'm reading email from you, I can't tell from reading the phrase "I saw 3 zebras!" whether:

  • that was typed rapidly or slowly
  • you used the left-shift, the right-shift, or the caps-lock key to make the "i" turn into a capitalized letter "I"
  • the letters were all typed at the same pace, or if there was a long pause before the letter "z" or the numeral "3" while you were looking for that letter
  • you typed any letters wrong initially and then went back and corrected them, or if you got them right the first time

[edit] Origin of Keystroke Dynamics

On May 24, 1844, the message "What hath God wrought" was sent by telegraph from the U.S. Capitol in Washington, D.C. to the B&O Railroad "outer depot" in Baltimore, Maryland, A new era in long-distance communications had begun. By the 1860’s the telegraph revolution was in full swing and telegraph operators were a valuable resource. With experience, each operator developed their unique “signature” and was able to be identified simply by their tapping rhythm.

As late as World War II the military transmitted messages through Morse Code. Using a methodology called "The Fist of the Sender," Military Intelligence identified that an individual had a unique way of keying in a message's "dots" and "dashes," creating a rhythm that could help distinguish ally from enemy.

[edit] Use as biometric data

Researchers are interested in using this keystroke dynamic information, that is normally discarded, to verify or even try to determine the identity of the person who is producing those keystrokes. This is often possible because some characteristics of keystroke production are as individual as handwriting or a signature. The techniques used to do this vary widely in power and sophistication, and range from statistical techniques to neural-nets to artificial intelligence.

In the simplest case, very simple rules can be used to rule out a possible user. For example, if we know that John types at 20 words per minute, and the person at the keyboard is going at 70 words per minute, it's a pretty safe bet that it's not John. That would be a test based simply on raw speed uncorrected for errors. It's only a one-way test, as it's always possible for people to go slower than normal, but it's unusual or impossible for them to go twice their normal speed.

Or, it may be that the mystery user at the keyboard and John both type at 50 words per minute; but John never really learned the numbers, and always has to slow down an extra half-second whenever a number has to be entered. If the mystery user doesn't slow down for numbers, then, again, it's a safe bet this isn't John.

The time to get to and depress a key (seek-time), and the time the key is held-down (hold-time) may be very characteristic for a person, regardless of how fast he is going overall. Most people have specific letters that take them longer to find or get to than their average seek-time over all letters, but which letters those are may vary dramatically but consistently for different people. Right-handed people may be statistically faster in getting to keys they hit with their right hand fingers than they are with their left hand fingers. Index fingers may be characteristically faster than other fingers to a degree that is consistent for a person day-to-day regardless of their overall speed that day.

In addition, sequences of letters may have characteristic properties for a person. In English, the word "the" is very common, and those three letters may be known as a rapid-fire sequence and not as just three meaningless letters hit in that order. Common endings, such as "ing", may be entered far faster than, say, the same letters in reverse order ("gni") to a degree that varies consistently by person—Try it yourself! Compare your speed at entering "ing ing ing ing" to "gni gni gni gni". This consistency may hold and may reveal the person's native language's common sequences even when they are writing entirely in a different language, just as revealing as an accent might in spoken English.

Common "errors" may also be quite characteristic of a person, and there is an entire taxonomy of errors, such as this person's most common "substitutions", "reversals", "drop-outs", "double-strikes", "adjacent letter hits", "homonyms", hold-length-errors (for a shift key held down too short or too long a time). Even without knowing what language a person is working in, by looking at the rest of the text and what letters the person goes back and replaces, these errors might be detected. Again, the patterns of errors might be sufficiently different to distinguish two people.

[edit] Authentication versus identification

Keystroke dynamics patterns are statistical in nature, and are not as reliable as other biometrics used for authentication such as fingerprints or retinal scans or DNA. However, they can be captured continuously—not just at the start-up time—and may be adequately accurate to trigger an alarm to another system or person to come double-check the situation.

In some cases, a person at gun-point might be forced to get start-up access by entering a password or having a particularly fingerprint, but then that person could be replaced by someone else at the keyboard who was taking over for some bad purpose. In other less dramatic cases, a doctor might violate business rules by sharing his password with his secretary, or by logging onto a medical system but then leaving the computer logged-in while someone else he knows about or doesn't know about uses the system. Keystroke dynamics is one way to detect such problems sufficiently reliably to be worth investigating, because even a 20% true-positive rate would send the word out that this type of behavior is being watched and caught.

Researchers are still a long way from being able to read a keylogger session from a public computer in a library or cafe somewhere and identify the person from the keystroke dynamics, but we may be in a position to confidently rule out certain people from being the author, who we are confident is "a left-handed person with small hands who doesn't write in English as their primary language."

[edit] Temporal variation

One of the major problems that keystroke dynamics runs into is that a person's typing varies substantially during a day and between different days. People may get tired, or angry, or have a beer, or switch computers, or move their keyboard tray to a new location, or be pasting in information from another source (cut-and-paste) or from a voice-to-text converter. Even while typing, a person, for example, may be on the phone or pausing to talk.

And some mornings, perhaps after a long night with little sleep and a lot of drinking, his typing may bear little resemblance to the way he types when he is well-rested. Extra doses of medication or missed doses could change his rhythm. There are hundreds of confounders.

Because of these variations, there will be error rates to almost any system, both false-positives and false-negatives. A valid solution that uses keystroke dynamics must take these elements into account.

[edit] Commercial products

There are several home and commercial software products which claim to use keystroke dynamics to authenticate a user.

ID Control (http://www.idcontrol.net) delivers keystroke dynamics with KeystrokeID which offers an impressively low FRR and FAR for verification and identification. KeystrokeID is easy to enrol and manage through their fully integrated and centralized identity and access management solution called ID Control Server.

BioPassword (http://www.biopassword.com]) is a patented commercial system which uses keystroke dynamics to restrict access—see the References section below for a link to a review from PC Magazine. The vendor, BioPassword, Inc., just received $8 million in new funding, according to a January, 2006 trade press release. [1]

Deepnet Security (http://www.deepnetsecurity.com) has also developed a keystroke biometric authentication system, TypeSense. It is claimed that their product employs advanced new algorithms such as auto-correlative training and adaptive learning, and achieve better result than other similar products.

iMagic Software [2] makes Trustable Passwords, which is designed for use by large enterprises (they recommend 2,000+ users) and interfaces with all major enterprise infrastructure. Trustable Passwords just won the audience vote at the presigious Forrester IT Forum 2006 in Las Vegas—there is a video of that demo on the iMagic Software website. iMagic was founded in 2002, and is based on new patent-pending algorithms. Trustable Passwords is presently being used in several major multi-hospital healthsystems for user authentication and, in terms of both recognition and user satisfaction, works better than fingerprints. Several major financial institutions are also in pilot. Historically, iMagic has kept quiet, but it seems they are beginning to publicize.

Anyone considering building a new product using keystroke dynamics should understand the legal issues (see below), and figure out as well how to have an authorized program's use of keystroke interception survive the removal efforts of multiple anti-spyware programs. In this case, the security enhancing programs may be fighting with each other.

On top of that, if the desired result for a web-based product is to use keystroke dynamics to decide whether to cause a pop-up window to appear, asking for re-entry of a password or other verification question, new pop-up blockers may prevent that feature from functioning.

[edit] Legal and regulatory issues

Surreptitious use of key-logging software is on the rise, as of this writing. Use of such software may be in direct and explicit violation of local laws, such as the U.S. Patriot Act, under which such use may constitute wire-tapping. This could have severe penalties including jail-time. See spyware for a better description of user-consent issues and various fraud statutes. Spyware and its use for illegal operations such as bank-fraud and identity theft are very much in the news, with even Microsoft issuing new spyware defense products, and tougher laws in the near future being very likely.

Competent legal advice should be obtained before attempting to use or even experiment with such software and keystroke dynamic analysis, if consent is not clearly obtained from the people at the keyboard, even though the actual residual "content" of the message—the resultant text—is never analyzed, read, or retained. The status of the "dynamic context" of the text is probably in legal limbo.

There are some patents in this area. Examples:

  • J. Garcia. Personal identification apparatus. Patent No. 4 621 334, U.S. Patent and Trademark Office, 1986.
  • J.R. Young and R.W. Hammon. Method and apparatus for verifying an individual’s identity. Patent No. 4 805 222, U.S. Patent and Trademark Office, 1989.

[edit] Other uses

Because keystroke timings are generated by human beings, they are not well correlated with external processes, and are frequently used as a source of hardware-generated random numbers for computer systems.

[edit] References

  • Bergadano, F., Gunetti, D., & Picardi, C. (2002). User authentication through Keystroke Dynamics. ACM Transactions on Information and System Security (TISSEC), 5(4), 367-397.
  • iMagic Software. (vendor web-site [3] May 2006). Notes: Vendor specializing in keystroke authentication for large enterprises.
  • BioPassword. (vendor web-site home [Web Page]. URL [4] [2006, March 6]. Notes: Vendor specializing in keystroke dynamics
  • Garcia, J. (Inventor). (1986). Personal identification apparatus. (USA 4621334). Notes: US Patent Office - [5]
  • Joyce, R., & Gupta, G. (1990). Identity authorization based on keystroke latencies. Communications of the ACM, 33(2), 168-176. Notes: Review up through 1990
  • Mahar, D., Napier R., Wagner M., Laverty W., Henderson, R. D., & Hiron, M. (1995). Optimizing digraph-latency based biometric typist verification systems: inter and intra typist differences in digraph latency distributions. International Journal of Human-Computer Studies, 43(4), 579-592.
  • Monrose, F., & Rubin Ariel D. (1997). Authentication via Keystroke Dynamics. ACM Conference on Computer and Communications Security. Notes: available to subscribers at [6] , much cited
  • Monrose, F., & Rubin, A. D. (2000). Keystroke Dynamics as a Biometric for Authentication. Future Generation Computer Systems, 16, 351-359. Notes: Review 1990 - 1999 [7]
  • Monrose, F. R. M. K., & Wetzel, S. (1999). Password hardening based on keystroke dynamics. Proceedings of the 6th ACM Conference on Computer and Communications Security, 73-82. Notes: Kent Ridge Digital Labs, Singapore
  • Robinson, J. A., Liang, V. M., Chambers, J. A. M., & MacKenzie, C. L. (1998). Computer user Verification using Login String Keystroke Dynamics. IEEE Transactions on Systems, Man, and Cybernetics Part A, 28(2). Notes: [8] Highlights: 10 users were distinguished from 10 "forgers" using 3 classification systems. Hold times were more effective than interkey times for discrimination. Best results used both with a learning classifier. There were a high rate of confounding errors and backspaces in the password samples.
  • Young, J. R., & Hammon, R. W. (Inventors). (1989). Method and apparatus for verifying an individual's identity. 4805222). Notes: US Patent Office - [9]
  • Vertical Company LTD. (vendor web-site [10] October 2006). Notes: Vendor specializing in keystroke authentication solutions for government and commercial agencies.

[edit] See also

  • Fist (telegraphy)