Name resolution
Name resolution can refer to any process that further identifies an object or entity from an associated, not-necessarily-unique alphanumeric name:
- In computer systems, it refers to the retrieval of the underlying numeric values corresponding to computer hostnames, account user names, group names, and other named entities;
- In programming languages, it refers to the resolution of the tokens within program expressions to the intended program components;
- In semantics and text extraction, it refers to the determination of the specific person, actor, or object a particular use of a name refers to.
In computer systems
Computer operating systems commonly employ multiple key/value lists that associate easily-remembered names with the integer numbers used to identify users, groups, other computers, hardware devices, and other entities. In that context, name resolution refers to the retrieval of numeric values given the associated names, while Reverse name resolution refers to the opposite process of finding the name(s) associated with specified numeric values:
- In computer networking, it refers to processes used to obtain the assigned IP addresses needed to communicate with devices whose host or domain names are known. Examples include the Domain Name System (DNS), Network Information Service and Multicast DNS (mDNS). IP addresses for devices on the local segment can in turn be resolved to MAC addresses by invoking the Address Resolution Protocol (ARP).
- Unix operating systems associate both an alphanumeric name and a user or group ID with each user account or defined group of user names.
The GNU C library provides various operating system facilities that shell commands and other applications can call to resolve such names to the corresponding addresses or IDs, and vice-versa. Some Linux distributions use an nsswitch.conf file to specify the order in which multiple resolution services are used to effect such lookups.
In programming languages
Expressions in computer programs reference variables, data types, functions, classes, objects, libraries, packages and other entities by name. In that context, name resolution refers to the association of those not-necessarily-unique names with the intended program entities. The algorithms that determine what those identifiers refer to in specific contexts are part of the language definition.
The complexity of these algorithms is influenced by the sophistication of the language. For example, name resolution in assembly language usually involves only a single simple table lookup, while name resolution in C++ is extremely complicated as it involves:
- namespaces, which make it possible for an identifier to have different meanings depending on its associated namespace;
- scopes, which make it possible for an identifier to have different meanings at different scope levels, and which involves various scope overriding and hiding rules. At the most basic level name resolution usually attempts to find the binding in the smallest enclosing scope, so that for example local variables supersede global variables; this is called shadowing.
- visibility rules, which determine whether identifiers from specific namespaces or scopes are visible from the current context;
- overloading, which makes it possible for an identifier to have different meanings depending on how it is used, even in a single namespace or scope;
- accessibility, which determines whether identifiers from an otherwise visible scope are actually accessible and participate in the name resolution process.
Static versus dynamic
In programming languages, name resolution can be performed either at compile time or at runtime. The former is called static name resolution, the latter is called dynamic name resolution.
A somewhat common misconception is that dynamic typing implies dynamic name resolution. However, static typing does imply static name resolution. For example, Erlang is dynamically typed but has static name resolution.
Static name resolution catches, at compile time, use of variables that are not in scope; preventing programmer errors. Languages with dynamic scope resolution sacrifice this safety for more flexibility; they can typically set and get variables in the some scope at runtime.
For example, in Python:
locals()['num'] = 999 # equivalent to: num = 999 noun = "troubles" noun2 = "hound" # which variables to use are decided at runtime print("{num} {noun} and a {noun2} ain't one".format(**locals())) # outputs: 999 troubles and a hound ain't one
However, relying on dynamic name resolution in code is discouraged by the Python community.[1][2] The feature also may be removed in a later version of Python.[3]
Examples of languages that use static name resolution include C, C++, E, Erlang, Haskell, Java, Pascal, Scheme, and Smalltalk. Examples of languages that use dynamic name resolution include some Lisp dialects, Perl, PHP, Python, REBOL, and Tcl.
Name masking
Masking occurs when the same identifier is used for different entities in overlapping lexical scopes. At the level of variables (rather than names), this is known as variable shadowing. An identifier I' (for variable X') masks an identifier I (for variable X) when two conditions are met
- I' has the same name as I
- I' is defined in a scope which is a subset of the scope of I
The outer variable X is said to be shadowed by the inner variable X'.
For example, the parameter x masks the local variable in this common pattern:
private int foo; // A declaration with name "foo" in an outer scope public void setFoo(int foo) { // A declaration with the same name in the inner scope // "foo" is resolved by looking in the innermost scope first, // so the author uses a different syntax, this.foo, to refer to the name "foo" // in the outer scope. this.foo = foo; } // "foo" here means the same as this.foo below, // since setFoo's parameter is no longer in scope. public void getFoo() { return foo; }
Alpha renaming to make name resolution trivial
In programming languages with lexical scoping that do not reflect over variable names, α-conversion (or α-renaming) can be used to make name resolution easy by finding a substitution that makes sure that no variable name masks another name in a containing scope. Alpha-renaming can make static code analysis easier since only the alpha renamer needs to understand the language's scoping rules.
For example, in this code:
class Point { private: double x, y; public: Point(double x, double y) { // x and y declared here mask the privates setX(x); setY(y); } void setX(double newx) { x = newx; } void setY(double newy) { y = newy; } }
within the Point constructor, the class variables x and y are shadowed by local variables of the same name. This might be alpha-renamed to:
class Point { private: double x, y; public: Point(double a, double b) { setX(a); setY(b); } void setX(double newx) { x = newx; } void setY(double newy) { y = newy; } }
In the new version, there is no masking, so it is immediately obvious which uses correspond to which declarations.
In semantics and text extraction
In this context (also referred to as entity resolution), name resolution refers to the ability of text mining software to determine which actual person, actor, or object a particular use of a name refers to.
Name resolution in simple text
For example, in the text mining field, software frequently needs to interpret the following text:
John gave Edward the book. He then stood up and called to John to come back into the room.
In these sentences, the software must determine whether the pronoun "he" refers to "John", or "Edward" from the first sentence. The software must also determine whether the "John" referred to in the second sentence is the same as the "John" in the first sentence, or a third person whose name also happens to be "John". Such examples apply to almost all languages, and not only English.
Name resolution across documents
Frequently, this type of name resolution is also used across documents, for example to determine whether the "George Bush" referenced in an old newspaper article as President of the United States (George H. W. Bush) is the same person as the "George Bush" mentioned in a separate news article years later about a man who is running for President (George W. Bush.) Because many people may have the same name, analysts and software must take into account substantially more information than only a name to determine whether two identical references ("George Bush") actually refer to the same specific entity or person.
Name/entity resolution in text extraction and semantics is a notoriously difficult problem, in part because in many cases there is not sufficient information to make an accurate determination. Numerous partial solutions exist that rely on specific contextual clues found in the data, but there is no currently known general solution.
The problem is sometimes referred to as name disambiguation and, for digital libraries, author disambiguation.
For examples of software that might provide name resolution benefits, see also:
- AeroText
- AlchemyAPI
- Attensity
- Autonomy
- DBpedia Spotlight, providing a simple approach for name resolution using DBpedia and Wikipedia
- Nerso, another approach for name resolution using DBpedia.
- NetOwl
See also
- Name server
- Multicast DNS
- Name Service Switch
- Identity resolution
- namespace (programming)
- Scope (programming)
- Named entity recognition
- Naming collision
- Anaphor resolution
References
- ↑ "[Python-Ideas] str.format utility function". 9 May 2009. Retrieved 2011-01-23.
- ↑ "8.6. Dictionary-based string formatting". diveintopython.org. Mark Pilgrim. Retrieved 2011-01-23.
- ↑ "9. Classes - Python v2.7.1 documentation". Retrieved 2011-01-23. "search for names is done dynamically, at run time — however, the language definition is evolving towards static name resolution"