![]() |
|||||||||||||
|
Domain Name System |
| This article may require cleanup to meet Wikipedia's quality standards. Please improve this article if you can. (October 2007) |
| This article needs additional citations for verification. Please help improve this article by adding reliable references. Unsourced material may be challenged and removed. (December 2007) |
| The TCP/IP model (RFC 1122) |
|---|
| Application Layer |
|
DHCP · DNS · FTP · Gopher · HTTP · IMAP4 · IRC · NNTP · XMPP · POP3 · RTP · SIP · SMTP · SNMP · SSH · TELNET · RPC · RTCP · RTSP · TLS (and SSL) · SDP · SOAP · GTP · STUN · NTP · BGP · RIP · (more) |
| Transport Layer |
| TCP · UDP · DCCP · SCTP · RSVP · ECN · (more) |
| Internet Layer |
| IP (IPv4 · IPv6) · ICMP · ICMPv6 · IGMP · IPsec · (more) |
| Link Layer |
| ARP · RARP · NDP · OSPF · IS-IS · Tunnels · Device Drivers · Media Access Control · (more) |
The Domain Name System (DNS) associates various information with domain names; most importantly, it serves as the "phone book" for the Internet by translating human-readable computer hostnames, e.g. www.example.com, into IP addresses, e.g. 208.77.188.166, which networking equipment needs to deliver information. A DNS also stores other information such as the list of mail servers that accept email for a given domain. By providing a worldwide keyword-based redirection service, the Domain Name System is an essential component of contemporary Internet use.
Contents |
Above all, the DNS makes it possible to assign domain names to organizations independent of the physical routing hierarchy represented by the numerical IP address. Because of this, hyperlinks and Internet contact information can remain the same, whatever the current IP routing arrangements may be, and can take a human-readable form (such as "example.com"). These Internet names are easier to remember than the IP address 208.77.188.166. People take advantage of this when they recite meaningful URLs and e-mail addresses without caring how the machine will actually locate them.
The Domain Name System distributes the responsibility for assigning domain names and mapping them to IP networks by allowing an authoritative name server for each domain to keep track of its own changes, avoiding the need for a central register to be continually consulted and updated.
Additionally other arbitrary identifiers such as RFID tags, UPC codes, International characters in email addresses and host names, and a variety of other identifiers could all potentially utilize DNS [1].
The practice of using a name as a more human-legible abstraction of a machine's numerical address on the network predates even TCP/IP. This practice dates back to the ARPAnet era. Back then, a different system was used. The DNS was invented in 1983, shortly after TCP/IP was deployed. With the older system, each computer on the network retrieved a file called HOSTS.TXT from a computer at SRI (now SRI International)[2][3]. The HOSTS.TXT file mapped numerical addresses to names. A hosts file still exists on most modern operating systems, either by default or through configuration, and allows users to specify an IP address (eg. 208.77.188.166) to use for a hostname (eg. www.example.net) without checking DNS. Systems based on a hosts file have inherent limitations, because of the obvious requirement that every time a given computer's address changed, every computer that seeks to communicate with it would need an update to its hosts file.
The growth of networking required a more scalable system that recorded a change in a host's address in one place only. Other hosts would learn about the change dynamically through a notification system, thus completing a globally accessible network of all hosts' names and their associated IP Addresses.
At the request of Jon Postel, Paul Mockapetris invented the Domain Name system in 1983 and wrote the first implementation. The original specifications appear in RFC 882 and RFC 883. In November 1987, the publication of RFC 1034 and RFC 1035 updated the DNS specification and made RFC 882 and RFC 883 obsolete. Several more-recent RFCs have proposed various extensions to the core DNS protocols.
In 1984, four Berkeley students - Douglas Terry, Mark Painter, David Riggle and Songnian Zhou - wrote the first UNIX implementation, which was maintained by Ralph Campbell thereafter. In 1985, Kevin Dunlap of DEC significantly re-wrote the DNS implementation and renamed it BIND (Berkeley Internet Name Domain, previously: Berkeley Internet Name Daemon). Mike Karels, Phil Almquist and Paul Vixie have maintained BIND since then. BIND was ported to the Windows NT platform in the early 1990s.
Due to BIND's long history of security issues, several alternative nameserver and resolver programs have been written and distributed in recent years.
The domain name space consists of a tree of domain names. Each node or leaf in the tree has zero or more resource records, which hold information associated with the domain name. The tree sub-divides into zones beginning at the root zone. A DNS zone consists of a collection of connected nodes authoritatively served by an authoritative DNS nameserver. (Note that a single nameserver can host several zones.)
When a system administrator wants to let another administrator control a part of the domain name space within the first administrator’s zone of authority, control can be delegated to the second administrator. This splits off a part of the old zone into a new zone, which comes under the authority of the second administrator's nameservers. The old zone ceases to be authoritative for the new zone.
A domain name usually consists of two or more parts (technically a label), which is conventionally written separated by dots, such as example.com.
The Domain Name System is maintained by a distributed database system, which uses the client-server model. The nodes of this database are the name servers. Each domain or subdomain has one or more authoritative DNS servers that publish information about that domain and the name servers of any domains subordinate to it. The top of the hierarchy is served by the root nameservers: the servers to query when looking up (resolving) a top-level domain name (TLD).
The client-side of the DNS is called a DNS resolver. It is responsible for initiating and sequencing the queries that ultimately lead to a full resolution (translation) of the resource sought, e.g., translation of a domain name into an IP address.
A DNS query may be either a recursive query or a non-recursive query. The resolver (or another DNS server acting recursively on behalf of the resolver) negotiates use of recursive service using bits in the query headers.
Resolving usually entails iterating through several name servers to find the needed information. However, some resolvers function simplistically and can communicate only with a single name server. These simple resolvers rely on a recursive query to a recursive name server to perform the work of finding information for them.
In theory a full host name may have several name segments, (e.g ahost.ofasubnet.ofabiggernet.inadomain.example). In practice, full host names will frequently consist of just three segments (ahost.inadomain.example, and most often www.inadomain.example). For querying purposes, software interprets the name segment by segment, from right to left, using an iterative search procedure. At each step along the way, the program queries a corresponding DNS server to provide a pointer to the next server which it should consult.
As originally envisaged, the process was as simple as:
The diagram illustrates this process for the real host www.wikipedia.org.
The mechanism in this simple form has a difficulty: it places a huge operating burden on the root servers, with every search for an address starting by querying one of them. Being as critical as they are to the overall function of the system, such heavy use would create an insurmountable bottleneck for trillions of queries placed every day. The section DNS in practice describes how this is addressed.
Name servers in delegations appear listed by name, rather than by IP address. This means that a resolving name server must issue another DNS request to find out the IP address of the server to which it has been referred. Since this can introduce a circular dependency if the nameserver referred to is under the domain that it is authoritative of, it is occasionally necessary for the nameserver providing the delegation to also provide the IP address of the next nameserver. This record is called a glue record.
For example, assume that the sub-domain en.wikipedia.org contains further sub-domains (such as something.en.wikipedia.org) and that the authoritative name server for these lives at ns1.something.en.wikipedia.org. A computer trying to resolve something.en.wikipedia.org will thus first have to resolve ns1.something.en.wikipedia.org. Since ns1 is also under the something.en.wikipedia.org subdomain, resolving ns1.something.en.wikipedia.org requires resolving something.en.wikipedia.org which is exactly the circular dependency mentioned above. The dependency is broken by the glue record in the nameserver of en.wikipedia.org that provides the IP address of ns1.something.en.wikipedia.org directly to the requestor, enabling it to bootstrap the process by figuring out where ns1.something.en.wikipedia.org is located.
When an application (such as a web browser) tries to find the IP address of a domain name, it doesn't necessarily follow all of the steps outlined in the Theory section above. We will first look at the concept of caching, and then outline the operation of DNS in "the real world."
Because of the huge volume of requests generated by a system like DNS, the designers wished to provide a mechanism to reduce the load on individual DNS servers. To this end, the DNS resolution process allows for caching (i.e. the local recording and subsequent consultation of the results of a DNS query) for a given period of time after a successful answer. How long a resolver caches a DNS response (i.e. how long a DNS response remains valid) is determined by a value called the time to live (TTL). The TTL is set by the administrator of the DNS server handing out the response. The period of validity may vary from just seconds to days or even weeks.
As a noteworthy consequence of this distributed and caching architecture, changes to DNS do not always take effect immediately and globally. This is best explained with an example: If an administrator has set a TTL of 6 hours for the host www.wikipedia.org, and then changes the IP address to which www.wikipedia.org resolves at 12:01pm, the administrator must consider that a person who cached a response with the old IP address at 12:00noon will not consult the DNS server again until 6:00pm. The period between 12:01pm and 6:00pm in this example is called caching time, which is best defined as a period of time that begins when you make a change to a DNS record and ends after the maximum amount of time specified by the TTL expires. This essentially leads to an important logistical consideration when making changes to DNS: not everyone is necessarily seeing the same thing you're seeing. RFC 1537 helps to convey basic rules for how to set the TTL.
Note that the term "propagation", although very widely used in this context, does not describe the effects of caching well. Specifically, it implies that [1] when you make a DNS change, it somehow spreads to all other DNS servers (instead, other DNS servers check in with yours as needed), and [2] that you do not have control over the amount of time the record is cached (you control the TTL values for all DNS records in your domain, except your NS records and any authoritative DNS servers that use your domain name).
Some resolvers may override TTL values, as the protocol supports caching for up to 68 years or no caching at all. Negative caching (the non-existence of records) is determined by name servers authoritative for a zone which MUST include the Start of Authority (SOA) record when reporting no data of the requested type exists. The MINIMUM field of the SOA record and the TTL of the SOA itself is used to establish the TTL for the negative answer. RFC 2308
Many people incorrectly refer to a mysterious 48 hour or 72 hour propagation time when you make a DNS change. When one changes the NS records for one's domain or the IP addresses for hostnames of authoritative DNS servers using one's domain (if any), there can be a lengthy period of time before all DNS servers use the new information. This is because those records are handled by the zone parent DNS servers (for example, the .com DNS servers if your domain is example.com), which typically cache those records for 48 hours. However, those DNS changes will be immediately available for any DNS servers that do not have them cached. And any DNS changes on your domain other than the NS records and authoritative DNS server names can be nearly instantaneous, if you choose for them to be (by lowering the TTL once or twice ahead of time, and waiting until the old TTL expires before making the change).
Users generally do not communicate directly with a DNS resolver. Instead DNS-resolution takes place transparently in client-applications such as web-browsers, mail-clients, and other Internet applications. When an application makes a request which requires a DNS lookup, such programs send a resolution request to the local DNS resolver in the local operating system, which in turn handles the communications required.
The DNS resolver will almost invariably have a cache (see above) containing recent lookups. If the cache can provide the answer to the request, the resolver will return the value in the cache to the program that made the request. If the cache does not contain the answer, the resolver will send the request to one or more designated DNS servers. In the case of most home users, the Internet service provider to which the machine connects will usually supply this DNS server: such a user will either have configured that server's address manually or allowed DHCP to set it; however, where systems administrators have configured systems to use their own DNS servers, their DNS resolvers point to separately maintained nameservers of the organization. In any event, the name server thus queried will follow the process outlined above, until it either successfully finds a result or does not. It then returns its results to the DNS resolver; assuming it has found a result, the resolver duly caches that result for future use, and hands the result back to the software which initiated the request.
An additional level of complexity emerges when resolvers violate the rules of the DNS protocol. A number of large ISPs have configured their DNS servers to violate rules (presumably to allow them to run on less-expensive hardware than a fully-compliant resolver), such as by disobeying TTLs, or by indicating that a domain name does not exist just because one of its name servers does not respond.citation needed
As a final level of complexity, some applications (such as web-browsers) also have their own DNS cache, in order to reduce the use of the DNS resolver library itself. This practice can add extra difficulty when debugging DNS issues, as it obscures the freshness of data, and/or what data comes from which cache. These caches typically use very short caching times — on the order of one minute. Internet Explorer offers a notable exception: recent versions cache DNS records for half an hour.[5]
The system outlined above provides a somewhat simplified scenario. The Domain Name System includes several other functions:
DNS primarily uses UDP on port 53 [6] to serve requests. Almost all DNS queries consist of a single UDP request from the client followed by a single UDP reply from the server. TCP comes into play only when the response data size exceeds 512 bytes, or for such tasks as zone transfer. Some operating systems such as HP-UX are known to have resolver implementations that use TCP for all queries, even when UDP would suffice.
EDNS is an extension of the DNS protocol which allows the transport over UDP of DNS replies exceeding 512 bytes, and adds support for expanding the space of request and response codes. It is described in RFC 2671.
When sent over the internet, all records use the common format specified in RFC 1035 shown below.
| Field | Description | Length (octets) |
|---|---|---|
| NAME | Name of the node to which this record pertains. | (variable) |
| TYPE | Type of RR. For example, MX is type 15. | 2 |
| CLASS | Class code. | 2 |
| TTL | Signed time in seconds that RR stays valid. | 4 |
| RDLENGTH | Length of RDATA field. | 2 |
| RDATA | Additional RR-specific data. | (variable) |
The type of the record indicates what the format of the data is, and gives a hint of its intended use; for instance, the A record is used to translate from a domain name to an IPv4 address, the NS record lists which name servers can answer lookups on a DNS zone, and the MX record is used to translate from a name in the right-hand side of an e-mail address to the name of a machine able to handle mail for that address.
Many more record types exist and be found in the complete List of DNS record types.
While domain names technically have no restrictions on the characters they use and can include non-ASCII characters, the same is not true for host names.[7] Host names are the names most people see and use for things like e-mail and web browsing. Host names are restricted to a small subset of the ASCII character set known as LDH, the Letters A–Z in upper and lower case, Digits 0–9, Hyphen, and the dot to separate LDH-labels; see RFC 3696 section 2 for details. This prevented the representation of names and words of many languages natively. ICANN has approved the Punycode-based IDNA system, which maps Unicode strings into the valid DNS character set, as a workaround to this issue. Some registries have adopted IDNA.
DNS was not originally designed with security in mind, and thus has a number of security issues.
One class of vulnerabilities is DNS cache poisoning, which tricks a DNS server into believing it has received authentic information when, in reality, it has not.
DNS responses are traditionally not cryptographically signed, leading to many attack possibilities; DNSSEC modifies DNS to add support for cryptographically signed responses. There are various extensions to support securing zone transfer information as well.
Even with encryption, a DNS server could become compromised by a virus (or for that matter a disgruntled employee) that would cause IP addresses of that server to be redirected to a malicious address with a long TTL. This could have far-reaching impact to potentially millions of Internet users if busy DNS servers cache the bad IP data. This would require manual purging of all affected DNS caches as required by the long TTL (up to 68 years).
Some domain names can spoof other, similar-looking domain names. For example, "paypal.com" and "paypa1.com" are different names, yet users may be unable to tell the difference when the user's typeface (font) does not clearly differentiate the letter l and the number 1. This problem is much more serious in systems that support internationalized domain names, since many characters that are different, from the point of view of ISO 10646, appear identical on typical computer screens. This vulnerability is often exploited in phishing.
Techniques such as Forward Confirmed reverse DNS can also be used to help validate DNS results.
The right to use a domain name is delegated by domain name registrars which are accredited by the Internet Corporation for Assigned Names and Numbers (ICANN), the organization charged with overseeing the name and number systems of the Internet. In addition to ICANN, each top-level domain (TLD) is maintained and serviced technically by a sponsoring organization, the TLD Registry. The registry is responsible for maintaining the database of names registered within the TLDs they administer. The registry receives registration information from each domain name registrar authorized to assign names in the corresponding TLD and publishes the information using a special service, the whois protocol.
Registrars usually charge an annual fee for the service of delegating a domain name to a user and providing a default set of name servers. Often this transaction is termed a sale or lease of the domain name, and the registrant is called an "owner", but no such legal relationship is actually associated with the transaction, only the exclusive right to use the domain name. More correctly authorized users are known as "registrants" or as "domain holders".
ICANN publishes a complete list of TLD registries and domain name registrars in the world. One can obtain information about the registrant of a domain name by looking in the WHOIS database held by many domain registries.
For most of the more than 240 country code top-level domains (ccTLDs), the domain registries hold the authoritative WHOIS (Registrant, name servers, expiration dates, etc.). For instance, DENIC, Germany NIC, holds the authoritative WHOIS to a .DE domain name. Since about 2001, most gTLD registries (.ORG, .BIZ, .INFO) have adopted this so-called "thick" registry approach, i.e. keeping the authoritative WHOIS in the central registries instead of the registrars.
For .COM and .NET domain names, a "thin" registry is used: the domain registry (e.g. VeriSign) holds a basic WHOIS (registrar and name servers, etc.). One can find the detailed WHOIS (registrant, name servers, expiry dates, etc.) at the registrars.
Some domain name registries, also called Network Information Centres (NIC), also function as registrars, and deal directly with end users. But most of the main ones, such as for .COM, .NET, .ORG, .INFO, etc., use a registry-registrar model. There are hundreds of Domain Name Registrars that actually perform the domain name registration with the end user (see lists at ICANN or VeriSign). By using this method of distribution, the registry only has to manage the relationship with the registrar, and the registrar maintains the relationship with the end users, or 'registrants' -- in some cases through additional layers of resellers.
A registrant usually designates an administrative contact to manage the domain name. In practice, the administrative contact usually has the most immediate power over a domain. Management functions delegated to the administrative contacts may include (for example):
A technical contact manages the name servers of a domain name. The many functions of a technical contact include:
The party whom a domain name registrar invoices.
Namely the authoritative name servers that host the domain name zone of a domain name.
Critics often claim abuse of administrative power over domain names. Particularly noteworthy was the VeriSign Site Finder system which redirected all unregistered .com and .net domains to a VeriSign webpage. For example, at a public meeting with VeriSign to air technical concerns about SiteFinder [8], numerous people, active in the IETF and other technical bodies, explained how they were surprised by VeriSign's changing the fundamental behavior of a major component of Internet infrastructure, not having obtained the customary consensus. SiteFinder, at first, assumed every Internet query was for a website, and it monetized queries for incorrect domain names, taking the user to VeriSign's search site. Unfortunately, other applications, such as many implementations of email, treat a lack of response to a domain name query as an indication that the domain does not exist, and that the message can be treated as undeliverable. The original VeriSign implementation broke this assumption for mail, because it would always resolve an erroneous domain name to that of SiteFinder. While VeriSign later changed SiteFinder's behaviour with regard to email, there was still widespread protest about VeriSign's action being more in its financial interest than in the interest of the Internet infrastructure component for which VeriSign was the steward.
Despite widespread criticism, VeriSign only reluctantly removed it after the Internet Corporation for Assigned Names and Numbers (ICANN) threatened to revoke its contract to administer the root name servers. ICANN published the extensive set of letters exchanged, committee reports, and ICANN decisions [9].
There is also significant disquiet regarding the United States' political influence over ICANN. This was a significant issue in the attempt to create a .xxx top-level domain and sparked greater interest in alternative DNS roots that would be beyond the control of any single country.citation needed
Additionally, there are numerous accusations of domain name "front running", whereby registrars, when given whois queries, automatically register the domain name for themselves. Recently, Network Solutions has been accused of this.[10]
In the United States, the "Truth in Domain Names Act" (actually the "Anticybersquatting Consumer Protection Act"), in combination with the PROTECT Act, forbids the use of a misleading domain name with the intention of attracting people into viewing a visual depiction of sexually explicit conduct on the Internet.
The Domain name system is defined by Request for Comments published by the Internet Engineering Task Force (Internet standards). The following is a list of some of the RFCs that pertain to DNS.