The History of the URL

To understand the history of the URL, we must go back to January 11, 1982, when twenty-two computer scientists met to discuss a problem with "computerized mail" (what we know today as email). Among those present were individuals who would later create Sun Microsystems, the creator of Zork, the person responsible for the NTP protocol, and the one who convinced the government to fund Unix. The problem was simple: there were 455 hosts on ARPANET, and the situation was spiraling out of control.

The transition from ARPANET and the birth of the domain

This problem arose when ARPANET was about to switch from its original NCP protocol to the TCP/IP protocol, which is the foundation of what we now call the Internet. This change would create a multitude of interconnected networks, requiring a more hierarchical domain name system. In that system, ARPANET could resolve its own domains while other networks would resolve theirs.

At that time, the networks had creative names like "COMSAT," "CHAOSNET," "UCLNET," and "INTELPOSTNET," and were maintained by groups of universities and companies across the United States. These groups wanted to communicate with each other and could afford to lease 56k lines from telephone companies and purchase the equipment needed to handle routing.

Originally, the ARPANET design relied on a Network Information Center (NIC) to maintain a file listing all hosts on the network, called HOSTS.TXT, similar to the /etc/hosts file on Linux or OS X systems. However, this system could not scale indefinitely.

The birth of the domain and the evolution of email

The priority at that time was email, which presented a challenge in terms of addressing. The solution was to create a hierarchical system in which an external system could be queried to obtain the necessary domain. Thus, the concept of a domain was born, and addresses went from "user@host" to "user@host.domain", a format we still use today."

Although this decision was not made with a clear vision of the future, it was chosen because it caused fewer problems in existing systems.

UUCP and the Bang routes

The UUCP system, created in 1976, allowed computers to communicate through routes called "Bang Paths," along which files or emails could be sent. This system was a predecessor to the public internet we know today.

DNS and the first TLDs

The DNS system, which we still use today, was proposed in 1983. DNS resolves domains to IP addresses and allows users to efficiently find websites. The first TLDs (top-level domains) included .com, .gov, .org, .edu, and .mil, which are still widely used today.

The DNS system was designed to be hierarchical, with a set of root servers responsible for resolving domains. Today, the root DNS system consists of thirteen server clusters. Historically, DNS responses were handled using UDP packets, limiting the response to 512 bytes.

Punycode and the internationalization of domains

As the internet grew and expanded globally, the need arose to support non-Latin characters in domain names. This led to the creation of Punycode, a system that converts Unicode characters (such as Chinese or Arabic letters) into ASCII, making them compatible with existing infrastructure.

Punycode was not the first proposal to solve this problem, but it was adopted because it is efficient in character encoding and ensures that domain names are understandable to both machines and users.

Protocols, ports and the "extra" component

The most common protocol we use to access websites is HTTP, invented by Tim Berners-Lee specifically for the web. However, other protocols such as FTP and Gopher were also popular in the early days of the internet.

The default port for HTTP is port 80, assigned in the early days of the web. This allowed browsers to access web pages via HTTP without needing to specify the port in the URL.

The double slash "//" that separates the protocol from the rest of the URL was borrowed from an earlier computer system called Apollo. Although Tim Berners-Lee has expressed regret for this decision, it remains part of the URL standard today.

The road to the future: Fragments, URNs, and authentication

URLs also include snippets, represented by the "#" symbol, which allow linking to a specific part of a page. Additionally, query parameters and authentication credentials could be added to the URL, although these practices have now been replaced by more secure and efficient methods.

Although URLs have evolved since their inception, they remain a crucial part of how the web works. While more advanced systems like URNs (Uniform Resource Names) have been proposed, URLs continue to be the most efficient way to identify and access online resources.

This article explores how the URL has evolved from its humble beginnings on ARPANET to become one of the most essential pieces of the internet infrastructure. Over the years, there have been numerous proposals, errors, and advancements that have led us to the web addressing system we use today.

 

Original article by Zack Bloom | May 05