Enabling Global Communication through IP Addressing and Datagram Delivery
Welcome back to our exploration of computer networks. Having examined the foundational layers that govern physical connections and local data exchange, we now ascend to a pivotal abstraction: the Network Layer. This layer is fundamental to the very concept of the Internet, transforming disparate physical links into a unified, global communication system. It can be conceptualized as a vast collection of interconnected subnetworks or autonomous systems.
At the network layer, the Internet can be viewed as a collection of subnetworks or autonomous systems that are connected together. While the Internet architecture lacks a rigid, centralized structure, it prominently features several major backbones. These are constructed from high-bandwidth lines and high-speed routers, serving as primary conduits. Connected to these backbones are regional (mid-level) networks, which in turn link to the Local Area Networks (LANs) found within universities, companies, and Internet service providers. A conceptual illustration of this quasi-hierarchical organization is presented in Figure 1 below, demonstrating how diverse networks integrate to form the global Internet.
Figure 1: Conceptual illustration of the Internet's quasi-hierarchical organization
The core element that binds this distributed infrastructure is its network layer protocol: IP (Internet Protocol). Unlike many predecessor network layer protocols, IP was designed from its inception with internetworking as a primary goal. Fundamentally, the Network Layer's role is to provide a "best-effort" service for transporting data units, known as datagrams, from a source machine to a destination machine. This delivery occurs irrespective of whether the machines reside on the same local network or if multiple intermediate networks separate them. The "best-effort" philosophy implies that IP does not guarantee delivery, ensure order, or prevent duplication; it simply endeavors to forward datagrams towards their destination.
Communication at this layer typically proceeds as follows: The transport layer segments application data streams into individual datagrams. While theoretically, IP datagrams can be up to 64 Kbytes in size, practical considerations, such as the maximum transmission unit (MTU) of underlying network technologies, usually limit them to approximately 1500 bytes. Each datagram is then transmitted independently across the Internet.
If a datagram encounters a network segment with a smaller MTU, it may undergo fragmentation – being broken down into smaller units. For instance, a 1500-byte datagram might be split into two 750-byte fragments to traverse a network with a 800-byte MTU. Upon reaching the destination machine, the network layer is responsible for reassembling all fragmented pieces into the original datagram. This reassembled datagram is then handed over to the transport layer for delivery to the appropriate receiving process.
Our detailed study of the Network Layer in the Internet appropriately begins with the structure of the IP datagram itself. An IP datagram comprises two main sections: a header, containing control information, and a data (text) part, carrying the actual payload. The header includes a 20-byte fixed portion and an optional, variable-length section.
The precise format of the IP header is depicted in Figure 2 below. Header fields are transmitted in big-endian order; specifically, from left to right, with the high-order bit of the Version field transmitted first. For systems like SPARC, this is native; however, for little-endian machines (such as Pentium-based systems), software conversion is required during both transmission and reception to align with the network byte order.
Figure 2: Structure of the IP Datagram Header
Let us examine the key fields within the IP header:
Version (4 bits): This field indicates the version of the IP protocol to which the datagram conforms. Including the version in each datagram facilitates a gradual transition between protocol versions, enabling networks to operate with mixed versions over extended periods.
Internet Header Length (IHL) (4 bits): As the header length is not constant due to the presence of optional fields, this field specifies the length of the entire IP header, measured in 32-bit words. The minimum value is 5, corresponding to a 20-byte header with no options. The maximum value of 15 (representing 1111 in 4 bits) limits the header to 60 bytes, consequently restricting the options field to 40 bytes. For some options, such as recording a packet's complete route, 40 bytes can prove insufficient, thus limiting the option's practical utility.
Type of Service (ToS) (8 bits): This field allows the sending host to convey to the network the desired service characteristics for the datagram. Various combinations of reliability and speed are theoretically possible. For real-time applications like digitized voice, expedited delivery (minimizing delay) takes precedence over perfect accuracy. Conversely, for file transfers, error-free transmission is more critical than rapid delivery.
The ToS field comprises a three-bit Precedence field, three flag bits (D for Delay, T for Throughput, R for Reliability), and two unused bits. The Precedence field indicates a priority level, ranging from 0 (normal) to 7 (network control packet). The flags allow the host to express its primary concern from the set {Delay, Throughput, Reliability}. In theory, these preferences would enable routers to make informed routing decisions, for instance, choosing between a satellite link (high throughput, high delay) and a leased line (lower throughput, lower delay). In practice, however, most contemporary routers largely disregard the Type of Service field, often relying instead on higher-layer Quality of Service (QoS) mechanisms to manage traffic priorities.
Total Length (16 bits): This field specifies the entire length of the IP datagram, encompassing both the header and the data payload, measured in bytes. The maximum possible length is 65,535 bytes. While this upper limit remains tolerable for current network speeds, the advent of future gigabit networks may necessitate larger datagrams, a consideration addressed in newer IP versions like IPv6.
Identification (16 bits): This field is crucial for reassembling fragmented datagrams at the destination host. All fragments belonging to the same original datagram carry an identical Identification value, allowing the destination to correctly group them.
Flags (3 bits): Following an unused bit, two significant 1-bit flags control fragmentation behavior:
Fragment Offset (13 bits): This field specifies where in the original datagram this particular fragment belongs. It is measured in units of 8 bytes. All fragments, except the last one in a datagram, must have a payload size that is a multiple of 8 bytes, which is the elementary fragment unit. With 13 bits, this allows for a maximum of 8192 fragments per datagram, enabling a theoretical maximum datagram length of 65,536 bytes, which aligns closely with the Total Length field's capacity.
Time to Live (TTL) (8 bits): The TTL field serves as a counter to limit a packet's lifetime within the network. Conceptually, it is designed to count time in seconds, allowing a maximum lifetime of 255 seconds. In practice, however, it primarily functions as a hop counter, being decremented by one at each router it traverses. If the TTL value reaches zero, the packet is discarded, and an ICMP (Internet Control Message Protocol) warning packet is typically sent back to the source host. This feature is vital for preventing datagrams from circulating indefinitely in the event of corrupted routing tables.
Protocol (8 bits): Once the network layer has received or assembled a complete datagram, this field indicates which higher-level transport layer protocol (e.g., TCP, UDP, or others) the data portion should be handed to at the destination. The numbering of these protocols is globally defined and standardized, as specified in RFC 1700.
Header Checksum (16 bits): This field provides a checksum specifically for the IP header. It is used to detect errors introduced by memory corruption inside a router. The algorithm typically involves adding up all 16-bit half-words of the header using one's complement arithmetic, then taking the one's complement of the result (the Header Checksum field itself is treated as zero during this calculation). Because fields like the Time to Live change at each hop, the Header Checksum must be recomputed by every router, though optimizations exist to speed up this process.
Source Address (32 bits): This field contains the IP address of the originating host or router.
Destination Address (32 bits): This field contains the IP address of the intended recipient host or router.
Options (Variable length): This field was incorporated into the IP design to provide extensibility. It allows for future protocol enhancements, enables experimenters to test new concepts, and avoids allocating fixed header bits for information that is rarely required. Options are variable in length; each begins with a 1-byte code identifying the option type. Some options are followed by a 1-byte option length field, then one or more data bytes. The Options field is padded with zeros to ensure its total length is a multiple of four bytes. Five standard options are currently defined, as listed in the table below, though not all routers provide support for all of them.
Option | Description |
---|---|
Security | Specifies how secret the datagram is |
Strict source routing | Gives the complete path to be followed |
Loose source routing | Gives a list of routers not to be missed |
Record route | Makes each router append its IP address |
Timestamp | Makes each router append its address and timestamp |
Let us briefly detail these options:
Security: Intended to specify the confidentiality level of the datagram. While designed for policy enforcement, its practical utility in the general Internet has been limited due to a lack of universal implementation by network devices.
Strict Source Routing: This option provides the complete, exact path that the datagram must follow from source to destination, specified as a sequence of IP addresses. The datagram is strictly required to traverse only these specified routers. Its primary utility lies in network diagnostics, such as enabling system managers to send emergency packets when routing tables are corrupted, or for making precise timing measurements along a predefined path.
Loose Source Routing: Similar to strict source routing, this option requires the packet to traverse the list of routers specified, and in the order specified, but it permits the packet to pass through other routers between the listed ones. Typically, this option would specify only a few key routers to influence a particular general path. For example, to compel a packet traveling from London to Sydney to route westward instead of eastward, this option might specify intermediate routers in New York, Los Angeles, and Honolulu. This option is most useful when political or economic considerations necessitate passing through or avoiding certain countries.
Record Route: This option instructs each router along the datagram's path to append its IP address to the option field. This functionality is invaluable for system managers in diagnosing and tracing routing anomalies (e.g., "Why are packets from Houston to Dallas consistently routing via Tokyo first?").
Timestamp: Similar to the Record Route option, but in addition to recording its 32-bit IP address, each router also appends a 32-bit timestamp. This option primarily serves debugging routing algorithms by providing detailed timing information for each hop.
Every host and router connected to the Internet possesses a unique IP address, which serves to identify its network number and host number. No two machines simultaneously connected to the Internet possess the same IP address. All IP addresses are 32 bits long and are utilized in the Source address and Destination address fields of IP packets. Machines connected to multiple networks (e.g., a router) will have a different IP address on each network interface.
The formats used for IP addresses are illustrated in Figure 3 below.
Figure 3: Formats for IP Addresses (IPv4)
The Class A, B, C, and D formats were originally designed to accommodate a range of network sizes:
Class A: Allows for up to 126 networks, each with approximately 16 million hosts.
Class B: Supports 16,382 networks, each with up to 64K (65,534) hosts.
Class C: Enables approximately 2 million networks (e.g., LANs), each with up to 254 hosts.
Class D: Reserved for multicast addresses, where a datagram is directed to a group of multiple hosts.
Addresses beginning with 11110 are reserved for future use (Class E). Tens of thousands of networks are now connected to the Internet, and this number continues to grow rapidly. Network numbers are assigned by the NIC (Network Information Center) or its successors to prevent conflicts and ensure uniqueness.
Network addresses, being 32-bit numbers, are commonly written in dotted decimal notation. In this format, each of the four bytes is written in decimal, ranging from 0 to 255, separated by dots. For example, 192.41.6.20. The lowest possible IP address is 0.0.0.0, and the highest is 255.255.255.255.
Certain IP address values hold special meanings, as detailed in Figure 4 below.
Figure 4: Special Meanings of IP Address Values
As established, all hosts within a traditional IP network must share the same network number. This property of IP addressing can present challenges as networks expand. Consider a company initially operating with a single Class C LAN connected to the Internet. As its needs grow, it might acquire more than 254 machines, necessitating an additional Class C address. Alternatively, it might acquire a second LAN of a different type and require a separate IP address for it (though bridging LANs to form a single IP network is possible, bridges introduce their own complexities). Eventually, the company could manage numerous LANs, each with its own router and dedicated Class C network number.
As the number of distinct local networks proliferates, their management can become a significant administrative burden. Each time a new network is installed, the system administrator traditionally had to contact the NIC (or its regional equivalents) to obtain a new network number. This new number then required global announcement and updates to routing tables across the Internet. Furthermore, relocating a machine from one LAN to another would necessitate a change in its IP address, potentially requiring modifications to its configuration files and a global announcement of the new IP address. If another machine were subsequently assigned the newly released IP address, it could inadvertently receive email and other data intended for the original machine.
The solution to these problems is to allow a larger network address space to be logically divided into smaller parts for internal use, while still appearing as a single, unified network to the outside world. In Internet terminology, these internal divisions are called subnets. It is important to note that this usage of "subnet" specifically refers to a logical subdivision of a larger network's address space, which differs from its broader, older usage meaning the set of all routers and communication lines in a network. In this section, the former definition is to be applied.
If our growing company had initially acquired a Class B address instead of multiple Class C addresses, it could begin by simply numbering hosts from 1 to 254. When a second LAN became necessary, the company could decide to logically split the 16-bit host number of its Class B address into a smaller subnet number and a host number within that subnet, as illustrated in Figure 5 below. For instance, this split might allocate 6 bits for the subnet number and 10 bits for the host number. This configuration would allow for 62 distinct subnets (with 0 and -1 typically reserved), each capable of accommodating up to 1022 hosts.
Figure 5: Splitting a 16-bit Class B host number into a subnet number and a host number
Crucially, to networks outside the organization, this internal subnetting structure is not visible. Consequently, allocating a new subnet internally does not necessitate contacting the NIC or updating any external databases. In the example configuration, the first subnet might use IP addresses starting at 130.50.4.1, the second subnet at 130.50.8.1, and so forth.
To understand how subnets function, it is necessary to explain how IP packets are processed at a router. Each router maintains a routing table that lists various network IP addresses and specific host IP addresses (for local hosts). The former entries dictate how to reach distant networks, while the latter specify how to reach hosts directly connected to the router's local network segments. Associated with each table entry is the network interface to use for reaching the destination, along with other routing metrics.
When an IP packet arrives, its destination address is examined and looked up in the routing table. If the packet is destined for a distant network, it is forwarded to the next router on the interface specified in the table. If it is destined for a local host (e.g., a machine on the router's LAN), it is sent directly to that destination. If the destination network is not explicitly present in the router's table, the packet is typically forwarded to a default router with more extensive routing knowledge. This hierarchical algorithm significantly reduces the size of routing tables, as each router primarily needs to track other networks and its directly connected local hosts, rather than individual (network, host) pairs for every machine on the Internet.
When subnetting is introduced, the routing tables are adapted to include entries of the form (this-network, subnet, 0) and (this-network, this-subnet, host). Thus, a router on subnet k knows how to reach all other subnets within its larger network and also how to reach all hosts specifically on its own subnet k. It is not required to know the intricate details of individual hosts on other subnets within the same larger network.
The mechanism that enables this involves the use of a subnet mask (as depicted in Figure 5 above). When a packet arrives, each router performs a Boolean AND operation between the packet's destination IP address and the network's subnet mask to derive the subnet address. This resulting subnet address is then looked up in the routing tables. For example, if a packet addressed to 130.50.15.6 arrives at a router on subnet 5, and the subnet mask is 255.255.252.0 (which corresponds to the 6-bit subnet, 10-bit host split), the Boolean AND operation would yield the address 130.50.12.0. This address would then be looked up in the routing tables to find the appropriate path to hosts on subnet 3. Consequently, the router on subnet 5 is spared the task of maintaining data link addresses for hosts beyond its immediate subnet. Subnetting therefore establishes a logical three-level hierarchy (network, subnet, host), substantially reducing router table space and simplifying network administration.
The Network Layer, driven by the Internet Protocol, forms the resilient backbone of global digital communication. Its robust design, encompassing flexible addressing schemes, efficient datagram handling, and adaptable routing mechanisms, allows data to navigate the Internet's complex and ever-changing landscape. The concepts of IP addressing, subnetting, and the detailed structure of the IP datagram are fundamental building blocks for comprehending how data truly finds its way from any source to any destination across the world.
As we continue to build our understanding of computer networks, our next exploration will delve deeper into the sophisticated world of routing protocols. We will examine how routers dynamically learn and share information about network paths to make intelligent forwarding decisions, ensuring efficient and reliable data delivery across the Internet.
How to move your Email accounts from one hosting provider to another without losing any mails?
How to resolve the issue of receiving same email message multiple times when using Outlook?
Self Referential Data Structure in C - create a singly linked list
Mosquito Demystified - interesting facts about mosquitoes
Elements of the C Language - Identifiers, Keywords, Data types and Data objects
How to pass Structure as a parameter to a function in C?
Rajeev Kumar is the primary author of How2Lab. He is a B.Tech. from IIT Kanpur with several years of experience in IT education and Software development. He has taught a wide spectrum of people including fresh young talents, students of premier engineering colleges & management institutes, and IT professionals.
Rajeev has founded Computer Solutions & Web Services Worldwide. He has hands-on experience of building variety of websites and business applications, that include - SaaS based erp & e-commerce systems, and cloud deployed operations management software for health-care, manufacturing and other industries.