→ How the Internet Works

Despite the internet shaping modern life, few of us know how it really works.

On the surface, we all share the same definition of the internet. Colloquially we understand it to be a global network of interconnected computers. We associate the internet with connectivity; with having access to the worlds information.

The Dream Machine is a decent chronological account of how the internet came to be, it emerged from rapid experimentation and practical progress. It’s origin story is a messy one, full of inventions and innovations that build on each other to create something incredible. As a result, the best theoretical framework to help us understand the structure of the ‘internet’ arrived after its invention.

In 1980 Hubert Zimmermann proposed the OSI Reference Model, a step toward a shared set of standards for ‘Open Systems Interconnection’. The paper doesn’t use the word ‘internet’ which specifically referred to a practical implementation using TCP/IP at the time.

It’s primary contribution toward standardising the rules of interaction between interconnected systems was to propose a 7 layer architecture reference model.

image
Layer Name
Layer Description
Examples…
Application
Where end-user applications access network services.
HTTP, FTP, SMTP
Presentation
Responsible for the translation, encryption, and compression of data. Converts data from the application format to a common format and vice versa, ensuring that data sent from the application layer of one system is readable by the application layer of another.
SSL, TLS, ASCII
Session
Establishes, manages, and terminates sessions between applications. E.g. initiating data re-synchronised because of a disconnection.
NetBIOS, RPC
Transport
Provides reliable data transfer across a network. Ensures complete and error-free data transmission between hosts. It also handles flow control, ensuring data is sent at a rate that can be handled by the receiving end.
TCP, UDP
Network
Handles addressing and routing of data packets. Determines how data is transferred and routed through the network. This layer also handles packet switching and congestion control.
IP, ICMP
Data Link
Responsible for node-to-node data transfer and corrects errors from the physical layer. Divided into two layers, Logical Link Control (LLC) and the Media Access Control (MAC). This layer sets up links across the physical network, putting packets into network frames.
Ethernet, PPP, WIFI, Bluetooth
Physical
The physical transmission of raw bitstreams of data over physical medium. The electrical and physical aspects of data transmission (switches, cables, etc).
Ethernet, USB, WIFI, Bluetooth

The Application Layer (Layer 7)

The Application Layer provides the interface through which users interact with network services. It encompasses a wide range of protocols and services that form the backbone of much of the internet and network communication used today. Including…

  • HTTP stands for Hypertext Transfer Protocol, and it is the foundational protocol used for transmitting data over the World Wide Web. It defines how messages are formatted and transmitted, and what actions web servers and browsers should take in response to various commands. HTTP is largely about defining requests and responses. Hand includes….
    • The ‘S’ in HTTPS stands for secure, usually via SSL or TLS which are presentation layer protocols.
    • Each request and response has a header and a body
      • The body is likely to be the HTML page, or JSON you’re requesting or submitting to the server
      • The header can be broken into three parts. Here are some of the fields:
      General (example)
      Response (example)
      Request (example)
      Request URL Request Method (GET) Status Code (201) Remote Address Referrer Policy
      Server Set-Cookie Content-Type (HTML) Content-Length (bytes) Date
      Cookies Accept - (Prog. language) Content - Type Content - Length Authorisation (Auth Token) User-Agent (OS, Browser Version) Referrer (Last Website Visited)
      Request Methods…
      POST
      Create or search. Adding something to the server (blog post, form submission etc)
      GET
      Retrieves data from the server
      PUT
      Update something on the server
      DELETE
      Delete something from the server
      PATCH
      Partial update
      Response Codes…
      1xx
      Information Requests indicate a provisional response, acknowledging the request has been received and understood.
      100
      Continue
      101
      Switching protocols
      102
      Processing
      2xx
      Successful Requests the request was received, understood, and processed by the server.
      200
      OK
      201
      Created
      202
      Accepted
      203
      Non-Authoritative Information
      204
      No Content
      205
      Reset Content
      206
      Partial Content
      207
      Multi-Status
      208
      Already Reported
      3xx
      Redirects indicate further action is needed to complete the request.
      300
      Multiple Choices
      301
      Moved Permanently
      302
      Found
      303
      See Other
      304
      Not Modified
      305
      Use Proxy
      307
      Temporary Redirect
      306
      Permanent Redirect
      4xx
      Client Errors indicate that there was a problem with the request
      400
      Bad Request
      401
      Unauthorised
      402
      Payment Required
      403
      Forbidden
      404
      Not Found
      405
      Method Not Allowed
      407
      Proxy Authentication Required
      408
      Request Timeout
      409
      Conflict
      410
      Gone
      412
      Precondition Failed
      416
      Requested Range Not Satisfiable
      417
      Expectation Failed
      422
      Unprocessable Entity
      424
      Failed Dependency
      426
      Upgrade Required
      429
      Too Many Requests
      431
      Request Header Too Large
      451
      Unavailable for Legal Reasons
      5xx
      Server Errors indicate the request was accepted, but an error on the server prevented fulfilment.
      500
      Internal Server Error
      501
      Not Implemented
      502
      Bad Gateway
      503
      Service Unavailable
      504
      Gateway Timeout
      505
      HTTP Version Not Supported
      506
      Variant Also Negotiates
      507
      Insufficient Storage
      508
      Loop Detected
      510
      Not Extended
      511
      Network Authentication Required
  • FTP stands for File Transfer Protocol. FTP is also a protocol used for transferring data over the internet. FTP though is specialised for file management and transfer, while HTTP is designed for accessing web resources.
    • FTP can maintain a connection for multiple file transfers, whereas HTTP opens a new connection for each request/response cycle.
    • FTP often requires explicit authentication, whereas historically HTTP didn’t
    • FTP is more efficient for continuous file transfers.
  • The WebSockets protocol is optimised for real-time communication.
    1. WebSockets have two main components, the handshake and the data frames
      • The handshake. A WebSocket connection is initiated with an HTTP handshake, the client requests an upgrade to WebSockets. The server responds with an upgrade header in its response, establishing the WebSocket connection.
      • The Data Frames. Once the connection is established, data is transmitted in frames, which can be text or binary data.This allows for efficient and fast data transfer.
      WebSockets have a number of interesting attributes…
      • Bidirectional: they allow for two-way communication, e.g. the server and client can send data without a request being made by the other.
      • Full-Duplex Channel: They are two-way, the client client and the server can send and receive messages independently and simultaneously over the same connection.
      • Persistent Connection: Once established, the WebSocket connection remains open, providing a persistent connection between client and server.
      • Lower Overhead: After the initial handshake, data can be sent back and forth with minimal overhead, making WebSockets more efficient for scenarios where frequent, small messages are exchanged.
    2. WebSockets are used for interactive web apps (dashboards), real-time applications (chat) or Internet of Things applications.
    3. Long polling is an alternative HTTP based approach to WebSockets

      The client maintains the HTTP connection instead of receiving an instant response from the server. This allows the server to reply later when data is available or the timeout threshold is reached. The client then sends the next request after receiving the response.

  • STMP/IMAP/POP are all standard email communication protocols, for sending, receiving and managing emails. They differ in their approach to storage, SMTP is for transmission not storage, IMAP emails are stored on the server, POP3 emails are typically downloaded to a device and removed from the server.
On REST and GraphQL
  • REST (Representational State Transfer) is an architectural style that uses existing web standards and protocols, primarily HTTP. It is based on the concept of resources, each identified by URLs, and uses standard HTTP methods like GET, POST, PUT, and DELETE. RESTful services are stateless, and they often use JSON or XML to format data.
  • GraphQL:GraphQL is a query language for APIs and a runtime for executing queries using a type system defined for the data. It allows clients to specify exactly what data they need, reducing over-fetching or under-fetching issues common in traditional REST APIs. Unlike REST, which is an architectural style, GraphQL is more prescriptive, offering a specific syntax and system for requesting and delivering data.

The Presentation Layer (Layer 6)

The Presentation Layer plays a crucial role in the management and interpretation of data formats in network communication.

  • The presentation layer helps networks have the following attributes…
    • Interoperability. Across diverse systems and applications. Ensures that data formatted in a proprietary way by one application is readable by another.
    • Efficient Communication. By compressing data, the Presentation Layer reduces the bandwidth needed for data transmission. Effective data compression and decompression techniques can significantly speed up data transfer rates.
    • Security: With its role in encryption, it's a critical layer for ensuring data privacy and security in communication.
  • The fundamental functions of the presentation layer:
    • Data Translation: Converting data between the formats the network requires and the formats the application needs. Ensures data from the application layer of one system can be understood by the application layer of another.
    • Data Encryption and Decryption: Responsible for encrypting data before it's transmitted and decrypting data upon receipt. Enhances security by ensuring that data cannot be easily understood if intercepted.
    • Data Compression: Reduces the size of data to be transmitted over the network, increasing transmission efficiency.
  • The key protocols and standards:
    • SSL/TLS for Security. Protocols that provide security measures, primarily used in HTTPS for secure communication over the Internet.
      • SSL (Secure Sockets Layer) is a cryptographic protocol designed to provide secure communication over a computer network. It is widely used for secure transactions on the Internet, such as online shopping and banking, by encrypting the data transmitted between a web server and a client.
      • TLS (Transport Layer Security) is the successor to SSL, providing stronger and more versatile encryption capabilities. It is the standard protocol used for establishing secure internet connections and ensuring data privacy and integrity between two communicating applications.
      • ℹ️
        Major web browsers and search engines, like Google, advocate the use of SSL/TLS, often marking non-HTTPS sites as insecure and potentially ranking them lower in search results.
    • ASCII for Text Data. ASCII (American Standard Code for Information Interchange) is commonly used for text data representation in English.
    • JPEG, GIF, TIFF for Images. Standards for image encoding, ensuring images are consistently rendered across different systems.
    • MPEG, QuickTime, MIDI for Multimedia. Protocols and standards for video and audio data.

The Session Layer (Layer 5)

The Session Layer manages and controls the interactions between applications on different network devices.

  • The session layer helps provide the following system attributes:
    • Session Integrity by ensuring that data exchange sessions are reliably maintained, providing stable communication links for applications.
    • Interoperability by facilitating interaction between different systems and applications, essential for diverse network environments.
    • Error Recovery by incorporating checkpoints, it enables effective error recovery mechanisms, enhancing overall network resilience.
  • The fundamental functions of the session layer:
    • Session Management. Manages sessions between two parties, ensuring proper opening and closing for data exchange integrity.
    • Dialog Control. Manages application dialogues, enabling communication in half-duplex or full-duplex modes. Coordinates communication directions for orderly data exchange.
    • Synchronisation. Introduces data checkpoints for recovery during disruptions, ensuring efficient data recovery and continuous communication.
  • In modern tech stacks, the functionalities traditionally associated with the Session Layer are often implemented at the Application Layer or are an integral part of the Transport Layer protocols.
  • The key protocols and standards:
    • RPC (Remote Procedure Call) allows a program to execute a procedure in another address space, often on a different computer on a network. It simplifies inter-process communication by handling the necessary network communication. RPC is widely used in web services and APIs, such as XML-RPC and JSON-RPC. In microservices architecture, services often communicate via RPC, enabling modular and scalable application development. RPC is also used in server-client applications on the web for server-side processing triggered by client requests.
    • PPTP (Point-to-Point Tunneling Protocol) is used for implementing virtual private networks (VPNs). Operates at the Session Layer to manage tunnels between points in a network.
    • NetBIOS (Network Basic Input/Output System) provides services related to the Session Layer for LANs. Used for establishing and managing sessions between devices on a network.

The Transport Layer (Layer 4)

The Transport Layer is a pivotal component in managing the end-to-end communication process across a network.

  • The Transport Layer contributes to the following system attributes:
    • End-to-End Communication. Manages data transmission between the source and destination hosts, regardless of the underlying network types.
    • Quality of Service (QoS). Provides different levels of service based on the application's requirements, such as real-time data transmission for video calls.
    • Independence from Network Layer. Operates independently of the Network Layer, providing flexibility to work over different network types and topologies.
  • Foundational functions of the Transport Layer:
    • Segmentation and Reassembly. Divides data from the Application Layer into smaller segments for easier transmission and reassembles these segments at the destination.
    • Connection Management. Establishes, maintains, and terminates connections between communicating devices.
    • Flow Control. Regulates data transmission rate to ensure that a fast sender does not overwhelm a slow receiver.
    • Error Handling and Correction. Detects and, in some cases, corrects errors that may occur during data transmission.
    • Port Management. Uses port numbers to direct data to the correct application processes on the source and destination devices.
    • ℹ️
      What’s the difference between a Session and a Connection? While a connection sets up the pathway for data to travel between nodes, a session ensures that this data exchange is coherent, continuous, and meaningful from an application’s perspective.
  • Protocols and functions of the Transport Layer:
    • TCP (Transmission Control Protocol) ensures reliable, ordered, and error-checked delivery of data. Connection-oriented: establishes a connection before data can be sent.
      • TCP connections are initiated through a process called the three-way handshake. This involves the exchange of SYN (synchronise) and ACK (acknowledgment) packets between the client and server, establishing a connection before data transfer begins.
      • TCP is widely used by the Internet's most popular applications and services, including HTTP/HTTPS (for web traffic), SMTP (for email), and FTP (for file transfer).
      • While TCP is highly reliable, it can be slower than UDP due to its emphasis on reliability and ordered delivery. However, this makes it ideal for applications where data integrity is more important than speed.
    • UDP (User Datagram Protocol) enables a low-latency and loss-tolerating connection between applications. Connectionless: sends messages, called datagrams, without establishing a dedicated connection.
    • SCTP (Stream Control Transmission Protocol) combines features of TCP and UDP, used for message-oriented data transmission. Provides multi-streaming and multi-homing capabilities for more reliable connections.
    • DCCP (Datagram Congestion Control Protocol) suitable for non-reliable data streams but requires congestion control, like streaming media.

The Network Layer (Layer 3)

The Network Layer is pivotal in global network infrastructure, facilitating the routing and forwarding of data packets across different networks. This layer is where you'll find some of the core protocols that make the internet function.

  • The Network Layer contributes to the following system attributes:
    • Internetworking Capability: Enables internetworking, which is the interconnection of multiple networks, forming a single large network.
    • Decoupling of Data: Decouples the data transport from the physical network layout, allowing different network technologies to communicate.
    • Path Determination: Uses various routing algorithms and metrics to determine the best path for data.
    • Error Reporting and Diagnostics: Provides means to report errors and network issues back to the source for diagnostic purposes.
  • Fundamental functions of the Networking Layer:
    • Routing determines the optimal path along which network traffic should be forwarded from the source to the destination.
    • Logical Addressing uses logical addresses (like IP addresses) to uniquely identify each device on the network.
    • Packet Forwarding handles the forwarding of data packets (known as packets in the context of the Network Layer) from one router to another until they reach their destination.
    • Subnetting and Addressing responsible for dividing the network into smaller sub-networks (subnets) to manage and optimise traffic.
    • Handling Traffic Congestion implements congestion control mechanisms to avoid network bottlenecks.
  • Key Protocols and Standards:
    • IP (Internet Protocol) is the fundamental protocol that defines IP addresses and how data packets are routed through the internet. Includes both IPv4 and IPv6 standards.
      • It provides the foundation for sending and receiving data across networks.
      • IP addresses are unique numerical identifiers assigned to each device connected to a network.
      • IPv4 is the most widely used version of IP, using a 32-bit address format.
      • However, with the growth of the internet and the increasing number of devices, IPv6 (Internet Protocol version 6) was developed to provide a larger address space using a 128-bit format.
      • IP is connectionless, meaning that it does not establish a dedicated connection before transmitting data. Instead, it breaks data into packets and adds source and destination IP addresses to each packet.
      • These packets are then routed independently across the network, potentially taking different paths to reach the destination.
      • IP relies on routers to forward packets based on the destination IP address, making routing decisions in real-time to optimise data transmission.
      • In addition to addressing and routing, IP also handles fragmentation and reassembly of data packets.
      • When data is too large to fit into a single packet, IP divides it into smaller fragments, each with its own IP header. At the destination, IP reassembles the fragments back into the original data.
    • ICMP (Internet Control Message Protocol) is used for diagnostic and error-reporting purposes. For example, it's used by the ping utility to test connectivity.
      • The Ping utility’s primary purpose is to determine whether a specific IP address is accessible and to measure how long it takes for messages to travel round-trip between the originating host and the destination
    • IPsec (Internet Protocol Security) a suite of protocols for securing IP communications by authenticating and encrypting each IP packet in a data stream.
    • Routing Protocols (OSPF, BGP, RIP). OSPF (Open Shortest Path First), BGP (Border Gateway Protocol), RIP (Routing Information Protocol) are among the key protocols that manage routing decisions.

The Data Link Layer (Layer 2)

The Data Link Layer is fundamental to network communication, providing a reliable link between adjacent network devices. It plays a critical role in handling the physical and logical connections to the network hardware.

  • Attributes of the Data Link Layer:
    • Reliable Link ensures a reliable and error-free link between two directly connected nodes.
    • Point-to-Point and Point-to-Multipoint can facilitate both point-to-point and point-to-multipoint connections.
    • Hardware-Oriented closely tied to network hardware and technologies, providing a layer of abstraction above the raw data transmission of the Physical Layer.
  • Fundamental Functions of the Data Link Layer:
    • Framing: Divides data streams into frames, adding necessary headers and trailers for error checking and flow control.
    • Physical Addressing: Adds MAC (Media Access Control) addresses to frames, identifying the source and destination devices on the local network.
      • MAC addresses are used as network addresses for most IEEE 802 network technologies, including Ethernet and Wi-Fi. A MAC address is a 48-bit number (6 bytes long) displayed as six pairs of hexadecimal digits, separated by colons or hyphens (e.g. 00:1A:C2:7B:00:47). They are globally unique identifiers for network interfaces, assigned by the manufacturer of the network interface card (NIC) and stored in its hardware.
    • Error Detection and Handling: Detects and, in some cases, corrects errors within a frame. Common methods include Cyclic Redundancy Check (CRC).
    • Flow Control: Manages the pace of data transmission between devices to prevent a fast sender from overwhelming a slow receiver.
    • Access Control: Controls which device has access to the network at any one time, particularly in shared media (like Wi-Fi or Ethernet).
  • Key Protocols and Standards:
    • Ethernet:The most common Data Link Layer protocol used in LANs. Ethernet frames encapsulate packets of data, adding source and destination MAC addresses.
    • PPP (Point-to-Point Protocol):Used for creating direct connections between two nodes, commonly in dial-up Internet connections.
    • ARP (Address Resolution Protocol): Resolves IP addresses into MAC addresses, enabling communication on a local network.
    • VLAN (Virtual LAN): VLAN tags are used to segment a network at the Data Link Layer, allowing separate broadcast domains within the same physical network.
    • IEEE 802.11:The set of standards for implementing wireless local area networking, commonly known as Wi-Fi.

The Physical Layer (Layer 1)

The Physical Layer deals with the physical aspects of network communication. It's concerned with the transmission and reception of raw bitstreams over a physical medium. It serves as the bedrock of network communication, providing the means for transmitting raw data over various forms of media. Its roles in defining the physical properties of the network and in handling the actual transmission of data are fundamental to all types of network communications.

  • Attributes of the Physical Layer
    • Hardware-Oriented. Involves hardware components like hubs, repeaters, network adapters, cables, and connectors.
    • Signal Transmission. Handles the details of signal transmission, including voltage levels, data rates, and maximum transmission distances.
    • Interface Characteristics. Defines the electrical and physical specifications for devices and transmission media.
  • Fundamental Functions:
    • Data Transmission:Transforms data into electrical, optical, or radio signals suitable for transmission over physical media.
    • Physical Medium: Defines the characteristics of the physical medium used for data transmission, including cable types (like coaxial, fiber optic) and wireless transmission methods.
    • Signal Modulation: Manages the modulation and demodulation processes, which involve converting digital data into signals and vice versa.
    • Bit Rate Control: Determines the rate at which data is transmitted, measured in bits per second (bps).
    • Physical Topology: Influences the physical design of the network, including the layout of network devices and connections.
  • Key Protocols and Standards:
    • Ethernet standards define the wiring and signaling for the physical layer of local area networks (LANs).
    • Wi-Fi (IEEE 802.11) specifies the standards for implementing wireless local area networking.
    • Bluetooth (IEEE 802.15.1) a standard for short-range wireless communication between devices.
    • DSL (Digital Subscriber Line) a family of technologies used to transmit digital data over telephone lines.
    • USB (Universal Serial Bus) an industry standard for cables, connectors, and protocols for connection, communication, and power supply between devices.

The OSI seven layer model isn’t the only way to represent the layers of the network. In fact, the TCP/IP model is more popular.

The TCP/IP model combines the Presentation, Session, and Data Link layers into the Application Layer and Network Access Layer respectively.

Which changes the picture somewhat, but shows a more precise definition of the internet (as the internet layer). See Below…

OSI Model
TCP/IP Model
Application Layer
Application Layer
Presentation Layer
Session Layer
Transport Layer
Transport Layer
Network Layer
Internet Layer
Data Link Layer
Network Access Layer
Physical Layer

I’ll leave it there for now, but may come back to this page as a learn more about the different parts of the jigsaw puzzle.

image