How a Browser Works: A Beginner-Friendly Guide to Browser Internals
Learn what happens after you type a URL — from networking to parsing, DOM creation, layout, painting, and pixels on screen.

BCA student and developer who loves learning in public. I build web and mobile projects, explore databases and backend systems, and document my journey through blogs. Currently focused on writing clean code and growing one commit at a time.
How a Browser Works: A Beginner-Friendly Guide to Browser Internals
You type a URL, press Enter, and a webpage appears. It seems like magic, but behind the scenes, your browser is doing an incredible amount of work — fetching files, parsing code, building structures, calculating layouts, and painting pixels.
What actually happens between pressing Enter and seeing a webpage?
This guide will take you on a journey through browser internals — not to overwhelm you with specifications, but to give you a mental model of how all the pieces fit together.
Introduction
A browser is much more than "a thing that opens websites." It's a sophisticated piece of software that translates code into the visual, interactive experiences we use every day.
Understanding how browsers work helps you:
Write more performant websites
Debug layout and rendering issues
Appreciate the complexity behind the web
Don't worry about memorizing everything. Focus on the flow — how one step leads to the next.
Prerequisites
Basic familiarity with HTML and CSS
Curiosity about what happens "under the hood"
What Is a Browser, Really?
A browser is a program that:
Fetches files from the internet (HTML, CSS, JavaScript, images)
Understands those files (parses and interprets them)
Displays the result as a visual, interactive webpage
Think of a browser as a translator — it takes code written by developers and translates it into pixels on your screen.
┌─────────────────────────────────────────────────────────────────┐
│ What a Browser Does │
└─────────────────────────────────────────────────────────────────┘
[Developer's Code] [Browser] [Your Screen]
│ │ │
HTML ───┐ │ │
CSS ───┼────────────────────►│─────────────────────►│ 👁️
JS ───┘ │ │
Images ─── (Translate) (Display)
Popular browsers include:
Chrome (uses Blink engine)
Firefox (uses Gecko engine)
Safari (uses WebKit engine)
Edge (uses Blink engine)
Each has its own implementation, but they all follow similar principles.
The Main Parts of a Browser
A browser isn't one monolithic program — it's a collection of components working together. Let's look at the main parts:
┌─────────────────────────────────────────────────────────────────┐
│ High-Level Browser Architecture │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ USER INTERFACE │
│ ┌──────────┐ ┌─────────────────────────────┐ ┌──────────────┐ │
│ │ ← → ↻ │ │ 🔒 https://example.com │ │ ☆ ⋮ tabs │ │
│ │ buttons │ │ address bar │ │ actions │ │
│ └──────────┘ └─────────────────────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ BROWSER ENGINE │
│ (Coordinates between UI and Rendering Engine) │
└─────────────────────────────────────────────────────────────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ RENDERING │ │ NETWORKING │ │ JAVASCRIPT │
│ ENGINE │ │ │ │ ENGINE │
│ │ │ HTTP requests │ │ │
│ HTML → DOM │ │ DNS lookup │ │ V8 (Chrome) │
│ CSS → CSSOM │ │ TCP/TLS │ │ SpiderMonkey │
│ Layout & Paint │ │ Fetch files │ │ (Firefox) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ DATA STORAGE │
│ (Cookies, LocalStorage, Cache, IndexedDB) │
└─────────────────────────────────────────────────────────────────┘
Component Overview
| Component | What It Does |
| User Interface | Everything you see and interact with — address bar, tabs, buttons |
| Browser Engine | Coordinates actions between UI and rendering engine |
| Rendering Engine | Parses HTML/CSS, builds the DOM and CSSOM, paints pixels |
| Networking | Makes HTTP requests, handles DNS, fetches resources |
| JavaScript Engine | Executes JavaScript code (V8, SpiderMonkey, etc.) |
| Data Storage | Stores cookies, cache, localStorage, etc. |
Browser Engine vs Rendering Engine
These terms can be confusing:
Browser Engine: The "manager" that coordinates between UI and rendering
Rendering Engine: The "worker" that actually parses files and paints the page
In practice, people often use these terms interchangeably. The key thing to remember is that the rendering engine is responsible for turning code into visuals.
💡 Tip: You don't need to memorize these distinctions. Just know that the browser has different parts responsible for different tasks.
From URL to Request: The Networking Layer
Let's start our journey. You type https://example.com and press Enter. What happens first?
Step 1: Parse the URL
The browser breaks down the URL:
https://example.com/page?id=123
│ │ │ │
│ │ │ └── Query parameters
│ │ └── Path
│ └── Domain (hostname)
└── Protocol (scheme)
Step 2: DNS Lookup
The browser needs the IP address of example.com. It asks DNS servers (as we covered in previous articles):
Browser: "What's the IP for example.com?"
DNS: "It's 93.184.216.34"
Step 3: Establish Connection
Using the IP address, the browser:
Opens a TCP connection (3-way handshake)
Performs a TLS handshake (for HTTPS, to encrypt traffic)
Step 4: Send HTTP Request
GET /page HTTP/1.1
Host: example.com
Accept: text/html
Step 5: Receive Response
The server sends back HTML:
HTTP/1.1 200 OK
Content-Type: text/html
<!DOCTYPE html>
<html>
<head><title>Example</title></head>
<body><h1>Hello!</h1></body>
</html>
┌─────────────────────────────────────────────────────────────────┐
│ Networking Flow │
└─────────────────────────────────────────────────────────────────┘
Browser Server
│ │
│ 1. DNS Lookup │
│ "What's the IP for example.com?" │
│ ─────────────────────────────────────────────►│ DNS
│◄───────────────────────────────────────────── │
│ "93.184.216.34" │
│ │
│ 2. TCP + TLS Handshake │
│ ◄═══════════════════════════════════════════► │
│ │
│ 3. HTTP Request │
│ GET /page HTTP/1.1 │
│ ─────────────────────────────────────────────►│
│ │
│ 4. HTTP Response │
│ 200 OK + HTML content │
│◄───────────────────────────────────────────── │
│ │
Now the browser has the HTML. Time to make sense of it!
What Is Parsing?
Before diving into HTML and CSS parsing, let's understand what parsing means.
Parsing = Understanding Structure
Parsing is breaking down text into a structured format that a program can work with.
Think about how you understand a sentence:
"The cat sat on the mat"
│ │ │
subject verb location
Your brain "parses" the sentence — identifying parts and their relationships.
A Simple Parsing Example
Let's parse a math expression: 3 + 5 * 2
Step 1: Tokenization (break into pieces)
Tokens: [3] [+] [5] [*] [2]
Step 2: Build a Tree (understand relationships)
[+]
/ \
[3] [*]
/ \
[5] [2]
This tree shows that 5 * 2 happens first (because * has higher precedence), then we add 3.
Step 3: Evaluate
5 * 2 = 10
3 + 10 = 13
Why Trees?
Trees are powerful because they show:
Hierarchy — what contains what
Relationships — how pieces connect
Order — what to process first
The browser does exactly this with HTML and CSS!
┌─────────────────────────────────────────────────────────────────┐
│ Parsing Example: 3 + 5 * 2 │
└─────────────────────────────────────────────────────────────────┘
Raw Text Tokens Tree
│ │ │
"3 + 5 * 2" ────► [3][+][5][*][2] ────► +
/ \
(tokenization) 3 *
/ \
5 2
(evaluation: 13)
ℹ️ Note: This is a simplified example. Real parsers handle much more complexity, but the concept is the same: text → tokens → tree.
HTML Parsing and DOM Creation
Now let's see how browsers parse HTML into something usable.
What Is the DOM?
DOM (Document Object Model) is a tree representation of your HTML document. It's how the browser "sees" your HTML internally.
From HTML to DOM
Consider this HTML:
<!DOCTYPE html>
<html>
<head>
<title>My Page</title>
</head>
<body>
<h1>Hello World</h1>
<p>Welcome to my site</p>
</body>
</html>
The browser parses this into a DOM tree:
┌─────────────────────────────────────────────────────────────────┐
│ HTML → DOM Tree │
└─────────────────────────────────────────────────────────────────┘
document
│
html
/ \
head body
│ / \
title h1 p
│ │ │
"My Page" │ "Welcome to
│ my site"
"Hello World"
How Parsing Works
Step 1: Tokenization The HTML is broken into tokens:
[DOCTYPE] [<html>] [<head>] [<title>] [text: "My Page"] [</title>] ...
Step 2: Tree Construction Tokens are used to build the DOM tree, following HTML rules:
Opening tags create nodes
Closing tags complete nodes
Nesting creates parent-child relationships
The DOM Is Live
The DOM isn't just a snapshot — it's a live structure that:
JavaScript can modify (add/remove elements)
The browser keeps in sync with what's displayed
Reacts to user interactions
// JavaScript can modify the DOM
document.querySelector("h1").textContent = "New Title!";
// The page updates instantly!
Analogy: DOM as a Family Tree
Think of the DOM like a family tree:
<html>is the ancestor of everything<head>and<body>are siblings, children of<html><h1>and<p>are children of<body>, siblings of each other
html (grandparent)
/ \
head (parent) body (parent)
│ / \
title (child) h1 (child) p (child)
CSS Parsing and CSSOM Creation
While HTML is being parsed, the browser also encounters CSS (either in <style> tags or linked files). CSS gets its own tree structure.
What Is the CSSOM?
CSSOM (CSS Object Model) is a tree representation of all CSS rules and how they apply to elements.
From CSS to CSSOM
Consider this CSS:
body {
font-family: Arial;
background: white;
}
h1 {
color: blue;
font-size: 24px;
}
p {
color: gray;
}
The browser builds a CSSOM:
┌─────────────────────────────────────────────────────────────────┐
│ CSS → CSSOM Tree │
└─────────────────────────────────────────────────────────────────┘
CSSOM
│
body
│
┌───────────────┼───────────────┐
│ │ │
font-family: background: (inherited by
Arial white children)
│
┌─────────────────────┼─────────────────────┐
│ │ │
h1 p other
│ │
┌─────────┴─────────┐ │
│ │ │
color: blue font-size: 24px color: gray
CSS Inheritance in the CSSOM
Notice how styles can inherit:
bodysetsfont-family: ArialAll children (
h1,p) inherit this unless they override it
The CSSOM captures these relationships.
Why a Separate Tree?
You might wonder: "Why not just attach styles to DOM nodes?"
Keeping them separate allows:
Parallel processing — HTML and CSS can be parsed simultaneously
Reusability — Same CSS can apply to different HTML structures
Efficiency — Changes to CSS only require CSSOM recalculation
┌─────────────────────────────────────────────────────────────────┐
│ Parallel Processing │
└─────────────────────────────────────────────────────────────────┘
HTML Document CSS Files
│ │
▼ ▼
┌───────────┐ ┌───────────┐
│ HTML │ │ CSS │
│ Parser │ │ Parser │
└─────┬─────┘ └─────┬─────┘
│ │
▼ ▼
┌───────────┐ ┌───────────┐
│ DOM │ │ CSSOM │
│ Tree │ │ Tree │
└───────────┘ └───────────┘
│ │
└──────────────┬─────────────────────┘
▼
Render Tree
Bringing It Together: The Render Tree
Now the browser has two trees:
DOM — What's on the page (structure)
CSSOM — How it should look (styles)
Time to combine them!
What Is the Render Tree?
The Render Tree combines DOM and CSSOM to create a tree of visible elements with their computed styles.
┌─────────────────────────────────────────────────────────────────┐
│ DOM + CSSOM = Render Tree │
└─────────────────────────────────────────────────────────────────┘
DOM CSSOM Render Tree
│ │ │
▼ ▼ ▼
┌─────┐ ┌───────┐ ┌─────────┐
│html │ + │ styles│ = │ html │
└──┬──┘ └───┬───┘ │(visible)│
│ │ └────┬────┘
┌──┴──┐ │ │
head body │ ┌──┴──┐
│ │ │ body ...
title h1 ◄─────────────┘ │
│ (apply ┌┴┐
"Hi" styles) h1 p
│ │
"Hi" "text"
+blue +gray
+24px
What Gets Included?
Not everything in the DOM appears in the Render Tree:
| Included | Excluded |
Visible elements (<h1>, <p>, <div>) | <head> and its children (<title>, <meta>) |
| Elements with styles | Elements with display: none |
| Text nodes | <script> tags |
<div style="display: none">Hidden</div>
<!-- NOT in Render Tree -->
<div style="visibility: hidden">Invisible</div>
<!-- IN Render Tree (takes space) -->
⚠️ Warning:
display: noneremoves an element from the Render Tree entirely.visibility: hiddenkeeps it in the tree (it's just invisible).
Computed Styles
The Render Tree contains computed styles — the final values after:
Cascading (which rule wins?)
Inheritance (what comes from parents?)
Defaults (browser's default styles)
/* CSS */
h1 {
color: blue;
}
h1 {
color: red;
} /* This wins (later rule) */
/* Computed style for h1: color: red */
Layout, Paint, and Display
We have a Render Tree with styled elements. Now the browser needs to figure out WHERE to put everything and HOW to draw it.
Step 1: Layout (Reflow)
Layout calculates the exact position and size of each element.
The browser asks: "Where does each box go? How big is it?"
┌─────────────────────────────────────────────────────────────────┐
│ Layout Phase │
└─────────────────────────────────────────────────────────────────┘
Render Tree Layout Result
│ │
▼ ▼
┌───────┐ ┌────────────────────┐
│ body │ │ body: 0,0 │
└───┬───┘ │ width: 1200px │
│ │ height: 800px │
┌────┴────┐ ├────────────────────┤
│ │ │ h1: 0,0 │
h1 p │ width: 1200px │
│ height: 36px │
├────────────────────┤
│ p: 0, 36 │
│ width: 1200px │
│ height: 20px │
└────────────────────┘
Layout considers:
Box model (margin, border, padding, content)
Display type (block, inline, flex, grid)
Position (static, relative, absolute, fixed)
Float and clear
Viewport size
Step 2: Paint
Paint fills in the pixels. It determines the drawing order and creates layers.
The browser answers: "What color is each pixel? What order do I draw things?"
Paint Order (simplified):
1. Background colors
2. Background images
3. Borders
4. Children (recursively)
5. Outlines
Complex pages may have multiple layers that are painted independently and then composited (combined) together.
Step 3: Composite and Display
Finally, all layers are composited (combined) and sent to the screen.
┌─────────────────────────────────────────────────────────────────┐
│ Layout → Paint → Composite → Display │
└─────────────────────────────────────────────────────────────────┘
Render Tree Layout Paint Display
│ │ │ │
▼ ▼ ▼ ▼
┌─────────┐ ┌───────────┐ ┌──────────┐ ┌─────────┐
│ Styled │ ──► │ Positioned│ ──►│ Pixel │ ──►│ SCREEN │
│ Elements│ │ Boxes │ │ Colors │ │ 👁️ │
└─────────┘ └───────────┘ └──────────┘ └─────────┘
"What?" "Where?" "How?" "Show it!"
Reflow vs Repaint
These terms come up often:
| Term | What Triggers It | Cost |
| Reflow (Layout) | Changing size, position, adding/removing elements | Expensive |
| Repaint | Changing colors, shadows, visibility | Less expensive |
// Triggers reflow (layout change)
element.style.width = "200px";
// Triggers repaint only (just color)
element.style.backgroundColor = "blue";
💡 Tip: Minimize reflows for better performance. Batch DOM changes together rather than making them one at a time.
The Complete Journey: URL to Pixels
Let's trace the entire journey from typing a URL to seeing pixels on screen:
┌─────────────────────────────────────────────────────────────────┐
│ Complete Browser Flow: URL to Pixels │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ 1. USER TYPES URL AND PRESSES ENTER │
│ https://example.com │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 2. NETWORKING │
│ • Parse URL │
│ • DNS Lookup → Get IP address │
│ • TCP Connection (3-way handshake) │
│ • TLS Handshake (for HTTPS) │
│ • Send HTTP Request │
│ • Receive HTTP Response (HTML) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 3. PARSING (parallel) │
│ │
│ HTML ──────────► DOM Tree │
│ │
│ CSS ───────────► CSSOM Tree │
│ │
│ (Also: fetch more resources - images, fonts, JS) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 4. RENDER TREE CONSTRUCTION │
│ DOM + CSSOM ────► Render Tree │
│ (Only visible elements with computed styles) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 5. LAYOUT (Reflow) │
│ Calculate position and size of every element │
│ "Where does each box go? How big is it?" │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 6. PAINT │
│ Fill in pixels - colors, text, images, borders │
│ Create layers for complex content │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 7. COMPOSITE & DISPLAY │
│ Combine layers │
│ Send to GPU │
│ Display on screen │
│ │
│ 👁️ USER SEES THE PAGE │
└─────────────────────────────────────────────────────────────┘
Timeline Summary
| Step | What Happens | Output |
| 1. URL Entry | User types and presses Enter | URL string |
| 2. Networking | DNS, TCP, TLS, HTTP request/response | HTML content |
| 3. HTML Parsing | Tokenize and build tree | DOM Tree |
| 4. CSS Parsing | Tokenize and build tree | CSSOM Tree |
| 5. Render Tree | Combine DOM + CSSOM | Render Tree |
| 6. Layout | Calculate positions and sizes | Layout boxes |
| 7. Paint | Determine pixel colors | Paint records |
| 8. Composite | Combine layers | Final image |
| 9. Display | Show on screen | Pixels! 👁️ |
What About JavaScript?
JavaScript can interrupt this flow:
When the parser encounters
<script>, parsing pausesJS executes (it might modify the DOM!)
Parsing resumes
This is why you often see scripts at the end of <body> or with defer/async attributes.
<!-- Blocks parsing -->
<script src="app.js"></script>
<!-- Doesn't block parsing, runs after HTML is parsed -->
<script defer src="app.js"></script>
<!-- Doesn't block parsing, runs as soon as downloaded -->
<script async src="app.js"></script>
Best Practices
Understand the flow — Knowing the pipeline helps you write performant code
Minimize DOM manipulation — Batch changes to avoid multiple reflows
Use CSS efficiently — Complex selectors take longer to match
Defer non-critical JavaScript — Don't block HTML parsing
Optimize images — They affect networking and paint times
Common Mistakes to Avoid
Forcing synchronous layouts — Reading layout properties right after changing styles causes immediate reflow
Overly complex CSS selectors —
.nav ul li a spanis slower than.nav-linkBlocking scripts in
<head>— Delays first paint; usedeferor move to end of bodyIgnoring the critical rendering path — First paint matters for user experience
Conclusion
When you press Enter on a URL, your browser:
Fetches the HTML (DNS, TCP, HTTP)
Parses HTML into the DOM tree
Parses CSS into the CSSOM tree
Combines them into the Render Tree
Calculates layout (where and how big)
Paints pixels (colors and content)
Displays the final result
The browser is a remarkable piece of software, orchestrating all these steps in milliseconds. You don't need to memorize every detail — just understand the flow: Code → Parse → Build Trees → Layout → Paint → Display.
Next Steps / Further Reading
Explore browser DevTools (Performance tab shows this pipeline!)
Learn about the Critical Rendering Path for performance optimization
Study how JavaScript interacts with the DOM
Investigate CSS containment and layout optimization
If you found this helpful, consider following for more deep dives into how the web really works.





