Discover Git's Internal Version Control Magic

Ever stop and think about what really happens when you run git add or git commit? A lot of developers use Git every day, but honestly, most don’t know much about what’s going on behind the scenes. Let’s change that. Let’s take a peek under the hood and see what makes Git tick.

Introduction

So, here’s the thing: Git isn’t just a version control tool. At its core, it’s more like a content-addressable filesystem, with a version control system layered on top. Basically, you give Git some content, and it hands you back a unique key you can use to get that content again. It’s a pretty clever system.

Once you get how Git actually works inside, you stop just memorizing commands. You start to really understand it. Suddenly, debugging weird issues gets easier. You can fix mistakes without panicking. And those advanced features? They don’t seem so intimidating anymore.

Prerequisites

Basic familiarity with Git commands (init, add, commit)
Access to a terminal/command line
A code editor

The `.git` Folder Explained

When you run git init, Git creates a hidden .git directory. This folder is your repository — everything Git needs lives here.

$ git init my-project
$ cd my-project
$ ls -la .git/

Here's what you'll find:

.git/
├── HEAD           # Points to current branch
├── config         # Repository-specific configuration
├── description    # Used by GitWeb (rarely needed)
├── hooks/         # Client/server-side scripts
├── info/          # Global exclude patterns
├── objects/       # All content (blobs, trees, commits)
└── refs/          # Pointers to commits (branches, tags)

The Four Critical Components

Component	Purpose
`HEAD`	A symbolic reference pointing to the current branch
`index`	The staging area (created after first `git add`)
`objects/`	The object database — stores all your content
`refs/`	References to commit objects (branches, tags, remotes)

💡 Tip: Want to back up your entire repository? Just copy the .git folder — it contains everything!

Git Objects: The Building Blocks

At the heart of Git, things are refreshingly straightforward. Everything in your repository boils down to just three main object types:

1. Blob (Binary Large Object)

A blob is as simple as it gets. It holds the content of a file—nothing else. No filenames, no permissions, no extra details. Just the raw data.

# See what type of object a hash represents
$ git cat-file -t 83baae61804e65cc73a7201a7252750c76066a30
blob

# View the content of a blob
$ git cat-file -p 83baae61804e65cc73a7201a7252750c76066a30
Hello, World!

ℹ️ Note: Two files with identical content share the same blob object, regardless of their names. This is how Git achieves efficient storage!

2. Tree

Think of a tree as Git’s way of organizing files and folders.

It points to blobs (which are files)
other trees (which are subdirectories)
and keeps track of filenames and permissions.

$ git cat-file -p master^{tree}
100644 blob a906cb2a4a904a152e80877d4088654daad0c859    README.md
100644 blob 8f94139338f9404f26296befa88755fc2598c289    index.js
040000 tree 99f1a6d12cb4b6f19c8655fca46c3ecf317074e0    src

The numbers represent file modes:

100644 — Normal file
100755 — Executable file
040000 — Directory (tree)
120000 — Symbolic link

3. Commit

This is where it all comes together:

A commit points to a specific tree (your project at that moment)
links back to its parent commit or commits (so you can trace the history)
and records who made the change
when it happened
and the message that explains why

$ git cat-file -p HEAD
tree d8329fc1cc938780ffdd9f94e0d364e0ea74f579
parent cac0cab538b970a37ea1e769cbbde608743bc96d
author John Doe <vikash@example.com> 1609459200 +0000
committer John Doe <vikash@example.com> 1609459200 +0000

Add new feature to the project

The Object Relationship Model

Here's how these objects connect:

Commit ──────────────────────────────────────────────────┐
│                                                        │
├── tree: d8329fc...  ─────────────────────────┐         │
├── parent: cac0cab...                         │         │
├── author: John Doe                           ▼         │
├── committer: John Doe                      Tree        │
└── message: "Add feature"                     │         │
                                               ├── blob: README.md
                                               ├── blob: index.js
                                               └── tree: src/
                                                    └── blob: app.js

How Git Tracks Changes

Git doesn’t just save the changes between files. Instead, it takes full snapshots each time. Sounds like it’d eat up a ton of space, right? Don’t stress—Git handles storage in a really smart way.

The Staging Area (Index)

The staging area lives in a file called .git/index. It keeps track of what you’re about to commit next. Picture it like a draft of your upcoming snapshot, not the final version, but pretty close.

Working Directory          Staging Area           Repository
     │                          │                      │
     │    git add file.txt      │                      │
     │ ─────────────────────►   │                      │
     │                          │    git commit        │
     │                          │ ─────────────────►   │
     │                          │                      │

How Git Stores Objects

Every object is stored in .git/objects/ using its SHA-1 hash as the filename:

$ find .git/objects -type f
.git/objects/83/baae61804e65cc73a7201a7252750c76066a30
.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4

Notice the directory structure:

First 2 characters of the hash → directory name
Remaining 38 characters → filename

This prevents any single directory from having too many files.

SHA-1 Hashes and Data Integrity

Git uses a 40-character SHA-1 hash to identify every object. This isn’t just some random string—it’s a cryptographic fingerprint of the object’s actual content.

How Git Calculates Hashes

Git hashes the content along with a header:

header = "blob " + content.length + "\0"
hash = SHA1(header + content)

For example, hashing "Hello, World!":

$ echo -n "Hello, World!" | git hash-object --stdin
8ab686eafeb1f44702738c8b0f24f2567c36da6d

Why This Matters

Data Integrity: If even a single bit in an object change, the hash changes completely. That way, Git spots any corruption right away
Deduplication: When two files have the same content, they get the same hash. Git only stores them once
Immutability: You can’t tweak an object without changing its hash, and if you do, you have to update every reference pointing to it.

# Verify object integrity
$ git fsck
Checking object directories: 100% (256/256), done.

Warning: Never manually edit files in .git/objects/. You'll corrupt your repository because the filename (hash) won't match the content.

What Actually Happens When You Run git add and git commit

Let’s break down what’s really going on under the hood with these commands.

What `git add` Does

So, you hit git add file.txt. Here’s what happens:

Git takes the contents of file.txt and hashes it. This creates what’s called a “blob” object.
Git stores that blob inside the .git/objects/ directory.
it updates the index (that’s .git/index) to include your file. That way, Git knows you want to track this exact version for your next commit.

# Before git add
$ git status
Untracked files:
    file.txt

# Run git add
$ git add file.txt

# A new blob is created
$ find .git/objects -type f
.git/objects/83/baae61804e65cc73a7201a7252750c76066a30

What `git commit` Does

When you run git commit -m "message", here’s what goes on:

Git takes a snapshot of everything you’ve staged and creates a tree object.
Then it builds a commit object. This one points to the tree and adds info like who made the commit and when.
Finally, Git moves the branch pointer forward, so it now points to your new commit.

# Create a commit
$ git commit -m "Initial commit"

# See the new objects
$ git cat-file -p HEAD
tree d8329fc1cc938780ffdd9f94e0d364e0ea74f579
author You <you@email.com> 1609459200 +0000
committer You <you@email.com> 1609459200 +0000

Initial commit

The Complete Picture

                    Working Directory
                           │
                     git add file
                           │
                           ▼
    ┌─────────────────────────────────────────┐
    │           Staging Area (Index)          │
    │  ┌────────────────────────────────────┐ │
    │  │ file.txt → blob 83baae61...        │ │
    │  └────────────────────────────────────┘ │
    └─────────────────────────────────────────┘
                           │
                    git commit -m "msg"
                           │
                           ▼
    ┌─────────────────────────────────────────┐
    │           Object Database               │
    │  ┌──────────────────────────────────┐   │
    │  │ Commit: abc123...                │   │
    │  │   └── Tree: d8329f...            │   │
    │  │         └── Blob: 83baae...      │   │
    │  └──────────────────────────────────┘   │
    └─────────────────────────────────────────┘
                           │
                           ▼
    ┌─────────────────────────────────────────┐
    │           refs/heads/main               │
    │           Points to: abc123...          │
    └─────────────────────────────────────────┘

Exploring Git Internals Yourself

Here are some commands to explore your own repositories:

Plumbing Commands (Low-Level)

# Hash content without storing
$ echo "test content" | git hash-object --stdin

# Hash and store content
$ echo "test content" | git hash-object -w --stdin

# View object type
$ git cat-file -t <hash>

# View object content
$ git cat-file -p <hash>

# View object size
$ git cat-file -s <hash>

Examine Your Repository

# List all objects
$ find .git/objects -type f

# View the staging area
$ git ls-files --stage

# Check HEAD reference
$ cat .git/HEAD

# Check branch reference
$ cat .git/refs/heads/main

# Verify repository integrity
$ git fsck --full

Best Practices

Don’t mess with files inside .git/ by hand. Always use Git commands.
Run git fsck once in a while. It checks if your repo is healthy.
Try to understand what each command does instead of just memorizing them.
Want to poke around? Set up a test repo and experiment there. It’s safer.

Common Mistakes to Avoid

Never delete files from .git/objects/. You’ll break your repo for good.
Don’t edit files in .git/objects/pack/ either. These are compressed—leave them alone.
If git fsck throws warnings, don’t ignore them. They’re usually a sign something’s wrong with your data.

Wrapping Up

Git’s design is actually pretty straightforward when you break it down:

Blobs hold your file data.
Trees keep track of directory layouts.
Commits save snapshots and remember your history.
SHA-1 hashes glue everything together, making sure nothing gets lost or duplicated.
The staging area lines up your next changes.
References—like branches and tags—just point to specific commits.

Once you get how these pieces fit, Git stops being this black box and starts making sense. You can actually reason about what’s going on under the hood.

Want to dig deeper?

Git Internals - Official Documentation
Try using git cat-file and git hash-object on your own projects
Explore how git pack-objects compresses your repository
Learn about git reflog for recovering lost commits

Sources:

If this helped, stick around for more hands-on guides and deep dives into the tool’s developers use every day.

How Git Works Internally: Understanding the Magic Behind Version Control

Introduction

Prerequisites

The `.git` Folder Explained

The Four Critical Components

Git Objects: The Building Blocks

1. Blob (Binary Large Object)

2. Tree

3. Commit

The Object Relationship Model

How Git Tracks Changes

The Staging Area (Index)

How Git Stores Objects

SHA-1 Hashes and Data Integrity

How Git Calculates Hashes

Why This Matters

What Actually Happens When You Run git add and git commit

What `git add` Does

What `git commit` Does

The Complete Picture

Exploring Git Internals Yourself

Plumbing Commands (Low-Level)

Examine Your Repository

Best Practices

Common Mistakes to Avoid

Wrapping Up

Want to dig deeper?

Sources:

Comments

Web Dev Cohort 2026

Git for Beginners: Basics and Essential Commands

More from this blog

TCP vs UDP: When to Use What, and How TCP Relates to HTTP

TCP Working: 3-Way Handshake & Reliable Communication

How a Browser Works: A Beginner-Friendly Guide to Browser Internals

CSS Selectors 101: Targeting Elements with Precision

Understanding HTML Tags and Elements

Command Palette

Introduction

Prerequisites

The .git Folder Explained

The Four Critical Components

Git Objects: The Building Blocks

1. Blob (Binary Large Object)

2. Tree

3. Commit

The Object Relationship Model

How Git Tracks Changes

The Staging Area (Index)

How Git Stores Objects

SHA-1 Hashes and Data Integrity

How Git Calculates Hashes

Why This Matters

What Actually Happens When You Run git add and git commit

What git add Does

What git commit Does

The Complete Picture

Exploring Git Internals Yourself

Plumbing Commands (Low-Level)

Examine Your Repository

Best Practices

Common Mistakes to Avoid

Wrapping Up

Want to dig deeper?

Sources:

Comments

Web Dev Cohort 2026

Git for Beginners: Basics and Essential Commands

More from this blog

The `.git` Folder Explained

What `git add` Does

What `git commit` Does