MD5 Hash Function

An interactive walkthrough of how MD5 works — from message padding to the four round functions — with live hashing, avalanche effect visualization, and an honest look at why MD5 is no longer safe.

128-bit output 4 rounds × 16 ops One-way function ⚠ Collisions found Educational

0 What Is a Hash Function?

A cryptographic hash function takes an input of any size and produces a fixed-size "fingerprint." Unlike encryption, hashing is one-way — you cannot reverse it to recover the original input.

Encryption (reversible)

plaintext → encrypt(key) → ciphertext
ciphertext → decrypt(key) → plaintext

Purpose: keep data confidential while allowing authorized recovery.

Hashing (one-way)

any input → hash() → fixed-size digest
digest❌ no way back

Purpose: verify integrity, store passwords, create digital signatures.

Real-world uses of hashing:
  • Password storage — store hash(password), never the password itself
  • File integrity — download a file, verify its hash matches the published one
  • Digital signatures — sign hash(document) instead of the full document
  • Deduplication — identical files have identical hashes

1 The Four Properties

📏
Fixed-size output
No matter if the input is 1 byte or 1 GB — MD5 always outputs exactly 128 bits (32 hex characters).
🔁
Deterministic
Same input always produces the same output. Hashing "hello" today and tomorrow gives identical results.
🚪
One-way (pre-image resistant)
Given the hash, you cannot recover the original input. There is no "unhash" operation.
🌊
Avalanche effect
Change a single bit in the input → roughly 50% of output bits flip. Small changes produce completely different hashes.
Collision resistance — the 5th property (and where MD5 fails): It should be computationally infeasible to find two different inputs with the same hash. MD5 breaks this property — researchers can produce collisions in seconds. See Section 6.

2 MD5 at a Glance

Output size
128 bits / 32 hex
Block size
512 bits / 64 bytes
Word size
32 bits
Rounds
4 rounds × 16 ops = 64
Published
1992 (RFC 1321)
Status
⚠ Broken

Pipeline Overview

1. Pad message → length ≡ 448 (mod 512) bits, then append 64-bit length
2. Initialize state: four 32-bit words A, B, C, D (fixed constants)
3. Process blocks — for each 512-bit block, run 64 rounds (F,G,H,I)
4. Output — concatenate final A‖B‖C‖D = 128-bit hash

Initial State (Magic Constants)

MD5 starts with four hard-coded 32-bit words. These specific values are chosen to have no special mathematical relationship — they're just "nothing up my sleeve" numbers.

A₀ = 0x67452301
B₀ = 0xEFCDAB89
C₀ = 0x98BADCFE
D₀ = 0x10325476
Note: A₀ bytes in order are: 01 23 45 67
B₀ bytes: 89 AB CD EF
C₀ bytes: FE DC BA 98
D₀ bytes: 76 54 32 10
MD5 uses little-endian byte order throughout.

3 Step 1: Message Padding

Before processing, the message must be padded to a multiple of 512 bits (64 bytes). The padding scheme is carefully designed so different messages produce different padded results.

Padded length ≡ 0 (mod 512 bits)  where  padded = original + pad_bits + 64-bit-length

Padding Rules

  1. Append a single 1 bit (byte 0x80) after the message.
  2. Append 0 bits until the message length ≡ 448 (mod 512) bits (i.e., 56 mod 64 bytes).
  3. Append the original message length as a 64-bit little-endian integer. This fills the last 8 bytes of the 512-bit block.

Padding Visualizer

Type a short message to see how the 64-byte (512-bit) padded block looks.

Message bytes 0x80 separator Zero padding Length (64-bit LE)

4 The Four Round Functions

MD5 processes each block in 4 rounds of 16 operations. Each round uses a different non-linear bitwise function (F, G, H, or I) applied to the current state words B, C, D.

Round 1 — F (Choice)
F(b,c,d) = (b ∧ c) ∨ (¬b ∧ d)

If bit of b is 1, choose bit of c. Otherwise choose bit of d. Acts as a bitwise "if-then-else".

Round 2 — G (Choice)
G(b,c,d) = (b ∧ d) ∨ (c ∧ ¬d)

Similar to F but with b and d swapped. A different "choice" that mixes bits with a complementary pattern.

Round 3 — H (Parity)
H(b,c,d) = b ⊕ c ⊕ d

XOR parity of all three words. Each output bit is 1 iff an odd number of b, c, d bits are 1.

Round 4 — I (Majority variant)
I(b,c,d) = c ⊕ (b ∨ ¬d)

A nonlinear combination using OR and NOT. Designed to complement H and increase diffusion.

One Operation Within a Round

Each of the 64 operations updates one state word using this formula. The rotations and additions provide further diffusion.

\( a \leftarrow b + \text{LeftRotate}_{s}(a + F(b,c,d) + M_k + T_i) \)
Components:
  • F — round function (F, G, H, or I)
  • M_k — a 32-bit word from the message block (k varies by round)
  • T_i — precomputed constant: floor(2³² × |sin(i)|) for i = 1…64
  • s — per-operation left-rotate amount
Show all 64 T constants (sin-based)

Implementation Walkthrough

This section traces every single step of MD5, from raw bytes to final hash — with enough detail to write your own implementation from scratch. Enter a short message (≤ 55 chars to fit in one block) and click Trace →.

1 Encode Input to Bytes (UTF-8)

Click "Trace MD5 →" to see this phase.

2 Pad to 512-bit Block

Click "Trace MD5 →" to see this phase.

3 Pack Bytes → 16 Message Words (Little-Endian!)

Click "Trace MD5 →" to see this phase.

4 Initialize State: A, B, C, D

Click "Trace MD5 →" to see this phase.

5 64 Operations — 4 Rounds × 16 (F, G, H, I)

Click "Trace MD5 →" to see this phase.

6 Add Back, Encode Output (Little-Endian)

Click "Trace MD5 →" to see this phase.

5 Live MD5 Demo

Type anything and watch the hash update in real time. The output is always exactly 32 hexadecimal characters = 128 bits.

Real-time Hash

Preset Examples

Notice: "hello" vs "Hello" differ by just one bit in the first character — yet produce completely different hashes. That's the avalanche effect.

6 Avalanche Effect

A good hash function magnifies tiny changes. Flipping a single bit in the input should flip roughly 50% of the output bits — making every output look completely random relative to nearby inputs.

Bit-level Comparison

Edit either input and see exactly which output bits change (shown in red).

Try It: Increment by one bit

Enter a base message. Each row below shows the hash when we change exactly one character position by +1 ASCII value:

7 Why MD5 Is Broken

⚠ Do not use MD5 for passwords, digital signatures, or any security-critical purpose. MD5 is fully broken for cryptographic applications. Use SHA-256 or SHA-3 instead.

Collision Attack

A collision is two different inputs that produce the same hash. For a 128-bit hash with a random function, you'd expect to need ~2⁶⁴ ≈ 18 quintillion attempts (birthday paradox). But MD5's internal structure has weaknesses that allow collisions to be found in milliseconds.

Famous MD5 collision (Wang et al., 2004 — two inputs with the same hash):
d131dd02c5e6eec4693d9a0698aff95c 2fcab58712467eab4004583eb8fb7f89 55ad340609f4b30283e4888325718161 2e57f8d9b4f95bb25568be6f9d1f68e6
d131dd02c5e6eec4693d9a0698aff95c 2fcab50712467eab4004583eb8fb7f89 55ad340609f4b30283e488832571816e 2e57f8d9b4f95bb25568be6f9d1f68e6
Both blocks hash to: 79054025255fb1a26e4bc422aef54eb4

When MD5 is still OK

  • Checksums for accidental corruption (not malicious)
  • Deduplication / content-addressing where security is not needed
  • Non-security hash tables
  • Legacy system compatibility

Never use MD5 for…

  • Passwords — use bcrypt, scrypt, or Argon2
  • Digital signatures — use SHA-256 or SHA-3
  • TLS/SSL certificates — deprecated since 2004
  • HMAC where collision-resistance matters
Timeline of MD5's demise:
1992 — MD5 published by Ron Rivest (RFC 1321)
1996 — Dobbertin finds weaknesses in MD5 compression function
2004 — Wang et al. demonstrate full MD5 collisions in minutes
2008 — Forged CA certificate using MD5 collision (Sotirov et al.)
2012 — Flame malware forged Windows Update using MD5
Now — Use SHA-256 or SHA-3 for all security uses