MD5 Hash Function
An interactive walkthrough of how MD5 works — from message padding to the four round functions — with live hashing, avalanche effect visualization, and an honest look at why MD5 is no longer safe.
0 What Is a Hash Function?
A cryptographic hash function takes an input of any size and produces a fixed-size "fingerprint." Unlike encryption, hashing is one-way — you cannot reverse it to recover the original input.
Encryption (reversible)
ciphertext → decrypt(key) → plaintext
Purpose: keep data confidential while allowing authorized recovery.
Hashing (one-way)
digest → ❌ no way back
Purpose: verify integrity, store passwords, create digital signatures.
- Password storage — store hash(password), never the password itself
- File integrity — download a file, verify its hash matches the published one
- Digital signatures — sign hash(document) instead of the full document
- Deduplication — identical files have identical hashes
1 The Four Properties
2 MD5 at a Glance
Pipeline Overview
2. Initialize state: four 32-bit words A, B, C, D (fixed constants)
3. Process blocks — for each 512-bit block, run 64 rounds (F,G,H,I)
4. Output — concatenate final A‖B‖C‖D = 128-bit hash
Initial State (Magic Constants)
MD5 starts with four hard-coded 32-bit words. These specific values are chosen to have no special mathematical relationship — they're just "nothing up my sleeve" numbers.
01 23 45 67B₀ bytes:
89 AB CD EFC₀ bytes:
FE DC BA 98D₀ bytes:
76 54 32 10MD5 uses little-endian byte order throughout.
3 Step 1: Message Padding
Before processing, the message must be padded to a multiple of 512 bits (64 bytes). The padding scheme is carefully designed so different messages produce different padded results.
Padding Rules
- Append a single
1bit (byte0x80) after the message. - Append
0bits until the message length ≡ 448 (mod 512) bits (i.e., 56 mod 64 bytes). - Append the original message length as a 64-bit little-endian integer. This fills the last 8 bytes of the 512-bit block.
Padding Visualizer
Type a short message to see how the 64-byte (512-bit) padded block looks.
4 The Four Round Functions
MD5 processes each block in 4 rounds of 16 operations. Each round uses a different non-linear bitwise function (F, G, H, or I) applied to the current state words B, C, D.
If bit of b is 1, choose bit of c. Otherwise choose bit of d. Acts as a bitwise "if-then-else".
Similar to F but with b and d swapped. A different "choice" that mixes bits with a complementary pattern.
XOR parity of all three words. Each output bit is 1 iff an odd number of b, c, d bits are 1.
A nonlinear combination using OR and NOT. Designed to complement H and increase diffusion.
One Operation Within a Round
Each of the 64 operations updates one state word using this formula. The rotations and additions provide further diffusion.
- F — round function (F, G, H, or I)
- M_k — a 32-bit word from the message block (k varies by round)
- T_i — precomputed constant:
floor(2³² × |sin(i)|)for i = 1…64 - s — per-operation left-rotate amount
Show all 64 T constants (sin-based)
★ Implementation Walkthrough
This section traces every single step of MD5, from raw bytes to final hash — with enough detail to write your own implementation from scratch. Enter a short message (≤ 55 chars to fit in one block) and click Trace →.
Click "Trace MD5 →" to see this phase.
Click "Trace MD5 →" to see this phase.
Click "Trace MD5 →" to see this phase.
Click "Trace MD5 →" to see this phase.
Click "Trace MD5 →" to see this phase.
Click "Trace MD5 →" to see this phase.
5 Live MD5 Demo
Type anything and watch the hash update in real time. The output is always exactly 32 hexadecimal characters = 128 bits.
Real-time Hash
Preset Examples
6 Avalanche Effect
A good hash function magnifies tiny changes. Flipping a single bit in the input should flip roughly 50% of the output bits — making every output look completely random relative to nearby inputs.
Bit-level Comparison
Edit either input and see exactly which output bits change (shown in red).
Try It: Increment by one bit
Enter a base message. Each row below shows the hash when we change exactly one character position by +1 ASCII value:
7 Why MD5 Is Broken
Collision Attack
A collision is two different inputs that produce the same hash. For a 128-bit hash with a random function, you'd expect to need ~2⁶⁴ ≈ 18 quintillion attempts (birthday paradox). But MD5's internal structure has weaknesses that allow collisions to be found in milliseconds.
When MD5 is still OK
- Checksums for accidental corruption (not malicious)
- Deduplication / content-addressing where security is not needed
- Non-security hash tables
- Legacy system compatibility
Never use MD5 for…
- Passwords — use bcrypt, scrypt, or Argon2
- Digital signatures — use SHA-256 or SHA-3
- TLS/SSL certificates — deprecated since 2004
- HMAC where collision-resistance matters
1996 — Dobbertin finds weaknesses in MD5 compression function
2004 — Wang et al. demonstrate full MD5 collisions in minutes
2008 — Forged CA certificate using MD5 collision (Sotirov et al.)
2012 — Flame malware forged Windows Update using MD5
Now — Use SHA-256 or SHA-3 for all security uses