I used to think hashing was enough. SHA-256 the data, compare the hashes, done. Then I learned about HMAC and realized I'd been missing something fundamental: a hash proves integrity, but HMAC proves authenticity.
Here's the difference that clicked for me.
The Problem with Plain Hashes
Say you're building an API. You want to make sure data hasn't been tampered with. Your first instinct might be:
import hashlib
def sign_data(data: bytes) -> str:
return hashlib.sha256(data).hexdigest()
# Sign the message
message = b'{"user_id": 123, "action": "transfer", "amount": 100}'
signature = sign_data(message)The problem? Anyone can compute that hash. If an attacker intercepts the message, they can:
- Modify the data (
amount: 100→amount: 10000) - Recompute the hash
- Send the modified message with a valid signature
The hash doesn't prove the message came from you—it just proves the data matches some hash.
Enter HMAC
HMAC (Hash-based Message Authentication Code) solves this by mixing in a secret key. Only someone who knows the key can create a valid signature.
import hmac
import hashlib
def sign_data(key: bytes, data: bytes) -> str:
return hmac.new(key, data, hashlib.sha256).hexdigest()
# Now the signature requires the secret
secret_key = b'my-super-secret-key'
message = b'{"user_id": 123, "action": "transfer", "amount": 100}'
signature = sign_data(secret_key, message)An attacker who modifies the message can't forge a valid signature without knowing the key. That's the whole point.
The hmac.new() API
The basic pattern is simple:
import hmac
import hashlib
# Create HMAC
h = hmac.new(
key=b'secret', # Your secret key (bytes)
msg=b'message', # The data to sign (bytes)
digestmod=hashlib.sha256 # Hash algorithm
)
# Get the signature
hex_signature = h.hexdigest() # As hex string
raw_signature = h.digest() # As raw bytesYou can also build it incrementally for large data:
h = hmac.new(b'secret', digestmod=hashlib.sha256)
h.update(b'first chunk')
h.update(b'second chunk')
h.update(b'third chunk')
signature = h.hexdigest()The compare_digest Gotcha
Here's something that surprised me. This code looks fine:
def verify_signature(expected: str, provided: str) -> bool:
return expected == provided # DON'T DO THISBut it's vulnerable to timing attacks. String comparison in Python returns False as soon as it finds a mismatch. An attacker can measure response times to guess the signature one character at a time.
The fix is hmac.compare_digest():
import hmac
def verify_signature(expected: str, provided: str) -> bool:
return hmac.compare_digest(expected, provided)This function takes constant time regardless of where the strings differ. Always use it when comparing secrets.
Here's my mental rule: If you're comparing anything security-sensitive, use compare_digest.
Common Use Case: Webhook Verification
Most third-party services (GitHub, Stripe, Slack) sign their webhooks. Here's how to verify them:
import hmac
import hashlib
from flask import Flask, request, abort
app = Flask(__name__)
WEBHOOK_SECRET = 'whsec_your_secret_here'
def verify_webhook(payload: bytes, signature: str, secret: str) -> bool:
"""Verify a webhook signature."""
expected = hmac.new(
secret.encode(),
payload,
hashlib.sha256
).hexdigest()
# Many services prefix with 'sha256='
if signature.startswith('sha256='):
signature = signature[7:]
return hmac.compare_digest(expected, signature)
@app.route('/webhook', methods=['POST'])
def handle_webhook():
signature = request.headers.get('X-Hub-Signature-256', '')
if not verify_webhook(request.data, signature, WEBHOOK_SECRET):
abort(403) # Forbidden
# Process the webhook...
return 'OK', 200Without this verification, anyone could POST fake events to your endpoint.
Common Use Case: API Authentication
For signing your own API requests:
import hmac
import hashlib
import time
class SignedAPIClient:
def __init__(self, api_key: str, api_secret: str):
self.api_key = api_key
self.api_secret = api_secret.encode()
def make_request(self, method: str, path: str, body: str = ''):
timestamp = str(int(time.time()))
# Build the string to sign
message = f"{method}\n{path}\n{timestamp}\n{body}"
signature = hmac.new(
self.api_secret,
message.encode(),
hashlib.sha256
).hexdigest()
headers = {
'X-API-Key': self.api_key,
'X-Timestamp': timestamp,
'X-Signature': signature,
}
# Now make the HTTP request with these headers...
return headers
# Usage
client = SignedAPIClient('my_key', 'my_secret')
headers = client.make_request('POST', '/api/orders', '{"item": "book"}')On the server side, you'd reconstruct the message the same way and verify the signature matches.
Practical Pattern: Simple Signed Tokens
Need a lightweight token system? HMAC can do that:
import hmac
import hashlib
import json
import base64
import time
def create_token(secret: str, data: dict, expires_in: int = 3600) -> str:
"""Create a signed token with expiration."""
payload = {
**data,
'exp': int(time.time()) + expires_in
}
# Encode the payload
payload_json = json.dumps(payload, sort_keys=True)
payload_b64 = base64.urlsafe_b64encode(payload_json.encode()).decode()
# Sign it
signature = hmac.new(
secret.encode(),
payload_b64.encode(),
hashlib.sha256
).hexdigest()
return f"{payload_b64}.{signature}"
def verify_token(secret: str, token: str) -> dict | None:
"""Verify token and return payload if valid."""
try:
payload_b64, signature = token.rsplit('.', 1)
except ValueError:
return None
# Verify signature
expected = hmac.new(
secret.encode(),
payload_b64.encode(),
hashlib.sha256
).hexdigest()
if not hmac.compare_digest(expected, signature):
return None
# Decode and check expiration
payload_json = base64.urlsafe_b64decode(payload_b64).decode()
payload = json.loads(payload_json)
if payload.get('exp', 0) < time.time():
return None # Expired
return payload
# Usage
token = create_token('my-secret', {'user_id': 42, 'role': 'admin'})
print(token) # eyJ1c2VyX2lkIjog...
payload = verify_token('my-secret', token)
if payload:
print(f"User: {payload['user_id']}")This is basically a stripped-down JWT. For production, you'd probably use a real JWT library, but understanding this helps you see what's happening under the hood.
HMAC vs Hash: When to Use Each
| Scenario | Use |
|---|---|
| Password storage | Neither! Use bcrypt/argon2 |
| File integrity (no attacker) | Plain hash |
| Webhook signatures | HMAC ✓ |
| API request signing | HMAC ✓ |
| Session tokens | HMAC ✓ |
| Verifying downloads | Plain hash (if hash is from trusted source) |
The key question: Does someone need to prove they know a secret? If yes, HMAC. If you just need to detect accidental corruption, a plain hash works.
Which Hash Algorithm?
import hmac
import hashlib
key = b'secret'
msg = b'data'
# Use SHA-256 (recommended default)
hmac.new(key, msg, hashlib.sha256).hexdigest()
# SHA-512 for extra security margin
hmac.new(key, msg, hashlib.sha512).hexdigest()
# SHA-1 (legacy—avoid for new code)
hmac.new(key, msg, hashlib.sha1).hexdigest()
# MD5 (broken—don't use)
# hmac.new(key, msg, hashlib.md5).hexdigest()Stick with SHA-256 unless you have a specific reason for something else.
Quick Reference
import hmac
import hashlib
# Create signature
sig = hmac.new(key, message, hashlib.sha256).hexdigest()
# Verify (constant-time!)
is_valid = hmac.compare_digest(expected, provided)
# Incremental update
h = hmac.new(key, digestmod=hashlib.sha256)
h.update(chunk1)
h.update(chunk2)
sig = h.hexdigest()What I Wish I'd Known Earlier
- HMAC isn't encryption—it doesn't hide data, it signs it
- Always use
compare_digest—timing attacks are real - Include timestamps—prevents replay attacks
- Key length matters—use at least 256 bits of entropy
- Plan for rotation—you'll need to change keys eventually
HMAC is one of those things that seems simple but has real security implications. Get it right once, and you'll use the same patterns everywhere.