8.8 KiB
CSE4303 Introduction to Computer Security (Lecture 9)
Cryptographic Hash Functions
What is a Hash Function
A hash function maps a variable-length input to a fixed-length output.
h : X \to Y
Typical examples:
- Java hashCode(): input is an Object, output is a 4-byte integer.
- String polynomial hash example:
h("cs433s") = 'c' \cdot 31^6 + 's' \cdot 31^5 + \dots + 's'
Key property:
- Domain
|X|is much larger than range|Y|. - Collisions are unavoidable in principle since
|X| > |Y|.
Main uses:
- Compact numerical representation
- Hash tables (Set, Map, dictionaries)
- Object comparison
- Integrity checking (fingerprint)
Security Properties
Let h : X \to Y.
-
Preimage Resistance (One-way)
Giveny \in Y, it is computationally infeasible to findx \in Xsuch that
h(x) = y. -
Second Preimage Resistance (Weak collision resistance)
Given a specificx \in X, it is computationally infeasible to findx' \neq xsuch that
h(x') = h(x). -
Collision Resistance (Strong collision resistance)
It is computationally infeasible to find any two distinct valuesx, x' \in Xsuch that
h(x) = h(x').
Adversarial definition:
Let H : M \to T where |M| is much larger than |T|.
H is collision resistant if for all efficient algorithms A:
Adv_{CR}[A, H] = Pr[A outputs a collision for H]
is negligible.
Generic Collision Attack (Birthday Attack)
Let H : M \to \{0,1\}^n.
Generic algorithm to find a collision in time on the order of 2^{n/2}:
- Choose
2^{n/2}random messagesm_1, \dots, m_{2^{n/2}}. - Compute
t_i = H(m_i). - Look for
t_i = t_j.
Birthday phenomenon:
If the output space size is B,
high collision probability greater than 50\% occurs with about \sqrt{B} samples.
Thus:
- 128-bit hash gives about
2^{64}collision attack - 256-bit hash gives about
2^{128}collision attack
Practical Hash Functions
From performance and security table (AMD Opteron 2.2 GHz):
- MD5: 128 bits, completely broken since 2004
- SHA-1: 160 bits, practical collision attack demonstrated
- SHA-256: 256 bits
- SHA-512: 512 bits
- Whirlpool: 512 bits
SHA-1 collision example: SHAttered attack (Google and CWI).
Two different PDF files were produced with identical SHA-1 hash.
Construction of Cryptographic Hash Functions
Merkle-Damgard Construction
Given compression function:
h : T \times X \to T
We build:
H : X^{\le L} \to T
Process:
-
Split message into blocks
m[0], m[1], \dots, m[L]. -
Use fixed initialization vector
IV. -
Iterate chaining:
H_0 = IV
H_1 = h(H_0, m[0])
H_2 = h(H_1, m[1])
\dots
H_L = h(H_{L-1}, m[L]) -
Apply padding: append
1000\ldots0concatenated with message length (64 bits).
If no space remains, add another block.
Theorem:
If compression function h is collision resistant,
then H is collision resistant.
Davies-Meyer Compression from Block Cipher
Given block cipher:
E : K \times \{0,1\}^n \to \{0,1\}^n
Define compression function:
h(H, m) = E(m, H) \oplus H
If E behaves like an ideal cipher,
finding a collision in h takes about 2^{n/2} evaluations.
This is optimal for $n$-bit output.
Example: SHA-256
Built using:
- Merkle-Damgard construction
- Davies-Meyer style compression
- Block cipher-like core: SHACAL-2
Structure:
- 512-bit message block
- 256-bit chaining value
- 256-bit output
Applications for Integrity and Authentication
Standalone Usage: Message Integrity
Application 1: Delayed Knowledge Verification
Idea:
Publish h(secret) first.
Later reveal secret.
Anyone can recompute hash and verify.
Justification: Preimage resistance ensures secret is hidden until revealed.
Example: Stock market prediction commitment.
Example for delayed knowledge verification
- Publish
H("Stock will rise on May 1"). - On May 1, reveal the prediction string.
- Anyone computes hash and checks equality.
Application 2: Password Storage
Model: System must verify password but not store plaintext.
Solution:
Store hash of password.
During login:
- Hash input
- Compare with stored value
Example:
Linux stores hashed passwords in the /etc/shadow file.
Includes:
- Salt
- Password hash
- Metadata
Security relies on:
- One-way property
- Salting to prevent precomputed attacks
Application 3: Trusted Timestamping and Blockchains
Goal: Prove document existed before a given date.
Methods:
- Publish document hash in newspaper.
- Time Stamping Authority signs hash.
- Publish hash in blockchain block.
Blockchain relies on:
- One-way hash functions
- Linking blocks via hash pointers
Application 4: Software Integrity with Secure Read-Only Space
Context: Trusted read-only public space (for example official website).
Process:
- Publisher computes
H(F_1), H(F_2), \dots, H(F_n). - Publish hashes publicly.
- User downloads file
F_iand verifies hash.
If H is collision resistant:
Attacker cannot modify file without detection.
No encryption required.
Public verifiability works if read-only space is trusted.
Symmetric Crypto Authentication: MACs and AE
Message Authentication Codes (MACs)
Definition:
MAC I = (S, V) over (K, M, T)
S(k, m) \to tV(k, m, t) \toyes or no
Security model:
Attacker can query S(k, m_i).
Goal: produce new (m, t) not previously seen such that V accepts.
Adv_{MAC}[A, I] must be negligible.
MAC from PRF
Given PRF:
F : K \times X \to Y
Define MAC:
S(k, m) = F(k, m)
V(k, m, t) accepts if t = F(k, m)
Theorem:
If F is secure PRF and |Y| is large,
then derived MAC is secure.
Condition:
1 / |Y| must be negligible.
Example: |Y| = 2^{80}.
MACs from Hash Functions
Construction:
S_{big}(k, m) = S(k, H(m))
V_{big}(k, m, t) = V(k, H(m), t)
If:
Sis secure MAC for short messagesHis collision resistant
Then S_{big} is secure MAC.
If collision exists:
If H(m_0) = H(m_1),
query tag for m_0,
forge (m_1, t).
HMAC
HMAC(k, m) = H((k \oplus opad) \| H((k \oplus ipad) \| m))
Used in:
- TLS
- IPsec
- SSH
Properties:
- Built from hash function (for example SHA-256)
- Provably secure under PRF assumptions
Timing Attacks on MAC Verification
Problem: Byte-by-byte comparison leaks timing information.
Attack:
- Send random tag.
- Guess first byte.
- Detect timing increase.
- Repeat per byte.
Defense 1: Constant-time comparison loop.
Defense 2:
Double-HMAC comparison:
Compare HMAC(k, mac) with HMAC(k, sig).
Authenticated Encryption (AE)
AE provides:
- Confidentiality (CPA security)
- Ciphertext integrity
Cipher:
E : K \times M \times N \to C
D : K \times C \times N \to M \cup \{\bot\}
Ciphertext integrity: Attacker cannot produce new valid ciphertext.
Theorem: AE implies CCA security.
Implication:
If D(k, c) \neq \bot,
receiver knows sender had key.
Encrypt-then-MAC
Correct construction:
- Compute
c = E(k_E, m) - Compute
tag = S(k_I, c) - Send
(c, tag)
Encrypt-then-MAC is always secure ordering.
AE Standards
- GCM: CTR mode encryption then polynomial MAC
- CCM: CBC-MAC then CTR mode encryption
- EAX: CTR mode encryption then CMAC
All support AEAD:
Authenticated Encryption with Associated Data.
Example: authenticate packet headers but do not encrypt them.
Asymmetric Crypto Authentication: Digital Signatures
Motivation
Goal: Bind document to author.
Digital problem: Anyone can copy a visible signature from one document to another.
Solution: Make signature depend on document contents.
Digital Signature Scheme
Components:
- Secret signing key
sk - Public verification key
pk Sign(sk, m) \to signatureVerify(pk, m, sig) \toaccept or reject
Property:
Anyone can verify.
Only signer can produce valid signature.
Signing a Certificate
Process:
- Compute hash of data.
- Sign hash with secret key.
- Attach signature to data.
Verification:
- Compute hash of received data.
- Verify signature using public key.
- Accept if hashes match.
Software Signing
Software vendor:
- Signs update with secret key.
- Publishes update and signature.
Clients:
- Use vendor public key.
- Verify signature.
- Install only if valid.
Allows distribution via untrusted hosting site.
Review: Three Approaches to Data Integrity
-
Collision resistant hashing
Requires secure read-only public space.
No secret keys.
Suitable for public verification. -
MACs
Requires shared secret key.
Must compute new MAC per user.
Suitable when one signs and one verifies. -
Digital signatures
Requires long-term secret key.
Public verification.
Suitable when one signs and many verify.
Crypto Summary
Cryptographic goals:
- Confidentiality
- Data integrity
- Authentication
- Non-repudiation
Primitives:
- Hash functions
- MACs
- Digital signatures
- Symmetric ciphers
- Public key ciphers