Files
NoteNextra-origin/content/CSE4303/CSE4303_L9.md
Trance-0 571efa1bad update
2026-02-12 12:49:16 -06:00

255 lines
5.8 KiB
Markdown

# CSE4303 Introduction to Computer Security (Lecture 9)
## Cryptographic Hash Functions
### What is a Hash Function
A hash function maps a variable-length input to a fixed-length output.
$h : X \to Y$
Typical examples:
- Java hashCode(): input is an Object, output is a 4-byte integer.
- String polynomial hash example:
$h("cs433s") = 'c' \cdot 31^6 + 's' \cdot 31^5 + \dots + 's'$
Key property:
- Domain $|X|$ is much larger than range $|Y|$.
- Collisions are unavoidable in principle since $|X| > |Y|$.
Main uses:
- Compact numerical representation
- Hash tables (Set, Map, dictionaries)
- Object comparison
- Integrity checking (fingerprint)
### Security Properties
Let $h : X \to Y$.
1. Preimage Resistance (One-way)
Given $y \in Y$, it is computationally infeasible to find $x \in X$ such that
$h(x) = y$.
2. Second Preimage Resistance (Weak collision resistance)
Given a specific $x \in X$, it is computationally infeasible to find $x' \neq x$ such that
$h(x') = h(x)$.
3. Collision Resistance (Strong collision resistance)
It is computationally infeasible to find any two distinct values $x, x' \in X$ such that
$h(x) = h(x')$.
Adversarial definition:
Let $H : M \to T$ where $|M|$ is much larger than $|T|$.
$H$ is collision resistant if for all efficient algorithms $A$:
$Adv_{CR}[A, H] = Pr[A$ outputs a collision for $H]$
is negligible.
### Generic Collision Attack (Birthday Attack)
Let $H : M \to \{0,1\}^n$.
Generic algorithm to find a collision in time on the order of $2^{n/2}$:
1. Choose $2^{n/2}$ random messages $m_1, \dots, m_{2^{n/2}}$.
2. Compute $t_i = H(m_i)$.
3. Look for $t_i = t_j$.
Birthday phenomenon:
If the output space size is $B$,
high collision probability greater than $50\%$ occurs with about $\sqrt{B}$ samples.
Thus:
- 128-bit hash gives about $2^{64}$ collision attack
- 256-bit hash gives about $2^{128}$ collision attack
### Practical Hash Functions
From performance and security table (AMD Opteron 2.2 GHz):
- MD5: 128 bits, completely broken since 2004
- SHA-1: 160 bits, practical collision attack demonstrated
- SHA-256: 256 bits
- SHA-512: 512 bits
- Whirlpool: 512 bits
SHA-1 collision example: SHAttered attack (Google and CWI).
Two different PDF files were produced with identical SHA-1 hash.
## Construction of Cryptographic Hash Functions
### Merkle-Damgard Construction
Given compression function:
$h : T \times X \to T$
We build:
$H : X^{\le L} \to T$
Process:
- Split message into blocks $m[0], m[1], \dots, m[L]$.
- Use fixed initialization vector $IV$.
- Iterate chaining:
$H_0 = IV$
$H_1 = h(H_0, m[0])$
$H_2 = h(H_1, m[1])$
$\dots$
$H_L = h(H_{L-1}, m[L])$
- Apply padding:
append $1000\ldots0$ concatenated with message length (64 bits).
If no space remains, add another block.
Theorem:
If compression function $h$ is collision resistant,
then $H$ is collision resistant.
### Davies-Meyer Compression from Block Cipher
Given block cipher:
$E : K \times \{0,1\}^n \to \{0,1\}^n$
Define compression function:
$h(H, m) = E(m, H) \oplus H$
If $E$ behaves like an ideal cipher,
finding a collision in $h$ takes about $2^{n/2}$ evaluations.
This is optimal for $n$-bit output.
### Example: SHA-256
Built using:
- Merkle-Damgard construction
- Davies-Meyer style compression
- Block cipher-like core: SHACAL-2
Structure:
- 512-bit message block
- 256-bit chaining value
- 256-bit output
## Applications for Integrity and Authentication
### Standalone Usage: Message Integrity
#### Application 1: Delayed Knowledge Verification
Idea:
Publish $h(secret)$ first.
Later reveal secret.
Anyone can recompute hash and verify.
Justification:
Preimage resistance ensures secret is hidden until revealed.
Example:
Stock market prediction commitment.
<details>
<summary>Example for delayed knowledge verification</summary>
1. Publish $H("Stock will rise on May 1")$.
2. On May 1, reveal the prediction string.
3. Anyone computes hash and checks equality.
</details>
#### Application 2: Password Storage
Model:
System must verify password but not store plaintext.
Solution:
Store hash of password.
During login:
- Hash input
- Compare with stored value
Example:
Linux stores hashed passwords in the /etc/shadow file.
Includes:
- Salt
- Password hash
- Metadata
Security relies on:
- One-way property
- Salting to prevent precomputed attacks
#### Application 3: Trusted Timestamping and Blockchains
Goal:
Prove document existed before a given date.
Methods:
- Publish document hash in newspaper.
- Time Stamping Authority signs hash.
- Publish hash in blockchain block.
Blockchain relies on:
- One-way hash functions
- Linking blocks via hash pointers
#### Application 4: Software Integrity with Secure Read-Only Space
Context:
Trusted read-only public space (for example official website).
Process:
1. Publisher computes $H(F_1), H(F_2), \dots, H(F_n)$.
2. Publish hashes publicly.
3. User downloads file $F_i$ and verifies hash.
If $H$ is collision resistant:
Attacker cannot modify file without detection.
No encryption required.
Public verifiability works if read-only space is trusted.
## Symmetric Crypto Authentication: MACs and AE
This section can also be found here [CSE442T Introduction to Cryptography (Lecture 18)](https://notenextra.trance-0.com/CSE442T/CSE442T_L18/#chapter-5-authentication)
### Message Authentication Codes (MACs)
Definition:
MAC $I = (S, V)$ over $(K, M, T)$
- $S(k, m) \to t$
- $V(k, m, t) \to$ yes or no
Security model:
Attacker can query $S(k, m_i)$.
Goal: produce new $(m, t)$ not previously seen such that $V$ accepts.
$Adv_{MAC}[A, I]$ must be negligible.
### MAC from PRF
Given PRF:
$F : K \times X \to Y$
Define MAC:
$S(k, m) = F(k, m)$
$V(k, m, t)$ accepts if $t = F(k, m)$
Theorem:
If $F$ is secure PRF and $|Y|$ is large,
then derived MAC is secure.
Condition:
$1 / |Y|$ must be negligible.
Example: $|Y| = 2^{80}$.