Update CSE4303_L9.md

2026-02-12 12:41:50 -06:00
parent 0e28ba6261
commit 6d5c80d257
1 changed files with 424 additions and 1 deletions
--- a/content/CSE4303/CSE4303_L9.md
+++ b/content/CSE4303/CSE4303_L9.md
@@ -1 +1,424 @@
-# CSE4303 Introduction to Computer Security (Lecture 9)
+# CSE4303 Introduction to Computer Security (Lecture 9)
+
+## Cryptographic Hash Functions
+
+### What is a Hash Function
+
+A hash function maps a variable-length input to a fixed-length output.
+
+$h : X \to Y$
+
+Typical examples:
+- Java hashCode(): input is an Object, output is a 4-byte integer.
+- String polynomial hash example:
+  $h("cs433s") = 'c' \cdot 31^6 + 's' \cdot 31^5 + \dots + 's'$
+
+Key property:
+- Domain $|X|$ is much larger than range $|Y|$.
+- Collisions are unavoidable in principle since $|X| > |Y|$.
+
+Main uses:
+- Compact numerical representation
+- Hash tables (Set, Map, dictionaries)
+- Object comparison
+- Integrity checking (fingerprint)
+
+### Security Properties
+
+Let $h : X \to Y$.
+
+1. Preimage Resistance (One-way)  
+   Given $y \in Y$, it is computationally infeasible to find $x \in X$ such that  
+   $h(x) = y$.
+
+2. Second Preimage Resistance (Weak collision resistance)  
+   Given a specific $x \in X$, it is computationally infeasible to find $x' \neq x$ such that  
+   $h(x') = h(x)$.
+
+3. Collision Resistance (Strong collision resistance)  
+   It is computationally infeasible to find any two distinct values $x, x' \in X$ such that  
+   $h(x) = h(x')$.
+
+Adversarial definition:
+
+Let $H : M \to T$ where $|M|$ is much larger than $|T|$.  
+$H$ is collision resistant if for all efficient algorithms $A$:
+
+$Adv_{CR}[A, H] = Pr[A$ outputs a collision for $H]$
+
+is negligible.
+
+### Generic Collision Attack (Birthday Attack)
+
+Let $H : M \to \{0,1\}^n$.
+
+Generic algorithm to find a collision in time on the order of $2^{n/2}$:
+
+1. Choose $2^{n/2}$ random messages $m_1, \dots, m_{2^{n/2}}$.
+2. Compute $t_i = H(m_i)$.
+3. Look for $t_i = t_j$.
+
+Birthday phenomenon:
+
+If the output space size is $B$,  
+high collision probability greater than $50\%$ occurs with about $\sqrt{B}$ samples.
+
+Thus:
+- 128-bit hash gives about $2^{64}$ collision attack
+- 256-bit hash gives about $2^{128}$ collision attack
+
+### Practical Hash Functions
+
+From performance and security table (AMD Opteron 2.2 GHz):
+
+- MD5: 128 bits, completely broken since 2004
+- SHA-1: 160 bits, practical collision attack demonstrated
+- SHA-256: 256 bits
+- SHA-512: 512 bits
+- Whirlpool: 512 bits
+
+SHA-1 collision example: SHAttered attack (Google and CWI).  
+Two different PDF files were produced with identical SHA-1 hash.
+
+## Construction of Cryptographic Hash Functions
+
+### Merkle-Damgard Construction
+
+Given compression function:
+
+$h : T \times X \to T$
+
+We build:
+
+$H : X^{\le L} \to T$
+
+Process:
+- Split message into blocks $m[0], m[1], \dots, m[L]$.
+- Use fixed initialization vector $IV$.
+- Iterate chaining:
+
+  $H_0 = IV$  
+  $H_1 = h(H_0, m[0])$  
+  $H_2 = h(H_1, m[1])$  
+  $\dots$  
+  $H_L = h(H_{L-1}, m[L])$
+
+- Apply padding:
+  append $1000\ldots0$ concatenated with message length (64 bits).  
+  If no space remains, add another block.
+
+Theorem:  
+If compression function $h$ is collision resistant,  
+then $H$ is collision resistant.
+
+### Davies-Meyer Compression from Block Cipher
+
+Given block cipher:
+
+$E : K \times \{0,1\}^n \to \{0,1\}^n$
+
+Define compression function:
+
+$h(H, m) = E(m, H) \oplus H$
+
+If $E$ behaves like an ideal cipher,  
+finding a collision in $h$ takes about $2^{n/2}$ evaluations.
+
+This is optimal for $n$-bit output.
+
+### Example: SHA-256
+
+Built using:
+- Merkle-Damgard construction
+- Davies-Meyer style compression
+- Block cipher-like core: SHACAL-2
+
+Structure:
+- 512-bit message block
+- 256-bit chaining value
+- 256-bit output
+
+## Applications for Integrity and Authentication
+
+### Standalone Usage: Message Integrity
+
+#### Application 1: Delayed Knowledge Verification
+
+Idea:
+Publish $h(secret)$ first.  
+Later reveal secret.  
+Anyone can recompute hash and verify.
+
+Justification:
+Preimage resistance ensures secret is hidden until revealed.
+
+Example:
+Stock market prediction commitment.
+
+<details>
+<summary>Example for delayed knowledge verification</summary>
+
+1. Publish $H("Stock will rise on May 1")$.
+2. On May 1, reveal the prediction string.
+3. Anyone computes hash and checks equality.
+
+</details>
+
+#### Application 2: Password Storage
+
+Model:
+System must verify password but not store plaintext.
+
+Solution:
+Store hash of password.  
+During login:
+- Hash input
+- Compare with stored value
+
+Example:
+Linux stores hashed passwords in the /etc/shadow file.  
+Includes:
+- Salt
+- Password hash
+- Metadata
+
+Security relies on:
+- One-way property
+- Salting to prevent precomputed attacks
+
+#### Application 3: Trusted Timestamping and Blockchains
+
+Goal:
+Prove document existed before a given date.
+
+Methods:
+- Publish document hash in newspaper.
+- Time Stamping Authority signs hash.
+- Publish hash in blockchain block.
+
+Blockchain relies on:
+- One-way hash functions
+- Linking blocks via hash pointers
+
+#### Application 4: Software Integrity with Secure Read-Only Space
+
+Context:
+Trusted read-only public space (for example official website).
+
+Process:
+1. Publisher computes $H(F_1), H(F_2), \dots, H(F_n)$.
+2. Publish hashes publicly.
+3. User downloads file $F_i$ and verifies hash.
+
+If $H$ is collision resistant:  
+Attacker cannot modify file without detection.
+
+No encryption required.  
+Public verifiability works if read-only space is trusted.
+
+## Symmetric Crypto Authentication: MACs and AE
+
+### Message Authentication Codes (MACs)
+
+Definition:
+MAC $I = (S, V)$ over $(K, M, T)$
+
+- $S(k, m) \to t$
+- $V(k, m, t) \to$ yes or no
+
+Security model:
+Attacker can query $S(k, m_i)$.  
+Goal: produce new $(m, t)$ not previously seen such that $V$ accepts.
+
+$Adv_{MAC}[A, I]$ must be negligible.
+
+### MAC from PRF
+
+Given PRF:
+
+$F : K \times X \to Y$
+
+Define MAC:
+
+$S(k, m) = F(k, m)$  
+$V(k, m, t)$ accepts if $t = F(k, m)$
+
+Theorem:
+If $F$ is secure PRF and $|Y|$ is large,  
+then derived MAC is secure.
+
+Condition:
+$1 / |Y|$ must be negligible.  
+Example: $|Y| = 2^{80}$.
+
+### MACs from Hash Functions
+
+Construction:
+
+$S_{big}(k, m) = S(k, H(m))$  
+$V_{big}(k, m, t) = V(k, H(m), t)$
+
+If:
+- $S$ is secure MAC for short messages
+- $H$ is collision resistant
+
+Then $S_{big}$ is secure MAC.
+
+If collision exists:
+If $H(m_0) = H(m_1)$,
+query tag for $m_0$,
+forge $(m_1, t)$.
+
+### HMAC
+
+$HMAC(k, m) = H((k \oplus opad) \| H((k \oplus ipad) \| m))$
+
+Used in:
+- TLS
+- IPsec
+- SSH
+
+Properties:
+- Built from hash function (for example SHA-256)
+- Provably secure under PRF assumptions
+
+### Timing Attacks on MAC Verification
+
+Problem:
+Byte-by-byte comparison leaks timing information.
+
+Attack:
+1. Send random tag.
+2. Guess first byte.
+3. Detect timing increase.
+4. Repeat per byte.
+
+Defense 1:
+Constant-time comparison loop.
+
+Defense 2:
+Double-HMAC comparison:
+Compare $HMAC(k, mac)$ with $HMAC(k, sig)$.
+
+### Authenticated Encryption (AE)
+
+AE provides:
+1. Confidentiality (CPA security)
+2. Ciphertext integrity
+
+Cipher:
+
+$E : K \times M \times N \to C$  
+$D : K \times C \times N \to M \cup \{\bot\}$
+
+Ciphertext integrity:
+Attacker cannot produce new valid ciphertext.
+
+Theorem:
+AE implies CCA security.
+
+Implication:
+If $D(k, c) \neq \bot$,  
+receiver knows sender had key.
+
+### Encrypt-then-MAC
+
+Correct construction:
+
+1. Compute $c = E(k_E, m)$
+2. Compute $tag = S(k_I, c)$
+3. Send $(c, tag)$
+
+Encrypt-then-MAC is always secure ordering.
+
+### AE Standards
+
+- GCM: CTR mode encryption then polynomial MAC
+- CCM: CBC-MAC then CTR mode encryption
+- EAX: CTR mode encryption then CMAC
+
+All support AEAD:
+Authenticated Encryption with Associated Data.  
+Example: authenticate packet headers but do not encrypt them.
+
+## Asymmetric Crypto Authentication: Digital Signatures
+
+### Motivation
+
+Goal:
+Bind document to author.
+
+Digital problem:
+Anyone can copy a visible signature from one document to another.
+
+Solution:
+Make signature depend on document contents.
+
+### Digital Signature Scheme
+
+Components:
+- Secret signing key $sk$
+- Public verification key $pk$
+- $Sign(sk, m) \to signature$
+- $Verify(pk, m, sig) \to$ accept or reject
+
+Property:
+Anyone can verify.  
+Only signer can produce valid signature.
+
+### Signing a Certificate
+
+Process:
+1. Compute hash of data.
+2. Sign hash with secret key.
+3. Attach signature to data.
+
+Verification:
+1. Compute hash of received data.
+2. Verify signature using public key.
+3. Accept if hashes match.
+
+### Software Signing
+
+Software vendor:
+- Signs update with secret key.
+- Publishes update and signature.
+
+Clients:
+- Use vendor public key.
+- Verify signature.
+- Install only if valid.
+
+Allows distribution via untrusted hosting site.
+
+## Review: Three Approaches to Data Integrity
+
+1. Collision resistant hashing  
+   Requires secure read-only public space.  
+   No secret keys.  
+   Suitable for public verification.
+
+2. MACs  
+   Requires shared secret key.  
+   Must compute new MAC per user.  
+   Suitable when one signs and one verifies.
+
+3. Digital signatures  
+   Requires long-term secret key.  
+   Public verification.  
+   Suitable when one signs and many verify.
+
+## Crypto Summary
+
+Cryptographic goals:
+- Confidentiality
+- Data integrity
+- Authentication
+- Non-repudiation
+
+Primitives:
+- Hash functions
+- MACs
+- Digital signatures
+- Symmetric ciphers
+- Public key ciphers