diff --git a/content/CSE4303/CSE4303_E1.md b/content/CSE4303/CSE4303_E1.md new file mode 100644 index 0000000..59c980f --- /dev/null +++ b/content/CSE4303/CSE4303_E1.md @@ -0,0 +1,770 @@ +# CSE4303 Introduction to Computer Security (Exam Review) + +## Details + +Time and location + +– In class exam – Thursday, 3/5 at 11:30 AM +– What is allowed: + - One 8.5" X 11" paper of notes, single-sided only, typed or hand-written + +Topics covered: + +– Security fundamentals +– TCP/IP network stack +– Crypto fundamentals +– Symmetric key cryptography +– Hash functions +– Asymmetric key cryptography + +## Security fundamentals + +### Defining security + +- Understand principles of security analysis + - The security of a system, application, or protocol is always relative to + - A set of desired properties + - An adversary with specific capabilities ("threat model") + +### Key security concepts + +C.I.A. triad: + +- Integrity: Prevent unauthorized modification of data, and/or detect if modification occurred. + - ARP poisoning (ARP spoofing) + - Authentication codes +- Confidentiality: Prevent unauthorized parties from learning the contents of data (in transit or at rest). + - Packet sniffing / eavesdropping + - Data encryption +- Availability: Ensure systems and data are accessible to authorized users when needed. + - Denial-of-Service (DoS) / Distributed DoS (DDoS) + - Rate limiting + traffic filtering (often with DDoS protection/CDN) + +Other security goals: + +- Authenticity: identity of an entity (issuer of info/message) is verified +- Anonymity: identity of an entity remains unknown +- Non-repudiation: messages can't be denied or taken back (e.g. online transaction commitments) + +### Modeling attacks + +Common components: + +- System being attacked (usually a model, with assumptions and abstractions) +- Threat model +- Attack surface: what can be attacked + - Open ports and exposed services + - Public APIs and their parameters + - Web endpoints, forms, cookies + - File system permissions + - Hardware interfaces (USB, JTAG) + - User roles and privilege boundaries +- Attack vector: how the attacker attacks + - SQL injection via POST /login + - Phishing to steal credentials, then SSH login + - Buffer overflow in a network daemon + - Cross-site scripting through a comment field + - Supply-chain poisoning of a dependency +- Vulnerability: what the attacker can do +- Exploit: how the attacker exploits the vulnerability +- Damage: what the attacker can do +- Mitigation: mitigate vulnerability +- Defense: close vulnerability gap + +Importance of correct modeling + +- Attack-surface awareness guides defenses + - E.g. pre-Covid-19 vs. post-Covid attack surface of company servers +- Match resources to expected threat actors + - "Script kiddie": individual or group running off-the-shelf attacks + - Caveat: off-the-shelf attacks can still be quite powerful! Metasploit, Shodan, dark web market. + - "Insider attack": employee with access to internal machines/networks + - "Advanced Persistent Threat (APT)": nation-state level resources and patience + - All these threats have different motivations, require different defenses/responses! +- Reevaluate often + - Threat capabilities change over time + +### TCP/IP network stack + +Local and interdomain routing + +- TCP/IP for routing and messaging +- BGP for routing announcements + +Domain Name System + +- Find IP address from symbolic name (cse.wustl.edu) + +#### Layer Summary + +Application: the actual sending message +Transport (TCP, UDP): segment +Network (IP): packet +Data Link (Ethernet): frame + +### Types of Addresses in Internet + +- Media Access Control (MAC) addresses in the network access layer + - Associated w/ network interface card (NIC) + - 00-50-56-C0-00-01 +- IP addresses for the network layer + - IPv4 (32 bit) vs IPv6 (128 bit) + - 128.1.1.3 vs fe80::fc38:6673:f04d:b37b%4 +- IP addresses + ports for the transport layer + - E.g., 10.0.0.2:8080 +- Domain names for the application/human layer + - E.g., www.wustl.edu + +#### Routing and Translation of Addresses + +(All of them are attack surfaces) + +- Translation between IP addresses and MAC addresses + - Address Resolution Protocol (ARP) for IPv4 + - Neighbor Discovery Protocol (NDP) for IPv6 +- Routing with IP addresses + - TCP, UDP for connections, IP for routing packets + - Border Gateway Protocol for routing table updates +- Translation between IP addresses and domain names + - Domain Name System (DNS) + +### Summary for security + +- Confidentiality + - Packet sniffing +- Integrity + - ARP poisoning +- Availability + - Denial of service attacks +- Common + - Address translation poisoning attacks (DNS, ARP) + - Packet spoofing +- Core protocols not designed for security + - Eavesdropping, packet injection, route stealing, DNS poisoning + - Patched over time to prevent basic attacks +- More secure variants exist: + - IP $\to$ IPsec (IPsec is ) + - DNS $\to$ DNSsec + - BGP $\to$ sBGP + +## Crypto fundamentals + +- Well-defined statement about difficulty of compromising a system + - ...with clear implicit or explicit assumptions about: + - Parameters of the system + - Threat model + - Attack surfaces +- Example: "A one-time pad cipher is secure against any cryptanalysis, including a brute-force attack, assuming: + - the key is the same length as the plaintext, + - the key is truly random, and + - the key is never re-used." + +### Common roles in cryptography + +Alice and Bob: Sender and receiver + +Eve: Adversary that can see but not create any packets + +Mallory: Man in the middle, can create and modify packets + +The message M is called the **plaintext**. + +Alice will convert plaintext M to an encrypted form using an +encryption algorithm E that outputs a **ciphertext** C for M. + +#### Cryptography goals + +Confidentiality: + +- Mallory and Eve cannot recover original message from ciphertext + +Integrity: + +- Mallory cannot modify message from Alice to Bob without detection by Bob + +#### Threat models + +- Attacker may have (with increasing power): + - a) collection of ciphertexts (ciphertext-only attack) + - b) collection of plaintext/ciphertext pairs (known plaintext attack: KPA) + - c) collection of plaintext/ciphertext pairs for plaintexts selected by the attacker (chosen plaintext attack: CPA) + - d) collection of plaintext/ciphertext pairs for ciphertexts selected by the attacker (chosen ciphertext attack: CCA/CCA2) + +### Symmetric key cryptography + +#### Classical cryptography + +Techniques: substitution and transposition + +- Substitution: 1:1 mapping of alphabet onto itself +- Transposition: permutation of elements (i.e. rearrange letters) + +- Caesar cipher: rotate each letter by k positions (k is fixed) +- Vigenère cipher: If length of key is known, split letters into groups based on index within key and do frequency analysis within groups + +> The three steps in cryptography: +> +> - Precisely specify threat model +> - Propose a construction +> - Prove that breaking construction under threat mode will solve an underlying hard problem + +#### Perfect secrecy + +Ciphertext attack reveal no "info" about plaintext under ciphertext only attack + +Def: A cipher $(E, D)$ over $(K, M, C)$ has perfect secrecy if + +- $\forall m_0, m_1 \in M$ $(|m_0| = |m_1|)$ and $\forall c \in C$, + - $\Pr[E(k, m_0) = c] = \Pr[E(k, m_1) = c]$ where $k \leftarrow K$ + +#### XOR One-time pad (perfect secrecy) + +Assumptions: + +- Key is as long as message +- Key is random +- Key is never re-used + +In practice, relax this assumption gets "Stream ciphers" + +### Stream cipher + +- Use pseudorandom generator as keystream for xore encryption (security is guaranteed by pseudorandom generator) + +Security abstraction: + +1. XOR transfers randomness of keystream to randomness of CT regardless of PT’s content +2. Security depends on G being "practically" indistinguishable from random string and "practically" unpredictable +3. Idea: shouldn’t be able to predict next bit of generator given all bits seen so far + +#### Semantic security + +- $(E, D)$ has semantic secrecy if $\forall m_0, m_1 \in M$ $(|m_0| = |m_1|)$, + - $\{E(k, m_0)\} \approx_p \{E(k, m_1)\}$ where $k \leftarrow K$ +- ...and the adversary exhibits $m_0, m_1 \in M$ explicitly + +The advantage of adversary is defined as the probability of distinguishing $E(k, m_0)$ from $E(k, m_1)$. + +#### Weakness for stream ciphers + +- Week pseudorandom generator +- Key re-use +- Predicable effect of modifying ciphertext or decrypted plaintext. + +### Block cipher + +View cipher as a Pseudo-Random Permutation (PRP) + +#### Pseudorandom permutation + +- PRP defined over $(K, X)$: + - $E: K \times X \to X$ + - such that: + 1. There exists an "efficient" deterministic algorithm to evaluate $E(k, x)$. + 2. The function $E(k, \cdot)$ is one-to-one. + 3. There exists an "efficient" inversion algorithm $D(k, y)$. + +- i.e. a PRF that is an invertible one-to-one mapping from message space to message space + +#### Security of block ciphers + +Intuition: a PRP is secure if: a random function in $Perms[X]$ is indistinguishable from a random function in $SF$ (real random permutation function) + +The adversarial game is to let adversary decide $x$, then we choose random key $k$ and give $E(k,x)$ and real random permutation $Perm(X)$ to let adversary decide which is which. + +#### Block cipher constructions: Feistel network + +Forward network: + +![Feistel network](https://notenextra.trance-0.com/CSE4303/Feistel_network.png) + +- Forward (round $i$): given $(L_{i-1}, R_{i-1}) \in \{0,1\}^n \times \{0,1\}^n$, + - $L_i = R_{i-1}$ + - $R_i = L_{i-1} \oplus f_i(R_{i-1})$ + +- Proof (construct the inverse): + - Suppose we are given the output of round $i$, namely $(L_i, R_i)$. + - Recover the previous right half immediately: + - $R_{i-1} = L_i$ + - Then recover the previous left half by undoing the XOR: + - $L_{i-1} = R_i \oplus f_i(R_{i-1}) = R_i \oplus f_i(L_i)$ + - Therefore each round map is invertible, with inverse transformation: + - $R_{i-1} = L_i$ + - $L_{i-1} = f_i(L_i) \oplus R_i$ + - Applying this inverse for $i=d,d-1,\ldots,1$ recovers $(L_0,R_0)$ from $(L_d,R_d)$, so the whole Feistel network $F$ is invertible. + +- Notation sketch (each wire is $n$ bits): + - Input: $(L_0, R_0)$ + - Rounds: + - $L_1 = R_0,\ \ R_1 = L_0 \oplus f_1(R_0)$ + - $L_2 = R_1,\ \ R_2 = L_1 \oplus f_2(R_1)$ + - $\cdots$ + - $L_d = R_{d-1},\ \ R_d = L_{d-1} \oplus f_d(R_{d-1})$ + - Output: $(L_d, R_d)$ + +#### Block ciphers: block modes: ECB + +New attacker model for multi-use keys (e.g. multiple blocks): CPA (Chosen Plaintext)-capable, not just CT-only + +- Attacker sees many PT/CT pairs for same key +- Conservative model: attacker submits arbitrary PT (hence "C"PA) +- Cipher goal: maintain semantic security against CPA + +#### CPA indistinguishability game + +- Updated adversarial game for a CPA attacker: + - Let $E = (E, D)$ be a cipher defined over $(K, M, C)$. For $b \in \{0,1\}$ define $\operatorname{EXP}(b)$ as: + +- Experiment $\operatorname{EXP}(b)$: + - Challenger samples $k \leftarrow K$. + - For each query $i = 1,\ldots,q$: + - Adversary outputs messages $m_{i,0}, m_{i,1} \in M$ such that $|m_{i,0}| = |m_{i,1}|$. + - Challenger returns $c_i \leftarrow E(k, m_{i,b})$. + +- Encryption-oracle access (CPA): + - If the adversary wants $c = E(k, m)$, it queries with $m_{j,0} = m_{j,1} = m$ (so the response is $E(k,m)$ regardless of $b$). + +#### Semantic security under CPA + +- Def: $E$ is semantically secure under CPA if for all "efficient" adversaries $A$, + - $\operatorname{Adv}^{\operatorname{CPA}}[A,E] = \left|\Pr[\operatorname{EXP}(0)=1] - \Pr[\operatorname{EXP}(1)=1]\right|$ + - is negligible. + +### Summary for symmetric encrption + +1. Stream ciphers + - Rely on secure PRG + - No key re-use + - Fast, low-mem, less robust +2. Block ciphers + - Rely on secure PRP + - Allow key re-use (usually only across blocks, not sessions) + - Provide authenticated encryption in some modes (e.g. GCM) + - Slower, higher-mem, more robust + - Used in practice for most crypto tasks (including secure network channels) + +## Hash functions + +### Hash function security properties + +- Given a function $h:X \to Y$, we say that $h$ is: + +- 1. Preimage resistant (one-way) if: + - given $y \in Y$ it is computationally infeasible to find a value $x \in X$ s.t. $h(x) = y$ + +- 2. 2nd preimage resistant (weak collision resistant) if: + - given a specific $x \in X$ it is computationally infeasible to find a value $x' \in X$ s.t. $x' \ne x$ and $h(x') = h(x)$ + +- 3. Collision resistant (strong collision resistant) if: + - it is computationally infeasible to find any two distinct values $x', x \in X$ s.t. $h(x') = h(x)$ + +### Collision resistance: adversarial definition + +- Let $H: M \to T$ be a hash function ($|M| \gg |T|$). +- A function $H$ is collision resistant if for all (explicit) "efficient" algorithms $A$, + - $\operatorname{Adv}^{\operatorname{CR}}[A,H] = Pr[$A outputs a collision for $H$ $]$ + - is negligible + +### Hash function integrity applications + +1. Delayed knowledge verification +2. Password storage +3. Trusted timestamping / blockchains +4. Integrity check on software + +#### File integrity with secure read-only space + +- When user downloads package, can verify that contents are valid +- $H$ collision resistant $\Rightarrow$ attacker cannot modify package without detection +- No encryption needed (public verifiability) if publisher has secure read-only space (e.g. trusted website, social media account) + +#### Symmetric-crypto message authentication + +- Context: Assume no secure RO space (insecure channel only) + - Need means of message authentication +- Idea: add tag to message +- System: Message Authentication Code (MAC) +- Def: a MAC $I=(S,V)$ defined over $(K,M,T)$ is a pair of algorithms: + - $S(k,m)$ outputs $t \in T$ // "Sign" + - $V(k,m,t)$ outputs `yes' or `no' // "Verify" + +- Symmetric-crypto message authentication: + - Alice and Bob share secret key $k$ + - Generate tag: $\text{tag} \leftarrow S(k,m)$ + - Verify tag: $V(k,m,\text{tag}) = \texttt{yes}?$ + +#### MAC security model + +- For a MAC $I=(S,V)$ and adversary $A$, define a MAC game as: +- Def: $I=(S,V)$ is a secure MAC if for all "efficient" $A$, + - $\operatorname{Adv}^{\operatorname{MAC}}[A,I] = \Pr[\text{Chal. outputs }1]$ + - is negligible + +- MAC game (sketch): + - Challenger samples $k \leftarrow K$ + - Adversary makes queries $m_1,\ldots,m_q \in M$ + - For each $i$, challenger returns $t_i \leftarrow S(k,m_i)$ + - Adversary outputs a candidate forgery $(m,t)$ + - Challenger outputs $b=1$ if: + - $V(k,m,t)=\texttt{yes}$ and + - $(m,t) \notin \{(m_1,t_1),\ldots,(m_q,t_q)\}$ + - Otherwise challenger outputs $b=0$ + +- MAC security example: secure PRF not sufficient + - Suppose $F: K \times X \to Y$ is a secure PRF with $Y=\{0,1\}^{10}$. + - Is the derived MAC $I_F$ a secure MAC system? + - No: tags are too short, anyone can guess the tag for any message + +#### MACs from PRFs: sufficient security condition + +- Thm: If $F: K \times X \to Y$ is a secure PRF and $1/|Y|$ is negligible (i.e. $|Y|$ is large), then $I_F$ is a secure MAC. +- In particular, for every efficient MAC adversary $A$ attacking $I_F$, there exists an efficient PRF adversary $B$ attacking $F$ such that: + - $\operatorname{Adv}^{\operatorname{MAC}}[A, I_F] \le \operatorname{Adv}^{\operatorname{PRF}}[B, F] + 1/|Y|$ +- Therefore $I_F$ is secure as long as $|Y|$ is large, e.g. $|Y| = 2^{80}$. + +#### MACs from collision resistance + +- Let $I=(S,V)$ be a MAC for short messages over $(K,M,T)$ (e.g. AES). +- Let $H: M_{\text{big}} \to M$. +- Def: $I_{\text{big}}=(S_{\text{big}},V_{\text{big}})$ over $(K,M_{\text{big}},T)$ as: + - $S_{\text{big}}(k,m) = S(k, H(m))$ + - $V_{\text{big}}(k,m,t) = V(k, H(m), t)$ +- Thm: If $I$ is a secure MAC and $H$ is collision resistant, then $I_{\text{big}}$ is a secure MAC. +- Example: $S(k,m) = \operatorname{AES2\text{-}block\text{-}cbc}(k, \operatorname{SHA\text{-}256}(m))$ is a secure MAC. + +#### Using HMACs for confidentiality + integrity + +- Confidentiality: + - Semantic security under a CPA + - Encryption secure against eavesdropping only +- Integrity: + - Existential unforgeability under a CPA + - CBC-MAC, HMAC + - Hash functions +- Confidentiality + integrity: + - CCA security + - Secure against tampering + - Method: Authenticated Encryption (AE) + - Encryption + MAC, in correct form + +#### Authenticated Encryption: security defs + +- An authenticated encryption system $(E,D)$ is a cipher where: + - $E: K \times M \times N \to C$ + - $D: K \times C \times N \to M \cup$ cipher text rejected +- Security: the system must provide + - semantic security under a CPA attack, and + - ciphertext integrity: attacker cannot create new ciphertexts that decrypt properly + +#### Ciphertext integrity + +- Let $(E,D)$ be a cipher with message space $M$. +- Def: $(E,D)$ has ciphertext integrity if for all "efficient" $A$, + - $\operatorname{Adv}^{\operatorname{CI}}[A,E] = \Pr[\text{Chal. outputs }1]$ + - is negligible + +- Security model: ciphertext integrity (sketch): + - Challenger samples $k \leftarrow K$ + - Adversary makes encryption queries $m_1,\ldots,m_q \in M$ + - For each $i$, challenger returns $c_i \leftarrow E(k,m_i)$ + - Adversary outputs a ciphertext $c$ + - Challenger outputs $b=1$ if: + - $D(k,c) \ne \bot$ and + - $c \notin \{c_1,\ldots,c_q\}$ + - Otherwise challenger outputs $b=0$ + +#### Authenticated encryption implies CCA security + +- Thm: Let $(E,D)$ be a cipher that provides AE. Then $(E,D)$ is CCA secure. +- In particular, for any $q$-query efficient adversary $A$, there exist efficient $B_1,B_2$ such that: + - $\operatorname{Adv}^{\operatorname{CCA}}[A,E] \le 2q \cdot \operatorname{Adv}^{\operatorname{CI}}[B_1,E] + \operatorname{Adv}^{\operatorname{CPA}}[B_2,E]$ +- Interpretation: CCA advantage is $\le O(\text{CT-integrity advantage}) + \text{CPA advantage}$. + +- AE implication: authenticity + - Attacker cannot fool Bob into thinking a message was sent from Alice + - If attacker cannot create a valid ciphertext $c \notin \{c_1,\ldots,c_q\}$, then whenever $D(k,c) \ne \bot$ Bob knows the message is from someone who knows $k$ (but it could be a replay) + +- DS construction example: signing a certificate + +### Comparison: integrity/authentication approaches + +- 1) Collision resistant hashing: need a read-only public space + - Allows public verification if the hash is published in a small read-only public space +- 2) MACs: must compute a new MAC for every client/user + - Must manage a long-term secret key per user to verify MACs (depending on application) + - Typically useful when one party signs, one verifies +- 3) Digital signatures: must manage a long-term secret key + - E.g. vendor's signature on software is shipped with software + - Allows software to be downloaded from an untrusted distribution site + - Public-key verification/rejection works, provided public key distribution is trustworthy + - Typically useful when one party signs, many verify + +## Asymmetric key cryptography + +### Asymmetric crypto overview + +- Parties: sender, recipient, attacker (eavesdropping) +- Goal: sender encrypts a plaintext to a ciphertext using a public key; recipient decrypts using a private key. + +#### Public-key encryption system + +- Def: a public-key encryption system is a triple of algorithms $(G, E, D)$: + - $G()$: randomized algorithm that outputs a key pair $(pk, sk)$ + - $E(pk, m)$: randomized algorithm that takes $m \in M$ and outputs $c \in C$ + - $D(sk, c)$: deterministic algorithm that takes $c \in C$ and outputs $m \in M$ or $\bot$ +- Consistency: for all $(pk, sk)$ output by $G$, for all $m \in M$, + - $D(sk, E(pk, m)) = m$ + +#### Trapdoor function + +- Def: a trapdoor function $X \to Y$ is a triple of efficient algorithms $(G, F, F^{-1})$: + - $G()$: randomized algorithm that outputs a key pair $(pk, sk)$ + - $F(pk, \cdot)$: deterministic algorithm that defines a function $X \to Y$ + - $F^{-1}(sk, \cdot)$: defines a function $Y \to X$ that inverts $F(pk, \cdot)$ +- More precisely: for all $(pk, sk)$ output by $G$, for all $x \in X$, + - $F^{-1}(sk, F(pk, x)) = x$ + +#### Symmetric vs. asymmetric security: attacker models + +- Symmetric ciphers: two security notions for a passive attacker + - One-time security (stream ciphers: ciphertext-only) + - Many-time security (block ciphers: CPA) + - One-time security $\Rightarrow$ many-time security + - Example: ECB mode is one-time secure but not many-time secure +- Public-key encryption: single notion for a passive attacker + - Attacker can encrypt by themselves using the public key + - Therefore one-time security $\Rightarrow$ many-time security (CPA) + - Implication: public-key encryption must be randomized + - Analogous to secure block modes for block ciphers + +### Semantic security of asymmetric crypto (IND-CPA) + +#### IND-CPA game for public-key encryption + +- For $b \in \{0,1\}$ define experiments $\operatorname{EXP}(0)$ and $\operatorname{EXP}(1)$: + +- Experiment $\operatorname{EXP}(b)$: + - Challenger runs $(pk, sk) \leftarrow G()$ + - Challenger sends $pk$ to adversary $A$ + - Adversary outputs $m_0, m_1 \in M$ such that $|m_0| = |m_1|$ + - Challenger returns $c \leftarrow E(pk, m_b)$ + - Adversary outputs a bit $b' \in \{0,1\}$ (often modeled as outputting 1 if it "guesses $b=1$") + +#### Semantic security (IND-CPA) + +- Def: $E = (G, E, D)$ is semantically secure (a.k.a. IND-CPA) if for all efficient adversaries $A$, + - $\operatorname{Adv}^{\operatorname{SS}}[A, E] = \left|\Pr[\operatorname{EXP}(0)=1] - \Pr[\operatorname{EXP}(1)=1]\right|$ + - is negligible +- Note: inherently multiple-round because the attacker can always encrypt on their own using $pk$ (CPA power is "built in"). + +### RSA cryptosystem: overview + +- Setup: + - $n = pq$, with $p$ and $q$ primes + - Choose $e$ relatively prime to $\phi(n) = (p-1)(q-1)$ + - Choose $d$ as the inverse of $e$ in $\mathbb{Z}_{\phi(n)}$ +- Keys: + - Public key: $K_E = (n, e)$ + - Private key: $K_D = d$ +- Encryption: + - Plaintext $M \in \mathbb{Z}_n$ + - $C = M^e \bmod n$ +- Decryption: + - $M = C^d \bmod n$ + +- Example: + - Setup: + - $p = 7$, $q = 17$ + - $n = 7 \cdot 17 = 119$ + - $\phi(n) = 6 \cdot 16 = 96$ + - $e = 5$ + - $d = 77$ + - Keys: + - public key: $(119, 5)$ + - private key: $77$ + - Encryption: + - $M = 19$ + - $C = 19^5 \bmod 119 = 66$ + - Decryption: + - $M = 66^{77} \bmod 119 = 19$ + +- Security intuition: + - To invert RSA without $d$, attacker must compute $x$ from $c = x^e \pmod n$. + - Best known approach: + - Step 1: factor $n$ (hard) + - Step 2: compute $e$-th roots modulo $p$ and $q$ (easy once factored) + - Notes (as commonly stated in lectures): + - 1024-bit RSA is within reach; 2048-bit is recommended usage + +### Diffie-Hellman key exchange (informal) + +- Fix a large prime $p$ (e.g., 2000 bits) +- Fix an integer $g \in \{1,\ldots,p\}$ + +- Protocol: + - Alice chooses random $a \in \{1,\ldots,p-1\}$ and sends $A = g^a \bmod p$ + - Bob chooses random $b \in \{1,\ldots,p-1\}$ and sends $B = g^b \bmod p$ + - Shared key: + - Alice computes $k_{AB} = B^a \bmod p = g^{ab} \bmod p$ + - Bob computes $k_{AB} = A^b \bmod p = g^{ab} \bmod p$ + +- Hardness assumptions: + - Discrete log problem: given $p, g, y = g^x \bmod p$, find $x$ + - Diffie-Hellman function: $\operatorname{DH}_g(g^a, g^b) = g^{ab} \bmod p$ + +#### Diffie-Hellman: security notes + +- As described, the protocol is insecure against active attacks: + - A man-in-the-middle (MiTM) can insert themselves and create 2 separate secure sessions +- Fix idea: need a way to bind identity to a public key + - In practice: web of trust (e.g., GPG) or Public Key Infrastructure (PKI) + +### Implementing trapdoor functions securely + +- Never encrypt by applying $F$ directly to plaintext: + - Deterministic: cannot be semantically secure + - Many attacks exist for concrete TDFs + - Same plaintext blocks yield same ciphertext blocks + +- Naive (insecure) sketch: + - $E(pk, m)$: output $c \leftarrow F(pk, m)$ + - $D(sk, c)$: output $F^{-1}(sk, c)$ + +### Public-key encryption from TDFs + +- Components: + - $(G, F, F^{-1})$: secure TDF $X \to Y$ + - $(E_s, D_s)$: symmetric authenticated encryption over $(K, M, C)$ + - $H: X \to K$: a hash function + +- Construction of $(G, E, D)$ (with $G$ same as in the TDF): + - $E(pk, m)$: + - sample $x \leftarrow X$, compute $y \leftarrow F(pk, x)$ + - derive $k \leftarrow H(x)$, compute $c \leftarrow E_s(k, m)$ + - output $(y, c)$ + - $D(sk, (y, c))$: + - compute $x \leftarrow F^{-1}(sk, y)$ + - derive $k \leftarrow H(x)$, compute $m \leftarrow D_s(k, c)$ + - output $m$ + +- Visual intuition: + - header: $y = F(pk, x)$ + - body: $c = E_s(H(x), m)$ + +- Security theorem (lecture-style statement): + - If $(G, F, F^{-1})$ is a secure TDF, $(E_s, D_s)$ provides authenticated encryption, and $H$ is modeled as a random oracle, then $(G, E, D)$ is CCA-secure in the random oracle model (often denoted CCA-RO). + - Extension exists to reach full CCA (outside the RO idealization). + +### Wrapup: symmetric vs. asymmetric systems + +- Symmetric: faster, but key distribution is hard +- Asymmetric: slower, but key distribution/management is easier +- Application: secure web sessions (e.g., online shopping) + - Use symmetric-key encrypted sessions for bulk traffic + - Exchange symmetric keys using an asymmetric scheme + - Authenticate public keys (PKI or web of trust) + +### Key exchange: summary + +- Symmetric-key encryption challenges: + - Key storage: one per user pair, $O(n^2)$ total for $n$ users + - Key exchange: how to do it over a non-secure channel? + +- Possible solutions: + +- 1) Trusted Third Party (TTP) + - All users establish separate secret keys with the TTP + - TTP helps manage user-user keys (storage and secure channel) + - Applicability: + - Works for local domains + - Popular implementation: Kerberos for Single Sign On (SSO) + - Challenges: + - Scale: central authentication server is not suitable for the entire Internet + - Latency: requires online response from central server for every user-user session + +- 2) Public/private keys with certificates + - All users have a single stable public key (helps with key storage and exchange) + - Users exchange per-session symmetric keys via a secure channel using public/private keys + - Trusting public keys: binding is validated by a third-party authority (Certificate Authority, CA) + - Why better than TTP? CAs can validate statically by issuing certificates, then be uninvolved + - CA/certificate process covered in a future lecture + +## Appendix for additional algorithms and methods + +### Feistel network (used by several items below) + +A **Feistel network** splits a block into left/right halves and iterates rounds of the form $(L_{i+1},R_{i+1})=(R_i, L_i\oplus F(R_i,K_i))$, so decryption reuses the same structure with subkeys in reverse order. + +Feistel-based here: **DES, 3DES, CAMELLIA, SEED, GOST 28147-89 (and thus GOST89MAC uses a Feistel block cipher internally).** + +### Key exchange and authentication selectors (not symmetric encryption, not MAC) + +These describe *how keys are negotiated- and/or *how the peer is authenticated*, not whether payload is a block/stream cipher. + +#### RSA / DH / ECDH families + +- **kRSA, RSA** — (key exchange) the premaster secret is sent encrypted under the server’s RSA public key (classic TLS RSA KX). +- **aRSA, aECDSA, aDSS, aGOST, aGOST01** — (authentication) the server identity is proven via a certificate signature scheme (RSA / ECDSA / DSA / GOST). +- **kDHr, kDHd, kDH** — (key exchange) *static- DH key agreement using DH certificates (obsolete/removed in newer OpenSSL). +- **kDHE, kEDH, DH / DHE, EDH / ECDHE, EECDH / kEECDH, kECDHE, ECDH** — (key exchange) *ephemeral- (EC)DH derives a fresh shared secret each handshake; "authenticated" variants bind it to a cert/signature. +- **aDH** — (authentication selector) indicates DH-authenticated suites (DH certs; also removed in newer OpenSSL). + +#### PSK family + +- **PSK** — (keying model) uses a pre-shared secret as the authentication/secret basis. +- **kPSK, kECDHEPSK, kDHEPSK, kRSAPSK** — (key exchange) PSK combined with (EC)DHE or RSA to derive/transport session keys. +- **aPSK** — (authentication) PSK itself authenticates endpoints (except RSA_PSK where cert auth may be involved). + +--- + +### Symmetric encryption / AEAD (this is where "block vs stream" applies) + +#### AES family + +- **AES128 / AES256 / AES** — **encryption/decryption**; **block cipher**; core algorithm: AES is an SPN (substitution–permutation network) of repeated SubBytes/ShiftRows/MixColumns/AddRoundKey rounds. +- **AES-GCM** — **both encryption + message authentication (AEAD)**; **both** (AES block cipher used in counter mode + auth); core algorithm: encrypt with AES-CTR and authenticate with GHASH over ciphertext/AAD to produce a tag. +- **AES-ECB**: Functionality is encryption/decryption (confidentiality only) using a block cipher mode; core algorithm encrypts each 128-bit plaintext block independently under the same key, which deterministically leaks patterns because equal plaintext blocks map to equal ciphertext blocks. +- **AES-CBC**: Functionality is encryption/decryption (confidentiality only) using a block cipher mode; core algorithm XORs each plaintext block with the previous ciphertext block (starting from a fresh unpredictable IV) before AES-encrypting, which hides repetitions but requires correct IV handling and padding for non-multiple-of-block messages. +- **AES-OFB** — **encryption**; both (stream-like); repeatedly AES-encrypts an internal state to generate a keystream and XORs it with plaintext, where the state evolves independently of the plaintext/ciphertext. +- **AESCCM / AESCCM8** — **both encryption + message authentication (AEAD)**; **both**; core algorithm: compute CBC-MAC then encrypt with CTR mode, with 16-byte vs 8-byte tag length variants. + +#### ARIA family + +- **ARIA128 / ARIA256 / ARIA** — **encryption/decryption**; **block cipher**; core algorithm: ARIA is an SPN-style block cipher with byte-wise substitutions and diffusion layers across rounds. + +#### CAMELLIA family + +- **CAMELLIA128 / CAMELLIA256 / CAMELLIA** — **encryption/decryption**; **block cipher**; core algorithm: Camellia is a **Feistel network** with round functions plus extra FL/FL$^{-1}$ layers for nonlinearity and diffusion. *(Feistel: yes)* + +#### ChaCha20 + +- **CHACHA20** — **encryption/decryption**; **stream cipher**; core algorithm: ChaCha20 generates a keystream via repeated ARX (add-rotate-xor) quarter-rounds on a 512-bit state and XORs it with plaintext. + +#### DES / 3DES + +- **DES** — **encryption/decryption**; **block cipher**; core algorithm: DES is a 16-round **Feistel network** using expansion, S-boxes, and permutations. *(Feistel: yes)* +- **3DES** — **encryption/decryption**; **block cipher**; core algorithm: applies DES three times (EDE or EEE) to increase effective security while retaining the **Feistel** DES core. *(Feistel: yes)* + +#### RC4 + +- **RC4** — **encryption/decryption**; **stream cipher**; core algorithm: maintains a 256-byte permutation and produces a keystream byte-by-byte that is XORed with plaintext. + +#### RC2 / IDEA / SEED + +- **RC2** — **encryption/decryption**; **block cipher**; core algorithm: mixes key-dependent operations (adds, XORs, rotates) across rounds with "mix" and "mash" steps (not Feistel). +- **IDEA** — **encryption/decryption**; **block cipher**; core algorithm: combines modular addition, modular multiplication, and XOR in a Lai–Massey-like structure to achieve diffusion/nonlinearity (not Feistel). +- **SEED** — **encryption/decryption**; **block cipher**; core algorithm: a 16-round **Feistel network** with nonlinear S-box-based round functions. *(Feistel: yes)* + +--- + +### Hash / MAC / digest selectors (message authentication side) + +These are not "ciphers" but are used for integrity/authentication (often as HMAC, PRF, signatures). + +- **MD5** — **message authentication component** (typically via HMAC, historically); **cipher method: N/A**; core algorithm: iterated Merkle–Damgård hash compressing 512-bit blocks into a 128-bit digest (now considered broken for collision resistance). +- **SHA1, SHA** — **message authentication component** (typically HMAC-SHA1 historically); **N/A**; core algorithm: Merkle–Damgård hash producing 160-bit output via 80-step compression (collisions known). +- **SHA256 / SHA384** — **message authentication component** (HMAC / TLS PRF / signatures); **N/A**; core algorithm: SHA-2 family Merkle–Damgård hashes with different word sizes/output lengths (256-bit vs 384-bit). +- **GOST94** — **message authentication component** (HMAC based on GOST R 34.11-94); **N/A**; core algorithm: builds an HMAC tag by hashing inner/outer padded key with the message using the GOST hash. +- **GOST89MAC** — **message authentication**; **block-cipher-based MAC (so "block" internally)**; core algorithm: computes a MAC using the GOST 28147-89 block cipher in a MAC mode (cipher-based chaining). *(Feistel internally via GOST 28147-89)* + +> Latest version of cheatsheet distilled from this note. diff --git a/content/CSE4303/_meta.js b/content/CSE4303/_meta.js index ab7862b..d62ca52 100644 --- a/content/CSE4303/_meta.js +++ b/content/CSE4303/_meta.js @@ -3,6 +3,7 @@ export default { "---":{ type: 'separator' }, + CSE4303_E1: "Exam review", CSE4303_L1: "Introduction to Computer Security (Lecture 1)", CSE4303_L2: "Introduction to Computer Security (Lecture 2)", CSE4303_L3: "Introduction to Computer Security (Lecture 3)", diff --git a/content/Math4202/Exam_reviews/Math4202_E1.md b/content/Math4202/Exam_reviews/Math4202_E1.md index 4f2801c..f5bfd97 100644 --- a/content/Math4202/Exam_reviews/Math4202_E1.md +++ b/content/Math4202/Exam_reviews/Math4202_E1.md @@ -78,6 +78,27 @@ An $m$-dimensional **manifold** is a topological space $X$ that is 2. Second countable: With a countable basis 3. Local euclidean: Each point of $x$ of $X$ has a neighborhood that is homeomorphic to an open subset of $\mathbb{R}^m$. +
+Example of space that is not a manifold but satisfies part of the definition + +Non-hausdorff: + +Consider the set with two origin $\mathbb{R}\setminus\{0\}$. with $\{p,q\}$, and the topology defined over all the open intervals that don't contain the origin, with set of the form $(-a,0)\cup \{p\}\cup (0,a)$ for $a\in \mathbb{R}$ and $(-a,0)\cup \{q\}\cup (0,a)$. + +--- + +Non-second-countable: + +Consider the long line $\mathbb{R}\times [0,1)$ + +--- + +Non-local-euclidean: + +Any 1-dimensional CW complex (graph) that has a vertex with 3 or more edges connected to it will be Hausdorff and second-countable, but not locally Euclidean at those vertices. + +
+ #### Whitney's Embedding Theorem If $X$ is a compact $m$-manifold, then $X$ can be imbedded in $\mathbb{R}^N$ for some positive integer $N$. @@ -97,6 +118,12 @@ Let $\{U_i\}_{i=1}^n$ be a finite open cover of a normal space $X$ (Every pair o Then there exists a partition of unity dominated by $\{U_i\}_{i=1}^n$. +#### Definition of paracompact space + +Locally finite: $\forall x\in X$, $\exists$ open $x\in U$ such that $U$ only intersects finitely many open sets in $\mathcal{B}$. + +A space $X$ is paracompact if every open cover $A$ of $X$ has a **locally finite** refinement $\mathcal{B}$ of $A$ that covers $X$. + ### Homotopy #### Definition of homotopy equivalent spaces @@ -128,7 +155,6 @@ Two pathes $f$ and $f'$ are path homotopic if The $\simeq$, $\simeq_p$ are both equivalence relations. - #### Definition for product of paths Given $f$ a path in $X$ from $x_0$ to $x_1$ and $g$ a path in $X$ from $x_1$ to $x_2$. diff --git a/public/CSE4303/Feistel_network.png b/public/CSE4303/Feistel_network.png new file mode 100644 index 0000000..ad6be5c Binary files /dev/null and b/public/CSE4303/Feistel_network.png differ