diff --git a/pages/CSE332S/CSE332S_L10.md b/pages/CSE332S/CSE332S_L10.md index 8e77085..a1ac8f8 100644 --- a/pages/CSE332S/CSE332S_L10.md +++ b/pages/CSE332S/CSE332S_L10.md @@ -1 +1,274 @@ -# CSE332S Lecture 10 \ No newline at end of file +# CSE332S Lecture 10 + +## Associative Containers + +| Container | Sorted | Unique Key | Allow duplicates | +| -------------------- | ------ | ---------- | ---------------- | +| `set` | Yes | Yes | No | +| `multiset` | Yes | Yes | Yes | +| `unordered_set` | No | Yes | No | +| `unordered_multiset` | No | Yes | Yes | +| `map` | Yes | Yes | No | +| `multimap` | Yes | Yes | Yes | +| `unordered_map` | No | Yes | No | +| `unordered_multimap` | No | Yes | Yes | + +Associative containers support efficient key lookup +vs. sequence containers, which lookup by position +Associative containers differ in 3 design dimensions + +- Ordered vs. unordered (tree vs. hash structured) +- We’ll look at ordered containers today, unordered next time +- Set vs. map (just the key or the key and a mapped type) +- Unique vs. multiple instances of a key + +### Ordered Associative Containers + +Example: `set`, `multiset`, `map`, `multimap` + +Ordered associative containers are tree structured +- Insert/delete maintain sorted order, e.g. operator `<` +- Don’t use sequence algorithms like `sort` or `find` with them + - Already sorted, so sorting unnecessary (or harmful) + - `find` is more efficient (logarithmic time) as a container method +Ordered associative containers are bidirectional +- Can iterate through them in either direction, find sub-ranges +- Can use as source or destination for algorithms like `copy` + +### Set vs. Map + +A set/multiset stores keys (the key is the entire value) + +- Used to collect single-level information (e.g., a set of words to ignore) +- Avoid in-place modification of keys (especially in a set or multiset) + +A map/multimap associates keys with mapped types + +- That style of data structure is sometimes called an associative array +- Map subscripting operator takes key, returns reference to mapped type +- E.g., `string s = employees[id]; // returns employee name` +- If key does not exist, `[]` creates new entry with the key, value-initialized (0 if numeric, default initialized if class) instance of the mapped type + +### Unique vs. Multiple Instances of a Key + +In set and map containers, keys are unique + +- In set, keys are the entire value, so every element is unique +- In map, multiple keys may map to same value, but can’t duplicate keys +- Attempt to insert a duplicate key is ignored by the container (returns false) + +In multiset and multimap containers, duplicate keys ok + +- Since containers are ordered, duplicates are kept next to each other +- Insertion will always succeed, at appropriate place in the order + +### Key Types, Comparators, Strict Weak Ordering + +Like `sort` algorithm, can modify container’s order ... +... with any callable object that can be used correctly for sort + +Must establish a **strict weak ordering** over elements + +- Two keys cannot both be less than each other (inequality), so comparison operator must return `false` if they are equal +- If `a < b` and `b < c` then `a < c` (transitivity of inequality) +- If `!(a < b)` and `! (b < a)` then `a == b` (equivalence) +- If `a == b` and `b == c` then `a == c` (transitivity of eqivalence) + +_Sounds like definition of order in math_ + +Type of the callable object is used in container type + +- Cool example in LLM pp. 426 using `decltype` for a function + - Could do this by declaring your own pointer to function type + - But much easier to let compiler’s type inference figure it out for you + +### Pairs + +Maps use `pair` template to hold key, mapped type + +- A `pair` can be used hold any two types +- Maps use the key type as the 1st element of the pair (`p.first`) +- Maps use the mapped type as the 2nd element of the pair (`p.second`) + +Can compare `pair` variables using operators + +- Equivalence, less than, other relational operators + +Can declare `pair` variables several different ways + +- Easiest uses initialization list (curly braces around values) (e.g. `pair p = {"hello", 1};`) +- Can also default construct (value initialization) (e.g. `pair p;`) +- Can also construct with two values (e.g. `pair p("hello", 1);`) +- Can also use special `make_pair` function (e.g. `pair p = make_pair("hello", 1);`) + +### Unordered Containers (UCs) + +Example: `unordered_set`, `unordered_multiset`, `unordered_map`, `unordered_multimap` + +UCs use `==` to compare elements instead of `<` to order them + +- Types in unordered containers must be equality comparable +- When you write your own structs, overload `==` as well as `<` + +UCs store elements in indexed buckets instead of in a tree + +- Useful for types that don’t have an obvious ordering relation over their values + +UCs use hash functions to put and find elements in buckets + +- May improve performance in some cases (if performance profiling suggests so) +- Declare UCs with pluggable hash functions via callable objects, decltype, etc. +- Or specialize the `std::hash()` template for your type, used by default + +### Summary + +Use associative containers for key based lookup + +- Ordering of elements is maintained over the keys +- Think ranges and ordering rather than position indexes +- A sorted vector may be a better alternative (depends on which operations you will use most often, and their costs) + +Ordered associative containers use strict weak order + +- Operator `<` or any callable object that acts like `<` over `int` can be used + +Maps allow two-level (dictionary-like) lookup + +- Vs. sets which are used for “there or not there” lookup +- Map uses a `pair` to associate key with mapped type + +Can enforce uniqueness or allow duplicates + +- Duplicates are still stored in order, creating “equal ranges” + +## IO Libraries + +### `std::copy()` + +`std::copy()` + +http://www.cplusplus.com/reference/algorithm/copy/ + +Takes 3 parameters: + +- `copy(InputIterator first, InputIterator last, OutputIterator result);` + - `[first, last)` specifies the range of elements to copy. + - `result` specifies where we are copying to. + +Example: + +```cpp +vector v = {1, 2, 3, 4, 5}; +// copy v to cout +std::copy(v.begin(), v.end(), std::ostream_iterator(std::cout, " ")); +``` + +Some useful destination iterator types: + +1. `ostream_iterator` - iterator over an output stream (like `cout`) +2. `insert_iterator` - inserts elements directly into an STL container (will be practiced in studio) + +```cpp +#include +#include +#include +#include +#include + +using namespace std; + +int main(int argc, char *argv[]) { + if (argc != 3) { + cerr << "Usage: " << argv[0] << " " << endl; + return 1; + } + + string input_file = argv[1]; + string output_file = argv[2]; + + ifstream input_file(input_file.c_str()); + ofstream output_file(output_file.c_str()); + + // don't skip whitespace + input_file >> noskipws; + + istream_iterator i (input_file); + ostream_iterator o (output_file); + + // copy the input file to the output file: copy(InputIterator first, InputIterator last, OutputIterator result); + copy(i, istream_iterator(), o); + + cout << "Copied input file" << input_file << " to " << output_file << endl; + + return 0; +} +``` + +### IO reviews + +How to move data into and out of a program: + +- Using `argc` and `argv` to pass command line args +- Using `cout` to print data out to the terminal +- Using `cin` to obtain data from the user at run-time +- Using an `ifstream` to read data in from a file +- Using an `ofstream` to write data out to a file + +How to move data between strings, basic types + +- Using an `istringstream` to extract formatted int values +- Using an `ostringstream` to assemble a string + +### Streams + +Simply a buffer of data (array of bytes). + +Insertion operator (`<<`) specifies how to move data from a variable into an output stream +Extraction operator (`>>`) specifies how to pull data off of an input stream and store it into a variable + +Both operators defined for built-in types: + +- Numeric types +- Pointers +- Pointers to char (char *) + +Cannot copy or assign stream objects + +- Copy construction or assignment syntax using them results in a compile-time error + +Extraction operator consumes data from input stream + +- "Destructive read" that reads a different element each time +- Use a variable if you want to read same value repeatedly + +Need to test streams’ condition states + +- E.g., calling the `is_open` method on a file stream +- E.g., use the stream object in a while or if test +- Insertion and extraction operators return a reference to a stream object, so can test them too + +File stream destructor calls close automatically + +### Flushing and stream manipulators + +An output stream may hold onto data for a while, internally + +- E.g., writing chunks of text rather than a character at a time is efficient +- When it writes data out (e.g., to a file, the terminal window, etc.) is entirely up to the stream, **unless you tell it to flush out its buffers** +- If a program crashes, any un-flushed stream data is lost +- So, flushing streams reasonably often is an excellent debugging trick + +Can tie an input stream directly to an output stream + +- Output stream is then flushed by call to input stream extraction operator +- E.g., `my_istream.tie(&my_ostream);` +- `cout` is already tied to `cin` (useful for prompting the user, getting input) + +Also can flush streams directly using stream manipulators + +- E.g., `cout << flush;` or `cout << endl;` or `cout << unitbuf;` + +Other stream manipulators are useful for formatting streams + +- Field layout: `setwidth`, `setprecision`, etc. +- Display notation: `oct`, `hex`, `dec`, `boolalpha`, `nobooleanalpha`, `scientific`, etc.