Files
NoteNextra-origin/content/CSE332S/CSE332S_L10.md
2025-07-06 12:40:25 -05:00

275 lines
9.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CSE332S Lecture 10
## Associative Containers
| Container | Sorted | Unique Key | Allow duplicates |
| -------------------- | ------ | ---------- | ---------------- |
| `set` | Yes | Yes | No |
| `multiset` | Yes | Yes | Yes |
| `unordered_set` | No | Yes | No |
| `unordered_multiset` | No | Yes | Yes |
| `map` | Yes | Yes | No |
| `multimap` | Yes | Yes | Yes |
| `unordered_map` | No | Yes | No |
| `unordered_multimap` | No | Yes | Yes |
Associative containers support efficient key lookup
vs. sequence containers, which lookup by position
Associative containers differ in 3 design dimensions
- Ordered vs. unordered (tree vs. hash structured)
- Well look at ordered containers today, unordered next time
- Set vs. map (just the key or the key and a mapped type)
- Unique vs. multiple instances of a key
### Ordered Associative Containers
Example: `set`, `multiset`, `map`, `multimap`
Ordered associative containers are tree structured
- Insert/delete maintain sorted order, e.g. `operator<`
- Dont use sequence algorithms like `sort` or `find` with them
- Already sorted, so sorting unnecessary (or harmful)
- `find` is more efficient (logarithmic time) as a container method
Ordered associative containers are bidirectional
- Can iterate through them in either direction, find sub-ranges
- Can use as source or destination for algorithms like `copy`
### Set vs. Map
A set/multiset stores keys (the key is the entire value)
- Used to collect single-level information (e.g., a set of words to ignore)
- Avoid in-place modification of keys (especially in a set or multiset)
A map/multimap associates keys with mapped types
- That style of data structure is sometimes called an associative array
- Map subscripting operator takes key, returns reference to mapped type
- E.g., `string s = employees[id]; // returns employee name`
- If key does not exist, `[]` creates new entry with the key, value-initialized (0 if numeric, default initialized if class) instance of the mapped type
### Unique vs. Multiple Instances of a Key
In set and map containers, keys are unique
- In set, keys are the entire value, so every element is unique
- In map, multiple keys may map to same value, but cant duplicate keys
- Attempt to insert a duplicate key is ignored by the container (returns false)
In multiset and multimap containers, duplicate keys ok
- Since containers are ordered, duplicates are kept next to each other
- Insertion will always succeed, at appropriate place in the order
### Key Types, Comparators, Strict Weak Ordering
Like `sort` algorithm, can modify containers order ...
... with any callable object that can be used correctly for sort
Must establish a **strict weak ordering** over elements
- Two keys cannot both be less than each other (inequality), so comparison operator must return `false` if they are equal
- If `a < b` and `b < c` then `a < c` (transitivity of inequality)
- If `!(a < b)` and `! (b < a)` then `a == b` (equivalence)
- If `a == b` and `b == c` then `a == c` (transitivity of eqivalence)
_Sounds like definition of order in math_
Type of the callable object is used in container type
- Cool example in LLM pp. 426 using `decltype` for a function
- Could do this by declaring your own pointer to function type
- But much easier to let compilers type inference figure it out for you
### Pairs
Maps use `pair` template to hold key, mapped type
- A `pair` can be used hold any two types
- Maps use the key type as the 1st element of the pair (`p.first`)
- Maps use the mapped type as the 2nd element of the pair (`p.second`)
Can compare `pair` variables using operators
- Equivalence, less than, other relational operators
Can declare `pair` variables several different ways
- Easiest uses initialization list (curly braces around values) (e.g. `pair<string, int> p = {"hello", 1};`)
- Can also default construct (value initialization) (e.g. `pair<string, int> p;`)
- Can also construct with two values (e.g. `pair<string, int> p("hello", 1);`)
- Can also use special `make_pair` function (e.g. `pair<string, int> p = make_pair("hello", 1);`)
### Unordered Containers (UCs)
Example: `unordered_set`, `unordered_multiset`, `unordered_map`, `unordered_multimap`
UCs use `==` to compare elements instead of `<` to order them
- Types in unordered containers must be equality comparable
- When you write your own structs, overload `==` as well as `<`
UCs store elements in indexed buckets instead of in a tree
- Useful for types that dont have an obvious ordering relation over their values
UCs use hash functions to put and find elements in buckets
- May improve performance in some cases (if performance profiling suggests so)
- Declare UCs with pluggable hash functions via callable objects, decltype, etc.
- Or specialize the `std::hash()` template for your type, used by default
### Summary
Use associative containers for key based lookup
- Ordering of elements is maintained over the keys
- Think ranges and ordering rather than position indexes
- A sorted vector may be a better alternative (depends on which operations you will use most often, and their costs)
Ordered associative containers use strict weak order
- Operator `<` or any callable object that acts like `<` over `int` can be used
Maps allow two-level (dictionary-like) lookup
- Vs. sets which are used for “there or not there” lookup
- Map uses a `pair` to associate key with mapped type
Can enforce uniqueness or allow duplicates
- Duplicates are still stored in order, creating “equal ranges”
## IO Libraries
### `std::copy()`
`std::copy()`
http://www.cplusplus.com/reference/algorithm/copy/
Takes 3 parameters:
- `copy(InputIterator first, InputIterator last, OutputIterator result);`
- `[first, last)` specifies the range of elements to copy.
- `result` specifies where we are copying to.
Example:
```cpp
vector<int> v = {1, 2, 3, 4, 5};
// copy v to cout
std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout, " "));
```
Some useful destination iterator types:
1. `ostream_iterator` - iterator over an output stream (like `cout`)
2. `insert_iterator` - inserts elements directly into an STL container (will be practiced in studio)
```cpp
#include <iostream>
#include <string>
#include <fstream>
#include <iterator>
#include <algorithm>
using namespace std;
int main(int argc, char *argv[]) {
if (argc != 3) {
cerr << "Usage: " << argv[0] << " <input file> <output file>" << endl;
return 1;
}
string input_file = argv[1];
string output_file = argv[2];
ifstream input_file(input_file.c_str());
ofstream output_file(output_file.c_str());
// don't skip whitespace
input_file >> noskipws;
istream_iterator<char> i (input_file);
ostream_iterator<char> o (output_file);
// copy the input file to the output file: copy(InputIterator first, InputIterator last, OutputIterator result);
copy(i, istream_iterator<char>(), o);
cout << "Copied input file" << input_file << " to " << output_file << endl;
return 0;
}
```
### IO reviews
How to move data into and out of a program:
- Using `argc` and `argv` to pass command line args
- Using `cout` to print data out to the terminal
- Using `cin` to obtain data from the user at run-time
- Using an `ifstream` to read data in from a file
- Using an `ofstream` to write data out to a file
How to move data between strings, basic types
- Using an `istringstream` to extract formatted int values
- Using an `ostringstream` to assemble a string
### Streams
Simply a buffer of data (array of bytes).
Insertion operator (`<<`) specifies how to move data from a variable into an output stream
Extraction operator (`>>`) specifies how to pull data off of an input stream and store it into a variable
Both operators defined for built-in types:
- Numeric types
- Pointers
- Pointers to char (char *)
Cannot copy or assign stream objects
- Copy construction or assignment syntax using them results in a compile-time error
Extraction operator consumes data from input stream
- "Destructive read" that reads a different element each time
- Use a variable if you want to read same value repeatedly
Need to test streams condition states
- E.g., calling the `is_open` method on a file stream
- E.g., use the stream object in a while or if test
- Insertion and extraction operators return a reference to a stream object, so can test them too
File stream destructor calls close automatically
### Flushing and stream manipulators
An output stream may hold onto data for a while, internally
- E.g., writing chunks of text rather than a character at a time is efficient
- When it writes data out (e.g., to a file, the terminal window, etc.) is entirely up to the stream, **unless you tell it to flush out its buffers**
- If a program crashes, any un-flushed stream data is lost
- So, flushing streams reasonably often is an excellent debugging trick
Can tie an input stream directly to an output stream
- Output stream is then flushed by call to input stream extraction operator
- E.g., `my_istream.tie(&my_ostream);`
- `cout` is already tied to `cin` (useful for prompting the user, getting input)
Also can flush streams directly using stream manipulators
- E.g., `cout << flush;` or `cout << endl;` or `cout << unitbuf;`
Other stream manipulators are useful for formatting streams
- Field layout: `setwidth`, `setprecision`, etc.
- Display notation: `oct`, `hex`, `dec`, `boolalpha`, `nobooleanalpha`, `scientific`, etc.