275 lines
9.6 KiB
Markdown
275 lines
9.6 KiB
Markdown
# CSE332S Lecture 10
|
||
|
||
## Associative Containers
|
||
|
||
| Container | Sorted | Unique Key | Allow duplicates |
|
||
| -------------------- | ------ | ---------- | ---------------- |
|
||
| `set` | Yes | Yes | No |
|
||
| `multiset` | Yes | Yes | Yes |
|
||
| `unordered_set` | No | Yes | No |
|
||
| `unordered_multiset` | No | Yes | Yes |
|
||
| `map` | Yes | Yes | No |
|
||
| `multimap` | Yes | Yes | Yes |
|
||
| `unordered_map` | No | Yes | No |
|
||
| `unordered_multimap` | No | Yes | Yes |
|
||
|
||
Associative containers support efficient key lookup
|
||
vs. sequence containers, which lookup by position
|
||
Associative containers differ in 3 design dimensions
|
||
|
||
- Ordered vs. unordered (tree vs. hash structured)
|
||
- We’ll look at ordered containers today, unordered next time
|
||
- Set vs. map (just the key or the key and a mapped type)
|
||
- Unique vs. multiple instances of a key
|
||
|
||
### Ordered Associative Containers
|
||
|
||
Example: `set`, `multiset`, `map`, `multimap`
|
||
|
||
Ordered associative containers are tree structured
|
||
- Insert/delete maintain sorted order, e.g. `operator<`
|
||
- Don’t use sequence algorithms like `sort` or `find` with them
|
||
- Already sorted, so sorting unnecessary (or harmful)
|
||
- `find` is more efficient (logarithmic time) as a container method
|
||
Ordered associative containers are bidirectional
|
||
- Can iterate through them in either direction, find sub-ranges
|
||
- Can use as source or destination for algorithms like `copy`
|
||
|
||
### Set vs. Map
|
||
|
||
A set/multiset stores keys (the key is the entire value)
|
||
|
||
- Used to collect single-level information (e.g., a set of words to ignore)
|
||
- Avoid in-place modification of keys (especially in a set or multiset)
|
||
|
||
A map/multimap associates keys with mapped types
|
||
|
||
- That style of data structure is sometimes called an associative array
|
||
- Map subscripting operator takes key, returns reference to mapped type
|
||
- E.g., `string s = employees[id]; // returns employee name`
|
||
- If key does not exist, `[]` creates new entry with the key, value-initialized (0 if numeric, default initialized if class) instance of the mapped type
|
||
|
||
### Unique vs. Multiple Instances of a Key
|
||
|
||
In set and map containers, keys are unique
|
||
|
||
- In set, keys are the entire value, so every element is unique
|
||
- In map, multiple keys may map to same value, but can’t duplicate keys
|
||
- Attempt to insert a duplicate key is ignored by the container (returns false)
|
||
|
||
In multiset and multimap containers, duplicate keys ok
|
||
|
||
- Since containers are ordered, duplicates are kept next to each other
|
||
- Insertion will always succeed, at appropriate place in the order
|
||
|
||
### Key Types, Comparators, Strict Weak Ordering
|
||
|
||
Like `sort` algorithm, can modify container’s order ...
|
||
... with any callable object that can be used correctly for sort
|
||
|
||
Must establish a **strict weak ordering** over elements
|
||
|
||
- Two keys cannot both be less than each other (inequality), so comparison operator must return `false` if they are equal
|
||
- If `a < b` and `b < c` then `a < c` (transitivity of inequality)
|
||
- If `!(a < b)` and `! (b < a)` then `a == b` (equivalence)
|
||
- If `a == b` and `b == c` then `a == c` (transitivity of eqivalence)
|
||
|
||
_Sounds like definition of order in math_
|
||
|
||
Type of the callable object is used in container type
|
||
|
||
- Cool example in LLM pp. 426 using `decltype` for a function
|
||
- Could do this by declaring your own pointer to function type
|
||
- But much easier to let compiler’s type inference figure it out for you
|
||
|
||
### Pairs
|
||
|
||
Maps use `pair` template to hold key, mapped type
|
||
|
||
- A `pair` can be used hold any two types
|
||
- Maps use the key type as the 1st element of the pair (`p.first`)
|
||
- Maps use the mapped type as the 2nd element of the pair (`p.second`)
|
||
|
||
Can compare `pair` variables using operators
|
||
|
||
- Equivalence, less than, other relational operators
|
||
|
||
Can declare `pair` variables several different ways
|
||
|
||
- Easiest uses initialization list (curly braces around values) (e.g. `pair<string, int> p = {"hello", 1};`)
|
||
- Can also default construct (value initialization) (e.g. `pair<string, int> p;`)
|
||
- Can also construct with two values (e.g. `pair<string, int> p("hello", 1);`)
|
||
- Can also use special `make_pair` function (e.g. `pair<string, int> p = make_pair("hello", 1);`)
|
||
|
||
### Unordered Containers (UCs)
|
||
|
||
Example: `unordered_set`, `unordered_multiset`, `unordered_map`, `unordered_multimap`
|
||
|
||
UCs use `==` to compare elements instead of `<` to order them
|
||
|
||
- Types in unordered containers must be equality comparable
|
||
- When you write your own structs, overload `==` as well as `<`
|
||
|
||
UCs store elements in indexed buckets instead of in a tree
|
||
|
||
- Useful for types that don’t have an obvious ordering relation over their values
|
||
|
||
UCs use hash functions to put and find elements in buckets
|
||
|
||
- May improve performance in some cases (if performance profiling suggests so)
|
||
- Declare UCs with pluggable hash functions via callable objects, decltype, etc.
|
||
- Or specialize the `std::hash()` template for your type, used by default
|
||
|
||
### Summary
|
||
|
||
Use associative containers for key based lookup
|
||
|
||
- Ordering of elements is maintained over the keys
|
||
- Think ranges and ordering rather than position indexes
|
||
- A sorted vector may be a better alternative (depends on which operations you will use most often, and their costs)
|
||
|
||
Ordered associative containers use strict weak order
|
||
|
||
- Operator `<` or any callable object that acts like `<` over `int` can be used
|
||
|
||
Maps allow two-level (dictionary-like) lookup
|
||
|
||
- Vs. sets which are used for “there or not there” lookup
|
||
- Map uses a `pair` to associate key with mapped type
|
||
|
||
Can enforce uniqueness or allow duplicates
|
||
|
||
- Duplicates are still stored in order, creating “equal ranges”
|
||
|
||
## IO Libraries
|
||
|
||
### `std::copy()`
|
||
|
||
`std::copy()`
|
||
|
||
http://www.cplusplus.com/reference/algorithm/copy/
|
||
|
||
Takes 3 parameters:
|
||
|
||
- `copy(InputIterator first, InputIterator last, OutputIterator result);`
|
||
- `[first, last)` specifies the range of elements to copy.
|
||
- `result` specifies where we are copying to.
|
||
|
||
Example:
|
||
|
||
```cpp
|
||
vector<int> v = {1, 2, 3, 4, 5};
|
||
// copy v to cout
|
||
std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout, " "));
|
||
```
|
||
|
||
Some useful destination iterator types:
|
||
|
||
1. `ostream_iterator` - iterator over an output stream (like `cout`)
|
||
2. `insert_iterator` - inserts elements directly into an STL container (will be practiced in studio)
|
||
|
||
```cpp
|
||
#include <iostream>
|
||
#include <string>
|
||
#include <fstream>
|
||
#include <iterator>
|
||
#include <algorithm>
|
||
|
||
using namespace std;
|
||
|
||
int main(int argc, char *argv[]) {
|
||
if (argc != 3) {
|
||
cerr << "Usage: " << argv[0] << " <input file> <output file>" << endl;
|
||
return 1;
|
||
}
|
||
|
||
string input_file = argv[1];
|
||
string output_file = argv[2];
|
||
|
||
ifstream input_file(input_file.c_str());
|
||
ofstream output_file(output_file.c_str());
|
||
|
||
// don't skip whitespace
|
||
input_file >> noskipws;
|
||
|
||
istream_iterator<char> i (input_file);
|
||
ostream_iterator<char> o (output_file);
|
||
|
||
// copy the input file to the output file: copy(InputIterator first, InputIterator last, OutputIterator result);
|
||
copy(i, istream_iterator<char>(), o);
|
||
|
||
cout << "Copied input file" << input_file << " to " << output_file << endl;
|
||
|
||
return 0;
|
||
}
|
||
```
|
||
|
||
### IO reviews
|
||
|
||
How to move data into and out of a program:
|
||
|
||
- Using `argc` and `argv` to pass command line args
|
||
- Using `cout` to print data out to the terminal
|
||
- Using `cin` to obtain data from the user at run-time
|
||
- Using an `ifstream` to read data in from a file
|
||
- Using an `ofstream` to write data out to a file
|
||
|
||
How to move data between strings, basic types
|
||
|
||
- Using an `istringstream` to extract formatted int values
|
||
- Using an `ostringstream` to assemble a string
|
||
|
||
### Streams
|
||
|
||
Simply a buffer of data (array of bytes).
|
||
|
||
Insertion operator (`<<`) specifies how to move data from a variable into an output stream
|
||
Extraction operator (`>>`) specifies how to pull data off of an input stream and store it into a variable
|
||
|
||
Both operators defined for built-in types:
|
||
|
||
- Numeric types
|
||
- Pointers
|
||
- Pointers to char (char *)
|
||
|
||
Cannot copy or assign stream objects
|
||
|
||
- Copy construction or assignment syntax using them results in a compile-time error
|
||
|
||
Extraction operator consumes data from input stream
|
||
|
||
- "Destructive read" that reads a different element each time
|
||
- Use a variable if you want to read same value repeatedly
|
||
|
||
Need to test streams’ condition states
|
||
|
||
- E.g., calling the `is_open` method on a file stream
|
||
- E.g., use the stream object in a while or if test
|
||
- Insertion and extraction operators return a reference to a stream object, so can test them too
|
||
|
||
File stream destructor calls close automatically
|
||
|
||
### Flushing and stream manipulators
|
||
|
||
An output stream may hold onto data for a while, internally
|
||
|
||
- E.g., writing chunks of text rather than a character at a time is efficient
|
||
- When it writes data out (e.g., to a file, the terminal window, etc.) is entirely up to the stream, **unless you tell it to flush out its buffers**
|
||
- If a program crashes, any un-flushed stream data is lost
|
||
- So, flushing streams reasonably often is an excellent debugging trick
|
||
|
||
Can tie an input stream directly to an output stream
|
||
|
||
- Output stream is then flushed by call to input stream extraction operator
|
||
- E.g., `my_istream.tie(&my_ostream);`
|
||
- `cout` is already tied to `cin` (useful for prompting the user, getting input)
|
||
|
||
Also can flush streams directly using stream manipulators
|
||
|
||
- E.g., `cout << flush;` or `cout << endl;` or `cout << unitbuf;`
|
||
|
||
Other stream manipulators are useful for formatting streams
|
||
|
||
- Field layout: `setwidth`, `setprecision`, etc.
|
||
- Display notation: `oct`, `hex`, `dec`, `boolalpha`, `nobooleanalpha`, `scientific`, etc.
|