This commit is contained in:
Trance-0
2025-09-03 16:59:28 -05:00
parent fb1350038e
commit f8ef68ff7f
4 changed files with 33 additions and 4 deletions

View File

@@ -41,12 +41,15 @@ $\beta$ is a hyperparameter that scale the loss of the camera orientation so tha
### Network architecture
Based on GoogleNet (SOTA in 2014), but with a few changes:
Based on GoogLeNet (SOTA in 2014), but with a few changes:
- Replace all three softmax classifiers with affine regressors.
- Insert another fully connected layer before final regressor of feature size 2048
- At test time, normalize the quaternion to unit length.
<details>
<summary>Architecture</summary>
```python
from network import Network
@@ -259,7 +262,7 @@ class GoogLeNet(Network):
.fc(4, relu=False, name='cls3_fc_pose_wpqr'))
```
</details>
## Unsupervised Learning of Depth and Ego-Motion From Video
@@ -267,6 +270,24 @@ class GoogLeNet(Network):
[link to the paper](https://openaccess.thecvf.com/content_cvpr_2017/papers/Zhou_Unsupervised_Learning_of_CVPR_2017_paper.pdf)
This is a method that estimates both depth and camera pose motion from a single video using CNN.
Jointly training a single-view depth CNN and a camera pose estimation CNN form unlabelled monocular video sequences.
![View synthesis as supervision](https://notenextra.trance-0.com/CSE5519/Unsupervised_Learning_DepthNet_PoseNet.png)
### View synthesis as supervision
Notice that in mo
## Unsupervised Monocular Depth Estimation with Left-Right Consistency
(CVPR 2017)
[link to the paper](https://arxiv.org/pdf/1609.03677)
This is a method that use pair of images as Left and Right eye to estimate depth. Increased consistency by flipping the right-left relation.
## GeoNet
Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose (CVPR 2018)
@@ -283,6 +304,12 @@ Depth Estimation from single monocular image.
### Rigid structure constructor
Combines the DepthNet and PoseNet to estimate the depth and camera pose motion from [Unsupervised Learning of Depth and Ego-Motion From Video](#unsupervised-learning-of-depth-and-ego-motion-from-video).
### Non-rigid motion localizer
Use [Left-right consistency](#unsupervised-monocular-depth-estimation-with-left-right-consistency) to estimate the non-rigid motion by training the ResFlowNet.
### Geometric consistency enforcement
Finally, we use an additional geometric consistency enforcement to handle non-Lambertian surfaces (e.g., metal, plastic, etc.).