diff --git a/content/CSE5519/CSE5519_C3.md b/content/CSE5519/CSE5519_C3.md index 5bc5ec4..7cf3cb7 100644 --- a/content/CSE5519/CSE5519_C3.md +++ b/content/CSE5519/CSE5519_C3.md @@ -1,2 +1,30 @@ # CSE5519 Advances in Computer Vision (Topic C: 2023: Neural Rendering) +## NoPe-NeRF: Optimising neural radiance field with no pose prior + +[link to paper](https://arxiv.org/pdf/2212.07388) + +Incorporating undistorted monocular depth priors. + +THese priors are generated by correcting scale and shift parameters during training, with which we are then able to constrain the relative posed between consecutive frames. + +### Novelty in Methods + +By noticing that the test views are sampled from video sequences, are close to the training views. We fit a bezier curve from the estimated training poses and sample interpolated posed for each method to render novel view videos. + +### Novelty in Implementations + +1. Replacing ReLU activation function with Softplus +2. Sample 128 points along each ray uniformly with noise. + +### Loss function construction + +1. Distortion parameters impact on pose accuracy +2. Inter-frame consistency loss impacts pose accuracy +3. NeRF Losses impact on pose accuracy + +> [!TIP] +> +> This paper presents a new method integrating the monocular depth estimation with Neural Rendering. Note that the model is trained on video sequences with inter-frame losses. The author even fits a Bezier curve from the estimated training poses and samples interpolated poses for novel view videos. Is this unfair for other models that don't assume the trajectory of predicted views is continuous? +> +> How does the model accurately perform for arbitrary poses instead of selecting from poses in video sequences? \ No newline at end of file diff --git a/content/CSE5519/CSE5519_G3.md b/content/CSE5519/CSE5519_G3.md index b7ddf9f..25b5c2a 100644 --- a/content/CSE5519/CSE5519_G3.md +++ b/content/CSE5519/CSE5519_G3.md @@ -1,2 +1,17 @@ # CSE5519 Advances in Computer Vision (Topic G: 2023: Correspondence Estimation and Structure from Motion) +## Detector-Free Structure from Motion + +[link to paper](https://arxiv.org/abs/2306.15669) + +- A new detector-free SfM framework built upon detector-free matchers to handle texture-pool scenes. +- An iterative refinement pipeline with a transformer-based multi-view matching network to efficiently refine both feature tracks and reconstruction results. + - Multi-view Feature transformer to enhance the discrimitiveness of extracted features. + - Use Bundle adjustment (view point consistency) + - Use Topology adjustment (merge, complete, or remove vertices using pre defined rules) + +> [!TIP] +> +> This paper proposed a new detector-free SfM framework built upon detector-free matchers to handle texture-pool scenes and use an iterative refinement pipeline with a transformer-based multi-view matching network to efficiently refine both feature tracks and reconstruction results. +> +> I'm particularly interested in the detector-free matchers and the transformer-based multi-view matching network. Due to time constraints, I don't have much time to check the work for detector-free matchers and how they generate the coarse model for predicted matches. I'm looking forward to hearing more about this topic in tomorrow's presentation. \ No newline at end of file