Learning Objectives
Introduction
- I am motivated.
- I have an idea of what is coming.
What is 3D Computer Vision about
- I know what computer vision means and encompasses.
- I know first ideas how 3D information can be reconstructed from images.
Cameras and perception
- I know how a camera creates an image.
- I know the pinhole camera model.
- I understand why we use lenses in cameras.
- I know that human eyes are not cameras and perception is tricky.
Projection
- I understood that an image is a projection from 3D to 2D, and therefore information is lost.
- I know how a projection can be calculated using matrix multiplication.
Camera parameter
- I know that there are internal and external camera parameters and how they are related.
Distortion
- I learned that distortion of an image can occur due to perspective as well as the lens.
- I learned about a method that can correct distortion in wide-angle shots.
Projective geometry
- I understood that projective geometry can be used to easily calculate intersections of lines in images.
- I can imagine that it also works in 3D with planes.
Camera calibration
- I have understood the principle of camera calibration.
Monocular reconstruction
- I have understood how to use vanishing points and a reference object to measure the height of other objects in an image.
- I know how to calculate the position of a vanishing point.
- I know what an artificial neuron is and I can imagine that millions of them can be connected to each other.
- I understand why it is called “deep” learning.
- I know why you should never use training data for testing.
- I understand what training, testing and prediction mean and can understand the process.
- I understand the role of the loss function.
- I know what the aim and principle of the gradient descent method is and have understood what it has to do with derivatives.
- I know that back propagation is the algorithm used to train a neural network.
- I know about DepthAnything and other deep learning based methods to get a depth map from a single image.
Stereo and Triangulation
Stereo
- I have understood the principle of epipolar geometry.
- I have understood the purpose and use of rectification.
- I have understood why stereo matching is so difficult, especially on edges.
- I know the distinction between local and global stereo algorithms.
- I understand how a global stereo matching can be found using dynamic programming on the disparity space image.
Triangulation
- I understand the calculation process of stereo triangulation and can reproduce it.
- I understand the link between structured illumination and stereo triangulation.
- I understand the idea of using binary codes for Structured Light scanners and why this speeds up the process.
- I have understood how triangulation can be used to reconstruct the shape of objects and I have learned about various applications.
Epipolar geometry
- I have understood what epipoles are.
- I have learned that the mapping from one point to the epipolar line in another image is described by the fundamental matrix.
- I can estimate where the epipolar lines are in two images of the same scene.
- I know that you can rectify a stereo image pair if you know the fundamental matrix.
- I have understood the difference between essential matrix and fundamental matrix.
- I understood that you can calculate the fundamental matrix from matching image points.
- I learned that it is still a relatively simple minimization problem.
Multiview Stereo (MVS) and NeRF (and tbd: 3DGS)
MVS
- I have understood that multi-view stereo assumes that one knows the intrinsic and extrinsic camera parameters.
- I learned about different weighting functions for matching.
- I learned that you can reconstruct very accurately with many cameras and I know some applications.
- I learned about the influence of the base length.
- I have understood the principle of how to arrive at 3D models instead of a single depth map.
NeRF (and Gaussian Splatting)
- I know what a NeRF is and how it is created.
- I have understood that a NeRF does not store geometries.
- I can differentiate how NeRFs are trained and rendered.
- I understand why controlling and editing a NeRF is complicated.
Feature Matching and Structure from motion (SfM)
Features
- I understand what is meant by image feature.
- I am aware of different applications that use image features and am motivated to learn more about features.
- I understand what are the advantages of image features over individual pixels, regions or whole images.
- I can understand the approach; detection, descriptive, matching.
- I know what a blob is and know and understand the concept of scale invariance.
- I have the understood what invariance is and what it is necessary for.
- I know applications based on finding pairs of features.
- I understand how invariance can be achieved by the descriptor.
- I know SIFT as a detector and descriptor.
- I can calculate the distance between two features.
- I understood that there is problem with repetitions and know possible solutions of the problem.
- I understand RANSAC and have ideas about where to use it.
- I go through the day exhilarated and whistling the RANSAC song!
- I can distinguish between feature detection and feature description.
SfM
- I know what Structure From Motion means and why it is called what it is called.
- I understand that you can take advantage of the low rank when solving the system of equations.
- I understand that there is a lot of data and large matrices that need to be computed together and that is why iterative methods are used.
- I know many different applications of SfM.
3D data structures and Time-of-Flight (ToF) imaging
Data structures and algorithms
- I know what data structures exist for 3D data.
- I understand how the ICP algorithm works.
- I know octrees.
- I know how a kD tree is constructed and how to find the nearest neighbors in it.
3D cameras
- I know different 3D camera technologies.
- I have understood how the Time-Of-Flight measurement principle works.
- I know the necessary hardware components of a ToF camera.
- I know which problems exist in ToF imaging and how they can be solved.
- I know different applications for 3D cameras.