🪴Digital Garden

Search

❯

artificial intelligence

❯

computer vision

❯

object recognition

❯

SIFT

Jul 19, 20242 min read

a.k.a. Scale Invariant Feature Transform

Local Feature matching invariant to

scale
orientation/rotation
illumination/brightness
occlusion
noise
small changes in viewpoint

Keypoint Localization

See Laplacian of Gaussian Edge Detector

increasing σ (scale parameter)
for each blob, as we increase σ, a peak will emerge and then fade away
apply the σ-Normalized Laplacian of the Gaussian (NLoG) using multiple values of σ
at some scale, the output will attain a peak
characteristic scales (σ) ∝ the size of the blobs
Selection of σ:
- σ_k = σ₀s^k, k = 0,1,2,3…
- s = constant multiplier
- σ₀ = initial scale
Fast approximation → DoG
- DoG = (s-1) NLoG

see Difference of Gaussians Edge Detector

detect local maximas obtained
Non maximal suppression: Run a NxNxN grid over the stack, if the absolute value of center pixel is significantly larger than the absolute values of its neighbors in its scale and its neighbouring scales, it is declared to be an extremum.

See Non-Maximal Suppression

use some contrast thresholding to remove weak extrema

Orientation invariant Region Selection

consider a square window of pixels in the blob
get image gradient directions
principal orientation: most common gradient direction

SIFT Descriptors

calculate gradient orientation histogram of each of the four quadrants of the grid and concatenate them → normalized histogram = SIFT descriptor
histograms have 8 directions
descriptor = 128 elements
Comparing SIFT descriptors
- L2 Norm
  - smaller value = better match
- Normalized Correlation
  - - perfect match when d(H₁,H₂) = 1
- Intersection
  - larger value = better match

Applications

Object Recognition
Panorama Stitching
Auto Collages

Limitations

3D objects
different viewpoints, angles

Graph View

Keypoint Localization
Orientation invariant Region Selection
SIFT Descriptors
Applications
Limitations

Backlinks

Local Feature Detection

Created with Quartz v4.2.3 © 2024

GitHub
Discord Community