Pirates are evil? The Marines are righteous? These terms have always changed throughout the course of history! Kids who have never seen peace and kids who have never seen war have different values! Those who stand at the top determine what's wrong and what's right! This very place is neutral ground! Justice will prevail, you say? But of course it will! Whoever wins this war becomes justice! - Donquixote Doflamingo • Pirates are evil? The Marines are righteous? These terms have always changed throughout the course of history! Kids who have never seen peace and kids who have never seen war have different values! Those who stand at the top determine what's wrong

  • Can perform semantic segmentation

  • The ability to distinguish areas of the data

  • Point Net Extracts both local and global features of a Point Cloud with any orientation

  • Still not entirely sure what this means

  • Analyses point clouds

  • A Point Cloud is a set of 3D points that are invariant to both order and rigid motion

  • Point cloud analysis (Like CNNs) are feature extractors and usually go before a classifier (like SVM or MLP)

  • When we want to learn an object like an airplane, how can we get the model to understand they have a nose, wings, tail etc.?

  • We can’t specify this for every class, so we need a way for the model to learn the features itself

  • PointNet consumes an entire point cloud, learns a spatial encoding of each point, aggregates learned encodings into features, and feeds them into Classification and Segmentation heads.

  • encoding is the transformation of data into model inputs

  • embeddings are the representation of the inputs in latent space

  • The PointNet Model attempts to approximate the following function

  • f is symmetric where “a symmetric function takes n vectors as input and outputs a new vector that is invariant to the input order”

  • This is important given that point clouds are symmetric

  • x_1, x_2, .., x_n are inputs

  • g is the max pooling function

  • Max pooling occurs on features/dimensions across inputs

  • h is a MLP with weights shared between inputs

  • A T-net is a special Neural Network that learns a transformation matrix that will rotate the input point cloud to a consistent orientation.

  • Rotate here isn’t literal

  • T-net begins with a learned transformation matrix

  • The T-net is a mini Point Net that performs it’s own feature extraction with a shared MLP and Max Pooling function, it then uses Fully Connected layers to scale down the features into a transformation matrix

  • The transformation matrix begins as the identity, and the output of the model is then added to it

  • Right before the max pool are where the local features lie

  • After the max pool, there are global features

  • The max pooling layer retains the highest importance points that define the shape

  • If we were to initialize to zero, then we would set all the points to zero, if we were to use a random initialization, then we could disrupt the structure of the point cloud.

  • The critical points are the global features

  • Classification head just uses the global features

  • Segmentation head uses global and local features

  • The global features are essentially added to each of the point features to go from nx64 local features, 1024 global features, to nx1088 input.