Learning from data without target variables
## Discrete Latent Variables
Unsupervised learning for classification
- Clustering
- Different kinds of clustering models
- Centroid
- Distribution
- Connectivity (hierarchical)
- Others
- Assignment of data into clusters
- Hard
- data assigned to one specific cluster
- Soft
- exist in multiple clusters
[[K-Means Clustering]]
[[Gaussian Mixture Models]]
[[Expectation Maximization (EM) Algorithm]]
## Continuous Latent Variables
Unsupervised learning for regression
Look at variables that are not observed, but assumed to exist --> ***latent***
- assumed to exist in functions that generate data
Example: digit recognition
- observed raw data --> pixel values
- hidden latent data
- scaling
- rotation
- translation
High dimensional feature space vs. lower dimensional manifold
- observed data --> high dimensional space
- latent data --> lower dimensional manifold
Most simple case:
- [[Principal Component Analysis]], both observed and latent variables assumed to be Gaussian
- Probabilistic PCA --> factor analysis
Non-Gaussian assumption:
- Independent component analysis
- Neural network autoencoders