KMeans, Naive Bayes
Cornell College
STA 362 Spring 2024 Block 8
Given positive integer \(K\), and a test observation \(x_0\), KNN identifies the \(K\) points in the training data that are closest to \(x_0\), represented by \(\mathcal{N}_0\).
KNN then estimates the conditional probabilities for each class \(j\) as the fraction of the points in \(\mathcal{N}_0\) whose response values equal \(j\):
\[P(Y=j|X=x_0)=\frac{1}{K}\sum_{i\in \mathcal{N}}I(y_i=j)\]
Lastly, KNN classifies \(x_0\) into the class with the largest probability
The KNN approach, using \(K = 3\), is illustrated in a situation with six blue observations and six orange observations.
A test observation, \(x_0\), at which a predicted class label is desired is shown as a black cross.
The three closest points to the test observation are identified, and it is predicted that the test observation belongs to the most commonly-occurring class, in this case blue.
The KNN approach, using \(K = 3\), is illustrated in a situation with six blue observations and six orange observations.
The KNN decision boundary for this example is shown in black. The blue grid indicates the region in which a test observation will be assigned to the blue class, and the orange grid indicates the region in which it will be assigned to the orange class.
The black curve indicates the KNN decision boundary, using \(K = 10\). The Bayes decision boundary is shown as a purple dashed line.
The KNN and Bayes decision boundaries are very similar.
These two independence assumptions allows us to write, for \(k=1,2,...,K\),
\[f_k(x) = f_{k1}(x)f_{k2}(x)\cdot \cdot \cdot f_{kp}\]
Making these assumption, we now have:
\[P(Y=k|X=x) = \frac{\pi_k\cdot f_{k1}(x_1)\cdot \cdot \cdot f_{kp}(x_p)}{\sum_{l=1}^K\pi_l\cdot f_{l1}(x_1)\cdot \cdot \cdot f_{lp}(x_p)}\] for \(k=1,2,...,K\)
Naive Bayes - Given Y, the predictors X are conditionally independent.
LDA - LDA assumes that the covariance matrix across classes is the same.
QDA - QDA does not assume constant covariance matrix across classes.