[Coursera Stanford Machine Learning (week 7)] SVM: Support Vector Machine

sunnyshiny 2023. 2. 17. 15:55

2023. 2. 17. 15:55

728x90

Large Margin Classification

logistic regression 에서 $y=1$이면 $\theta^Tx \gg 0$이길 원하고 $y=0$이면 $\theta^Tx \ll 0$ 이길 원한다.

$$ -(ylogh_{\theta}(x)+(1-y)log(1-h_{\theta}(x))\\=-(ylog\frac{1}{1+e^{-\theta^Tx}}+(1-y)log(1-\frac{1}{1+e^{-\theta^Tx}}) $$

$y=1$일때 $cost_1(\theta^Tx)=-log\frac{1}{1+e^{-\theta^Tx}}$

$y=0$일때 $cost_0(\theta^Tx)=-log(1-\frac{1}{1+e^{-\theta^Tx}})$ 라고 하면

$$ -(y\ cost_1(\theta^Tx)+(1-y)cost_0(\theta^Tx) \Rightarrow \\min_{\theta}\frac{1}{m}\sum_{i=1}^m[-(y^{(i)}cost_1(\theta^Tx^{(i)})+(1-y^{(i)})cost_0(\theta^Tx^{(i)})]+\frac{\lambda}{2m}\sum_{i=1}^n \theta_j^2 $$

$\frac{1}{m}$은 상수이므로 제거해 주어도 동일한 결과를 얻을 수 있으므로 제거해 주고 $c = \frac{1}{\lambda}$ 를적용

ex ) $min_u\ (u-5)^2+1\ \rightarrow u=5\\min_u\ 10(u-5)^2+10\ \rightarrow u=5$

$$ min_{\theta}\ C \sum_{i=1}^m[-(y^{(i)} cost_1(\theta^Tx^{(i)})+(1-y^{(i)})cost_0(\theta^Tx^{(i)})]+\frac{1}{2}\sum_{i=1}^n \theta_j^2 $$

$y^{(i)}=1$일때 $\theta^Tx^{(i)} \ge1$이면 $min_{\theta} \ C *0 +\frac{1}{2}\sum_{i=1}^n \theta_j^2$
$y^{(i)}=0$일때 $\theta^Tx^{(i)} \le-1$이면 $min_{\theta} \ C *0 +\frac{1}{2}\sum_{i=1}^n \theta_j^2$

선형으로 분류할수 있는 문제의 경우 SVM은 margin이 큰 decision boundary를 선택한다. 그러나 SVM은 outlier에 영향을 많이 받으며 C의 크기를 조절하므로써 oulier의 영향력의 정도를 줄일수 있다.

728x90

'Data Science > Machine Learning' 카테고리의 다른 글

[Coursera Stanford Machine Learning (week 7)] Kernel and using SVM (0)	2023.03.07
[Coursera Stanford Machine Learning (week 6)]Machine Learning system design (0)	2023.02.17
[Coursera Stanford Machine Learning (week 6)]Advice for applying machine learning (0)	2023.02.15
[Coursera Stanford Machine Learning (week 5)] Neural Networks - Cost Function and Backpropagation (0)	2023.02.15
[Coursera Stanford Machine Learning (week4)] Applications (0)	2023.02.08

Sunny Finance & Tech Blog