[Coursera Stanford Machine Learning (week 2)] Computing parameters Analytically

sunnyshiny 2023. 2. 3. 16:56

2023. 2. 3. 16:56

728x90

해당 내용은 coursera Andrew Ng교수의 Machine Learning강의노트정리

Normal Equation

Gradient descent는 global minimum으로 수렴하는 방식으로 해를 찾는 다면 Normal equation은 $\theta$ 의 해를 직접적으로 찾는 방법이다.

$\theta = (X^TX)^{-1} X^{T} y$

Gradient Descent

Need to choose $\alpha$
Needs many iterations
$O(kn^2)$
works well even when n is large

Normal Equation

No need to choose $\alpha$
Don’t need to iterate
$O(n^3)$, need to compute $(X^TX)^{-1}$
slow if n is very large → if $n\gtrsim 10000$겨

경사하강법은 learning rate과 iteration을 통해 최적의 값으로 수렴하도록 해야 한다. 반면 Normal Euqation은 learning rate과 iteration을 할 필요가 없다. 그러나 $(X^TX)^{-1}$값을 구할 수 있어야 하고 feature의 개수가 커지면 computation cost가 증가하여 계산속도가 느려지게 된다. $O(n^3)$ 따라서 feature가 많을 경우 Gradient descent가 더 잘 작동한다.

if $X^TX$non_invertible?

중복되는 변수를 제거 → linearly dependent
변수가 너무 많은 경우 → 데이터의 갯수 $\le$ 변수의 수
- 변수를 줄이거나 regularization을 해줌

Refrenece
Machine learning , Coursera, Andrew Ng

728x90

'Data Science > Machine Learning' 카테고리의 다른 글

[Coursera Stanford Machine Learning (week 3)] Logistic Regression Model (0)	2023.02.06
[Coursera Stanford Machine Learning (week3) ]Classification and Representation (0)	2023.02.06
[Coursera Stanford Machine Learning (week 2)] Multivariate Linear Regression (0)	2023.02.03
[Coursera Stanford Machine Learning (week 1)] Parameter Learning (0)	2023.02.02
[Coursera Stanford Machine learning (week 1)] Model and Cost Function (0)	2023.02.01

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Sunny Finance & Tech Blog