Index

Lecture4
- Multiple features
  ~ Link to coursera section

Lecture4

note: Lecture 3 is about Linear Algebra review, check at the end of “week2 note” for more info.

Note for Coursera Machine Learning made by Andrew Ng.

Multiple features

We still use the previous examp;e (house price pridiction problem). Then we add the following notations for convenience.

Multi-feature1

Notations

= number of features.
= input (features) of training example.
= value of feature in training example.

Because the hypothesis is defined as

Which is equal to

We can then represent the hypothesis in martix form

Link to coursera section

https://www.coursera.org/learn/machine-learning/supplement/WKgbA/multiple-features

Gradient descent for multiple variables

Gd_descent1

Below is the proof I did for better understand the vectorized gradient descent
Gd_vectorized_pf1

Link to coursera section

https://www.coursera.org/learn/machine-learning/supplement/aEN5G/gradient-descent-for-multiple-variables

Gradient descent in practice I: Feature Scaling

The purpose of Feature Scaling is to speedup gradient descent

Idea: Make sure features are on a similar scale. (eg. see as below)

feature_scaling1

We can see from above figure that the gradient requires a lot more iterations to approch the minimum point if features are no on a similar scale.

feature_scaling2

Mean normalization

This is a technic which normalize the range (optimze gradient descent)

details see the figure below:
mean_normalization1

Link to coursera section

https://www.coursera.org/learn/machine-learning/supplement/CTA0D/gradient-descent-in-practice-i-feature-scaling

Gradient descent in practice II: Learning rate

gd_alpha1
gd_alpha2
gd_alpha3
gd_alpha4

Link to coursera section

https://www.coursera.org/learn/machine-learning/supplement/TnHvV/gradient-descent-in-practice-ii-learning-rate

Features and polynomial regression

poly_regression1
poly_regression2

Link to coursera section

https://www.coursera.org/learn/machine-learning/supplement/ITznZ/features-and-polynomial-regression

Normal equation

Normal equation is a method to solve for analytically

Proof see below
normal_equation_pf1

Comparison between gradient descent & normal equation

gd_eq_cmp1

Link to coursera section

https://www.coursera.org/learn/machine-learning/supplement/bjjZW/normal-equation

Normal equtaion and non-invertibility

In Ocatave/Matlab, we use pinv(pseudo inverse) instead of inv. So it is rare that we can’t find the inverse of .

Useful links

Pseudo-inverse: https://www.youtube.com/watch?v=pTUfUjIQjoE
Generalized_inverse(synonym of Moore–Penrose inverse): https://en.wikipedia.org/wiki/Generalized_inverse
Reason of using pinv instead of inv (eg. when matrix in singular):https://stats.stackexchange.com/a/69459

What if is non-invertible?

normal_equation_non_inv1

Link to coursera section

https://www.coursera.org/learn/machine-learning/supplement/bjjZW/normal-equation

Coursera-Machine-Learning-Lecture4-Linear Regression with Multiple Variables

Yadong Liu 发布于 2021-08-29

Index

Lecture4

Multiple features

Link to coursera section

Gradient descent for multiple variables

Link to coursera section

Gradient descent in practice I: Feature Scaling

Mean normalization

Link to coursera section

Gradient descent in practice II: Learning rate

Link to coursera section

Features and polynomial regression

Link to coursera section

Normal equation

Comparison between gradient descent & normal equation

Link to coursera section

Normal equtaion and non-invertibility

Useful links

What if is non-invertible?

Link to coursera section

Sukoshi