AI入门之——Andrew Ng “Machine Learning”课程学习笔记第三周

3、Logistic Regression

主要分为三部分:Classification and Representation、Logistic Regression Model、Multiclass Classfication

3.1 Classification and Representation

3.1.1 Classification

Y ∈ {0,1} 0:‘Negative Class’ 1:‘Positive Class’
3_1

Classification: hθ(x)可以 >1 or <0

Logistic Regression: 0<= hθ(x) <=1

3.1.2 Hypothesis Representation

3_2

hθ(x) = estimated probability that y=1 on input x

3.1.3 Decision Boundary

3_3

3_4

3.2 Logistic Regression Model

3.2.1 Cost Function

Cost(hθ(x),y) = -log(hθ(x))    if y=1
                -log(1-hθ(x))  if y=0

3.2.2 Simplified Cost Function and Gradient Descent

Cost(hθ(x),y) = -ylog(hθ(x)) - (1-y)log(1-hθ(x))

3_5

3_6

3_7

3.3 Multiclass Classification

3_8

3.4 Solving the Problem of Overfitting

Overfitting: If we have too many features, the 
learning hypoyhesis may fit the training set very well,
but fail to generalize to new examples.

Addressing overfitting:

Options:

1. Reduce number of features.
    --Manually select which features to keep.
    --Model selection algorithm.
2. Regularization.
    --Keep all the features,but reduce magnitude
      /values of parameters θj.
    --Works well when we have a lo