AI入门之——Andrew Ng “Machine Learning”课程学习笔记第四周

Veröffentlicht am 2016-11-13

4、Neural Networks

4.1 Motivations

Neural Networks
Origins: Algorithms that try to mimic the brain.
Was very widely used in 80s and early 90s;
popularity diminished in late 90s.
Recent resurgence: 
    state-of-the-art technique for many applications.

4.2 Neural Networks

4.2.1 Model Representation I

Sigmoid(logistic) activation function.

4.2.2 Model Representation II

input层
hidden层
output层

4.3 Multiple output units:One-vs-all.

Machine Learning编程作业2——Logistic Regression

Veröffentlicht am 2016-11-07

##ex2.m
=============plotting================

data = load(‘ex2data1.txt’);

X = data(:,[1,2]);

y = data(:,3);

plotData(X,y);

plotData.m

function plotData(X,y)
figure;hold on;

pos = find(y==1);
neg = find(y==0);
plot(X(pos,1),X(pos,2),'k+','LineWidth',2,...
    'MarkerSize',7);
plot(X(neg,1),X(neg,2),'ko','MarkerFaceColor','y',...
    'MarkerSize',7);

hold off;
end

=============compute cost and gradient===========

[m,n] = size(X);

X = [ones(m,1) X];

initial_theta = zeros(n+1,1);

[cost,grad] = costFunction(initial_theta,X,y);

costFunction.m

function [J,grad] = costFunction(theta,X,y)
m = length(y);
J = 0;
grad = zeros(size(theta));

J = (-1)/m *(log(sigmoid(X*theta))'*y +
            log(1-sigmoid(X*theta))'*(1-y));
for i = 1: size(X,2)
  grad(i) = 1/m * sum((sigmoid(X*theta)-y) .* X(:,i));
end
end

plotDecisionBoundary(theta,X,y);

=============predict and accuracies===============

prob = sigmoid([1 45 85] * theta);

p = predict(theta,X);

predict.m

function p = predict(theta,X)

m = size(X,1);
p = zeros(m,1);

for i = 1:m
    if(sigmoid(X(i,:) * theta)) >= 0.5
        p(i) = 1;
    else
        p(i) = 0;
    end
end
end

ex2_reg.m

clear;

data = load(‘ex2data2.txt’);

X = data(:,[1,2]);

y = data(:,3);

plotData(X,y);

=====================regularized Logistic Regression======

X = mapFeature(X(:,1),X(:,2));

initial_theta = zeros(size(X,2),1);

lambda = 1;

[cost,grad] = costFunctionReg(initial_theta,X,y,lambda);

costFunctionReg.m

function [J,grad] = costFunctionReg(theta,X,y,lambda)
m = length(y);
J = 0;
grad = zeros(size(theta));

temp = theta(2:size(theta,1),:) .^2;
value = sum(temp);
J = (-1)/m * (log(sigmoid(X*theta))'*y +
            log(1-sigmoid(X*theta))'*(1-y))
        +lambda/(2*m) * value;
grad(1) = 1/m*sum((sigmoid(X*theta) - y) .* X(:,1));

for i = 2: size(X,2)
    grad(i) = 1 / m *sum((sigmoid(X*theta) -y).*X(:,i))
     +lambda/m * theta(i);
end
end

AI入门之——Andrew Ng “Machine Learning”课程学习笔记第三周

Veröffentlicht am 2016-11-01

3、Logistic Regression

主要分为三部分：Classification and Representation、Logistic Regression Model、Multiclass Classfication

3.1 Classification and Representation

3.1.1 Classification

Y ∈ {0，1} 0：‘Negative Class’ 1：‘Positive Class’

Classification: hθ(x)可以 >1 or <0

Logistic Regression: 0<= hθ(x) <=1

3.1.2 Hypothesis Representation

hθ(x) = estimated probability that y=1 on input x

3.1.3 Decision Boundary

3.2 Logistic Regression Model

3.2.1 Cost Function

Cost(hθ(x),y) = -log(hθ(x))    if y=1
                -log(1-hθ(x))  if y=0

3.2.2 Simplified Cost Function and Gradient Descent

Cost(hθ(x),y) = -ylog(hθ(x)) - (1-y)log(1-hθ(x))

3.3 Multiclass Classification

3.4 Solving the Problem of Overfitting

Overfitting: If we have too many features, the 
learning hypoyhesis may fit the training set very well,
but fail to generalize to new examples.

Addressing overfitting:

Options:

1. Reduce number of features.
    --Manually select which features to keep.
    --Model selection algorithm.
2. Regularization.
    --Keep all the features,but reduce magnitude
      /values of parameters θj.
    --Works well when we have a lo

Machine Learning编程作业1——Linear Regression

Veröffentlicht am 2016-10-31

作业源码

warmUpExercise.m ____basic function

fprintf(‘Running warmUpExercise …\n’);

fprintf(‘5*5 Identity Matrix: \n”);

warmUpExercise()

function A = warmUpExercise()

A = [];
A = eye(5);

end

plotData.m ____Plotting

fprintf(‘Plotting Data…\n’)

data = load(‘ex1data1.txt’);

X = data(:,1); y = data(;,2);

m = length(y); %num of training examples

plotData(X,y)

function plotData(x,y)

figure;
plot(x,y,'rx','MarkerSize',10);
ylabel('Profit in $10,000s');
xlabel('Poputation of City in 10,000s');

end

gradientDescent.m ____Gradient Descent

fprintf(‘Running Gradient Descent …\n’)

X = [ones[m,1],data(;,1)];

theta = zeros(2,1);

iterations = 1500;

alpha = 0.01;

computeCost(X,y,theta);

computeCost.m ____compute initial cost

function J = computeCost(X,y,theta)

m = length(y);
J = 0;
cost = 0;
for i = 1 : m
    cost += (theta(1,1) * X(i,1) + theta(2,1) * X(i,2)
            - y(i))^2;
end

J = 1/(2*m) * cost;

end

theta = gradientDescent(X,y,theta,alpha,iterations);

function [theta,J_history] = gradientDescent(X,y,theta,
                alpha,num_iters)

m = length(y);
J_history = zeros(num_iters,1);

for iter = 1 : num_iters
    cost_theta1 = 0;
    cost_theta2 = 0;
    for i = 1 : m
        cost_theta1 += (theta(1,1) * X(i,1) + 
            theta(2,1) * X(i,2) - y(i)) * X(i,1);
        cost_theta2 += (theta(1,1) * X(i,1) +
            theta(2,1) * X(i,2) - y(i)) * X(i,2);
    end
    new_theta1 = theta(1,1) - alpha*cost_theta1 * 1/m;
    new_theta2 = theta(2,1) - alpha*cost_theta2 * 1/m;
    theta(1,1) = new_theta1;
    theta(2,1) = new_theta2;

    J_history(iter) = computeCost(X,y,theta);
end
end

AI入门之——Andrew Ng “Machine Learning”课程学习笔记第二周

Veröffentlicht am 2016-10-24

课程内容简介

课程主要介绍机器学习、数据挖掘和统计模式识别。相关主题包括：
i) 监督式学习（参数和非参数算法、支持向量机、核函数、神经网络）
ii）无监督学习（集群、降维、推荐系统、深度学习）
iii) 机器学习实例（偏见／方差理论、机器学习和AI领域的创新）

课程学习

第二周

1、环境搭建

2、Multivariate Linear Regression，多元线性回归。

Feature Scaling:
Make sure features are on a similar scale.

为什么使用feature scaling？
It speeds up gradient by making it require fewer
iterations to get to 


(Get every feature into approximately a -1<= xi <=1)
(Mean normalization: replace xi with xi-ui to make
 features have approximately zero mean.)

如何保证梯度下降正确完成？

Making sure gradient descent is working correctly.
如果随着迭代次数的增加J(θ)反而越来越大，应该用更小的α。
If α is too small: slow convergence.
If α is too large: J(θ) may be not decrease on
            every interation, may not converge.

3、Computing Parameters Analytically

正态方程

Octave: pinv(X’ X) X’ * Y (X’ 即XT)

什么时候用梯度下降？什么时候用正态方程？

m training examples, n features.

Gradient Descent
* Need to choose α.
* Need many iterations.
* Work well even when n is large.

Normal Equation
* No need to choose α.
* Don't need to iterate.
* Need to cumpute (XTX)-1.   O(n3) 
* Slow if n is very large.

如果XTX是不可逆的怎么办？

Octave 有两种计算矩阵逆的函数——pinv、inv
如果使用pinv计算的话都会得到实际的结果，不管矩阵是否可逆。

* Redundant features(linearly dependent).
* Too many features(eg. m <= n)
    delete some features,or use regularization.

AI入门之——Andrew Ng “Machine Learning”课程学习笔记第一周

Veröffentlicht am 2016-10-18

课程内容简介

课程学习

第一周

1、10月18日学习Introduction章节。主要介绍了什么是Machine Learning及其意义，后续介绍了监督学习和无监督学习，其中监督学习主要介绍了regression和classification，无监督学习主要是cluster。(review三遍才过，对于有无监督学习理解不深刻)

Machine Learning is the field of study that gives computers 
the ability to learn without being explicitly programmed.

2、10月19日学习Model and Cost Function章节。分为四个小节，Model Representation、Cost Function、Cost Function-Intuition I、Cost Function-Intuition II。

Cost Function —— square error function

我们的目标就是尽可能的使Cost Function值最小。

3、10月19日学习Parameter Learning章节。分为三个小节Gradient Descent、Gradient Descent Intuition、Gradient Descent For Linear Regression。

因此采用梯度下降的方法

注意，上述temp0和temp1是同步变化的。

参数a的选取也至关重要，不仅影响算法的效率，还对是否能寻找到局部最优解起至关重要的作用。

As we approach a local minimum, gradient descent will sutomatically
take smaller steps. So, no need to decrease a over time.

最终算法表示如下：

4、10月20日学习Linear Algebra Review，主要是复习了线性代数的相关知识，矩阵计算、矩阵的逆和转置等。

如何在MAC上搭建博客——hexo+github

Veröffentlicht am 2016-09-05

环境配置

准备工作

1、git——git-scm官网git-scm下载mac版

2、node.js——直接到node.js官网下载安装

3、markdown编辑器——mac下推荐mou或者macDown 本人下载mou时其还未支持macOS sierra

4、域名——阿里云购买即可

安装和初始化

mac下在命令行下安装hexo

sudo npm install hexo -g
hexo init blog
cd blog
sudo npm install
hexo server

配置

配置站点文件_config.yml,主要修改Site部分，URL部分，deploy部分

#Site
title: Frances Hu's Blog
subtitle: 技术迷，本命李宇春
author: Frances Hu
...
language: zh_CN

#URL
url: http://xiaozhazi.win

deploy:
    type: git
    repository: https://github.com/xiaozhazi/xiaozhazi.github.io.git
    branch: master

部署网页

在blog文件夹目录下执行生成静态页面的命令：

hexo generate
hexo deploy

如果出错，无法识别git，则执行如下命令来安装hexo-deployer-git

sudo npm install hexo-deployer-git --save

绑定个人域名

在/blog/themes/landscape/source目录下新建CNAME文件，将自己的域名 xiaozhazi.win写入即可

重新部署

hexo clean
hexo d -g

###安装主题
在blog目录下执行如下命令

git clone https://github.com/iissnan/hexo-theme-next themes/next

将blog目录下的_config.yml里theme的名称改为next即可

也可以到hexo官网主题页下载自己喜欢的主题

Hello World

Veröffentlicht am 2016-09-04

Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.

Quick Start

Create a new post

1	$ hexo new "My New Post"

More info: Writing

Run server

1	$ hexo server

More info: Server

Generate static files

1	$ hexo generate

More info: Generating

Deploy to remote sites

1	$ hexo deploy

More info: Deployment