AI入门之——Andrew Ng “Machine Learning”课程学习笔记第九周

Veröffentlicht am 2016-12-16

9、Anomaly detection

9.1、Density Estimation

9.1.1、Problem motivation

密度估计，判断一个test实例是否为不正常的。

Anomaly detection example：

Fraud detection

xi = features of user i's ativities.
Model p(x) from data.
Identify unusual users by checking which have p(x)<ε

Manufacturing

Monitoring computers in a data center.

xi = features of machine i.
memory use,number of disk access/sec,cpu load...

9.1.2、Gaussian distribution

9.1.3、Anomaly detection algorithm

Choose features xi that might be indicative of anomalous examples.

Fit parameters μ1, … μn,σ12,…σn2

μj = 1/m ξxji
σj2 = 1/m ξ(xji-μj)2

Given new example x, compute p(x),Anomaly if p(x)<ε.

9.1.4、Developing and evaluating an anomaly detection system

The importance of real-number evaluation

例如 10000 good engines， 20 flawed engines，我们可以进行如下划分：

Training set：6000 good engines
CV:2000 good engines,10 anomalous
Test:2000 good engines,10 anomalous

Algorithm evaluation

可以利用F1-score来评估算法，我们也可以用CV来选择参数ε。

9.1.5、Anomaly detection VS supervised learning

Anomaly detection:

Very small number of positive example.
Large number of negative example.
Many different types od anomalies. 很难通过positive实例来学习异常的特征
未来和异常和目前的异常实例不相关

Supervised Learning：

Large number of positive and negative examples.
可以根据大量的positive值推断出其特征值，未来的positive和现在的训练集非常相似

9.1.6、多元高斯分布

通过μ矩阵和ξ矩阵来对多远高斯分布进行调整。

9.2 推荐系统

9.3.1 基于内容的推荐

问题描述：

r(i,j)=1 if user j has rated movie i
y(i,j)=rating by user j on movie i
θ(j)=paramater vector for user j
x(i)=feature vector for movie i
For user j,movie i, predicted rating θ(j)T(x(i))
m(j)=no. of movies rated by user j

9.3.2 正交过滤

9.3.2 实现技巧

归一化，计算平均值，然后同时减去该值