机器学习(Machine Learning)
是人工智能(artificial intelligence)
或通用人工智能(artificial general intelligence)
的一个子领域。
介绍
定义:Field of study that gives computers the ability to learn without being explicitly programmed (Arthur Samuel, 1959)
Machine Learning 分类/algorithms:
- 监督学习(Supervised Learning):使用最广泛,特点是 input(x) + labels(y) -> output 规律等
- learns from data
labeled with the (being given)
right answers
- 示例:
- (email -> spam?(0/1)) spam filtering
- (audio -> text transcripts) speech recognition
- (English -> chiness) machine translation
- (ad, user info -> click) Online advertising
- (image, radar, lidar info -> position of other car) Self-driving car
- (image of phone -> defect) visual inspection
- 常见算法
Regression Algorithms(回归算法)
: Predict a number, infinitely many possible outputs, 用来学习预测数字,如房价/大小拟合曲线预测Classification Algorithms(分类算法)
: Predict categories, small number of possible outputs. 如Breast cancer detecgion, 根据肿瘤的大小、病人年龄,可以预测肿瘤是良性(benign)还是恶性(malignant)的,通过学习算法,拟合出一条良性、恶性的边界线
- 无监督学习(Unsupervised Learning)
- Find something interesting in unlabeled data
- Data only comes with inputs x, but not output labels y.
- Algorithm has to find
structure
in the data. - 常见算法
Clustering Algorithms(聚类算法)
: Group similar data points together. 将未标记(没有标签)
的数据放置在不同的集群(clusters or group)中,示例 DNA microarrayAnomaly detection(异常检测)
: find unusual data points. 如金融系统的异常交易等Dimensionality Reduction(降维算法)
: Compress data using fewer numbers. 一个大数据集,压缩为一个小数据集(data-set),并丢失尽可能少的信息
- 强化学习(Reinforcement Learning)
- 需要不断与环境交互,通过互动获取下一步的
指示(Action)
,行动后改变环境,继续互动获取指示,依次执行- 输入
state
,输出 action
,该过程称为 episod
- 需要大量模型数据来训练
- 应用领域
- 等,前两个最常用
- Proximal Policy Optimization
On Prolicy
训练数据与agent不断交互Off Prolicy
通过代理获取数据并交互
工具:
案例
示例:
- Collect data
- Analyze: Iterate many times to get good insights
- Suggest hypotheses/actions
- Deploy changes
- Re-analyze new data periodically
工作角色
由于人工发展较快,没有确定的说法,一般包括
- 软件工程师(Software Engineer):具体的程序实现
- 机器学习工程师(Machine Learning Engineer):训练深度学习算法或神经网络
- 机器学习研究院(Machine Learning Researcher)
- 应用机器学习科学家(Applied ML Scientist)
- 数据分析师(Data Scientist):帮助驱动商业决策
- 数据工程师(Data Engineer):安全的保存数据、方便读取数据
- AI 产品经理(AI Product Manager)