自动机器学习框架之二 _AutoML

Auto ML（Auto Machine Learning）自动机器学习是个宽泛的概念，有不只一个软件以此命名，本篇介绍的 Auto-ML 并非谷歌基于云平台的 AUTOML。本篇介绍的 Auto-ML 也是一款开源的离线工具，它的优势在于简单快速，且输出信息比较丰富。它默认支持 Keras、TensorFlow、XGBoost、LightGBM、CatBoost 和 Sklearn 等机器学习模型，整体使用进化网格搜索的方法完成特征处理和模型优化。

安装

Auto-ML 安装方法如下：

1	$ pip install auto-ml

为更多地了解 auto-ml 的功能和用法，建议下载其源码：

1	$ git clone https://github.com/ClimbsRocks/auto_ml

举例

本例也使用 96 年美国大选数据，将”投票 vote”作为因变量，它有只 0/1 两种取值，因此使用分类方法 type_of_estimator=’classifier’，训练时需要用字典的方式指定各字段类型：其中包括：因变量 output，分类型变量 categorical，时间型变量 date，文本 nlp，以及不参与训练的变量 ignore。

from auto_ml import Predictor
import statsmodels.api as sm

data = sm.datasets.anes96.load_pandas().data
column_descriptions = {
     'vote': 'output',
     'TVnews': 'categorical',
     'educ': 'categorical',
     'income': 'categorical',
}

ml_predictor = Predictor(type_of_estimator='classifier', 
                         column_descriptions=column_descriptions)
model = ml_predictor.train(data)
model.score(data, data.vote)
# 谢彦技术博客

程序的输出较多，不在此列出，相对 Auto-Sklearn，Auto-ML 的输出内容丰富得多，包含最佳模型，特征重要性，对预测结果的各种评分，建议读者自行运行上述例程。由于它同时支持深度学习模型和机器学习模型，可使用深度学习模型提取特征，用机器学习模型完成具体的预测，从而得到更好的训练结果。

(转载请注明出处：https://www.jianshu.com/p/6e4e40d29339)

自动机器学习框架之二_AutoML

自动机器学习框架之二 _AutoML

安装

举例