用Azure ML Service構建和部署機器學習模型

譯文

作者：布加迪編譯 2019-01-23 11:12:42

人工智能機器學習

我們在本教程中將構建和部署一個機器模型，以便利用Stackoverflow數(shù)據(jù)集預測工資。看完本文后，你能夠調(diào)用充分利用REST的Web服務來獲得預測結果。

[[256196]]

【51CTO.com快譯】我們在本教程中將構建和部署一個機器模型，以便利用Stackoverflow數(shù)據(jù)集預測工資。看完本文后，你能夠調(diào)用充分利用REST的Web服務來獲得預測結果。

由于目的是演示工作流程，我們將使用一個簡單的雙列數(shù)據(jù)集進行試驗，該數(shù)據(jù)集包含多年的工作經(jīng)驗和薪水。想了解數(shù)據(jù)集的詳細信息，參閱我之前介紹線性回歸的那篇文章。

先決條件

1.Python和Scikit-learn方面的基礎知識

2.有效的微軟Azure訂閱

3.Anaconda或Miniconda

配置開發(fā)環(huán)境

使用Azure ML SDK配置一個虛擬環(huán)境。運行以下命令以安裝Python SDK，并啟動Jupyter Notebook。從Jupyter啟動一個新的Python 3內(nèi)核。

$ conda create -n aml -y Python=3.6  
$ conda activate aml  
$ conda install nb_conda  
$ pip install azureml-sdk[notebooks]  
$ jupyter notebook

初始化Azure ML環(huán)境

先導入所有必要的Python模塊，包括標準的Scikit-learn模塊和Azure ML模塊。

import datetime  
import numpy as np  
import pandas as pd  
from sklearn.model_selection import train_test_split  
from sklearn.linear_model import LinearRegression  
from sklearn.externals import joblib  
import azureml.core  
from azureml.core import Workspace  
from azureml.core.model import Model 
from azureml.core import Experiment  
from azureml.core.webservice import Webservice  
from azureml.core.image import ContainerImage  
from azureml.core.webservice import AciWebservice  
from azureml.core.conda_dependencies import CondaDependencies

我們需要創(chuàng)建一個Azure ML Workspace，該工作區(qū)充當我們這次試驗的邏輯邊界。Workspace創(chuàng)建用于存儲數(shù)據(jù)集的Storage Account、存儲秘密信息的Key Vault、維護映像中心的Container Registry以及記錄度量指標的Application Insights。

別忘了把占位符換成你的訂閱ID。

ws = Workspace.create(name='salary',  
subscription_id='',  
resource_group='mi2',  
create_resource_group=True,  
location='southeastasia'  
)

幾分鐘后，我們會看到Workspace里面創(chuàng)建的資源。

現(xiàn)在我們可以創(chuàng)建一個Experiment開始記錄度量指標。由于我們沒有許多參數(shù)要記錄，于是獲取訓練過程的啟始時間。

exp = Experiment(workspace=ws, name='salexp')  
run = exp.start_logging()  
run.log("Experiment start time", str(datetime.datetime.now()))

訓練和測試Scikit-learn ML模塊

現(xiàn)在我們將進而借助Scikit-learn訓練和測試模型。

sal = pd.read_csv('data/sal.csv',header=0, index_col=None)  
X = sal[['x']]  
y = sal['y']  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=10)  
lm = LinearRegression()  
lm.fit(X_train,y_train)

經(jīng)過訓練的模型將被序列化成輸出目錄中的pickle文件。Azure ML將輸出目錄的內(nèi)容自動拷貝到云端。

filename = 'outputs/sal_model.pkl'  
joblib.dump(lm, filename)

不妨記錄訓練作業(yè)的斜率、截距和結束時間，從而完成試驗。

run.log('Intercept :', lm.intercept_)  
run.log('Slope :', lm.coef_[0])  
run.log("Experiment end time", str(datetime.datetime.now()))  
run.complete()

我們可以通過Azure Dashboard來跟蹤度量指標和執(zhí)行時間。

注冊和加載經(jīng)過訓練的模型

每當我們凍結模型，它可以用獨特的版本向Azure ML注冊。這讓我們能夠在加載時在不同的模型之間輕松切換。

不妨將SDK指向PKL文件的位置，注冊來自上述訓練作業(yè)的工資模型。我們還將一些額外的元數(shù)據(jù)以標簽這種形式添加到模型中。

model = Model.register(model_path = "outputs/sal_model.pkl",  
model_name = "sal_model",  
tags = {"key": "1"},  
description = "Salary Prediction",  
workspace = ws)

檢查Workspace的Models部分，確保我們的模型已注冊。

是時候?qū)⒛Ｐ痛虬扇萜饔诚?到時作為Web服務來公開)并部署的時候了。

為了創(chuàng)建容器映像，我們需要將模型所需的環(huán)境告訴Azure ML。然而，我們傳遞一段Python腳本，該腳本含有基于入站數(shù)據(jù)點來預測數(shù)值的代碼。

Azure ML API為兩者提供了便利的方法。不妨先創(chuàng)建環(huán)境文件salenv.yaml，該文件告訴運行時環(huán)境在容器映像中添加Scikit-learn。

salenv = CondaDependencies()  
salenv.add_conda_package("scikit-learn")  
with open("salenv.yml","w") as f:  
f.write(salenv.serialize_to_string())  
with open("salenv.yml","r") as f:  
print(f.read())

下列代碼片段從Jupyter Notebook來執(zhí)行時，創(chuàng)建一個名為score.py的文件，該文件含有模型的推理邏輯。

%%writefile score.py  
import json  
import numpy as np  
import os  
import pickle  
from sklearn.externals import joblib  
from sklearn.linear_model import LogisticRegression  
from azureml.core.model import Model  
def init():  
global model  
# retrieve the path to the model file using the model name  
model_path = Model.get_model_path('sal_model')  
model = joblib.load(model_path)  
def run(raw_data):  
data = np.array(json.loads(raw_data)['data'])  
# make prediction  
y_hat = model.predict(data)  
return json.dumps(y_hat.tolist())

現(xiàn)在將推理文件和環(huán)境配置傳遞給映像，從而將各點連起來。

%%time  
image_config = ContainerImage.image_configuration(execution_script="score.py",  
runtime="python", 
conda_file="salenv.yml")

這最終會創(chuàng)建將出現(xiàn)在Workspace的Images部分中的容器映像。

我們都已準備創(chuàng)建定義目標環(huán)境的部署配置，并將它作為托管在Azure Container Instance的Web服務來運行。我們還決定選擇AKS或物聯(lián)網(wǎng)邊緣環(huán)境作為部署目標。

aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,  
memory_gb=1,  
tags={"data": "Salary", "method" : "sklearn"},  
description='Predict Stackoverflow Salary')  
service = Webservice.deploy_from_model(workspace=ws,  
name='salary-svc',  
deployment_config=aciconfig,  
models=[model],  
image_config=image_config)  
service.wait_for_deployment(show_output=True)