ML flow is a tool for montoring and taracking ml runs with a full server UI here we go over some exaples of use and how it can be incorperted into a google colab.
#Here is the code (meant to be run on a Colab notebook)
!pip install mlflow
!pip install pyngrok
# first import the ML flow lib
import mlflow
# start ml flow and set starting run params
with mlflow.start_run(run_name="MLflow on Colab"):
mlflow.log_metric("m1", 2.0)
mlflow.log_param("p1", "mlflow-colab")
# run tracking UI in the background
get_ipython().system_raw("mlflow ui --port 5000 &")
borrowed from
#import pyngrok
from pyngrok import ngrok
# Terminate open tunnels if exist
ngrok.kill()
NGROK_AUTH_TOKEN = "1xiKn1eTJOmwpwdB4DtuzRRMXZf_6KBaaCrekZX8Vn7HQjQRP"
ngrok.set_auth_token(NGROK_AUTH_TOKEN)
# Open an HTTPs tunnel on port 5000 for http://localhost:5000
#if you are on the VPN you will have to disconect to use ngrok
#also comment out any proxys in your bash
ngrok_tunnel = ngrok.connect(addr="5000", proto="http", bind_tls=True)
print("MLflow Tracking UI:", ngrok_tunnel.public_url)
import mlflow
with mlflow.start_run(run_name="MLflow in Notebook"):
mlflow.log_metric("m1", 2.0)
mlflow.log_param("p1", "mlflow-colab")
The output of this notebook will be a pyngrok-generated URL like:
MLflow Tracking UI: https://0a23d7a7d0c4.ngrok.io clicking on which will lead to an MLfLow GUI screen.
(Slight modification of the original code thanks to pyngrok creator, Alex Laird)
Tested with MLflow versions 1.10.0 and 1.11.0.
next try setting up some ml to track with ngrok tutorials https://dashboard.ngrok.com/get-started/tutorials
https://www.mlflow.org/docs/latest/tutorials-and-examples/tutorial.html
This tutorial showcases how you can use MLflow end-to-end to:
Train a linear regression model
Package the code that trains the model in a reusable and reproducible model format
Deploy the model into a simple HTTP server that will enable you to score predictions
This tutorial uses a dataset to predict the quality of wine based on quantitative features like the wine’s “fixed acidity”, “pH”, “residual sugar”, and so on. The dataset is from UCI’s machine learning repository. 1
What You’ll Need
Training the Model
Comparing the Models
Packaging Training Code in a Conda Environment
Specifying pip requirements using pip_requirements and extra_pip_requirements
Serving the Model
More Resources
To run this tutorial, you’ll need to:
PythonR Install MLflow and scikit-learn. There are two options for installing these dependencies:
Install MLflow with extra dependencies, including scikit-learn (via pip install mlflow[extras])
Install MLflow (via pip install mlflow) and install scikit-learn separately (via pip install scikit-learn)
Install conda
Clone (download) the MLflow repository via git clone https://github.com/mlflow/mlflow
cd into the examples directory within your clone of MLflow - we’ll use this working directory for running the tutorial. We avoid running directly from our clone of MLflow as doing so would cause the tutorial to use MLflow from source, rather than your PyPI installation of MLflow.
First, train a linear regression model that takes two hyperparameters: alpha and l1_ratio.
# The data set used in this example is from http://archive.ics.uci.edu/ml/datasets/Wine+Quality
# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
# Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.
import os
import warnings
import sys
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from urllib.parse import urlparse
import mlflow
import mlflow.sklearn
import logging
logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)
def eval_metrics(actual, pred):
rmse = np.sqrt(mean_squared_error(actual, pred))
mae = mean_absolute_error(actual, pred)
r2 = r2_score(actual, pred)
return rmse, mae, r2
def Build_ML_Flow():
warnings.filterwarnings("ignore")
np.random.seed(40)
# Read the wine-quality csv file from the URL
csv_url = (
"http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
)
try:
data = pd.read_csv(csv_url, sep=";")
except Exception as e:
logger.exception(
"Unable to download training & test CSV, check your internet connection. Error: %s", e
)
# Split the data into training and test sets. (0.75, 0.25) split.
train, test = train_test_split(data)
# The predicted column is "quality" which is a scalar from [3, 9]
train_x = train.drop(["quality"], axis=1)
test_x = test.drop(["quality"], axis=1)
train_y = train[["quality"]]
test_y = test[["quality"]]
try:
alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
except:
alpha =0.5
try:
l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5
except:
l1_ratio =0.5
#run ml flow
with mlflow.start_run():
lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
lr.fit(train_x, train_y)
predicted_qualities = lr.predict(test_x)
(rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)
print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
print(" RMSE: %s" % rmse)
print(" MAE: %s" % mae)
print(" R2: %s" % r2)
mlflow.log_param("alpha", alpha)
mlflow.log_param("l1_ratio", l1_ratio)
mlflow.log_metric("rmse", rmse)
mlflow.log_metric("r2", r2)
mlflow.log_metric("mae", mae)
if __name__ == "__main__":
Build_ML_Flow()
This example uses the familiar pandas, numpy, and sklearn APIs to create a simple machine learning model. The MLflow tracking APIs log information about each training run, like the hyperparameters alpha and l1_ratio, used to train the model and metrics, like the root mean square error, used to evaluate the model. The example also serializes the model in a format that MLflow knows how to deploy.
You can run the example with default hyperparameters as follows:
python sklearn_elasticnet_wine/train.py Try out some other values for alpha and l1_ratio by passing them as arguments to train.py:
python sklearn_elasticnet_wine/train.py
Next, use the MLflow UI to compare the models that you have produced. In the same current working directory as the one that contains the mlruns run:
# Open an HTTPs tunnel on port 5000 for http://localhost:5000
ngrok_tunnel = ngrok.connect(addr="5000", proto="http", bind_tls=True)
print("MLflow Tracking UI:", ngrok_tunnel.public_url)
Trains and evaluate a simple MLP on the Reuters newswire topic classification task.
import numpy as np
from tensorflow import keras
from tensorflow.keras.datasets import reuters
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation
from tensorflow.keras.preprocessing.text import Tokenizer
# The following import and function call are the only additions to code required
# to automatically log metrics and parameters to MLflow.
import mlflow.keras
mlflow.keras.autolog()
max_words = 1000
batch_size = 32
epochs = 5
print("Loading data...")
(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=max_words, test_split=0.2)
print(len(x_train), "train sequences")
print(len(x_test), "test sequences")
num_classes = np.max(y_train) + 1
print(num_classes, "classes")
print("Vectorizing sequence data...")
tokenizer = Tokenizer(num_words=max_words)
x_train = tokenizer.sequences_to_matrix(x_train, mode="binary")
x_test = tokenizer.sequences_to_matrix(x_test, mode="binary")
print("x_train shape:", x_train.shape)
print("x_test shape:", x_test.shape)
print("Convert class vector to binary class matrix " "(for use with categorical_crossentropy)")
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print("y_train shape:", y_train.shape)
print("y_test shape:", y_test.shape)
print("Building model...")
model = Sequential()
model.add(Dense(512, input_shape=(max_words,)))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation("softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
history = model.fit(
x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.1
)
score = model.evaluate(x_test, y_test, batch_size=batch_size, verbose=1)
print("Test score:", score[0])
print("Test accuracy:", score[1])
# Open an HTTPs tunnel on port 5000 for http://localhost:5000
ngrok_tunnel = ngrok.connect(addr="5000", proto="http", bind_tls=True)
print("MLflow Tracking UI:", ngrok_tunnel.public_url)