blog

Deploying Tensorflow models with Go

It has been a while since I last posted here. I decided a few months ago to watch the extended version of Lord of the Rings, and I just finished it. Of course, I couldn't leave all my readers (Yes! All three of you! [including web crawlers]) without posts, so here I am.

(I also spent a month in New York for Google's Machine Learning Advanced Solutions Lab - an amazing course offered by Google to people interested in having a solid foundation on Machine Learning! It is a fantastic course, and I could not recommended it enough!)

Before we start, here is a philosophical thought experiment:

If an ML model is trained and never leaves Jupyter Lab, does it really make predictions? - Abraham Lincoln

Technically yes, but whether it is useful or not is another story. We can, however, make it useful by actually putting it in production and making it accessible to the public! Although productionizing a model is not a very easy task, especially if you want to have great performance. Training models are normally done with Python, but we probably want to write our APIs (or whatever else will do the predictions) in a more performant language such as Go. Luckily, there are bindings available for Go that allows us to import Tensorflow models and run predictions!

In this post, I am going to make a simple model with Tensorflow using Python, export the trained model, load it in Go, and wrap it in an API to run predictions!

Building the model

Let's start by making a simple ML model in Tensorflow. The focus of this post is not to show how to build models, so I will keep this part short. Let's suppose this is what my data looks like:

 x | y
---|---
10 | 21
33 | 67
24 | 43 
21 | 38
34 | 72
12 | 26
35 | 75
42 | 80

If you look closely, it can almost be defined by the function y = x * 2. I am going to generate a lot of data like this, and we are going to train a model to predict the value of y based on x.

We will being by creating the dataset:

import random

x = [n   + random.randint(-3, 3) for n in range(0, 100)]
y = [n*2 + random.randint(-3, 3) for n in range(0, 100)]

If I plot x and y, this is what I get:

X and Y Plotted

X and Y Plotted

We have our dataset. Before we can make the model, however, I am going to parse my x and y arrays into a Pandas dataframe. This is the input format that my model will consume.

import pandas as pd

df = pd.DataFrame({ "x": x, "y": y })

Now let's make a little Linear Regression model in Tensorflow.

import tensorflow as tf

# My model has "x" as a feature column. Meaning I am going to use "x" to predict "y"
model = tf.estimator.LinearRegressor(
    feature_columns = [ tf.feature_column.numeric_column(key = "x") ],
    config = tf.estimator.RunConfig(),
)

My model is done. Now we need to train it.

# This function will serve the input to the trainer
# We will shuffle the input dataset and serve it in batches
def train_input_fn(df):
    dataset = tf.data.Dataset.from_tensor_slices(tensors = (dict(df[["x"]]), df["y"]))
    dataset = dataset.shuffle(10).repeat().batch(10)
    return dataset

# Here we are training the model. We will train it with 300 steps - this should be
# enough to have good accuracy
model.train(input_fn = lambda: train_input_fn(df = df), steps = 300)

The model is now trained. Now let's test it!

# Function that serves the input to the model. We need to wrap the
# raw number with a tensor
def predict_input_fn(x):
    dataset = tf.data.Dataset.from_tensors(tensors = { "x": [x] })
    dataset = dataset.batch(batch_size = 120)
    return dataset

# Runs the prediction and unwraps the result from the 100 billion layers
# of arrays that tensorflow produces
def predict(x):
    p = model.predict(lambda: predict_input_fn(x))
    return list(p)[0]["predictions"][0]

predict(30) # Result: 60.912247
predict(50) # Result: 100.33565
predict(75) # Result: 149.61488

Not bad! Now that we have a trained model, we need to export it as a .pb file. Our Go program will load this file later.

# These are the placeholders for the inputs of the model (in our case, "x"). I am
# giving it the name of "input_x". 
p = { "x": tf.placeholder(tf.float64, [1], name="input_x") }

# This function will bind the placeholder we just created with the actual model
export_input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(p)

# Exporting the model to the "model" directory
model.export_savedmodel("model", export_input_fn)

And voila! Our model is saved!

$ tree
...
├── model
│   └── 1557625293
│       ├── saved_model.pb
│       └── variables
│           ├── variables.data-00000-of-00002
│           ├── variables.data-00001-of-00002
│           └── variables.index
...

We can inspect the model we just saved with the saved_model_cli utility from Tensorflow:

$ saved_model_cli show --dir model/1557625293/ --all

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['predict']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['x'] tensor_info:
        dtype: DT_DOUBLE
        shape: (-1)
        name: input_x:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['predictions'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: linear/linear_model/linear_model/linear_model/weighted_sum:0
  Method name is: tensorflow/serving/predict

There is some important information there. Notice how the input name is called input_x (we specified this when we were exporting the model)! We also should pay attention to the model tag, called serve. The output placeholder name is also important: linear/linear_model/linear_model/linear_model/weighted_sum.

Now we are good to Go!

Predicting with Go

To make predictions, we are going to use a library called tfgo. It makes working with the models much easier! We will have to install the following libraries:

$ go get github.com/galeone/tfgo
$ go get github.com/tensorflow/tensorflow/tensorflow/go

Also, make sure you have Tensorflow installed. Not the Python library, but the OS library. Otherwise you will run into this error just like I did:

/usr/bin/ld: cannot find -ltensorflow

The first thing we have to do is create our main.go file and import the libraries:

package main

import (
    "fmt"
    tg "github.com/galeone/tfgo"
    tf "github.com/tensorflow/tensorflow/tensorflow/go"
)

func main() {
    // I will do my work here
}

Now we can load the model.

// The first argument is the location of the model. The second argument
// is the model tag
model := tg.LoadModel("model/1557625293", []string{"serve"}, nil)

Next, we are going to create the input for the prediction. I am going to predict for the number 30, so I will put it in a tensor.

xInput, _ := tf.NewTensor([]float64{30})

We are now ready to run the prediction.

// Executing the operation in the model
results := model.Exec(

    // This part describes how to obtain the output. We are saying that our output
    // is in the path "linear/linear_model/linear_model/linear_model/weighted_sum"
    // (remember this from the output of "saved_model_cli"?) at position 0
    []tf.Output{
        model.Op("linear/linear_model/linear_model/linear_model/weighted_sum", 0),
    },

    // And this part describes how to serve the input. We are saying that at
    // position 0 of the "x_input" feature we are placing our xInput tensor
    map[tf.Output]*tf.Tensor{
        model.Op("input_x", 0): xInput,
    },
)

Now we just have to unwrap the result from 100 billion more arrays from Tensorflow!

predictions := results[0].Value().([][]float32)

fmt.Println(predictions[0][0]) // Result: 60.912247

Victory!

This is my main function so far:

func main() {
    model := tg.LoadModel("model/1557625293", []string{"serve"}, nil)

    xInput, _ := tf.NewTensor([]float64{30})

    results := model.Exec(
        []tf.Output{
            model.Op("linear/linear_model/linear_model/linear_model/weighted_sum", 0),
        },
        map[tf.Output]*tf.Tensor{
            model.Op("input_x", 0): xInput,
        },
    )

    predictions := results[0].Value().([][]float32)
    fmt.Println(predictions[0][0])
}

Just for fun, let's turn this into an API!

I am going to wrap the prediction part with a factory:

// I can initialize this factory by giving it the model only once. Then I can
// make as many predictions as I want!
func makePredictor(model *tg.Model) func(float64) (float32, error) {
    return func(x float64) (float32, error) {

        // Creating the input tensor
        xInput, err := tf.NewTensor([]float64{x})
        if err != nil {
            return 0, err
        }

        // Running the prediction
        results := model.Exec(
            []tf.Output{
                model.Op("linear/linear_model/linear_model/linear_model/weighted_sum", 0),
            },
            map[tf.Output]*tf.Tensor{
                model.Op("input_x", 0): xInput,
            },
        )

        // Unwrapping and returning
        predictions := results[0].Value().([][]float32)
        return predictions[0][0], nil
    }
}

I am also going to make a factory for an http controller:

// This factory accepts a function to make predictions and will return
// a controller that will handle HTTP requests for predictions!
func makePredictionController(
    predict func(float64) (float32, error),
) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {

        // Reading the ?v= query string from the URL
        vs := r.URL.Query().Get("v")
        v, err := strconv.ParseFloat(vs, 64)
        if err != nil {
            w.WriteHeader(http.StatusBadRequest)
            w.Write([]byte("param 'v' must be a float"))
            return
        }

        // Running the prediction for the value
        p, err := predict(v)
        if err != nil {
            w.WriteHeader(http.StatusInternalServerError)
            w.Write([]byte("an error happened while running prediction"))
            return
        }

        // Returning the prediction
        w.Write([]byte(fmt.Sprintf("predicted: %f\n", p)))
    }
}

Now to start the server:

func main() {
    model := tg.LoadModel("model/1557625293", []string{"serve"}, nil)

    predictor := makePredictor(model)
    pcontroller := makePredictionController(predictor)

    http.Handle("/", pcontroller)
    fmt.Println("listening on port 8000")
    http.ListenAndServe(":8000", nil)
}

Let's try it out!

$ curl localhost:8000?v=15
predicted: 31.344702

$ curl localhost:8000?v=68
predicted: 135.816696

$ curl localhost:8000?v=25
predicted: 51.056400

Another victory!

I hope you found this post useful. I am very glad the community is putting so much effort on this awesome libraries.

You can find the full source code, the model, and the Jupyter notebook here.