Using Titan with CI/CD tools

Akoios
6 min readMay 5, 2020

Titan Tutorial #10: A basic pipeline for Machine Learning

Introduction

Ever since its inception, every detail and feature of Titan has been designed and built with interoperability in mind.

In order to facilitate the integration of our product in any corporate architecture, Titan is both agnostic with regard to the underlying Cloud (public or on-prem) and also with regard to all the potential integrations with other applications and pipelines.

Titan has been designed to fit in any IT architecture

In this tutorial we will learn how to use Titan with two CI/CD services: GitLab CI and GitHub Actions.

CI/CD Basics

Continuous Integration and Deployment (CI/CD) is a global denomination for all those (automatic) processes used to build, package and deploy all types of applications (including Machine Learning ones).

Using CI/CD services brings several important benefits for the life-cycle management of our applications (AI/ML model in our case) such as:

  • Reduce human errors in repetitive tasks
  • Speed up the release cycles
  • Integration with the source code repository

A generic structure of a CI/CD pipeline is shown in the following picture:

A common CI/CD Pipeline

Our first CI/CD model

In order to understand how Titan can be used in these pipelines we will work in a very simple process as shown in the figure below:

Our basic pipeline

As it can be seen, our pipeline will do the following:

  • Connection with the source code repository
  • Use of a linter to ensure to identify errors, bugs or bad code practices (For our example we will use the Jupyter Notebook version of flake8)
  • Once the code has been checked, it will be deployed using Titan

In terms of source code, we will use the simplest possible model, a Hello World to illustrate the example.

The model we will be using

GitLab Implementation

Let’s start with GitLab’s service for CI/CD: GitLaB CI. As for other CI/CD services, the pipeline configuration is simply made by defining a YAML specification of the steps.

Note that it is required to have a GitLab repository in order to being able to apply CI/CD!

The best way to understand how this all works it to go straight to the YAML specification:

Let’s analyze the structure of the file:

stages:  
- lint
- deploy

First of all, we define the two stages which will form the pipeline. On our case, we will have just two stages, the linting and the deployment.

After that, we can define each of the jobs.

# Lint the Jupyter Notebook
lint:
image: python:3.8
stage: lint
script:
# Install Linter
- pip install flake8-nb
# Run Linter
- flake8-nb helloworld.ipynb

This first job is called lint, will use a python image in the GitLab environment and is linked to the lint stage previously defined through thestage: lint line.

The command in this job are quite simple:

  1. Install the linter in the GitLab environment
  2. Run the linter

Note that, if this stage fails, the pipeline will be stopped ant it won’t proceed to the deploy stage.

We proceed in the same way with the deploy stage:

# Deploy stage will deploy our Titan service
deploy:
image: python:3.8
stage: deploy
script:
# Install Titan CLI
- curl -sf https://install.akoios.com/beta | sh
# Deploy Notebook API service
- titan deploy --image scipy helloworld.ipynb

As for the linting, we create a new job called which will be linked to the deploy stage. The process is quite similar, we first install Titan and then we run our well know command:

$ titan deploy

You might be wondering how we can make Titan work without previous authentication. The trick is that GitLab enables the use of secret environment variables to this end, allowing to run Titan without compromising our credentials.

In the CI/CD settings of the GitLab repository it is possible to define these variables:

Env. variables at Gitlab CI

That would be it! Now, every time a commit is made to the master branch of the repository, the defined pipeline will be automatically started and will run the stages and jobs we have defined.

GitLab CI running the first stage
GitLab CI running the second stage
Process finished!

When the process is finished, we will have our model running in Titan as expected, as we can see in the dashboard:

It’s up and running!

Imagine now that we want to check if the linter is doing its job. In order to do that, we will introduce a syntactic mistake in our code (note the missing quotes in the print command):

If we commit and push the changes, the CI process will start again but, as shown below, it will fail as expected:

Lint stage failed due to the error

Checking the logs at GitLab CI, we see that the error returned by the linter is the following:

The linter detected the error

You can find all the code in this GitLab Repository.

GitHub Actions Implementation

GitHub Actions is GitHub’s approach to CI and works in a very similar fashion as GitLab CI.

Similarly, GitHub Actions uses a YAML configuration file to create or pipelines:

As in the previous example with GitHub, we define several jobs including for each of them their environment setup, variables and tasks to perform.

Likewise, regarding the management of Titan’s authentication credentials is also made using secret variables:

Secret Management in GitHub Actions

You can find all the code in this GitHub repository.

Wrap-up

In this post we have seen how to make use of Titan from two different CI/CD services, GitLab CI and GitHub Actions,

By using Titan in these type of services, it is possible to easily create pipelines to automate processes and reduce human errors. Moreover, this capability makes it easier to integrate Titan in different IT architectures and existing infrastructures.

Thanks for reading!

Foreword

Titan can help you to radically reduce and simplify the effort required to put AI/ML models into production, enabling Data Science teams to be agile, more productive and closer to the business impact of their developments.

If you want to know more about how to start using Titan or getting a free demo, please visit our website or drop us a line at info@akoios.com.

If you prefer, you can schedule a meeting with us here.

Akoios: Frictionless solutions for modern data science.

--

--