BossaBox

This is the playbook for engineering-playbook

Model Experimentation

Overview

Machine learning model experimentation involves uncertainty around the expected model results and future operationalization. To handle this uncertainty as much as possible, we propose a semi-structured process, balancing between engineering/research best practices and rapid model/data exploration.

Model experimentation goals

Model experimentation challenges

Creating an experimentation framework which facilitates rapid experimentation, collaboration, experiment and model reproducibility, evaluation and defined APIs, and lets each team member focus on the model development and improvement, while trusting the framework to do the rest.

The following tools and guidelines are aimed at achieving experimentation goals as well as addressing the aforementioned challenges.

Tools and guidelines for successful model experimentation

Virtual environments

In languages like Python and R, it is always advised to employ virtual environments. Virtual environments facilitate reproducibility, collaboration and productization. Virtual environments allow us to be consistent across our local dev envs as well as with compute resources. These environments’ configuration files can be used to build the code from source in an consistent way. For more details on why we need virtual environments visit this blog post.

Which virtual environment framework should I choose

All virtual environments frameworks create isolation, some also propose dependency management and additional features. Decision on which framework to use depends on the complexity of the development environment (dependencies and other required resources) and on the ease of use of the framework.

Types of virtual environments

In ISE, we often choose from either venv, Conda or Poetry, depending on the project requirements and complexity.

Expected outcomes for virtual environments setup

  1. Documentation describing how to create the selected virtual environment and how to install dependencies.
  2. Environment configuration files if applicable (e.g. requirements.txt for venv, environment.yml for Conda or pyrpoject.toml for Poetry).

Virtual environments benefits

Source control and folder or package structure

Applied ML projects often contain source code, notebooks, devops scripts, documentation, scientific resources, datasets and more. We recommend coming up with an agreed folder structure to keep resources tidy. Consider deciding upon a generic folder structure for projects (e.g. which contains the folders data, src, docs and notebooks), or adopt popular structures like the CookieCutter Data Science folder structure.

Source control should be applied to allow collaboration, versioning, code reviews, traceability and backup. In data science projects, source control should be used for code, and the storing and versioning of other artifacts (e.g. data, scientific literature) should be decided upon depending on the scenario.

Folder structure and source control expected outcomes

Source control and folder structure benefits

Experiment tracking

Experiment tracking tools allow data scientists and researchers to keep track of previous experiments for better understanding of the experimentation process and for the reproducibility of experiments or models.

Types of experiment tracking frameworks

Experiment tracking frameworks differ by the set of features they provide for collecting experiment metadata, and comparing and analyzing experiments. In ISE, we mainly use MLFlow on Databricks or Azure ML Experimentation. Note that some experiment tracking frameworks require a deployment, while others are SaaS.

Experiment tracking outcomes

  1. Decide on an experiment tracking framework
  2. Ensure it is accessible to all users
  3. Document set-up on local environments
  4. Define datasets and evaluation in a way which will allow the comparison of all experiments. Consistency across datasets and evaluation is paramount for experiment comparison.
  5. Ensure full reproducibility by assuring that all required details are tracked (i.e. dataset names and versions, parameters, code, environment)

Experiment tracking benefits

Datasets and models abstractions

By creating abstractions to building blocks (e.g., datasets, models, evaluators), we allow the easy introduction of new logic into the experimentation pipeline while keeping the agreed upon experimentation flow intact.

These abstractions can be created using different mechanisms. For example, we can use Object-Oriented Programming (OOP) solutions like abstract classes:

Abstraction outcomes

  1. Different building blocks have defined APIs allowing them to be replaced or extended.
  2. Replacing building blocks does not break the original experimentation flow.
  3. Mock building blocks are used for unit tests
  4. APIs/mocks are shared with the engineering teams for integration with other modules.

Abstraction benefits

Model evaluation

When deciding on the evaluation of the ML model/process, consider the following checklist:

Evaluation development process outcomes

  1. Evaluation strategy is agreed upon all stakeholders
  2. Research and discussion on various evaluation methods and metrics is documented.
  3. The code holding the logic and data structures for evaluation is reviewed and tested.
  4. Documentation on how to apply evaluation is reviewed.
  5. Performance metrics are automatically tracked into the experiment tracker.

Evaluation development process benefits