Blog: Facebook is Making Deep Learning Experimentation Easier With These Two New PyTorch-Based Frameworks
Streamlining the cycle from experimentation to production is one of the hardest things to achieve in modern machine learning applications. Among the deep learning frameworks in the market, Facebook-incubated PyTorch has become a favorite of the data science community for its flexibility in order to rapidly model and run experiments. However, many of the challenges of experimentation in deep learning applications go beyond the capabilities of a specific framework. The ability of data scientists to evaluate different models or hyperparameter configurations is typically hindered by the expensive computation resources and time needed to run those experiments. A few days ago, Facebook open sourced two new tools targeted to streamline adaptive experimentation in PyTorch applications:
- Ax: Is an accessible, general-purpose platform for understanding, managing, deploying, and automating adaptive experiments.
- BoTorch: Built on PyTorch, is a flexible, modern library for Bayesian optimization, a probabilistic method for data-efficient global optimization.
The goal of both tools is to lower the barrier to entry for PyTorch developers to conduct rapid experiments in order to find optimal models for a specific problem. Both, Ax and BoTorch, are based on probabilistic models which simplify the exploration of a given environment in a machine learning problem. However, the two frameworks target different dimension of the experimentation problem space.
BoTorch is a Bayesian Optimization library built on top of PyTorch. The goal of Bayesian Optimization is to find an optimal solution to a problem within constrained resources. Typically, Bayesian Optimization is applied to black-box optimization problems such as hyperparameter optimization for machine learning algorithms, A/B testing, as well as many scientific and engineering problems.
A Bayesian optimization problem attempts to maximize some expensive-to-evaluate black box function f without having access to the functional form of f. In that context, the optimization technique evaluates f at a sequence of test points, with the hope of determining a near-optimal value after a small number of evaluations. In order to optimize f achieve that, a Bayesian Optimization method needs a way of extrapolating the belief about what f looks like at points we have not yet evaluated. In Bayesian Optimization, this is referred to as the surrogate model. Importantly, the surrogate model should be able to quantify the uncertainty of its predictions in form of a posterior distribution over function values f(x) at points x.
BoTorch is the result of Facebook’s repeated work on Bayesian Optimization and the integration of these techniques into the PyTorch programming model. Conceptually, BoTorch brings a series of unique benefits compared to alternative optimization approaches.
· PyTorch Capabilities: BoTorch is built on top of the PyTorch framework and takes advantage of native capabilities such as auto-differentiation, support for highly parallelized modern hardware (such as GPUs) using device-agnostic code, and a dynamic computation graph that facilitates interactive development.
· State-Of-The-Art Modeling: BoTorch supports cutting edge probabilistic modeling in GPyTorch, including support for multitask Gaussian processes (GPs), scalable GPs, deep kernel learning, deep GPs, and approximate inference.
· Improved Developer Efficiency: BoTorch provides a simple programming model for composing Bayesian Optimization primitives. Specifically, BoTorch relies on Monte Carlo-based acquisition functions , which makes it straightforward to implement new ideas without having to impose restrictive assumptions about the underlying model.
· Parallelism: BoTorch programming model is optimized for concurrency and parallelism through batched computations which improves its scalability in large infrastructures.
BoTorch design allow PyTorch developers to change, swap or rearrange different components of a deep neural network architecture without having to rebuild the entire graph to retrain the entire model. Obviously, building low-level Bayesian Optimization primitives is a task that requires deep level of expertise. To address that challenge, Facebook decided to integration BoTorch with another project that provides a simple programming model for deep learning experimentation: Ax.
Conceptually, Ax is a platform for optimizing experiments such as A/B tests, simulations or machine learning models. Ax provides a high-level, easy-to-use API to interface with BoTorch allowing developers to rapidly model and run experiments. The relationship between Ax and BoTorch is illustrated in the following diagram. While new optimization algorithms can be implemented using BoTorch primitives, Ax provides a simple API for dispatching configurations, querying the data and evaluating results.
From the optimization standpoint, Ax can process discrete configurations (e.g., variants of an A/B test) using multi-armed bandit optimization as well as continuous (e.g., integer or floating point)-valued configurations using Bayesian optimization. Ax provides a very extensible framework that allow developers to customize all sorts of experiments for PyTorch models. From the programming model standpoint, Ax offers three main APIs:
- Loop API: This API is intended for synchronous optimization loops, where trials can be evaluated right away. With this API, optimization can be executed in a single call and experiment introspection is available once optimization is complete.
- Service API: This API can be used as a lightweight service for parameter-tuning applications where trials might be evaluated in parallel and data is available asynchronously.
- Developer API: This API is for ad-hoc use by data scientists, machine learning engineers, and researchers. The developer API allows for a great deal of customization and introspection, and is recommended for those who plan to use Ax to optimize A/B tests.
From the programming model standpoint, the Loop API offers the greatest degree of simplicity while the Developer API enables the highest levels of customization. Using the Loop API for a unconstrained synthetic Branin function is as simple as the following code:
from ax import optimize
from ax.utils.measurement.synthetic_functions import branin
best_parameters, values, experiment, model = optimize(
"bounds": [-5.0, 10.0],
"bounds": [0.0, 10.0],
evaluation_function=lambda p: branin(p["x1"], p["x2"]),
The Developer API requires a deeper manipulation of the Ax architecture components:
from ax import *
branin_search_space = SearchSpace(
name="x1", parameter_type=ParameterType.FLOAT, lower=-5, upper=10
name="x2", parameter_type=ParameterType.FLOAT, lower=0, upper=15
exp = SimpleExperiment(
evaluation_function=lambda p: branin(p["x1"], p["x2"]),
sobol = Models.SOBOL(exp.search_space)
for i in range(5):
best_arm = None
for i in range(15):
gpei = Models.GPEI(experiment=exp, data=exp.eval())
generator_run = gpei.gen(1)
best_arm, _ = generator_run.best_arm_predictions
best_parameters = best_arm.parameters
Ax provides some clear advantages compared to other experimentation frameworks. For starters, the programming model can be used with different optimization frameworks beyond BoTorch. Additionally, Ax automates the selection of optimization routines which reduces the data scientists efforts in fine-tuning a model. Finally, the framework is complemented by visualization tools and benchmarking suites that streamline the evaluation of optimization techniques.
Both Ax and BoTorch are widely used across different Facebook teams. The open source availability of these frameworks are huge additions to the PyTorch ecosystem which is already considered one of the most flexible deep learning frameworks for data science experimentation. As the data science community starts experimenting with Ax and BoTorch, new ideas are likely to be incorporate into both stacks in order to improve the experimentation cycles in PyTorch applications.