Introduction to Bayesian Optimization
Javier Gonz´alez
Masterclass, 7-February, 2107 @Lancaster University
Big picture
“Civilization advances by extending the
number of important operations which
we can perform without thinking of them.”
(Alfred North Whitehead)
We are interested on optimizing data science pipelines:
Automatic model configuration.
Automate the design of physical experiments.
Agenda of the day
9:00-11:00, Introduction to Bayesian Optimization:
What is BayesOpt and why it works?
Relevant things to know.
11:30-13:00, Connections, extensions and
applications:
Extensions to multi-task problems, constrained domains,
early-stopping, high dimensions.
Connections to Armed bandits and ABC.
An applications in genetics.
14:00-16:00, GPyOpt LAB!: Bring your own problem!
16:30-15:30, Hot topics current challenges:
Parallelization.
Non-myopic methods
Interactive Bayesian Optimization.
Section I: Introduction to Bayesian Optimization
What is BayesOpt and why it works?
Relevant things to know.
Data Science pipeline/Autonomous System
Challenges and needs for automation
Data CollectionOptimal designFeatures extractionFilters, dimensionality reductionModellingModel tuning and configuration, code optimisationResults Interpretation/DecisionData visualisation,Sequential experimentationPipeline improvement / Interaction with environment
Experimental Design - Uncertainty Quantification
Can we automate/simplify the process of designing complex experiments?
Emulator - Simulator - Physical system
Global optimization
Consider a ‘well behaved’ function f : X → R where X ⊆ RD is
a bounded domain.
xM = arg min
x∈X f (x).
f is explicitly unknown and multimodal.
Evaluations of f may be perturbed.
Evaluations of f are expensive.
Expensive functions, who doesn’t have one?
Parameter tuning in ML algorithms.
Number of layers/units per layer
Weight penalties
Learning rates, etc.
Figure source: http://theanalyticsstore.com/deep-learning