# Schedule

Date: 20th and 21st of May, 2019.

Venue: Cente for Mathematical Science, Cambridge

# Monday 20th May, Day 1

12:30 – 13:30 | Registration with Lunch | ||

13:30 – 13:45 | Housekeeping/Opening remarks | ||

13:45 – 15:15 | TensorFlow Tutorial |
Chair: Hong Ge | |

"Tutorial: Edward2 for Bayesian Deep Learning" [Slides] , [Extra material] | Dustin Tran (Google) | ||

"MCMC in TensorFlow Probability" | Matthew Pearce (Google) | ||

15:15 – 16:45 | Stan Tutorial |
Chair: Mark Briers | |

"Stan: A practical probabilistic programming language for Bayesian inference" | Daniel Simpson (University of Toronto) | ||

16:45 – 17:45 | Wine Reception and Networking | ||

19:30 – 22:30 | Conference Dinner at Downing College (Speakers and Organisers) |

# Tuesday 21st May, Day 2

09:00 – 09:30 | Arrival Tea and Coffee | ||

Session 1Chair: Petros Dellaportas |
|||

09:30 – 10:30 | Keynote: "Modern Data Oriented Programming" [Slides/material] |
Neil Lawrence (University of Sheffield, Amazon) | |

10:30 – 11:00 | "A compositional approach to scalable Bayesian computation and probabilistic programming" [Slides] | Darren Wilkinson (Newcastle University, Alan Turing Institute) | |

11:00 – 11:30 | "MultiBUGS: A parallel implementation of the BUGS modelling framework for faster Bayesian inference" [Slides] | Robert Goudie (MRC Biostatistics Unit, University of Cambridge) | |

11:30 – 12:00 | Morning Tea and Coffee | ||

Session 2Chair:Nei Lawrence |
|||

12:00 – 13:00 | Keynote: "Piecewise-deterministic Markov chain Monte Carlo" |
Arnaud Doucet (University of Oxford) | |

13:00 – 13:30 | "Probabilistic Programming At Uber; Pyro: a maturing Language" | Theofanis Karaletsos (Uber AI) | |

13:30 – 14:00 | "MLJ, Machine Learning in Julia" | Yiannis Simillides and Sebastian Vollmer (Warwick University, Alan Turing Institute) | |

14:00 – 15:00 | Lunch and Poster session | ||

Session 3Chair: Maria Skoularidou |
|||

15:00 – 16:00 | Keynote: "What are the important advances and challenges in machine learning, from this user’s perspective?" |
Andrew Gelman (Columbia University) | |

16:00 – 16:30 | "Probabilistic Programming for Agent-Based Models" | Christoforos Anagnostopoulos (Imperial College, Improbable) | |

16:30 – 17:00 | Afternoon Tea and Coffee | ||

Session 4Chair: Christoforos Anagnostopoulos |
|||

17:00 – 18:00 | Keynote: "From automatic differentiation to message passing" [Slides] |
Tom Minka (Microsoft Research Cambridge) | |

18:00 – 18:30 | "Building a Differentiable Programming Language" [Slides] | Mike Innes (Julia Computing) | |

18:30 – 19:00 | "Building a better language for array-oriented programming" [Slides] | Adam Paszke (PyTorch Team) | |

19:00 – 19:15 | Closing Remarks | ||

19:15 | Meeting Ends |

# Speaker Abstracts

**Speaker:** Dustin Tran (Google)

**Title:** Edward2 for Bayesian Deep Learning

**Abstract:** Probabilistic methods have expanded the scope of deep learning, with applications ranging from perceptual tasks such as image generation, to scientific challenges such as understanding how genetic factors cause diseases. In this talk, I will give an overview of the probabilistic approach to machine learning—specifically, the idea of Box’s loop which formulates the scientific method via building models, performing inference and predictions, validating the models, and repeating this loop by revising the models. I’ll tie this into our software design with Edward2, showing how to build and train large-scale deep probabilistic models.

**Speaker:** Matthew Pearce (Google)

**Title:** MCMC in TensorFlow Probability

**Abstract:** This talk will focus on the practicalities of using TensorFlow
Probability for conducting Markov Chain Monte Carlo sampling from
probability distributions. The package facilities substantial
flexibility of modelling, supported by hardware acceleration.

**Speaker:** Daniel Simpson (University of Toronto)

**Title:** Stan: A practical probabilistic programming language for Bayesian inference

**Abstract:** Stan is a probabilistic programming language. But more than that, it’s a language specifically built to construct log-density functions and their corresponding gradients that can be used in modern MCMC or optimization methods. In this tutorial I will go through the structure of a Stan program and highlight how the density and gradient information can be extracted through two of the main interfaces RStan and PyStan to prototype new algorithms. I will also cover some of the more recent developments in the Stan universe like parallelism and GPU support.

**Speaker:** Neil Lawrence (University of Sheffield, Amazon)

**Title:** Modern Data Oriented Programming

**Abstract:** There has been a great deal of interest in probabilistic
programs: placing modeling at the heart of programming language. In this
talk we set the scene for data oriented programming. Data is a
fundamental component of machine learning, yet the availability, quality
and discoverability of data are often ignored in formal computer
science. While languages for data manipulation exist (for example SQL),
they are not suitable for the modern world of machine learning data.
Modern data oriented languages should place data at the center of modern
digital systems design and provide an infrastructure in which monitoring
of data quality and model decision making are automatically available.
We provide the context for Modern Data Oriented Programming, and give
some insight into our initial ideas in this space.

**Speaker:** Darren Wilkinson (Newcastle University, Alan Turing Institute)

**Title:** A compositional approach to scalable Bayesian computation and probabilistic programming

**Abstract:** In the Big Data era, some kind of hierarchical “divide and conquer” approach seems necessary for the development of genuinely scalable Bayesian models and algorithms, where (solutions to) sub-problems are combined to obtain (solutions to) the full problem of interest. It is therefore unfortunate that statistical models and algorithms are not usually formulated in a composable way, and that the programming languages typically used for scientific and statistical computing fail to naturally support the composition of models, data and computation. The mathematical subject of category theory is in many ways the study of composition, and provides significant insight into the development of more compositional models of computation. Functional programming languages which are strongly influenced by category theory turn out to be much better suited to the development of scalable statistical models and algorithms than the imperative programming languages more commonly used. Expressing algorithms in a functional/categorical way is not only more elegant, concise and less error-prone, but provides numerous more tangible scalability benefits, such as automatic parallelisation and distribution of computation. Categorical concepts such as monoids, functors, monads and comonads turn out to be useful for formulating (Monte Carlo based) Bayesian inferential algorithms in a composable way. Further, probability monads form the foundation for the development of flexible and compositional probabilistic programming languages.

**Speaker:** Robert Goudie (MRC Biostatistics Unit, University of Cambridge)

**Title:** MultiBUGS: A parallel implementation of the BUGS modelling framework for faster Bayesian inference

**Abstract:** BUGS is a long running software project that makes general purpose Bayesian modelling software available to the statistics community. It has evolved through several versions: ClassicBUGS, then WinBUGS, then OpenBUGS. In this talk, I will describe the BUGS language, and then describe the newly developed version of BUGS called MultiBUGS (https://www.multibugs.org) that is able to automatically parallelise the broad range of statistical models that can be fitted using BUGS-language software, making the dramatic speed-ups of modern multi-core computing accessible to applied statisticians, without requiring any experience of parallel programming.

**Speaker:** Arnaud Doucet (University of Oxford)

**Title:** Piecewise-deterministic Markov chain Monte Carlo

**Abstract:** A novel class of continuous-time non-reversible Markov chain Monte Carlo (MCMC) based on piecewise-deterministic processes has recently emerged. In these algorithms, the state of the Markov process evolves according to a deterministic dynamics which is modified using a Markov transition kernel at random event times. These schemes enjoy remarkable properties including the ability to update only a subset of the state components while other components implicitly keep evolving and the ability to use an unbiased estimate of the gradient of the log-target while preserving the target as invariant distribution. The deterministic dynamics used so far do not exploit the geometry of the target. Moreover, exact simulation of the event times is feasible for an important yet restricted class of problems. In this talk, I will show that it is possible to introduce discrete-time non-reversible algorithms which address these shortcomings and still enjoy the remarkable properties of the continuous-time algorithms. I will demonstrate the performance on these schemes on a variety of applications including Bayesian inference for big data and Bayesian inference for high-dimensional graphical models.

**Speaker:** Theofanis Karaletsos (Uber AI)

**Title:** Probabilistic Programming At Uber; Pyro: a maturing Language

**Abstract:** I will be reviewing our PPL, pyro, showing advances in the language such as tensor elimination and alternative backends. These changes are accompanied by examples of models which benefit from these representations within Uber, for example Spatiotemporal modeling of markets and Sensor Fusion for mapping. I will also be teasing new results on automatic guides and discuss the weak spots from a user perspective that we aim to work on.

**Speaker:** Yiannis Simillides and Sebastian Vollmer (Warwick University, Alan Turing Institute)

**Title:** MLJ, Machine Learning in Julia

**Abstract:** In this talk we will describe our new software package called MLJ.jl, developed recently at The Alan Turing Institute. We will talk about its development, features and release to the wider Julia and ML communities. Features described will include automated tuning of hyperparameters, learning networks and the export as composite models. We will also describe efforts to integrate MLJ with other relevant software packages and if time/space permits, we will provide a quick demonstration of the package.

**Speaker:** Andrew Gelman (Columbia University)

**Title:** What are the important advances and challenges in machine learning, from this user’s perspective?

**Abstract**: Our rationale for creating and developing Stan was to fit models for applied problems that I could not otherwise easily solve, in areas ranging from climate reconstruction to pharmacology to opinion polling. I will discuss the sorts of applied research that my colleagues and I have worked on, and specific features that we would like in Stan or other machine learning languages that would facilitate future progress on these and related problems.

**Speaker:** Christoforos Anagnostopoulos (Imperial College, Improbable)

**Title:** Probabilistic Programming for Agent-Based Models

**Abstract:** The recent advance in computational Bayesian inference and probabilistic programming has focused mostly on hierarchical, semi-parametric and dynamic models where the state transition is mostly smooth and constant. This emphasis serves poorly a rich modelling sub-domain, that of agent-based modelling, wherein a population of agents is forward simulated in a way that allows non-linear, non-smooth, localised interactions between agents. Though largely intractable, agent-based modelling remains valuable because it naturally encodes prior knowledge on a number of domains, including economics and ecology; it is capable of reproducing emergent behaviour in complex systems that is hard to reason about macroscopically, and, finally, it is in principle capable of counterfactual analysis. Nevertheless, little methodological progress has been made in inference for ABMs, which remains a somewhat haphazard affair. In this talk, we present the motivation for ABMs in modern modelling, a set of challenges and a number of tentative solutions.

**Speaker:** Tom Minka (Microsoft Research Cambridge)

**Title:** From automatic differentiation to message passing

**Abstract:** Automatic differentiation is an elegant technique for converting a computable function expressed as a program into a derivative-computing program with similar time complexity. It does not execute the original program as a black-box, nor does it expand the program into a mathematical formula, both of which would be counter-productive. By generalizing this technique, you can produce efficient algorithms for constraint satisfaction, optimization, and Bayesian inference on models specified as programs. This approach can be broadly described as compiling into a message-passing program.

**Speaker:** Mike Innes (Julia Computing)

**Title:** Building a Differentiable Programming Language

**Abstract:** As researchers increasingly push the limits of frameworks like TensorFlow and PyTorch, they are looking to more powerful tools for the next generation of machine learning and statistical models. Where ML frameworks generalised machine learning beyond simple feed-forward chains to more complex architectures, new differentiable programming languages make virtually any numerical program a potential model.

This provides an elegant and expressive new way to do ML, without tradeoffs between performance and expressiveness. But more importantly, it enables a differentiable library ecosystem; algorithms in areas as diverse as colour theory, finance, physics and ray tracing can be differentiated and used directly in models, opening up exciting new research areas. This talk will cover the recent advances and open challenges in the field, with a focus on both the underlying technology and the applications for ML, data science and statistics researchers.

**Speaker:** Adam Paszke (PyTorch Team)

**Title:** Building a better language for array-oriented programming

**Abstract:** The recent surge in popularity of Python has been driven by a plethora of numerical computing libraries in its ecosystem which were the perfect fit for the rising interest in machine learning. The object central to most of those tools is a multi-dimensional array, which allows users to easily organise and manipulate complex collections of data. While some see this paradigm as limiting and overly specialised, in this talk I will try to argue that this is not necessarily the case. The problem lies mostly in the limitations of tools available today, and many recent approaches can help alleviate those issues, while providing a programming model understandable to a wide and inclusive audience.