Skip to main content
Prannay Kaul and Shaan Desai working on a Husky in Mobile Robotics Week

Current DPhil Projects

Following the first year of taught modules and mini-projects our students move on to a variety of DPhil projects.

You'll find some of our current research projects below.

2019 Cohort

Probabilistic Snapshot GNSS - Jonas Beuchert

Conventional global navigation satellite system (GNSS) receivers operate in multiple consecutive steps to estimate their position. Instead, direct position estimation (DPE) is based on a probabilistic model of the received GNSS signal and performs position estimation in one step using maximum-likelihood estimation (MLE). This approach has the potential to be robust in scenarios where conventional GNSS fails, such as low-quality signals recorded with an energy-saving low-cost device, weak signals, e.g., in a multi-path environment, or signals as short as one millisecond. Furthermore, Bayesian DPE allows to directly integrate prior knowledge into the probabilistic model. Advancements of DPE such that it can be employed in practice would allow to build GNSS receivers with significantly lower costs and lower energy consumption. Such devices would, e.g., enable conservationists to perform more affordable wildlife tracking on a broader scale.

Robust & Transparent Machine Learning in Biomedicine - Jan Brauner

Deep learning methods excel at spotting patterns in large datasets. The increasing availability of medical datasets has thus led to many reports of deep learning models performing on par with human physicians in prediction and diagnosis tasks. In my DPhil research, I will develop and apply machine learning models in the context of biomedicine and healthcare, with a particular focus on two topics: interpretability and robustness.

Interpretability: Deep learning models are notoriously opaque: Even if they give highly accurate predictions, it’s not easy to figure out how they do so. There would be two major advantages in improving our ability to interpret, explain, and understand neural networks. First, black-box algorithms are unlikely to inspire the required trust in medical decisionmakers and thus unlikely to find their way into clinical practice. Second, the ability to interpret deep neural networks could help us to capitalise on the implicit knowledge captured in these trained models, potentially elucidating the mechanisms that play a role in disease development and progression.

Robustness: Robustness to dataset shift is a central topic in many areas of machine learning. In the medical context, robustness is critical as covariate shifts are ubiquitous. Most notably, different healthcare providers serve different populations, with different demographics, baseline prevalence levels, behaviours and needs. If we ever hope to apply machine learning models in healthcare settings, they need to generalise to these different populations, or at least ‘know’ when they should be uncertain and be readily adaptable to new populations.

Sound, Automated, and scalable, Synthesis of Digital Controllers for Physical Systems - Alec Edwards

Connections between Verification and Control underpin work on the development of symbolic methods and automated techniques based on SAT/SMT theory for the synthesis of CPS. This project will employ powerful techniques from bounded model checking and inductive synthesis (CEGIS and SyGuS) to automatically design sound digital controllers for physical plants [1,2]. The approach allows for the design and synthesis of modern control architectures, implemented over digital devices such as FPGAs, using automatic procedures that are correct by construction. The synthesis is sound with respect to the complete range of approximations related to utilising digital architectures for physical plants including: time discretisation, quantisation and saturation effects, and finite-precision arithmetics with rounding errors.

Data and Model-Based Reinforcement Learning for Performance, Requirements, & Multi-Agent Setups - James Fox

Despite many recent successes in the field of AI, AI systems can still only solve a narrow set of tasks in a restricted environment. Reinforcement learning (RL) is a machine learning technique that holds promise for achieving generality because almost all real-world cognitive tasks can be cast as a reinforcement learning problem. This is one where an agent is coupled with an environment and gets reward according to which action it takes in each situation. The agent must decide on a policy of actions to maximise its expected cumulative future reward.  

Two key shortcomings limiting the applications of current RL systems are reward misspecification and inefficient sampling. Reward misspecification refers to the fact that it is difficult for a user to codify exactly what they want in an objective function. This can result in negative side effects or ‘reward hacking’ where an agent learns to exploit a loophole in the objective function to gain reward for undesired behaviours. RL’s inefficient sampling refers to the fact that RL agents must currently acquire vast amounts of experience before reaching any degree of competence at a task.

Inverse Reinforcement Learning (IRL) and Active Learning try to address these shortcomings. IRL seeks to determine the objective function given observations of optimal behaviour. Several approaches to IRL have recently been put forward including Maximum entropy IRL, Cooperative IRL and Bayesian IRL. The idea behind Active Learning is that if one prioritises training on data, trajectories, or samples that would result in the greatest learning effect, then one can significantly increase the sample efficiency of learning systems (including RL agents or IRL algorithms). By addressing shortcomings in existing RL systems, I will be advancing and expediting the project of creating safe and scalable RL systems to tackle real world problems and benefit humanity.

Spatial Reasoning and Planning with Deep Neural Networks - Shu Ishida

Performing tasks autonomously in novel environments is a major challenge for many applications of robotics, from indoor robots to autonomous vehicles. In classical robotics, such tasks have been separated into smaller components: localisation, mapping, planning and navigation. Techniques such as Simultaneous Localisation and Mapping (SLAM) have been popular for localisation and mapping, while optimisation, graph search and control theory has been prominent in the field of planning and navigation. While these methods can solve well-specified individual problems, their performances are upper-bounded by the assumptions made by each particular component.

Advances in neural network architectures, together with successful integration of these models into Reinforcement Learning (RL) frameworks, have transformed the field of computer vision and robotics, allowing a learning-based approach to problems traditionally solved by manually implementing insights of domain experts. Learning-based approaches have two major advantages. Firstly, learning-based algorithms can keep improving and adapting to the application domain with more availability of data, whereas manually implemented methods are fixed and do not learn to adapt. Secondly, learning-based methods are capable of automatically discovering inherent regularities and characteristics of the application domain and exploiting them to improve their performances without having such strategies hardcoded.

Deep Reinforcment Learning - Cong Lu

Deep reinforcement learning has become ubiquitous for learning control policies in challenging environments such as robotic control, Go-playing and autonomous driving. However, standard approaches are often sample inefficient, unstable, and use ad-hoc tricks that are not theoretically well justified. This project looks at deriving principled new objectives and algorithms for deep reinforcement learning through a Bayesian lens, and new interpretations of existing reinforcement learning algorithms.

Low-Cost Conservation Technology - Amanda Matthes

We are currently witnessing an ongoing mass extinction. This Anthropocene extinction is mainly caused by human activities that either directly decimate populations (e.g.over fishing and overhunting) ordestroy their habitats (e.g.deforestation and pollution). Recently, human-induced climate change has additionally sped up habitat destruction, which is leading to an even greater loss of biodiversity.   We hope to address this problem by providing better technology to conservation biologists. This should speed up and improve their investigations which will allow for more effective policy decisions. 

Akam Rahimi


Efficient Inference for Higher-Order Probabilistic Programs - Tim Reichelt

Many scientific models can be naturally expressed as stochastic simulators. Probabilistic programming allows users to exploit the source code information of these simulators to conduct Bayesian inference. Full-scale Bayesian inference in general stochastic simulators essentially provides users with a principled way to invert simulators based on observed data. For example, given a simulator that models disease outbreaks and some observed data we can infer the underlying latent parameters which best describe the given disease outbreak.

However, most inference algorithms in Bayesian statistics are designed for models which have a fixed dimensionality. In contrast, higher-order probabilistic programming allows the user to define models which have a variable (possibly even infinite) number of latent variables. The generality of models expressed in higher-order probabilistic programs requires the design of new inference algorithms which are sufficiently general and can exploit the program structure of the simulator. The potential impact of efficient and general Bayesian inference in these simulators would be enormous as it would allow for entirely new scientific workflows of building accurate simulators which can be inverted and improved based on observed data.

Understanding and Improving Deep Generative Models - Mrinank Sharma

The meteoric rise of the quantity of available data in the 21st century demands algorithms that can leverage this data without requiring expensive labels. Deep Generative Models (DGMs) uncover rich patterns hidden within the data by learning to generate the data. These models are used to perform several tasks including: compression; clustering; representation learning; and density estimation. Improvements to these models could be deployed across a number of real world applications.

2018 Cohort

Efficient Algorithms for Neural Network Bounds - Alessandro De Palma

Neural networks have been used to solve problems in a variety of contexts, including healthcare, autonomous driving and collision avoidance for commercial aircraft. It is crucial that these systems are robust, and verify a number of desirable properties. The knowledge of the output range for the network activations is essential to fully understand the resulting system, and many verification problems can be formulated as optimisation problems to compute network bounds.

Learning the Laws of Physics using Deep Learning - Shaan Desai

Neural networks have a unique ability to learn and generalize from data. However, they still struggle to learn the basic laws of physics which underpin a majority of datasets e.g. moving objects in videos adhere to the equations of motion and conservation laws. This has sparked significant interest in developing physics-based priors and methods to interpret hidden layers of a neural network.

The impact of being able to bridge the divide between learning from data and theoretical laws is an ability to explain why neural networks make the decisions they do.

Understanding Deep Learning - Bryn Elesedy

The success of deep learning has played a signifi cant role in the resurgence of interest in arti ficial intelligence and machine learning. Today, while large-scale practical applications and engineering feats abound, a comprehensive mathematical theory of deep learning remains elusive. While being interesting and important in its own right, a grounded understanding of deep learning is essential if modern machine intelligence is to be safe and interpretable.  Moreover, increased clarity on the inner workings of deep learning algorithms has potential to guide practitioners in advancing state of the art techniques and applications.

Noncooperative Path Planning under Uncertainty - Anna Gautier

Robots owned and controlled by independent parties frequently need to coexist in the same environment. For example in a grocery store, one company may be tasked with autonomously scanning shelves while another company is tasked with autonomously cleaning the floors. The two companies are hesitant to share data and proprietary information, so there is a need for a third party to design a system that allows all companies to operate in the same space, achieve their respective individual goals, and avoid collisions.

The goal of this thesis is to design an offline protocol for noncooperative multi agent robot path planning that is both individually rational and incentive compatible. Further goals include modifying the proposal to be online, in which case it must be flexible under changes in environment and individual goals before completion and robust to uncertainty in individual transitions during path execution. Further goals may include approximate solutions with lower runtimes.

Gaussian Processed with Learnt Kerneld for Probabilistic Numerical Methods in High Dimensional Spaces - Saad Hamid

The challenge of designing agents that are capable of innovation has two facets.  The fi rst is to create spaces which correspond to ideas or candidates for the task, such as possible tools for a robot reaching task. Recent work focussing on object-centric representations using VAEs enables the creation of such spaces [Bur+19]. The second is to intelligently explore this space. For this, Bayesian Optimisation o ffers the ideal framework, however there are challenges. The spaces over which we want to optimise are high-dimensional, over which it has historically been infeasible to fit Gaussian Processes. However, recent advancements in learning kernels in their spectral domain has made this possible. We may be able to leverage the Fourier features thus learnt to intelligently explore the objective function.

Developing Visual Data Understanding through Self-Similarity - Prannay Kaul

Comparison between data of the same type (e.g. vectors, text files, images) is a critical task in information processing. In the context of visual data however (images and videos), computer vision methods still lack meaningful tools for comparison relative to human visual perception. For example, a child which sees a single picture of a zebra at the end an alphabet book, will often be able to recognise the animal in a zoo, despite inevitable differences in pose, appearance, scale and other quantities. In this case the child has learnt the similarity between two very different pieces of raw visual data.

Much of the current progress of deep neural networks has relied on large human-annotated datasets, which are expensive and labour-intensive to collect. Returning to the example, current methods will struggle to understand the character of a zebra from a single example which can then be applied immediately, as a child could.

This research aims to leverage self-similarity present in natural images to solve multiple tasks in the supervised-learning framework and to develop representations of visual data which can be widely applied to multiple computer vision tasks.

Short-term objectives revolve around using self-similarity to solve the problem of class-agnostic counting and salient object subitizing, i.e. predicting the existence and the number of salient objects in an image. Provided an example of any object of interest, we aim to provide accurate counts in a collection of query images, with no/minimal fine-tuning. This could be applied in microbiology in the context of counting cell growth given different initial conditions or in zoology to count the number of a certain species at different times of the day.

Long-term research aims are to develop the current work of feature-learning using the self-supervised framework with a focus on techniques relying on self-similarity.

Information Theory in Deep Learning - Andreas Kirsch

Within active learning and deep learning, information theory holds a prominent place as motivator for recent advances and explanations of fruitful results.

We want to further advance this by finding explanations grounded in information theory for successful applications of DL, and expand and generalize these where possible. At the same time, we want to examine results and show where the semantics and intuitions fail and offer alternatives.

The guiding example of this is our success with BatchBALD, which is based on an expansion of the BALD acquisition function using a better grounded intuition for information theory. I hope to continue this line of research. 

Reinforcement Learning for the Environments with Dynamic State and Action Spaces - Vitaly Kurin

Most of the research in Reinforcement Learning (RL) assumes that an environment has a fixed state-space representation. Fixed state space implies that an agent has a viewport of a predefined size through which the agent perceives the world. This might be a fixed window of an Atari 2600 screen, or a fixed number of joints for the robot. Fixed action space means that an agent cannot have more than k actions to choose from. Fixed number of torques for robots or predefined game controllers are examples of fixed action spaces in RL.

Why is it essential to loosen the limitations of the fixed state-action space? I can see the following potential benefits. The algorithms should be able to generalize better, bringing the power of function approximation to the structured representation (i.e. as a graph) of the state-action space. We need generalization to be able to scale methods for complex real-world tasks. This might benefit even further from the permutation invariance properties of the nodes in the graph.

I plan to tackle the problem of the fixed state-action space using Graph Neural Networks. Both of my mini-projects were devoted to this topic. In the first one, I applied Graph Neural Networks to a multi-agent RL studying their generalization properties. In the second, I applied Graph Neural Networks and RL for solving boolean satisfiability problems.

Generative Modelling for Robot Locomotion and Manipulation - Alexander Mitchell

Deep generative models are increasingly a popular choice for representation learning. The aim of my PhD is to apply these methods to robotics. These models have shown the potential to allow robots to reason about their environments at an object-centric level. This means that robots will be able to interact naturally with their environment.

I am working on legged robot locomotion and manipulation of robotic arms. The impact of my research will be to allow legged robots to perform search and rescue or inspection tasks in environments too dangerous for humans. Work applied to robot arms will have an impact in areas from health care to manufacturing and warehousing.

The aim of this research is to improve how robots interact and reason about the environments they work in. The objectives are to have robots perform tasks that humans take for granted, such as being able to pack delicate objects into a shipping box or walking over rough terrain.

Analysis of Robust Performance of Dynamical Systems using Sum of Squares - Matthew Newton

Research into dynamical systems is a large area and requires a variety of mathematical techniques to solve the problems it contains. This DPhil project will build upon the existing expansive research to explore various problems within dynamical systems. Sum of squares is a technique that can be used in semi-definite programming through their use in solving linear matrix inequalities. Sum of squares is well established and has been an integral part of Prof. Papachristodoulou’s research. The technique can be employed to create algorithms that can be combined with dynamical systems to analyse system performance. The performance of the system can be modified by selecting parameters and then optimised to improve the operational efficiency of the system.

The research in this project will be an extension of the work I completed in two mini-projects, which were a part of the AIMS CDT program. The first project was titled “An Investigation into Higher-order Interactions in The Lotka-Volterra Model”, which used a sum of squares decomposition to find Lyapunov functions for a higher-order Lotka-Volterra model. The second project was titled “ADMM for Control of Mixed Traffic Flow with Human-driven and Autonomous Vehicles”, which defined a dynamical system for a multi-vehicle traffic problem and created a control strategy for the autonomous vehicle to reduce traffic congestion.

Multi-sensory Self-supervision from Ego-centric Platforms - Mandela Patrick

Today's computer vision methods need human supervision, such as object labels, to learn about the world. Humans, on the other hand, learn a great deal from associations between senses: for example, early in development, seeing a face and hearing a voice teaches us about other people’s presence and identities. Other senses are involved in this cross-correlation, such as touch (relating seen objects and their physical shapes), proprioception (associating our motion with viewpoint changes), and others. Inspired by this idea, we hope to develop models that learn about the world by finding structure in multimodal sensation, particularly from first-person/ego-centric platforms.

A Motion Primitive Approach to Data-driven and Model-predictive Optimal Control - Steffen Ridderbusch

Model-predictive optimal control has been used successfully in recent years to provide accurate and fast control for a range of complex systems. The bulk of research focusses on only a single objective, as adding further objectives can increase the effort exponentially and requires a choice regarding the trade-off. One example is the conflict between speed and energy-efficiency.

To address this issue, a new method based on motion-primitives has been proposed, which can be seen as nonlinear explicit model-predictive control. The idea is to pre-compute motion primitives, or pieces of trajectories, based on symmetries in the model of the system. This reduces the problem during runtime to selecting one of the available primitives in the motion library instead of optimizing over an appropriate function space.

This will make the method feasible for online usage.

Quantitative Rational Verification - Thomas Steeples

Rational verification asks whether certain temporal properties hold within the equilibria of multi-agent systems. There are a number of models for this framework, including iterated boolean games, simple reactive module games and concurrent game structures. Whilst effective in verifying binary properties, such as ensuring the safety of a configuration, these lack the flexibility of differentiating between non-functional behaviours of a system.

To illustrate why this might be important, consider verifying the behaviour of some robot - there may be multiple ways of having it achieve its goal, but these may have vastly different resource (such as energy or time) consumptions. As such, it is natural to ask questions like ‘how can this fleet of autonomous cars get from point A to their respective destinations both safely and in the least amount of time?’ or ‘how can these industrial robots perform their respective functions, whilst collectively expending the least amount of battery power?’. These are real world problems which also invoke rich questions in theoretical computer science. It also worth noting that whilst the two examples above are framed in terms of minimising consumption of some resource, we can also consider situations where we are looking to maximise some utility or reward.

In this project, we aim to introduce non-dichotomous preferences to the framework of rational verification. We will build on the existing literature by extending the tradtional temporal goals of players with mean-payoff rewards. Whilst the verification of temporal properties of systems where the agents have mean-payoff rewards has been looked at be- fore, we believe this will be the first work to directly incorporate both temporal goals and numerical payoffs into the rewards of the agents. As such, this will give us games with a greater strategic interest than previously analysed, which also have direct applicability to the real world. We will also consider other ways of representing non-dichotomous rewards within rational verification, analysing their expressiveness and complexity, for the sake of completeness.

Deep Learning Efficiency: a Path towards a Rational Automated Machine Learner - Filip Svoboda

Deep model architecture design in its current form is ambivalent to its resource requirements.  Consequently, the state-of-the-art models are too large and too costly to run on most everyday hardware.  This has very serious and negatove implications for the field. First, the current situation monopolizes the technology as only very few very large corporations and governments have the ability to deploy and monetize these models.  Unless access is broadened the deep product development wil be dominated by a small number of large players - with the predictable monopoly effect of slower innovation, higher proces, and lower quality.  Second, the current state introduces serious data privacy concerns as data needs to be transmitted and centralized.  The centralized introduces a single point of attach, while the transmission exposes potentially highly priovate data to the public.  Both need to be avoided or mitigated in the design of a secure system.  Finally, the current modus operandi prohibits some of the most important applications from being developed.  Particularly embedded systems stand to gain the most from running models locally - this couls be anything from the deep learning powerered robotic arm in space to the smart insulin dispenser off the grid.

State Space Representation Learning - Panagiotis Tigkas

For autonomous agents to succeed, we need systems that can operate in environments of rich observations. However, such high-dimensional input spaces can make the task of perception and learning challenging. In representation learning, a fundamental concept in machine intelligence, one is trying to learn a mapping between a high-dimensional space (e.g. raw pixel space) to a lower-dimensional space (features), where learning is more manageable. Problems like control (e.g. robotics) and multi-agent communication depend on “good” representations in order to solve the downstream tasks and usually such mapping is created manually by experts who incorporate domain knowledge (feature engineering). Recent advances in deep learning have allowed learning end-to-end control systems, where low-dimensional representations are learned via backpropagation (e.g. embed to control, world models). However, what makes a “good” representation and how one can learn them efficiently, is an open research problem with significant impact in various fields like robotics and (single/multi-agent) autonomous systems.

An Automated Screening Tool for Spinal Cancer - Rhydian Windsor

Magnetic Resonance (MR) scans are a workhorse of clinical imaging, used in the diagnosis and prognosis of a wide range of spinal diseases. The aim of this project is to develop a tool to automatically screen such scans for signs of cancer and alert clinicians of the possibility in order to minimise the risk of them being missed. Such cases are not uncommon; two thirds of all patients diagnosed with cancer will develop some form of bone metastasis (spread of cancer to the bone)[1, 2]. Furthermore, a 2011 meta-analysis suggested expert radiologists achieve 90.6% recall/sensitivity at nding these metastases and 96% precision/speci city [3]. This means that 1 in 10 cases of these cancers are missed and 1 in 20 diagnoses are not in fact myeloma. Other than bone metastases, we would also like to explore the automated diagnosis of multiple myeloma, a blood cancer which will a ect 1 in 132 people over their lifetime. Catching this disease early is vital for patients survival odds and current guidelines suggest MRI as the primary modality for such investigations [4]. We hope to explore the use of automated analysis to achieve this goal.

2017 Cohort

Home Monitoring of Patients with Early and Late Stages of Dementia – Antigoni Alevizaki

In recent years, various changes in modern societies have resulted in a sig­nificant number of people spending a considerable amount of their day in their home environments. More and more people turn to self-employment, while businesses seem to be exploring recent studies related to increasing productivity, by allowing their employees to have flexible working times, of­ten working from home. At the same time, the advances in medicine and the increase in life expectancy have resulted in the phenomenon of the age­ing population; even though nowadays older retired adults normally have many more years to live, they are often faced with age-related diseases, such as arthritis, Parkinson’s, dementia or geriatric depression, that might keep them at home as they become more severe.

In this DPhil project, the task of monitoring human behaviour in their home environment employing widely available, low-cost and light-weight sensors is tackled. In particular, we will explore the following research directions:

  1. An algorithm for room identification, based only on BLE beacons and IMU data from smartwatches. Even though RSSI methods based on the use of smartwatches have been popular these last few years, the use of smartwatches as the main tool for tracking is not met frequently in existing literature; it is also particularly challenging, as smartwatch recordings are very noisy and also related to tasks that might be per­formed alongside movement intended to travel from one place to an­other.
  2. An algorithm to perform PDR from smartwatches; the VICON sys­tem will be used to provide ground truth positioning, and the noisy acceleration data will have to be analysed carefully to identify steps. Smartwatches have been used in PDR methods before, but only as a sensor fusion method, with smartphones or smartglasses being the primary sensing device.
  3. Motion pattern analysis at home using location and gait information. Though motion patterns have long been studied, it is either specific movements that are usually tackled, or the sensors used are either com­plicated networks, or intrusive (e.g., cameras), both of which are not appropriate for the privacy-preserving home environment.
  4. Should data from dementia patients become available, application of the aforementioned
On Self-Supervised Learning and where Labels are Necessary – Yuki Asano

Artificial Intelligence, and more specifically machine learning has recently seen a huge gain in both its research impact and novel applications. This has been partly due to novel insights, more computing power, and to a large extent, more data. Data and its algorithms used to processed it, can be briefly categorized into having a supervisory signal (e.g. a label such as a `dog', indicating a dog's presence in a picture) or being unsupervised, such as plain images or videos from the internet. In this work, we seek to understand exactly where and when rare and expensive supervisory signals are necessary to facilitate learning good models. The aim is thus to explore and extend the boundary between supervised and unsupervised learning further by developing and analyzing self-supervised learning and meta learning methods. As the field of machine vision has been at the forefront of neural network research, has standard testing data sets and well established baselines, this field is well suited for starting this line of research. Developing new models will allow us to be more economical about our data, use resources for gathering human-generated labels more effectively and ultimately lead to better models and understanding.

Dynamic Motion Planning for Full-body Manipulators – Mark Finean

State-of-the art robots still appear very `robotic’ in their movements and are generally poor at interacting with moving objects, or static objects while the robot is moving. Overcoming these challenges should improve efficiency in industrial automation processes such as warehouse pick-and-place tasks. Human Support Robots are now at the forefront of research and becoming much more prevalent. In order to develop better relationships with robots, in particular in a care or hospital environment, these robots should appear more natural in their movements as well as be able to perform useful tasks. This research proposal will address these areas.

In this research, I propose expanding on the latest research in robotic control, such as new variants of path planning algorithms, in combination with the use of machine learning techniques. The aim will be to develop more natural and efficient movements, decrease the planning time needed and apply these techniques to real robots. This research will complement the main focus areas of research for the EPRSC such as in Artificial Intelligence Technologies, Assistive Technology, and Robotics.

Deep Learning and Control Theory Based Hierarchical Underactuated Robotic Control – Siddhant Gangapurwala

Much of the research in robotic control aims to develop solutions that, depending on the environment of operation, exploit the machine’s dynamics in order to achieve a highly agile behavior. This, however, is limited by the use of traditional control techniques such as model predictive control (MPC) [1] and quadratic programming (QP) [2] which are often based on simplified rigid body dynamics and contact models. A model-based optimization strategy employed over such simplified models often results in a constrained range of solutions that do not fully exploit the versatility of the robotic system, thereby limiting the agility of the robot in question.

Treating the control of robotic systems as an RL problem enables the use of model-free algorithms that attempt to learn a policy which maximizes the expected future (discounted) reward without infer­ring the effects of an executed action on the environment. Authors of [3] [4] and [5] have successfully implemented these strategies for various robotic applications including control of robotic manipulators, helicopter aerobatics, and even quadrupedal locomotion. However, despite the successful implementa­tion of these RL algorithms for the mentioned tasks, one of the main challenges faced in solving an RL problem is defining a reward function in order to learn an optimal policy resulting in a sensible robotic behavior. Often, this reward function needs to be tuned by a human expert. For tasks such as quadrupedal navigation through rough terrain, computing a reward function is also significantly more difficult than for tasks such as posture recovery, which when solved using an RL algorithm results in a near-optimal policy.

Deep Visuomotor Policies for Robot Manipulation – Chia-Man Hung

Robot manipulation has been one of the main drives of robotics and a key breakthrough of the first robotic revolution, e.g. big robot arms for automation of factory manufacturing. It has been an active area since the 1960s and roboticists are still working to develop robots capable of picking and placing objects in unstructured environments. In the early days, robot manipulation is mainly about carefully prescribed hand-engineered movement sequences with no ability to adapt to changes. As time passed, robot manipulation gradually shifted to a pipeline of components that have different capabilities, e.g. task-level planning, vision, etc. However, each component is independent and cannot correct for errors propagated from other components. For instance, if a vision component has a slight error on an object position, the manipulation component would attempt to pick it up at the wrong position and would have no clue how to correct it.
Learning vision and control in an end-to-end manner gives us the opportunity to overcome this difficulty and has emerged as a new trend. We want our robot system to be able to adapt to changes and generalise to unseen environments. This is what more precise modelling and traditional control cannot achieve. On the other hand, learning to control approaches tend to fail due to accumulated errors in long-horizon tasks. In most deep learning settings, a prediction is made without knowing how uncertain it is. We want our robot system to be able to estimate uncertainty about its prediction and recover from positions where it already thinks it is uncertain about its prediction. This new research field has the potential to shape the future of industrial manufacturing, assistance in daily life, etc.

Branch and Bound Methods for Neural Networks Verification – Florian Jaeckle

Despite the recent success Deep Learning has had in a variety of scientific fields its use in safety-critical settings is still limited by the lack of formal verification. However, even though neural networks are generally being treated as a black-box method, some progress has been made on verifying straight-forward properties in simple networks. In my research I will focus on improving existing branch and bound methods that exploit the piecewise linear structure of neural networks with the aim of being able to apply them to larger networks. Improvements can be made to all three parts of the branch and bound algorithm: the search strategy, which picks the next domain to branch on, the branching rule, which given a domain divides it into non-intersecting subdomains, and finally the bounding methods which estimate lower and upper bounds for each subdomain.

Signal Processing and Deep Learning on Graphs for Network Data Analysis – Henry Kenlay

Modern information processing tasks typically involve data that come with not only a large volume but also increasingly complex structures. In particular, data are often collected in non-Euclidean domains such as networks and graphs, where the observations are in uenced by the underlying structures as well as by the underlying dynamics at each node. For example, mobility trajectories may follow the physical constraints of the environment, and behaviours of a group of people may be in uenced by the friendship among them. This poses a series of challenges to classical learning approaches, which are mostly successful on data with an underlying Euclidean or grid-like structure with a built-in notion of metric and invariance. To cope with such challenges, geometric deep learning (GDL) [1] is a branch of emerging deep learning techniques that makes use of novel concepts and ideas brought about by graph signal processing (GSP) [2], a fast-growing eld by itself, to generalise classical deep learning approaches to data lying in non-Euclidean domains such as graphs and manifolds.

This project aims to develop novel signal processing and machine learning techniques within the context of GSP and GDL. In particular, owing to the infancy of the eld there remain many open challenges in GDL. An example of one of these open problems which we hope to explore early on is how do we construct an underlying graph? Although many GDL techniques have been proposed, they mostly focus on building models on a prede ned or known graph, but the importance of such a choice remains largely unexplored. We will aim to understand how this choice impacts the ecacy of GDL models. Furthermore, there is still considerable research to be done in exploring novel lter design on a prede ned graph.

Towards Detecting and Understanding Change – Hala Lamdouar

This research aims at detecting and understanding significant spatio-temporal change from discrete observations in the form of multiple frames or video sequences. Our goal is to identify relevant differences that are not caused by signal noise, illumination alteration or camera motion. This typically involves image alignment or registration which can be described as the task of inferring correspondences and transformations that map images to the same coordinate system and therefore emphasise change. While the previous methods have shown considerable success in the well-defined cases, the problem is far from being solved in the ambiguous examples presenting texture-less regions or repetitive patterns.

Whereas change induced by camera motion is generally considered extraneous and can be compensated through robust image alignment, the notion of relevance of change may vary according to applications, which makes the problem ill-defined. In most cases, this includes significant modifications of an object's position or location with respect to its environment. Hence, another intimately connected task to our problem is that of motion segmentation. Like many areas of computer vision, this task has leveraged the great progress in CNNs especially with the recent publication of large video datasets. However, the current state-of-the-art methods still lack robustness in managing abrupt movements and occlusions; therefore they require computationally expensive post-processing. With a better understanding of the underlying geometry and incorporating robust feature matching and occlusion awareness, we can train networks to attend to relevant change without the need for further post-processing. We are particularly interested in the ambitious task of camouflage breaking which can only be addressed with such a robust model.

Machine Learning for Autonomous Driving – Robet McCraith

In the past handful of years machine learning techniques have seen rapid development and incredible results on tasks which were previously much more challenging. One such area is Computer Vision where deep learning techniques are state of the art in many tasks. This has motivated many people to employ these techniques to various robotics tasks including autonomous driving which incorporates many classical vision problems such as segmentation, classification, depth prediction, and uncertainty estimation. Developing such systems therefore both contributes to the fields of computer vision and machine learning and benefits greatly from other developments in these fields.

Deep Learning for Inverse Problems – Ben Moseley

Solving inversion problems is core to many scientific areas. In geophysics, we wish to infer properties of the Earth from seismic recordings. In medical imaging, we wish to decode biological properties from sets of electromagnetic and acoustic measurements. In robotics, we wish to intuitively understand the physics in the world around us. For many areas the associated inverse problem is well studied and challenging to solve. Often the inverse problem is underdetermined and highly non-linear, and optimisation is heavily relied upon to provide a solution. Recently, deep learning has made an impressive impact on these problems. In seismic imaging, convolutional autoencoders have been used to predict underlying velocity models given a set of wavefield measurements, in a single inference step (Wu, Lin, & Zhou, 2018). Convolutional networks have rapidly become a method of choice in medical imaging (Litjens et al., 2017). Theoretical approaches have recently been suggested for combining the power of deep learning and optimisation, for example by using a deep neural network as a regulariser (Adler & Öktem, 2017; Li, Schwab, Antholzer, & Haltmeier, 2018). Closely linked to inversion is the ability to carry out forward modelling, and deep learning has made an impact here too (Guo, Li, & Iorio, 2016).

Optimisation for Efficient Machine Vision – Alasdair Paren

Machine Vision has undergone rapid development during the last 6 years with the state of the art on a range of benchmarks being persistently improved by new machine vision techniques. Many of these recent techniques in machine vision leverage large convolutional neural networks (CNNs) that require graphics processing units (GPUs) to both train and run at inference time because of their large computational load. However, the power, cost and space requirements of GPUs prohibits the applications of these techniques in many settings.

This research aims to develop novel machine vision methods, with a focus on efficient operation. As a starting point this research will look to develop novel methods for training Binary and Quantised Neural networks by using discrete programming relaxations to train binary neural networks.

If comparable results to modern CCNs could be replicated on low powered CPUs such as those found in mobile devices this would have a huge impact on the areas of self-driving cars, robotics, smart data acquisition and portable AI.

Probabilistic Inference for Reinforcement Learning and Meta-Learning – Tim Rudner

Probabilistic machine learning uses probability theory to represent and manipulate uncertainty and is based on the idea that learning can be thought of as inferring plausible models to explain observed data. This way, probabilistic methods provide a mathematically principled approach to learning that can be applied to other areas of machine learning such as reinforcement learning (RL) or meta-learning.

Probabilistic models stand to play a crucial role in a wide variety of RL problems, including: smart exploration; hierarchical RL; and model-based RL. Meta-learning also naturally lends itself to probabilistic approaches, as they allow for information about sets of models to be encoded and inferred probabilistically.

My research will aim to elucidate and bridge the gap between probabilistic inference, reinforcement learning, and meta-learning. The two main research foci will be: (i) improving the data efficiency of reinforcement learning through the use of probabilistic inference in model-based RL and meta-learning; as well as (ii) establishing optimization dualities between probabilistic inference and either RL or meta-learning. The former research focus will help open up a new range of problems to which reinforcement learning can be applied, while the latter will make training in reinforcement learning and meta-learning amenable to a wide range of probabilistic inference methods.

A Probabilistic Approach to Structure and Robustness in Machine Learning – Lewis Smith

Machine learning has made remarkable progress in recent years by exploiting 'deep' models, which promise to learn complex representations of their input, aiming to discover the underlying structure of the problem directly from data. However, despite their empirical successes, the theoretical evidence that this is actually the explanation for the success of deep models is mixed. Even in toy cases where a very simple invariance in the data exists, empirically deep models do not always infer it even in the limit of large amounts of data, showing failure to learn even simple structure. In addition, models are often sensitive to extremely small pertubations to their input, which show that they often achive their performance by using features not semantically relevant to the task at hand. Sometimes this is acceptable, but in other cases enforcing appropriate structure on the model will result in greater robustness and interpretability.

Probabilistic modelling is one solution to enforcing model structure.  However, it is challenging - often structured models are too restrictive, and as a result are extremely difficult to fit.  This is especially true for data like images, where specifying a direct likelihood over pixels is often both restrictive and artificial.

We think exploring ways to both add more flexibility to structured probablistic models and to use newer techniques, like GANS, to learn more traditional graphical models will be a fruitful area of research.


2016 Cohort

Computer Vision for Understanding Human Communication – Daffy Afouras

Speech recognition and machine translation have been thoroughly researched in the past and continue being a popular area, due to the large impact of their applications. Human communication however is multimodal and uses visual signals to complement the acoustic and linguistic information. Attempting to transcribe only one of the modalities, namely speech, many times has ambiguous results. In fact, even a perfect transcript of a speaker’s verbal expression, is sometimes not enough to communicate more abstract notions such as their emotional state. In these cases, visual messages such as lip motion, gestures, body-language, and facial expressions, carry a great deal of information that substantially aids our understanding.

Autonomous Agents for Augmented Decision-Making – Oliver Bent

Artificial Intelligence Agents pose enormous opportunities to inform decisions made by expert and non-expert humans across industries. This research develops the potential for Agents to augment complex decision-making. Complex decisions impact the future data, observation and state of the system which considering. To achieve some confidence in the decision-making process Agents will have to efficiently explore high dimensional decision spaces and collaborate sharing information.

Learning Invariant Representations with Deep Neural Networks – Fabian Fuchs

My research topic is learning invariant representations. Simply put: whereas most of deep learning is concerned with finding the important information in an input, I focus on ignoring harmful or irrelevant parts of information. This can be important to counteract biases or to better leverage structure in the data.

Inference Amortization for Probabilistic Programming – Adam Golinski

Probabilistic modelling and reasoning are widespread techniques lying on the boundary of statistics and machine learning. Probabilistic programming simplifies the use of probabilistic modelling thanks to the ease of defining generative models, and saves the effort of deriving custom inference algorithms for the model of interest thanks to the general purpose Monte Carlo or black box variational inference algorithms which are available as part of some prominent probabilistic programming languages or systems, such as Anglican.

AI+ : Applied, Interpretable Inference – Bradley Gram-Hansen

Artificial intelligence (AI) has received increasing interest over the last two decades, but is AI actually the thing that we should be working towards to solve machine intelligence? I would argue not. The very premise of AI is the ability to emulate the decision making and deductive skills of a human being. However, humans, in general, do not tend to make rational decisions. There are approaches within machine learning, such as Bayesian and Frequentist methods, that have long been used to quantify decision-making processes, enabling practitioners to make rational decisions. However, such approaches are not entirely accessible, nor entirely scalable. In order to make them both scalable and accessible we require a new approach to the current paradigm, that combines human domain knowledge and a hybrid of old and new machine learning technologies, to provide robust solutions to big problems within the machine learning and machine intelligence communities; such as the interpretability of predictions, reliable inference on small datasets and making inference globally accessible. I call this new paradigm Applied Interpretable Inference (AI+).

In order to create AI+ we need to build intelligent systems that leverage new and existing techniques, to generate informed, rational decisions in an automated way.To do this we will employ probabilistic programming (PP) Gordon et al. [2014], Staton [2017], Goodman et al. [2012], Staton et al. [2016], Minka et al. [2013], Wood et al. [2014], whose aims can be seen from two vantage points. On the one hand, the programming languages community Staton et al. [2016], who look at how to formally define the semantics of a probabilistic programming language (PPL), which in turn enables them to analyze how programs can be transformed into something that represents a probability, or a density. This then characterizes the class of models that a program defined by a particular set of semantics can represent. On the other hand, the statistics and machine learning community, who look at how one can apply PPLs to the real world, via flexible systems that leverage existing inference algorithms Hoffman et al. [2013], Neal [2011], Welling and Teh [2011], Gelman et al. [2013]. By turning those inference algorithms into generalizable algorithms, the community provides an automated way to perform general-purpose inference. It is standard nomenclature to refer to the system as the thing doing the inference and the PPL as the semantics and syntax that define the rules and characteristics of a language. The combination of the two creates a probabilistic programming system (PPS). It is common among those of the latter vantage point to call a PPS a PPL, throughout this work I shall keep the distinction clear. 

Unsupervised and Multi-task Learning for Computer Vision – Xu Ji

The need for large-scale manual annotations is a bottleneck for many machine learning methods that use deep neural networks, especially for computer vision problems such as image classification. Methods that are able to learn visual understanding in an unsupervised manner, i.e. without manual annotation, could be deployed in a wider range of applications, as the amount of real-world unlabelled data far exceeds that of labelled data. Catastrophic interference is another drawback of deep neural networks: learning from changing (i.e. non-stationary) distributions leads to forgetting previously learned modes of the functions being approximated. Consequently stationary distributions must be simulated for many real-world applications in computer vision and reinforcement learning, where for example video and game sequence data are both highly temporally correlated, meaning online (real time) learning and testing is inhibited. Furthermore, the need for neural networks to employ variable learning rates (few shot and episodic learning; the ability for humans to immediately retain specific observed events) is also hindered by catastrophic interference, as higher retention of new function modes equates to faster catastrophic forgetting of old ones. Solving these issues would result in making neural networks hardier: able to cope with the lack of dense manual annotation and non-stationarity that human learning can.

Learning Meaningful Embeddings of Complex Data – Shuyu Lin

Humans are able to build a profound understanding of the world and how different objects interact with each other. To develop such high-level intelligence, we rely on two important factors: 1) acute sensing systems that allow us to collect information about different aspects of the world; and 2) strong comprehension ability that can form a systematic understanding of the information collected by our sensors. The understanding that we develop from observing a large amount of sensory information (often at young ages) facilitate us to achieve two tasks: 1) to predict certain properties of unseen objects (e.g. an exotic fruit are likely to be juicy and nutritious); and 2) to make a decision given some sensory observations (e.g. If I see cars in my lane are moving slowly, then I should switch to the other lane to take over the traffic). Recently there has been much interest in reproducing a similar level of intelligence on machines, leading to an important research topic -- artificial intelligence. To do this, we need to design a system that contains the two necessary factors that human intelligence relies on.

Affective Disorders Monitoring with Wearable Technologies – Andrea Patane

Mental health problems affect mood and the way people behave, think and react. Referred to asaffective or mood disorders, this group of psychiatric diseases includes depression, bipolar disorderand anxiety disorder. With over 33 million people diagnosed, the yearly healthcare costs related to affective disorders exceed 100 billion euros. Traditionally, affective disorders have been treated through medication and psychotherapy, but over the past decades psychotherapeutic
practice has been supplemented with computerised technologies.

Standing on the Shoulders of Giants: Domain and Task Transfer Reinforcement Learning – Sasha Salter

Due to the recent successful deployment of deep learning architectures in reinforcement learning (RL), the field has gained a lot of popularity as of late. Mastery of challenges such as the Atari suiteand AlphaGo builds excitement as to what artificial intelligence may be able to achieve in the nearfuture. However, this success relies on the ability to learn at low cost, often within the confines of a virtual environment, by trial and error over as many episodes as is required. In many domains, such as robotics, this presents a significant challenge. For embodied systems not only is there a cost (either monetary or execution time) associated with an episode, thereby limiting the number of training samples obtainable, but there also exist safety constraints making exploration of state space undesirable. One of the principle challenges for the future of artificial intelligence in real world systems is therefore the ability to train agents in a safe and data-efficient manner.

Probabilistic Numerics for Reinforcement Learning – Ed Wagstaff

Reinforcement learning is an established paradigm for machine learning which has seen impressive results in recent years, creating systems with state-of-the-art performance on a range of problems[1][2]. Probabilistic numerics is an emerging field which applies probabilistic inference to numerical problems (i.e. to problems of approximation)[3]. Practical reinforcement learning algorithms often depend heavily on numerical approximations. Further improvement of reinforcement learning algorithms has the potential to improve the performance of automated systems on a broad variety of real-world problems.


2015 Cohort

Deep Learning for Large Heterogeneous Human-Centric Data – Leo Berrada

Deep generative neural networks and their conditional variants have recently witnessed a surge of interest due to their impressive ability to model very complex probability distributions, such as the modelling of human face images or human voice audio signal. However, parameter estimation for such models from large data sets and over large structured outputs remains an open area of research.

Next Best View Planning with Point Clouds for Detailed Mapping of Large Environments – Rowan Border

ORI has state-of-the-art systems for dense reconstruction and 3D localisation with autonomous ground vehicles. These systems can be leveraged to design similar capabilities for autonomous aerial vehicles. Drone operation with vision (for applications such as aerial inspection) is an important research area that has been dominated by photogrammetry techniques, which often require human control and offline processing. Aerial vehicles that can operate autonomously and provide onboard vision processing are vastly more capable and open up new possibilities. A drone with these capabilities would be able to provide an autonomous aerial inspection of a demarcated area, navigating the environment and computing a complete dense reconstruction in a closed loop.

Scalable machine learning in the presence of uncertainty- Adam Cobb

This is a study of applying new machine learning techniques to challenges which require robust measures of uncertainty. The thesis will cover novel techniques building on both Bayesian non-parametric methods and highly parametric deep neural networks. The emphasis throughout the work will be how to incorporate notions of uncertainty into real-world problems, while trying to avoid the overcomplication of models.

Bayesian Inference with Big Data – Rob Cornish

Many Bayesian methods, particularly those based on sampling, are not yet capable of handling very large datasets, which are becoming increasingly common across many scientific and engineering disciplines. My research aims to improve on this. For instance, we seek to build on recently proposed methods based on piecewise-deterministic Markov processes — such as the bouncy particle sampler — which have demonstrated scalability by providing a mechanism to subsample data correctly. We aim to extract and generalise these developments so they can be applied to a broader range of Bayesian inference tasks.

Structured value and policy learning for deep reinforcement learning – Greg Farquhar

Reinforcement learning (RL) aims to train systems that choose optimal actions given the state of their environment, by allowing agents to explore possible policies and learn from their experiences. This kind of trial-and-error learning is plagued by high variance in value estimates, non-stationarity in data distributions, and a number of other critical obstacles. Deep reinforcement learning uses deep neural networks as function approximators for policies, models, and value functions. Structure in the problems may be exploited in the architecture of these neural networks and algorithms used to train them. For example, convolutional neural networks exploit the translational invariance of the observation space to learn rapidly in visual domains. However, many aspects of the structure of RL agents and optimal policies have not been explored. Further work in this area will help develop better RL methods with applications from robotic control to logistics or predictions in financial markets.

Inference and Probabilistic Programming in Reinforcement Learning – Max Igl

Recently major advances in Reinforcement Learning for game playing, a by now widely accepted benchmark[1], have been made by using Deep Q-Learning[2] (DQN). However, current state of the art methods still struggle with the combination of visual environments and structured hierarchical tasks.

In those cases exploration using a flat policy is highly inefficient as recurring subtasks, such as movement primitives, have to been re-learned in each situation. Several methods have been proposed to incorporate hierarchical policies, which impose structure on the search space and enable re-using of subroutines[3]-[6]. However, it is not yet clear how the visual input and higher-level policies should be combined or which higher-level policy representation should be used.

Unifying Motion Segmentation, Estimation, and Tracking for Complex Dynamic Scenes – Kevin Judd

The field of autonomous robotics is accelerating rapidly, and there has been significant research and development in visual navigation. Specifically, visual odometry (VO) addresses the challenge of estimating the egomotion of a mov- ing camera in a largely static environment. Recently, VO approaches have been extended to scenarios where large regions of the scene are dynamic; however, these systems are still primarily focused on only estimating egomotion and esti- mate other motions separately or even ignore them. Knowledge of all motions of the scene gives important context for navigating safely and intelligently through an environment.

A Scalable, Robust, and Stable Approach to Signal Detection in Non-stationary Noise – Ivan Kiskin

Detecting signals in noise is a fundamental problem applicable to vastly diverse research areas. These range from potential planet discovery and trend identification in finance to disease-bearing insect detection. The latter application, aimed to battle malaria, has received attention and funding from winning the 2014 Google impact challenge for its strong potential societal impact. The project aims to identify mosquito swarms through a distributed network of low-cost sensors. Correct identification ensures the chances of targeting affected areas with aid are maximised. Within the scope of the project, effective detection in challenging real-world conditions is vital to the success of the overall collaboration with the Royal Botanic Gardens, Kew.

Robust Model Nased Policy Search – Kyriakos Polymenakos

During this research project we will combine machine learning (ML) tech­niques for constructing and tuning models and policies along with formal meth­ods and control theory. Our aim is, starting with an incomplete and uncertain model of the system dynamics, to design a controller which:

  • Refines the model by intelligently exploring the environment
  • has verifiable properties such as safety and stability
  • Approximates or achieves optimal performance given the above constraints

ML approaches, for the most part, have been concerned with finding optimal policies and not guarantees about properties of the system and its behaviour while training and in operation. On the other hand, system verification and robust control theory usually deal with the model uncertainty by establishing desirable system properties and investigating whether a system respects them, but with less focus on performance.

Connections between Probabilisitc Machine Learning and Systems Identification – Control Theory – Nikitas Rontsis

In recent years there has been a surge in the area of Machine Learning techniques applied in a variety of areas, including the area of Control Systems. These techniques require very little prior knowledge for the system under control and are adjustable to changes of the system. However, they lack formal guarantees and interpretation of the resulting models & controllers, which is a well explored topic of classic control theory and system identification. The research will focus on combining and finding connections between the two fields. We will begin by examining an industry-motivated example system for Schlumberger and apply both standard system identification & machine learning methods to derive a model. We will then try to show circumstances under which the one could be a generalization of the other. Afterwards, the same thing will be done for the design of a controller for the identified plant model.

Distributed Model Learning with Guarantees for Dynamical Systems – Timothy Seabrook

I propose to conduct my DPhil research in the field of Distributed Learning, to take advantage of edge computing and decentralise the computational effort away from large data centres. I would like to explore the development of local dynamic models within a network of agents, to be aggregated into a global hi­erarchical set of models. This would accelerate learning not only by splitting work, but also by facilitating the transfer of knowledge between agents, from global sets to new local models.