# Prospective Students

Not Pitt students: I have no influence on the graduate admissions process, for which information can be found here. Feel free to reach out, but I may not be helpful at this stage.

Pitt students: If you are interested in an undergraduate or graduate research project under my supervision, please reach out by email or in person (contact info at the bottom of the page). It is helpful if you include a brief description of your background and interests in your initial message. If you have a particular topic in mind (graph-based algorithms or deep neural networks, or a specific data set that you find interesting etc.), please bring that up as well.

In general: Machine learning (ML) is a tool kit for solving problems involving big data, oft referred to as Artificial Intelligence (AI). While promising, these new technologies are affected by serious problems before they are ready for deployment - much as technologies have been historically. My interest is more in the underlying processes of machine learning than in enhancing the performance of AI on a specific problem.

## Possible topics

My research projects generally fall within the scope of machine learning (and most likely deep learning). Particular interests of mine are for example the following:

Training of neural networks. How can we find good weights in our neural network, and if there are many good weights, which ones does a training algorithm find?

Depth separation. What kind of problems are better solved by deeper neural networks?

Convolutional neural networks. Why do CNNs outperform fully connected networks e.g. in image classification?

Generalization of neural networks. Does a neural network uncover meaningful structure in a data set, or merely memorize the particular examples we showed it during training?

Data pre-processing. Pixel-wise distances are terrible at measuring similarity between pictures, while better distances based on 'optimal transport theory' are hard to evaluate. Can we use clever pre-processing to get better performance cheaply?

Geometry of data. Often data are presented in a high-dimensional space, but have a hidden 'low-dimensional' structure. For example, English words are a small subset of random combinations of up to fifteen letters... What are good models to extract the low-dimensional structure of data and use it, if all we have is a set of finitely many data samples?

If you have a particular topic in mind, I will be happy to see how it may be integrated. The range for undergraduate research projects is somewhat wider, including other topics in machine learning, but also the calculus of variations, partial differential equations, and geometric flows.

You can check out my work in the areas that you are interested aranged graphically in this image. For more details, please see the Publications.

## Prerequisites

The focus of my research are the foundations of deep learning. Strong analytic skills are required for most projects.

Real analysis. Most of my research involves analysis in some way, and familiarity with the results and techniques from real analysis in one and several dimensions is indispensable. This includes for example Sard's theorem, the regular value theorem, Lebesgue integration (including limit theorems), basic topology.

Probability theory. Knowledge of measure-based probability is preferred, familiarity with the basics (law of large numbers, central limit theorem) is required. Stochastic processes, conditional expectations and high-dimensional statistics are useful.

For some undergraduate research projects, the requirements may be relaxed. Linear algebra and calculus are required here as well. A lot of useful resources specifically for machine learning and deep learning can be found on the course page for my class on the Principles of Deep Learning.

If you are looking for an application-oriented research project, it is useful if you already have an application in mind.

## Useful Qualifications

The following are not required, nor is it realistic to be an expert in all of them. In any research project, it is likely that one or several of them may pop up at some point. To see graphically how different areas connect to deep learning, see this image.

Programming experience. While I am happy to supervise theory-heavy projects, machine learning research often combines theory and practice. Knowledge of Python and specifically the TensorFlow or PyTorch libraries is particularly useful.

Functional analysis. Functional analysis is essentially the linear algebra of infinite-dimensional spaces. It pops up in many places and is a very useful background in many different places, often relating to function spaces.

Statistical learning theory. How do neural networks trained on finite data sets perform on unseen data? Particularly Rademacher complexities and concentration inequalities may be helpful.

Ordinary differential equations. Training algorithms of neural networks are modeled after ordinary differential equations, which are often useful to gain intuition.

Partial differential equations. The training of infinitely wide neural networks can be described by partial differential equations. Conversely, it is an active problem how the tools of machine learning can be used in scientific computing, which often includes the solution of certain PDEs.

Differential geometry. Real data in applications seems to be concentrated on 'low-dimensional' subsets of a high-dimensional space. Understanding sub-manifolds of Euclidean spaces or even abstract Riemannian manifolds can be useful.

Measure theory. Measure theory is another topic that pops up in many different places, including function representation and e.g. the geometry of low-dimensional sets which are too 'rough' to be captured by manifolds. It is also foundational for probability theory.

Stochastic analysis. Stochastic gradient descent and related algorithms can be modeled by stochastic differential equations in the vanishing step-size limit.

Numerical analysis and linear algebra. This is relevant for machine learning beyond neural networks: Least squares fitting, singular value decomposition, Moore-Penrose pseudo-inverse, sparse and low-rank matrices, l1- and l2-norm, Runge phenomenon...

While knowledge of one or multiple of these topics may be useful, it is not required. A research project will be based on your knowledge and interests.