Date: July 18th 2020, Time Zone: Eastern Daylight Time (GMT -4)
8:00 am
Opening Remarks
8:15 am
Invited Talk
Reinforcement learning has seen major success in games and other artificial environments, but its applications in industries and real life are still limited. This limited applicability is partly due to the requirement of the large amount of the training data that needs to be collected through trial and error as well as the difficulty in effectively dealing with multiple or many agents. Diversity and negative dependence are a promising approach to resolve some of the major challenges in today’s reinforcement learning and have gained increasing attention in recent years. In this talk, we will briefly review some of the approaches to introducing diversity in reinforcement learning with a focus on the use of determinantal point processes for effective multi-agent reinforcement learning.
8:50 am
Invited Talk
9:25 am
Contributed Talk
Joachim Schreurs, Michaël Fanuel, Johan Suykens
By using the framework of Determinantal Point Processes (DPPs), some theoretical results concerning the interplay between diversity and regularization can be obtained. In this paper we show that sampling subsets with kDPPs results in implicit regularization in the context of ridgeless Kernel Regression. Furthermore, we leverage the common setup of state-of-the-art DPP algorithms to sample multiple small subsets and use them in an ensemble of ridgeless regressions. Our first empirical results indicate that ensemble of ridgeless regressors can be interesting to use for datasets including redundant information.
9:40 am
Break
10:10 am
Invited Talk
DPP MAP inference, the problem of finding the highest probability set under the distribution defined by a DPP, is of practical importance for a variety of areas such as recommender systems, active learning, and data compression. Unfortunately, finding the exact MAP solution is NP-hard. Often though, the standard greedy submodular maximization algorithm works well in practice for approximating the solution. In this talk, we discuss ways to speed up this simple greedy algorithm, as well as slower, but more accurate alternatives to it. We also discuss how to scale greedy for customized DPPs, where we want to solve the MAP problem multiple times with different weightings of item features. We conclude with a brief note on the complexity of MAP for nonsymmetric DPPs, where we show that greedy scales fairly well if we assume a particular kernel decomposition.
10:45 am
Invited Talk
11:20 am
Contributed Talk
Ehsan Kazemi, Shervin Minaee, Moran Feldman, Amin Karbasi
In this paper, we propose scalable methods for maximizing a regularized submodular function $f = g- \ell$ expressed as the difference between a monotone submodular function $g$ and a modular function $\ell$. Indeed, submodularity is inherently related to the notions of diversity, coverage, and representativeness. In particular, finding the mode (i.e., the most likely configuration) of many popular probabilistic models of diversity, such as determinantal point processes, submodular probabilistic models, and strongly log-concave distributions, involves maximization of (regularized) submodular functions. Since a regularized function$ f$ can potentially take on negative values, the classic theory of submodular maximization, which heavily relies on the non-negativity assumption of submodular functions, may not be applicable. To circumvent this challenge, we develop the first streaming and distributed algorithm for maximizing regularized submodular functions. Furthermore, we show that how can we find the mode of a strongly log-concave (SLC) distribution by regularized submodular maximization.
11:35 am
Lunch Break
12:30 pm
Room 1: Submodular Maximization
Omid Sadeghi, Reza Eghbali, Maryam Fazel. Online Algorithms for Budget-Constrained DR-Submodular Maximization [paper] [poster slides]
Aytunc Sahin, Joachim Buhmann, Andreas Krause. Constrained Maximization of Lattice Submodular Functions [paper] [poster slides]
Ehsan Kazemi, Shervin Minaee, Moran Feldman, Amin Karbasi. Mode Finding for SLC Distributions via Regularized Submodular Maximization [paper] [poster slides]
Room 2: Determinantal Point Processes
Wei Chen, Faez Ahmed. MO-PaDGAN: Generating Diverse Designs with Multivariate Performance Enhancement [paper] [poster slides]
Honghua Zhang, Steven J Holtzen, Guy Van den Broeck. On the Relationship Between Probabilistic Circuits and Determinantal Point Processes [paper] [poster slides]
Joachim Schreurs, Michaël Fanuel, Johan Suykens. Ensemble Kernel Methods, Implicit Regularization and Determinantal Point Processes [paper] [poster slides]
Room 3: Inference and Bipartite Matching
Zhe Dong, Andriy Mnih, George Tucker. DisARM: An Antithetic Gradient Estimator for Binary Latent Variables [paper] [poster slides]
Saba Ahmadi, Faez Ahmed, John P Dickerson, Mark Fuge, Samir Khuller. On Diverse Bipartite b-Matching [paper] [poster slides]
Pierre-Alexandre Mattei, Jes Frellsen. Negative Dependence Tightens Variational Bounds [paper] [poster slides]
1:15 pm
Invited Talk
In this talk I’ll describe a novel approach that yields algorithms whose parallel running time is exponentially faster than any algorithm previously known for a broad range of machine learning applications. The algorithms are designed for submodular function maximization which is the algorithmic engine behind applications such as clustering, network analysis, feature selection, Bayesian inference, ranking, speech and document summarization, recommendation systems, hyperparameter tuning, and many others. Since applications of submodular functions are ubiquitous across machine learning and data sets become larger, there is consistent demand for accelerating submodular optimization. The approach we describe yields simple algorithms whose parallel runtime is logarithmic in the size of the data rather than linear. I’ll introduce the frameworks we recently developed and present experimental results from various application domains.
1:50 pm
Invited Talk
A central challenge in biotechnology is to be able to predict functional properties of a protein from its sequence, and thus (i) discover new proteins with specific functionality and (ii) better understand the functional effect of genomic mutations. Experimental breakthroughs in our ability to read and write DNA allows data on the relationship between sequence and function to be rapidly acquired. This data can be used to train and validate machine learning models that predict protein function from sequence. However, the cost and latency of wet-lab experiments requires methods that find good sequences in few experimental rounds, where each round contains large batches of sequence designs. In this setting, model-based optimization allows us to take advantage of sample inefficient methods to find diverse optimal sequence candidates to be tested in the wet-lab. These requirements are illustrated by a collaboration that involves the design and experimental validation of AAV capsid protein variants that assemble integral capsids and package their genome, for use in gene therapy applications.
2:25 pm
Contributed Talk
Aytunc Sahin, Joachim Buhmann, Andreas Krause
Submodular optimization over the integer lattice has many applications in machine learning. Although the constrained maximization of submodular functions with coordinate-wise concavity (also called {\em DR-Submodular} functions) is well studied, the maximization of {\em general} lattice submodular functions is considerably more challenging. In this work, we first show that we can optimize lattice submodular functions subject to a discrete (integer) polymatroid constraint using a recently proposed extension, called the Generalized Multilinear Extension. Then, we establish a bound on the rounding error for the discrete polymatroid constraint, which depends on the “distance” between the lattice submodular function to a DR-Submodular function. Lastly, we demonstrate the effectiveness of our algorithm on a Bayesian experimental design problem with repetition and a concave cost.
2:40 pm
Break
3:10 pm
Invited Talk
Randomized Numerical Linear Algebra (RandNLA) is an area which uses randomness, most notably random sampling and random projection methods, to develop improved algorithms for ubiquitous matrix problems, such as those that arise in scientific computing, data science, machine learning, etc. A seemingly different topic, but one which has a long history in pure and applied mathematics, is that of Determinantal Point Processes (DPPs), which are stochastic point processes, the probability distribution of which is characterized by sub-determinants of some matrix. Recent work has uncovered deep and fruitful connections between DPPs and RandNLA. For example, random sampling with a DPP leads to new kinds of unbiased estimators for classical RandNLA tasks, enabling more refined statistical and inferential understanding of RandNLA algorithms; a DPP is, in some sense, an optimal randomized method for many RandNLA problems; and a standard RandNLA technique, called leverage score sampling, can be derived as the marginal distribution of a DPP. This work will be reviewed, as will recent algorithmic developments, illustrating that, while not quite as efficient as simply applying a random projection, these DPP-based algorithms are only moderately more expensive.
3:45 pm
Contributed Talk
Honghua Zhang, Steven J Holtzen, Guy Van den Broeck
Scaling probabilistic models to large realistic problems and datasets is a key challenge in machine learning. Central to this effort is the development of tractable probabilistic models (TPMs): models whose structure guarantees efficient probabilistic inference algorithms. The current landscape of TPMs is fragmented: there exist various kinds of TPMs with different strengths and weaknesses. Two of the most prominent classes of TPMs are determinantal point processes (DPPs) and probabilistic circuits (PCs). This paper provides the first systematic study of their relationship. We propose a unified analysis and shared language for discussing DPPs and PCs. Then we establish theoretical barriers for the unification of these two families, and prove that there are cases where DPPs have no compact representation as a class of PCs. We close with a perspective on the central problem of unifying these models.
4:00 pm
Panel Discussion
4:45 pm
Closing Remarks