Below are the slides & videos for our previous seminars.
November 6th, 2024
Speaker: Kevin Pratt, NYU
Title: Combinatorial aspects of matrix multiplication
Abstract: One of the major open problems in computer science to determine the exponent of matrix multiplication — the smallest number \omega such that two n-by-n matrices can be multiplied using n^{\omega +o(1)} arithmetic operations. While it is popularly conjectured that \omega = 2, we seem to be far from a proof (or disproof) of this.
In this talk I will discuss this problem in the context of the group-theoretic approach of Cohn and Umans. After giving background on this approach, I will highlight some open problems in additive combinatorics which could potentially yield \omega = 2, and discuss some recent progress towards these problems.
October 30th, 2024
Speaker: Xiao Hu, U Waterloo
Title: What is New in Join-Aggregate Query Processing?
Abstract: Join-aggregate queries defined over commutative semirings subsume a wide variety of common algorithmic problems, such as graph pattern matching, graph colorability, matrix multiplication, and constraint satisfaction problems. Developing efficient algorithms for computing join-aggregate queries in the conventional RAM model has been a holy grail in database theory. One of the most celebrated results in this area is the Yannakakis algorithm dating back to 1981. Despite its prominence as a textbook solution, no improvements in its complexity have been made over the past 40 years. In this talk, I will present the first algorithm that improves upon Yannakakis for computing acyclic join-aggregate queries. Moreover, this algorithm is proven to be output-optimal among all combinatorial algorithms. One application is an output-optimal algorithm for chain matrix multiplication over sparse matrices. Beyond combinatorial algorithms, I will also show how fast matrix multiplication can further speed up the processing of conjunctive queries, a critical subclass of join-aggregate queries. Finally, I will highlight a few interesting open problems in this area.
October 23rd, 2024
Speaker: Srijita Kundu, U Waterloo
Title: Separations in query complexity for total search problems
Abstract: We study the query complexity analogue of the class TFNP of total search problems. We give a way to convert partial functions to total search problems under certain settings; we also give a way to convert search problems back into partial functions. Our results include exponential separations between quantum query complexity and approximate degree (and in fact non-negative approximate degree), between approximate degree and the positive-weights adversary bound, for TFNP problems. Using our conversion between TFNP problems and partial functions, we give a partial function separation between quantum query complexity and approximate degree, thus reproving a result of Ambainis and Belovs (2023). Finally, lifting some of our results, we prove new separations in communication complexity.
Based on joint work with Shalev Ben-David.
October 9th, 2024
Speaker: Argyris Mouzakis, U Waterloo
Title: Private Mean Estimation with Person-Level Differential Privacy
Abstract: We study person-level differentially private (DP) mean estimation in the case where each person holds multiple samples. DP here requires the usual notion of distributional stability when all of a person’s datapoints can be modified. Informally, if n people each have m samples from an unknown d-dimensional distribution with bounded k-th moments, we show that
n = \tilde{\Theta}(d/(α^2 m) + d/(α \sqrt{m} ε) +d/{α^{k / (k − 1)} m ε} +d/ε)
people are necessary and sufficient to estimate the mean up to distance α in ℓ2-norm under ε-differential privacy (and its common relaxations). In the multivariate setting, we give computationally efficient algorithms under approximate-DP and computationally inefficient algorithms under pure DP, and our nearly matching lower bounds hold for the most permissive case of approximate DP. Our computationally efficient estimators are based on the standard clip-and-noise framework, but the analysis for our setting requires both new algorithmic techniques and new analyses. In particular, our new bounds on the tails of sums of independent, vector-valued, bounded-moments random variables may be of interest.
The paper is joint work with Sushant Agarwal, Gautam Kamath, Mahbod Majid, Rose Silver, and Jon Ullman, and is set to appear in SODA 2025. It is also available on arxiv.
October 2nd, 2024
Speaker: Hanna Komlos, NYU
Title: Nearly Optimal List Labeling
Abstract: The list-labeling problem captures the basic task of storing a dynamically changing set of up to N elements in sorted order in an array of size M=cN where c is a constant. The goal is to support insertions and deletions while moving around elements within the array as little as possible. Until recently, the best known upper bound stood at O(log^2(N)) amortized cost. This bound, which was first established in 1981, was finally improved two years ago, when a randomized O(log^{1.5}(N)) expected-cost algorithm was discovered. The best randomized lower bound for this problem remains Ω(log(N)), and closing this gap is considered to be a major open problem in data structures.
We present the See-Saw Algorithm, a randomized list-labeling solution that achieves a nearly optimal bound of O(log(N) polyloglog(N)) amortized expected cost. This bound is achieved despite at least three lower bounds showing that this type of result is impossible for large classes of solutions. This work will appear at FOCS 2024.
September 25th, 2024
Speaker: Robert Andews, U Waterloo
Title: Algebraic Pseudorandomness in VNC^0
Abstract: Polynomial identity testing (PIT) is a central problem in theoretical computer science with numerous algorithmic applications, including applications to problems with no obvious algebraic character. Known polynomial-time algorithms for PIT crucially rely on the use of randomness. Designing a deterministic polynomial-time algorithm for PIT is a major goal of algebraic complexity theory.
Often, algorithms for PIT are obtained by constructing pseudorandom objects known as hitting set generators, which are analogous to the pseudorandom generators seen in boolean derandomization problems like P versus BPP. Recently, there has been interest in constructing such hitting set generators in a “cryptographic” regime of parameters, where the complexity of the generator is very small compared to the class of algebraic circuits it fools.
In this talk, I will describe a new construction of a generator where each output can be computed by a formula of constant size and the generator itself fools algebraic circuits of polynomial size and constant depth. Under strong (but reasonable) hardness assumptions, this generator also fools polynomial-size algebraic branching programs. No background in algebra or pseudorandomness is necessary!
September 11th, 2024
Speaker: Gary Hoppenworth, U Michigan
Title: New Reductions and Upper Bounds for Distance Preservers
Abstract: Given a graph G, a distance preserver is a sparse subgraph H of G that exactly preserves shortest path distances between a small set of demand pairs P \subseteq V(G) \times V(G) in G. The study of distance preservers is primarily focused on the following question: How many edges are needed to construct a distance preserver of an n-vertex graph G with respect to a set P of p=|P| demand pairs? Distance preservers were first introduced by Coppersmith and Elkin in [SODA'06] and have connections to extremal graph theory and incidence geometry.
In the first part of this talk, we will review what is known about distance preservers. Some highlights include: 1) We will introduce a property of shortest paths known as ‘consistency’, and we will use consistency to bound the size of distance preservers by the number of edges in a graph avoiding a certain forbidden subgraph. 2) We will discuss a conjecture on distance preservers that generalizes the classical Szemeredi-Trotter Theorem of discrete geometry to finite metrics.
In the second part of this talk, we will present two new results related to distance preservers. First, we will present an extremal reduction from distance preservers in directed graphs to distance preservers in undirected graphs. This reduction has the property that any improvement in undirected distance preserver upper bounds in the regime of p \leq n would imply improved directed distance preserver upper bounds as well. Along the way, we prove the following algorithmic result: All-Pairs Shortest Paths (APSP) in directed acyclic graphs can be reduced to APSP in undirected graphs in O(m) time. Second, we will present new upper bounds for distance preservers in unweighted directed graphs. Upper bounds for distance preservers in unweighted undirected graphs were previously studied by Bodwin and Vassilevska Williams in [SODA'20].
The second part of this talk is based on joint work with Yinzhan Xu and Zixuan Xu.
August 20th, 2024
Speaker: Mika Goos, EPFL
Title: Constant-Cost Communication
Abstract: Some of the most extreme examples of the power of randomness in computing come from communication complexity, where shared randomness can allow two parties to solve non-trivial problems with communication cost independent of the input size. The textbook example is the Equality problem, where two parties hold n-bit strings x and y, respectively, and they wish to decide whether x = y. While n bits of deterministic communication are necessary, there is a randomised protocol that communicates only 2 bits. Can we characterise all communication problems that admit such hyperefficient protocols? We discuss recent progress and open problems.
Based on joint work with Yuting Fang, Nathaniel Harms, and Pooya Hatami.
November 29th, 2023
Speaker: Rachel Zhang, MIT
Title: New Codes on High Dimensional Expanders
Abstract: A code, which is a set of strings called codewords, is locally testable if one can test whether a given word is close to a codeword by reading only a few bits. Locally testable codes have been studied since the 1990s as key ingredients in the construction of probabilistically checkable proofs. In this talk, I’ll describe a new construction of locally testable codes implemented on high dimensional expanders. It has been known for several years that high dimensional expanders give rise to locally testable codes assuming the existence of a compatible base code. Our main contribution is the construction and analysis of such a base code compatible with the coset complex of Kaufman and Oppenheim.
November 22, 2023
Speaker: David Wajc, Technion
Title: Dynamic matching with (2-eps) approximation in polylog time.
Abstract: In this talk I will present dynamic algorithms with polylogarithmic update time for estimating the size of the maximum matching of a graph undergoing edge insertions and deletions with approximation ratio strictly better than 2. These algorithms answer in the affirmative (the value version of) a major open question repeatedly asked in the dynamic graph algorithms literature. I will give a high-level idea of the key ingredients needed for this result, and discuss subsequent developments.
Based on a SODA 2023 best paper, joint with Sayan Bhattacharya, Peter Kiss and Thatchaphol Saranurak.
November 15, 2023
Speaker: Harold Nieuwboer, University of Amsterdam
Title: Interior-point methods on manifolds: theory and applications
Abstract: Interior-point methods offer a highly versatile framework for convex optimization that is effective in theory and practice. A key notion in their theory is that of a self-concordant barrier. We give a suitable generalization of self-concordance to Riemannian manifolds and show that it gives the same structural results and guarantees as in the Euclidean setting, in particular local quadratic convergence of Newton’s method. We analyze a path-following method for optimizing compatible objectives over a convex domain for which one has a self-concordant barrier, and obtain the standard complexity guarantees as in the Euclidean setting. We provide general constructions of barriers, and show that on the space of positive-definite matrices and other symmetric spaces, the squared distance to a point is self-concordant. To demonstrate the versatility of our framework, we give algorithms with state-of-the-art complexity guarantees for the general class of scaling and non-commutative optimization problems, which have been of much recent interest, and we provide the first algorithms for efficiently finding high-precision solutions for computing minimal enclosing balls and geometric medians in nonpositive curvature.
Joint work with Hiroshi Hirai and Michael Walter, based on https://arxiv.org/abs/2303.04771.
October 25, 2023
Speaker: Roei Tell, University of Toronto
Title: Fiat-Shamir in the Plain Model from Derandomization
Abstract: Do efficient algorithms believe that NP = PSPACE? Under strong complexity-theoretic assumptions, we prove that every problem in PSPACE can be solved by an NP-type verifier running in polynomial time that looks correct to any efficient observer (i.e., it has computational soundness). For example, the PSPACE-complete problem TQBF can be decided by a polytime NP-type verifier V, interacting with an honest prover whose runtime is T(n) = 2^O(n), such that no adversary running in time poly(T) can find an input and a proof that mislead V (except with probability negligible in T over the adversary’s coins).
The proof uses recent techniques from non-black-box derandomization in order to compile the interactive proof system underlying IP = PSPACE into a deterministic (non-interactive) argument system. This is an instance from a broader set of results applying derandomization-based techniques and assumptions to instantiate (versions of) the Fiat-Shamir heuristic in cryptography, in the plain model and without (or with weak) cryptographic assumptions.
From an upcoming joint work with Lijie Chen and Ron Rothblum.
October 18, 2023
Speaker: Cameron Seth, University of Waterloo
Title: Testing Graph Properties with the Container Method
Abstract: We establish nearly optimal sample complexity bounds for testing the clique property in the dense graph model. Specifically, we show that it is possible to distinguish graphs on n vertices that have a clique of size rho n from graphs for which at least epsilon n^2 edges must be added to form a rho n clique by sampling and inspecting a random subgraph on only O(rho^3/epsilon^2) vertices.
The new bounds for testing the clique property are obtained via a new extension of the graph container method. This method has been an effective tool for tackling various problems in graph theory and combinatorics. Our results demonstrate that it is also a powerful tool for the analysis of property testing algorithms.
This is joint work with Eric Blais.
October 13, 2023
Speaker: Mrinal Kumar, TIFR
Title: Fast multivariate multipoint evaluation
Abstract: Multipoint evaluation is the computational task of evaluating a polynomial given as a list of coefficients at a given set of inputs. A straightforward algorithm for this problem is to just iteratively evaluate the polynomial at each of the inputs. The question of obtaining faster-than-naive (and ideally, close to linear time) algorithms for this problem is a natural and basic question in computational algebra. In addition to its own inherent interest, faster algorithms for multipoint evaluation are closely related to fast algorithms for other natural algebraic questions like polynomial factorization and modular composition.
In this talk, I will briefly survey the state of art for this problem, and discuss some recent improvements and applications.
October 11, 2023
Speaker: Kam Chuen (Alex) Tung, University of Waterloo
Title: Cheeger Inequalities for Directed Graphs and Hypergraphs using Reweighted Eigenvalues
Abstract: We derive Cheeger inequalities for directed graphs and hypergraphs using the reweighted eigenvalue approach that was recently developed for vertex expansion in undirected graphs. The goal is to develop a new spectral theory for directed graphs and an alternative spectral theory for hypergraphs. The first main result is a Cheeger inequality relating the vertex expansion of a directed graph to the vertex-capacitated maximum reweighted second eigenvalue. This provides a combinatorial characterization of the fastest mixing time of a directed graph by vertex expansion, and builds a new connection between reweighted eigenvalues, vertex expansion, and fastest mixing time for directed graphs. The second main result is a stronger Cheeger inequality relating the edge conductance of a directed graph to the edge-capacitated maximum reweighted second eigenvalue. This provides a certificate for a directed graph to be an expander and a spectral algorithm to find a sparse cut in a directed graph, playing a similar role as Cheeger’s inequality in certifying graph expansion and in the spectral partitioning algorithm for undirected graphs. We also use this reweighted eigenvalue approach to derive the improved Cheeger inequality for directed graphs, and furthermore to derive several Cheeger inequalities for hypergraphs that match and improve the existing results. These are supporting results that this provides a unifying approach to lift the spectral theory for undirected graphs to more general settings.
Joint work with Lap Chi Lau and Robert Wang. arXiv version
October 4, 2023
Speaker: Kaave Hosseini, University of Rochester
Title: Notions of “rank” for boolean matrices
Abstract: In this talk I will discuss a few well-known complexity parameters for boolean matrices that are relaxations of rank (over the reals). These are approximate rank, sign-rank/dimension complexity, margin/discrepancy, gamma2 norm, and approximate-gamma2. The focus of this talk is to study the meta question: “what is the relationship between these parameters?”. Surprisingly, this meta question is not yet fully understood. I will try to answer some of these pairwise relations using different tools such Fourier analysis, topology, and also ideas from discrete geometry.
It turns out that study of this meta question connects many different branches of mathematics and equivalent stories are to be told in learning theory, communication complexity, convex geometry, theory of dimensionality reduction, etc. For example, in learning theory one related question is: if a binary concept class is learnable via large margin classifiers (such as SVM) does it imply that it is learnable via linear programming, and vice versa? In communication complexity, one question is whether one can turn a public-randomness protocol into a private-randomness protocol without a super constant overhead, if one allows the error probability to get arbitrarily close to 1/2.
September 27, 2023
Speaker: Sepehr Assadi, University of Waterloo
Title: Hidden Permutations to the Rescue: Multi-Pass Semi-Streaming Lower Bounds for Approximate Matchings
Abstract: A semi-streaming (graph) algorithm processes its input graph by making one or a few passes over its edges and using a space proportional to the number of vertices, hence, (potentially) quadratically smaller than the input size. Semi-streaming algorithms have been at the forefront of theoretical research on processing massive graphs in recent years. In this talk, we will consider the maximum (bipartite) matching problem in the semi-streaming model.
There is a large body of semi-streaming algorithms for the maximum matching problem, developed over the last two decades, that achieve a (1+eps)-approximation in progressively smaller number of passes as a function of eps and size of the graph. In sharp contrast to this progress however, there is scarcely any lower bounds known for this problem, beyond single-pass and very recently two-pass algorithms. For instance, it is not even ruled out whether one can achieve a (1+eps)-approximation in a fixed constant number of passes, say, even three passes.
In this talk, we present the first pass-approximation tradeoff for (small) constant-factor approximation of semi-streaming matchings: semi-streaming (1-ε)-approximation of bipartite matching (even its size) requires Omega(log(1/ε)) passes under a natural combinatorial hypothesis that moderately dense Ruzsa-Szemeredi graphs do exist. The key to our analysis is a novel way of analyzing communication complexity of hiding permutations directly, instead of relying on tools from Boolean function analysis of prior work in this context. This maybe of its own independent interest.
The talk will be self-contained and no prior background in streaming algorithms or communication complexity will be needed to enjoy this talk!
This is joint work with Janani Sundaresan (to appear in FOCS 2023).
September 25, 2023
Speaker: Akshay Ramachandran, CWI
Title: “A Tale of Two Subspaces” [Regev]
Abstract: Optimization problems with orthogonality constraints arise in many fields in science and engineering. For example, optimization over subspaces in physics and signal processing, and over rotations in computational geometry. The key step in solving these problems often boils down to understanding the relation between two subspaces. It turns out that this question has a surprisingly elegant answer given by the CS decomposition from numerical linear algebra. In this talk, we will discuss the CS decomposition in the context of the geodesic geometry of subspaces. This perspective further gives unifying framework for understanding the various numerical algorithms used to solve problems with orthogonality constraints. As our main illustration, we study the problem of computing eigenspaces of a matrix, giving a rigorous convergence analysis of the well-known power method and its subspace generalization.
If time permits, we also present some new results on tractable algorithms for constrained optimization over rotation matrices. This is joint work with Kevin Shu and Alex Wang.
December 06, 2022
Speaker: Robert Andrews, University of Illinois Urbana-Champaign
Title: On Matrix Multiplication and Polynomial Identity Testing
Abstract: Determining the complexity of matrix multiplication is a fundamental problem of theoretical computer science. It is popularly conjectured that ω, the matrix multiplication exponent, equals 2. If true, this conjecture would yield fast algorithms for a wide array of problems in linear algebra and beyond. If instead ω > 2, can we leverage the hardness of matrix multiplication to design algorithms for other problems?
In this talk, I will describe how lower bounds on ω can be used to make progress on derandomizing polynomial identity testing.
November 11, 2022
Speaker: Iden Kalemaj, Boston University
Title: Isoperimetric Inequalities for Real-Valued Functions with Applications to Monotonicity Testing
Abstract: We study sublinear algorithms for monotonicity of real-valued functions. An algorithm for testing monotonicity queries a function at a few locations and distinguishes between when the function is monotone and when it is far from monotone. An algorithm for approximating the distance to monotonicity returns the distance to monotonicity of the function with some additive and multiplicative error. We improve previous algorithms for these two tasks by showing that several isoperimetric inequalities about Boolean functions on the hypercube can be generalized to real-valued functions
A celebrated line of work [Margulis ‘74, Talagrand ‘93, Chakrabarty and Seshadhri ‘13, Khot Minzer Safra ‘15] studied the size of the ‘‘boundary’’ between the points on which a Boolean function takes value 0 and the points on which it takes value 1. This boundary is defined in terms of the edges of the d-dimensional hypercube, where the edges can be directed or undirected. In particular, the inequality of Khot, Minzer, and Safra ‘15 implies all previous inequalities. It bounds the average square root of the number of decreasing edges incident on a vertex of the hypercube by the distance to monotonicity of the function. We generalize all the inequalities in this line of work to real-valued functions. Our main contribution is a Boolean decomposition technique that represents a real-valued function as a collection of Boolean functions that roughly capture the distance to monotonicity of the original function. In this talk, we prove the Boolean decomposition theorem and touch on the improved algorithms for monotonicity testing and distance approximation.
This is joint work with Hadley Black and Sofya Raskhodnikova.
October 28, 2022
Speaker: Shubhangi Saraf, University of Toronto
Title: New Lower Bounds for Set Multilinear Formulas
Abstract: The recent exciting breakthrough by Limaye, Srinivasan, and Tavenas showing superpolynomial lower bounds for constant-depth algebraic circuits has underscored the importance of studying the complexity of set multilinear formulas. In this talk, along with discussing the impact of this work and several others in the literature, I will discuss some new results in this area, namely strong (and “sharp”) lower bounds for set-multilinear circuits/formulas in the constant (or low) depth setting.
Joint work with Deepanshu Kush.
October 21, 2022
Speaker: Akshay Ramachandran, CWI
Title: The Brascamp-Lieb polytope: matroid matching and rank of matrix spaces
Abstract: Given a linear subspace of matrices $\mathcal{A} := \langle A_{1}, …, A_{m} \rangle \subseteq \mathbb{F}^{n \times n}$, Edmond’s problem asks to decide whether there is a matrix in $\mathcal{A}$ of full rank. This can be rephrased as finding the rank of the matrix $A := \sum_{i=1}^{m} x_{i} A_{i}$ over the field of rational functions $\mathbb{F}(x_{1}, …, x_{m})$.
This linear algebraic problem turns out to connect to various problems in combinatorial optimization: maximum matching in graphs, linear matroid intersection, and their common generalization linear matroid matching [Lovasz, 1980].
By the Schwarz-Zippel lemma, there is a simple randomized algorithm to answer this question, but derandomizing this result is a major open question that would be an important breakthrough in complexity theory [KI04].
A related variant, known as the non-commutative Edmond’s problem, asks to compute the rank of $A$ over the “free skew field” in which the variables $x$ no longer commute. Surprisingly, recent work has shown that this seemingly more complicated problem can in fact be solved by a deterministic algorithm in polynomial time.
In this talk, we will discuss one of these algorithms and a recent result [Oki, Soma] relating this problem to the Brascamp-Lieb Theorem from functional analysis. In particular, they show that the algorithm for non-commutative rank can be used to optimize over the Brascamp-Lieb polytope for rank 2 instances, which gives the first truly polynomial time algorithm to maximize the fractional relaxation of linear matroid matching [GP08].
This problem can be considered as a natural “higher rank” generalization of optimization over linear matroid polytopes.
August 3, 2022
Speaker: Stephen Melczer, University of Waterloo
Title: Computability and Complexity in Analytic Combinatorics
Abstract: Enumerative combinatorics studies discrete objects by capturing aspects of their behaviour (such as the number of objects of a given size) using sequences. In this talk we explore how to combine pure mathematical tools with computational methods to answer questions about the computability and complexity of asymptotic behaviour for sequences under different encodings that arise frequently in combinatorics.
Applications discussed will include (time permitting) the analysis of classical algorithms, models predicting the shape of biomembranes, queuing theory, random walks, ratchet models for gene expression, transcendence of zeta values, restricted permutations, maximum likelihood degree in algebraic statistics, sampling algorithms for perfect matchings in bipartite graphs, and parallel synthesis for DNA storage.
March 23, 2022
Speaker: Sushant Sachdeva, University of Toronto
Title: Almost linear time algorithms for max-flow and more
Abstract: We give the first almost-linear time algorithm for computing exact maximum flows and minimum-cost flows on directed graphs. By well known reductions, this implies almost-linear time algorithms for several problems including bipartite matching, optimal transport, and undirected vertex connectivity.
Our algorithm is designed using a new Interior Point Method (IPM) that builds the flow as a sequence of almost-linear number of approximate undirected minimum-ratio cycles, each of which is computed and processed very efficiently using a new dynamic data structure.
Our framework extends to give an almost-linear time algorithm for computing flows that minimize general edge-separable convex functions to high accuracy. This gives the first almost-linear time algorithm for several problems including entropy-regularized optimal transport, matrix scaling, p-norm flows, and Isotonic regression.
Joint work with Li Chen, Rasmus Kyng, Yang Liu, Richard Peng, and Maximilian Probst Gutenberg.
February 24, 2022
Speaker: Vishwas Bhargava, Rutgers University
Title: A Tale of Two Reconstructions
Abstract: Arithmetic circuits are a natural model for computing polynomials using basic arithmetic operations like addition and multiplication. The problem of learning arithmetic circuits, a.k.a. reconstruction, is an important and well-studied problem. Here we are given a polynomial (either explicitly as a list of coefficients or using black-box access) and the goal is to find the smallest (or approximately smallest) arithmetic circuit computing it. In this talk, we will discuss two vastly different paradigms in learning arithmetic circuits, which are worst-case learning and average-case (non-degenerate) learning. We will elaborate on various techniques involved in these paradigms, their similarity, and their differences. Along the way, we will see a worst-case as well as a non-degenerate algorithm for decomposing tensors. Time permitting, we will also discuss some recent progress in non-degenerate learning of “generalized” depth 3 arithmetic circuits.
February 18, 2022
Speaker: Emmanouil Vasileios Vlatakis-Gkaragkounis, Columbia University
Title: Optimization Beyond Minimization
Abstract: Motivated by recent advances in both theoretical and applied aspects of multiplayer games, spanning from e-sports to multi-agent generative adversarial networks, a surge of different studies has been focused on the core problem of understanding the behavior of game dynamics in general N-player games. From the seminal settings of two competitive players and Min-Max Optimization to the complete understanding of how the day-to-day behavior of the dynamics correlates to the game’s different notion of equilibria is much more limited, and only partial results are known for certain classes of games (such as zero-sum or congestion games). In this talk, we study from two different perspectives arguably the most well-studied class of no-regret dynamics, “Follow-the-regularized-leader” (FTRL) and Discretizations of Gradient Flow (GDA/OGDA/EG), and we establish a sweeping negative result showing that the notion of mixed Nash equilibrium is antithetical to no-regret learning. Specifically, we show that any Nash equilibrium which is not strict (in that every player has a unique best response) cannot be stable and attracting under the dynamics of FTGL. This result has significant implications for predicting the outcome of a learning process as it shows unequivocally that only strict (and hence, pure) Nash equilibria can emerge as stable limit points thereof. For a final happy end story, we present either structural examples of families where convergence is possible providing the last-iterate convergence rates or even new methods inspired from other areas like control theory & planning.
April 14, 2021
Speaker: Pei Wu, UCLA
Title: An optimal separation of randomized and quantum query complexity
Abstract: We prove that for every decision tree, the absolute values of the Fourier coefficients of given order $\ell\geq1$ sum to at most $c^{\ell}\sqrt{\binom{d}{\ell}(1+\log n)^{\ell-1}},$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant. This bound is essentially tight and settles a conjecture due to Tal (arxiv 2019; FOCS 2020). The bounds prior to our work degraded rapidly with $\ell,$ becoming trivial already at $\ell=\sqrt{d}.$ As an application, we obtain, for every integer $k\geq1,$ a partial Boolean function on $n$ bits that has bounded-error quantum query complexity at most $\lceil k/2 rceil$ and randomized query complexity $\tilde{\Omega}(n^{1-1/k}).$ This separation of bounded-error quantum versus randomized query complexity is best possible, by the results of Aaronson and Ambainis (STOC 2015). Prior to our work, the best known separation was polynomially weaker: $O(1)$ versus $\Omega(n^{2/3-\epsilon})$ for any $\epsilon>0$ (Tal, FOCS 2020). As another application, we obtain an essentially optimal separation of $O(\log n)$ versus $\Omega(n^{1-\epsilon})$ for bounded-error quantum versus randomized communication complexity, for any $\epsilon>0.$ The best previous separation was polynomially weaker: $O(\log n)$ versus $\Omega(n^{2/3-\epsilon})$ (implicit in Tal, FOCS 2020).
February 24, 2021
Speaker: Vikrant Singhal, Northeastern University
Title: Private Mean Estimation of Heavy-Tailed Distributions
Abstract: We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded k-th moments. Roughly speaking, in the univariate case, we show that n = \Theta(1/\alpha^2 + 1/\epsilon\alpha^{k/(k-1)}) samples are necessary and sufficient to estimate the mean to \alpha-accuracy under \epsilon-differential privacy, or any of its common relaxations. This result demonstrates a qualitatively different behavior compared to estimation absent privacy constraints, for which the sample complexity is identical for all k >= 2. We also give algorithms for the multivariate setting whose sample complexity is a factor of O(d) larger than the univariate case.
February 19, 2021
Speaker: Jasper Lee, Brown University
Title: Optimal Sub-Gaussian Mean Estimation in R
Abstract: We revisit and settle one of the most fundamental problems in statistics: given access to independent samples from a 1D random variable (with finite but unknown mean and variance), what is the best way to estimate the mean in the high probability regime,in terms of error convergence with respect to sample size? The conventional wisdom is to use the empirical mean as our estimate. However, it is known that the empirical mean can in fact have exponentially sub-optimal convergence for certain heavy-tailed distributions.On the other hand, the median-of-means estimator (invented and reinvented in various literature) does have sub-Gaussian convergence for all finite-variance distributions, albeit only in the big-O sense with a sub-optimal multiplicative constant. The natural remaining question then, is whether it is possible to bridge the gap, to have an estimator that has optimal sub-Gaussian concentration with an optimal constant, for all finite-variance distributions.
In this talk, we answer the question affirmatively by giving an estimator that converges with the optimal constant inside the big-O, up to a 1+o(1) multiplicative factor. Our estimator is furthermore computable in time linear in the sample size. The convergence analysis involves deriving tail bounds using linear and convex-concave programming dualities, which may be of independent interest.
This is joint work with Paul Valiant.
October 21, 2020
Speaker: Bryce Sandlund, University of Waterloo
Title: Lazy Search Trees
Abstract: We introduce the lazy search tree data structure. The lazy search tree is a comparison-based data structure on the pointer machine that supports order-based operations such as rank, select, membership, predecessor, successor, minimum, and maximumwhile providing dynamic operations insert, delete, change-key, split, and merge. We analyze the performance of our data structure based on a partition of current elements into a set of gaps {Delta_i} based on rank. A query falls into a particular gap and splitsthe gap into two new gaps at a rank r associated with the query operation. If we define B = sum_i |Delta_i| log_2(n/|Delta_i|), our performance over a sequence of n insertions and q distinct queries is O(B + min(n log log n, n log q)). We show B is a lowerbound.
Effectively, we reduce the insertion time of binary search trees from Theta(log n) to O(min(log(n/|Delta_i|) + log log |Delta_i|, log q)), where Delta_i is the gap in which the inserted element falls. Over a sequence of n insertions and q queries, a time boundof O(n log q + q log n) holds; better bounds are possible when queries are non-uniformly distributed. As an extreme case of non-uniformity, if all queries are for the minimum element, the lazy search tree performs as a priority queue with O(log log n) timeinsert and decrease-key operations. The same data structure supports queries for any rank, interpolating between binary search trees and efficient priority queues.
Lazy search trees can be implemented to operate mostly on arrays, requiring only O(min(q, n)) pointers, suggesting smaller memory footprint, better constant factors, and better cache performance compared to many existing efficient priority queues or binarysearch trees. Via direct reduction, our data structure also supports the efficient access theorems of the splay tree, providing a powerful data structure for non-uniform element access, both when the number of accesses is small and large.
Joint work with Sebastian Wild. Accepted for publication in FOCS 2020; arXiv: https://arxiv.org/abs/2010.08840
September 30, 2020
Speaker: Aukosh Jagannath, University of Waterloo
Title: From high-dimensional random landscapes to statistical inference
Abstract: We study the algorithmic thresholds for principal component analysis of Gaussian k-tensors with a planted rank-one spike, via Langevin dynamics and gradient descent. In order to efficiently recover the spike from natural initializations, the signal-to-noise ratio must diverge in the dimension. Our proof shows that the mechanism for the success/failure of recovery is the strength of the “curvature” of the spike on the maximum entropy region of the initial data. To demonstrate this, we study the dynamics on a generalized family of high-dimensional landscapes with planted signals, containing the spiked tensor models as specific instances. We identify thresholds of signal-to-noise ratios above which order 1 time recovery succeeds; in the case of the spiked tensor model, these match the thresholds conjectured for algorithms such as approximate message passing. Below these thresholds, where the curvature of the signal on the maximal entropy region is weak, we show that recovery from certain natural initializations takes at least stretched exponential time. Our approach combines global regularity estimates for spin glasses with pointwise estimates to study the recovery problem by a perturbative approach.
September 9, 2020
Speaker: Wai Ming Tai, University of Chicago
Title: New Nearly-Optimal Coreset for Kernel Density Estimation
Abstract: Given a point set $P \subset \mathbb{R}^d$, kernel density estimation for Gaussian kernel is defined as $\overline{\mathcal{G}}P(x) = \frac{1}{\left|P\right|}\sum{p\inP}e^{-\left\lVert x-p \right\rVert^2}$ for any $x\in\mathbb{R}^d$. We study how to construct a small subset $Q$ of $P$ such that the kernel density estimation of $P$ can be approximated by the kernel density estimation of $Q$. This subset $Q$ is called \emph{coreset}. The primary technique in this work is to construct $\pm 1$ coloring on the point set $P$ by the discrepancy theory and apply this coloring algorithm recursively. Our result leverages Banaszczyk’s Theorem. When $d>1$ is constant, our construction gives a coreset of size $O\left(\frac{1}{\varepsilon}\sqrt{\log\log\frac{1}{\varepsilon}}\right)$as opposed to the best-known result of $O\left(\frac{1}{\varepsilon}\sqrt{\log\frac{1}{\varepsilon}}\right)$. It is the first to give a breakthrough on the barrier of $\sqrt{\log}$ factor even when $d=2$.
July 2, 2020
Speaker: Amit Sinhababu, Aalen University
Title: Factoring Polynomials given as Arithmetic Branching Programs (Joint work with Thomas Thierauf, to appear in CCC 20)
Abstract: A basic problem in computational algebra is polynomial factoring: Given a polynomial, compute all its irreducible factors. In a classic result, Kaltofen proved that if a polynomial P(x_1,…,x_n) of degree d is given as an arithmetic circuit of size s, then all its factors can be computed by arithmetic circuits of size poly(s,d). In other words, the algebraic complexity class VP is closed under factors. This result has applications in algebraic complexity, for example in derandomization of polynomial identity testing using explicit hard polynomials (lower bounds).
A natural direction to extend Kaltofen’s result is to prove analogous factor size bounds in more restricted / more general models than arithmetic circuits. Kaltofen’s proofs do not extend to the restricted models, as reusing previous computations is costly for these models. Recent works in this area focused on models like constant depth circuits, arithmetic formulas, arithmetic branching programs. In a beautiful work, Oliveira (CCC 2015) proved that factors of a low-depth circuit/formula can be computed by small low-depth circuits/formulas, if we assume that the individual degree of the given polynomial is bounded by constant. In a follow-up work, Dutta, Saxena and Sinhababu (DSS, STOC 18) showed poly (s^log d) factor size upper bound for arithmetic formulas and branching programs.
In this work, we show poly (s,d) factor size upper bound for the model of arithmetic branching programs (ABP). Whereas the works of Oliveira and DSS used different versions of Newton iteration, we use the classic technique of Hensel lifting. Although Newton iteration and Hensel lifting are related techniques, the latter gives better bound for ABPs. We also use that the determinant of a symbolic matrix can be computed by a small ABP. In this talk, we give a brief survey of the earlier works and present the main ideas of our proof.
Paper: https://image.informatik.htw-aalen.de/~thierauf/Papers/ABP-factors.pdf
June 24, 2020
Speaker: Anindya De, University of Pennsylvania
Title: Testing Noisy Linear Equations for Sparsity
Abstract: Consider the following basic problem in sparse linear regression – an algorithm gets labeled samples of the form (x, w.x + \eps) where w is an unknown n-dimensional vector, x is drawn from a background distribution D and \eps is some independent noise. Given the promise that w is k-sparse, the breakthrough work of Candes, Rhomberg and Tao (2005) shows that w can be recovered with samples and time which scales as O(k log n). This should be contrasted with general linear regression where O(n) samples are information theoretically necessary.
In this talk, we look at this question from the vantage point of property testing and study the decision variant of the following question – namely, what is the complexity of deciding if the unknown vector w is k-sparse (or at least say 0.01 far from k-sparse in \ell_2 distance). We show that the decision version of the problem can be solved with samples which are independent of n as long as the background distribution D is i.i.d. and the components are not Gaussian. We further show that weakening any of the conditions in this result necessarily makes the complexity scale as log n (thus showing our results are tight).
Joint work with Xue Chen (Northwestern) and Rocco Servedio (Columbia).
June 17, 2020
Speaker: Hong Zhou, University of Waterloo
Title: A Spectral Approach to Network Design
Abstract: In this talk, I will present a spectral approach to design approximation algorithms for network design problems. We observe that the underlying mathematical questions are the spectral rounding problems, which were studied in spectral sparsification and in discrepancy theory. We extend these results to incorporate additional non-negative linear constraints, and show that they can be used to significantly extend the scope of network design problems that can be solved. Our algorithm for spectral rounding is an iterative randomized rounding algorithm based on the regret minimization framework. In some settings, this provides an alternative spectral algorithm to achieve constant factor approximation for the classical survivable network design problem.
Joint work with Lap Chi Lau. Arxiv link of the manuscript: https://arxiv.org/abs/2003.07810
June 03, 2020
Speaker: Shalev Ben-David, University of Waterloo
Title: Forecasting algorithms: a new approach to randomized lower bounds
Abstract: In this talk, I’ll present a new approach to randomized lower bounds, particularly in a setting in which we wish to give a fine-grained analysis of randomized algorithms that achieve small bias. The approach is as follows: instead of considering ordinary randomized algorithms which give an output in {0,1} and may err, we switch models to look at “forecasting” randomized algorithms which output a confidence in [0,1] for whether they think the answer is 1. When scored by a proper scoring rule, the performance of the best forecasting algorithm is closely related to the bias of the best (ordinary) randomized algorithm; however, the forecasting algorithms (when analyzed using the right scoring rule) have elegant mathematical properties that make them more amenable to analysis.
As an application, I’ll present a new minimax theorem for randomized algorithms, which can be viewed as a strengthening of Yao’s minimax theorem. Yao’s minimax theorem guarantees the existence of a hard distribution for a function f such that solving f against this distribution (to a desired error level) is as hard as solving f in the worst case (to that same error level). However, the hard distribution provided by Yao’s theorem depends on the chosen error level. Our minimax theorem removes this dependence,giving a distribution which certifies the hardness of f against all bias levels at once. In recent work, we used this minimax theorem to give a tight composition theorem for randomized query complexity.
Joint work with Eric Blais.
May 27, 2020
Speaker: Anamay Tengse, Tata Institute of Fundamental Research (TIFR)
Title: Can algebraic circuit lower bounds have easy proofs?
Abstract: The field of Algebraic Circuit Complexity studies how succinctly one can represent explicit multivariate polynomials. Algebraic circuits (analogous to boolean circuits) are the most natural model for computing polynomials. A central question in this field istherefore to find explicit polynomials that require “large” algebraic circuits. The best known lower bounds for general circuits are due to Baur and Strassen (1983) and Smolenksy (1997), both of which give (tight) Omega(n log d) lower bounds for an explicitn-variate polynomial of degree d. Although there has been considerable progress in proving lower bounds for simpler models, there has not been much progress for more general computational classes. This lack of progress over the last few decades leads to thefollowing question. Are most of the current techniques incapable of proving a superpolynomial lower bound for general circuits?
Almost all the current techniques for showing a lower bound against a class of circuits C, can also be used to obtain what is called a “defining equation” for the class C that is also easy to compute. Thus, the above question is equivalent to asking whetherthe class of polynomials computable by polynomial-sized circuits has efficiently computable defining equations.
Based on the important work of Razborov and Rudich (1997) in the boolean world, Forbes, Shpilka, and Volk (2018), and Grochow, Kumar, Saks and Saraf (2017) introduced the framework of Algebraically Natural Proofs. They showed that under a certain derandomizationassumption, such defining equations do not exist; thereby hinting towards the existence of a barrier for proving lower bounds. We show that there \emph{are} efficiently computable defining equations for the class of polynomials with bounded coefficients thatare computable by polynomial-sized circuits. This provides evidence \emph{against} the existence of a barrier for proving lower bounds.
Joint work with Prerona Chatterjee (TIFR, Mumbai), Mrinal Kumar (IITB, Mumbai), C. Ramya (TIFR, Mumbai) and Ramprasad Saptharishi (TIFR, Mumbai).
May 13, 2020
Speaker: Gautam Kamath, University of Waterloo
Title: Robustness in unsupervised and supervised machine learning
Abstract: Recently, the need for robust machine learning algorithms has become apparent. Whether due to errors in data collection, model misspecification, or adversarial attacks, contaminated datasets arise in many areas. This is an issue, as existing methods appear to be quite brittle to small amounts of errors. Even more worryingly, these models are being deployed in many security-critical settings, such as self-driving cars, where reliability is an absolute must.
In this talk, I will describe a line of work in which we provide provable guarantees for robust machine learning in several fundamental settings. I’ll begin by discussing the problem of robust estimation of mean and covariance of a Gaussian distribution, and how to relax this to distributions with weaker assumptions on the moments. I will then describe how these methods can be used to “robustify” supervised learning algorithms by applying robust mean estimation algorithms to the gradients of the dataset. While theoretically sound, the algorithms are also realizable and efficient, and I will present experimental results on both synthetic and real-world data.
Based on joint works with Ilias Diakonikolas, Daniel M. Kane, Jerry Li, Ankur Moitra, Jacob Steinhardt, and Alistair Stewart.
April 29, 2020
Speaker: Kimon Fountoulakis, University of Waterloo
Title: Local Graph Clustering
Abstract: Graphs, long popular in computer science and discrete mathematics, have received renewed interest because they provide a useful way to model many types of relational data. In biology, e.g., graphs are routinely used to generate hypotheses for experimental validation; in neuroscience, they are used to study the networks and circuits in the brain; and in social networks, they are used to find common behaviors of users. These modern graph applications require the analysis of large graphs, and this can be computationally expensive. Graph algorithms have been developed to identify and interpret small-scale local structure in large-scale data without the requirement to access all the data.
In this talk, we will discuss state-of-the-art local spectral-, flow-, and Lp-norm-based algorithms that are specialized in finding small-scale clusters in large graphs without accessing the whole graph. We will discuss worst- and average-case theoretical results of these algorithms and we will demonstrate their empirical performance.
Based on joint work with Shenghao Yang, Di Wang, Wooseok Ha, Julian Shun, Farbod Roosta Khorasani, Xiang Cheng.