STOC 2019- Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing

Full Citation in the ACM Digital Library

SESSION: Award Papers and More

Log-concave polynomials II: high-dimensional walks and an FPRAS for counting bases of a matroid

We design an FPRAS to count the number of bases of any matroid given by an independent set oracle, and to estimate the partition function of the random cluster model of any matroid in the regime where 0<q<1. Consequently, we can sample random spanning forests in a graph and estimate the reliability polynomial of any matroid. We also prove the thirty year old conjecture of Mihail and Vazirani that the bases exchange graph of any matroid has edge expansion at least 1.

Our algorithm and proof build on the recent results of Dinur, Kaufman, Mass and Oppenheim who show that a high dimensional walk on a weighted simplicial complex mixes rapidly if for every link of the complex, the corresponding localized random walk on the 1-skeleton is a strong spectral expander. One of our key observations is that a weighted simplicial complex X is a 0-local spectral expander if and only if a naturally associated generating polynomial pX is strongly log-concave. More generally, to every pure simplicial complex with positive weights on its maximal faces, we can associate to X a multiaffine homogeneous polynomial pX such that the eigenvalues of the localized random walks on X correspond to the eigenvalues of the Hessian of derivatives of pX.

Oracle separation of BQP and PH

We present a distribution D over inputs in {−1,1}2N, such that: (1) There exists a quantum algorithm that makes one (quantum) query to the input, and runs in time O(logN), that distinguishes between D and the uniform distribution with advantage Ω(1/logN). (2) No Boolean circuit of quasi-polynomial size and constant depth distinguishes between D and the uniform distribution with advantage better than polylog(N)/√N.

By well known reductions, this gives a separation of the classes Promise-BQP and Promise-PH in the black-box model and implies an oracle O relative to which BQPOPHO.

The reachability problem for Petri nets is not elementary

Petri nets, also known as vector addition systems, are a long established model of concurrency with extensive applications in modelling and analysis of hardware, software and database systems, as well as chemical, biological and business processes. The central algorithmic problem for Petri nets is reachability: whether from the given initial configuration there exists a sequence of valid execution steps that reaches the given final configuration. The complexity of the problem has remained unsettled since the 1960s, and it is one of the most prominent open questions in the theory of verification. Decidability was proved by Mayr in his seminal STOC 1981 work, and the currently best published upper bound is non-primitive recursive Ackermannian of Leroux and Schmitz from LICS 2019. We establish a non-elementary lower bound, i.e. that the reachability problem needs a tower of exponentials of time and space. Until this work, the best lower bound has been exponential space, due to Lipton in 1976. The new lower bound is a major breakthrough for several reasons. Firstly, it shows that the reachability problem is much harder than the coverability (i.e., state reachability) problem, which is also ubiquitous but has been known to be complete for exponential space since the late 1970s. Secondly, it implies that a plethora of problems from formal languages, logic, concurrent systems, process calculi and other areas, that are known to admit reductions from the Petri nets reachability problem, are also not elementary. Thirdly, it makes obsolete the currently best lower bounds for the reachability problems for two key extensions of Petri nets: with branching and with a pushdown stack.

At the heart of our proof is a novel gadget so called the factorial amplifier that, assuming availability of counters that are zero testable and bounded by k, guarantees to produce arbitrarily large pairs of values whose ratio is exactly the factorial of k. We also develop a novel construction that uses arbitrarily large pairs of values with ratio R to provide zero testable counters that are bounded by R. Repeatedly composing the factorial amplifier with itself by means of the construction then enables us to compute in linear time Petri nets that simulate Minsky machines whose counters are bounded by a tower of exponentials, which yields the non-elementary lower bound. By refining this scheme further, we in fact establish hardness for h-exponential space already for Petri nets with h + 13 counters.

Bootstrapping results for threshold circuits “just beyond” known lower bounds

The best known lower bounds for the circuit class TC0 are only slightly super-linear. Similarly, the best known algorithm for derandomization of this class is an algorithm for quantified derandomization (i.e., a weak type of derandomization) of circuits of slightly super-linear size. In this paper we show that even very mild quantitative improvements of either of the two foregoing results would already imply super-polynomial lower bounds for TC0. Specifically:

(1) If for every c>1 and sufficiently large d∈ℕ it holds that n-bit TC0 circuits of depth d require n1+cd wires to compute certain NC1-complete functions, then TC0NC1. In fact, even lower bounds for TC0 circuits of size n1+cd against these functions when c>1 is fixed and sufficiently small would yield lower bounds for polynomial-sized circuits. Lower bounds of the form n1+cd against these functions are already known, but for a fixed c≈2.41 that is too large to yield new lower bounds via our results.

(2) If there exists a deterministic algorithm that gets as input an n-bit TC0 circuit of depth d and n1+(1.61)d wires, runs in time 2no(1), and distinguishes circuits that accept at most B(n)=2n1−(1.61)d inputs from circuits that reject at most B(n) inputs, then NEXPTC0. An algorithm for this “quantified derandomization” task is already known, but it works only when the number of wires is n1+cd, for c>30, and with a smaller B(n)≈2n1−(30/c)d.

Intuitively, the “takeaway” message from our work is that the gap between currently-known results and results that would suffice to get super-polynomial lower bounds for TC0 boils down to the precise constant c>1 in the bound n1+cd on the number of wires. Our results improve previous results of Allender and Koucký (2010) and of the second author (2018), respectively, whose hypotheses referred to circuits with n1+c/d wires (rather than n1+cd wires). We also prove results similar to two results above for other circuit classes (i.e., ACC0 and CC0).

The log-approximate-rank conjecture is false

We construct a simple and total XOR function F on 2n variables that has only O(√n) spectral norm, O(n2) approximate rank and O(n2.5) approximate nonnegative rank. We show it has polynomially large randomized bounded-error communication complexity of Ω(√n). This yields the first exponential gap between the logarithm of the approximate rank and randomized communication complexity for total functions. Thus F witnesses a refutation of the Log-Approximate-Rank Conjecture (LARC) which was posed by Lee and Shraibman as a very natural analogue for randomized communication of the still unresolved Log-Rank Conjecture for deterministic communication. The best known previous gap for any total function between the two measures is a recent 4th-power separation by G'o'os, Jayram, Pitassi and Watson.

Additionally, our function F refutes Grolmusz’s Conjecture and a variant of the Log-Approximate-Nonnegative-Rank Conjecture, suggested recently by Kol, Moran, Shpilka and Yehudayoff, both of which are implied by the LARC. The complement of F has exponentially large approximate nonnegative rank. This answers a question of Lee and Kol et al., showing that approximate nonnegative rank can be exponentially larger than approximate rank. The function F also falsifies a conjecture about parity measures of Boolean functions made by Tsang, Wong, Xie and Zhang. The latter conjecture implied the Log-Rank Conjecture for XOR functions.

We are pleased to note that shortly after we published our results two independent groups of researchers, Anshu, Boddu and Touchette, and Sinha and de Wolf, used our function F to prove that the Quantum-Log-Rank Conjecture is also false by showing that F has Ω(n1/6) quantum communication complexity.

A strongly polynomial algorithm for linear exchange markets

We present a strongly polynomial algorithm for computing an equilibrium in Arrow-Debreu exchange markets with linear utilities. Our algorithm is based on a variant of the weakly-polynomial Duan-Mehlhorn (DM) algorithm. We use the DM algorithm as a subroutine to identify revealed edges, i.e., pairs of agents and goods that must correspond to best bang-per-buck transactions in every equilibrium solution. Every time a new revealed edge is found, we use another subroutine that decides if there is an optimal solution using the current set of revealed edges, or if none exists, finds the solution that approximately minimizes the violation of the demand and supply constraints. This task can be reduced to solving a linear program (LP). Even though we are unable to solve this LP in strongly polynomial time, we show that it can be approximated by a simpler LP with two variables per inequality that is solvable in strongly polynomial time.

SESSION: Discrete Optimization

An optimal approximation for submodular maximization under a matroid constraint in the adaptive complexity model

In this paper we study submodular maximization under a matroid constraint in the adaptive complexity model. This model was recently introduced in the context of submodular optimization to quantify the information theoretic complexity of black-box optimization in a parallel computation model. Informally, the adaptivity of an algorithm is the number of sequential rounds it makes when each round can execute polynomially-many function evaluations in parallel. Since submodular optimization is regularly applied on large datasets we seek algorithms with low adaptivity to enable speedups via parallelization. Consequently, a recent line of work has been devoted to designing constant factor approximation algorithms for maximizing submodular functions under various constraints in the adaptive complexity model.

Despite the burst in work on submodular maximization in the adaptive complexity model, the fundamental problem of maximizing a monotone submodular function under a matroid constraint has remained elusive. In particular, all known techniques fail for this problem and there are no known constant factor approximation algorithms whose adaptivity is sublinear in the rank of the matroid k or in the worst case sublinear in the size of the ground set n.

In this paper we present an approximation algorithm for the problem of maximizing a monotone submodular function under a matroid constraint in the adaptive complexity model. The approximation guarantee of the algorithm is arbitrarily close to the optimal 1−1/e and it has near optimal adaptivity of Ø(log(n)log(k)). This result is obtained using a novel technique of adaptive sequencing which departs from previous techniques for submodular maximization in the adaptive complexity model. In addition to our main result we show how to use this technique to design other approximation algorithms with strong approximation guarantees and polylogarithmic adaptivity.

Parallelizing greedy for submodular set function maximization in matroids and beyond

We consider parallel, or low adaptivity, algorithms for submodular function maximization. This line of work was recently initiated by Balkanski and Singer and has already led to several interesting results on the cardinality constraint and explicit packing constraints. An important open problem is the classical setting of matroid constraint, which has been instrumental for developments in submodular function maximization. In this paper we develop a general strategy to parallelize the well-studied greedy algorithm and use it to obtain a randomized (1 / 2 − є)-approximation in O( log2(n) / 2 ) rounds of adaptivity. We rely on this algorithm, and an elegant amplification approach due to Badanidiyuru and Vondrák to obtain a fractional solution that yields a near-optimal randomized ( 1 − 1/e − є )-approximation in O( log2(n) / є3 ) rounds of adaptivity. For non-negative functions we obtain a ( 3−2√2 − є )-approximation and a fractional solution that yields a ( 1 / e − є)-approximation. Our approach for parallelizing greedy yields approximations for intersections of matroids and matchoids, and the approximation ratios are comparable to those known for sequential greedy.

Submodular maximization with matroid and packing constraints in parallel

We consider the problem of maximizing the multilinear extension of a submodular function subject a single matroid constraint or multiple packing constraints with a small number of adaptive rounds of evaluation queries.

We obtain the first algorithms with low adaptivity for submodular maximization with a matroid constraint. Our algorithms achieve a 1−1/e−є approximation for monotone functions and a 1/e−є approximation for non-monotone functions, which nearly matches the best guarantees known in the fully adaptive setting. The number of rounds of adaptivity is O(log2 n3), which is an exponential speedup over the existing algorithms.

We obtain the first parallel algorithm for non-monotone submodular maximization subject to packing constraints. Our algorithm achieves a 1/e−є approximation using O(log(n/є) log(1/є) log(n+m)/ є2) parallel rounds, which is again an exponential speedup in parallel time over the existing algorithms. For monotone functions, we obtain a 1−1/e−є approximation in O(log(n/є)logm2) parallel rounds. The number of parallel rounds of our algorithm matches that of the state of the art algorithm for solving packing LPs with a linear objective (Mahoney et al., 2016).

Our results apply more generally to the problem of maximizing a diminishing returns submodular (DR-submodular) function.

Unconstrained submodular maximization with constant adaptive complexity

In this paper, we consider the unconstrained submodular maximization problem. We propose the first algorithm for this problem that achieves a tight (1/2−ε)-approximation guarantee using Õ(ε−1) adaptive rounds and a linear number of function evaluations. No previously known algorithm for this problem achieves an approximation ratio better than 1/3 using less than Ω(n) rounds of adaptivity, where n is the size of the ground set. Moreover, our algorithm easily extends to the maximization of a non-negative continuous DR-submodular function subject to a box constraint, and achieves a tight (1/2−ε)-approximation guarantee for this problem while keeping the same adaptive and query complexities.

Dynamic set cover: improved algorithms and lower bounds

We give new upper and lower bounds for the dynamic set cover problem. First, we give a (1+є) f-approximation for fully dynamic set cover in O(f2logn5) (amortized) update time, for any є > 0, where f is the maximum number of sets that an element belongs to. In the decremental setting, the update time can be improved to O(f25), while still obtaining an (1+є) f-approximation. These are the first algorithms that obtain an approximation factor linear in f for dynamic set cover, thereby almost matching the best bounds known in the offline setting and improving upon the previous best approximation of O(f2) in the dynamic setting.

To complement our upper bounds, we also show that a linear dependence of the update time on f is necessary unless we can tolerate much worse approximation factors. Using the recent distributed PCP-framework, we show that any dynamic set cover algorithm that has an amortized update time of O(f1−є) must have an approximation factor that is Ω(nδ) for some constant δ>0 under the Strong Exponential Time Hypothesis.

Approximation algorithms for minimum norm and ordered optimization problems

In many optimization problems, a feasible solution induces a multi-dimensional cost vector. For example, in load-balancing a schedule induces a load vector across the machines. In k-clustering, opening k facilities induces an assignment cost vector across the clients. Typically, one seeks a solution which either minimizes the sum- or the max- of this vector, and these problems (makespan minimization, k-median, and k-center) are classic NP-hard problems which have been extensively studied.

In this paper we consider the minimum-norm optimization problem. Given an arbitrary monotone, symmetric norm, the problem asks to find a solution which minimizes the norm of the induced cost-vector. Such norms are versatile and include ℓp norms, Top-ℓ norm (sum of the ℓ largest coordinates in absolute value), and ordered norms (non-negative linear combination of Top-ℓ norms), and consequently, the minimum-norm problem models a wide variety of problems under one umbrella, We give a general framework to tackle the minimum-norm problem, and illustrate its efficacy in the unrelated machine load balancing and k-clustering setting. Our concrete results are the following.

(a) We give constant factor approximation algorithms for the minimum norm load balancing problem in unrelated machines, and the minimum norm k-clustering problem. To our knowledge, our results constitute the first constant-factor approximations for such a general suite of objectives.

(b) For load balancing on unrelated machines, we give a (2+ε)-approximation for ordered load balancing (i.e., min-norm load-balancing under an ordered norm).

(c) For k-clustering, we give a (5+ε)-approximation for the ordered k-median problem, which significantly improves upon the previous-best constant-factor approximation (Chakrabarty and Swamy (ICALP 2018); Byrka, Sornat, and Spoerhase (STOC 2018)).

(d) Our techniques also imply O(1) approximations to the instance-wise best simultaneous approximation factor for unrelated-machine load-balancing and k-clustering. To our knowledge, these are the first positive simultaneous approximation results in these settings.

At a technical level, one of our chief insights is that minimum-norm optimization can be reduced to a special case that we call min-max ordered optimization. Both the reduction, and the task of devising algorithms for the latter problem, require a sparsification idea that we develop, which is of interest for ordered optimization problems. The main ingredient in solving min-max ordered optimization is a deterministic, oblivious rounding procedure (that we devise) for suitable LP relaxations of the load-balancing and k-clustering problem; this may be of independent interest.

SESSION: Planar Graphs Algorithms

Almost optimal distance oracles for planar graphs

We present new tradeoffs between space and query-time for exact distance oracles in directed weighted planar graphs. These tradeoffs are almost optimal in the sense that they are within polylogarithmic, subpolynomial or arbitrarily small polynomial factors from the naïve linear space, constant query-time lower bound. These tradeoffs include: (i) an oracle with space O(n1+є) and query-time Õ(1) for any constant є>0, (ii) an oracle with space Õ(n) and query-time O(nє) for any constant є>0, and (iii) an oracle with space n1+o(1) and query-time no(1).

Planar diameter via metric compression

We develop a new approach for distributed distance computation in planar graphs that is based on a variant of the metric compression problem recently introduced by Abboud et al. [SODA’18]. In our variant of the Planar Graph Metric Compression Problem, one is given an n-vertex planar graph G=(V,E), a set of SV source terminals lying on a single face, and a subset of target terminals TV. The goal is to compactly encode the S× T distances.

One of our key technical contributions is in providing a compression scheme that encodes all S × T distances using O(|S|·(D)+|T|) bits, for unweighted graphs with diameter D. This significantly improves the state of the art of O(|S|· 2D+|T| · D) bits. We also consider an approximate version of the problem for weighted graphs, where the goal is to encode (1+є) approximation of the S × T distances, for a given input parameter є ∈ (0,1]. Here, our compression scheme uses O((|S|/є)+|T|) bits. In addition, we describe how these compression schemes can be computed in near-linear time. At the heart of this compact compression scheme lies a VC-dimension type argument on planar graphs, using the well-known Sauer’’s lemma.

This efficient compression scheme leads to several improvements and simplifications in the setting of diameter computation, most notably in the distributed setting:

  • There is an O(D5)-round randomized distributed algorithm for computing the diameter in planar graphs, w.h.p.
  • There is an O(D3)+D2(logn/є)-round randomized distributed algorithm for computing a (1+є) approximation for the diameter in weighted planar graphs, with unweighted diameter D, w.h.p.

No sublinear round algorithms were known for these problems before. These distributed constructions are based on a new recursive graph decomposition that preserves the (unweighted) diameter of each of the subgraphs up to a logarithmic term. Using this decomposition, we also get an exact SSSP tree computation within O(D2) rounds.

Polylogarithmic approximation for Euler genus on bounded degree graphs

Computing the Euler genus of a graph is a fundamental problem in algorithmic graph theory. It has been shown to be NP-hard by [Thomassen ’89, Thomassen ’97], even for cubic graphs, and a linear-time fixed-parameter algorithm has been obtained by [Mohar ’99]. Despite extensive study, the approximability of the Euler genus remains wide open. While the existence of an O(1)-approximation is not ruled out, the currently best-known upper bound is a O(n1−α)-approximation, for some universal constant α>0 [Kawarabayashi and Sidiropoulos 2017].

We present an O(log2.5 n)-approximation polynomial time algorithm for this problem on graphs of bounded degree. Prior to our work, the best known result on graphs of bounded degree was a nΩ(1)-approximation [Chekuri and Sidiropoulos 2013].

As an immediate corollary, we also obtain improved approximation algorithms for the crossing number problem and for the minimum vertex planarization problem, on graphs of bounded degree. Specifically, we obtain a polynomial-time O(2 log3.5 n)-approximation algorithm for the minimum vertex planarization problem, on graphs of maximum degree . Moreover we obtain an algorithm which given a graph of crossing number k, computes a drawing with at most k2 logO(1) n crossings in polynomial time. This also implies a n1/2 logO(1) n-approximation polynomial time algorithm. The previously best-known result is a polynomial time algorithm that computes a drawing with k10 logO(1) crossings, which implies a n9/10logO(1) n-approximation algorithm [Chuzhoy 2011].

Planar graphs of bounded degree have bounded queue number

A queue layout of a graph consists of a linear order of its vertices and a partition of its edges into queues, so that no two independent edges of the same queue are nested. The queue number of a graph is the minimum number of queues required by any of its queue layouts. A long-standing conjecture by Heath, Leighton and Rosenberg states that the queue number of planar graphs is bounded.This conjecture has been partially settled in the positive for several sub- families of planar graphs (most of which have bounded treewidth).

In this paper, we make a further important step towards settling this conjecture. We prove that planar graphs of bounded degree (which may have unbounded treewidth) have bounded queue number. A notable implication of this result is that every planar graph of bounded degree admits a three-dimensional straight-line grid drawing in linear volume. Further implications are that every planar graph of bounded degree has bounded track number, and that every k-planar graph (i.e., every graph that can be drawn in the plane with at most k crossings per edge) of bounded degree as bounded queue number.

SESSION: Quantum Computing I

The parallel repetition of non-signaling games: counterexamples and dichotomy

Non-signaling games are an important object of study in the theory of computation, for their role both in quantum information and in (classical) cryptography. In this work, we study the behavior of these games under parallel repetition.

We show that, unlike the situation both for classical games and for two-player non-signaling games, there are k-player non-signaling games (for k ≥ 3) whose values do not tend to 0 with sufficient parallel repetition. In fact, parallel repetition sometimes does not decrease their value whatsoever.

We show that in general, every game’s non-signaling value under parallel repetition is either lower bounded by a positive constant or decreases exponentially with the number of repetitions. Furthermore, exponential decrease occurs if and only if the game’s sub-non-signaling value (Lancien and Winter, CJTCS ’16) is less than 1.

Quantum singular value transformation and beyond: exponential improvements for quantum matrix arithmetics

An n-qubit quantum circuit performs a unitary operation on an exponentially large, 2n-dimensional, Hilbert space, which is a major source of quantum speed-ups. We develop a new “Quantum singular value transformation” algorithm that can directly harness the advantages of exponential dimensionality by applying polynomial transformations to the singular values of a block of a unitary operator. The transformations are realized by quantum circuits with a very simple structure - typically using only a constant number of ancilla qubits - leading to optimal algorithms with appealing constant factors. We show that our framework allows describing many quantum algorithms on a high level, and enables remarkably concise proofs for many prominent quantum algorithms, ranging from optimal Hamiltonian simulation to various quantum machine learning applications. We also devise a new singular vector transformation algorithm, describe how to exponentially improve the complexity of implementing fractional queries to unitaries with a gapped spectrum, and show how to efficiently implement principal component regression. Finally, we also prove a quantum lower bound on spectral transformations.

Quantum weak coin flipping

We investigate weak coin flipping, a fundamental cryptographic primitive where two distrustful parties need to remotely establish a shared random bit. A cheating player can try to bias the output bit towards a preferred value. For weak coin flipping the players have known opposite preferred values. A weak coin-flipping protocol has a bias є if neither player can force the outcome towards their preferred value with probability more than 1/2+є. While it is known that all classical protocols have є=1/2, Mochon showed in 2007 that quantumly weak coin flipping can be achieved with arbitrarily small bias (near perfect) but the former best known explicit protocol has bias 1/6 (also due to Mochon, 2005). We propose a framework to construct new explicit protocols achieving biases below 1/6. In particular, we construct explicit unitaries for protocols with bias down to 1/10. To go lower, we introduce what we call the Elliptic Monotone Align (EMA) algorithm which, together with the framework, allows us to construct protocols with arbitrarily small biases.

A quantum-inspired classical algorithm for recommendation systems

We give a classical analogue to Kerenidis and Prakash’s quantum recommendation system, previously believed to be one of the strongest candidates for provably exponential speedups in quantum machine learning. Our main result is an algorithm that, given an m × n matrix in a data structure supporting certain ℓ2-norm sampling operations, outputs an ℓ2-norm sample from a rank-k approximation of that matrix in time O(poly(k)log(mn)), only polynomially slower than the quantum algorithm. As a consequence, Kerenidis and Prakash’s algorithm does not in fact give an exponential speedup over classical algorithms. Further, under strong input assumptions, the classical recommendation system resulting from our algorithm produces recommendations exponentially faster than previous classical systems, which run in time linear in m and n.

The main insight of this work is the use of simple routines to manipulate ℓ2-norm sampling distributions, which play the role of quantum superpositions in the classical setting. This correspondence indicates a potentially fruitful framework for formally comparing quantum machine learning algorithms to classical machine learning algorithms.

SESSION: Graph Algorithms I

The number of minimum k-cuts: improving the Karger-Stein bound

Given an edge-weighted graph, how many minimum k-cuts can it have? This is a fundamental question in the intersection of algorithms, extremal combinatorics, and graph theory. It is particularly interesting in that the best known bounds are algorithmic: they stem from algorithms that compute the minimum k-cut.

In 1994, Karger and Stein obtained a randomized contraction algorithm that finds a minimum k-cut in O(n(2−o(1))k) time. It can also enumerate all such k-cuts in the same running time, establishing a corresponding extremal bound of O(n(2−o(1))k). Since then, the algorithmic side of the minimum k-cut problem has seen much progress, leading to a deterministic algorithm based on a tree packing result of Thorup, which enumerates all minimum k-cuts in the same asymptotic running time, and gives an alternate proof of the O(n(2−o(1))k) bound. However, beating the Karger–Stein bound, even for computing a single minimum k-cut, has remained out of reach.

In this paper, we give an algorithm to enumerate all minimum k-cuts in O(n(1.981+o(1))k) time, breaking the algorithmic and extremal barriers for enumerating minimum k-cuts. To obtain our result, we combine ideas from both the Karger–Stein and Thorup results, and draw a novel connection between minimum k-cut and extremal set theory. In particular, we give and use tighter bounds on the size of set systems with bounded dual VC-dimension, which may be of independent interest.

Breaking quadratic time for small vertex connectivity and an approximation scheme

Vertex connectivity a classic extensively-studied problem. Given an integer k, its goal is to decide if an n-node m-edge graph can be disconnected by removing k vertices. Although a linear-time algorithm was postulated since 1974 [Aho, Hopcroft and Ullman], and despite its sibling problem of edge connectivity being resolved over two decades ago [Karger STOC’96], so far no vertex connectivity algorithms are faster than O(n2) time even for k=4 and m=O(n). In the simplest case where m=O(n) and k=O(1), the O(n2) bound dates five decades back to [Kleitman IEEE Trans. Circuit Theory’69]. For higher m, O(m) time is known for k≤ 3 [Tarjan FOCS’71; Hopcroft, Tarjan SICOMP’73], the first O(n2) time is from [Kanevsky, Ramachandran, FOCS’87] for k=4 and from [Nagamochi, Ibaraki, Algorithmica’92] for k=O(1). For general k and m, the best bound is Õ(min(kn2, nω+nkω)) [Henzinger, Rao, Gabow FOCS’96; Linial, Lovász, Wigderson FOCS’86] where Õ hides polylogarithmic terms and ω<2.38 is the matrix multiplication exponent.

In this paper, we present a randomized Monte Carlo algorithm with Õ(m+k7/3n4/3) time for any k=O(√n). This gives the first subquadratic time bound for any 4≤ ko(n2/7) (subquadratic time refers to O(m)+o(n2) time.) and improves all above classic bounds for all kn0.44. We also present a new randomized Monte Carlo (1+є)-approximation algorithm that is strictly faster than the previous Henzinger’s 2-approximation algorithm [J. Algorithms’97] and all previous exact algorithms. The story is the same for the directed case, where our exact Õ( min{km2/3n, km4/3} )-time for any k = O(√n) and (1+є)-approximation algorithms improve classic bounds for small and large k, respectively. Additionally, our algorithm is the first approximation algorithm on directed graphs.

The key to our results is to avoid computing single-source connectivity, which was needed by all previous exact algorithms and is not known to admit o(n2) time. Instead, we design the first local algorithm for computing vertex connectivity; without reading the whole graph, our algorithm can find a separator of size at most k or certify that there is no separator of size at most k “near” a given seed node.

O(log? k / log log k)-approximation algorithm for directed Steiner tree: a tight quasi-polynomial-time algorithm

In the Directed Steiner Tree (DST) problem we are given an n-vertex directed edge-weighted graph, a root r , and a collection of k terminal nodes. Our goal is to find a minimum-cost subgraph that contains a directed path from r to every terminal. We present an O(log^2 k /log log k )-approximation algorithm for DST that runs in quasi-polynomial-time, i.e., in time n^polylog(k). By making standard complexity assumptions, we show the matching lower bound of Omega(log^2 k/loglogk) for the class of quasi-polynomial time algorithms, meaning that our approximation ratio is asymptotically the best possible. This is the first improvement on the DST problem since the classical quasi-polynomial-time O (log^3 k ) approximation algorithm by Charikar et al. [SODA’98J. Algorithms’99]. (The paper erroneously claims an O (log^2 k ) approximation due to a mistake in prior work.)

Our approach is based on two main ingredients. First, we derive an approximation preserving reduction to the Group Steiner Tree on Trees with Dependency Constraint (GSTTD) problem. Compared to the classic Group Steiner Tree on Trees problem, in GSTTD we are additionally given some dependency constraints among the nodes in the output tree that must be satisfied. The GSTTD instance has quasi-polynomial size and logarithmic height. We remark that, in contrast, Zelikovsky’s heigh-reduction theorem [Algorithmica’97] used in all prior work on DST achieves a reduction to a tree instance of the related Group Steiner Tree (GST) problem of similar height, however losing a logarithmic factor in the approximation ratio.

Our second ingredient is an LP-rounding algorithm to approximately solve GSTTD instances, which is inspired by the framework developed by [Rothvoss, Preprint’11; Friggstad et al., IPCO’14]. We consider a Sherali-Adams lifting of a proper LP relaxation of GSTTD. Our rounding algorithm proceeds level by level from the root to the leaves, rounding and conditioning each time on a proper subset of label variables. The limited height of the tree and small number of labels on root-to-leaf paths guarantee that a small enough (namely, polylogarithmic) number of Sherali-Adams lifting levels is sufficient to condition up to the leaves. We believe that our basic strategy of combining label-based reductions with a round-and-condition type of LP-rounding over hierarchies might find applications to other related problems.

SESSION: Streaming

Polynomial pass lower bounds for graph streaming algorithms

We present new lower bounds that show that a polynomial number of passes are necessary for solving some fundamental graph problems in the streaming model of computation. For instance, we show that any streaming algorithm that finds a weighted minimum s-t cut in an n-vertex undirected graph requires n2−o(1) space unless it makes nΩ(1) passes over the stream.

To prove our lower bounds, we introduce and analyze a new four-player communication problem that we refer to as the hidden-pointer chasing problem. This is a problem in spirit of the standard pointer chasing problem with the key difference that the pointers in this problem are hidden to players and finding each one of them requires solving another communication problem, namely the set intersection problem. Our lower bounds for graph problems are then obtained by reductions from the hidden-pointer chasing problem.

Our hidden-pointer chasing problem appears flexible enough to find other applications and is therefore interesting in its own right. To showcase this, we further present an interesting application of this problem beyond streaming algorithms. Using a reduction from hidden-pointer chasing, we prove that any algorithm for submodular function minimization needs to make n2−o(1) value queries to the function unless it has a polynomial degree of adaptivity.

An optimal space lower bound for approximating MAX-CUT

We consider the problem of estimating the value of MAX-CUT in a graph in the streaming model of computation. At one extreme, there is a trivial 2-approximation for this problem that uses only O(log n) space, namely, count the number of edges and output half of this value as the estimate for the size of the MAX-CUT. On the other extreme, for any fixed є > 0, if one allows Õ(n) space, a (1+є)-approximate solution to the MAX-CUT value can be obtained by storing an Õ(n)-size sparsifier that essentially preserves MAX-CUT value.

Our main result is that any (randomized) single pass streaming algorithm that breaks the 2-approximation barrier requires Ω(n)-space, thus resolving the space complexity of any non-trivial approximations of the MAX-CUT value to within polylogarithmic factors in the single pass streaming model. We achieve the result by presenting a tight analysis of the Implicit Hidden Partition Problem introduced by Kapralov et al. [SODA’17] for an arbitrarily large number of players. In this problem a number of players receive random matchings of Ω(n) size together with random bits on the edges, and their task is to determine whether the bits correspond to parities of some hidden bipartition, or are just uniformly random.

Unlike all previous Fourier analytic communication lower bounds, our analysis does not directly use bounds on the ℓ2 norm of Fourier coefficients of a typical message at any given weight level that follow from hypercontractivity. Instead, we use the fact that graphs received by players are sparse (matchings) to obtain strong upper bounds on the ℓ1 norm of the Fourier coefficients of the messages of individual players using their special structure, and then argue, using the convolution theorem, that similar strong bounds on the ℓ1 norm are essentially preserved (up to an exponential loss in the number of players) once messages of different players are combined. We feel that our main technique is likely of independent interest.

Stronger l2/l2 compressed sensing; without iterating

We consider the extensively studied problem of ℓ2/ℓ2 compressed sensing. The main contribution of our work is an improvement over [Gilbert, Li, Porat and Strauss, STOC 2010] with faster decoding time and significantly smaller column sparsity, answering two open questions of the aforementioned work.

Previous work on sublinear-time compressed sensing employed an iterative procedure, recovering the heavy coordinates in phases. We completely depart from that framework, and give the first sublinear-time ℓ2/ℓ2 scheme which achieves the optimal number of measurements without iterating; this new approach is the key step to our progress. Towards that, we satisfy the ℓ2/ℓ2 guarantee by exploiting the heaviness of coordinates in a way that was not exploited in previous work. Via our techniques we obtain improved results for various sparse recovery tasks, and indicate possible further applications to problems in the field, to which the aforementioned iterative procedure creates significant obstructions.

SESSION: Privacy

Private selection from private candidates

Differentially Private algorithms often need to select the best amongst many candidate options. Classical works on this selection problem require that the candidates’ goodness, measured as a real-valued score function, does not change by much when one person’s data changes. In many applications such as hyperparameter optimization, this stability assumption is much too strong. In this work, we consider the selection problem under a much weaker stability assumption on the candidates, namely that the score functions are differentially private. Under this assumption, we present algorithms that are near-optimal along the three relevant dimensions: privacy, utility and computational efficiency.

Our result can be seen as a generalization of the exponential mechanism and its existing generalizations. We also develop an online version of our algorithm, that can be seen as a generalization of the sparse vector technique to this weaker stability assumption. We show how our results imply better algorithms for hyperparameter selection in differentially private machine learning, as well as for adaptive data analysis.

The structure of optimal private tests for simple hypotheses

Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. This work answers a basic question about privately testing simple hypotheses: given two distributions P and Q, and a privacy level ε, how many i.i.d. samples are needed to distinguish P from Q subject to ε-differential privacy, and what sort of tests have optimal sample complexity? Specifically, we characterize this sample complexity up to constant factors in terms of the structure of P and Q and the privacy level ε, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test. Our result is an analogue of the classical Neyman-Pearson lemma in the setting of private hypothesis testing. We also give an application of our result to the private change-point detection. Our characterization applies more generally to hypothesis tests satisfying essentially any notion of algorithmic stability, which is known to imply strong generalization bounds in adaptive data analysis, and thus our results have applications even when privacy is not a primary concern.

Gentle measurement of quantum states and differential privacy

In differential privacy (DP), we want to query a database about n users, in a way that “leaks at most ε about any individual user,” even conditioned on any outcome of the query. Meanwhile, in gentle measurement, we want to measure n quantum states, in a way that “damages the states by at most α,” even conditioned on any outcome of the measurement. In both cases, we can achieve the goal by techniques like deliberately adding noise to the outcome before returning it. This paper proves a new and general connection between the two subjects. Specifically, we show that on products of n quantum states, any measurement that is α-gentle for small α is also O( α) -DP, and any product measurement that is ε-DP is also O( ε√n) -gentle.

Illustrating the power of this connection, we apply it to the recently studied problem of shadow tomography. Given an unknown d-dimensional quantum state ρ, as well as known two-outcome measurements E1,…,Em, shadow tomography asks us to estimate Pr[ Ei accepts ρ] , for every i∈[ m] , by measuring few copies of ρ. Using our connection theorem, together with a quantum analog of the so-called private multiplicative weights algorithm of Hardt and Rothblum, we give a protocol to solve this problem using order ( logm) 2( logd) 2 copies of ρ, compared to Aaronson’s previous bound of O( ( logm) 4( logd) ) . Our protocol has the advantages of being online (that is, the Ei’s are processed one at a time), gentle, and conceptually simple.

Other applications of our connection include new lower bounds for shadow tomography from lower bounds on DP, and a result on the safe use of estimation algorithms as subroutines inside larger quantum algorithms.

SESSION: Graph Algorithms II Distributed/Dynamic

Distributed exact weighted all-pairs shortest paths in near-linear time

In the distributed all-pairs shortest paths problem (APSP), every node in the weighted undirected distributed network (the CONGEST model) needs to know the distance from every other node using least number of communication rounds (typically called time complexity). The problem admits (1+o(1))-approximation Θ(n)-time algorithm and a nearly-tight Ω(n) lower bound [Nanongkai, STOC’14; Lenzen and Patt-Shamir PODC’15]. For the exact case, Elkin [STOC’17] presented an O(n5/3 log2/3 n) time bound, which was later improved to Õ(n5/4) in [Huang, Nanongkai, Saranurak FOCS’17].It was shown that any super-linear lower bound (in n) requires a new technique [Censor-Hillel, Khoury, Paz, DISC’17], but otherwise it remained widely open whether there exists a Õ(n)-time algorithm for the exact case, which would match the best possible approximation algorithm. This paper resolves this question positively: we present a randomized (Las Vegas) Õ(n)-time algorithm, matching the lower bound up to polylogarithmic factors. Like the previous Õ(n5/4) bound, our result works for directed graphs with zero (and even negative) edge weights. In addition to the improved running time, our algorithm works in a more general setting than that required by the previous Õ(n5/4) bound; in our setting (i) the communication is only along edge directions (as opposed to bidirectional), and (ii) edge weights are arbitrary (as opposed to integers in {1, 2, ... poly(n)}). The previously best algorithm for this more difficult setting required Õ(n3/2) time [Agarwal and Ramachandran, ArXiv’18] (this can be improved to Õ(n4/3) if one allows bidirectional communication).

Our algorithm is extremely simple and relies on a new technique called Random Filtered Broadcast. Given any sets of nodes A,BV and assuming that every bB knows all distances from nodes in A, and every node vV knows all distances from nodes in B, we want every vV to know DistThroughB(a,v) = minbB dist(a,b) + dist(b,v) for every aA. Previous works typically solve this problem by broadcasting all knowledge of every bB, causing super-linear edge congestion and time. We show a randomized algorithm that can reduce edge congestions and thus solve this problem in Õ(n) expected time.

Distributed edge connectivity in sublinear time

We present the first sublinear-time algorithm that can compute the edge connectivity λ of a network exactly on distributed message-passing networks (the CONGEST model), as long as the network contains no multi-edge. We present the first sublinear-time algorithm for a distributed message-passing network sto compute its edge connectivity λ exactly in the CONGEST model, as long as there are no parallel edges. Our algorithm takes Õ(n1−1/353D1/353+n1−1/706) time to compute λ and a cut of cardinality λ with high probability, where n and D are the number of nodes and the diameter of the network, respectively, and Õ hides polylogarithmic factors. This running time is sublinear in n (i.e. Õ(n1−є)) whenever D is. Previous sublinear-time distributed algorithms can solve this problem either (i) exactly only when λ=O(n1/8−є) [Thurimella PODC’95; Pritchard, Thurimella, ACM Trans. Algorithms’11; Nanongkai, Su, DISC’14] or (ii) approximately [Ghaffari, Kuhn, DISC’13; Nanongkai, Su, DISC’14]. To achieve this we develop and combine several new techniques. First, we design the first distributed algorithm that can compute a k-edge connectivity certificate for any k=O(n1−є) in time Õ(√nk+D). The previous sublinear-time algorithm can do so only when k=o(√n) [Thurimella PODC’95]. In fact, our algorithm can be turned into the first parallel algorithm with polylogarithmic depth and near-linear work. Previous near-linear work algorithms are essentially sequential and previous polylogarithmic-depth algorithms require Ω(mk) work in the worst case (e.g. [Karger, Motwani, STOC’93]). Second, we show that by combining the recent distributed expander decomposition technique of [Chang, Pettie, Zhang, SODA’19] with techniques from the sequential deterministic edge connectivity algorithm of [Kawarabayashi, Thorup, STOC’15], we can decompose the network into a sublinear number of clusters with small average diameter and without any mincut separating a cluster (except the “trivial” ones). This leads to a simplification of the Kawarabayashi-Thorup framework (except that we are randomized while they are deterministic). This might make this framework more useful in other models of computation. Finally, by extending the tree packing technique from [Karger STOC’96], we can find the minimum cut in time proportional to the number of components. As a byproduct of this technique, we obtain an Õ(n)-time algorithm for computing exact minimum cut for weighted graphs.

Towards the locality of Vizing’s theorem

Vizing showed that it suffices to color the edges of a simple graph using Δ + 1 colors, where Δ is the maximum degree of the graph. However, up to this date, no efficient distributed edge-coloring algorithm is known for obtaining such coloring, even for constant degree graphs. The current algorithms that get closest to this number of colors are the randomized (Δ + Θ(√Δ))-edge-coloring algorithm that runs in (n) rounds by Chang et al. [SODA 2018] and the deterministic (Δ + (n))-edge-coloring algorithm that runs in (Δ, logn) rounds by Ghaffari et al. [STOC 2018].

We present two distributed edge-coloring algorithms that run in (Δ,logn) rounds. The first algorithm, with randomization, uses only Δ+2 colors. The second algorithm is a deterministic algorithm that uses Δ+ O(logn/ loglogn) colors. Our approach is to reduce the distributed edge-coloring problem into an online and restricted version of balls-into-bins problem. If ℓ is the maximum load of the bins, our algorithm uses Δ + 2ℓ − 1 colors. We show how to achieve ℓ = 1 with randomization and ℓ = O(logn / loglogn) without randomization.

Decremental strongly-connected components and single-source reachability in near-linear time

Computing the Strongly-Connected Components (SCCs) in a graph G=(V,E) is known to take only O(m+n) time using an algorithm by Tarjan from 1972[SICOMP 72] where m = |E|, n=|V|. For fully-dynamic graphs, conditional lower bounds provide evidence that the update time cannot be improved by polynomial factors over recomputing the SCCs from scratch after every update. Nevertheless, substantial progress has been made to find algorithms with fast update time for decremental graphs, i.e. graphs that undergo edge deletions.

In this paper, we present the first algorithm for general decremental graphs that maintains the SCCs in total update time Õ(m), thus only a polylogarithmic factor from the optimal running time. Previously such a result was only known for the special case of planar graphs [Italiano et al, STOC 17]. Our result should be compared to the formerly best algorithm for general graphs achieving Õ(mn) total update time by Chechik et.al. [FOCS 16] which improved upon a breakthrough result leading to O(mn0.9 + o(1)) total update time by Henzinger, Krinninger and Nanongkai [STOC 14, ICALP 15]; these results in turn improved upon the longstanding bound of O(mn) by Roditty and Zwick [STOC 04].

All of the above results also apply to the decremental Single-Source Reachability (SSR) problem, which can be reduced to decrementally maintaining SCCs. A bound of O(mn) total update time for decremental SSR was established already in 1981 by Even and Shiloach [JACM 81].

Dynamic low-stretch trees via dynamic low-diameter decompositions

Spanning trees of low average stretch on the non-tree edges, as introduced by Alon et al. [SICOMP 1995], are a natural graph-theoretic object. In recent years, they have found significant applications in solvers for symmetric diagonally dominant (SDD) linear systems. In this work, we provide the first dynamic algorithm for maintaining such trees under edge insertions and deletions to the input graph. Our algorithm has update time n1/2 + o(1) and the average stretch of the maintained tree is no(1) , which matches the stretch in the seminal result of Alon et al.

Similar to Alon et al., our dynamic low-stretch tree algorithm employs a dynamic hierarchy of low-diameter decompositions (LDDs). As a major building block we use a dynamic LDD that we obtain by adapting the random-shift clustering of Miller et al. [SPAA 2013] to the dynamic setting.

The major technical challenge in our approach is to control the propagation of updates within our hierarchy of LDDs: each update to one level of the hierarchy could potentially induce several insertions and deletions to the next level of the hierarchy. We achieve this goal by a sophisticated amortization approach. In particular, we give a bound on the number of changes made to the LDD per update to the input graph that is significantly better than the trivial bound implied by the update time.

We believe that the dynamic random-shift clustering might be useful for independent applications. One of these applications is the dynamic spanner problem. By combining the random-shift clustering with the recent spanner construction of Elkin and Neiman [SODA 2017]. We obtain a fully dynamic algorithm for maintaining a spanner of stretch 2k − 1 and size O (n1 + 1/k logn) with amortized update time O (k log2 n) for any integer 2 ≤ k ≤ logn . Compared to the state-of-the art in this regime Baswana et al. [TALG 2012], we improve upon the size of the spanner and the update time by a factor of k .

A new algorithm for decremental single-source shortest paths with applications to vertex-capacitated flow and cut problems

We study the vertex-decremental Single-Source Shortest Paths (SSSP) problem: given an undirected graph G=(V,E) with lengths ℓ(e)≥ 1 on its edges that undergoes vertex deletions, and a source vertex s, we need to support (approximate) shortest-path queries in G: given a vertex v, return a path connecting s to v, whose length is at most (1+є) times the length of the shortest such path, where є is a given accuracy parameter. The problem has many applications, for example to flow and cut problems in vertex-capacitated graphs. Decremental SSSP is a fundamental problem in dynamic algorithms that has been studied extensively, especially in the more standard edge-decremental setting, where the input graph G undergoes edge deletions. The classical algorithm of Even and Shiloach supports exact shortest-path queries in O(mn) total update time. A series of recent results have improved this bound to O(m1+o(1)logL), where L is the largest length of any edge. However, these improved results are randomized algorithms that assume an oblivious adversary. To go beyond the oblivious adversary restriction, recently, Bernstein, and Bernstein and Chechik designed deterministic algorithms for the problem, with total update time Õ(n2logL), that by definition work against an adaptive adversary. Unfortunately, their algorithms introduce a new limitation, namely, they can only return the approximate length of a shortest path, and not the path itself. Many applications of the decremental SSSP problem, including the ones considered in this paper, crucially require both that the algorithm returns the approximate shortest paths themselves and not just their lengths, and that it works against an adaptive adversary. Our main result is a randomized algorithm for vertex-decremental SSSP with total expected update time O(n2+o(1)logL), that responds to each shortest-path query in Õ(nlogL) time in expectation, returning a (1+є)-approximate shortest path. The algorithm works against an adaptive adversary. The main technical ingredient of our algorithm is an Õ(|E(G)|+ n1+o(1))-time algorithm to compute a core decomposition of a given dense graph G, which allows us to compute short paths between pairs of query vertices in G efficiently. We use our result for vertex-decremental SSSP to obtain (1+є)-approximation algorithms for maximum s-t flow and minimum s-t cut in vertex-capacitated graphs, in expected time n2+o(1), and an O(log4n)-approximation algorithm for the vertex version of the sparsest cut problem with expected running time n2+o(1). These results improve upon the previous best known algorithms for these problems in the regime where m= ω(n1.5 + o(1)).

SESSION: Complexity Theory I Circuits

Near-optimal lower bounds on the threshold degree and sign-rank of AC0

The threshold degree of a Boolean function f∶{0,1}n→{0,1} is the minimum degree of a real polynomial p that represents f in sign: sgn  p(x)=(−1)f(x). A related notion is sign-rank, defined for a Boolean matrix F=[Fij] as the minimum rank of a real matrix M with sgn  Mij=(−1)Fij. Determining the maximum threshold degree and sign-rank achievable by constant-depth circuits (AC0) is a well-known and extensively studied open problem, with complexity-theoretic and algorithmic applications.

We give an essentially optimal solution to this problem. For any є>0, we construct an AC0 circuit in n variables that has threshold degree Ω(n1−є) and sign-rank exp(Ω(n1−є)), improving on the previous best lower bounds of Ω(√n) and exp(Ω(√n)), respectively. Our results subsume all previous lower bounds on the threshold degree and sign-rank of AC0 circuits of any given depth, with a strict improvement starting at depth 4. As a corollary, we also obtain near-optimal bounds on the discrepancy, threshold weight, and threshold density of AC0, strictly subsuming previous work on these quantities. Our work gives some of the strongest lower bounds to date on the communication complexity of AC0.

Reconstruction of non-degenerate homogeneous depth three circuits

A homogeneous depth three circuit C computes a polynomial <table><tr><td>f = T1 + T2 + ... + Ts,</td></tr></table> where each Ti is a product of d linear forms in n variables over some underlying field F. Given black-box access to f, can we efficiently reconstruct (i.e. proper learn) a homogeneous depth three circuit computing f? Learning various subclasses of circuits is natural and interesting from both theoretical and practical standpoints and in particular, properly learning homogeneous depth three circuits efficiently is stated as an open problem in a work by Klivans and Shpilka (COLT 2003) and is well-studied. Unfortunately, there is substantial amount of evidence to show that this is a hard problem in the worst case. We give a (randomized) poly(n,d,s)-time algorithm to reconstruct non-degenerate homogeneous depth three circuits for n = Ω(d2) (with some additional mild requirements on s and the characteristic of F). We call a circuit C as non-degenerate if the dimension of the partial derivative space of f equals the sum of the dimensions of the partial derivative spaces of the terms T1, T2, …, Ts. In this sense, the terms are “independent” of each other in a non-degenerate circuit. A random homogeneous depth three circuit (where the coefficients of the linear forms are chosen according to the uniform distribution or any other reasonable distribution) is almost surely non-degenerate. In comparison, previous learning algorithms for this circuit class were either improper (with an exponential dependence on d), or they only worked for s < n (with a doubly exponential dependence of the running time on s). The main contribution of this work is to formulate the following paradigm for efficiently handling addition gates and to successfully implement it for the class of homogeneous depth three circuits. The problem of finding the children of an addition gate with large fan-in s is first reduced to the problem of decomposing a suitable vector space U into a (direct) sum of simpler subspaces U1, U2, …, Us. One then constructs a suitable space of operators S consisting of linear maps acting on U such that analyzing the simultaneous global structure of S enables us to efficiently decompose U. In our case, we exploit the structure of the set of low rank matrices in S and of the invariant subspaces of U induced by S. We feel that this paradigm is novel and powerful: it should lead to efficient reconstruction of many other subclasses of circuits for which the efficient reconstruction problem had hitherto looked unapproachable because of the presence of large fan-in addition gates.

Separating monotone VP and VNP

This work is about the monotone versions of the algebraic complexity classes VP and VNP. The main result is that monotone VNP is strictly stronger than monotone VP.

On the approximation resistance of balanced linear threshold functions

In this paper, we show that there exists a balanced linear threshold function (LTF) which is unique games hard to approximate, refuting a conjecture of Austrin, Benabbas, and Magen. We also show that the almost monarchy predicate P(x) = sign((k−4)x1 + ∑i=2kxi) is approximable for sufficiently large k.

A fixed-depth size-hierarchy theorem for AC0[⊕] via the coin problem

In this work we prove the first Fixed-depth Size-Hierarchy Theorem for uniform AC0[⊕]. In particular, we show that for any fixed d, the class Cd,k of functions that have uniform AC0[⊕] formulas of depth d and size nk form an infinite hierarchy. We show this by exhibiting the first class of explicit functions where we have nearly (up to a polynomial factor) matching upper and lower bounds for the class of AC0[⊕] formulas.

The explicit functions are derived from the δ-Coin Problem, which is the computational problem of distinguishing between coins that are heads with probability (1+δ)/2 or (1−δ)/2, where δ is a parameter that is going to 0. We study the complexity of this problem and make progress on both upper bound and lower bound fronts.

Upper bounds. For any constant d≥ 2, we show that there are explicit monotone AC0 formulas (i.e. made up of AND and OR gates only) solving the δ-coin problem that have depth d, size exp(O(d(1/δ)1/(d−1))), and sample complexity (i.e. number of inputs) poly(1/δ). This matches previous upper bounds of O’Donnell and Wimmer (ICALP 2007) and Amano (ICALP 2009) in terms of size (which is optimal) and improves the sample complexity from exp(O(d(1/δ)1/(d−1))) to poly(1/δ).

Lower bounds. We show that the above upper bounds are nearly tight (in terms of size) even for the significantly stronger model of AC0[⊕] formulas (which are also allowed NOT and Parity gates): formally, we show that any AC0[⊕] formula solving the δ-coin problem must have size exp(Ω(d(1/δ)1/(d−1))). This strengthens a result of Shaltiel and Viola (SICOMP 2010), who prove a exp(Ω((1/δ)1/(d+2))) lower bound for AC0[⊕], and a lower bound of exp(Ω((1/δ)1/(d−1))) shown by Cohen, Ganor and Raz (APPROX-RANDOM 2014) for the class 0.

The upper bound is a derandomization involving a use of Janson’s inequality and classical combinatorial designs. The lower bound involves proving an optimal degree lower bound for polynomials over 2 solving the δ-coin problem.

DNF sparsification beyond sunflowers

There are two natural complexity measures associated with DNFs: their size, which is the number of clauses; and their width, which is the maximal number of variables in a clause. It is a folklore result that DNFs of small size can be approximated by DNFs of small width (logarithmic in the size). The other direction is much less clear.

Gopalan, Meka and Reingold [Computational Complexity 2013] showed that the other direction – DNF sparsification – holds as well. Any DNF of width w can be approximated to within error ε by a DNF of size (w log(1/ε))O(w). Our main interest in this work is the dependence on the width w. The same dependence of ww appears in several other open problems in combinatorics and complexity, such as the Erdős-Rado sunflower conjecture and Mansour’s conjecture. In fact, there are deep connections between these three problems. Our main result is DNF compression with an improved dependence on the width, which overcomes the ww barrier. Concretely, we show that any DNF of width w can be approximated to within error ε by a DNF of size (1/ε)O(w).

The proof centers around a new object which we call the DNF index function. Given a DNF, the DNF index function outputs for an input the first clause that satisfies it (if one exists). Our proof has two parts: a combinatorial part, where we exhibit a switching lemma for the DNF index function; and an analytic part, where we use the switching lemma to bound the noise sensitivity of the DNF index function, and then use it to obtain our DNF compression result.

SESSION: Quantum Computation II

Quantum Lovász local lemma: Shearer’s bound is tight

Lovász Local Lemma (LLL) is a very powerful tool in combinatorics and probability theory to show the possibility of avoiding all “bad” events under some “weakly dependent” condition. Over the last decades, the algorithmic aspect of LLL has also attracted lots of attention in theoretical computer science. A tight criterion under which the abstract version LLL (ALLL) holds was given by Shearer. It turns out that Shearer’s bound is generally not tight for variable version LLL (VLLL). Recently, Ambainis et al. introduced a quantum version LLL (QLLL), which was then shown to be powerful for the quantum satisfiability problem.

In this paper, we prove that Shearer’s bound is tight for QLLL, i.e., the relative dimension of the smallest satisfying subspace is completely characterized by the independent set polynomial, affirming a conjecture proposed by Sattath et al. Our result also shows the tightness of Gilyén and Sattath’s algorithm, and implies that the lattice gas partition function fully characterizes quantum satisfiability for almost all Hamiltonians with large enough qudits.

Commuting LLL (CLLL), LLL for commuting local Hamiltonians which are widely studied in the literature, is also investigated here. We prove that the tight regions of CLLL and QLLL are different in general. This result might imply that it is possible to design an algorithm for CLLL which is still efficient beyond Shearer’s bound.

In applications of LLLs, the symmetric cases are most common, i.e., the events are with the same probability and the Hamiltonians are with the same relative dimension. We give the first lower bound on the gap between the symmetric VLLL and Shearer’s bound. Our result can be viewed as a quantitative study on the separation between quantum and classical constraint satisfaction problems. Additionally, we obtain similar results for the symmetric CLLL. As an application, we give lower bounds on the critical thresholds of VLLL and CLLL for several of the most common lattices.

Quantum proof systems for iterated exponential time, and beyond

We show that any language solvable in nondeterministic time exp( exp(⋯exp(n))), where the number of iterated exponentials is an arbitrary function R(n), can be decided by a multiprover interactive proof system with a classical polynomial-time verifier and a constant number of quantum entangled provers, with completeness 1 and soundness 1 − exp(−Cexp(⋯exp(n))), where the number of iterated exponentials is R(n)−1 and C>0 is a universal constant. The result was previously known for R=1 and R=2; we obtain it for any time-constructible function R.

The result is based on a compression technique for interactive proof systems with entangled provers that significantly simplifies and strengthens a protocol compression result of Ji (STOC’17). As a separate consequence of this technique we obtain a different proof of Slofstra’s recent result on the uncomputability of the entangled value of multiprover games (Forum of Mathematics, Pi 2019).

Finally, we show that even minor improvements to our compression result would yield remarkable consequences in computational complexity theory and the foundations of quantum mechanics: first, it would imply that the class MIP* contains all computable languages; second, it would provide a negative resolution to a multipartite version of Tsirelson’s problem on the relation between the commuting operator and tensor product models for quantum correlations.

Good approximate quantum LDPC codes from spacetime circuit Hamiltonians

We study approximate quantum low-density parity-check (QLDPC) codes, which are approximate quantum error-correcting codes specified as the ground space of a frustration-free local Hamiltonian, whose terms do not necessarily commute.

Such codes generalize stabilizer QLDPC codes, which are exact quantum error-correcting codes with sparse, low-weight stabilizer generators (i.e. each stabilizer generator acts on a few qubits, and each qubit participates in a few stabilizer generators). Our investigation is motivated by an important question in Hamiltonian complexity and quantum coding theory: do stabilizer QLDPC codes with constant rate, linear distance, and constant-weight stabilizers exist?

We show that obtaining such optimal scaling of parameters (modulo polylogarithmic corrections) is possible if we go beyond stabilizer codes: we prove the existence of a family of [[N,k,d,ε]] approximate QLDPC codes that encode k = Ω(N) logical qubits into N physical qubits with distance d = Ω(N) and approximation infidelity ε = 1/(N). The code space is stabilized by a set of 10-local noncommuting projectors, with each physical qubit only participating in N projectors. We prove the existence of an efficient encoding map and show that the spectral gap of the code Hamiltonian scales as Ω(N−3.09). We also show that arbitrary Pauli errors can be locally detected by circuits of polylogarithmic depth.

Our family of approximate QLDPC codes is based on applying a recent connection between circuit Hamiltonians and approximate quantum codes (Nirkhe, et al., ICALP 2018) to a result showing that random Clifford circuits of polylogarithmic depth yield asymptotically good quantum codes (Brown and Fawzi, ISIT 2013). Then, in order to obtain a code with sparse checks and strong detection of local errors, we use a spacetime circuit-to-Hamiltonian construction in order to take advantage of the parallelism of the Brown-Fawzi circuits. Because of this, we call our codes spacetime codes.

The analysis of the spectral gap of the code Hamiltonian is the main technical contribution of this work. We show that for any depth D quantum circuit on n qubits there is an associated spacetime circuit-to-Hamiltonian construction with spectral gap Ω(n−3.09 D−2 log−6(n)). To lower bound this gap we use a Markov chain decomposition method to divide the state space of partially completed circuit configurations into overlapping subsets corresponding to uniform circuit segments of depth logn, which are based on bitonic sorting circuits. We use the combinatorial properties of these circuit configurations to show rapid mixing between the subsets, and within the subsets we develop a novel isomorphism between the local update Markov chain on bitonic circuit configurations and the edge-flip Markov chain on equal-area dyadic tilings, whose mixing time was recently shown to be polynomial (Cannon, Levin, and Stauffer, RANDOM 2017). Previous lower bounds on the spectral gap of spacetime circuit Hamiltonians have all been based on a connection to exactly solvable quantum spin chains and applied only to 1+1 dimensional nearest-neighbor quantum circuits with at least linear depth.

Hamiltonian simulation with nearly optimal dependence on spectral norm

We present a quantum algorithm for approximating the real time evolution eiHt of an arbitrary d-sparse Hamiltonian to error є, given black-box access to the positions and b-bit values of its non-zero matrix entries. The query complexity of our algorithm is O((td||H||1 → 2)1+o(1)o(1)) with respect to the largest Euclidean row norm ||H||1 → 2, which is shown to be optimal up to subpolynomial factors through a matching lower bound, and it uses a factor O(b) more gates. This provides a polynomial speedup in sparsity for the common case where the spectral norm is known, and generalizes previous approaches which achieve optimal scaling, but with respect to more restrictive parameters. By exploiting knowledge of the spectral norm, our algorithm solves the black-box unitary implementation problem – O(d1/2+o(1)) queries suffice to approximate any d-sparse unitary in the black-box setting, which matches the quantum search lower bound of Ω(√d) queries and improves upon prior art [Berry and Childs, QIP 2010] of Õ(d2/3) queries. Combined with known techniques, we also solve systems of sparse linear equations with condition number κ using O((κ √d)1+o(1)o(1)) queries, which is a quadratic improvement in sparsity.

Quantum state certification

We consider the problem of quantum state certification, where one is given n copies of an unknown d-dimensional quantum mixed state ρ, and one wants to test whether ρ is equal to some known mixed state σ or else is є-far from σ. The goal is to use notably fewer copies than the Ω(d2) needed for full tomography on ρ (i.e., density estimation). We give two robust state certification algorithms: one with respect to fidelity using n = O(d/є) copies, and one with respect to trace distance using n = O(d2) copies. The latter algorithm also applies when σ is unknown as well. These copy complexities are optimal up to constant factors.

Exponential separation between shallow quantum circuits and unbounded fan-in shallow classical circuits

Recently, Bravyi, Gosset, and Konig (Science, 2018) exhibited a search problem called the 2D Hidden Linear Function (2D HLF) problem that can be solved exactly by a constant-depth quantum circuit using bounded fan-in gates (or QNC^0 circuits), but cannot be solved by any constant-depth classical circuit using bounded fan-in AND, OR, and NOT gates (or NC^0 circuits). In other words, they exhibited a search problem in QNC^0 that is not in NC^0.

We strengthen their result by proving that the 2D HLF problem is not contained in AC^0, the class of classical, polynomial-size, constant-depth circuits over the gate set of unbounded fan-in AND and OR gates, and NOT gates. We also supplement this worst-case lower bound with an average-case result: There exists a simple distribution under which any AC^0 circuit (even of nearly exponential size) has exponentially small correlation with the 2D HLF problem.

Our results are shown by constructing a new problem in QNC^0, which we call the Parity Halving Problem, which is easier to work with. We prove our AC^0 lower bounds for this problem, and then show that it reduces to the 2D HLF problem.

SESSION: Property Testing

Testing graphs in vertex-distribution-free models

Prior studies of testing graph properties presume that the tester can obtain uniformly distributed vertices in the tested graph (in addition to obtaining answers to the some type of graph-queries). Here we envision settings in which it is only feasible to obtain random vertices drawn according to an arbitrary distribution (and, in addition, obtain answers to the usual graph-queries). We initiate a study of testing graph properties in such settings, while adapting the definition of distance between graphs so that it reflects the different probability weight of different vertices. Hence, the distance to the property represents the relative importance of the “part of the graph” that violates the property. We consider such “vertex-distribution free” (VDF) versions of the two most-studied models of testing graph properties (i.e., the dense graph model and the bounded-degree model).

In both cases, we show that VDF testing within complexity that is independent of the distribution on the vertex-set (of the tested graph) is possible only if the same property can be tested in the standard model with one-sided error and size-independent complexity. We also show that this necessary condition is not sufficient; yet, we present size-independent VDF testers for many of the natural properties that satisfy the necessary condition.

Testing graphs against an unknown distribution

The classical model of graph property testing, introduced by Goldreich, Goldwasser and Ron, assumes that the algorithm can obtain uniformly distributed vertices from the input graph. Goldreich introduced a more general model, called the Vertex-Distribution-Free model (or VDF for short) in which the testing algorithm obtains vertices drawn from an arbitrary and unknown distribution. The main motivation for this investigation is that it can allow one to give different weight/importance to different parts of the input graph, as well as handle situations where one cannot obtain uniformly selected vertices from the input. Goldreich proved that any property which is testable in this model must (essentially) be hereditary, and that several hereditary properties can indeed be tested in this model. He further asked which properties are testable in this model.

In this paper we completely solve Goldreich’s problem by giving a precise characterization of the graph properties that are testable in the VDF model. Somewhat surprisingly this characterization takes the following clean form: say that a graph property P is extendable if given any graph G satisfying P, one can add one more vertex to G, and connect it to some of the vertices of G in a way that the resulting graph satisfies P. Then a property P is testable in the VDF model if and only if P is hereditary and extendable.

Testing unateness nearly optimally

We present an Õ(n2/32)-query algorithm that tests whether an unknown Boolean function f∶{0,1}n→ {0,1} is unate (i.e., every variable is either non-decreasing or non-increasing) or є-far from unate. The upper bound is nearly optimal given the Ω(n2/3) lower bound of Chen, Waingarten and Xie (2017). The algorithm builds on a novel use of the binary search procedure and its analysis over long random paths.

Random walks and forbidden minors II: a poly(d ???)-query tester for minor-closed properties of bounded degree graphs

Let G be a graph with n vertices and maximum degree d. Fix some minor-closed property P (such as planarity). We say that G is ε-far from P if one has to remove ε dn edges to make it have P. The problem of property testing P was introduced in the seminal work of Benjamini-Schramm-Shapira (STOC 2008) that gave a tester with query complexity triply exponential in ε−1. Levi-Ron (TALG 2015) have given the best tester to date, with a quasipolynomial (in ε−1) query complexity. It is an open problem to get property testers whose query complexity is (dε−1), even for planarity.

In this paper, we resolve this open question. For any minor-closed property, we give a tester with query complexity d· (ε−1). The previous line of work on (independent of n, two-sided) testers is primarily combinatorial. Our work, on the other hand, employs techniques from spectral graph theory. This paper is a continuation of recent work of the authors (FOCS 2018) analyzing random walk algorithms that find forbidden minors.

SESSION: Constraint Satisfaction

Bridging between 0/1 and linear programming via random walks

Under the Strong Exponential Time Hypothesis, an integer linear program with n Boolean-valued variables and m equations cannot be solved in cn time for any constant c < 2. If the domain of the variables is relaxed to [0,1], the associated linear program can of course be solved in polynomial time. In this work, we give a natural algorithmic bridging between these extremes of 0-1 and linear programming. Specifically, for any subset (finite union of intervals) E ⊂ [0,1] containing {0,1}, we give a random-walk based algorithm with runtime OE((2−measure(E))npoly(n,m)) that finds a solution in En to any n-variable linear program with m constraints that is feasible over {0,1}n. Note that as E expands from {0,1} to [0,1], the runtime improves smoothly from 2n to polynomial.

Taking E = [0,1/k) ∪ (1−1/k,1] in our result yields as a corollary a randomized (2−2/k)npoly(n) time algorithm for k-SAT. While our approach has some high level resemblance to Sch'oning’s beautiful algorithm, our general algorithm is based on a more sophisticated random walk that incorporates several new ingredients, such as a multiplicative potential to measure progress, a judicious choice of starting distribution, and a time varying distribution for the evolution of the random walk that is itself computed via an LP at each step (a solution to which is guaranteed based on the minimax theorem). Plugging the LP algorithm into our earlier polymorphic framework yields fast exponential algorithms for any CSP (like k-SAT, 1-in-3-SAT, NAE k-SAT) that admit so-called “threshold partial polymorphisms.”

Faster k-SAT algorithms using biased-PPSZ

The PPSZ algorithm, due to Paturi, Pudlak, Saks and Zane, is currently the fastest known algorithm for the k-SAT problem, for every k>3. For 3-SAT, a tiny improvement over PPSZ was obtained by Hertli. We introduce a biased version of the PPSZ algorithm using which we obtain an improvement over PPSZ for every k≥ 3. For k=3 we also improve on Herli’s result and get a much more noticeable improvement over PPSZ, though still relatively small. In particular, for Unique 3-SAT, we improve the current bound from 1.308n to 1.307n.

CSPs with global modular constraints: algorithms and hardness via polynomial representations

We study the complexity of Boolean constraint satisfaction problems (CSPs) when the assignment must have Hamming weight in some congruence class modulo M, for various choices of the modulus M. Due to the known classification of tractable Boolean CSPs, this mainly reduces to the study of three cases: 2-SAT, HORN-SAT, and LIN-2 (linear equations mod 2). We classify the moduli M for which these respective problems are polynomial time solvable, and when they are not (assuming the ETH). Our study reveals that this modular constraint lends a surprising richness to these classic, well-studied problems, with interesting broader connections to complexity theory and coding theory. The HORN-SAT case is connected to the covering complexity of polynomials representing the NAND function mod M. The LIN-2 case is tied to the sparsity of polynomials representing the OR function mod M, which in turn has connections to modular weight distribution properties of linear codes and locally decodable codes. In both cases, the analysis of our algorithm as well as the hardness reduction rely on these polynomial representations, highlighting an interesting algebraic common ground between hard cases for our algorithms and the gadgets which show hardness. These new complexity measures of polynomial representations merit further study.

The inspiration for our study comes from a recent work by N'agele, Sudakov, and Zenklusen on submodular minimization with a global congruence constraint. Our algorithm for HORN-SAT has strong similarities to their algorithm, and in particular identical kind of set systems arise in both cases. Our connection to polynomial representations leads to a simpler analysis of such set systems, and also sheds light on (but does not resolve) the complexity of submodular minimization with a congruency requirement modulo a composite M.

Algebraic approach to promise constraint satisfaction

The complexity and approximability of the constraint satisfaction problem (CSP) has been actively studied over the last 20 years. A new version of the CSP, the promise CSP (PCSP) has recently been proposed, motivated by open questions about the approximability of variants of satisfiability and graph colouring. The PCSP significantly extends the standard decision CSP. The complexity of CSPs with a fixed constraint language on a finite domain has recently been fully classified, greatly guided by the algebraic approach, which uses polymorphisms — high-dimensional symmetries of solution spaces — to analyse the complexity of problems. The corresponding classification for PCSPs is wide open and includes some long-standing open questions, such as the complexity of approximate graph colouring, as special cases.

The basic algebraic approach to PCSP was initiated by Brakensiek and Guruswami, and in this paper we significantly extend it and lift it from concrete properties of polymorphisms to their abstract properties. We introduce a new class of problems that can be viewed as algebraic versions of the (Gap) Label Cover problem, and show that every PCSP with a fixed constraint language is equivalent to a problem of this form. This allows us to identify a ”measure of symmetry” that is well suited for comparing and relating the complexity of different PCSPs via the algebraic approach. We demonstrate how our theory can be applied by improving the state-of-the-art in approximate graph colouring: we show that, for any k≥ 3, it is NP-hard to find a (2k−1)-colouring of a given k-colourable graph.

SESSION: Pseudorandomness / Algorithmic Game Theory

Fooling polytopes

We give a pseudorandom generator that fools m-facet polytopes over {0,1}n with seed length polylog(m) · log(n). The previous best seed length had superlinear dependence on m. An immediate consequence is a deterministic quasipolynomial time algorithm for approximating the number of solutions to any {0,1}-integer program.

Pseudorandom generators for width-3 branching programs

We construct pseudorandom generators of seed length Õ(log(n)· log(1/є)) that є-fool ordered read-once branching programs (ROBPs) of width 3 and length n. For unordered ROBPs, we construct pseudorandom generators with seed length Õ(log(n) · poly(1/є)). This is the first improvement for pseudorandom generators fooling width 3 ROBPs since the work of Nisan [Combinatorica, 1992].

Our constructions are based on the “iterated milder restrictions” approach of Gopalan et al. [FOCS, 2012] (which further extends the Ajtai-Wigderson framework [FOCS, 1985]), combined with the INW-generator [STOC, 1994] at the last step (as analyzed by Braverman et al. [SICOMP, 2014]). For the unordered case, we combine iterated milder restrictions with the generator of Chattopadhyay et al. [CCC, 2018].

Two conceptual ideas that play an important role in our analysis are: (1) A relabeling technique allowing us to analyze a relabeled version of the given branching program, which turns out to be much easier. (2) Treating the number of colliding layers in a branching program as a progress measure and showing that it reduces significantly under pseudorandom restrictions.

In addition, we achieve nearly optimal seed-length Õ(log(n/є)) for the classes of: (1) read-once polynomials on n variables, (2) locally-monotone ROBPs of length n and width 3 (generalizing read-once CNFs and DNFs), and (3) constant-width ROBPs of length n having a layer of width 2 in every consecutive polylog(n) layers.

The complexity of splitting necklaces and bisecting ham sandwiches

We resolve the computational complexity of two problems known as Necklace Splitting and Discrete Ham Sandwich, showing that they are PPA-complete. For Necklace Splitting, this result is specific to the important special case in which two thieves share the necklace. We do this via a PPA-completeness result for an approximate version of the Consensus Halving problem, strengthening our recent result that the problem is PPA-complete for inverse-exponential precision. At the heart of our construction is a smooth embedding of the high-dimensional Mobius strip in the Consensus Halving problem. These results settle the status of PPA as a class that captures the complexity of “natural” problems whose definitions do not incorporate a circuit.

The communication complexity of local search

We study a communication variant of local search. There is some fixed, commonly known graph G. Alice holds fA and Bob holds fB, both are functions that specify a value for each vertex. The goal is to find a local maximum of fA+fB with respect to G, i.e., a vertex v for which (fA+fB)(v)≥ (fA+fB)(u) for each neighbor u of v.

Our main result is that finding a local maximum requires polynomial (in the number of vertices) bits of communication. The result holds for the following families of graphs: three dimensional grids, hypercubes, odd graphs, and degree 4 graphs. Moreover, we prove an optimal communication bound of Ω(√N) for the hypercube, and for a constant dimension grid, where N is the number of vertices in the graph.

We provide applications of our main result in two domains, exact potential games and combinatorial auctions. Each one of the results demonstrates an exponential separation between the non-deterministic communication complexity and the randomized communication complexity of a total search problem. First, we show that finding a pure Nash equilibrium in 2-player N-action exact potential games requires poly(N) communication. We also show that finding a pure Nash equilibrium in n-player 2-action exact potential games requires exp(n) communication.

The second domain that we consider is combinatorial auctions, in which we prove that finding a local maximum in combinatorial auctions requires exponential (in the number of items) communication even when the valuations are submodular.

SESSION: EC Joint Session

Settling the sample complexity of single-parameter revenue maximization

This paper settles the sample complexity of single-parameter revenue maximization by showing matching upper and lower bounds, up to a poly-logarithmic factor, for all families of value distributions that have been considered in the literature. The upper bounds are unified under a novel framework, which builds on the strong revenue monotonicity by Devanur, Huang, and Psomas (STOC 2016), and an information theoretic argument. This is fundamentally different from the previous approaches that rely on either constructing an є-net of the mechanism space, explicitly or implicitly via statistical learning theory, or learning an approximately accurate version of the virtual values. To our knowledge, it is the first time information theoretical arguments are used to show sample complexity upper bounds, instead of lower bounds. Our lower bounds are also unified under a meta construction of hard instances.

Tight approximation ratio of anonymous pricing

This paper considers two canonical Bayesian mechanism design settings. In the single-item setting, the tight approximation ratio of Anonymous Pricing is obtained: (1) compared to Myerson Auction, Anonymous Pricing always generates at least a 1/2.62-fraction of the revenue; (2) there is a matching lower-bound instance.

In the unit-demand single-buyer setting, the tight approximation ratio between the simplest deterministic mechanism and the optimal deterministic mechanism is attained: in terms of revenue, (1) Uniform Pricing admits a 2.62-approximation to Item Pricing; (2) a matching lower-bound instance is presented also.

These results answer two open questions asked by Alaei et al. (FOCS’15) and Cai and Daskalakis (GEB’15). As an implication, in the single-item setting: the approximation ratio of Second-Price Auction with Anonymous Reserve (Hartline and Roughgarden EC’09) is improved to 2.62, which breaks the best known upper bound of e ≈ 2.72.

Optimal (and benchmark-optimal) competition complexity for additive buyers over independent items

The Competition Complexity of an auction setting refers to the number of additional bidders necessary in order for the (deterministic, prior-independent, dominant strategy truthful) Vickrey-Clarke-Groves mechanism to achieve greater revenue than the (randomized, prior-dependent, Bayesian-truthful) optimal mechanism without the additional bidders.

We prove that the competition complexity of n bidders with additive valuations over m independent items is at most n(ln(1+m/n)+2), and also at most 9√nm. When nm, the first bound is optimal up to constant factors, even when the items are i.i.d. and regular. When nm, the second bound is optimal for the benchmark introduced by Eden et al. up to constant factors, even when the items are i.i.d. and regular. We further show that, while the Eden et al. benchmark is not necessarily tight in the nm regime, the competition complexity of n bidders with additive valuations over even 2 i.i.d. regular items is indeed ω(1).

Our main technical contribution is a reduction from analyzing the Eden et al. benchmark to proving stochastic dominance of certain random variables.

SESSION: Stringology

Near-linear time insertion-deletion codes and (1+ε)-approximating edit distance via indexing

We introduce fast-decodable indexing schemes for edit distance which can be used to speed up edit distance computations to near-linear time if one of the strings is indexed by an indexing string I. In particular, for every length n and every ε >0, one can in near linear time construct a string I ∈ Σ′n with |Σ′| = Oε(1), such that, indexing any string S ∈ Σn, symbol-by-symbol, with I results in a string S′ ∈ Σ″n where Σ″ = Σ × Σ′ for which edit distance computations are easy, i.e., one can compute a (1+ε)-approximation of the edit distance between S′ and any other string in O(n (logn)) time.

Our indexing schemes can be used to improve the decoding complexity of state-of-the-art error correcting codes for insertions and deletions. In particular, they lead to near-linear time decoding algorithms for the insertion-deletion codes of [Haeupler, Shahrasbi; STOC ‘17] and faster decoding algorithms for list-decodable insertion-deletion codes of [Haeupler, Shahrasbi, Sudan; ICALP ‘18]. Interestingly, the latter codes are a crucial ingredient in the construction of fast-decodable indexing schemes.

1+ε approximation of tree edit distance in quadratic time

Edit distance is one of the most fundamental problems in computer science. Tree edit distance is a natural generalization of edit distance to ordered rooted trees. Such a generalization extends the applications of edit distance to areas such as computational biology, structured data analysis (e.g., XML), image analysis, and compiler optimization. Perhaps the most notable application of tree edit distance is in the analysis of RNA molecules in computational biology where the secondary structure of RNA is typically represented as a rooted tree.

The best-known solution for tree edit distance runs in cubic time. Recently, Bringmann et al. show that an O(n2.99) algorithm for weighted tree edit distance is unlikely by proving a conditional lower bound on the computational complexity of tree edit distance. This shows a substantial gap between the computational complexity of tree edit distance and that of edit distance for which a simple dynamic program solves the problem in quadratic time.

In this work, we give the first non-trivial approximation algorithms for tree edit distance. Our main result is a quadratic time approximation scheme for tree edit distance that approximates the solution within a factor of 1+є for any constant є > 0.

Optimal sequence length requirements for phylogenetic tree reconstruction with indels

We consider the phylogenetic tree reconstruction problem with insertions and deletions (indels). Phylogenetic algorithms proceed under a model where sequences evolve down the model tree, and given sequences at the leaves, the problem is to reconstruct the model tree with high probability. Traditionally, sequences mutate by substitution-only processes, although some recent work considers evolutionary processes with insertions and deletions. In this paper, we improve on previous work by giving a reconstruction algorithm that simultaneously has O(poly logn) sequence length and tolerates constant indel probabilities on each edge. Our recursively-reconstructed distance-based technique provably outputs the model tree when the model tree has O(poly logn) diameter and discretized branch lengths, allowing for the probability of insertion and deletion to be non-uniform and asymmetric on each edge. Our polylogarithmic sequence length bounds improve significantly over previous polynomial sequence length bounds and match sequence length bounds in the substitution-only models of phylogenetic evolution, thereby challenging the idea that many global misalignments caused by insertions and deletions when pindel is large are a fundamental obstruction to reconstruction with short sequences.

We build upon a signature scheme for sequences, introduced by Daskalakis and Roch, that is robust to insertions and deletions. Our main contribution is to show that an averaging procedure gives an accurate reconstruction of signatures for ancestors, even while the explicit ancestral sequences cannot be reconstructed due to misalignments. Because these signatures are not as sensitive to indels, we can bound the noise that arise from indel-induced shifts and provide a novel analysis that provably reconstructs the model tree with O(poly logn) sequence length as long as the rate of mutation is less than the well known Kesten-Stigum threshold. The upper bound on the rate of mutation is optimal as beyond this threshold, an information-theoretic lower bound of Ω(poly(n)) sequence length requirement exists.

Computing quartet distance is equivalent to counting 4-cycles

The quartet distance is a measure of similarity used to compare two unrooted phylogenetic trees on the same set of n leaves, defined as the number of subsets of four leaves related by a different topology in both trees. After a series of previous results, Brodal et al. [SODA 2013] presented an algorithm that computes this number in O(ndlogn) time, where d is the maximum degree of a node. For the related triplet distance between rooted phylogenetic trees, the same authors were able to design an O(nlogn) time algorithm, that is, with running time independent of d. This raises the question of achieving such complexity for computing the quartet distance, or at least improving the dependency on d.

Our main contribution is a two-way reduction establishing that the complexity of computing the quartet distance between two trees on n leaves is the same, up to polylogarithmic factors, as the complexity of counting 4-cycles in an undirected simple graph with m edges. The latter problem has been extensively studied, and the fastest known algorithm by Vassilevska Williams [SODA 2015] works in O(m1.48) time. In fact, even for the seemingly simpler problem of detecting a 4-cycle, the best known algorithm works in O(m4/3) time, and a conjecture of Yuster and Zwick implies that this might be optimal. In particular, an almost-linear time for computing the quartet distance would imply a surprisingly efficient algorithm for counting 4-cycles. In the other direction, by plugging in the state-of-the-art algorithms for counting 4-cycles, our reduction allows us to significantly decrease the complexity of computing the quartet distance. For trees with unbounded degrees we obtain an O(n1.48) time algorithm, which is a substantial improvement on the previous bound of O(n2logn). For trees with degrees bounded by d, by analysing the reduction more carefully, we are able to obtain an Õ(nd0.77) time algorithm, which is again a nontrivial improvement on the previous bound of O(ndlogn).

Local decodability of the Burrows-Wheeler transform

The Burrows-Wheeler Transform (BWT) is among the most influential discoveries in text compression and DNA storage. It is a reversible preprocessing step that rearranges an n-letter string into runs of identical characters (by exploiting context regularities), resulting in highly compressible strings, and is the basis of the <pre>bzip</pre> compression program. Alas, the decoding process of BWT is inherently sequential and requires Ω(n) time even to retrieve a single character.

We study the succinct data structure problem of locally decoding short substrings of a given text under its compressed BWT, i.e., with small additive redundancy r over the Move-To-Front (<pre>bzip</pre>) compression. The celebrated BWT-based FM-index (FOCS ’00), as well as other related literature, yield a trade-off of r=Õ(n/√t) bits, when a single character is to be decoded in O(t) time. We give a near-quadratic improvement r=Õ(nlg(t)/t). As a by-product, we obtain an exponential (in t) improvement on the redundancy of the FM-index for counting pattern-matches on compressed text. In the interesting regime where the text compresses to o(n) (say, n/polylg(n)) bits, these results provide an exp(t) overall space reduction. For the local decoding problem of BWT, we also prove an Ω(n/t2) cell-probe lower bound for “symmetric” data structures.

We achieve our main result by designing a compressed partial-sums (Rank) data structure over BWT. The key component is a locally-decodable Move-to-Front (MTF) code: with only O(1) extra bits per block of length nΩ(1), the decoding time of a single character can be decreased from Ω(n) to O(lgn). This result is of independent interest in algorithmic information theory.

String synchronizing sets: sublinear-time BWT construction and optimal LCE data structure

Burrows–Wheeler transform (BWT) is an invertible text transformation that, given a text T of length n, permutes its symbols according to the lexicographic order of suffixes of T. BWT is one of the most heavily studied algorithms in data compression with numerous applications in indexing, sequence analysis, and bioinformatics. Its construction is a bottleneck in many scenarios, and settling the complexity of this task is one of the most important unsolved problems in sequence analysis that has remained open for 25 years. Given a binary string of length n, occupying O(n/logn) machine words, the BWT construction algorithm due to Hon et al. (SIAM J. Comput., 2009) runs in O(n) time and O(n/logn) space. Recent advancements (Belazzougui, STOC 2014, and Munro et al., SODA 2017) focus on removing the alphabet-size dependency in the time complexity, but they still require Ω(n) time. Despite the clearly suboptimal running time, the existing techniques appear to have reached their limits.

In this paper, we propose the first algorithm that breaks the O(n)-time barrier for BWT construction. Given a binary string of length n, our procedure builds the Burrows–Wheeler transform in O(n/√logn) time and O(n/logn) space. We complement this result with a conditional lower bound proving that any further progress in the time complexity of BWT construction would yield faster algorithms for the very well studied problem of counting inversions: it would improve the state-of-the-art O(m√logm)-time solution by Chan and Pătraşcu (SODA 2010). Our algorithm is based on a novel concept of string synchronizing sets, which is of independent interest. As one of the applications, we show that this technique lets us design a data structure of the optimal size O(n/logn) that answers Longest Common Extension queries (LCE queries) in O(1) time and, furthermore, can be deterministically constructed in the optimal O(n/logn) time.

SESSION: Algorithmic Statistics

Approximation algorithms for distributionally-robust stochastic optimization with black-box distributions

Two-stage stochastic optimization is a widely used framework for modeling uncertainty, where we have a probability distribution over possible realizations of the data, called scenarios, and decisions are taken in two stages: we make first-stage decisions knowing only the underlying distribution and before a scenario is realized, and may take additional second-stage recourse actions after a scenario is realized. The goal is typically to minimize the total expected cost. A common criticism levied at this model is that the underlying probability distribution is itself often imprecise! To address this, an approach that is quite versatile and has gained popularity in the stochastic-optimization literature is the distributionally robust 2-stage model: given a collection D of probability distributions, our goal now is to minimize the maximum expected total cost with respect to a distribution in D.

We provide a framework for designing approximation algorithms in such settings when the collection D is a ball around a central distribution and the central distribution is accessed only via a sampling black box. We first show that one can utilize the sample average approximation (SAA) method—solve the distributionally robust problem with an empirical estimate of the central distribution—to reduce the problem to the case where the central distribution has polynomial-size support. Complementing this, we show how to approximately solve a fractional relaxation of the SAA (i.e., polynomial-scenario central-distribution) problem. Unlike in 2-stage stochastic- or robust- optimization, this turns out to be quite challenging. We utilize the ellipsoid method in conjunction with several new ideas to show that this problem can be approximately solved provided that we have an (approximation) algorithm for a certain max-min problem that is akin to, and generalizes, the k-max-min problem—find the worst-case scenario consisting of at most k elements—encountered in 2-stage robust optimization. We obtain such a procedure for various discrete-optimization problems; by complementing this via LP-rounding algorithms that provide local (i.e., per-scenario) approximation guarantees, we obtain the first approximation algorithms for the distributionally robust versions of a variety of discrete-optimization problems including set cover, vertex cover, edge cover, facility location, and Steiner tree, with guarantees that are, except for set cover, within O(1)-factors of the guarantees known for the deterministic version of the problem.

Efficient profile maximum likelihood for universal symmetric property estimation

Estimating symmetric properties of a distribution, e.g. support size, coverage, entropy, distance to uniformity, are among the most fundamental problems in algorithmic statistics. While these properties have been studied extensively and separate optimal estimators have been produced, in striking recent work Acharya et al. provided a single estimator that is competitive for each. They showed that the value of the property on the distribution that approximately maximizes profile likelihood (PML), i.e. the probability of observed frequency of frequencies, is sample competitive with respect to a broad class of estimators. Unfortunately, prior to this work, there was no known polynomial time algorithm to compute such an approximation or use PML to obtain a universal plug-in estimator.

In this paper we provide an algorithm that, given n samples from a distribution, computes an approximate PML distribution up to a multiplicative error of exp(n2/3 poly log(n)) in nearly linear time. Generalizing work of Acharya et al. we show that our algorithm yields a universal plug-in estimator that is competitive with a broad range of estimators up to accuracy є = Ω(n−0.166). Further, we provide efficient polynomial-time algorithms for computing a d-dimensional generalization of PML (for constant d) that allows for universal plug-in estimation of symmetric relationships between distributions.

Communication complexity of estimating correlations

We characterize the communication complexity of the following distributed estimation problem. Alice and Bob observe infinitely many iid copies of ρ-correlated unit-variance (Gaussian or ±1 binary) random variables, with unknown ρ∈[−1,1]. By interactively exchanging k bits, Bob wants to produce an estimate ρ of ρ. We show that the best possible performance (optimized over interaction protocol Π and estimator ρ) satisfies infΠ ρsupρE [|ρ−ρ|2] = k−1 (1/2 ln2 + o(1)). Curiously, the number of samples in our achievability scheme is exponential in k; by contrast, a naive scheme exchanging k samples achieves the same Ω(1/k) rate but with a suboptimal prefactor. Our protocol achieving optimal performance is one-way (non-interactive). We also prove the Ω(1/k) bound even when ρ is restricted to any small open sub-interval of [−1,1] (i.e. a local minimax lower bound). Our proof techniques rely on symmetric strong data-processing inequalities and various tensorization techniques from information-theoretic interactive common-randomness extraction. Our results also imply an Ω(n) lower bound on the information complexity of the Gap-Hamming problem, for which we show a direct information-theoretic proof.

Degree-𝑑 chow parameters robustly determine degree-𝑑 PTFs (and algorithmic applications)

The degree-d Chow parameters of a Boolean function are its degree at most d Fourier coefficients. It is well-known that degree-d Chow parameters uniquely characterize degree-d polynomial threshold functions (PTFs) within the space of all bounded functions. In this paper, we prove a robust version of this theorem: For f any Boolean degree-d PTF and g any bounded function, if the degree-d Chow parameters of f are close to the degree-d Chow parameters of g in ℓ2-norm, then f is close to g in ℓ1-distance. Notably, our bound relating the two distances is independent of the dimension. That is, we show that Boolean degree-d PTFs are robustly identifiable from their degree-d Chow parameters. No non-trivial bound was previously known for d >1.

Our robust identifiability result gives the following algorithmic applications: First, we show that Boolean degree-d PTFs can be efficiently approximately reconstructed from approximations to their degree-d Chow parameters. This immediately implies that degree-d PTFs are efficiently learnable in the uniform distribution d-RFA model. As a byproduct of our approach, we also obtain the first low integer-weight approximations of degree-d PTFs, for d>1. As our second application, our robust identifiability result gives the first efficient algorithm, with dimension-independent error guarantees, for malicious learning of Boolean degree-d PTFs under the uniform distribution.

The proof of our robust identifiability result involves several new technical ingredients, including the following structural result for degree-d multivariate polynomials with very poor anti-concentration: If p is a degree-d polynomial where p(x) is very close to 0 on a large number of points in { ± 1 }n, then there exists a degree-d hypersurface that exactly passes though almost all of these points. We leverage this structural result to show that if the degree-d Chow distance between f and g is small, then we can find many degree-d polynomials that vanish on their disagreement region, and in particular enough that forces the ℓ1-distance between f and g to also be small. To implement this proof strategy, we require additional technical ideas. In particular, in the d=2 case we show that for any large vector space of degree-2 polynomials with a large number of common zeroes, there exists a linear function that vanishes on almost all of these zeroes. The degree-d degree generalization of this statement is significantly more complex, and can be viewed as an effective version of Hilbert’s Basis Theorem for our setting.

Capacity lower bound for the Ising perceptron

We consider the Ising perceptron with gaussian disorder, which is equivalent to the discrete cube {−1,+1}N intersected by M random half-spaces. The perceptron’s capacity is the largest integer MN for which the intersection is nonempty. It is conjectured by Krauth and Mézard (1989) that the (random) ratio MN/N converges in probability to an explicit constant α≐ 0.83. Kim and Roche (1998) proved the existence of a positive constant γ such that γ ≤ MN/N ≤ 1−γ with high probability; see also Talagrand (1999). In this paper we show that the Krauth–Mézard conjecture α is a lower bound with positive probability, under the condition that an explicit univariate function S(λ) is maximized at λ=0. Our proof is an application of the second moment method to a certain slice of perceptron configurations, as selected by the so-called TAP (Thouless, Anderson, and Palmer, 1977) or AMP (approximate message passing) iteration, whose scaling limit has been characterized by Bayati and Montanari (2011) and Bolthausen (2012). For verifying the condition on S(λ) we outline one approach, which is implemented in the current version using (nonrigorous) numerical integration packages. In a future version of this paper we intend to complete the verification by implementing a rigorous numerical method.

Learning restricted Boltzmann machines via influence maximization

Graphical models are a rich language for describing high-dimensional distributions in terms of their dependence structure. While there are algorithms with provable guarantees for learning undirected graphical models in a variety of settings, there has been much less progress in the important scenario when there are latent variables. Here we study Restricted Boltzmann Machines (or RBMs), which are a popular model with wide-ranging applications in dimensionality reduction, collaborative filtering, topic modeling, feature extraction and deep learning.

The main message of our paper is a strong dichotomy in the feasibility of learning RBMs, depending on the nature of the interactions between variables: ferromagnetic models can be learned efficiently, while general models cannot. In particular, we give a simple greedy algorithm based on influence maximization to learn ferromagnetic RBMs with bounded degree. In fact, we learn a description of the distribution on the observed variables as a Markov Random Field. Our analysis is based on tools from mathematical physics that were developed to show the concavity of magnetization. Our algorithm extends straighforwardly to general ferromagnetic Ising models with latent variables.

Conversely, we show that even for a contant number of latent variables with constant degree, without ferromagneticity the problem is as hard as sparse parity with noise. This hardness result is based on a sharp and surprising characterization of the representational power of bounded degree RBMs: the distribution on their observed variables can simulate any bounded order MRF. This result is of independent interest since RBMs are the building blocks of deep belief networks.

SESSION: COLT Sister Session ML Foundations

Non-Gaussian component analysis using entropy methods

Non-Gaussian component analysis (NGCA) is a problem in multidimensional data analysis which, since its formulation in 2006, has attracted considerable attention in statistics and machine learning. In this problem, we have a random variable X in n-dimensional Euclidean space. There is an unknown subspace Γ of the n-dimensional Euclidean space such that the orthogonal projection of X onto Γ is standard multidimensional Gaussian and the orthogonal projection of X onto Γ, the orthogonal complement of Γ, is non-Gaussian, in the sense that all its one-dimensional marginals are different from the Gaussian in a certain metric defined in terms of moments. The NGCA problem is to approximate the non-Gaussian subspace Γ given samples of X.

Vectors in Γ correspond to ‘interesting’ directions, whereas vectors in Γ correspond to the directions where data is very noisy. The most interesting applications of the NGCA model is for the case when the magnitude of the noise is comparable to that of the true signal, a setting in which traditional noise reduction techniques such as PCA don’t apply directly. NGCA is also related to dimension reduction and to other data analysis problems such as ICA. NGCA-like problems have been studied in statistics for a long time using techniques such as projection pursuit.

We give an algorithm that takes polynomial time in the dimension n and has an inverse polynomial dependence on the error parameter measuring the angle distance between the non-Gaussian subspace and the subspace output by the algorithm. Our algorithm is based on relative entropy as the contrast function and fits under the projection pursuit framework. The techniques we develop for analyzing our algorithm maybe of use for other related problems.

Private PAC learning implies finite Littlestone dimension

We show that every approximately differentially private learning algorithm (possibly improper) for a class H with Littlestone dimension d requires Ω(log*(d)) examples. As a corollary it follows that the class of thresholds over ℕ can not be learned in a private manner; this resolves open questions due to [Bun et al. 2015] and [Feldman and Xiao, 2015]. We leave as an open question whether every class with a finite Littlestone dimension can be learned by an approximately differentially private algorithm.

Competitively chasing convex bodies

Let F be a family of sets in some metric space. In the F-chasing problem, an online algorithm observes a request sequence of sets in F and responds (online) by giving a sequence of points in these sets. The movement cost is the distance between consecutive such points. The competitive ratio is the worst case ratio (over request sequences) between the total movement of the online algorithm and the smallest movement one could have achieved by knowing in advance the request sequence. The family F is said to be chaseable if there exists an online algorithm with finite competitive ratio. In 1991, Linial and Friedman conjectured that the family of convex sets in Euclidean space is chaseable. We prove this conjecture.

Beyond the low-degree algorithm: mixtures of subcubes and their applications

We introduce the problem of learning mixtures of k subcubes over {0,1}n, which contains many classic learning theory problems as a special case (and is itself a special case of others). We give a surprising nO(logk)-time learning algorithm based on higher-order multilinear moments. It is not possible to learn the parameters because the same distribution can be represented by quite different models. Instead, we develop a framework for reasoning about how multilinear moments can pinpoint essential features of the mixture, like the number of components.

We also give applications of our algorithm to learning decision trees with stochastic transitions (which also capture interesting scenarios where the transitions are deterministic but there are latent variables). Using our algorithm for learning mixtures of subcubes, we can approximate the Bayes optimal classifier within additive error є on k-leaf decision trees with at most s stochastic transitions on any root-to-leaf path in nO(s + logk)·poly(1/є) time. In this stochastic setting, the classic nO(logk)·poly(1/є)-time algorithms of Rivest, Blum, and Ehrenfreucht-Haussler for learning decision trees with zero stochastic transitions break down because they are fundamentally Occam algorithms. The low-degree algorithm of Linial-Mansour-Nisan is able to get a constant factor approximation to the optimal error (again within an additive є) and runs in time nO(s + log(k/є)). The quasipolynomial dependence on 1/є is inherent to the low-degree approach because the degree needs to grow as the target accuracy decreases, which is undesirable when є is small.

In contrast, as we will show, mixtures of k subcubes are uniquely determined by their 2 logk order moments and hence provide a useful abstraction for simultaneously achieving the polynomial dependence on 1/є of the classic Occam algorithms for decision trees and the flexibility of the low-degree algorithm in being able to accommodate stochastic transitions. Using our multilinear moment techniques, we also give the first improved upper and lower bounds since the work of Feldman-O’Donnell-Servedio for the related but harder problem of learning mixtures of binary product distributions.

Regression from dependent observations

The standard linear and logistic regression models assume that the response variables are independent, but share the same linear relationship to their corresponding vectors of covariates. The assumption that the response variables are independent is, however, too strong. In many applications, these responses are collected on nodes of a network, or some spatial or temporal domain, and are dependent. Examples abound in financial and meteorological applications, and dependencies naturally arise in social networks through peer effects. Regression with dependent responses has thus received a lot of attention in the Statistics and Economics literature, but there are no strong consistency results unless multiple independent samples of the vectors of dependent responses can be collected from these models. We present computationally and statistically efficient methods for linear and logistic regression models when the response variables are dependent on a network. Given one sample from a networked linear or logistic regression model and under mild assumptions, we prove strong consistency results for recovering the vector of coefficients and the strength of the dependencies, recovering the rates of standard regression under independent observations. We use projected gradient descent on the negative log-likelihood, or negative log-pseudolikelihood, and establish their strong convexity and consistency using concentration of measure for dependent random variables.

Memory-sample tradeoffs for linear regression with small error

We consider the problem of performing linear regression over a stream of d-dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate of convergence than can be achieved without memory constraints. Specifically, consider a sequence of labeled examples (a1,b1), (a2,b2)…, with ai drawn independently from a d-dimensional isotropic Gaussian, and where bi = ⟨ ai, x⟩ + ηi, for a fixed x ∈ ℝd with ||x||2 = 1 and with independent noise ηi drawn uniformly from the interval [−2d/5,2d/5]. We show that any algorithm with at most d2/4 bits of memory requires at least Ω(d loglog1/є) samples to approximate x to ℓ2 error є with probability of success at least 2/3, for є sufficiently small as a function of d. In contrast, for such є, x can be recovered to error є with probability 1−o(1) with memory O(d2 log(1/є)) using d examples. This represents the first nontrivial lower bounds for regression with super-linear memory, and may open the door for strong memory/sample tradeoffs for continuous optimization.

SESSION: Matrix Methods

Flows in almost linear time via adaptive preconditioning

We present algorithms for solving a large class of flow and regression problems on unit weighted graphs to (1 + 1 / poly(n)) accuracy in almost-linear time. These problems include ℓp-norm minimizing flow for p large (p ∈ [ω(1), o(log2/3 n) ]), and their duals, ℓp-norm semi-supervised learning for p close to 1.

As p tends to infinity, p-norm flow and its dual tend to max-flow and min-cut respectively. Using this connection and our algorithms, we give an alternate approach for approximating undirected max-flow, and the first almost-linear time approximations of discretizations of total variation minimization objectives.

Our framework is inspired by the routing-based solver for Laplacian linear systems by Spielman and Teng (STOC ’04, SIMAX ’14), and is based on several new tools we develop, including adaptive non-linear preconditioning, tree-routings, and (ultra-)sparsification for mixed ℓ2 and ℓp norm objectives.

Fully dynamic spectral vertex sparsifiers and applications

We study dynamic algorithms for maintaining spectral vertex sparsifiers of graphs with respect to a set of terminals T of our choice. Such objects preserve pairwise resistances, solutions to systems of linear equations, and energy of electrical flows between the terminals in T. We give a data structure that supports insertions and deletions of edges, and terminal additions, all in sublinear time. We then show the applicability of our result to the following problems.

(1) A data structure for dynamically maintaining solutions to Laplacian systems L x = b, where L is the graph Laplacian matrix and b is a demand vector. For a bounded degree, unweighted graph, we support modifications to both L and b while providing access to є-approximations to the energy of routing an electrical flow with demand b, as well as query access to entries of a vector x such that ∥x−L bL ≤ є ∥L bL in Õ(n11/12є−5) expected amortized update and query time.

(2) A data structure for maintaining fully dynamic All-Pairs Effective Resistance. For an intermixed sequence of edge insertions, deletions, and resistance queries, our data structures returns (1 ± є)-approximation to all the resistance queries against an oblivious adversary with high probability. Its expected amortized update and query times are Õ(min(m3/4,n5/6 є−2) є−4) on an unweighted graph, and Õ(n5/6є−6) on weighted graphs.

The key ingredients in these results are (1) the intepretation of Schur complement as a sum of random walks, and (2) a suitable choice of terminals based on the behavior of these random walks to make sure that the majority of walks are local, even when the graph itself is highly connected and (3) maintenance of these local walks and numerical solutions using data structures.

These results together represent the first data structures for maintain key primitives from the Laplacian paradigm for graph algorithms in sublinear time without assumptions on the underlying graph topologies. The importance of routines such as effective resistance, electrical flows, and Laplacian solvers in the static setting make us optimistic that some of our components can provide new building blocks for dynamic graph algorithms.

Spectral methods from tensor networks

A tensor network is a diagram that specifies a way to ``multiply'' a collection of tensors together to produce another tensor (or matrix). Many existing algorithms for tensor problems (such as tensor decomposition and tensor PCA), although they are not presented this way, can be viewed as spectral methods on matrices built from simple tensor networks. In this work we leverage the full power of this abstraction to design new algorithms for certain continuous tensor decomposition problems.

An important and challenging family of tensor problems comes from orbit recovery, a class of inference problems involving group actions (inspired by applications such as cryo-electron microscopy). Orbit recovery problems over finite groups can often be solved via standard tensor methods. However, for infinite groups, no general algorithms are known. We give a new spectral algorithm based on tensor networks for one such problem: continuous multi-reference alignment over the infinite group SO(2). Our algorithm extends to the more general heterogeneous case.

Solving linear programs in the current matrix multiplication time

This paper shows how to solve linear programs of the form minAx=b,x≥0 cx with n variables in time <table><tr><td>O*((nω+n2.5−α/2+n2+1/6) log(n/δ))</td></tr></table> where ω is the exponent of matrix multiplication, α is the dual exponent of matrix multiplication, and δ is the relative accuracy. For the current value of ω∼2.37 and α∼0.31, our algorithm takes O*(nω log(n/δ)) time. When ω = 2, our algorithm takes O*(n2+1/6 log(n/δ)) time.

Our algorithm utilizes several new concepts that we believe may be of independent interest:

(1) We define a stochastic central path method.

(2) We show how to maintain a projection matrix <table><tr><td>√</td><td><table><tr><td></td></tr><tr><td>W</td></tr></table></td><td>A(AWA)−1A</td><td>√</td><td><table><tr><td></td></tr><tr><td>W</td></tr></table></td></tr></table> in sub-quadratic time under ℓ2 multiplicative changes in the diagonal matrix W.

Approximating APSP without scaling: equivalence of approximate min-plus and exact min-max

Zwick’s (1+ε)-approximation algorithm for the All Pairs Shortest Path (APSP) problem runs in time Õ(nω/ε logW), where ω ≤ 2.373 is the exponent of matrix multiplication and W denotes the largest weight. This can be used to approximate several graph characteristics including the diameter, radius, median, minimum-weight triangle, and minimum-weight cycle in the same time bound.

Since Zwick’s algorithm uses the scaling technique, it has a factor logW in the running time. In this paper, we study whether APSP and related problems admit approximation schemes avoiding the scaling technique. That is, the number of arithmetic operations should be independent of W; this is called strongly polynomial. Our main results are as follows.

(1) We design approximation schemes in strongly polynomial time O(nωpolylog(n/ε)) for APSP on undirected graphs as well as for the graph characteristics diameter, radius, median, minimum-weight triangle, and minimum-weight cycle on directed or undirected graphs.

(2) For APSP on directed graphs we design an approximation scheme in strongly polynomial time O(nω + 3/2 ε−1 polylog(n/ε)). This is significantly faster than the best exact algorithm.

(3) We explain why our approximation scheme for APSP on directed graphs has a worse exponent than ω: Any improvement over our exponent ω + 3/2 would improve the best known algorithm for Min-Max Product. In fact, we prove that approximating directed APSP and exactly computing the Min-Max Product are equivalent.

Our techniques yield a framework for approximation problems over the (min,+)-semiring that can be applied more generally. In particular, we obtain the first strongly polynomial approximation scheme for Min-Plus Convolution in strongly subquadratic time, and we prove an equivalence of approximate Min-Plus Convolution and exact Min-Max Convolution.

Optimal succinct rank data structure via approximate nonnegative tensor decomposition

Given an n-bit array A, the succinct rank data structure problem asks to construct a data structure using space n+r bits for rn, supporting rank queries of form <pre>rank</pre>(u)=∑i=0u−1 A[i]. In this paper, we design a new succinct rank data structure with r=n/(logn)Ω(t)+n1−c and query time O(t) for some constant c>0, improving the previous best-known by Pǎtraşcu, which has r=n/(logn/t)Ω(t)+Õ(n3/4) bits of redundancy. For r>n1−c, our space-time tradeoff matches the cell-probe lower bound by Pǎtraşcu and Viola, which asserts that r must be at least n/(logn)O(t). Moreover, one can avoid an n1−c-bit lookup table when the data structure is implemented in the cell-probe model, achieving r=⌈ n/(logn)Ω(t)⌉. It matches the lower bound for the full range of parameters.

En route to our new data structure design, we establish an interesting connection between succinct data structures and approximate nonnegative tensor decomposition. Our connection shows that for specific problems, to construct a space-efficient data structure, it suffices to approximate a particular tensor by a sum of (few) nonnegative rank-1 tensors. For the rank problem, we explicitly construct such an approximation, which yields an explicit construction of the data structure.

SESSION: Lower Bounds/Metric Algs

Static data structure lower bounds imply rigidity

We show that static data structure lower bounds in the group (linear) model imply semi-explicit lower bounds on matrix rigidity. In particular, we prove that an explicit lower bound of t ≥ ω(log2 n) on the cell-probe complexity of linear data structures in the group model, even against arbitrarily small linear space (s= (1+)n), would already imply a semi-explicit (PNP) construction of rigid matrices with significantly better parameters than the current state of art (Alon, Panigrahy and Yekhanin, 2009). Our results further assert that polynomial (tnδ) data structure lower bounds against near-optimal space, would imply super-linear circuit lower bounds for log-depth linear circuits (a four-decade open question). In the succinct space regime (s=n+o(n)), we show that any improvement on current cell-probe lower bounds in the linear model would also imply new rigidity bounds. Our results rely on a new connection between the “inner” and “outer” dimensions of a matrix (Paturi and Pudlák, 2006), and on a new reduction from worst-case to average-case rigidity, which is of independent interest.

An exponential lower bound on the sub-packetization of MSR codes

An (n,k,ℓ)-vector MDS code is a F-linear subspace of (F)n (for some field F) of dimension kℓ, such that any k (vector) symbols of the codeword suffice to determine the remaining r=nk (vector) symbols. The length ℓ of each codeword symbol is called the Sub-Packetization of the code. Such a code is called minimum storage regenerating (MSR), if any single symbol of a codeword can be recovered by downloading ℓ/r field elements (which is known to be the least possible) from each of the other symbols.

MSR codes are attractive for use in distributed storage systems, and by now a variety of ingenious constructions of MSR codes are available. However, they all suffer from exponentially large Sub-Packetization ℓ ≳ rk/r. Our main result is an almost tight lower bound showing that for an MSR code, one must have ℓ ≥ exp(Ω(k/r)). Previously, a lower bound of ≈ exp(√k/r), and a tight lower bound for a restricted class of ”optimal access” MSR codes, were known. Our work settles a central open question concerning MSR codes that has received much attention. Further our proof is really short, hinging on one key definition that is somewhat inspired by Galois theory.

Why extension-based proofs fail

It is impossible to deterministically solve wait-free consensus in an asynchronous system. The classic proof uses a valency argument, which constructs an infinite execution by repeatedly extending a finite execution. We introduce extension-based proofs, a class of impossibility proofs that are modelled as an interaction between a prover and a protocol and that include valency arguments.

Using proofs based on combinatorial topology, it has been shown that it is impossible to deterministically solve k-set agreement among n > k ≥ 2 processes in a wait-free manner. However, it was unknown whether proofs based on simpler techniques were possible. We show that this impossibility result cannot be obtained by an extension-based proof and, hence, extension-based proofs are limited in power.

Lower bounds for external memory integer sorting via network coding

Sorting extremely large datasets is a frequently occuring task in practice. These datasets are usually much larger than the computer’s main memory; thus external memory sorting algorithms, first introduced by Aggarwal and Vitter (1988), are often used. The complexity of comparison based external memory sorting has been understood for decades by now, however the situation remains elusive if we assume the keys to be sorted are integers. In internal memory, one can sort a set of n integer keys of Θ(lgn) bits each in O(n) time using the classic Radix Sort algorithm, however in external memory, there are no faster integer sorting algorithms known than the simple comparison based ones. Whether such algorithms exist has remained a central open problem in external memory algorithms for more than three decades.

In this paper, we present a tight conditional lower bound on the complexity of external memory sorting of integers. Our lower bound is based on a famous conjecture in network coding by Li and Li (2004), who conjectured that network coding cannot help anything beyond the standard multicommodity flow rate in undirected graphs.

The only previous work connecting the Li and Li conjecture to lower bounds for algorithms is due to Adler et al. (2006). Adler et al. indeed obtain relatively simple lower bounds for oblivious algorithms (the memory access pattern is fixed and independent of the input data). Unfortunately obliviousness is a strong limitations, especially for integer sorting: we show that the Li and Li conjecture implies an Ω(n logn) lower bound for internal memory oblivious sorting when the keys are Θ(lgn) bits. This is in sharp contrast to the classic (non-oblivious) Radix Sort algorithm. Indeed going beyond obliviousness is highly non-trivial; we need to introduce several new methods and involved techniques, which are of their own interest, to obtain our tight lower bound for external memory integer sorting.

Algorithmic Pirogov-Sinai theory

We develop an efficient algorithmic approach for approximate counting and sampling in the low-temperature regime of a broad class of statistical physics models on finite subsets of the lattice ℤd and on the torus (ℤ/n ℤ)d. Our approach is based on combining contour representations from Pirogov–Sinai theory with Barvinok’s approach to approximate counting using truncated Taylor series. Some consequences of our main results include an FPTAS for approximating the partition function of the hard-core model at sufficiently high fugacity on subsets of ℤd with appropriate boundary conditions and an efficient sampling algorithm for the ferromagnetic Potts model on the discrete torus (ℤ/n ℤ)d at sufficiently low temperature.

On approximating the covering radius and finding dense lattice subspaces

In this work, we give a novel algorithm for computing dense lattice subspaces, a conjecturally tight characterization of the lattice covering radius, and provide a bound on the slicing constant of lattice Voronoi cells. Our work is motivated by the pursuit of faster algorithms for integer programming, for which we give a conditional speedup based on the recent resolution of the ℓ2 Kannan-Lovász conjecture. Through these results, we hope to motivate further study of the interplay between the recently developed reverse Minkowski theory, lattice algorithms and convex geometry.

On the algorithmic side, our main contribution is a 2O(n)-time algorithm for computing a O(Cη(n))-approximate sublattice of minimum normalized determinant on any n-dimensional lattice, where Cη(n) = O(logn) is the reverse Minkowski constant in dimension n. Our method for finding dense lattice subspaces is surprisingly simple: we iteratively descend to a random co-dimension 1 subspace chosen to be the orthogonal space to a discrete Gaussian sample from the dual lattice. Applying this algorithm within a “filtration reduction” scheme, we further show how to compute a O(Cη(n))-approximate canonical filtration of any lattice, which corresponds to a canonical way of decomposing a lattice into dense blocks. As a primary application, we get the first 2O(n)-time algorithm for computing a sparse lattice projection whose “volume radius” provides a lower bound on the lattice covering radius that is tight within a O(log2.5 n)-factor. This provides an efficient algorithmic version of the ℓ2 Kannan-Lovász conjecture, which was recently resolved by Regev and Stephens-Davidowitz (STOC ’2017).

On the structural side, we prove a new lower bound on the covering radius which combines volumetric lower bounds across a chain of lattice projections. Assuming Bourgain’s slicing conjecture restricted to Voronoi cells of stable lattices, our lower bound implies (somewhat surprisingly) that the problem of approximating the lattice covering radius to within a constant factor is in . Complementing this result, we show that the slicing constant of any n-dimensional Voronoi cell is bounded by O(CKL,2(n)) = O(log1.5 n), the ℓ2 Kannan-Lovász constant, which complements the O(logn) bound of Regev and Stephens-Davidowitz for stable Voronoi cells.

SESSION: ML Foundations II

Performance of Johnson-Lindenstrauss transform for k-means and k-medians clustering

Consider an instance of Euclidean k-means or k-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of (1+ε) under a projection onto a random O(log(k /ε) / ε2)-dimensional subspace. Further, the cost of every clustering is preserved within (1+ε). More generally, our result applies to any dimension reduction map satisfying a mild sub-Gaussian-tail condition. Our bound on the dimension is nearly optimal. Additionally, our result applies to Euclidean k-clustering with the distances raised to the p-th power for any constant p.

For k-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for k-medians, it answers a question raised by Kannan.

Oblivious dimension reduction for k-means: beyond subspaces and the Johnson-Lindenstrauss lemma

We show that for n points in d-dimensional Euclidean space, a data oblivious random projection of the columns onto mO((logk+loglogn−6log1/ε) dimensions is sufficient to approximate the cost of all k-means clusterings up to a multiplicative (1±ε) factor. The previous-best upper bounds on m are O(logn· ε−2) given by a direct application of the Johnson-Lindenstrauss Lemma, and O(kε−2) given by [Cohen et al.-STOC’15].

A universal sampling method for reconstructing signals with simple Fourier transforms

Reconstructing continuous signals based on a small number of discrete samples is a fundamental problem across science and engineering. We are often interested in signals with "simple'' Fourier structure -- e.g., those involving frequencies within a bounded range, a small number of frequencies, or a few blocks of frequencies -- i.e., bandlimited, sparse, and multiband signals, respectively. More broadly, any prior knowledge on a signal's Fourier power spectrum can constrain its complexity. Intuitively, signals with more highly constrained Fourier structure require fewer samples to reconstruct.

We formalize this intuition by showing that, roughly, a continuous signal from a given class can be approximately reconstructed using a number of samples proportional to the statistical dimension of the allowed power spectrum of that class. We prove that, in nearly all settings, this natural measure tightly characterizes the sample complexity of signal reconstruction.

Surprisingly, we also show that, up to log factors, a universal non-uniform sampling strategy can achieve this optimal complexity for any class of signals. We present an efficient and general algorithm for recovering a signal from the samples taken. For bandlimited and sparse signals, our method matches the state-of-the-art, while providing the the first computationally and sample efficient solution to a broader range of problems, including multiband signal reconstruction and Gaussian process regression tasks in one dimension.

Our work is based on a novel connection between randomized linear algebra and the problem of reconstructing signals with constrained Fourier structure. We extend tools based on statistical leverage score sampling and column-based matrix reconstruction to the approximation of continuous linear operators that arise in the signal reconstruction problem. We believe these extensions are of independent interest and serve as a foundation for tackling a broad range of continuous time problems using randomized methods.

Optimal terminal dimensionality reduction in Euclidean space

Let ε∈(0,1) and Xd be arbitrary with |X| having size n>1. The Johnson-Lindenstrauss lemma states there exists f:Xm with m = O−2logn) such that <table><tr><td> ∀ x∈ X ∀ y∈ X, ||xy||2 ≤ ||f(x)−f(y)||2 ≤ (1+ε)||xy||2 . </td></tr></table> We show that a strictly stronger version of this statement holds, answering one of the main open questions posed by Mahabadi et al. in STOC 2018: “∀ yX” in the above statement may be replaced with “∀ yd”, so that f not only preserves distances within X, but also distances to X from the rest of space. Previously this stronger version was only known with the worse bound m = O−4logn). Our proof is via a tighter analysis of (a specific instantiation of) the embedding recipe of Mahabadi et al.

Dynamic sampling from graphical models

In this paper, we study the problem of sampling from a graphical model when the model itself is changing dynamically with time. This problem derives its interest from a variety of inference, learning, and sampling settings in machine learning, computer vision, statistical physics, and theoretical computer science. While the problem of sampling from a static graphical model has received considerable attention, theoretical works for its dynamic variants have been largely lacking. The main contribution of this paper is an algorithm that can sample dynamically from a broad class of graphical models over discrete random variables. Our algorithm is parallel and Las Vegas: it knows when to stop and it outputs samples from the exact distribution. We also provide sufficient conditions under which this algorithm runs in time proportional to the size of the update, on general graphical models as well as well-studied specific spin systems. In particular we obtain, for the Ising model (ferromagnetic or anti-ferromagnetic) and for the hardcore model the first dynamic sampling algorithms that can handle both edge and vertex updates (addition, deletion, change of functions), both efficient within regimes that are close to the respective uniqueness regimes, beyond which, even for the static and approximate sampling, no local algorithms were known or the problem itself is intractable. Our dynamic sampling algorithm relies on a local resampling algorithm and a new ``equilibrium'' property that is shown to be satisfied by our algorithm at each step, and enables us to prove its correctness. This equilibrium property is robust enough to guarantee the correctness of our algorithm, helps us improve bounds on fast convergence on specific models, and should be of independent interest.

SESSION: Cryptography

Fiat-Shamir: from practice to theory

We give new instantiations of the Fiat-Shamir transform using explicit, efficiently computable hash functions. We improve over prior work by reducing the security of these protocols to qualitatively simpler and weaker computational hardness assumptions. As a consequence of our framework, we obtain the following concrete results.

1) There exists a succinct publicly verifiable non-interactive argument system for log-space uniform computations, under the assumption that any one of a broad class of fully homomorphic encryption (FHE) schemes has almost optimal security against polynomial-time adversaries. The class includes all FHE schemes in the literature that are based on the learning with errors (LWE) problem.

2) There exists a non-interactive zero-knowledge argument system for in the common reference string model, under either of the following two assumptions: (i) Almost optimal hardness of search-LWE against polynomial-time adversaries, or (ii) The existence of a circular-secure FHE scheme with a standard (polynomial time, negligible advantage) level of security.

3) The classic quadratic residuosity protocol of [Goldwasser, Micali, and Rackoff, SICOMP ’89] is not zero knowledge when repeated in parallel, under any of the hardness assumptions above.

Weak zero-knowledge beyond the black-box barrier

The round complexity of zero-knowledge protocols is a long-standing open question, yet to be settled under standard assumptions. So far, the question has appeared equally challenging for relaxations such as weak zero-knowledge and witness hiding. Protocols satisfying these relaxed notions under standard assumptions have at least four messages, just like full-fledged zero-knowledge. The difficulty in improving round complexity stems from a fundamental barrier: none of these notions can be achieved in three messages via reductions (or simulators) that treat the verifier as a black box.

We introduce a new non-black-box technique and use it to obtain the first protocols that cross this barrier under standard assumptions. We obtain weak zero-knowledge for in two messages, assuming the existence of quasipolynomially-secure fully-homomorphic encryption and other standard primitives (known based on the quasipolynomial hardness of Learning with Errors), and subexponentially-secure one-way functions. We also obtain weak zero-knowledge for in three messages under standard polynomial assumptions (following for example from fully homomorphic encryption and factoring).

We also give, under polynomial assumptions, a two-message witness-hiding protocol for any language ∈ that has a witness encryption scheme. This protocol is publicly verifiable.

Our technique is based on a new homomorphic trapdoor paradigm, which can be seen as a non-black-box analog of the classic Feige-Lapidot-Shamir trapdoor paradigm.

Finding a Nash equilibrium is no easier than breaking Fiat-Shamir

The Fiat-Shamir heuristic transforms a public-coin interactive proof into a non-interactive argument, by replacing the verifier with a cryptographic hash function that is applied to the protocol’s transcript. Constructing hash functions for which this transformation is sound is a central and long-standing open question in cryptography.

We show that solving the ENDOFMETEREDLINE problem is no easier than breaking the soundness of the Fiat-Shamir transformation when applied to the sumcheck protocol. In particular, if the transformed protocol is sound, then any hard problem in #P gives rise to a hard distribution in the class CLS, which is contained in PPAD. Our result opens up the possibility of sampling moderately-sized games for which it is hard to find a Nash equilibrium, by reducing the inversion of appropriately chosen one-way functions to #SAT.

Our main technical contribution is a stateful incrementally verifiable procedure that, given a SAT instance over n variables, counts the number of satisfying assignments. This is accomplished via an exponential sequence of small steps, each computable in time poly(n). Incremental verifiability means that each intermediate state includes a sumcheck-based proof of its correctness, and the proof can be updated and verified in time poly(n).

How to delegate computations publicly

We construct a delegation scheme for all polynomial time computations. Our scheme is publicly verifiable and completely non-interactive in the common reference string (CRS) model.

Our scheme is based on an efficiently falsifiable decisional assumption on groups with bilinear maps. Prior to this work, publicly verifiable non-interactive delegation schemes were only known under knowledge assumptions (or in the Random Oracle model) or under non-standard assumptions related to obfuscation or multilinear maps.

We obtain our result in two steps. First, we construct a scheme with a long CRS (polynomial in the running time of the computation) by following the blueprint of Paneth and Rothblum (TCC 2017). Then we bootstrap this scheme to obtain a short CRS. Our bootstrapping theorem exploits the fact that our scheme can securely delegate certain non-deterministic computations.

SESSION: Algorithms

On a generalization of iterated and randomized rounding

We give a general method for rounding linear programs that combines the commonly used iterated rounding and randomized rounding techniques. In particular, we show that whenever iterated rounding can be applied to a problem with some slack, there is a randomized procedure that returns an integral solution that satisfies the guarantees of iterated rounding and also has concentration properties. We use this to give new results for several classic problems where iterated rounding has been useful.

The online 𝑘-taxi problem

We consider the online k-taxi problem, a generalization of the k-server problem, in which k taxis serve a sequence of requests in a metric space. A request consists of two points s and t, representing a passenger that wants to be carried by a taxi from s to t. The goal is to serve all requests while minimizing the total distance traveled by all taxis. The problem comes in two flavors, called the easy and the hard k-taxi problem: In the easy k-taxi problem, the cost is defined as the total distance traveled by the taxis; in the hard k-taxi problem, the cost is only the distance of empty runs.

The hard k-taxi problem is substantially more difficult than the easy version with at least an exponential deterministic competitive ratio, Ω(2k), admitting a reduction from the layered graph traversal problem. In contrast, the easy k-taxi problem has exactly the same competitive ratio as the k-server problem. We focus mainly on the hard version. For hierarchically separated trees (HSTs), we present a memoryless randomized algorithm with competitive ratio 2k−1 against adaptive online adversaries and provide two matching lower bounds: for arbitrary algorithms against adaptive adversaries and for memoryless algorithms against oblivious adversaries. Due to well-known HST embedding techniques, the algorithm implies a randomized O(2klogn)-competitive algorithm for arbitrary n-point metrics. This is the first competitive algorithm for the hard k-taxi problem for general finite metric spaces and general k. For the special case of k=2, we obtain a precise answer of 9 for the competitive ratio in general metrics. With an algorithm based on growing, shrinking and shifting regions, we show that one can achieve a constant competitive ratio also for the hard 3-taxi problem on the line (abstracting the scheduling of three elevators).

Achieving optimal backlog in multi-processor cup games

Many problems in processor scheduling, deamortization, and buffer management can be modeled as single- and multi-processor cup games.

At the beginning of the single-processor n-cup game, all cups are empty. In each step of the game, a filler distributes 1−є units of water among the cups, and then an emptier selects a cup and removes up to 1 unit of water from it. The goal of the emptier is to minimize the amount of water in the fullest cup, also known as the backlog. The greedy algorithm (i.e., empty from the fullest cup) is known to achieve backlog O(logn), and no deterministic algorithm can do better.

We show that the performance of the greedy algorithm can be exponentially improved with a small amount of randomization: After each step and for any k ≥ Ω(logє−1), the emptier achieves backlog at most O(k) with probability at least 1 −O(2−2k). We call our algorithm the smoothed greedy algorithm because if follows from a smoothed analysis of the (standard) greedy algorithm.

In each step of the p-processor n-cup game, the filler distributes p(1−є) unit of water among the cups, with no cup receiving more than 1−δ units of water, and then the emptier selects p cups and removes 1 unit of water from each. Proving nontrivial bounds on the backlog for the multi-processor cup game has remained open for decades. We present a simple analysis of the greedy algorithm for the multi-processor cup game, establishing a backlog of O−1 logn), as long as δ > 1/poly(n).

Turning to randomized algorithms, we find that the backlog drops to constant. Specifically, we show that if є and δ satisfy reasonable constraints, then there exists an algorithm that bounds the backlog after a given step by 3 with probability at least 1 − O(exp(−Ω(є2 p)).

We prove that our results are asymptotically optimal for constant є, in the sense that no algorithms can achieve better bounds, up to constant factors in the backlog and in p. Moreover, we prove robustness results, demonstrating that our randomized algorithms continue to behave well even when placed in bad starting states.

Planar point sets determine many pairwise crossing segments

We show that any set of n points in general position in the plane determines n1−o(1) pairwise crossing segments. The best previously known lower bound, Ω(√n), was proved more than 25 years ago by Aronov, Erdős, Goddard, Kleitman, Klugerman, Pach, and Schulman. Our proof is fully constructive, and extends to dense geometric graphs.

SESSION: Graph Algorithms III

Graph pattern detection: hardness for all induced patterns and faster non-induced cycles

We consider the pattern detection problem in graphs: given a constant size pattern graph H and a host graph G, determine whether G contains a subgraph isomorphic to H. We present the following new improved upper and lower bounds:

We prove that if a pattern H contains a k-clique subgraph, then detecting whether an n node host graph contains a not necessarily induced copy of H requires at least the time for detecting whether an n node graph contains a k-clique. The previous result of this nature required that H contains a k-clique which is disjoint from all other k-cliques of H.

We show that if the famous Hadwiger conjecture from graph theory is true, then detecting whether an n node host graph contains a not necessarily induced copy of a pattern with chromatic number t requires at least the time for detecting whether an n node graph contains a t-clique. This implies that: (a) under Hadwiger’s conjecture for every k-node pattern H, finding an induced copy of H requires at least the time of √k-clique detection and size ω(nk/4) for any constant depth circuit, and (b) unconditionally, detecting an induced copy of a random G(k,p) pattern w.h.p. requires at least the time of Θ(k/logk)-clique detection, and hence also at least size nΩ(k/logk) for circuits of constant depth.

We show that for every k, there exists a k-node pattern that contains a k−1-clique and that can be detected as an induced subgraph in n node graphs in the best known running time for k−1-Clique detection. Previously such a result was only known for infinitely many k.

Finally, we consider the case when the pattern is a directed cycle on k nodes, and we would like to detect whether a directed m-edge graph G contains a k-Cycle as a not necessarily induced subgraph. We resolve a 14 year old conjecture of [Yuster-Zwick SODA’04] on the complexity of k-Cycle detection by giving a tight analysis of their k-Cycle algorithm. Our analysis improves the best bounds for k-Cycle detection in directed graphs, for all k>5.

New polynomial delay bounds for maximal subgraph enumeration by proximity search

In this paper we propose polynomial delay algorithms for several maximal subgraph listing problems, by means of a seemingly novel technique which we call proximity search. Our result involves modeling the space of solutions as an implicit directed graph called “solution graph”, a method common to other enumeration paradigms such as reverse search. Such methods, however, can become inefficient due to this graph having vertices with high (potentially exponential) degree. The novelty of our algorithm consists in providing a technique for generating better solution graphs, reducing the out-degree of its vertices with respect to existing approaches, and proving that it remains strongly connected. Applying this technique, we obtain polynomial delay listing algorithms for several problems for which output-sensitive results were, to the best of our knowledge, not known. These include Maximal Bipartite Subgraphs, Maximal k-Degenerate Subgraphs (for bounded k), Maximal Induced Chordal Subgraphs, and Maximal Induced Trees. We present these algorithms, and give insight on how this general technique can be applied to other problems.

Explicit N-vertex graphs with maximum degree K and diameter [1+o(1)]logK-1 N for each K-1 a prime power

Here we first present the solution of a long-standing open question–the explicit construction of an infinite family of N-vertex cubic graphs that have diameter [1+o(1)]log2 N. We then extend the techniques to construct, for each K of the form 2s+1 or K=ps+1; s an integer and p a prime, an infinite family of K-regular graphs on N vertices with diameter [1+o(1)]logK−1 N.

SESSION: Complexity Theory II

Sylvester-Gallai type theorems for quadratic polynomials

We prove Sylvester-Gallai type theorems for quadratic polynomials. Specifically, we prove that if a finite collection Q, of irreducible polynomials of degree at most 2, satisfy that for every two polynomials Q1,Q2Q there is a third polynomial Q3Q so that whenever Q1 and Q2 vanish then also Q3 vanishes, then the linear span of the polynomials in Q has dimension O(1). We also prove a colored version of the theorem: If three finite sets of quadratic polynomials satisfy that for every two polynomials from distinct sets there is a polynomial in the third set satisfying the same vanishing condition then all polynomials are contained in an O(1)-dimensional space.

This answers affirmatively two conjectures of Gupta [Electronic Colloquium on Computational Complexity (ECCC), 21:130, 2014] that were raised in the context of solving certain depth-4 polynomial identities.

To obtain our main theorems we prove a new result classifying the possible ways that a quadratic polynomial Q can vanish when two other quadratic polynomials vanish. Our proofs also require robust versions of a theorem of Edelstein and Kelly (that extends the Sylvester-Gallai theorem to colored sets).

Weak lower bounds on resource-bounded compression imply strong separations of complexity classes

The Minimum Circuit Size Problem (MCSP) asks to determine the minimum size of a circuit computing a given truth table. MCSP is a natural and powerful string compression problem using bounded-size circuits. Recently, Oliveira and Santhanam [FOCS 2018] and Oliveira, Pich, and Santhanam [ECCC 2018] demonstrated a “hardness magnification” phenomenon for MCSP in restricted settings. Letting MCSP[s(n)] be the problem of deciding if a truth table of length 2n has circuit complexity at most s(n), they proved that small (fixed-polynomial) average case circuit/formula lower bounds for MCSP[2n], or lower bounds for approximating MCSP[2o(n)], would imply major separations such as NPBPP and NPP/poly.

We strengthen their results in several directions, obtaining magnification results from worst-case lower bounds on exactly computing the search version of generalizations of MCSP[s(n)], which also extend to time-bounded Kolmogorov complexity. In particular, we show that search-MCSP[s(n)] (where we must output a s(n)-size circuit when it exists) admits extremely efficient AC0 circuits and streaming algorithms using Σ3 SAT oracle gates of small fan-in (related to the size s(n) we want to test).

For A : {0,1} → {0,1}, let search-oracleMCSPA[s(n)] be the problem: Given a truth table T of size N=2n, output a Boolean circuit for T of size at most s(n) with AND, OR, NOT, and A-oracle gates (or report that no such circuit exists). Some consequences of our results are:

(1) For reasonable s(n) ≥ n and APH, if search-MCSPA[s(n)] does not have a 1-pass deterministic poly(s(n))-space streaming algorithm with poly(s(n)) update time, then PNP. For example, proving that it is impossible to synthesize SAT-oracle circuits of size 2n/log n with a streaming algorithm on truth tables of length N=2n using Nε update time and Nε space on length-N inputs (for some ε > 0) would already separate P and NP. Note that some extremely simple functions, such as EQUALITY of two strings, already satisfy such lower bounds.

(2) If search-MCSP[nc] lacks Õ(N)-size, Õ(1)-depth circuits for a c ≥ 1, then NPP/poly.

(3) If search-MCSP[s(n)] does not have N · poly(s(n))-size, O(logN)-depth circuits, then NPNC1. Note it is known that MCSP[2n] does not have formulas of N1.99 size [Hirahara and Santhanam, CCC 2017].

(4) If there is an ε > 0 such that for all c ≥ 1, search-MCSP[2n/c] does not have N1+ε-size O(1/ε)-depth ACC0 circuits, then NPACC0. Thus the amplification results of Allender and Koucký [JACM 2010] can extend to problems in NP and beyond.

Furthermore, if we substitute ⊕ P, PP, PSPACE, or EXP-complete problems for the oracle A, we obtain separations for those corresponding complexity classes instead of NP. Analogues of the above results hold for time-bounded Kolmogorov complexity as well.

Mean-field approximation, convex hierarchies, and the optimality of correlation rounding: a unified perspective

The free energy is a key quantity of interest in Ising models, but unfortunately, computing it in general is computationally intractable. Two popular (variational) approximation schemes for estimating the free energy of general Ising models (in particular, even in regimes where correlation decay does not hold) are: (i) the mean-field approximation with roots in statistical physics, which estimates the free energy from below, and (ii) hierarchies of convex relaxations with roots in theoretical computer science, which estimate the free energy from above. We show, surprisingly, that the tight regime for both methods to compute the free energy to leading order is identical.

More precisely, we show that the mean-field approximation to the free energy is within O((n||J||F)2/3) of the true free energy, where ||J||F denotes the Frobenius norm of the interaction matrix of the Ising model. This simultaneously subsumes both the breakthrough work of Basak and Mukherjee, who showed the tight result that the mean-field approximation is within o(n) whenever ||J||F = o(√n), as well as the work of Jain, Koehler, and Mossel, who gave the previously best known non-asymptotic bound of O((n||J||F)2/3log1/3(n||J||F)). We give a simple, algorithmic proof of this result using a convex relaxation proposed by Risteski based on the Sherali-Adams hierarchy, automatically giving sub-exponential time approximation schemes for the free energy in this entire regime. Our algorithmic result is tight under Gap-ETH.

We furthermore combine our techniques with spin glass theory to prove (in a strong sense) the optimality of correlation rounding, refuting a recent conjecture of Allen, O’Donnell, and Zhou. Finally, we give the tight generalization of all of these results to k-MRFs, capturing as a special case previous work on approximating MAX-k-CSP.

SESSION: Graph Canonization

Canonical form for graphs in quasipolynomial time: preliminary report

We outline how to turn the author's quasipolynomial-time graph isomorphism test into a construction of a canonical form within the same time bound. The proof involves a nontrivial modification of the central symmetry-breaking tool, the construction of a canonical relational structure of logarithmic arity on the ideal domain based on local certificates.

A unifying method for the design of algorithms canonizing combinatorial objects

We devise a unified framework for the design of canonization algorithms. Using hereditarily finite sets, we define a general notion of combinatorial objects that includes graphs, hypergraphs, relational structures, codes, permutation groups, tree decompositions, and so on.

Our approach allows for a systematic transfer of the techniques that have been developed for isomorphism testing to canonization. We use it to design a canonization algorithm for general combinatorial objects. This result gives new fastest canonization algorithms with an asymptotic running time matching the best known isomorphism algorithm for the following types of objects: hypergraphs, hypergraphs of bounded color class size, permutation groups (up to permutational isomorphism) and codes that are explicitly given (up to code equivalence).