We revisit the constant-factor approximation algorithm for the asymmetric traveling salesman problem by Svensson, Tarnawski, and Végh [STOC 2018]. We improve on each part of this algorithm. We avoid the reduction to irreducible instances and thus obtain a simpler and much better reduction to vertebrate pairs. We also show that a slight variant of their algorithm for vertebrate pairs has a much smaller approximation ratio. Overall we improve the approximation ratio from 506 to 22+ε for any ε > 0. This also improves the upper bound on the integrality ratio from 319 to 22.
We present a black-box reduction from the path version of the Traveling Salesman Problem (Path TSP) to the classical tour version (TSP). More precisely, we show that given an α-approximation algorithm for TSP, then, for any ε >0, there is an (α+ε)-approximation algorithm for the more general Path TSP. This reduction implies that the approximability of Path TSP is the same as for TSP, up to an arbitrarily small error. This avoids future discrepancies between the best known approximation factors achievable for these two problems, as they have existed until very recently.
A well-studied special case of TSP, Graph TSP, asks for tours in unit-weight graphs. Our reduction shows that any α-approximation algorithm for Graph TSP implies an (α+ε)-approximation algorithm for its path version. By applying our reduction to the 1.4-approximation algorithm for Graph TSP by Sebő and Vygen, we obtain a polynomial-time (1.4+ε)-approximation algorithm for Graph Path TSP, improving on a recent 1.497-approximation algorithm of Traub and Vygen.
We obtain our results through a variety of new techniques, including a novel way to set up a recursive dynamic program to guess significant parts of an optimal solution. At the core of our dynamic program we deal with instances of a new generalization of (Path) TSP which combines parity constraints with certain connectivity requirements. This problem, which we call Φ-TSP, has a constant-factor approximation algorithm and can be reduced to TSP in certain cases when the dynamic program would not make sufficient progress.
We design a 1.49993-approximation algorithm for the metric traveling salesperson problem (TSP) for instances in which an optimal solution to the subtour linear programming relaxation is half-integral. These instances received significant attention over the last decade due to a conjecture of Schalekamp, Williamson and van Zuylen stating that half-integral LP solutions have the largest integrality gap over all fractional solutions. So, if the conjecture of Schalekamp et al. holds true, our result shows that the integrality gap of the subtour polytope is bounded away from 3/2.
The symmetric traveling salesman problem (TSP) is the problem of finding the shortest Hamiltonian cycle in an edge-weighted undirected graph. In 1962 Bellman, and independently Held and Karp, showed that TSP instances with n cities can be solved in O(n22n) time. Since then it has been a notorious problem to improve the runtime to O((2−є)n) for some constant є>0. In this work we establish the following progress: If (s× s)-matrices can be multiplied in s2+o(1) time, than all instances of TSP in bipartite graphs can be solved in O(1.9999n) time by a randomized algorithm with constant error probability. We also indicate how our methods may be useful to solve TSP in non-bipartite graphs.
On a high level, our approach is via a new problem called MinHamPair: Given two families of weighted perfect matchings, find a combination of minimum weight that forms a Hamiltonian cycle. As our main technical contribution, we give a fast algorithm for MinHamPair based on a new sparse cut-based factorization of the ‘matchings connectivity matrix’, introduced by Cygan et al. [JACM’18].
We introduce the binary value principle which is a simple subset-sum instance expressing that a natural number written in binary cannot be negative, relating it to central problems in proof and algebraic complexity. We prove conditional superpolynomial lower bounds on the Ideal Proof System (IPS) refutation size of this instance, based on a well-known hypothesis by Shub and Smale about the hardness of computing factorials, where IPS is the strong algebraic proof system introduced by Grochow and Pitassi (2018). Conversely, we show that short IPS refutations of this instance bridge the gap between sufficiently strong algebraic and semi-algebraic proof systems. Our results extend to full-fledged IPS the paradigm introduced in Forbes et al. (2016), whereby lower bounds against subsystems of IPS were obtained using restricted algebraic circuit lower bounds, and demonstrate that the binary value principle captures the advantage of semi-algebraic over algebraic reasoning, for sufficiently strong systems. Specifically, we show the following:
Conditional IPS lower bounds: The Shub-Smale hypothesis (1995) implies a superpolynomial lower bound on the size of IPS refutations of the binary value principle over the rationals defined as the unsatisfiable linear equation ∑i=1n 2i−1xi = −1, for boolean xi’s. Further, the related τ-conjecture (1995) implies a superpolynomial lower bound on the size of IPS refutations of a variant of the binary value principle over the ring of rational functions. No prior conditional lower bounds were known for IPS or for apparently much weaker propositional proof systems such as Frege.
Algebraic vs. semi-algebraic proofs: Admitting short refutations of the binary value principle is necessary for any algebraic proof system to fully simulate any known semi-algebraic proof system, and for strong enough algebraic proof systems it is also sufficient. In particular, we introduce a very strong proof system that simulates all known semi-algebraic proof systems (and most other known concrete propositional proof systems), under the name Cone Proof System (CPS), as a semi-algebraic analogue of the ideal proof system: CPS establishes the unsatisfiability of collections of polynomial equalities and inequalities over the reals, by representing sum-of-squares proofs (and extensions) as algebraic circuits. We prove that IPS is polynomially equivalent to CPS iff IPS admits polynomial-size refutations of the binary value principle (for the language of systems of equations that have no 0/1-solutions), over both ℤ and ℚ.
We show that Cutting Planes (CP) proofs are hard to find: Given an unsatisfiable formula F, It is -hard to find a CP refutation of F in time polynomial in the length of the shortest such refutation; and unless Gap-Hitting-Set admits a nontrivial algorithm, one cannot find a tree-like CP refutation of F in time polynomial in the length of the shortest such refutation.
The first result extends the recent breakthrough of Atserias and M'uller (FOCS 2019) that established an analogous result for Resolution. Our proofs rely on two new lifting theorems: (1) Dag-like lifting for gadgets with many output bits. (2) Tree-like lifting that simulates an r-round protocol with gadgets of query complexity O(r).
One of the major open problems in proof complexity is to prove lower bounds on AC0[p]-Frege proof systems. As a step toward this goal Impagliazzo, Mouli and Pitassi in a recent paper suggested to prove lower bounds on the size for Polynomial Calculus over the {± 1} basis. In this paper we show a technique for proving such lower bounds and moreover we also give lower bounds on the size for Sum-of-Squares over the {± 1} basis.
We show lower bounds on random Δ-CNF formulas and formulas composed with a gadget. As a byproduct, we establish a separation between Polynomial Calculus and Sum-of-Squares over the {± 1} basis by proving a lower bound on the Pigeonhole Principle.
We give a surprising classification for the computational complexity of the Quantified Constraint Satisfaction Problem over a constraint language Γ, QCSP(Γ), where Γ is a finite language over 3 elements which contains all constants. In particular, such problems are either in P, NP-complete, co-NP-complete or PSpace-complete. Our classification refutes the hitherto widely-believed Chen Conjecture.
Additionally, we show that already on a 4-element domain there exists a constraint language Γ such that (Γ) is DP-complete (from Boolean Hierarchy), and on a 10-element domain there exists a constraint language giving the complexity class Θ2P.
Meanwhile, we prove the Chen Conjecture for finite conservative languages Γ. If the polymorphism clone of such Γ has the polynomially generated powers (PGP) property then QCSP(Γ) is in NP. Otherwise, the polymorphism clone of Γ has the exponentially generated powers (EGP) property and QCSP(Γ) is PSpace-complete.
This paper focuses on the contention resolution problem on a shared communication channel that does not support collision detection. A shared communication channel is a multiple access channel, which consists of a sequence of synchronized time slots. Players on the channel may attempt to broadcast a packet (message) in any time slot. A player's broadcast succeeds if no other player broadcasts during that slot. If two or more players broadcast in the same time slot, then the broadcasts collide and both broadcasts fail. The lack of collision detection means that a player monitoring the channel cannot differentiate between the case of two or more players broadcasting in the same slot (a collision) and zero players broadcasting. In the contention-resolution problem, players arrive on the channel over time, and each player has one packet to transmit. The goal is to coordinate the players so that each player is able to successfully transmit its packet within reasonable time. However, the players can only communicate via the shared channel by choosing to either broadcast or not. A contention-resolution protocol is measured in terms of its throughput (channel utilization). Previous work on contention resolution that achieved constant throughput assumed that either players could detect collisions, or the players' arrival pattern is generated by a memoryless (non-adversarial) process.
The foundational question answered by this paper is whether collision detection is a luxury or necessity when the objective is to achieve constant throughput. We show that even without collision detection, one can solve contention resolution, achieving constant throughput, with high probability.
Population protocols are a model of distributed computing, where n agents with limited computational power and memory perform randomly scheduled pairwise interactions. A fundamental problem in this setting is that of leader election, where all agents start from the same state, and they seek to reach and maintain a global state where exactly one agent is in a dedicated leader state. A significant amount of work has been devoted to the study of the time and space complexity of this problem. Alistarh et al. (SODA’17) have shown that Ω(loglogn) states per agent are needed in order to elect a leader in fewer than Θ(n2) expected interactions. Moreover, Ω(nlogn) expected interactions are required regardless of the number of states (Sudo and Masuzawa, 2019). On the upper bound side, Gasieniec and Stachowiak (SODA’18) have presented the first protocol that uses an optimal, Θ(loglogn), number or states and elects a leader in O(n log2 n) expected interactions. This running time was subsequently improved to O(n lognloglogn) (Gasieniec et al., SPAA’19).
In this paper we provide the first leader election population protocol that is both time and space optimal: it uses Θ(loglogn) states per agent, and elects a leader in O(nlogn) interactions in expectation. A key novel component of our approach is a simple protocol that efficiently selects a small set of agents, of poly(logn) size, given an initial set of s = O(nє) selected agents. Unlike existing approaches, which proceed by shrinking the initial set monotonically over time, our protocol first increases the set in a controlled way to a specific size (which is independent of s), before it shrinks the set to a poly(logn) size.
In this paper, we study submodular function minimization in the adaptive complexity model. Seminal work by Gr'otschel, Lovász, and Schrijver shows that with oracle access to a function f, the problem of submodular minimization can be solved exactly with poly(n) queries to f. A long line of work has since then been dedicated to the acceleration of submodular minimization. In particular, recent work obtains a (strongly) polynomial time algorithm with Õ(n3) query complexity. A natural way to accelerate computation is via parallelization, though very little is known about the extent to which submodular minimization can be parallelized.
A natural measure for the parallel runtime of a black-box optimization algorithm is its adaptivity, as recently introduced in the context of submodular maximization. Informally, the adaptivity of an algorithm is the number of sequential rounds it makes when each round can execute polynomially-many function evaluations in parallel. In the past two years there have been breakthroughs in the study of adaptivity for both submodular maximization and convex minimization, in particular an exponential improvement in the parallel running time of submodular maximization was obtained with a O(logn)-adaptive algorithm. Whether submodular minimization can enjoy, thanks to parallelization, the same dramatic speedups as submodular maximization is unknown. To date, we do not know of any polynomial time algorithm for solving submodular minimization whose adaptivity is subquadratic in n.
We initiate the study of the adaptivity of submodular function minimization by giving the first non-trivial lower bound for the parallel runtime of submodular minimization. We show that there is no o(logn/loglogn)-adaptive algorithm with poly(n) queries which solves the problem of submodular minimization. This is the first adaptivity lower bound for unconstrained submodular optimization (whether for maximization or minimization) and the analysis relies on a novel and intricate construction of submodular functions.
In large-data applications, it is desirable to design algorithms with a high degree of parallelization. In the context of submodular optimization, adaptive complexity has become a widely-used measure of an algorithm’s “sequentiality”. Algorithms in the adaptive model proceed in rounds, and can issue polynomially many queries to a function f in each round. The queries in each round must be independent, produced by a computation that depends only on query results obtained in previous rounds.
In this work, we examine two fundamental variants of submodular maximization in the adaptive complexity model: cardinality-constrained monotone maximization, and unconstrained non-mono-tone maximization. Our main result is that an r-round algorithm for cardinality-constrained monotone maximization cannot achieve an approximation factor better than 1 − 1/e − Ω(min{ 1/r, log2 n/r3 }), for any r < nc (where c>0 is some constant). This is the first result showing that the number of rounds must blow up polynomially large as we approach the optimal factor of 1−1/e.
For the unconstrained non-monotone maximization problem, we show a positive result: For every instance, and every δ>0, either we obtain a (1/2−δ)-approximation in 1 round, or a (1/2+Ω(δ2))-approximation in O(1/δ2) rounds. In particular (and in contrast to the cardinality-constrained case), there cannot be an instance where (i) it is impossible to achieve an approximation factor better than 1/2 regardless of the number of rounds, and (ii) it takes r rounds to achieve a factor of 1/2−O(1/r).
In the dynamic Single-Source Shortest Paths (SSSP) problem, we are given a graph G=(V,E) subject to edge insertions and deletions and a source vertex s∈ V, and the goal is to maintain the distance d(s,t) for all t∈ V.
Fine-grained complexity has provided strong lower bounds for exact partially dynamic SSSP and approximate fully dynamic SSSP [ESA’04, FOCS’14, STOC’15]. Thus much focus has been directed towards finding efficient partially dynamic (1+є)-approximate SSSP algorithms [STOC’14, ICALP’15, SODA’14, FOCS’14, STOC’16, SODA’17, ICALP’17, ICALP’19, STOC’19, SODA’20, SODA’20]. Despite this rich literature, for directed graphs there are no known deterministic algorithms for (1+є)-approximate dynamic SSSP that perform better than the classic ES-tree [JACM’81]. We present the first such algorithm.
We present a deterministic data structure for incremental SSSP in weighted directed graphs with total update time Õ(n2 logW/єO(1)) which is near-optimal for very dense graphs; here W is the ratio of the largest weight in the graph to the smallest. Our algorithm also improves over the best known partially dynamic randomized algorithm for directed SSSP by Henzinger et al. [STOC’14, ICALP’15] if m=ω(n1.1).
Complementing our algorithm, we provide improved conditional lower bounds. Henzinger et al. [STOC’15] showed that under the OMv Hypothesis, the partially dynamic exact s-t Shortest Path problem in undirected graphs requires amortized update or query time m1/2−o(1), given polynomial preprocessing time. Under a new hypothesis about finding Cliques, we improve the update and query lower bound for algorithms with polynomial preprocessing time to m0.626−o(1). Further, under the k-Cycle hypothesis, we show that any partially dynamic SSSP algorithm with O(m2−є) preprocessing time requires amortized update or query time m1−o(1), which is essentially optimal. All previous conditional lower bounds that come close to our bound [ESA’04,FOCS’14] only held for “combinatorial” algorithms, while our new lower bound does not make such restrictions.
Given a dynamic graph subject to insertions and deletions of edges, a natural question is whether the graph presently admits a planar embedding. We give a deterministic fully-dynamic algorithm for general graphs, running in amortized O(log3 n) time per edge insertion or deletion, that maintains a bit indicating whether or not the graph is presently planar. This is an exponential improvement over the previous best algorithm [Eppstein, Galil, Italiano, Spencer, 1996] which spends amortized O(√n) time per update.
We give the first fully dynamic algorithm which maintains a (1−є)-approximate densest subgraph in worst-case time poly(logn, є−1) per update. Dense subgraph discovery is an important primitive for many real-world applications such as community detection, link spam detection, distance query indexing, and computational biology. We approach the densest subgraph problem by framing its dual as a graph orientation problem, which we solve using an augmenting path-like adjustment technique. Our result improves upon the previous best approximation factor of (1/4 − є) for fully dynamic densest subgraph [Bhattacharya et. al., STOC ‘15]. We also extend our techniques to solving the problem on vertex-weighted graphs with similar runtimes.
Additionally, we reduce the (1−є)-approximate densest subgraph problem on directed graphs to O(logn/є) instances of (1−є)-approximate densest subgraph on vertex-weighted graphs. This reduction, together with our algorithm for vertex-weighted graphs, gives the first fully-dynamic algorithm for directed densest subgraph in worst-case time poly(logn, є−1) per update. Moreover, combined with a near-linear time algorithm for densest subgraph [Bahmani et. al., WAW ‘14], this gives the first near-linear time algorithm for directed densest subgraph.
We present a new dynamic matching sparsification scheme. From this scheme we derive a framework for dynamically rounding fractional matchings against adaptive adversaries. Plugging in known dynamic fractional matching algorithms into our framework, we obtain numerous randomized dynamic matching algorithms which work against adaptive adversaries. In contrast, all previous randomized algorithms for this problem assumed a weaker, oblivious, adversary. Our dynamic algorithms against adaptive adversaries include, for any constant є >0, a (2+є)-approximate algorithm with constant update time or polylog worst-case update time, as well as (2−δ)-approximate algorithms in bipartite graphs with arbitrarily-small polynomial update time. All these results achieve polynomially better update time to approximation trade-offs than previously known to be achievable against adaptive adversaries.
We develop a new technique for proving concentration inequalities which relate between the variance and influences of Boolean functions.
Using this technique, we first settle a conjecture of Talagrand, proving that ∫ ⎧ ⎨ ⎩−1,1⎫ ⎬ ⎭n√hf⎛ ⎝x⎞ ⎠dµ≥ C·⎛ ⎝f⎞ ⎠·⎛ ⎜ ⎜ ⎜ ⎝log⎛ ⎜ ⎜ ⎜ ⎝1∑i2⎛ ⎝f⎞ ⎠⎞ ⎟ ⎟ ⎟ ⎠⎞ ⎟ ⎟ ⎟ ⎠1/2 , where hf(x) is the number of edges at x along which f changes its value, and i(f) is the influence of the i-th coordinate.
Second, we strengthen several classical inequalities concerning the influences of a Boolean function, showing that near-maximizers must have large vertex boundaries. An inequality due to Talagrand states that for a Boolean function f, (f)≤ C∑i=1ni(f)/1+log(1/i(f)). We give a lower bound for the size of the vertex boundary of functions saturating this inequality. As a corollary, we show that for sets that satisfy the edge-isoperimetric inequality or the Kahn-Kalai-Linial inequality up to a constant, a constant proportion of the mass is in the inner vertex boundary.
Lastly, we improve a quantitative relation between influences and noise stability given by Keller and Kindler.
Our proofs rely on techniques based on stochastic calculus, and bypass the use of hypercontractivity common to previous proofs.
A function f∶{0,1}n→ {0,1} is called an approximate AND-homomorphism if choosing x,y∈n uniformly at random, we have that f(x∧ y) = f(x)∧ f(y) with probability at least 1−ε, where x∧ y = (x1∧ y1,…,xn∧ yn). We prove that if f∶ {0,1}n → {0,1} is an approximate AND-homomorphism, then f is δ-close to either a constant function or an AND function, where δ(ε) → 0 as ε→ 0. This improves on a result of Nehama, who proved a similar statement in which δ depends on n.
Our theorem implies a strong result on judgement aggregation in computational social choice. In the language of social choice, our result shows that if f is ε-close to satisfying judgement aggregation, then it is δ(ε)-close to an oligarchy (the name for the AND function in social choice theory). This improves on Nehama’s result, in which δ decays polynomially with n.
Our result follows from a more general one, in which we characterize approximate solutions to the eigenvalue equation f = λ g, where is the downwards noise operator f(x) = y[f(x ∧ y)], f is [0,1]-valued, and g is {0,1}-valued. We identify all exact solutions to this equation, and show that any approximate solution in which f and λ g are close is close to an exact solution.
A major challenge in complexity theory is to explicitly construct functions that have small correlation with low-degree polynomials over F2. We introduce a new technique to prove such correlation bounds with F2 polynomials. Using this technique, we bound the correlation of an XOR of Majorities with constant degree polynomials. In fact, we prove a more general XOR lemma that extends to arbitrary resilient functions. We conjecture that the technique generalizes to higher degree polynomials as well.
A key ingredient in our new approach is a structural result about the Fourier spectrum of low degree polynomials over F2. We show that for any n-variate polynomial p over F2 of degree at most d, there is a small set S ⊂ [n] of variables, such that almost all of the Fourier mass of p lies on Fourier coefficients that intersect with S. In fact our result is more general, and finds such a set S for any low-dimensional subspace of polynomials. This generality is crucial in deriving the new XOR lemmas.
A decision list is an ordered list of rules. Each rule is specified by a term, which is a conjunction of literals, and a value. Given an input, the output of a decision list is the value corresponding to the first rule whose term is satisfied by the input. Decision lists generalize both CNFs and DNFs, and have been studied both in complexity theory and in learning theory.
The size of a decision list is the number of rules, and its width is the maximal number of variables in a term. We prove that decision lists of small width can always be approximated by decision lists of small size, where we obtain sharp bounds. This in particular resolves a conjecture of Gopalan, Meka and Reingold (Computational Complexity, 2013) on DNF sparsification.
An ingredient in our proof is a new random restriction lemma, which allows to analyze how DNFs (and more generally, decision lists) simplify if a small fraction of the variables are fixed. This is in contrast to the more commonly used switching lemma, which requires most of the variables to be fixed.
We define the notion of one-shot signatures, which are signatures where any secret key can be used to sign only a single message, and then self-destructs. While such signatures are of course impossible classically, we construct one-shot signatures using quantum no-cloning. In particular, we show that such signatures exist relative to a classical oracle, which we can then heuristically obfuscate using known indistinguishability obfuscation schemes.
We show that one-shot signatures have numerous applications for hybrid quantum/classical cryptographic tasks, where all communication is required to be classical, but local quantum operations are allowed. Applications include one-time signature tokens, quantum money with classical communication, decentralized blockchain-less cryptocurrency, signature schemes with unclonable secret keys, non-interactive certifiable min-entropy, and more. We thus position one-shot signatures as a powerful new building block for novel quantum cryptographic protocols.
We construct a constant-round zero-knowledge classical argument for NP secure against quantum attacks. We assume the existence of Quantum Fully-Homomorphic Encryption and other standard primitives, known based on the Learning with Errors Assumption for quantum algorithms. As a corollary, we also obtain a constant-round zero-knowledge quantum argument for QMA.
At the heart of our protocol is a new no-cloning non-black-box simulation technique.
A secret-sharing scheme allows to distribute a secret s among n parties such that only some predefined “authorized” sets of parties can reconstruct the secret, and all other “unauthorized” sets learn nothing about s. For over 30 years, it was known that any (monotone) collection of authorized sets can be realized by a secret-sharing scheme whose shares are of size 2n−o(n) and until recently no better scheme was known. In a recent breakthrough, Liu and Vaikuntanathan (STOC 2018) have reduced the share size to 20.994n+o(n), which was later improved to 20.892n+o(n) by Applebaum et al. (EUROCRYPT 2019).
In this paper we improve the exponent of general secret-sharing down to 0.637. For the special case of linear secret-sharing schemes, we get an exponent of 0.762 (compared to 0.942 of Applebaum et al.). As our main building block, we introduce a new robust variant of conditional disclosure of secrets (robust CDS) that achieves unconditional security even under bounded form of re-usability. We show that the problem of general secret-sharing reduces to robust CDS with sub-exponential overhead and derive our main result by implementing robust CDS with a non-trivial exponent. The latter construction follows by presenting a general immunization procedure that turns standard CDS into a robust CDS.
This paper shows several connections between data structure problems and cryptography against preprocessing attacks. Our results span data structure upper bounds, cryptographic applications, and data structure lower bounds, as summarized next.
First, we apply Fiat-Naor inversion, a technique with cryptographic origins, to obtain a data structure upper bound. In particular, our technique yields a suite of algorithms with space S and (online) time T for a preprocessing version of the N-input 3SUM problem where S3· T = O(N6). This disproves a strong conjecture (Goldstein et al., WADS 2017) that there is no data structure that solves this problem for S=N2−δ and T = N1−δ for any constant δ>0.
Secondly, we show equivalence between lower bounds for a broad class of (static) data structure problems and one-way functions in the random oracle model that resist a very strong form of preprocessing attack. Concretely, given a random function F: [N] → [N] (accessed as an oracle) we show how to compile it into a function GF: [N2] → [N2] which resists S-bit preprocessing attacks that run in query time T where ST=O(N2−ε) (assuming a corresponding data structure lower bound on 3SUM). In contrast, a classical result of Hellman tells us that F itself can be more easily inverted, say with N2/3-bit preprocessing in N2/3 time. We also show that much stronger lower bounds follow from the hardness of kSUM. Our results can be equivalently interpreted as security against adversaries that are very non-uniform, or have large auxiliary input, or as security in the face of a powerfully backdoored random oracle.
Thirdly, we give non-adaptive lower bounds for 3SUM which match the best known lower bounds for static data structure problems. Moreover, we show that our lower bound generalizes to a range of geometric problems, such as three points on a line, polygon containment, and others.
We present the first m polylog(n) work, polylog(n) time algorithm in the PRAM model that computes (1+є)-approximate single-source shortest paths on weighted, undirected graphs. This improves upon the breakthrough result of Cohen [JACM’00] that achieves O(m1+є0) work and polylog(n) time. While most previous approaches, including Cohen’s, leveraged the power of hopsets, our algorithm builds upon the recent developments in continuous optimization, studying the shortest path problem from the lens of the closely-related minimum transshipment problem. To obtain our algorithm, we demonstrate a series of near-linear work, polylogarithmic-time reductions between the problems of approximate shortest path, approximate transshipment, and ℓ1-embeddings, and establish a recursive algorithm that cycles through the three problems and reduces the graph size on each cycle. As a consequence, we also obtain faster parallel algorithms for approximate transshipment and ℓ1-embeddings with polylogarithmic distortion. The minimum transshipment algorithm in particular improves upon the previous best m1+o(1) work sequential algorithm of Sherman [SODA’17].
To improve readability, the paper is almost entirely self-contained, save for several staple theorems in algorithms and combinatorics.
We present a (1+ε)-approximate parallel algorithm for computing shortest paths in undirected graphs, achieving poly(logn) depth and mpoly(logn) work for n-nodes m-edges graphs. Although sequential algorithms with (nearly) optimal running time have been known for several decades, near-optimal parallel algorithms have turned out to be a much tougher challenge. For (1+ε)-approximation, all prior algorithms with poly(logn) depth perform at least Ω(mnc) work for some constant c>0. Improving this long-standing upper bound obtained by Cohen (STOC’94) has been open for 25 years.
We develop several new tools of independent interest. One of them is a new notion beyond hopsets — low hop emulator — a poly(logn)-approximate emulator graph in which every shortest path has at most O(loglogn) hops (edges). Direct applications of the low hop emulators are parallel algorithms for poly(logn)-approximate single source shortest path (SSSP), Bourgain’s embedding, metric tree embedding, and low diameter decomposition, all with poly(logn) depth and mpoly(logn) work.
To boost the approximation ratio to (1+ε), we introduce compressible preconditioners and apply it inside Sherman’s framework (SODA’17) to solve the more general problem of uncapacitated minimum cost flow (a.k.a., transshipment problem). Our algorithm computes a (1+ε)-approximate uncapacitated minimum cost flow in poly(logn) depth using mpoly(logn) work. As a consequence, it also improves the state-of-the-art sequential running time from m· 2O(√logn) to mpoly(logn).
The approximate single-source shortest-path problem is as follows: given a graph with nonnegative edge weights and a designated source vertex s, return estimates of the distances from s to each other vertex such that the estimate falls between the true distance and (1+є) times the distance. This paper provides the first nearly work-efficient parallel algorithm with sublinear span (also called depth) for the approximate shortest-path problem on directed graphs. Specifically, for constant є and polynomially-bounded edge weights, our algorithm has work Õ(m) and span n1/2+o(1). Several algorithms were previously known for the case of undirected graphs, but none of the techniques seem to translate to the directed setting.
The main technical contribution is the first nearly linear-work algorithm for constructing hopsets on directed graphs. A (β,є)-hopset is a set of weighted edges (sometimes called shortcuts) which, when added to the graph, admit β-hop paths with weight no more than (1+є) times the true shortest-path distances. There is a simple sequential algorithm that takes as input a directed graph and produces a linear-cardinality hopset with β=Õ(√n), but its running time is quite high—specifically Õ(m√n). Our algorithm is the first more efficient algorithm that produces a directed hopset with similar characteristics. Specifically, our sequential algorithm runs in Õ(m) time and constructs a hopset with Õ(n) edges and β = n1/2+o(1). A parallel version of the algorithm has work Õ(m) and span n1/2+o(1).
We present a simple polylogarithmic-time deterministic distributed algorithm for network decomposition. This improves on a celebrated 2O(√logn)-time algorithm of Panconesi and Srinivasan [STOC’92] and settles a central and long-standing question in distributed graph algorithms. It also leads to the first polylogarithmic-time deterministic distributed algorithms for numerous other problems, hence resolving several well-known and decades-old open problems, including Linial’s question about the deterministic complexity of maximal independent set [FOCS’87; SICOMP’92]—which had been called the most outstanding problem in the area.
The main implication is a more general distributed derandomization theorem: Put together with the results of Ghaffari, Kuhn, and Maus [STOC’17] and Ghaffari, Harris, and Kuhn [FOCS’18], our network decomposition implies that P-RLOCAL = P-LOCAL.
That is, for any problem whose solution can be checked deterministically in polylogarithmic-time, any polylogarithmic-time randomized algorithm can be derandomized to a polylogarithmic-time deterministic algorithm. Informally, for the standard first-order interpretation of efficiency as polylogarithmic-time, distributed algorithms do not need randomness for efficiency.
By known connections, our result leads also to substantially faster randomized distributed algorithms for a number of well-studied problems including (Δ+1)-coloring, maximal independent set, and Lovász Local Lemma, as well as massively parallel algorithms for (Δ+1)-coloring.
We introduce a set of techniques that allow for efficiently generating many independent random walks in the Massively Parallel Computation (MPC) model with space per machine strongly sublinear in the number of vertices. In this space-per-machine regime, many natural approaches to graph problems struggle to overcome the Θ(log n) MPC round complexity barrier, where n is the number of vertices. Our techniques enable achieving this for PageRank—one of the most important applications of random walks—even in more challenging directed graphs, as well as for approximate bipartiteness and expansion testing.
In the undirected case, we start our random walks from the stationary distribution, which implies that we approximately know the empirical distribution of their next steps. This allows for preparing continuations of random walks in advance and applying a doubling approach. As a result we can generate multiple random walks of length l in Θ(log l) rounds on MPC. Moreover, we show that under the popular 1-vs.-2-Cycles conjecture, this round complexity is asymptotically tight.
For directed graphs, our approach stems from our treatment of the PageRank Markov chain. We first compute the PageRank for the undirected version of the input graph and then slowly transition towards the directed case, considering convex combinations of the transition matrices in the process.
For PageRank, we achieve the following round complexities for damping factor equal to 1 − є: in O(log log n + log 1 / є) rounds for undirected graphs (with Õ(m / є2) total space), in Õ(log2 log n + log2 1/є) rounds for directed graphs (with Õ((m+n1+o(1)) / poly(є)) total space). The round complexity of our result for computing PageRank has only logarithmic dependence on 1/є. We use this to show that our PageRank algorithm can be used to construct directed length-l random walks in O(log2 log n + log2 l) rounds with Õ((m+n1+o(1)) poly(l)) total space. More specifically, by setting є = Θ(1 / l), a length-l PageRank walk with constant probability contains no random jump, and hence is a directed random walk.
We present a quasi-polynomial time classical algorithm that estimates the partition function of quantum many-body systems at temperatures above the thermal phase transition point. It is known that in the worst case, the same problem is NP-hard below this point. Together with our work, this shows that the transition in the phase of a quantum system is also accompanied by a transition in the hardness of approximation. We also show that in a system of n particles above the phase transition point, the correlation between two observables whose distance is at least Ω(logn) decays exponentially. We can improve the factor of logn to a constant when the Hamiltonian has commuting terms or is on a 1D chain. The key to our results is a characterization of the phase transition and the critical behavior of the system in terms of the complex zeros of the partition function. Our work extends a seminal work of Dobrushin and Shlosman on the equivalence between the decay of correlations and the analyticity of the free energy in classical spin models. On the algorithmic side, our result extends the scope of a recent approach due to Barvinok for solving classical counting problems to quantum many-body systems.
We present an algorithmic framework for quantum-inspired classical algorithms on close-to-low-rank matrices, generalizing the series of results started by Tang’s breakthrough quantum-inspired algorithm for recommendation systems [STOC’19]. Motivated by quantum linear algebra algorithms and the quantum singular value transformation (SVT) framework of Gilyén et al. [STOC’19], we develop classical algorithms for SVT that run in time independent of input dimension, under suitable quantum-inspired sampling assumptions. Our results give compelling evidence that in the corresponding QRAM data structure input model, quantum SVT does not yield exponential quantum speedups. Since the quantum SVT framework generalizes essentially all known techniques for quantum linear algebra, our results, combined with sampling lemmas from previous work, suffices to generalize all recent results about dequantizing quantum machine learning algorithms. In particular, our classical SVT framework recovers and often improves the dequantization results on recommendation systems, principal component analysis, supervised clustering, support vector machines, low-rank regression, and semidefinite program solving. We also give additional dequantization results on low-rank Hamiltonian simulation and discriminant analysis. Our improvements come from identifying the key feature of the quantum-inspired input model that is at the core of all prior quantum-inspired results: ℓ2-norm sampling can approximate matrix products in time independent of their dimension. We reduce all our main results to this fact, making our exposition concise, self-contained, and intuitive.
A relational bipartite communication problem is presented that has an efficient quantum simultaneous-messages protocol, but no efficient classical two-way protocol.
A quantum walk algorithm can detect the presence of a marked vertex on a graph quadratically faster than the corresponding random walk algorithm (Szegedy, FOCS 2004). However, quantum algorithms that actually find a marked element quadratically faster than a classical random walk were only known for the special case when the marked set consists of just a single vertex, or in the case of some specific graphs. We present a new quantum algorithm for finding a marked vertex in any graph, with any set of marked vertices, that is (up to a log factor) quadratically faster than the corresponding classical random walk, resolving a question that had been open for 15 years.
We give new characterizations of the sample complexity of answering linear queries (statistical queries) in the local and central models of differential privacy:
(1) In the non-interactive local model, we give the first approximate characterization of the sample complexity. Informally our bounds are tight to within polylogarithmic factors in the number of queries and desired accuracy. Our characterization extends to agnostic learning in the local model.
(2) In the central model, we give a characterization of the sample complexity in the high-accuracy regime that is analogous to that of Nikolov, Talwar, and Zhang (STOC 2013), but is both quantitatively tighter and has a dramatically simpler proof.
Our lower bounds apply equally to the empirical and population estimation problems. In both cases, our characterizations show that a particular factorization mechanism is approximately optimal, and the optimal sample complexity is bounded from above and below by well studied factorization norms of a matrix associated with the queries.
We study differentially private (DP) algorithms for stochastic convex optimization: the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions. A recent work of Bassily et al. (2019) has established the optimal bound on the excess population loss achievable given n samples. Unfortunately, their algorithm achieving this bound is relatively inefficient: it requires O(min{n3/2, n5/2/d}) gradient computations, where d is the dimension of the optimization problem.
We describe two new techniques for deriving DP convex optimization algorithms both achieving the optimal bound on excess loss and using O(min{n, n2/d}) gradient computations. In particular, the algorithms match the running time of the optimal non-private algorithms. The first approach relies on the use of variable batch sizes and is analyzed using the privacy amplification by iteration technique of Feldman et al. (2018). The second approach is based on a general reduction to the problem of localizing an approximately optimal solution with differential privacy. Such localization, in turn, can be achieved using existing (non-private) uniformly stable optimization algorithms. As in the earlier work, our algorithms require a mild smoothness assumption. We also give a linear-time optimal algorithm for the strongly convex case, as well as a faster algorithm for the non-smooth case.
Local differential privacy (LDP) is a model where users send privatized data to an untrusted central server whose goal it to solve some data analysis task. In the non-interactive version of this model the protocol consists of a single round in which a server sends requests to all users then receives their responses. This version is deployed in industry due to its practical advantages and has attracted significant research interest.
Our main result is an exponential lower bound on the number of samples necessary to solve the standard task of learning a large-margin linear separator in the non-interactive LDP model. Via a standard reduction this lower bound implies an exponential lower bound for stochastic convex optimization and specifically, for learning linear models with a convex, Lipschitz and smooth loss. These results answer the questions posed by Smith, Thakurta, and Upadhyay (IEEE Symposium on Security and Privacy 2017) and Daniely and Feldman (NeurIPS 2019). Our lower bound relies on a new technique for constructing pairs of distributions with nearly matching moments but whose supports can be nearly separated by a large margin hyperplane. These lower bounds also hold in the model where communication from each user is limited and follow from a lower bound on learning using non-adaptive statistical queries.
In the committee selection problem, we are given m candidates, and n voters. Candidates can have different weights. A committee is a subset of candidates, and its weight is the sum of weights of its candidates. Each voter expresses an ordinal ranking over all possible committees. The only assumption we make on preferences is monotonicity: If S ⊆ S′ are two committees, then any voter weakly prefers S′ to S.
We study a general notion of group fairness via stability: A committee of given total weight K is stable if no coalition of voters can deviate and choose a committee of proportional weight, so that all these voters strictly prefer the new committee to the existing one. Extending this notion to approximation, for parameter c ≥ 1, a committee S of weight K is said to be c-approximately stable if for any other committee S′ of weight K′, the fraction of voters that strictly prefer S′ to S is strictly less than c K′/K. When c = 1, this condition is equivalent to classical core stability.
The question we ask is: Does a c-approximately stable committee of weight at most any given value K always exist for constant c? It is relatively easy to show that there exist monotone preferences for which c ≥ 2. However, even for simple and widely studied preference structures, a non-trivial upper bound on c has been elusive.
In this paper, we show that c = O(1) for all monotone preference structures. Our proof proceeds via showing an existence result for a randomized notion of stability, and iteratively rounding the resulting fractional solution.
In the k-cut problem, we are given an edge-weighted graph and want to find the least-weight set of edges whose deletion breaks the graph into k connected components. Algorithms due to Karger-Stein and Thorup showed how to find such a minimum k-cut in time approximately O(n2k−2). The best lower bounds come from conjectures about the solvability of the k-clique problem and a reduction from k-clique to k-cut, and show that solving k-cut is likely to require time Ω(nk). Our recent results have given special-purpose algorithms that solve the problem in time n1.98k + O(1), and ones that have better performance for special classes of graphs (e.g., for small integer weights).
In this work, we resolve the problem for general graphs, by showing that for any fixed k ≥ 2, the Karger-Stein algorithm outputs any fixed minimum k-cut with probability at least O(n−k), where O(·) hides a 2O(lnlnn)2 factor. This also gives an extremal bound of O(nk) on the number of minimum k-cuts in an n-vertex graph and an algorithm to compute a minimum k-cut in similar runtime. Both are tight up to O(1) factors.
The first main ingredient in our result is a fine-grained analysis of how the graph shrinks—and how the average degree evolves—under the Karger-Stein process. The second ingredient is an extremal result bounding the number of cuts of size at most (2−δ) OPT/k, using the Sunflower lemma.
We improve the time for approximating network (un)reliability to (n2). We do so not with a new algorithm, but with a deeper analysis and tweaking of algorithms from our previous work. In particular, we show that once a graph’s failure probability shrinks below 1/2, the graph rapidly transitions to a regime where even the expected number of cut failures is small, and in fact is almost exactly the same as the graph’s failure probability. That is, we are very unlikely to ever see more than one cut fail. This lets us treat these cut failures as essentially independent, making it easier to estimate their likelihood. The contribution of this paper is not just the improved time bound, but also this clearer understanding of the evolution of a graph’s reliability. Our results rely on some new methods for analyzing the distribution of cut failures conditioned on the failure of a particular cut, as well as new insights into the evolution of a graph’s connectivity as edges are randomly added over time. Some of our results apply more broadly, to all monotone reliability systems.
Consider the following 2-respecting min-cut problem. Given any weighted graph G and its spanning tree T, find the minimum cut among the cuts that contain at most two edges in T. This problem is an important subroutine in Karger’s celebrated randomized near-linear-time min-cut algorithm [STOC’96]. We present a new approach for this problem which can be easily implemented in many settings, leading to the following randomized min-cut algorithms for weighted graphs.
– An O(mlog2 n/loglogn + nlog6 n)-time sequential algorithm improving Karger’s long-standing O(m log3 n) and O(nlog6 n + mlog2 nlog(n2/m)/loglogn) bounds when the input graph is not extremely sparse or dense. Improvements over Karger’s bounds were previously known only under a rather strong assumption that the input graph is simple (unweighted without parallel edges) [Henzinger, Rao, Wang, SODA’17; Ghaffari, Nowicki, Thorup, SODA’20]. For unweighted graphs (possibly with parallel edges) and using bit operations, our bound can be further improved to O(mlog1.5 n/loglogn + nlog6 n).
– An algorithm that requires Õ(n) cut queries to compute the min-cut of a weighted graph: This answers an open problem by Rubinstein, Schramm, and Weinberg [ITCS’18], who obtained a similar bound for simple graphs. Our bound is tight up to polylogarithmic factors. – A streaming algorithm that requires Õ(n) space and O(logn) passes to compute the min-cut: The only previous non-trivial exact min-cut algorithm in this setting is the 2-pass Õ(n)-space algorithm on simple graphs [Rubinstein et al., ITCS’18] (observed by Assadi, Chen, and Khanna [STOC’19]).
Our approach exploits some cute structural properties so that it only needs to compute the values of Õ(n) cuts corresponding to removing Õ(n) pairs of tree edges, an operation that can be done quickly in many settings. This is in contrast to the techniques used by Karger and Lovett-Sandlund to solve 2-respecting min-cut where information about many more cuts is computed, stored in and accessed from sophisticated data-structures.
For every constant d ≥ 3 and є > 0, we give a deterministic poly(n)-time algorithm that outputs a d-regular graph on Θ(n) vertices that is є-near-Ramanujan; i.e., its eigenvalues are bounded in magnitude by 2√d−1 + є (excluding the single trivial eigenvalue of d).
We give a complete answer to the following basic question: ”What is the maximal fraction of deletions or insertions tolerable by q-ary list-decodable codes with non-vanishing information rate?”
This question has been open even for binary codes, including the restriction to the binary insertion-only setting, where the best-known result was that a γ≤ 0.707 fraction of insertions is tolerable by some binary code family.
For any desired є>0, we construct a family of binary codes of positive rate which can be efficiently list-decoded from any combination of γ fraction of insertions and δ fraction of deletions as long as γ+2δ≤ 1−ε. On the other hand, for any γ, δ with γ+2δ=1 list-decoding is impossible. Our result thus precisely characterizes the feasibility region of binary list-decodable codes for insertions and deletions.
We further generalize our result to codes over any finite alphabet of size q. Surprisingly, our work reveals that the feasibility region for q>2 is not the natural generalization of the binary bound above. We provide tight upper and lower bounds that precisely pin down the feasibility region, which turns out to have a (q−1)-piece-wise linear boundary whose q corner-points lie on a quadratic curve.
The main technical work in our results is proving the existence of code families of sufficiently large size with good list-decoding properties for any combination of δ,γ within the claimed feasibility region. We achieve this via an intricate analysis of codes introduced by [Bukh, Ma; SIAM J. Discrete Math; 2014]. Finally, we give a simple yet powerful concatenation scheme for list-decodable insertion-deletion codes which transforms any such (non-efficient) code family (with information rate zero) into an efficiently decodable code family with constant rate.
List-decoding of Reed-Solomon (RS) codes beyond the so called Johnson radius has been one of the main open questions in coding theory and theoretical computer science since the work of Guruswami and Sudan. It is now known by the work of Rudra and Wootters, using techniques from high dimensional probability, that over large enough alphabets there exist RS codes that are indeed list-decodable beyond this radius.
In this paper we take a more combinatorial approach that allows us to determine the precise relation (up to the exact constant) between the decoding radius and the list size. We prove a generalized Singleton bound for a given list size, and conjecture that the bound is tight for most RS codes over large enough finite fields. We also show that the conjecture holds true for list sizes 2 and 3, and as a by product show that most RS codes with a rate of at least 1/9 are list-decodable beyond the Johnson radius. Lastly, we give the first explicit construction of such RS codes. The main tools used in the proof are a new type of linear dependency between codewords of a code that are contained in a small Hamming ball, and the notion of cycle space from Graph Theory. Both of them have not been used before in the context of list-decoding.
Let W be a binary-input memoryless symmetric (BMS) channel with Shannon capacity I(W) and fix any α > 0. We construct, for any sufficiently small δ > 0, binary linear codes of block length O(1/δ2+α) and rate I(W)−δ that enable reliable communication on W with quasi-linear time encoding and decoding. Shannon’s noisy coding theorem established the existence of such codes (without efficient constructions or decoding) with block length O(1/δ2). This quadratic dependence on the gap δ to capacity is known to be the best possible. Our result thus yields a constructive version of Shannon’s theorem with near-optimal convergence to capacity as a function of the block length. This resolves a central theoretical challenge associated with the attainment of Shannon capacity. Previously such a result was only known for the binary erasure channel.
Our codes are a variant of Arikan’s polar codes based on multiple carefully constructed local kernels, one for each intermediate channel that arises in the decoding. A crucial ingredient in the analysis is a strong converse of the noisy coding theorem when communicating using random linear codes on arbitrary BMS channels. Our converse theorem shows extreme unpredictability of even a single message bit for random coding at rates slightly above capacity.
Interactive error correcting codes can protect interactive communication protocols against a constant fraction of adversarial errors, while incurring only a constant multiplicative overhead in the total communication. What is the maximum fraction of errors that such codes can protect against?
For the non-adaptive channel, where the parties must agree in advance on the order in which they communicate, Braverman and Rao prove that the maximum error resilience is 1/4 (STOC, 2011). Ghaffari, Haeupler, and Sudan (STOC, 2014) consider the adaptive channel, where the order in which the parties communicate may not be fixed, and give a clever protocol that is resilient to a 2/7 fraction of errors. This was believed to be optimal.
We revisit this result, and show how to overcome the 2/7 barrier. Specifically, we show that, over the adaptive channel, every two-party communication protocol can be converted to a protocol that is resilient to 7/24 > 2/7 fraction of errors with only a constant multiplicative overhead to the total communication. The protocol is obtained by a novel implementation of a feedback mechanism over the adaptive channel.
Estimating the normalizing constant of an unnormalized probability distribution has important applications in computer science, statistical physics, machine learning, and statistics. In this work, we consider the problem of estimating the normalizing constant Z=∫ℝd e−f(x) dx to within a multiplication factor of 1 ± ε for a µ-strongly convex and L-smooth function f, given query access to f(x) and ∇ f(x). We give both algorithms and lowerbounds for this problem. Using an annealing algorithm combined with a multilevel Monte Carlo method based on underdamped Langevin dynamics, we show that O(d4/3κ + d7/6κ7/6/ε2) queries to ∇ f are sufficient, where κ= L / µ is the condition number. Moreover, we provide an information theoretic lowerbound, showing that at least d1−o(1)/ε2−o(1) queries are necessary. This provides a first nontrivial lowerbound for the problem.
We consider the problem of learning a mixture of linear regressions (MLRs). An MLR is specified by k nonnegative mixing weights p1, …, pk summing to 1, and k unknown regressors w1,...,wk∈ℝd. A sample from the MLR is drawn by sampling i with probability pi, then outputting (x, y) where y = ⟨ x, wi ⟩ + η, where η∼(0,ς2) for noise rate ς. Mixtures of linear regressions are a popular generative model and have been studied extensively in machine learning and theoretical computer science. However, all previous algorithms for learning the parameters of an MLR require running time and sample complexity scaling exponentially with k.
In this paper, we give the first algorithm for learning an MLR that runs in time which is sub-exponential in k. Specifically, we give an algorithm which runs in time O(d)·exp(O(√k)) and outputs the parameters of the MLR to high accuracy, even in the presence of nontrivial regression noise. We demonstrate a new method that we call “Fourier moment descent,” which uses univariate density estimation and low-degree moments of the Fourier transform of suitable univariate projections of the MLR to iteratively refine our estimate of the parameters. To the best of our knowledge, these techniques have never been used in the context of high dimensional distribution learning, and may be of independent interest. We also show that our techniques can be used to give a sub-exponential time algorithm for a natural hard instance of the subspace clustering problem, which we call learning mixtures of hyperplanes.
We study polynomial-time algorithms for linear regression and covariance estimation in the absence of strong (Gaussian) assumptions on the underlying distributions of samples, making assumptions instead about only finitely-many moments. We focus on how many samples are required to perform estimation and regression with high accuracy and exponentially-good success probability in the face of heavy-tailed data.
For covariance estimation, linear regression, and several other problems in high-dimensional statistics, estimators have recently been constructed whose sample complexities and rates of statistical error match what is possible when the underlying distribution is Gaussian, but known algorithms for these estimators require exponential time. We narrow the gap between the Gaussian and heavy-tailed settings for polynomial-time estimators with: (a) a polynomial-time estimator which takes n samples from a d-dimensional random vector X with covariance Σ and produces Σ such that in spectral norm ||Σ − Σ ||2 ≤ Õ(d3/4/√n) w.p. 1−2−d where the information-theoretically optimal error bound is Õ(√d/n), while previous approaches to polynomial-time algorithms were stuck at Õ(d/√n) and (b) a polynomial-time algorithm which takes n samples (Xi,Yi) where Yi = ⟨ u,Xi ⟩ + i where both X and have a constant number of bounded moments and produces û such that the loss ||u − û||2 ≤ O(d/n) w.p. 1−2−d for any n ≥ d3/2 log(d). This (information-theoretically optimal) error is achieved by inefficient algorithms for any n ≫ d, while previous approaches to polynomial-time algorithms suffer loss Ω(d2/n) and require n ≫ d2.
Our algorithms make crucial use of degree-8 sum-of-squares semidefinite programs. Both apply to any X which has constantly-many certifiably hypercontractive moments. We offer preliminary evidence that improving on these rates of error in polynomial time is not possible in the median of means framework our algorithms employ. Our work introduces new techniques to high-probability estimation, and suggests numerous new algorithmic questions in the following vein: when is it computationally feasible to do statistics in high dimensions with Gaussian-style errors when data is far from Gaussian?
We consider the following basic inference problem: there is an unknown high-dimensional vector w ∈ ℝn, and an algorithm is given access to labeled pairs (x,y) where x ∈ ℝn is a measurement and y = w · x + noise. What is the complexity of deciding whether the target vector w is (approximately) k-sparse? The recovery analogue of this problem — given the promise that w is sparse, find or approximate the vector w — is the famous sparse recovery problem, with a rich body of work in signal processing, statistics, and computer science.
We study the decision version of this problem (i.e. deciding whether the unknown w is k-sparse) from the vantage point of property testing. Our focus is on answering the following high-level question: when is it possible to efficiently test whether the unknown target vector w is sparse versus far-from-sparse using a number of samples which is completely independent of the dimension n? We consider the natural setting in which x is drawn from an i.i.d. product distribution D over ℝn and the noise process is independent of the input x. As our main result, we give a general algorithm which solves the above-described testing problem using a number of samples which is completely independent of the ambient dimension n, as long as D is not a Gaussian. In fact, our algorithm is fully noise tolerant, in the sense that for an arbitrary w, it approximately computes the distance of w to the closest k-sparse vector. To complement this algorithmic result, we show that weakening any of our conditions makes it information-theoretically impossible for any algorithm to solve the testing problem with fewer than essentially logn samples. Thus our conditions essentially characterize when it is possible to test noisy linear functions for sparsity with constant sample complexity.
Our algorithmic approach is based on relating the cumulants of the output distribution (i.e. of y) with elementary power sum symmetric polynomials in w and using the latter to measure the sparsity of w. This approach crucially relies on a theorem of Marcinkiewicz from probability theory. In fact, to obtain effective sample complexity bounds with our approach, we prove a new finitary version of Marcinkiewicz’s theorem. This involves extending the complex analytic arguments used in the original proof with results about the distribution of zeros of entire functions.
A sunflower with r petals is a collection of r sets so that the intersection of each pair is equal to the intersection of all. Erdős and Rado proved the sunflower lemma: for any fixed r, any family of sets of size w, with at least about ww sets, must contain a sunflower. The famous sunflower conjecture is that the bound on the number of sets can be improved to cw for some constant c. In this paper, we improve the bound to about (logw)w. In fact, we prove the result for a robust notion of sunflowers, for which the bound we obtain is tight up to lower order terms.
We present a randomized algorithm that takes as input an undirected n-vertex graph G with maximum degree Δ and an integer k > 3Δ, and returns a random proper k-coloring of G. The distribution of the coloring is perfectly uniform over the set of all proper k-colorings; the expected running time of the algorithm is poly(k,n)=O(nΔ2· log(k)). This improves upon a result of Huber (STOC 1998) who obtained a polynomial time perfect sampling algorithm for k>Δ2+2Δ. Prior to our work, no algorithm with expected running time poly(k,n) was known to guarantee perfectly sampling with sub-quadratic number of colors in general.
Our algorithm (like several other perfect sampling algorithms including Huber’s) is based on the Coupling from the Past method. Inspired by the bounding chain approach, pioneered independently by Huber (STOC 1998) and H'aggstr'om & Nelander (Scand. J. Statist., 1999), we employ a novel bounding chain to derive our result for the graph coloring problem.
We revisit a fundamental problem in string matching: given a pattern of length m and a text of length n, both over an alphabet of size σ, compute the Hamming distance (i.e., the number of mismatches) between the pattern and the text at every location. Several randomized (1+ε)-approximation algorithms have been proposed in the literature (e.g., by Karloff (Inf. Proc. Lett., 1993), Indyk (FOCS 1998), and Kopelowitz and Porat (SOSA 2018)), with running time of the form O(ε−O(1)nlognlogm), all using fast Fourier transform (FFT). We describe a simple randomized (1+ε)-approximation algorithm that is faster and does not need FFT. Combining our approach with additional ideas leads to numerous new results (all Monte-Carlo randomized) in different settings:
(1) We design the first truly linear-time approximation algorithm for constant ; the running time is O(ε−2n). In fact, the time bound can be made slightly sublinear in n if the alphabet size σ is small (by using bit packing tricks).
(2) We apply our approximation algorithms to design a faster exact algorithm computing all Hamming distances up to a threshold k; its runtime of O(n + min(nk√logm/√m,nk2/m)) improves upon previous results by logarithmic factors and is linear for k≤ √m.
(3) We alternatively design approximation algorithms with better ε-dependence, by using fast rectangular matrix multiplication. In fact, the time bound is O(n polylog n) when the pattern is sufficiently long, i.e., m≥ ε−c for a specific constant c. Previous algorithms with the best ε-dependence require O(ε−1n polylog n) time.
(4) When k is not too small, we design a truly sublinear-time algorithm to find all locations with Hamming distance approximately (up to a constant factor) less than k, in time O((n/kΩ(1)+occ)no(1)) time, where occ is the output size. The algorithm leads to a property tester for pattern matching that costs O((δ−1/3n2/3 + δ−1n/m) n) time and, with high probability, returns true if an exact match exists and false if the Hamming distance is more than δ m at every location.
(5) We design a streaming algorithm that approximately computes the Hamming distance for all locations with the distance approximately less than k, using O(ε−2√k n) space. Previously, streaming algorithms were known for the exact problem with O(k n) space (which is tight up to the polylogn factor) or for the approximate problem with O(ε−O(1)√mpolylogn) space. For the special case of k=m, we improve the space usage to O(ε−1.5√mpolylogn).
We study edit distance computation with preprocessing: the preprocessing algorithm acts on each string separately, and then the query algorithm takes as input the two preprocessed strings. This model is inspired by scenarios where we would like to compute edit distance between many pairs in the same pool of strings.
Our results include: Permutation-LCS: If the LCS between two permutations has length n−k, we can compute it exactly with O(n log(n)) preprocessing and O(k log(n)) query time. Small edit distance: For general strings, if their edit distance is at most k, we can compute it exactly with O(nlog(n)) preprocessing and O(k2 log(n)) query time. Approximate edit distance: For the most general input, we can approximate the edit distance to within factor (7+o(1)) with preprocessing time Õ(n2) and query time Õ(n1.5+o(1)).
All of these results significantly improve over the state of the art in edit distance computation without preprocessing. Interestingly, by combining ideas from our algorithms with preprocessing, we provide new improved results for approximating edit distance without preprocessing in subquadratic time.
In this paper, we provide new approximation algorithms for dynamic variations of the longest increasing subsequence (LIS) problem, and the complementary distance to monotonicity (DTM) problem. In this setting, operations of the following form arrive sequentially: (i) add an element, (ii) remove an element, or (iii) substitute an element for another. At every point in time, the algorithm has an approximation to the longest increasing subsequence (or distance to monotonicity). We present a (1+є)-approximation algorithm for DTM with polylogarithmic worst-case update time and a constant factor approximation algorithm for LIS with worst-case update time Õ(nє) for any constant є > 0.
We show that the edit distance between two strings of length n can be computed via a randomized algorithm within a factor of f(є) in n1+є time as long as the edit distance is at least n1−δ for some δ(є) > 0.
For any T ≥ 1, there are constants R=R(T) ≥ 1 and ζ=ζ(T)>0 and a randomized algorithm that takes as input an integer n and two strings x,y of length at most n, and runs in time O(n1+1/T) and outputs an upper bound U on the edit distance of edit(x,y) that with high probability, satisfies U ≤ R(edit(x,y)+n1−ζ). In particular, on any input with edit(x,y) ≥ n1−ζ the algorithm outputs a constant factor approximation with high probability. A similar result has been proven independently by Brakensiek and Rubinstein (this proceedings).
Understanding the difference between group orbits and their closures is a key difficulty in geometric complexity theory (GCT): While the GCT program is set up to separate certain orbit closures (i.e., prove non-containment of one orbit closure in the other), many beautiful mathematical properties are only known for the group orbits, in particular close relations with symmetry groups and invariant spaces, while the orbit closures seem much more difficult to understand. However, in order to prove lower bounds in algebraic complexity theory, considering group orbits is not enough.
In this paper we tighten the relationship between the orbit of the power sum polynomial and its closure, so that we can separate this orbit closure from the orbit closure of the product of variables by just considering the symmetry groups of both polynomials and their representation theoretic decomposition coefficients. In a natural way our construction yields a multiplicity obstruction that is neither an occurrence obstruction, nor a so-called vanishing ideal occurrence obstruction. All multiplicity obstructions so far have been of one of these two types.
Our paper is the first implementation of the ambitious approach that was originally suggested in the first papers on geometric complexity theory by Mulmuley and Sohoni (SIAM J Comput 2001, 2008): Before our paper, all existence proofs of obstructions only took into account the symmetry group of one of the two polynomials (or tensors) that were to be separated. In our paper the multiplicity obstruction is obtained by comparing the representation theoretic decomposition coefficients of both symmetry groups.
Our proof uses a semi-explicit description of the coordinate ring of the orbit closure of the power sum polynomial in terms of Young tableaux, which enables its comparison to the coordinate ring of the orbit.
We prove a Pumping Lemma for the noncooperative abstract Tile Assembly Model, a model central to the theory of algorithmic self-assembly since the beginning of the field. This theory suggests, and our result proves, that small differences in the nature of adhesive bindings between abstract square molecules gives rise to vastly different expressive capabilities. In the cooperative abstract Tile Assembly Model, square tiles attach to each other using multi-sided cooperation of one, two or more sides. This precise control of tile binding is directly exploited for algorithmic tasks including growth of specified shapes using very few tile types, as well as simulation of Turing machines and even self-simulation of self-assembly systems. But are cooperative bindings required for these computational tasks? The definitionally simpler noncooperative (or Temperature 1) model has poor control over local binding events: tiles stick if they bind on at least one side. This has led to the conjecture that it is impossible for it to exhibit precisely controlled growth of computationally-defined shapes. Here, we prove such an impossibility result. We show that any planar noncooperative system that attempts to grow large algorithmically-controlled tile-efficient assemblies must also grow infinite non-algorithmic (pumped) structures with a simple closed-form description, or else suffer blocking of intended algorithmic structures. Our result holds for both directed and nondirected systems, and gives an explicit upper bound of (8|T|)4|T|+1(5|σ| + 6), where |T| is the size of the tileset and |σ| is the size of the seed assembly, beyond which any path of tiles is pumpable or blockable.
For d ≥ 2 and all q≥ q0(d) we give an efficient algorithm to approximately sample from the q-state ferromagnetic Potts and random cluster models on the torus (ℤ / n ℤ )d for any inverse temperature β≥ 0. This stands in contrast to Markov chain mixing time results: the Glauber dynamics mix slowly at and below the critical temperature, and the Swendsen–Wang dynamics mix slowly at the critical temperature. We also provide an efficient algorithm (an FPRAS) for approximating the partition functions of these models.
Our algorithms are based on representing the random cluster model as a contour model using Pirogov–Sinai theory, and then computing an accurate approximation of the logarithm of the partition function by inductively truncating the resulting cluster expansion. One important innovation of our approach is an algorithmic treatment of unstable ground states; this is essential for our algorithms to apply to all inverse temperatures β. By treating unstable ground states our work gives a general template for converting probabilistic applications of Pirogov–Sinai theory to efficient algorithms.
The study of branching programs for the Tree Evaluation Problem (TreeEval), introduced by S. Cook et al. (TOCT 2012), remains one of the most promising approaches to separating L from P. Given a label in [k] at each leaf of a complete binary tree and an explicit function in [k]2 → [k] for recursively computing the value of each internal node from its children, the problem is to compute the value at the root node. (While the original problem allows an arbitrary-degree tree, we focus on binary trees.) The problem is parameterized by the alphabet size k and the height h of the tree. A branching program implementing the straightforward recursive algorithm uses Θ((k + 1)h) states, organized into 2h−1 layers of width up to kh. Until now no better deterministic algorithm was known.
We present a series of three new algorithms solving TreeEval. They are inspired by the work of Buhrman et al. on catalytic space (STOC 2012), applied outside the catalytic-space setting. First we give a novel branching program with 24h poly(k) layers of width 23k, which beats the straightforward algorithm when h = ω(k / logk). Next we give a branching program with k2h poly(k) layers of width k3. This has total size comparable to the straightforward algorithm, but is implemented using the catalytic framework. Finally we interpolate between the two algorithms to give a branching program with (O(k/h))2h textnormalpoly(k) layers of width (O(k/h))є h for any constant є > 0, which beats the straightforward algorithm for all h ≥ k1/2 + poly є. These are the first deterministic branching programs to beat the straightforward algorithm, but more importantly this is the first non-trivial approach to proving deterministic upper bounds for TreeEval.
We also contribute new machinery to the catalytic computing program, which may be of independent interest to some readers.
Following the breakthrough work of Tardos (Oper. Res. ’86) in the bit-complexity model, Vavasis and Ye (Math. Prog. ’96) gave the first exact algorithm for linear programming in the real model of computation with running time depending only on the constraint matrix. For solving a linear program (LP) max cx, Ax = b, x ≥ 0, A ∈ m × n, Vavasis and Ye developed a primal-dual interior point method using a ‘layered least squares’ (LLS) step, and showed that O(n3.5 log(χA+n)) iterations suffice to solve (LP) exactly, where χA is a condition measure controlling the size of solutions to linear systems related to A.
Monteiro and Tsuchiya (SIAM J. Optim. ’03), noting that the central path is invariant under rescalings of the columns of A and c, asked whether there exists an LP algorithm depending instead on the measure χA*, defined as the minimum χAD value achievable by a column rescaling AD of A, and gave strong evidence that this should be the case. We resolve this open question affirmatively.
Our first main contribution is an O(m2 n2 + n3) time algorithm which works on the linear matroid of A to compute a nearly optimal diagonal rescaling D satisfying χAD ≤ n(χ*)3. This algorithm also allows us to approximate the value of χA up to a factor n (χ*)2. This result is in (surprising) contrast to that of Tunçel (Math. Prog. ’99), who showed NP-hardness for approximating χA to within 2poly(rank(A)). The key insight for our algorithm is to work with ratios gi/gj of circuits of A—i.e., minimal linear dependencies Ag=0—which allow us to approximate the value of χA* by a maximum geometric mean cycle computation in what we call the ‘circuit ratio digraph’ of A.
While this resolves Monteiro and Tsuchiya’s question by appropriate preprocessing, it falls short of providing either a truly scaling invariant algorithm or an improvement upon the base LLS analysis. In this vein, as our second main contribution we develop a scaling invariant LLS algorithm, which uses and dynamically maintains improving estimates of the circuit ratio digraph, together with a refined potential function based analysis for LLS algorithms in general. With this analysis, we derive an improved O(n2.5 lognlog(χA*+n)) iteration bound for optimally solving (LP) using our algorithm. The same argument also yields a factor n/logn improvement on the iteration complexity bound of the original Vavasis-Ye algorithm.
In this paper we provide an O(nd+d3) time randomized algorithm for solving linear programs with d variables and n constraints with high probability. To obtain this result we provide a robust, primal-dual O(√d)-iteration interior point method inspired by the methods of Lee and Sidford (2014, 2019) and show how to efficiently implement this method using new data-structures based on heavy-hitters, the Johnson–Lindenstrauss lemma, and inverse maintenance. Interestingly, we obtain this running time without using fast matrix multiplication and consequently, barring a major advance in linear system solving, our running time is near optimal for solving dense linear programs among algorithms that do not use fast matrix multiplication.
We give the first approximation algorithm for mixed packing and covering semidefinite programs (SDPs) with polylogarithmic dependence on width. Mixed packing and covering SDPs constitute a fundamental algorithmic primitive with applications in combinatorial optimization, robust learning, and quantum complexity. The current approximate solvers for positive semidefinite programming can handle only pure packing instances, and technical hurdles prevent their generalization to a wider class of positive instances. For a given multiplicative accuracy of є, our algorithm takes O(log3(ndρ) · є−3) parallelizable iterations, where n, d are the problem dimensions and ρ is a width parameter of the instance, generalizing or improving all previous parallel algorithms in the positive linear and semidefinite programming literature. When specialized to pure packing SDPs, our algorithm’s iteration complexity is O(log2 (nd) · є−2), a slight improvement and derandomization of the state-of-the-art due to Allen-Zhu et. al. ’16, Peng et. al. ’16, and Wang et. al. ’15. For several common structured instances, the iterations of our algorithm run in nearly-linear time. In doing so, we give matrix analytic techniques to overcome obstacles that have stymied prior approaches to this problem, as stated in Peng et. al. ’16 and Mahoney et. al. ’16. Crucial to our analysis are a simplification of existing algorithms for mixed positive linear programs, achieved by removing an asymmetry from modifying covering constraints, and a suite of matrix inequalities with proofs based on analyzing the Schur complements of matrices in a higher dimension. We hope that our algorithm and techniques open the door to improved solvers for positive semidefinite programming and its applications.
In this paper we provide an algorithm which given any m-edge n-vertex directed graph with integer capacities at most U computes a maximum s-t flow for any vertices s and t in m11/8+o(1)U1/4 time with high probability. This running time improves upon the previous best of Õ(m10/7 U1/7) (Mądry 2016), Õ(m √n logU) (Lee Sidford 2014), and O(mn) (Orlin 2013) when the graph is not too dense or has large capacities.
We achieve this result by leveraging recent advances in solving undirected flow problems on graphs. We show that in the maximum flow framework of (Mądry 2016) the problem of optimizing the amount of perturbation of the central path needed to maximize energy and thereby reduce congestion can be efficiently reduced to a smoothed ℓ2-ℓp flow optimization problem, which can be solved approximately via recent work (Kyng, Peng, Sachdeva, Wang 2019). Leveraging this new primitive, we provide a new interior point method for maximum flow with faster convergence and simpler analysis that no longer needs global potential functions involving energy as in previous methods (Mądry 2013, Mądry 2016).
The basic goal of survivable network design is to build a cheap network that maintains the connectivity between given sets of nodes despite the failure of a few edges/nodes. The Connectivity Augmentation Problem (CAP) is arguably one of the most basic problems in this area: given a k(-edge)-connected graph G and a set of extra edges (links), select a minimum cardinality subset A of links such that adding A to G increases its edge connectivity to k+1. Intuitively, one wants to make an existing network more reliable by augmenting it with extra edges. The best known approximation factor for this NP-hard problem is 2, and this can be achieved with multiple approaches (the first such result is in [Frederickson and Jájá’81]).
It is known [Dinitz et al.’76] that CAP can be reduced to the case k=1, a.k.a. the Tree Augmentation Problem (TAP), for odd k, and to the case k=2, a.k.a. the Cactus Augmentation Problem (CacAP), for even k. Several better than 2 approximation algorithms are known for TAP, culminating with a recent 1.458 approximation [Grandoni et al.’18]. However, for CacAP the best known approximation is 2.
In this paper we breach the 2 approximation barrier for CacAP, hence for CAP, by presenting a polynomial-time 2ln(4)−967/1120+є<1.91 approximation. From a technical point of view, our approach deviates quite substantially from the current related literature. In particular, the better-than-2 approximation algorithms for TAP either exploit greedy-style algorithms or are based on rounding carefully-designed LPs. These approaches exploit properties of TAP that do not seem to generalize to CacAP. We instead use a reduction to the Steiner tree problem which was previously used in parameterized algorithms [Basavaraju et al.’14]. This reduction is not approximation preserving, and using the current best approximation factor for Steiner tree [Byrka et al.’13] as a black-box would not be good enough to improve on 2. To achieve the latter goal, we “open the box” and exploit the specific properties of the instances of Steiner tree arising from CacAP.
In our opinion this connection between approximation algorithms for survivable network design and Steiner-type problems is interesting, and it might lead to other results in the area.
We present a spectral approach to design approximation algorithms for network design problems. We observe that the underlying mathematical questions are the spectral rounding problems, which were studied in spectral sparsification and in discrepancy theory. We extend these results to incorporate additional linear constraints, and show that they can be used to significantly extend the scope of network design problems that can be solved. Our algorithm for spectral rounding is an iterative randomized rounding algorithm based on the regret minimization framework. In some settings, this provides an alternative spectral algorithm to achieve constant factor approximation for survivable network design, and partially answers a question of Bansal about survivable network design with concentration property. We also show that the spectral rounding results have many other applications, including weighted experimental design and additive spectral sparsification.
The degree-4 Sum-of-Squares (SoS) SDP relaxation is a powerful algorithm that captures the best known polynomial time algorithms for a broad range of problems including MaxCut, Sparsest Cut, all MaxCSPs and tensor PCA. Despite being an explicit algorithm with relatively low computational complexity, the limits of degree-4 SoS SDP are not well understood. For example, existing integrality gaps do not rule out a (2−)-algorithm for Vertex Cover or a (0.878+)-algorithm for MaxCut via degree-4 SoS SDPs, each of which would refute the notorious Unique Games Conjecture.
We exhibit an explicit mapping from solutions for degree-2 Sum-of-Squares SDP (Goemans-Williamson SDP) to solutions for the degree-4 Sum-of-Squares SDP relaxation on boolean variables. By virtue of this mapping, one can lift lower bounds for degree-2 SoS SDP relaxation to corresponding lower bounds for degree-4 SoS SDPs. We use this approach to obtain degree-4 SoS SDP lower bounds for MaxCut on random d-regular graphs, Sherington-Kirkpatrick model from statistical physics and PSD Grothendieck problem.
Our constructions use the idea of pseudocalibration towards candidate SDP vectors, while it was previously only used to produce the candidate matrix which one would show is PSD using much technical work. In addition, we develop a different technique to bound the spectral norms of graphical matrices that arise in the context of SoS SDPs. The technique is much simpler and yields better bounds in many cases than the trace method – which was the sole technique for this purpose.
We give new algorithms based on Markov chains to sample and approximately count satisfying assignments to k-uniform CNF formulas where each variable appears at most d times. For any k and d satisfying kd<no(1) and k≥ 20logk + 20logd + 60, the new sampling algorithm runs in close to linear time, and the counting algorithm runs in close to quadratic time.
Our approach is inspired by Moitra (JACM, 2019) which remarkably utilizes the Lovász local lemma in approximate counting. Our main technical contribution is to use the local lemma to bypass the connectivity barrier in traditional Markov chain approaches, which makes the well developed MCMC method applicable on disconnected state spaces such as SAT solutions. The benefit of our approach is to avoid the enumeration of local structures and obtain fixed polynomial running times, even if k=ω(1) or d=ω(1).
Let H be a frustration-free Hamiltonian describing a 2D grid of qudits with local interactions, a unique ground state, and local spectral gap lower bounded by a positive constant. For any bipartition defined by a vertical cut of length L running from top to bottom of the grid, we prove that the corresponding entanglement entropy of the ground state of H is upper bounded by Õ(L5/3). For the special case of a 1D chain, our result provides a new area law which improves upon prior work, in terms of the scaling with qudit dimension and spectral gap. In addition, for any bipartition of the grid into a rectangular region A and its complement, we show that the entanglement entropy is upper bounded as Õ(|∂ A|5/3) where ∂ A is the boundary of A. This represents a subvolume bound on entanglement in frustration-free 2D systems. In contrast with previous work, our bounds depend on the local (rather than global) spectral gap of the Hamiltonian. We prove our results using a known method which bounds the entanglement entropy of the ground state in terms of certain properties of an approximate ground state projector (AGSP). To this end, we construct a new AGSP which is based on a robust polynomial approximation of the AND function and we show that it achieves an improved trade-off between approximation error and entanglement.
Recent work of Bravyi et al. and follow-up work by Bene Watts et al. demonstrates a quantum advantage for shallow circuits: constant-depth quantum circuits can perform a task which constant-depth classical (i.e., 0) circuits cannot. Their results have the advantage that the quantum circuit is fairly practical, and their proofs are free of hardness assumptions (e.g., factoring is classically hard, etc.). Unfortunately, constant-depth classical circuits are too weak to yield a convincing real-world demonstration of quantum advantage. We attempt to hold on to the advantages of the above results, while increasing the power of the classical model. Our main result is a two-round interactive task which is solved by a constant-depth quantum circuit (using only Clifford gates, between neighboring qubits of a 2D grid, with Pauli measurements), but such that any classical solution would necessarily solve -hard problems. This implies a more powerful class of constant-depth classical circuits (e.g., 0[p] for any prime p) unconditionally cannot perform the task. Furthermore, under standard complexity-theoretic conjectures, log-depth circuits and log-space Turing machines cannot perform the task either. Using the same techniques, we prove hardness results for weaker complexity classes under more restrictive circuit topologies. Specifically, we give 0 interactive tasks on 2 × n and 1 × n grids which require classical simulations of power 1 and 0[6], respectively. Moreover, these hardness results are robust to a small constant fraction of error in the classical simulation.
We use ideas and techniques from the theory of branching programs, quantum contextuality, measurement-based quantum computation, and Kilian randomization.
A conjecture of Jozsa (arXiv:quant-ph/0508124) states that any polynomial-time quantum computation can be simulated by polylogarithmic-depth quantum computation interleaved with polynomial-depth classical computation. Separately, Aaronson conjectured that there exists an oracle O such that BQPO ≠ (BPPBQNC)O. These conjectures are intriguing allusions to the unresolved potential of combining classical and low-depth quantum computation. In this work we show that the Welded Tree Problem, which is an oracle problem that can be solved in quantum polynomial time as shown by Childs et al. (arXiv:quant-ph/0209131), cannot be solved in BPPBQNC, nor can it be solved in the class that Jozsa describes. This proves Aaronson’s oracle separation conjecture and provides a counterpoint to Jozsa’s conjecture relative to the Welded Tree oracle problem. More precisely, we define two complexity classes, HQC and JC whose languages are decided by two different families of interleaved quantum-classical circuits. HQC contains BPPBQNC and is therefore relevant to Aaronson’s conjecture, while JC captures the model of computation that Jozsa considers. We show that the Welded Tree Problem gives an oracle separation between either of {JC, HQC} and BQP. Therefore, even when interleaved with arbitrary polynomial-time classical computation, greater ”quantum depth” leads to strictly greater computational ability in this relativized setting.
Near-term quantum computers are likely to have small depths due to short coherence time and noisy gates. A natural approach to leverage these quantum computers is interleaving them with classical computers. Understanding the capabilities and limits of this hybrid approach is an essential topic in quantum computation. Most notably, the quantum Fourier transform can be implemented by a hybrid of logarithmic-depth quantum circuits and a classical polynomial-time algorithm. Therefore, it seems possible that quantum polylogarithmic depth is as powerful as quantum polynomial depth in the presence of classical computation.
Indeed, Jozsa conjectured that “Any quantum polynomial-time algorithm can be implemented with only O(logn) quantum depth interspersed with polynomial-time classical computations.” This can be formalized as asserting the equivalence of BQP and “BQNCBPP”. On the other hand, Aaronson conjectured that “there exists an oracle separation between BQP and BPPBQNC.” BQNCBPP and BPPBQNC are two natural and seeming incomparable ways of hybrid classical-quantum computation.
In this work, we manage to prove Aaronson’s conjecture and disproves Jozsa’s conjecture relative to an oracle. In fact, we prove a stronger statement that for any depth parameter d, there exists an oracle that separates quantum depth d and 2d+1 in the presence of classical computation. Thus, our results show that relative to oracles, doubling the quantum circuit depth indeed gives the hybrid model more power, and this cannot be traded by classical computation.
How can two parties with competing interests carry out a fair coin flip across a quantum communication channel? This problem (quantum weak coin-flipping) was formalized more than 15 years ago, and, despite some phenomenal theoretical progress, practical quantum coin-flipping protocols with vanishing bias have proved hard to find. In the current work we show that there is a reason that practical weak quantum coin-flipping is difficult: any quantum weak coin-flipping protocol with bias є must use at least exp( Ω (1/√є )) rounds of communication. This is a large improvement over the previous best known lower bound of Ω ( log log(1/є )) due to Ambainis from 2004. Our proof is based on a theoretical construction (the two-variable profile function) which may find further applications.
We initiate a study of the following problem: Given a continuous domain Ω along with its convex hull K, a point A ∈ K and a prior measure µ on Ω, find the probability density over Ω whose marginal is A and that minimizes the KL-divergence to µ. This framework gives rise to several extremal distributions that arise in mathematics, quantum mechanics, statistics, and theoretical computer science. Our technical contributions include a polynomial bound on the norm of the optimizer of the dual problem that holds in a very general setting and relies on a “balance” property of the measure µ on Ω, and exact algorithms for evaluating the dual and its gradient for several interesting settings of Ω and µ. Together, along with the ellipsoid method, these results imply polynomial-time algorithms to compute such KL-divergence minimizing distributions in several cases. Applications of our results include: 1) an optimization characterization of the Goemans-Williamson measure that is used to round a positive semidefinite matrix to a vector, 2) the computability of the entropic barrier for polytopes studied by Bubeck and Eldan, and 3) a polynomial-time algorithm to compute the barycentric quantum entropy of a density matrix that was proposed as an alternative to von Neumann entropy by Band and Park in the 1970s: this corresponds to the case when Ω is the set of rank one projection matrices and µ corresponds to the Haar measure on the unit sphere. Our techniques generalize to the setting of rank k projections using the Harish-Chandra-Itzykson-Zuber formula, and are applicable even beyond, to adjoint orbits of compact Lie groups.
Given a separation oracle for a convex set K ⊂ ℝn that is contained in a box of radius R, the goal is to either compute a point in K or prove that K does not contain a ball of radius є. We propose a new cutting plane algorithm that uses an optimal O(n log(κ)) evaluations of the oracle and an additional O(n2) time per evaluation, where κ = nR/є. This improves upon Vaidya’s O( SO · n log(κ) + nω+1 log(κ)) time algorithm [Vaidya, FOCS 1989a] in terms of polynomial dependence on n, where ω < 2.373 is the exponent of matrix multiplication and SO is the time for oracle evaluation. This improves upon Lee-Sidford-Wong’s O( SO · n log(κ) + n3 logO(1) (κ)) time algorithm [Lee, Sidford and Wong, FOCS 2015] in terms of dependence on κ. For many important applications in economics, κ = Ω(exp(n)) and this leads to a significant difference between log(κ) and (log(κ)). We also provide evidence that the n2 time per evaluation cannot be improved and thus our running time is optimal.
A bottleneck of previous cutting plane methods is to compute leverage scores, a measure of the relative importance of past constraints. Our result is achieved by a novel multi-layered data structure for leverage score maintenance, which is a sophisticated combination of diverse techniques such as random projection, batched low-rank update, inverse maintenance, polynomial interpolation, and fast rectangular matrix multiplication. Interestingly, our method requires a combination of different fast rectangular matrix multiplication algorithms.
Our algorithm not only works for the classical convex optimization setting, but also generalizes to convex-concave games. We apply our algorithm to improve the runtimes of many interesting problems, e.g., Linear Arrow-Debreu Markets, Fisher Markets, and Walrasian equilibrium.
State-of-the-art results on image recognition tasks are achieved using over-parameterized learning algorithms that (nearly) perfectly fit the training set and are known to fit well even random labels. This tendency to memorize seemingly useless training data labels is not explained by existing theoretical analyses. Memorization of the training data also presents significant privacy risks when the training data contains sensitive personal information and thus it is important to understand whether such memorization is necessary for accurate learning.
We provide a simple conceptual explanation and a theoretical model demonstrating that for natural data distributions memorization of labels is necessary for achieving close-to-optimal generalization error. The model is motivated and supported by the results of several recent empirical works. In our model, data is sampled from a mixture of subpopulations and the frequencies of these subpopulations are chosen from some prior. The model allows to quantify the effect of not fitting the training data on the generalization performance of the learned classifier and demonstrates that memorization is necessary whenever frequencies are long-tailed. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. Our results also have concrete implications for the cost of ensuring differential privacy in learning.
We study the problem, introduced by Qiao and Valiant, of learning from untrusted batches. Here, we assume m users, all of whom have samples from some underlying distribution over 1, …, n. Each user sends a batch of k i.i.d. samples from this distribution; however an є-fraction of users are untrustworthy and can send adversarially chosen responses. The goal of the algorithm is to learn in total variation distance. When k = 1 this is the standard robust univariate density estimation setting and it is well-understood that (є) error is unavoidable. Suprisingly, Qiao and Valiant gave an estimator which improves upon this rate when k is large. Unfortunately, their algorithms run in time which is exponential in either n or k.
We first give a sequence of polynomial time algorithms whose estimation error approaches the information-theoretically optimal bound for this problem. Our approach is based on recent algorithms derived from the sum-of-squares hierarchy, in the context of high-dimensional robust estimation. We show that algorithms for learning from untrusted batches can also be cast in this framework, but by working with a more complicated set of test functions.
It turns out that this abstraction is quite powerful, and can be generalized to incorporate additional problem specific constraints. Our second and main result is to show that this technology can be leveraged to build in prior knowledge about the shape of the distribution. Crucially, this allows us to reduce the sample complexity of learning from untrusted batches to polylogarithmic in n for most natural classes of distributions, which is important in many applications. To do so, we demonstrate that these sum-of-squares algorithms for robust mean estimation can be made to handle complex combinatorial constraints (e.g. those arising from VC theory), which may be of independent technical interest.
The popular 3-SUM conjecture states that there is no strongly subquadratic time algorithm for checking if a given set of integers contains three distinct elements that sum up to zero. A closely related problem is to check if a given set of integers contains distinct x1, x2, x3 such that x1+x2=2x3. This can be reduced to 3-SUM in almost-linear time, but surprisingly a reverse reduction establishing 3-SUM hardness was not known.
We provide such a reduction, thus resolving an open question of Erickson. In fact, we consider a more general problem called 3-LDT parameterized by integer parameters α1, α2, α3 and t. In this problem, we need to check if a given set of integers contains distinct elements x1, x2, x3 such that α1 x1+α2 x2 +α3 x3 = t. For some combinations of the parameters, every instance of this problem is a NO-instance or there exists a simple almost-linear time algorithm. We call such variants trivial. We prove that all non-trivial variants of 3-LDT are equivalent under subquadratic reductions. Our main technical contribution is an efficient deterministic procedure based on the famous Behrend’s construction that partitions a given set of integers into few subsets that avoid a chosen linear equation.
In the classical SubsetSum problem we are given a set X and a target t, and the task is to decide whether there exists a subset of X which sums to t. A recent line of research has resulted in (t · poly (logt))-time algorithms, which are (near-)optimal under popular complexity-theoretic assumptions. On the other hand, the standard dynamic programming algorithm runs in time O(n · |S(X,t)|), where S(X,t) is the set of all subset sums of X that are smaller than t. All previous pseudopolynomial algorithms actually solve a stronger task, since they actually compute the whole set S(X,t).
As the aforementioned two running times are incomparable, in this paper we ask whether one can achieve the best of both worlds: running time |S(X,t)|·poly(logt). In particular, we ask whether S(X,t) can be computed in near-linear time in the output-size. Using a diverse toolkit containing techniques such as color coding, sparse recovery, and sumset estimates, we make considerable progress towards this question and design an algorithm running in time |S(X,t)|4/3 · poly(logt).
Central to our approach is the study of top-k-convolution, a natural problem of independent interest: given degree-d sparse polynomials with non-negative coefficients, compute the lowest k non-zero monomials of their product. We design an algorithm running in time k4/3 poly(logd), by a combination of sparse convolution and sumset estimates considered in Additive Combinatorics. Moreover, we provide evidence that going beyond some of the barriers we have faced requires either an algorithmic breakthrough or possibly new techniques from Additive Combinatorics on how to pass from information on restricted sumsets to information on unrestricted sumsets.
The Sparsest Cut is a fundamental optimization problem that have been extensively studied. For planar inputs the problem is in P and can be solved in Õ(n 3 ) time if all vertex weights are 1. Despite a significant amount of effort, the best algorithms date back to the early 90’s and can only achieve O(log n)-approximation in Õ(n) time or 3.5-approximation in Õ(n 2 ) time [Rao, STOC92]. Our main result is an Ω(n 2−ε ) lower bound for Sparsest Cut even in planar graphs with unit vertex weights, under the (min, +)-Convolution conjecture, showing that approxima- tions are inevitable in the near-linear time regime. To complement the lower bound, we provide a 3.3-approximation in near-linear time, improving upon the 25-year old result of Rao in both time and accuracy. We also show that our lower bound is not far from optimal by observing an exact algorithm with running time Õ(n 5/2 ) improving upon the Õ(n 3 ) algorithm of Park and Phillips [STOC93]. Our lower bound accomplishes a repeatedly raised challenge by being the first fine-grained lower bound for a natural planar graph problem in P. Building on our construction we prove near-quadratic lower bounds under SETH for variants of the closest pair problem in planar graphs, and use them to show that the popular Average-Linkage procedure for Hierarchical Clustering cannot be simulated in truly subquadratic time. At the core of our constructions is a diamond-like gadget that also settles the complexity of Diameter in distributed planar networks. We prove an Ω(n/ log n) lower bound on the number of communication rounds required to compute the weighted diameter of a network in the CONGET model, even when the underlying graph is planar and all nodes are D = 4 hops away from each other. This is the first poly(n) lower bound in the planar-distributed setting, and it complements the recent poly(D, log n) upper bounds of Li and Parter [STOC 2019] for (exact) unweighted diameter and for (1 + ε) approximate weighted diameter.
In this paper we provide a Õ(m√n) time algorithm that computes a 3-multiplicative approximation of the girth of a n-node m-edge directed graph with non-negative edge lengths. This is the first algorithm which approximates the girth of a directed graph up to a constant multiplicative factor faster than All-Pairs Shortest Paths (APSP) time, i.e. O(mn). Additionally, for any integer k ≥ 1, we provide a deterministic algorithm for a O(kloglogn)-multiplicative approximation to the girth in directed graphs in Õ(m1+1/k) time. Combining the techniques from these two results gives us an algorithm for a O(klogk)-multiplicative approximation to the girth in directed graphs in Õ(m1+1/k) time. Our results naturally also provide algorithms for improved constructions of roundtrip spanners, the analog of spanners in directed graphs.
The previous fastest algorithms for these problems either ran in All-Pairs Shortest Paths (APSP) time, i.e. O(mn), or were due Pachocki which provided a randomized algorithm that for any integer k ≥ 1 in time Õ(m1+1/k) computed with high probability a O(klogn) multiplicative approximation of the girth. Our first algorithm constitutes the first sub-APSP-time algorithm for approximating the girth to constant accuracy, our second removes the need for randomness and improves the approximation factor in Pachocki , and our third is the first time versus quality trade-off for obtaining constant approximations.
Non-signaling proofs, motivated by quantum computation, have found applications in cryptography and hardness of approximation. An important open problem is characterizing the power of non-signaling proofs. It is known that non-signaling proofs with two provers are characterized by PSPACE and that non-signaling proofs with poly(n)-provers are characterized by EXP. However, the power of k-prover non-signaling proofs, for 2<k<(n) remained an open problem.
We show that k-prover non-signaling proofs (with negligible soundness) for k=O(√logn) are contained in PSPACE. We prove this via two different routes that are of independent interest. In both routes we consider a relaxation of non-signaling called sub-non-signaling. Our main technical contribution (which is used in both our proofs) is a reduction showing how to convert any sub-non-signaling strategy with value at least 1−2−Ω(k2) into a non-signaling one with value at least 2−O(k2).
In the first route, we show that the classical prover reduction method for converting k-prover games into 2-prover games carries over to the non-signaling setting with the following loss in soundness: if a k-prover game has value less than 2−ck2 (for some constant c>0), then the corresponding 2-prover game has value less than 1 − 2dk2 (for some constant d>0). In the second route we show that the value of a sub-non-signaling game can be approximated in space that is polynomial in the communication complexity and exponential in the number of provers.
Hardness of computing the Kolmogorov complexity of a given string is closely tied to a security proof of hitting set generators, and thus understanding hardness of Kolmogorov complexity is one of the central questions in complexity theory. In this paper, we develop new proof techniques for showing hardness of computing Kolmogorov complexity under surprisingly efficient reductions, which were previously conjectured to be impossible. It is known that the set RK of Kolmogorov-random strings is PSPACE-hard under polynomial-time Turing reductions, i.e., PSPACE ⊂ PRK, and that NEXP ⊂ NPRK, which was conjectured to be tight by Allender (CiE 2012). We prove that EXPNP ⊂ PRK, which simultaneously improves these hardness results and refutes the conjecture of Allender under the plausible assumption that EXPNP ≠ NEXP. At the core of our results is a new security proof of a pseudorandom generator via a black-box uniform reduction, which overcomes an impossibility result of Gutfreund and Vadhan (RANDOM/APPROX 2008).
Our proof techniques have further consequences, including:
1. Applying our proof techniques to the case of resource-bounded Kolmogorov complexity, we obtain NP-hardness of the problem MINcKTSAT of computing conditional polynomial-time-bounded SAT-oracle Kolmogorov complexity under polynomial-time deterministic reductions. In contrast, the Minimum SAT-Oracle Circuit Size Problem, which is a version of sublinear-time-bounded Kolmogorov complexity, cannot be NP-hard under polynomial-time deterministic reductions without resolving EXP ≠ ZPP. Our hardness result is the first result that overcomes the non-NP-hardness results of MCSP. We also prove DistNP-hardness of MINKTSAT, which is a partial converse of the approach of Hirahara (FOCS 2018) for proving the equivalence between worst-case and average-case complexity of NP.
2. We prove S2p-hardness of Kolmogorov complexity under quasi-polynomial-time nonadaptive reductions. This is the first result that overcomes a P/poly barrier result of Allender, Buhrman, Friedman, and Loff (MFCS 2012).
We also establish a firm link between non-trivial satisfiability algorithms and immunity of random strings, and obtain the following unconditional lower bounds.
1. It has been a long-standing open question whether the set of subexponential-time-bounded Kolmogorov-random strings is decidable in P. We resolve this open question, by showing that the set of super-polynomial-time-bounded Kolmogorov-random strings is P-immune, which is a much stronger lower bound than an average-case lower bound.
2. The set of Levin’s Kolmogorov-random strings is (P-uniform ACC)-immune.
We show that the smoothed complexity of the FLIP algorithm for local Max-Cut is at most φ nO(√logn), where n is the number of nodes in the graph and φ is a parameter that measures the magnitude of perturbations applied on its edge weights. This improves the previously best upper bound of φ nO(logn) by Etscheid and Roglin. Our result is based on an analysis of long sequences of flips, which shows that it is very unlikely for every flip in a long sequence to incur a positive but small improvement in the cut weight. We also extend the same upper bound on the smoothed complexity of FLIP to all binary Maximum Constraint Satisfaction Problems.
We consider the simplest non-linear discrete dynamical systems, given by the logistic maps fa(x)=ax(1−x) of the interval [0,1]. We show that there exist real parameters a∈ (0,4) for which almost every orbit of fa has the same statistical distribution in [0,1], but this limiting distribution is not Turing computable. In particular, the Monte Carlo method cannot be applied to study these dynamical systems.
We prove the first separation in the approximation guarantee achievable by truthful and non-truthful combinatorial auctions with polynomial communication. Specifically, we prove that any truthful auction guaranteeing a (34−1240+є)-approximation for two buyers with XOS valuations over m items requires exp(Ω(ε2 · m)) communication whereas a non-truthful auction by Feige [J. Comput. 2009] is already known to achieve a 34-approximation in (m) communication.
We obtain our lower bound for truthful auctions by proving that any simultaneous auction (not necessarily truthful) which guarantees a (34−1240+ε)-approximation requires communication exp(Ω(ε2 · m)), and then apply the taxation complexity framework of Dobzinski [FOCS 2016] to extend the lower bound to all truthful auctions (including interactive truthful auctions).
We consider incentive compatible mechanisms for a domain that is very close to the domain of scheduling n unrelated machines: the single exception is that the valuation of just one machine is submodular. For the scheduling problem with such cost functions, we give a lower bound of Ω(√n) on the approximation ratio of incentive compatible deterministic mechanisms. This is a strong information-theoretic impossibility result on the approximation ratio of mechanisms that provides strong evidence for the Nisan-Ronen conjecture. This is the first non-constant lower bound that assumes no restriction on the mechanism side; in contrast, all previous general results hold for only special classes of mechanisms such as local, strongly monotone, and anonymous mechanisms. Our approach is based on a novel multi-player characterization of appropriately selected instances that allows us to focus on particular type of algorithms, linear mechanisms, and it is a potential stepping stone towards the full resolution of the conjecture.
There has been a long history for studying randomized greedy matching algorithms since the work by Dyer and Frieze(RSA 1991). We follow this trend and consider the problem formulated in the oblivious setting, in which the algorithm makes (random) decisions that are essentially oblivious to the input graph. We revisit the Modified Randomized Greedy (MRG) algorithm by Aronson et al.(RSA 1995) which is proved to be (0.5+epsilon)-approximate. In particular, we study a weaker version of the algorithm named Random Decision Order (RDO) that in each step, randomly picks an unmatched vertex and matches it to an arbitrary neighbor if exists. We prove the RDO algorithm is 0.639-approximate and 0.531-approximate for bipartite graphs and general graphs respectively. As a corollary, we substantially improve the approximation ratio of MRG. Furthermore, we generalize the RDO algorithm to the edge-weighted case and prove that it achieves a 0.501 approximation ratio. This result solves the open question by Chan et al.(SICOMP 2018) about the existence of an algorithm that beats greedy in this setting. As a corollary, it also solves the open questions by Gamlath et al.(SODA 2019) in the stochastic setting.
Suppose that we are given an arbitrary graph G=(V, E) and know that each edge in E is going to be realized independently with some probability p. The goal in the stochastic matching problem is to pick a sparse subgraph Q of G such that the realized edges in Q, in expectation, include a matching that is approximately as large as the maximum matching among the realized edges of G. The maximum degree of Q can depend on p, but not on the size of G. This problem has been subject to extensive studies over the years and the approximation factor has been improved gradually from 0.5 to eventually 2/3 which is a known barrier. In this work, we analyze a natural sampling-based algorithm and show that it can obtain a (1−є) approximation, for any constant є > 0. A key and of possible independent interest component of our analysis is an algorithm that constructs a matching on a stochastic graph, which among some other important properties, guarantees that each vertex is matched independently from the vertices that are sufficiently far. This allows us to bypass a previously known barrier towards achieving (1−є) approximation based on existence of dense Ruzsa-Szemerédi graphs.
We consider the (weighted) Paging with Time Windows problem, which is identical to the classical weighted paging problem but where each page request only needs to be served by a given deadline. This problem arises in many practical applications of online caching, such as the deadline I/O scheduler in the Linux kernel and video-on-demand streaming. From a theoretical perspective, this generalizes the caching problem to allow delayed service, a line of work that has recently gained traction in online algorithms (e.g., Emek et al. STOC '16, Azar et al. STOC '17, Azar and Touitou FOCS '19, etc.).
Our main result is an O(log k log n)-competitive algorithm for the Paging with Time Windows problem on n pages with a cache of size k. This significantly improves on the previous best bound of O(k) (Azar et al. (STOC '17). We also consider the offline version of this problem, for which we give an O(1) approximation algorithm and prove APX-hardness. These are the first results for the offline problem; even NP-hardness was not known before our work.
At the heart of our algorithms is a novel hitting-set LP relaxation of this problem that overcomes the Omega(k) integrality gap of the natural LP for the problem. To the best of our knowledge, this is the first example of an LP-based algorithm for an online algorithm with delays/deadlines.
We consider an online vector balancing question where T vectors, chosen from an arbitrary distribution over [−1,1]n, arrive one-by-one and must be immediately given a ± sign. The goal is to keep the discrepancy—the ℓ∞-norm of any signed prefix-sum—as small as possible. A concrete example of this question is the online interval discrepancy problem where T points are sampled one-by-one uniformly in the unit interval [0,1], and the goal is to immediately color them ± such that every sub-interval remains always nearly balanced. As random coloring incurs Ω(T1/2) discrepancy, while the worst-case offline bounds are Θ(√n log(T/n)) for vector balancing and 1 for interval balancing, a natural question is whether one can (nearly) match the offline bounds in the online setting for these problems. One must utilize the stochasticity as in the worst-case scenario it is known that discrepancy is Ω(T1/2) for any online algorithm.
In a special case of online vector balancing, Bansal and Spencer [BS19] recently show an O(√nlogT) bound when each coordinate is independently chosen. When there are dependencies among the coordinates, as in the interval discrepancy problem, the problem becomes much more challenging, as evidenced by a recent work of Jiang, Kulkarni, and Singla [JKS19] that gives a non-trivial O(T1/loglogT) bound for online interval discrepancy. Although this beats random coloring, it is still far from the offline bound.
In this work, we introduce a new framework that allows us to handle online vector balancing even when the input distribution has dependencies across coordinates. In particular, this lets us obtain a poly(n, logT) bound for online vector balancing under arbitrary input distributions, and a polylog (T) bound for online interval discrepancy. Our framework is powerful enough to capture other well-studied geometric discrepancy problems; e.g., we obtain a poly(logd (T)) bound for the online d-dimensional Tusnády’s problem. All our bounds are tight up to polynomial factors.
A key new technical ingredient in our work is an anti-concentration inequality for sums of pairwise uncorrelated random variables, which might also be of independent interest.
Mehta and Panigrahi (FOCS 2012) introduce the problem of online matching with stochastic rewards, where edges are associated with success probabilities and a match succeeds with the probability of the corresponding edge. It is one of the few online matching problems that have defied the randomized online primal dual framework by Devanur, Jain, and Kleinberg (SODA 2013) thus far. This paper unlocks the power of randomized online primal dual in online matching with stochastic rewards by employing the configuration linear program rather than the standard matching linear program used in previous works. Our main result is a 0.572 competitive algorithm for the case of vanishing and unequal probabilities, improving the best previous bound of 0.534 by Mehta, Waggoner, and Zadimoghaddam (SODA 2015) and, in fact, is even better than the best previous bound of 0.567 by Mehta and Panigrahi (FOCS 2012) for the more restricted case of vanishing and equal probabilities. For vanishing and equal probabilities, we get a better competitive ratio of 0.576. Our results further generalize to the vertex-weighted case due to the intrinsic robustness of the randomized online primal dual analysis.
We study the resource augmented version of the k-server problem, also known as the k-server problem against weak adversaries or the (h,k)-server problem. In this setting, an online algorithm using k servers is compared to an offline algorithm using h servers, where h ≤ k. For uniform metrics, it has been known since the seminal work of Sleator and Tarjan (1985) that for any є>0, the competitive ratio drops to a constant if k=(1+є) · h. This result was later generalized to weighted stars (Young 1994) and trees of bounded depth (Bansal et al. 2017). The main open problem for this setting is whether a similar phenomenon occurs on general metrics. We resolve this question negatively. With a simple recursive construction, we show that the competitive ratio is at least Ω(loglogh), even as k→∞. Our lower bound holds for both deterministic and randomized algorithms. It also disproves the existence of a competitive algorithm for the infinite server problem on general metrics.
We give a pseudorandom generator that fools degree-d polynomial threshold functions over n-dimensional Gaussian space with seed length dO(logd) · logn. All previous generators had a seed length with at least a 2d dependence on d.
The key new ingredient is our Local Hyperconcentration Theorem, which shows that every degree-d Gaussian polynomial is hyperconcentrated almost everywhere at scale d−O(logd).
Randomness extraction is a fundamental problem that has been studied for over three decades. A well-studied setting assumes that one has access to multiple independent weak random sources, each with some entropy. However, this assumption is often unrealistic in practice. In real life, natural sources of randomness can produce samples with no entropy at all or with unwanted dependence. Motivated by this and applications from cryptography, we initiate a systematic study of randomness extraction for the class of adversarial sources defined as follows.
A weak source X of the form X1, …, XN, where each Xi is on n bits, is an (N,K,n,k)-source of locality d if the following hold: (1) Somewhere good sources: at least K of the Xi’s are independent, and each contains min-entropy at least k. We call these Xi’s good sources, and their locations are unknown. (2) Bounded dependence: each remaining (bad) source can depend arbitrarily on at most d good sources.
We focus on constructing extractors with negligible error, in the regime where most of the entropy is contained within a few sources instead of across many (i.e., k is at least polynomial in K). In this setting, even for the case of 0-locality, very little is known prior to our work. For d=1, essentially no previous results are known. We present various new extractors for adversarial sources in a wide range of parameters, and some of our constructions work for locality d=KΩ(1). As an application, we also give improved extractors for small-space sources.
The class of adversarial sources generalizes several previously studied classes of sources, and our explicit extractor constructions exploit tools from recent advances in extractor machinery, such as two-source non-malleable extractors and low-error condensers. Thus, our constructions can be viewed as a new application of non-malleable extractors. In addition, our constructions combine the tools from extractor theory in a novel way through various sorts of explicit extremal hypergraphs. These connections leverage recent progress in combinatorics, such as improved bounds on cap sets and explicit constructions of Ramsey graphs, and may be of independent interest.
The motivation of this work is to extend the techniques of higher order random walks on simplicial complexes to analyze mixing times of Markov chains for combinatorial problems. Our main result is a sharp upper bound on the second eigenvalue of the down-up walk on a pure simplicial complex, in terms of the second eigenvalues of its links. We show some applications of this result in analyzing mixing times of Markov chains, including sampling independent sets of a graph and sampling common independent sets of two partition matroids.
Motivated by the Dikin walk, we develop aspects of the interior-point theory for sampling in high dimension. Specifically, we introduce the notions of strong self-concordance and symmetry for a barrier. These properties imply that the Dikin walk defined using a strongly self-concordant barrier with symmetry parameter ν mixes in Õ(nν) steps from a warm start for a convex body in ℝn. For many natural barriers, ν is roughly bounded by ν, the standard self-concordance parameter. We also show that these properties hold for the Lee-Sidford barrier. As a consequence, we obtain the first walk that mixes in Õ(n2) steps for an arbitrary polytope in ℝn. Strong self-concordance for other barriers leads to an interesting (and unexpected) connection — for the universal and entropic barriers, it is implied by the KLS conjecture.
A longstanding observation, which was partially proven by Li, Nguyen, and Woodruff in 2014, and extended by Ai, Hu, Li, and Woodruff in 2016, is that any turnstile streaming algorithm can be implemented as a linear sketch (the reverse is trivially true). We study the relationship between turnstile streaming and linear sketching algorithms in more detail, giving both new separations and new equivalences between the two models.
It was shown by Li, Nguyen, and Woodruff in 2014 that, if a turnstile algorithm works for arbitrarily long streams with arbitrarily large coordinates at intermediate stages of the stream, then the turnstile algorithm is equivalent to a linear sketch. We show separations of the opposite form: if either the stream length or the maximum value of the stream are substantially restricted, there exist problems where linear sketching is exponentially harder than turnstile streaming.
A further limitation of the Li, Nguyen, and Woodruff equivalence is that the turnstile sketching algorithm is neither explicit nor uniform, but requires an exponentially long advice string. We show how to remove this limitation for deterministic streaming algorithms: we give an explicit small-space algorithm that takes the streaming algorithm and computes an equivalent module.
Consider the following abstract coin tossing problem: Given a set of n coins with unknown biases, find the most biased coin using a minimal number of coin tosses. This is a common abstraction of various exploration problems in theoretical computer science and machine learning and has been studied extensively over the years. In particular, algorithms with optimal sample complexity (number of coin tosses) have been known for this problem for quite some time.
Motivated by applications to processing massive datasets, we study the space complexity of solving this problem with optimal number of coin tosses in the streaming model. In this model, the coins are arriving one by one and the algorithm is only allowed to store a limited number of coins at any point – any coin not present in the memory is lost and can no longer be tossed or compared to arriving coins. Prior algorithms for the coin tossing problem with optimal sample complexity are based on iterative elimination of coins which inherently require storing all the coins, leading to memory-inefficient streaming algorithms.
We remedy this state-of-affairs by presenting a series of improved streaming algorithms for this problem: we start with a simple algorithm which require storing only O(logn) coins and then iteratively refine it further and further, leading to algorithms with O(loglog(n)) memory, O((n)) memory, and finally a one that only stores a single extra coin in memory – the same exact space needed to just store the best coin throughout the stream.
Furthermore, we extend our algorithms to the problem of finding the k most biased coins as well as other exploration problems such as finding top-k elements using noisy comparisons or finding an -best arm in stochastic multi-armed bandits, and obtain efficient streaming algorithms for these problems.
Adaptive sampling is a useful algorithmic tool for data summarization problems in the classical centralized setting, where the entire dataset is available to the single processor performing the computation. Adaptive sampling repeatedly selects rows of an underlying matrix A∈ℝn× d, where n≫ d, with probabilities proportional to their distances to the subspace of the previously selected rows. Intuitively, adaptive sampling seems to be limited to trivial multi-pass algorithms in the streaming model of computation due to its inherently sequential nature of assigning sampling probabilities to each row only after the previous iteration is completed. Surprisingly, we show this is not the case by giving the first one-pass algorithms for adaptive sampling on turnstile streams and using space poly(d,k,logn), where k is the number of adaptive sampling rounds to be performed.
Our adaptive sampling procedure has a number of applications to various data summarization problems that either improve state-of-the-art or have only been previously studied in the more relaxed row-arrival model. We give the first relative-error algorithm for column subset selection on turnstile streams. We show our adaptive sampling algorithm also gives the first relative-error algorithm for subspace approximation on turnstile streams that returns k noisy rows of A. The quality of the output can be improved to a (1+є)-approximation at the tradeoff of a bicriteria algorithm that outputs a larger number of rows. We then give the first algorithm for projective clustering on turnstile streams that uses space sublinear in n. In fact, we use space poly(d,k,s,1/є,logn) to output a (1+є)-approximation, where s is the number of k-dimensional subspaces. Our adaptive sampling primitive also provides the first algorithm for volume maximization on turnstile streams. We complement our volume maximization algorithmic results with lower bounds that are tight up to lower order terms, even for multi-pass algorithms. By a similar construction, we also obtain lower bounds for volume maximization in the row-arrival model, which we match with competitive upper bounds.
Previous work on tabulation hashing by Pǎtraşcu and Thorup from STOC’11 on simple tabulation and from SODA’13 on twisted tabulation offered Chernoff-style concentration bounds on hash based sums, e.g., the number of balls/keys hashing to a given bin, but under some quite severe restrictions on the expected values of these sums. The basic idea in tabulation hashing is to view a key as consisting of c=O(1) characters, e.g., a 64-bit key as c=8 characters of 8-bits. The character domain Σ should be small enough that character tables of size |Σ| fit in fast cache. The schemes then use O(1) tables of this size, so the space of tabulation hashing is O(|Σ|). However, the concentration bounds by Pǎtraşcu and Thorup only apply if the expected sums are ≪ |Σ|.
To see the problem, consider the very simple case where we use tabulation hashing to throw n balls into m bins and want to analyse the number of balls in a given bin. With their concentration bounds, we are fine if n=m, for then the expected value is 1. However, if m=2, as when tossing n unbiased coins, the expected value n/2 is ≫ |Σ| for large data sets, e.g., data sets that do not fit in fast cache.
To handle expectations that go beyond the limits of our small space, we need a much more advanced analysis of simple tabulation, plus a new tabulation technique that we call tabulation-permutation hashing which is at most twice as slow as simple tabulation. No other hashing scheme of comparable speed offers similar Chernoff-style concentration bounds.
The three-in-a-tree problem is to determine if a simple undirected graph contains an induced subgraph which is a tree connecting three given vertices. Based on a beautiful characterization that is proved in more than twenty pages, Chudnovsky and Seymour [Combinatorica 2010] gave the previously only known polynomial-time algorithm, running in O(mn2) time, to solve the three-in-a-tree problem on an n-vertex m-edge graph. Their three-in-a-tree algorithm has become a critical subroutine in several state-of-the-art graph recognition and detection algorithms.
In this paper we solve the three-in-a-tree problem in O(mlog2 n) time, leading to improved algorithms for recognizing perfect graphs and detecting thetas, pyramids, beetles, and odd and even holes. Our result is based on a new and more constructive characterization than that of Chudnovsky and Seymour. Our new characterization is stronger than the original, and our proof implies a new simpler proof for the original characterization. The improved characterization gains the first factor n in speed. The remaining improvement is based on dynamic graph algorithms.
We resolve the fine-grained parameterized complexity of detecting and counting small patterns in planar graphs, assuming the Exponential Time Hypothesis. Given an n-vertex planar graph G and a k-vertex pattern graph P, we compute the number of (induced) copies of P in G in time 2Õ(√k)nO(1), if P is a matching, independent set, or connected bounded maximum degree graph, 2O(k/logk)nO(1), for any pattern P. Our results significantly advance the state of the art of the planar graph isomorphism problem: No 2o(k)nO(1) time algorithms where previously known for counting patterns even in very restricted cases such as independent sets in subgraphs of grids, Even for detection, no 2O(k/logk)nO(1) time algorithms for unrestricted patterns P were previously known, Our run times cannot be improved assuming the Exponential Time Hypothesis (ETH). Our results are a corollary of the following general result: We compute the number of (induced) copies of P in G in 2Õ(√k)(σ(P)n)O(1) time, where σ(P) denotes the number of non-isomorphic separations of P of order Õ(√k).
To obtain the first sub-exponential time algorithms, we introduce a new general technique that we call efficient inclusion-exclusion. This technique allows us to efficiently use hierarchies of separators for counting problems. To resolve the optimal complexity of planar subgraph isomorphism we provide another technique that we call balanced cycle sparsification. This technique allows us to obtain from one balanced cycle separator in a planar graphs a family of balanced cycle separators that have limited mutual overlap, implying one separator has few vertices of the pattern copy.
In the Disjoint Paths problem, the input is an undirected graph G on n vertices and a set of k vertex pairs, {si,ti}i=1k, and the task is to find k pairwise vertex-disjoint paths such that the i’th path connects si to ti. In this paper, we give a parameterized algorithm with running time 2O(k2)nO(1) for Planar Disjoint Paths, the variant of the problem where the input graph is required to be planar. Our algorithm is based on the unique linkage/treewidth reduction theorem for planar graphs by Adler et al. [JCTB 2017], the algebraic co-homology based technique developed by Schrijver [SICOMP 1994] for Disjoint Paths on directed planar graphs, and one of the key combinatorial insights developed by Cygan et al. [FOCS 2013] in their algorithm for Disjoint Paths on directed planar graphs. To the best of our knowledge our algorithm is the first parameterized algorithm to exploit that the treewidth of the input graph is small in a way completely different from the use of dynamic programming.
In the Topological Minor Deletion (TM-Deletion) problem, the input consists of an undirected graph G, a family of undirected graphs F and an integer k. The task is to determine whether G contains a set of vertices S of size at most k, such that the graph G∖ S obtained from G by removing the vertices of S, contains no graph from F as a topological minor. We give an algorithm forTM-Deletion with running time f(h⋆,k)· |V(G)|4. Here h⋆ is the maximum size of a graph in F and f is a computable function of h⋆ and k. This is the first fixed parameter tractable algorithm (FPT) for the problem. In fact, even for the restricted case of planar inputs the first FPT algorithm was found only recently by Golovach et al. [SODA 2020]. For this case we improve upon the algorithm of Golovach et al. [SODA 2020] by designing an FPT algorithm with explicit dependence on k and h⋆.
We prove that for all constants a, NQP = NTIME[npolylog(n)] cannot be (1/2 + 2−loga n)-approximated by 2loga n-size ACC0 ∘ THR circuits ( ACC0 circuits with a bottom layer of THR gates). Previously, it was even open whether E NP can be (1/2+1/√n)-approximated by AC0[⊕] circuits. As a straightforward application, we obtain an infinitely often ( NE ∩ coNE)/1-computable pseudorandom generator for poly-size ACC0 circuits with seed length 2logєn, for all є > 0.
More generally, we establish a connection showing that, for a typical circuit class C, non-trivial nondeterministic algorithms estimating the acceptance probability of a given S-size C circuit with an additive error 1/S (we call it a CAPP algorithm) imply strong (1/2 + 1/nω(1)) average-case lower bounds for nondeterministic time classes against C circuits. Note that the existence of such (deterministic) algorithms is much weaker than the widely believed conjecture PromiseBPP = PromiseP.
We also apply our results to several sub-classes of TC0 circuits. First, we show that for all k, NP cannot be (1/2 + n−k)-approximated by nk-size Sum∘ THR circuits (exact ℝ-linear combination of threshold gates), improving the corresponding worst-case result in [Williams, CCC 2018]. Second, we establish strong average-case lower bounds and build ( NE ∩ coNE)/1-computable PRGs for Sum ∘ PTF circuits, for various regimes of degrees. Third, we show that non-trivial CAPP algorithms for MAJ ∘ MAJ indeed already imply worst-case lower bounds for TC30 ( MAJ ∘ MAJ ∘ MAJ). Since exponential lower bounds for MAJ ∘ MAJ are already known, this suggests TC30 lower bounds are probably within reach.
Our new results build on a line of recent works, including [Murray and Williams, STOC 2018], [Chen and Williams, CCC 2019], and [Chen, FOCS 2019]. In particular, it strengthens the corresponding (1/2 + 1/polylog(n))-inapproximability average-case lower bounds in [Chen, FOCS 2019].
The two important technical ingredients are techniques from Cryptography in NC0 [Applebaum et al., SICOMP 2006], and Probabilistic Checkable Proofs of Proximity with NC1-computable proofs.
We establish several “sharp threshold” results for computational complexity. For certain tasks, we can prove a resource lower bound of nc for c ≥ 1 (or obtain an efficient circuit-analysis algorithm for nc size), there is strong intuition that a similar result can be proved for larger functions of n, yet we can also prove that replacing “nc” with “nc+ε” in our results, for any ε > 0, would imply a breakthrough nω(1) lower bound. We first establish such a result for Hardness Magnification. We prove (among other results) that for some c, the Minimum Circuit Size Problem for (logn)c-size circuits on length-n truth tables (MCSP[(logn)c]) does not have n2−o(1)-size probabilistic formulas. We also prove that an n2+ε lower bound for MCSP[(logn)c] (for any ε > 0 and c ≥ 1) would imply major lower bound results, such as NP does not have nk-size formulas for all k, and #SAT does not have log-depth circuits. Similar results hold for time-bounded Kolmogorov complexity. Note that cubic size lower bounds are known for probabilistic De Morgan formulas (for other functions). Next we show a sharp threshold for Quantified Derandomization (QD) of probabilistic formulas: (a) For all α, ε > 0, there is a deterministic polynomial-time algorithm that finds satisfying assignments to every probabilistic formula of n2−2α−ε size with at most 2nα falsifying assignments. (b) If for some α, ε > 0, there is such an algorithm for probabilistic formulas of n2−α+ε-size and 2nα unsatisfying assignments, then a full derandomization of NC1 follows: a deterministic poly-time algorithm additively approximating the acceptance probability of any polynomial-size formula. Consequently, NP does not have nk-size formulas, for all k. Finally we show a sharp threshold result for Explicit Obstructions, inspired by Mulmuley’s notion of explicit obstructions from GCT. An explicit obstruction against S(n)-size formulas is a poly-time algorithm A such that A(1n) outputs a list {(xi,f(xi))}i ∈ [poly(n)] ⊆ {0,1}n × {0,1}, and every S(n)-size formula F is inconsistent with the (partially defined) function f. We prove that for all ε > 0, there is an explicit obstruction against n2−ε-size formulas, and prove that there is an explicit obstruction against n2+ε-size formulas for some ε > 0 if and only if there is an explicit obstruction against all polynomial-size formulas. This in turn is equivalent to the statement that E does not have 2o(n)-size formulas, a breakthrough in circuit complexity.
Hegedűs’s lemma is the following combinatorial statement regarding polynomials over finite fields. Over a field F of characteristic p > 0 and for q a power of p, the lemma says that any multilinear polynomial P∈ F[x1,…,xn] of degree less than q that vanishes at all points in {0,1}n of Hamming weight k∈ [q,n−q] must also vanish at all points in {0,1}n of weight k + q. This lemma was used by Hegedűs (2009) to give a solution to Galvin’s problem, an extremal problem about set systems; by Alon, Kumar and Volk (2018) to improve the best-known multilinear circuit lower bounds; and by Hrubeš, Ramamoorthy, Rao and Yehudayoff (2019) to prove optimal lower bounds against depth-2 threshold circuits for computing some symmetric functions. In this paper, we formulate a robust version of Hegedűs’s lemma. Informally, this version says that if a polynomial of degree o(q) vanishes at most points of weight k, then it vanishes at many points of weight k+q. We prove this lemma and give the following three different applications.
1. Degree lower bounds for the coin problem: The δ-Coin Problem is the problem of distinguishing between a coin that is heads with probability ((1/2) + δ) and a coin that is heads with probability 1/2. We show that over a field of positive (fixed) characteristic, any polynomial that solves the δ-coin problem with error ε must have degree Ω(1/δlog(1/ε)), which is tight up to constant factors. 2. Probabilistic degree lower bounds: The Probabilistic degree of a Boolean function is the minimum d such that there is a random polynomial of degree d that agrees with the function at each point with high probability. We give tight lower bounds on the probabilistic degree of every symmetric Boolean function over positive (fixed) characteristic. As far as we know, this was not known even for some very simple functions such as unweighted Exact Threshold functions, and constant error. 3. A robust version of the combinatorial result of Hegedűs (2009) mentioned above.
We consider the classical problem of maximizing a monotone submodular function subject to a cardinality constraint, which, due to its numerous applications, has recently been studied in various computational models. We consider a clean multi-player model that lies between the offline and streaming model, and study it under the aspect of one-way communication complexity. Our model captures the streaming setting (by considering a large number of players), and, in addition, two player approximation results for it translate into the robust setting. We present tight one-way communication complexity results for our model, which, due to the above-mentioned connections, have multiple implications in the data stream and robust setting.
Even for just two players, a prior information-theoretic hardness result implies that no approximation factor above 1/2 can be achieved in our model, if only queries to feasible sets, i.e., sets respecting the cardinality constraint, are allowed. We show that the possibility of querying infeasible sets can actually be exploited to beat this bound, by presenting a tight 2/3-approximation taking exponential time, and an efficient 0.514-approximation. To the best of our knowledge, this is the first example where querying a submodular function on infeasible sets leads to provably better results. Through the above-mentioned link to the robust setting, both of these algorithms improve on the current state-of-the-art for robust submodular maximization, showing that approximation factors beyond 1/2 are possible. Moreover, exploiting the link of our model to streaming, we settle the approximability for streaming algorithms by presenting a tight 1/2+ε hardness result, based on the construction of a new family of coverage functions. This improves on a prior 1−1/e+ε hardness and matches, up to an arbitrarily small margin, the best known approximation algorithm.
We present the first distance sensitivity oracle (DSO) with subcubic preprocessing time and poly-logarithmic query time for directed graphs with integer weights in the range [−M,M].
Weimann and Yuster [FOCS 10] presented a distance sensitivity oracle for a single vertex/edge failure with subcubic preprocessing time of O(Mnω+1−α) and subquadratic query time of Õ(n1+α), where α is any parameter in [0,1], n is the number of vertices, m is the number of edges, the Õ(·) notation hides poly-logarithmic factors in n and ω<2.373 is the matrix multiplication exponent.
Later, Grandoni and Vassilevska Williams [FOCS 12] substantially improved the query time to sublinear in n. In particular, they presented a distance sensitivity oracle for a single vertex/edge failure with Õ(Mnω+1/2+ Mnω+α(4−ω)) preprocessing time and Õ(n1−α) query time.
Despite the substantial improvement in the query time, it still remains polynomial in the size of the graph, which may be undesirable in many settings where the graph is of large scale. A natural question is whether one can hope for a distance sensitivity oracle with subcubic preprocessing time and very fast query time (of poly-logarithmic in n).
In this paper we answer this question affirmatively by presenting a distance sensitive oracle supporting a single vertex/edge failure in subcubic Õ(Mn2.873) preprocessing time for ω=2.373, Õ(n2.5) space and near optimal query time of Õ(1).
For comparison, with the same Õ(Mn2.873) preprocessing time the DSO of Grandoni and Vassilevska Williams has Õ(n0.693) query time. In fact, the best query time their algorithm can obtain is (Mn0.385) (with (Mn3) preprocessing time).
Given a set S of n (distinct) keys from key space [U], each associated with a value from Σ, the static dictionary problem asks to preprocess these (key, value) pairs into a data structure, supporting value-retrieval queries: for any given x∈ [U], valRet(x) must return the value associated with x if x∈ S, or return ⊥ if x∉ S. The special case where |Σ|=1 is called the membership problem. The “textbook” solution is to use a hash table, which occupies linear space and answers each query in constant time. On the other hand, the minimum possible space to encode all (key, value) pairs is only OPT:= ⌈lg2(Un)+nlg2|Σ|⌉ bits, which could be much less.
In this paper, we design a randomized dictionary data structure using OPT+lgn+O(lglglglglgU) bits of space, and it has expected constant query time, assuming the query algorithm can access an external lookup table of size n0.001. The lookup table depends only on U, n and |Σ|, and not the input. Previously, even for membership queries and U≤ nO(1), the best known data structure with constant query time requires OPT+n/lgn bits of space by Pagh (SIAM J. Comput. 2001) and Pǎtraşcu (FOCS 2008); the best known using OPT+n0.999 space has query time O(lgn); the only known non-trivial data structure with OPT+n0.001 space has O(lgn) query time and requires a lookup table of size ≥ n2.99 (!). Our new data structure answers open questions by Pǎtraşcu and Thorup.
Given an integer array A[1..n], the Range Minimum Query problem (RMQ) asks to preprocess A into a data structure, supporting RMQ queries: given a,b∈ [1,n], return the index i∈[a,b] that minimizes A[i], i.e., argmini∈[a,b] A[i]. This problem has a classic solution using O(n) space and O(1) query time by Gabow, Bentley, Tarjan (STOC, 1984) and Harel, Tarjan (SICOMP, 1984). The best known data structure by Fischer, Heun (SICOMP, 2011) and Navarro, Sadakane (TALG, 2014) uses 2n+n/(logn/t)t+Õ(n3/4) bits and answers queries in O(t) time, assuming the word-size is w=Θ(logn). In particular, it uses 2n+n/polylogn bits of space as long as the query time is a constant.
In this paper, we prove the first lower bound for this problem, showing that 2n+n/polylogn space is necessary for constant query time. In general, we show that if the data structure has query time O(t), then it must use at least 2n+n/(logn)Õ(t2) space, in the cell-probe model with word-size w=Θ(logn).
Given a collection of n points in ℝd, the goal of the (k,z)-clustering problem is to find a subset of k “centers” that minimizes the sum of the z-th powers of the Euclidean distance of each point to the closest center. Special cases of the (k,z)-clustering problem include the k-median and k-means problems. Our main result is a unified two-stage importance sampling framework that constructs an ε-coreset for the (k,z)-clustering problem. Compared to the results for (k,z)-clustering in [Feldman and Langberg, STOC 2011], our framework saves a ε2 d factor in the coreset size. Compared to the results for (k,z)-clustering in [Sohler and Woodruff, FOCS 2018], our framework saves a poly(k) factor in the coreset size and avoids the exp(k/ε) term in the construction time. Specifically, our coreset for k-median (z=1) has size Õ(ε−4 k) which, when compared to the result in [Sohler and Woodruff, STOC 2018], saves a k factor in the coreset size. Our algorithmic results rely on a new dimensionality reduction technique that connects two well-known shape fitting problems: subspace approximation and clustering, and may be of independent interest. We also provide a size lower bound of Ω(k· min{2z/20,d }) for a 0.01-coreset for (k,z)-clustering, which has a linear dependence of size on k and an exponential dependence on z that matches our algorithmic results.