We investigate the connection between the complexity of nonlocal games and the arithmetical hierarchy, a classification of languages according to the complexity of arithmetical formulas defining them. It was recently shown by Ji, Natarajan, Vidick, Wright and Yuen that deciding whether the (finite-dimensional) quantum value of a nonlocal game is 1 or at most 1/2 is complete for the class Σ1 (i.e., ). A result of Slofstra implies that deciding whether the commuting operator value of a nonlocal game is equal to 1 is complete for the class Π1 (i.e., coRE).
We prove that deciding whether the quantum value of a two-player nonlocal game is exactly equal to 1 is complete for Π2; this class is in the second level of the arithmetical hierarchy and corresponds to formulas of the form “∀ x ∃ y φ(x,y)”. This shows that exactly computing the quantum value is strictly harder than approximating it, and also strictly harder than computing the commuting operator value (either exactly or approximately).
We explain how results about the complexity of nonlocal games all follow in a unified manner from a technique known as compression. At the core of our Π2-completeness result is a new “gapless” compression theorem that holds for both quantum and commuting operator strategies. All previous works only study the setting of finite-dimensional quantum strategies; ours is the first to study compression of games in the commuting operator setting. Our compression theorem yields as a byproduct an alternative proof of Slofstra’s result that the set of quantum correlations is not closed. We also show how a “gap-preserving” compression theorem for commuting operator strategies would imply that approximating the commuting operator value is complete for Π1.
We prove that the entanglement entropy of the ground state of a locally gapped frustration-free 2D lattice spin system satisfies an area law with respect to a vertical bipartition of the lattice into left and right regions. We first establish that the ground state projector of any locally gapped frustration-free 1D spin system can be approximated to within error є by a degree O(√nlog(є−1)) multivariate polynomial in the interaction terms of the Hamiltonian. This generalizes the optimal bound on the approximate degree of the boolean AND function, which corresponds to the special case of commuting Hamiltonian terms. For 2D spin systems we then construct an approximate ground state projector (AGSP) that employs the optimal 1D approximation in the vicinity of the boundary of the bipartition of interest. This AGSP has sufficiently low entanglement and error to establish the area law using a known technique.
The Quantum Singular Value Transformation (QSVT) is a recent technique that gives a unified framework to describe most quantum algorithms discovered so far, and may lead to the development of novel quantum algorithms. In this paper we investigate the hardness of classically simulating the QSVT. A recent result by Chia, Gilyén, Li, Lin, Tang and Wang (STOC 2020) showed that the QSVT can be efficiently “dequantized” for low-rank matrices, and discussed its implication to quantum machine learning. In this work, motivated by establishing the superiority of quantum algorithms for quantum chemistry and making progress on the quantum PCP conjecture, we focus on the other main class of matrices considered in applications of the QSVT, sparse matrices. We first show how to efficiently “dequantize”, with arbitrarily small constant precision, the QSVT associated with a low-degree polynomial. We apply this technique to design classical algorithms that estimate, with constant precision, the singular values of a sparse matrix. We show in particular that a central computational problem considered by quantum algorithms for quantum chemistry (estimating the ground state energy of a local Hamiltonian when given, as an additional input, a state sufficiently close to the ground state) can be solved efficiently with constant precision on a classical computer. As a complementary result, we prove that with inverse-polynomial precision, the same problem becomes BQP-complete. This gives theoretical evidence for the superiority of quantum algorithms for chemistry, and strongly suggests that said superiority stems from the improved precision achievable in the quantum setting. We also discuss how this dequantization technique may help make progress on the central quantum PCP conjecture.
We propose the first near-optimal quantum algorithm for estimating in Euclidean norm the mean of a vector-valued random variable with finite mean and covariance. Our result aims at extending the theory of multivariate sub-Gaussian estimators [Lugosi and Mendelson, 2019] to the quantum setting. Unlike classically, where any univariate estimator can be turned into a multivariate estimator with at most a logarithmic overhead in the dimension, no similar result can be proved in the quantum setting. Indeed, [Heinrich, 2004] ruled out the existence of a quantum advantage for the mean estimation problem when the sample complexity is smaller than the dimension. Our main result is to show that, outside this low-precision regime, there does exist a quantum estimator that outperforms any classical estimator. More precisely, we prove that the approximation error can be decreased by a factor of about the square root of the ratio between the dimension and the sample complexity. Our approach is substantially more involved than in the univariate setting, where most quantum estimators rely only on phase estimation. We exploit a variety of additional algorithmic techniques such as linear amplitude amplification, the Bernstein-Vazirani algorithm, and quantum singular value transformation. Our analysis is also deeply rooted in proving concentration inequalities for multivariate truncated statistics. Finally, we describe several applications of our algorithms, notably in measuring the expectation values of commuting observables and in the field of machine learning.
As small quantum computers are becoming available on different physical platforms, a benchmarking task known as cross-platform verification has been proposed that aims to estimate the fidelity of states prepared on two quantum computers. This task is fundamentally distributed, as no quantum communication can be performed between the two physical platforms due to hardware constraints, which prohibits a joint SWAP test. In this paper we settle the sample complexity of this task across all measurement and communication settings. The essence of the task, which we call distributed quantum inner product estimation, involves two players Alice and Bob who have k copies of unknown states ρ,σ (acting on ℂd) respectively. Their goal is to estimate Tr(ρσ) up to additive error ε∈(0,1), using local quantum operations and classical communication. In the weakest setting where only non-adaptive single-copy measurements and simultaneous message passing are allowed, we show that k=O(max{1/ε2,√d/ε}) copies suffice. This achieves a savings compared to full tomography which takes Ω(d3) copies with single-copy measurements. Surprisingly, we also show that the sample complexity must be at least Ω(max{1/ε2,√d/ε}), even in the strongest setting where adaptive multi-copy measurements and arbitrary rounds of communication are allowed. This shows that the success achieved by shadow tomography, for sample-efficiently learning the properties of a single system, cannot be generalized to the distributed setting. Furthermore, the fact that the sample complexity remains the same with single and multi-copy measurements contrasts with single system quantum property testing, which often demonstrate exponential separations in sample complexity with single and multi-copy measurements.
The graphical balls-into-bins process is a generalization of the classical 2-choice balls-into-bins process, where the bins correspond to vertices of an arbitrary underlying graph G. At each time step an edge of G is chosen uniformly at random, and a ball must be assigned to either of the two endpoints of this edge. The standard 2-choice process corresponds to the case of G=Kn.
For any k(n)-edge-connected, d(n)-regular graph on n vertices, and any number of balls, we give an allocation strategy that, with high probability, ensures a gap of O((d/k) log4n loglogn), between the load of any two bins. In particular, this implies polylogarithmic bounds for natural graphs such as cycles and tori, for which the classical greedy allocation strategy is conjectured to have a polynomial gap between the bin loads. For every graph G, we also show an Ω((d/k) + logn) lower bound on the gap achievable by any allocation strategy. This implies that our strategy achieves the optimal gap, up to polylogarithmic factors, for every graph G.
Our allocation algorithm is simple to implement and requires only O(log(n)) time per allocation. It can be viewed as a more global version of the greedy strategy that compares average load on certain fixed sets of vertices, rather than on individual vertices. A key idea is to relate the problem of designing a good allocation strategy to that of finding suitable multi-commodity flows. To this end, we consider Räcke’s cut-based decomposition tree and define certain orthogonal flows on it.
Mahlmann and Schindelhaue (2005) proposed the following simple process, called flip-chain, for transforming any given connected d-regular graph into a d-regular expander: In each step, a random 3-path abcd is selected, and edges ab and cd are replaced by two new edges ac and bd, provided that ac and bd do not exist already. A motivation for the study of the flip-chain arises in the design of overlay networks, where it is common practice that adjacent nodes periodically exchange random neighbors, to maintain good connectivity properties. It is known that the flip-chain converges to the uniform distribution over connected d-regular graphs, and it is conjectured that an expander graph is obtained after O(ndlogn) steps, w.h.p., where n is the number of vertices. However, the best known upper bound on the number of steps is O(n2d2√logn), and the best bound on the mixing time of the chain is O(n16d36logn).
We provide a new analysis of a natural flip-chain instantiation, which shows that starting from any connected d-regular graph, for d = Ω(log2 n), an expander is obtained after O(ndlog2n) steps, w.h.p. This result is tight within logarithmic factors, and almost matches the conjectured bound. Moreover, it justifies the use of edge flip operations in practice: for any d-regular graph with d = poly(logn), an expander is reached after each vertex participates in at most poly(logn) operations, w.h.p. Our analysis is arguably more elementary than previous approaches. It uses the novel notion of the strain of a cut, a value that depends both on the crossing edges and their adjacent edges. By keeping track of the cut strains, we form a recursive argument that bounds the time before all sets of a given size have large expansion, after all smaller sets have already attained large expansion.
Motivated by display advertising on the internet, the online stochastic matching problem is proposed by Feldman, Mehta, Mirrokni, and Muthukrishnan (FOCS 2009). Consider a stochastic bipartite graph with offline vertices on one side and with i.i.d. online vertices on the other side. The algorithm knows the offline vertices and the distribution of the online vertices in advance. Upon the arrival of each online vertex, its type is realized and the algorithm immediately and irrevocably decides how to match it. In the vertex-weighted version of the problem, each offline vertex is associated with a weight and the goal is to maximize the total weight of the matching.
In this paper, we generalize the model to allow non-identical online vertices and focus on the fractional version of the vertex-weighted stochastic matching. We design fractional algorithms that are 0.718-competitive and 0.731-competitive for non i.i.d. arrivals and i.i.d. arrivals respectively. We also prove that no fractional algorithm can achieve a competitive ratio better than 0.75 for non i.i.d. arrivals. Furthermore, we round our fractional algorithms by applying the recently developed multiway online correlated selection by Gao et al. (FOCS 2021) and achieve 0.666-competitive and 0.704-competitive integral algorithms for non i.i.d. arrivals and i.i.d. arrivals. Our results for non i.i.d. arrivals are the first algorithms beating the 1−1/e ≈ 0.632 barrier of the classical adversarial setting. Our 0.704-competitive integral algorithm for i.i.d. arrivals slightly improves the state-of-the-art 0.701-competitive ratio by Huang and Shu (STOC 2021).
We study the power of multiple choices in online stochastic matching. Despite a long line of research, existing algorithms still only consider two choices of offline neighbors for each online vertex because of the technical challenge in analyzing multiple choices. This paper introduces two approaches for designing and analyzing algorithms that use multiple choices. For unweighted and vertex-weighted matching, we adopt the online correlated selection (OCS) technique into the stochastic setting, and improve the competitive ratios to 0.716, from 0.711 and 0.7 respectively. For edge-weighted matching with free disposal, we propose the Top Half Sampling algorithm. We directly characterize the progress of the whole matching instead of individual vertices, through a differential inequality. This improves the competitive ratio to 0.706, breaking the 1−1/e barrier in this setting for the first time in the literature. Finally, for the harder edge-weighted problem without free disposal, we prove that no algorithms can be 0.703 competitive, separating this setting from the aforementioned three.
We give an online algorithm that with high probability computes a (e/e−1 + o(1))Δ edge coloring on a graph G with maximum degree Δ = ω(logn) under online edge arrivals against oblivious adversaries, making first progress on the conjecture of Bar-Noy, Motwani, and Naor in this general setting. Our algorithm is based on reducing to a matching problem on locally treelike graphs, and then applying a tree recurrences based approach for arguing correlation decay.
Given a directed graph as input, we show how to efficiently find a shortest (directed, simple) cycle on an even number of vertices. As far as we know, no polynomial-time algorithm was previously known for this problem. In fact, finding any even cycle in a directed graph in polynomial time was open for more than two decades until Robertson, Seymour, and Thomas (Ann. of Math. (2) 1999) and, independently, McCuaig (Electron. J. Combin. 2004; announced jointly at STOC 1997) gave an efficiently testable structural characterisation of even-cycle-free directed graphs.
Methodologically, our algorithm relies on the standard framework of algebraic fingerprinting and randomized polynomial identity testing over a finite field, and in fact relies on a generating polynomial implicit in a paper of Vazirani and Yannakakis (Discrete Appl. Math. 1989) that enumerates weighted cycle covers by the parity of their number of cycles as a difference of a permanent and a determinant polynomial. The need to work with the permanent—known to be #P-hard apart from a very restricted choice of coefficient rings (Valiant, Theoret. Comput. Sci. 1979)—is where our main technical contribution occurs. We design a family of finite commutative rings of characteristic 4 that simultaneously (i) give a nondegenerate representation for the generating polynomial identity via the permanent and the determinant, (ii) support efficient permanent computations by extension of Valiant’s techniques, and (iii) enable emulation of finite-field arithmetic in characteristic 2. Here our work is foreshadowed by that of Björklund and Husfeldt (SIAM J. Comput. 2019), who used a considerably less efficient commutative ring design—in particular, one lacking finite-field emulation—to obtain a polynomial-time algorithm for the shortest two disjoint paths problem in undirected graphs.
Building on work of Gilbert and Tarjan (Numer. Math. 1978) as well as Alon and Yuster (J. ACM 2013), we also show how ideas from the nested dissection technique for solving linear equation systems—introduced by George (SIAM J. Numer. Anal. 1973) for symmetric positive definite real matrices—leads to faster algorithm designs in our present finite-ring randomized context when we have control on the separator structure of the input graph; for example, this happens when the input has bounded genus.
In the minimum k-cut problem, we want to find the minimum number of edges whose deletion breaks the input graph into at least k connected components. The classic algorithm of Karger and Stein runs in Õ(n2k−2) time, and recent, exciting developments have improved the running time to O(nk). For general, weighted graphs, this is tight assuming popular hardness conjectures.
In this work, we show that perhaps surprisingly, O(nk) is not the right answer for simple, unweighted graphs. We design an algorithm that runs in time O(n(1−є)k+O(1)) where є>0 is an absolute constant, breaking the natural nk barrier. This establishes a separation of the two problems in the unweighted and weighted cases.
We give an Õ(m)-time algorithm for the edge connectivity augmentation problem and the closely related edge splitting-off problem. This is optimal up to lower order terms and closes the long line of work on these problems.
A k-vertex connectivity oracle for undirected G is a data structure that, given u,v∈ V(G), reports min{k,κ(u,v)}, where κ(u,v) is the pairwise vertex connectivity between u,v. There are three main measures of efficiency: construction time, query time, and space. Prior work of Izsak and Nutov [Inf. Process. Lett. 2012] shows that a data structure of total size O(knlogn), which can even be encoded as a O(klog3 n)-bit labeling scheme, can answer vertex-connectivity queries in O(klogn) time. The construction time is polynomial, but unspecified.
In this paper we address the top three complexity measures.
The first is the space consumption. We prove that any k-vertex connectivity oracle requires Ω(kn) bits of space. This answers a long-standing question on the structural complexity of vertex connectivity, and gives a strong separation between the complexity of vertex- and edge-connectivity. Both Izsak and Nutov [Inf. Process. Lett. 2012] and the data structure we will present in this work match this lower bound up to polylogarithmic factors.
The second is the query time. We answer queries in O(logn) time, independent of k, improving on Ω(klogn) time of Izsak and Nutov [Inf. Process. Lett. 2012]. The main idea is to build instances of SetIntersection data structures, with additional structure based on affine planes. This structure allows for optimum query time that is linear in the output size (This evades the general k1/2−o(1) and k1−o(1) lower bounds on SetIntersection from the 3SUM or OMv hypotheses, resp. Kopelowitz et al. [SODA 2016] and Henzinger et al. [STOC 2015].)
The third is the construction time. We build the data structure in time of roughly a max-flow computation on a unit-capacity graph, which is m4/3+o(1) using state-of-the-art algorithm by Tarun et al. [FOCS 2020]. Max-flow is a natural barrier for many problems that have an all-pairs-min-cut flavor. The main technical contribution here is a fast algorithm for computing a k-bounded version of a Gomory-Hu tree for element connectivity, a notion that generalizes edge and vertex connectivity.
We consider the problem of designing fundamental graph algorithms on the model of Massive Parallel Computation (MPC). The input to the problem is an undirected graph G with n vertices and m edges, and with D being the maximum diameter of any connected component in G. We consider the MPC with low local space, allowing each machine to store only Θ(nδ) words for an arbitrary constant δ>0, and with linear global space (which is the number of machines times the local space available), that is, with optimal utilization.
In a recent breakthrough, Andoni et al. (FOCS’18) and Behnezhad et al. (FOCS’19) designed parallel randomized algorithms that in O(logD + loglogn) rounds on an MPC with low local space determine all connected components of a graph, improving on the classic bound of O(logn) derived from earlier works on PRAM algorithms.
In this paper, we show that asymptotically identical bounds can be also achieved for deterministic algorithms: we present a deterministic MPC low local space algorithm that in O(logD + loglogn) rounds determines connected components of the input graph. Our result matches the complexity of state of the art randomized algorithms for this task. The techniques developed in our paper can be also applied to several related problems, giving new deterministic MPC algorithms for problems like finding a spanning forest, minimum spanning forest, etc.
We complement our upper bounds by extending a recent lower bound for connectivity on an MPC conditioned on the 1-vs-2-cycles conjecture (which requires D ≥ log1+Ω(1)n), by showing a related conditional hardness of Ω(logD) MPC rounds for the entire spectrum of D, covering a particularly interesting range when D ≤ O(logn).
We prove hypercontractive inequalities on high dimensional expanders. As in the settings of the p-biased hypercube, the symmetric group, and the Grassmann scheme, our inequalities are effective for global functions, which are functions that are not significantly affected by a restriction of a small set of coordinates. As applications, we obtain Fourier concentration, small-set expansion, and Kruskal–Katona theorems for high dimensional expanders. Our techniques rely on a new approximate Efron–Stein decomposition for high dimensional link expanders.
Hypercontractivity is one of the most powerful tools in Boolean function analysis. Originally studied over the discrete hypercube, recent years have seen increasing interest in extensions to settings like the p-biased cube, slice, or Grassmannian, where variants of hypercontractivity have found a number of breakthrough applications including the resolution of Khot’s 2-2 Games Conjecture (Khot, Minzer, Safra FOCS 2018). In this work, we develop a new theory of hypercontractivity on high dimensional expanders (HDX), an important class of expanding complexes that has recently seen similarly impressive applications in both coding theory and approximate sampling. Our results lead to a new understanding of the structure of Boolean functions on HDX, including a tight analog of the KKL Theorem and a new characterization of non-expanding sets.
Unlike previous settings satisfying hypercontractivity, HDX can be asymmetric, sparse, and very far from products, which makes the application of traditional proof techniques challenging. We handle these barriers with the introduction of two new tools of independent interest: a new explicit combinatorial Fourier basis for HDX that behaves well under restriction, and a new local-to-global method for analyzing higher moments. Interestingly, unlike analogous second moment methods that apply equally across all types of expanding complexes, our tools rely inherently on simplicial structure. This suggests a new distinction among high dimensional expanders based upon their behavior beyond the second moment.
This is an extended abstract. The full paper may be found at https://arxiv.org/abs/2111.09444.
For a function g∶{0,1}m→{0,1}, a function f∶ {0,1}n→{0,1} is called a g-polymorphism if their actions commute: f(g(row1(Z)),…,g(rown(Z))) = g(f(col1(Z)),…,f(colm(Z))) for all Z∈{0,1}n× m. The function f is called an approximate g-polymorphism if this equality holds with probability close to 1, when Z is sampled uniformly. A pair of functions f0,f1∶ {0,1}n → {0,1} are called a skew g-polymorphism if f0(g(row1(Z)),…,g(rown(Z))) = g(f1(col1(Z)),…,f1(colm(Z))) for all Z∈{0,1}n× m.
We study the structure of exact polymorphisms as well as approximate polymorphisms. Our results include a proof that an approximate polymorphism f must be close to an exact skew polymorphism, and a characterization of exact skew polymorphisms, which shows that besides trivial cases, only the functions AND, XOR, OR, NAND, NOR, XNOR admit non-trivial exact skew polymorphisms.
We also study the approximate polymorphism problem in the list-decoding regime (i.e., when the probability equality holds is not close to 1, but is bounded away from some value). We show that if f(x ∧ y) = f(x) ∧ f(y) with probability larger than s∧≈ 0.815 then f correlates with some junta, and s∧ is the optimal threshold for this property.
Our result generalize the classical linearity testing result of Blum, Luby and Rubinfeld, that in this language showed that the approximate polymorphisms of g = XOR are close to XOR’s, as well as a recent result of Filmus, Lifshitz, Minzer and Mossel, showing that the approximate polymorphisms of AND can only be close to AND functions.
We prove that every bounded function f:{−1,1}n→[−1,1] of degree at most d can be learned with L2-accuracy ε and confidence 1−δ from log(n/δ) ε−d−1 Cd3/2√logd random queries, where C>1 is a universal finite constant.
In a recent work, O’Donnell, Servedio and Tan (STOC 2019) gave explicit pseudorandom generators (s) for arbitrary m-facet polytopes in n variables with seed length poly-logarithmic in m,n, concluding a sequence of works in the last decade, that was started by Diakonikolas, Gopalan, Jaiswal, Servedio, Viola (SICOMP 2010) and Meka, Zuckerman (SICOMP 2013) for fooling linear and polynomial threshold functions, respectively. In this work, we consider a natural extension of s for intersections of positive spectrahedra. A positive spectrahedron is a Boolean function f(x)=[x1A1+⋯ +xnAn ≼ B] where the Ais are k× k positive semidefinite matrices. We construct explicit s that δ-fool “regular” width-M positive spectrahedra (i.e., when none of the Ais are dominant) over the Boolean space with seed length (logk,logn, M, 1/δ).
Our main technical contributions are the following: We first prove an invariance principle for positive spectrahedra via the well-known Lindeberg method. As far as we are aware such a generalization of the Lindeberg method was unknown. Second, we prove an upper bound on noise sensitivity and a Littlewood-Offord theorem for positive spectrahedra. Using these results, we give applications for constructing s for positive spectrahedra, learning theory, discrepancy sets for positive spectrahedra (over the Boolean cube) and s for intersections of structured polynomial threshold functions.
We study streaming algorithms for two fundamental geometric problems: computing the cost of a Minimum Spanning Tree (MST) of an n-point set X ⊂ {1,2,…,Δ}d, and computing the Earth Mover Distance (EMD) between two multi-sets A,B ⊂ {1,2,…,Δ}d of size n. We consider the turnstile model, where points can be added and removed. We give a one-pass streaming algorithm for MST and a two-pass streaming algorithm for EMD, both achieving an approximation factor of Õ(logn) and using (n,d,Δ)-space only. Furthermore, our algorithm for EMD can be compressed to a single pass with a small additive error. Previously, the best known sublinear-space streaming algorithms for either problem achieved an approximation of O(min{ logn , log(Δ d)} logn). For MST, we also prove that any constant space streaming algorithm can only achieve an approximation of Ω(logn), analogous to the Ω(logn) lower bound for EMD.
Our algorithms are based on an improved analysis of a recursive space partitioning method known generically as the Quadtree. Specifically, we show that the Quadtree achieves an Õ(logn) approximation for both EMD and MST, improving on the O(min{ logn , log(Δ d)} logn) approximation.
Every graph with maximum degree Δ can be colored with (Δ+1) colors using a simple greedy algorithm. Remarkably, recent work has shown that one can find such a coloring even in the semi-streaming model: there exists a randomized algorithm that with high probability finds a (Δ+1)-coloring of the input graph in only O(n·logn) space assuming a single pass over the edges of the graph in any arbitrary order. But, in reality, one almost never needs (Δ+1) colors to properly color a graph. Indeed, the celebrated Brooks’ theorem states that every (connected) graph beside cliques and odd cycles can be colored with Δ colors. Can we find a Δ-coloring in the semi-streaming model as well?
We settle this key question in the affirmative by designing a randomized semi-streaming algorithm that given any graph, with high probability, either correctly declares that the graph is not Δ-colorable or outputs a Δ-coloring of the graph.
The proof of this result starts with a detour. We first (provably) identify the extent to which the previous approaches for streaming coloring fail for Δ-coloring: for instance, all these approaches can handle streams with repeated edges and they can run in o(n2) time – we prove that neither of these tasks is possible for Δ-coloring. These impossibility results however pinpoint exactly what is missing from prior approaches when it comes to Δ-coloring.
We then build on these insights to design a semi-streaming algorithm that uses (i) a novel sparse-recovery approach based on sparse-dense decompositions to (partially) recover the ”problematic” subgraphs of the input—the ones that form the basis of our impossibility results—and (ii) a new coloring approach for these subgraphs that allows for recoloring of other vertices in a controlled way without relying on local explorations or finding ”augmenting paths” that are generally impossible for semi-streaming algorithms. We believe both these techniques can be of independent interest.
We present a deterministic (1+ε)-approximate maximum matching algorithm in poly(1/ε) passes in the semi-streaming model, solving the long-standing open problem of breaking the exponential barrier in the dependence on 1/ε. Our algorithm exponentially improves on the well-known randomized (1/ε)O(1/ε)-pass algorithm from the seminal work by McGregor [APPROX05], the recent deterministic algorithm by Tirodkar with the same pass complexity [FSTTCS18]. Up to polynomial factors in 1/ε, our work matches the state-of-the-art deterministic (logn / loglogn) · (1/ε)-pass algorithm by Ahn and Guha [TOPC18], that is allowed a dependence on the number of nodes n. Our result also makes progress on the Open Problem 60 at sublinear.info.
Moreover, we design a general framework that simulates our approach for the streaming setting in other models of computation. This framework requires access to an algorithm computing a maximal matching and an algorithm for processing disjoint ( 1 / ε)-size connected components. Instantiating our framework in CONGEST yields a (logn, 1/ε) round algorithm for computing (1+ε)-approximate maximum matching. In terms of the dependence on 1/ε, this result improves exponentially state-of-the-art result by Lotker, Patt-Shamir, and Pettie [LPSP15]. Our framework leads to the same quality of improvement in the context of the Massively Parallel Computation model as well.
Recent breakthroughs in graph streaming have led to design of semi-streaming algorithms for various graph coloring problems such as (Δ+1)-coloring, degeneracy-coloring, coloring triangle-free graphs, and others. These algorithms are all randomized in crucial ways and whether or not there is any deterministic analogue of them has remained an important open question in this line of work.
We settle this fundamental question by proving that there is no deterministic single-pass semi-streaming algorithm that given a graph G with maximum degree Δ, can output a proper coloring of G using any number of colors which is sub-exponential in Δ. Our proof is based on analyzing the multi-party communication complexity of a related communication game, using random graph theory type arguments that may be of independent interest.
We complement our lower bound by showing that one extra pass over the input allows one to recover an O(Δ2) coloring via a deterministic semi-streaming algorithm. This result is extended to an O(Δ) coloring in O(logΔ) passes even in dynamic streams.
We consider the approximability of constraint satisfaction problems in the streaming setting. For every constraint satisfaction problem (CSP) on n variables taking values in {0,…,q−1}, we prove that improving over the trivial approximability by a factor of q requires Ω(n) space even on instances with O(n) constraints. We also identify a broad subclass of problems for which any improvement over the trivial approximability requires Ω(n) space. The key technical core is an optimal, q−(k−1)-inapproximability for the Max k-LIN-mod q problem, which is the Max CSP problem where every constraint is given by a system of k−1 linear equations mod q over k variables.
Our work builds on and extends the breakthrough work of Kapralov and Krachun (Proc. STOC 2019) who showed a linear lower bound on any non-trivial approximation of the MaxCut problem in graphs. MaxCut corresponds roughly to the case of Max k-LIN-mod q with k=q=2. For general CSPs in the streaming setting, prior results only yielded Ω(√n) space bounds. In particular no linear space lower bound was known for an approximation factor less than 1/2 for any CSP. Extending the work of Kapralov and Krachun to Max k-LIN-mod q to k>2 and q>2 (while getting optimal hardness results) is the main technical contribution of this work. Each one of these extensions provides non-trivial technical challenges that we overcome in this work.
In the Unsplittable Flow on a Path problem (UFP) we are given a path with edge capacities, and a set of tasks where each task is characterized by a subpath, a demand, and a weight. The goal is to select a subset of tasks of maximum total weight such that the total demand of the selected tasks using each edge e is at most the capacity of e. The problem admits a QPTAS [Bansal, Chakrabarti, Epstein, Schieber, STOC'06; Batra, Garg, Kumar, Mömke, Wiese, SODA'15]. After a long sequence of improvements [Bansal, Friggstad, Khandekar, Salavatipour, SODA'09; Bonsma, Schulz, Wiese, FOCS'11; Anagnostopoulos, Grandoni, Leonardi, Wiese, SODA'14; Grandoni, Mömke, Wiese, Zhou, STOC'18], the best known polynomial time approximation algorithm for UFP has an approximation ratio of 1+1/(e+1) + epsilon < 1.269 [Grandoni, Mömke, Wiese, SODA'22]. It has been an open question whether this problem admits a PTAS. In this paper, we solve this open question and present a polynomial time (1 + epsilon)-approximation algorithm for UFP.
We consider the classical Minimum Crossing Number problem: given an n-vertex graph G, compute a drawing of G in the plane, while minimizing the number of crossings between the images of its edges. This is a fundamental and extensively studied problem, whose approximability status is widely open. In all currently known approximation algorithms, the approximation factor depends polynomially on Δ – the maximum vertex degree in G. The best current approximation algorithm achieves an O(n1/2−· (Δ·logn))-approximation, for a small fixed constant є, while the best negative result is APX-hardness, leaving a large gap in our understanding of this basic problem. In this paper we design a randomized O(2O((logn)7/8loglogn)·(Δ))-approximation algorithm for Minimum Crossing Number. This is the first approximation algorithm for the problem that achieves a subpolynomial in n approximation factor (albeit only in graphs whose maximum vertex degree is subpolynomial in n).
In order to achieve this approximation factor, we design a new algorithm for a closely related problem called Crossing Number with Rotation System, in which, for every vertex v∈ V(G), the circular ordering, in which the images of the edges incident to v must enter the image of v in the drawing is fixed as part of input. Combining this result with the recent reduction of [Chuzhoy, Mahabadi, Tan ’20] immediately yields the improved approximation algorithm for Minimum Crossing Number.
In the Shortest Superstring problem, we are given a set of strings and we are asking for a common superstring, which has the minimum number of characters. The Shortest Superstring problem is NP-hard and several constant-factor approximation algorithms are known for it. Of particular interest is the GREEDY algorithm, which repeatedly merges two strings of maximum overlap until a single string remains. The GREEDY algorithm, being simpler than other well-performing approximation algorithms for this problem, has attracted attention since the 1980s and is commonly used in practical applications.
Tarhio and Ukkonen (TCS 1988) conjectured that GREEDY gives a 2-approximation. In a seminal work, Blum, Jiang, Li, Tromp, and Yannakakis (STOC 1991) proved that the superstring computed by GREEDY is a 4-approximation, and this upper bound was improved to 3.5 by Kaplan and Shafrir (IPL 2005).
We show that the approximation guarantee of GREEDY is at most (13+√57)/6 ≈ 3.425. Furthermore, we prove that the Shortest Superstring can be approximated within a factor of (37+√57)/18≈ 2.475, improving slightly upon the currently best 2 11/23-approximation algorithm by Mucha (SODA 2013).
We relate discrepancy theory with the classic scheduling problems of minimizing max flow time and total flow time on unrelated machines. Specifically, we give a general reduction that allows us to transfer discrepancy bounds in the prefix Beck-Fiala (bounded ℓ1-norm) setting to bounds on the flow time of an optimal schedule. Combining our reduction with a deep result proved by Banaszczyk via convex geometry, give guarantees of O(√logn) and O(√logn logP) for max flow time and total flow time, respectively, improving upon the previous best guarantees of O(logn) and O(logn logP). Apart from the improved guarantees, the reduction motivates seemingly easy versions of prefix discrepancy questions: any constant bound on prefix Beck-Fiala where vectors have sparsity two (sparsity one being trivial) would already yield tight guarantees for both max flow time and total flow time. While known techniques solve this case when the entries take values in {−1,0,1}, we show that they are unlikely to transfer to the more general 2-sparse case of bounded ℓ1-norm.
Since the mid 90s, the study of the complexity of classic network design problems such as the traveling salesman problem (TSP), the Steiner tree problem (ST), or the k-MST problem on metric spaces such as low-dimensional Euclidean spaces, doubling metrics, planar or minor-free graphs, has led to major improvements of our understanding of the structure of both these important metric spaces, and the underlying problems.
In their celebrated work, Arora and Mitchell gave quasi polynomial-time approximation schemes (QPTAS) for several network design problems in Euclidean space, that have later been improved to polynomial and (near-)linear in some cases. Arora and Mitchell’s QPTAS result has been extended by Talwar [STOC’04] to doubling metrics showing that the Euclidean embedding is not a necessary condition to design approximation schemes for network design problems.
In the case of planar and more broadly minor-free graphs, the picture is much less satisfactory. For example, nothing better than the general case is known for k-MST, an open question that has thus been repeatedly asked. Even when approximation schemes are known for the planar case, generalizing them to the minor-free setting is often a major challenge because most of the topological properties are lost. Therefore, one of the most important question of the field is whether we can bypass these topological arguments to obtain a general framework for network design problems in minor-free graphs, similar to Arora, Mitchell and Talwar’s?
In this paper, we answer this question in the affirmative by proposing a framework for network design problems in minor-free graphs. We give the first approximation schemes, with quasi-polynomial running time, for k-MST, Steiner tree, and Steiner forest in minor-free graphs. The result on k-MST was not known before even for the planar case, and obtaining approximation schemes for Steiner tree on minor-free graphs has been a major open problem.cczc
A locally testable code (LTC) is an error correcting code that has a property-tester. The tester reads q bits that are randomly chosen, and rejects words with probability proportional to their distance from the code. The parameter q is called the locality of the tester.
LTCs were initially studied as important components of probabilistically checkable proofs (PCP), and since then the topic has evolved on its own. High rate LTCs could be useful in practice: before attempting to decode a received word, one can save time by first quickly testing if it is close to the code.
An outstanding open question has been whether there exist “c3-LTCs”, namely LTCs with constant rate, constant distance, and constant locality.
In this work we construct such codes based on a new two-dimensional complex which we call a left-right Cayley complex. This is essentially a graph which, in addition to vertices and edges, also has squares. Our codes can be viewed as a two-dimensional version of (the one-dimensional) expander codes, where the codewords are functions on the squares rather than on the edges.
We study classical and quantum LDPC codes of constant rate obtained by the lifted product construction over non-abelian groups. We show that the obtained families of quantum LDPC codes are asymptotically good, which proves the qLDPC conjecture. Moreover, we show that the produced classical LDPC codes are also asymptotically good and locally testable with constant query and soundness parameters, which proves a well-known conjecture in the field of locally testable codes.
We show that any nonzero polynomial in the ideal generated by the r × r minors of an n × n matrix X can be used to efficiently approximate the determinant. Specifically, for any nonzero polynomial f in this ideal, we construct a small depth-three f-oracle circuit that approximates the Θ(r1/3) × Θ(r1/3) determinant in the sense of border complexity. For many classes of algebraic circuits, this implies that every nonzero polynomial in the ideal generated by r × r minors is at least as hard to approximately compute as the Θ(r1/3) × Θ(r1/3) determinant. We also prove an analogous result for the Pfaffian of a 2n × 2n skew-symmetric matrix and the ideal generated by Pfaffians of 2r × 2r principal submatrices.
This answers a recent question of Grochow about complexity in polynomial ideals in the setting of border complexity. Leveraging connections between the complexity of polynomial ideals and other questions in algebraic complexity, our results provide a generic recipe that allows lower bounds for the determinant to be applied to other problems in algebraic complexity. We give several such applications, two of which are highlighted below.
We prove new lower bounds for the Ideal Proof System of Grochow and Pitassi. Specifically, we give super-polynomial lower bounds for refutations computed by low-depth circuits. This extends the recent breakthrough low-depth circuit lower bounds of Limaye et al. to the setting of proof complexity. Moreover, we show that for many natural circuit classes, the approximative proof complexity of our hard instance is governed by the approximative circuit complexity of the determinant.
We also construct new hitting set generators for the closure of low-depth circuits. For any ε > 0, we construct generators with seed length O(nε) that hit n-variate low-depth circuits. Our generators attain a near-optimal tradeoff between their seed length and degree, and are computable by low-depth circuits of near-linear size (with respect to the size of their output). This matches the seed length of the generators recently obtained by Limaye et al., but improves on the degree and circuit complexity of the generator.
Multipoint evaluation is the computational task of evaluating a polynomial given as a list of coefficients at a given set of inputs. Besides being a natural and fundamental question in computer algebra on its own, fast algorithms for this problem are also closely related to fast algorithms for other natural algebraic questions like polynomial factorization and modular composition. And while nearly linear time algorithms have been known for the univariate instance of multipoint evaluation for close to five decades due to a work of Borodin and Moenck, fast algorithms for the multivariate version have been much harder to come by. In a significant improvement to the state of art for this problem, Umans and Kedlaya & Umans gave nearly linear time algorithms for this problem over field of small characteristic and over all finite fields respectively, provided that the number of variables n is at most do(1) where the degree of the input polynomial in every variable is less than d. They also stated the question of designing fast algorithms for the large variable case (i.e. n ∉ do(1)) as an open problem.
In this work, we show that there is a deterministic algorithm for multivariate multipoint evaluation over a field Fq of characteristic p which evaluates an n-variate polynomial of degree less than d in each variable on N inputs in time
(N + dn)1 + o(1)(logq, d, n, p)
provided that p is at most do(1), and q is at most (exp(⋯ (exp(d)))), where the height of this tower of exponentials is fixed. When the number of variables is large (e.g. n ∉ do(1)), this is the first nearly linear time algorithm for this problem over any (large enough) field.
Our algorithm is based on elementary algebraic ideas and this algebraic structure naturally leads to the following two independently interesting applications.
We show that there is an algebraic data structure for univariate polynomial evaluation with nearly linear space complexity and sublinear time complexity over finite fields of small characteristic and quasipolynomially bounded size. This provides a counterexample to a conjecture of Milterson who conjectured that over small finite fields, any algebraic data structure for polynomial evaluation using polynomial space must have linear query complexity.
We also show that over finite fields of small characteristic and quasipolynomially bounded size, Vandermonde matrices are not rigid enough to yield size-depth tradeoffs for linear circuits via the current quantitative bounds in Valiant’s program. More precisely, for every fixed prime p, we show that for every constant є > 0, and large enough n, the rank of any n × n Vandermonde matrix V over the field pa can be reduced to (n/exp(Ω((є)√logn))) by changing at most nΘ(є) entries in every row of V, provided a ≤ (logn). Prior to this work, similar upper bounds on rigidity were known only for special Vandermonde matrices. For instance, the Discrete Fourier Transform matrices and Vandermonde matrices with generators in a geometric progression.
An Algebraic Formula for a polynomial P∈ [x1,…,xN] is an algebraic expression for P(x1,…,xN) using variables, field constants, additions and multiplications. Such formulas capture an algebraic analog of the Boolean complexity class NC1. Proving lower bounds against this model is thus an important problem.
It is known that, to prove superpolynomial lower bounds against algebraic formulas, it suffices to prove good enough lower bounds against restricted kinds of formulas known as Set-Multilinear formulas, for computing a polynomial P(x1,...,xN) of degree O(logN/loglogN). In the past, many superpolynomial lower bounds were found, but they are of the form Ω(f(d) poly(N)) (where f is typically a subexponential function) which is insufficient to get lower bounds for general formulas. Recently, the authors proved the first non-FPT lower bounds, i.e., a lower bound of the form NΩ(f(d)), against small-depth set-multilinear formulas (and also for circuits). In this work, we extend this result in two directions.
Large-depth set-multilinear formulas. In the setting of general set-multilinear formulas, we prove a lower bound of (logn)Ω(logd) for computing the Iterated Matrix Multiplication polynomial IMMn,d. In particular, this implies the first superpolynomial lower bound against unbounded-depth set-multilinear formulas computing IMMn,n.
As a corollary, this resolves the homogeneous version of a question of Nisan (asked in 1991) regarding the relative power of Algebraic formulas and Branching programs in the non-commutative setting.
Stronger bounds for homogeneous non-commutative small-depth circuits. In the small-depth homogeneous non-commutative case, we prove a lower bound of nd1/Δ/2O(Δ), which yields non-FPT bounds for depths up to o(√logd). In comparison, our previous bound works in the harder commutative set-multilinear setting, but only up to depths o(loglogd). Moreover, our lower bound holds for all values of d, as opposed to the previous set-multilinear lower bound, which holds as long as d is small, i.e., d = O(logn).
The question of finding expander graphs with strong vertex expansion properties such as unique neighbor expansion and lossless expansion is central to computer science. A barrier to constructing these is that strong notions of expansion could not be proven via the spectral expansion paradigm.
A very symmetric and structured family of optimal spectral expanders (i.e., Ramanujan graphs) was constructed using number theory by Lubotzky, Phillips and Sarnak, and and was subsequently generalized by others. We call such graphs Number Theoretic Ramanujan Graphs. These graphs are not only spectrally optimal, but also posses strong symmetries and rich structure. Thus, it has been widely conjectured that number theoretic Ramanujan graphs are lossless expanders, or at least unique neighbor expanders.
In this work we disprove this conjecture, by showing that there are number theoretic Ramanujan graphs that are not even unique neighbor expanders. This is done by introducing a new combinatorial paradigm that we term the closed orbit method.
The closed orbit method allows one to construct finite combinatorial objects with extermal substructures. This is done by observing that there exist infinite combinatorial structures with extermal substructures, coming from an action of a subgroup of the automorphism group of the structure. The crux of our idea is a systematic way to construct a finite quotient of the infinite structure containing a simple shadow of the infinite substructure, which maintains its extermal combinatorial property.
Other applications of the method are to the edge expansion of number theoretic Ramanujan graphs and vertex expansion of Ramanujan complexes. Finally, in the field of graph quantum ergodicity we produce number theoretic Ramanujan graphs with an eigenfunction of small support that corresponds to the zero eigenvalue. This again contradicts common expectations.
The closed orbit method is based on the well-established idea from dynamics and number theory of studying closed orbits of subgroups. The novelty of this work is in exploiting this idea to combinatorial questions, and we hope that it will have other applications in the future.
In this paper we consider the Ideal Membership Problem (IMP for short), in which we are given polynomials f0,f1,…,fk and the question is to decide whether f0 belongs to the ideal generated by f1,…,fk. In the more stringent version the task is also to find a proof of this fact. The IMP underlies many proof systems based on polynomials such as Nullstellensatz, Polynomial Calculus, and Sum-of-Squares (SOS). In such applications the IMP usually involves so called combinatorial ideals that arise from a variety of discrete combinatorial problems. This restriction makes the IMP significantly easier and in some cases allows for an efficient solution algorithm.
The first part of this paper follows the work of Mastrolilli [SODA 2019] who initiated a systematic study of IMPs arising from Constraint Satisfaction Problems (CSP) of the form CSP(Γ), that is, CSPs in which the type of constraints is limited to relations from a set Γ.
We show that many CSP techniques can be translated to IMPs thus allowing us to significantly improve the methods of studying the complexity of the IMP. We also develop universal algebraic techniques for the IMP that have been so useful in the study of the CSP. This allows us to prove a general necessary condition for the tractability of the IMP, and three sufficient ones. The sufficient conditions include IMPs arising from systems of linear equations over GF(p), p prime, and also some conditions defined through special kinds of polymorphisms.
Our work has several consequences and applications. First, we introduce a variation of the IMP and based on this propose a unified framework, different from the celebrated Buchberger’s algorithm, to construct a bounded degree Gröbner Basis. Our algorithm, combined with the universal algebraic techniques, leads to polynomial-time construction of Gröbner Basis for many combinatorial problems.
We present a new approach to randomized distributed graph coloring that is simpler and more efficient than previous ones. In particular, it allows us to tackle the (deg+1)-list-coloring (D1LC) problem, where each node v of degree dv is assigned a palette of dv+1 colors, and the objective is to find a proper coloring using these palettes. While for (Δ+1)-coloring (where Δ is the maximum degree), there is a fast randomized distributed O(log3logn)-round algorithm due to Chang, Li, and Pettie, no o(logn)-round algorithms are known for the D1LC problem.
We give a randomized distributed algorithm for D1LC that is optimal under plausible assumptions about the deterministic complexity of the problem. Using the recent deterministic algorithm of Ghaffari and Kuhn, our algorithm runs in O(log3 logn) time, matching the best bound known for (Δ+1)-coloring. A key contribution is a subroutine to generate slack for D1LC, which almost immediately leads to a palette sparsification theorem for D1LC when placed into the framework of works by Assadi, Chen, and Khanna and Alon and Assadi. That gives fast algorithms for D1LC in three different models: an O(1)-round algorithm in the MPC model with Õ(n) memory per machine; a single-pass semi-streaming algorithm in dynamic streams; and an Õ(n√n)-time query algorithm.
We prove several new tight or near-tight distributed lower bounds for classic symmetry breaking problems in graphs. As a basic tool, we first provide a new insightful proof that any deterministic distributed algorithm that computes a Δ-coloring on Δ-regular trees requires Ω(logΔn) rounds and any randomized such algorithm requires Ω(logΔlogn) rounds. We prove this by showing that a natural relaxation of the Δ-coloring problem is a fixed point in the round elimination framework.
As a first application, we show that our Δ-coloring lower bound proof directly extends to arbdefective colorings. An arbdefective c-coloring of a graph G=(V,E) is given by a c-coloring of V and an orientation of E, where the arbdefect of a color i is the maximum number of monochromatic outgoing edges of any node of color i. We exactly characterize which variants of the arbdefective coloring problem can be solved in O(f(Δ) + log*n) rounds, for some function f, and which of them instead require Ω(logΔn) rounds for deterministic algorithms and Ω(logΔlogn) rounds for randomized ones.
As a second application, which we see as our main contribution, we use the structure of the fixed point as a building block to prove lower bounds as a function of Δ for problems that, in some sense, are much easier than Δ-coloring, as they can be solved in O(log* n) deterministic rounds in bounded-degree graphs. More specifically, we prove lower bounds as a function of Δ for a large class of distributed symmetry breaking problems, which can all be solved by a simple sequential greedy algorithm. For example, we obtain novel results for the fundamental problem of computing a (2,β)-ruling set, i.e., for computing an independent set S⊆ V such that every node v∈ V is within distance ≤ β of some node in S. We in particular show that Ω(βΔ1/β) rounds are needed even if initially an O(Δ)-coloring of the graph is given. With an initial O(Δ)-coloring, this lower bound is tight and without, it still nearly matches the existing O(βΔ2/(β+1)+log* n) upper bound. The new (2,β)-ruling set lower bound is an exponential improvement over the best existing lower bound for the problem, which was proven in [FOCS ’20]. As a special case of the lower bound, we also obtain a tight linear-in-Δ lower bound for computing a maximal independent set (MIS) in trees. While such an MIS lower bound was known for general graphs, the best previous MIS lower bounds for trees was Ω(logΔ). Our lower bound even applies to a much more general family of problems that allows for almost arbitrary combinations of natural constraints from coloring problems, orientation problems, and independent set problems, and provides a single unified proof for known and new lower bound results for these types of problems.
All of our lower bounds as a function of Δ also imply substantial lower bounds as a function of n. For instance, we obtain that the maximal independent set problem, on trees, requires Ω(logn / loglogn) rounds for deterministic algorithms, which is tight.
This paper presents near-optimal deterministic parallel and distributed algorithms for computing (1+eps)-approximate single-source shortest paths in any undirected weighted graph.
On a high level, we deterministically reduce this and other shortest-path problems to Õ(1) Minor-Aggregations. A Minor-Aggregation computes an aggregate (e.g., max or sum) of node-values for every connected component of some subgraph.
Our reduction immediately implies:
Optimal deterministic parallel (PRAM) algorithms with Õ(1) depth and near-linear work.
Universally-optimal deterministic distributed (CONGEST) algorithms, whenever deterministic Minor-Aggregate algorithms exist. For example, an optimal Õ(hopDiameterG)-round deterministic CONGEST algorithm for excluded-minor networks.
Several novel tools developed for the above results are interesting in their own right:
A local iterative approach for reducing shortest path computations “up to distance D” to computing low-diameter decompositions “up to distance D/2”. Compared to the recursive vertex-reduction approach of [Li20], our approach is simpler, suitable for distributed algorithms, and eliminates many derandomization barriers.
A simple graph-based Õ(1)-competitive ℓ1-oblivious routing based on low-diameter decompositions that can be evaluated in near-linear work. The previous such routing [ZGY+20] was no(1)-competitive and required no(1) more work.
A deterministic algorithm to round any fractional single-source transshipment flow into an integral tree solution.
The first distributed algorithms for computing Eulerian orientations.
Consensus is one of the most thoroughly studied problems in distributed computing, yet there are still complexity gaps that have not been bridged for decades. In particular, in the classical message-passing setting with processes’ crashes, since the seminal works of Bar-Joseph and Ben-Or [PODC 1998] and Aspnes and Waarts [SICOMP 1996, JACM 1998] in the previous century, there is still a fundamental unresolved question about communication complexity of fast randomized Consensus against a (strong) adaptive adversary crashing processes arbitrarily online. The best known upper bound on the number of communication bits is Θ(n3/2/√logn) per process, while the best lower bound is Ω(1). This is in contrast to randomized Consensus against a (weak) oblivious adversary, for which time-almost-optimal algorithms guarantee amortized O(1) communication bits per process. We design an algorithm against adaptive adversary that reduces the communication gap by nearly linear factor to O(√n· n) bits per process, while keeping almost-optimal (up to factor O(log3 n)) time complexity O(√n·log5/2 n).
More surprisingly, we show this complexity indeed can be lowered further, but at the expense of increasing time complexity, i.e., there is a trade-off between communication complexity and time complexity. More specifically, our main Consensus algorithm allows to reduce communication complexity per process to any value from n to O(√n· n), as long as Time × Communication = O(n· n). Similarly, reducing time complexity requires more random bits per process, i.e., Time × Randomness =O(n· n).
Our parameterized consensus solutions are based on a few newly developed paradigms and algorithms for crash-resilient computing, interesting on their own. The first one, called a Fuzzy Counting, provides for each process a number which is in-between the numbers of alive processes at the end and in the beginning of the counting. Our deterministic Fuzzy Counting algorithm works in O(log3 n) rounds and uses only O( n) amortized communication bits per process, unlike previous solutions to counting that required Ω(n) bits. This improvement is possible due to a new Fault-tolerant Gossip solution with O(log3 n) rounds using only O(||· n) communication bits per process, where || is the length of the rumor binary representation. It exploits distributed fault-tolerant divide-and-conquer idea, in which processes run a Bipartite Gossip algorithm for a considered partition of processes. To avoid passing many long messages, processes use a family of small-degree compact expanders for local signaling to their overlay neighbors if they are in a compact (large and well-connected) party, and switch to a denser overlay graph whenever local signalling in the current one is failed.
It has been known since the early 1980s that Byzantine Agreement in the full information, asynchronous model is impossible to solve deterministically against even one crash fault [FLP 1985], but that it can be solved with probability 1 [Ben-Or 1983], even against an adversary that controls the scheduling of all messages and corrupts up to f<n/3 players [Bracha 1987]. The main downside of [Ben-Or 1983, Bracha 1987] is that they terminate with 2Θ(n) latency in expectation whenever f=Θ(n).
King and Saia [KS 2016, KS 2018] developed a polynomial protocol (polynomial latency, polynomial local computation) that is resilient to f < (1.14× 10−9)n Byzantine faults. The new idea in their protocol is to detect—and blacklist—coalitions of likely-bad players by analyzing the deviations of random variables generated by those players over many rounds.
In this work we design a simple collective coin-flipping protocol such that if any coalition of faulty players repeatedly does not follow protocol, then they will eventually be detected by one of two simple statistical tests. Using this coin-flipping protocol, we solve Byzantine Agreement in polynomial latency, even in the presence of up to f<n/4 Byzantine faults. This comes close to the f<n/3 upper bound on the maximum number of faults [LSP 1982, BT 1985, FLM 1986].
It is an open question to determine if the theory of self-concordant barriers can provide an interior point method with strongly polynomial complexity in linear programming. In the special case of the logarithmic barrier, it was shown in [Allamigeon, Benchimol, Gaubert and Joswig, SIAM J. on Applied Algebra and Geometry, 2018] that the answer is negative. In this paper, we show that none of the self-concordant barrier interior point methods is strongly polynomial. This result is obtained by establishing that, on parametric families of convex optimization problems, the log-limit of the central path degenerates to a piecewise linear curve, independently of the choice of the barrier function. We provide an explicit linear program that falls in the same class as the Klee–Minty counterexample for the simplex method, i.e., in which the feasible region is a combinatorial cube and the number of iterations is Ω(2n).
In this paper we obtain improved iteration complexities for solving ℓp regression. We provide methods which given any full-rank A ∈ ℝn × d with n ≥ d, b ∈ ℝn, and p ≥ 2 solve minx ∈ ℝd ||A x − b||p to high precision in time dominated by that of solving Op(dp−2/3p−2) linear systems in A⊤D A for positive diagonal matrices D. This improves upon the previous best iteration complexity of Op(np−2/3p−2) (Adil, Kyng, Peng, Sachdeva 2019). As a corollary, we obtain an O(d1/3є−2/3) iteration complexity for approximate ℓ∞ regression. Further, for q ∈ (1, 2] and dual norm q = p/(p−1) we provide an algorithm that solves ℓq regression in O(dp−2/2p−2) iterations.
To obtain this result we analyze row reweightings (closely inspired by ℓp-norm Lewis weights) which allow a closer connection between ℓ2 and ℓp regression. We provide adaptations of two different iterative optimization frameworks which leverage this connection and yield our results. The first framework is based on iterative refinement and multiplicative weights based width reduction and the second framework is based on highly smooth acceleration. Both approaches yield Op(dp−2/3p−2) iteration methods but the second has a polynomial dependence on p (as opposed to the exponential dependence of the first algorithm) and provides a new alternative to the previous state-of-the-art methods for ℓp regression for large p.
We make several advances broadly related to the maintenance of electrical flows in weighted graphs undergoing dynamic resistance updates, including:
(1) More efficient dynamic spectral vertex sparsification, achieved by faster length estimation of random walks in weighted graphs using Morris counters [Morris 1978, Nelson-Yu 2020].
(2) A direct reduction from detecting edges with large energy in dynamic electric flows to dynamic spectral vertex sparsifiers.
(3) A procedure for turning algorithms for estimating a sequence of vectors under updates from an oblivious adversary to one that tolerates adaptive adversaries via the Gaussian-mechanism from differential privacy.
Combining these pieces with modifications to prior robust interior point frameworks gives an algorithm that on graphs with m edges computes a mincost flow with edge costs and capacities in [1, U] in time O(m3/2−1/58 log2 U). In prior and independent work, [Axiotis-Mądry-Vladu FOCS 2021] also obtained an improved algorithm for sparse mincost flows on capacitated graphs. Our algorithm implies a O(m3/2−1/58 logU) time maxflow algorithm, improving over the O(m3/2−1/328logU) time maxflow algorithm of [Gao-Liu-Peng FOCS 2021].
We show that the sparsified block elimination algorithm for solving undirected Laplacian linear systems from [Kyng-Lee-Peng-Sachdeva-Spielman STOC’16] directly works for directed Laplacians. Given access to a sparsification algorithm that, on graphs with n vertices and m edges, takes time TS(m) to output a sparsifier with NS(n) edges, our algorithm solves a directed Eulerian system on n vertices and m edges to є relative accuracy in time O(TS(m) + NS(n)lognlog(n/є)) + Õ(TS(NS(n)) logn), where the Õ(·) notation hides loglog(n) factors. By previous results, this implies improved runtimes for linear systems in strongly connected directed graphs, PageRank matrices, and asymmetric M-matrices. When combined with slower constructions of smaller Eulerian sparsifiers based on short cycle decompositions, it also gives a solver algorithm that, after pre-processing the matrix in O(n2 logO(1) n) time, takes O(n log5n log(n / є)) time per solve. At the core of our analyses are constructions of augmented matrices whose Schur complements encode error matrices.
We study m by m random matrices M with jointly Gaussian entries. Assuming a global small-ball probability bound
infx,y∈ Sm−1 ℙ⎛ ⎝⎪ ⎪x* M y⎪ ⎪>m−O(1)⎞ ⎠≥ 1/2
and a polynomial bounded on the norm of M, we show that the minimum singular value of M has a polynomial lower bound. We also consider the problem with the additional self-adjoint assumption.
We establish two matrix anti-concentration inequalities, which lower bound the minimum singular values of the sum of independent positive semidefinite self-adjoint matrices and the linear combination of independent random matrices with independent Gaussian coefficients. Both are under a global small-ball probability assumption.
Two applications are discussed. First, we derive a better singular value bound for the Krylov space matrix. This leads to a faster and simpler algorithm for solving sparse linear systems. Our algorithm runs in Õ(n3ω−4/ω−1)=O(n2.2716) time where ω<2.37286 is the matrix multiplication exponent, improving on the previous fastest one in Õ(n5ω−4/ω+1)=O(n2.33165) time by Peng and Vempala. Second, in compressed sensing, we relax certain restrictions for constructing measurement matrix by the basis pursuit algorithm.
Given a Boolean circuit C, we wish to convert it to a circuit C′ that computes the same function as C even if some of its gates suffer from adversarial short circuit errors, i.e., their output is replaced by the value of one of their inputs. Can we design such a resilient circuit C′ whose size is roughly comparable to that of C? Prior work gave a positive answer for the special case where C is a formula.
We study the general case and show that any Boolean circuit C of size s can be converted to a new circuit C′ of quasi-polynomial size sO(logs) that computes the same function as C even if a 1/51 fraction of the gates on any root-to-leaf path in C′ are short circuited. Moreover, if the original circuit C is a formula, the resilient circuit C′ is of near-linear size s1+є. The construction of our resilient circuits utilizes the connection between circuits and DAG-like communication protocols, originally introduced in the context of proof complexity.
Since they were first introduced by Schulman (STOC 1993), the construction of tree codes remained an elusive open problem. The state-of-the-art construction by Cohen, Haeupler and Schulman (STOC 2018) has constant distance and (logn)e colors for some constant e > 1 that depends on the distance, where n is the depth of the tree. Insisting on a constant number of colors at the expense of having vanishing distance, Gelles, Haeupler, Kol, Ron-Zewi, and Wigderson (SODA 2016) constructed a distance Ω(1/logn) tree code.
In this work we improve upon these prior works and construct a distance-δ tree code with (logn)O(√δ) colors. This is the first construction of a constant distance tree code with sub-logarithmic number of colors. Moreover, as a direct corollary we obtain a tree code with a constant number of colors and distance Ω(1/(loglogn)2), exponentially improving upon the above-mentioned work by Gelles et al.
An error correcting code (ECC) allows a sender to send a message to a receiver such that even if a constant fraction of the communicated bits are corrupted, the receiver can still learn the message correctly. Due to their importance and fundamental nature, ECC’s have been extensively studied, one of the main goals being to maximize the fraction of errors that the ECC is resilient to.
For adversarial erasure errors (over a binary channel) the maximal error resilience of an ECC is 1/2 of the communicated bits. In this work, we break this 1/2 barrier by introducing the notion of an interactive error correcting code (iECC) and constructing an iECC that is resilient to adversarial erasure of 3/5 of the total communicated bits. We emphasize that the adversary can corrupt both the sending party and the receiving party, and that both parties’ rounds contribute to the adversary’s budget.
We also prove an impossibility (upper) bound of 2/3 on the maximal resilience of any binary iECC to adversarial erasures. In the bit flip setting, we prove an impossibility bound of 2/7.
We study the problem of certification: given queries to a function f : {0,1}n → {0,1} with certificate complexity ≤ k and an input x⋆, output a size-k certificate for f’s value on x⋆.
For monotone functions, a classic local search algorithm of Angluin accomplishes this task with n queries, which we show is optimal for local search algorithms. Our main result is a new algorithm for certifying monotone functions with O(k8 logn) queries, which comes close to matching the information-theoretic lower bound of Ω(k logn). The design and analysis of our algorithm are based on a new connection to threshold phenomena in monotone functions.
We further prove exponential-in-k lower bounds when f is non-monotone, and when f is monotone but the algorithm is only given random examples of f. These lower bounds show that assumptions on the structure of f and query access to it are both necessary for the polynomial dependence on k that we achieve.
We develop a novel connection between discrepancy minimization and (quantum) communication complexity. As an application, we resolve a substantial special case of the Matrix Spencer conjecture. In particular, we show that for every collection of symmetric n × n matrices A1,…,An with ||Ai|| ≤ 1 and ||Ai||F ≤ n1/4 there exist signs x ∈ { ± 1}n such that the maximum eigenvalue of ∑i ≤ n xi Ai is at most O(√n). We give a polynomial-time algorithm based on partial coloring and semidefinite programming to find such x.
Our techniques open a new avenue to use tools from communication complexity and information theory to study discrepancy. The proof of our main result combines a simple compression scheme for transcripts of repeated (quantum) communication protocols with quantum state purification, the Holevo bound from quantum information, and tools from sketching and dimensionality reduction. Our approach also offers a promising avenue to resolve the Matrix Spencer conjecture completely – we show it is implied by a natural conjecture in quantum communication complexity.
Motivated by the Matrix Spencer conjecture, we study the problem of finding signed sums of matrices with a small matrix norm. A well-known strategy to obtain these signs is to prove, given matrices A1, …, An ∈ ℝm × m, a Gaussian measure lower bound of 2−O(n) for a scaling of the discrepancy body {x ∈ ℝn: || ∑i=1n xi Ai|| ≤ 1}. We show this is equivalent to covering its polar with 2O(n) translates of the cube 1/n B∞n, and construct such a cover via mirror descent. As applications of our framework, we show:
Matrix Spencer for Low-Rank Matrices. If the matrices satisfy ||Ai||≤ 1 and (Ai) ≤ r, we can efficiently find a coloring x ∈ {± 1}n with discrepancy ||∑i=1n xi Ai ||≲ √n log(min(rm/n, r)). This improves upon the naive O(√n logr) bound for random coloring and proves the matrix Spencer conjecture when r m ≤ n.
Matrix Spencer for Block Diagonal Matrices. For block diagonal matrices with ||Ai||≤ 1 and block size h, we can efficiently find a coloring x ∈ {± 1}n with ||∑i=1n xi Ai ||≲ √n log(hm/n). This bound was previously shown in [Levy, Ramadas and Rothvoss, IPCO 2017] under the assumption h ≤ √n, which we remove. Using our proof, we reduce the matrix Spencer conjecture to the existence of a O(log(m/n)) quantum relative entropy net on the spectraplex.
Matrix Discrepancy for Schatten Norms. We generalize our discrepancy bound for matrix Spencer to Schatten norms 2 ≤ p ≤ q. Given ||Ai||Sp ≤ 1 and (Ai) ≤ r, we can efficiently find a partial coloring x ∈ [−1,1]n with |{i : |xi| = 1}| ≥ n/2 and ||∑i=1n xi Ai||Sq ≲ √n min(p, log(rk)) · k1/p−1/q, where k := min(1,m/n).
Our partial coloring bound is tight when m = Θ(√n). We also provide tight lower bounds of Ω(√n) for rank-1 matrix Spencer when m = n, and Ω(√min(m,n)) for S2 → S∞ discrepancy, precluding a matrix version of the Komlós conjecture.
Randomized Hadamard Transforms (RHTs) have emerged as a computationally efficient alternative to the use of dense unstructured random matrices across a range of domains in computer science and machine learning. For several applications such as dimensionality reduction and compressed sensing, the theoretical guarantees for methods based on RHTs are comparable to approaches using dense random matrices with i.i.d. entries. However, several such applications are in the low-dimensional regime where the number of rows sampled from the matrix is rather small. Prior arguments are not applicable to the high-dimensional regime often found in machine learning applications like kernel approximation. Given an ensemble of RHTs with Gaussian diagonals, {Mi}i = 1m, and any 1-Lipschitz function, f: → , we prove that the average of f over the entries of {Mi v}i = 1m converges to its expectation uniformly over | v | ≤ 1 at a rate comparable to that obtained from using truly Gaussian matrices. We use our inequality to then derive improved guarantees for two applications in the high-dimensional regime: 1) kernel approximation and 2) distance estimation. For kernel approximation, we prove the first uniform approximation guarantees for random features constructed through RHTs lending theoretical justification to their empirical success while for distance estimation, our convergence result implies data structures with improved runtime guarantees over previous work by the authors. We believe our general inequality is likely to find use in other applications.
The random geometric graph model GRGd(n,p) is a distribution over graphs in which the edges capture a latent geometry. To sample G ∼ GRGd(n,p), we identify each of our n vertices with an independently and uniformly sampled vector from the d-dimensional unit sphere Sd−1, and we connect pairs of vertices whose vectors are “sufficiently close,” such that the marginal probability of an edge is p. Because of the underlying geometry, this model is natural for applications in data science and beyond.
We investigate the problem of testing for this latent geometry, or in other words, distinguishing an Erd'os-Renyí graph G(n, p) from a random geometric graph GRGd(n, p). It is not too difficult to show that if d→ ∞ while n is held fixed, the two distributions become indistinguishable; we wish to understand how fast d must grow as a function of n for indistinguishability to occur.
When p = α/n for constant α, we prove that if d ≥ polylog(n), the total variation distance between the two distributions is close to 0; this improves upon the best previous bound of Brennan, Bresler, and Nagaraj (2020), which required d ≫ n3/2, and further our result is nearly tight, resolving a conjecture of Bubeck, Ding, Eldan, & Rácz (2016) up to logarithmic factors. We also obtain improved upper bounds on the statistical indistinguishability thresholds in d for the full range of p satisfying 1/n ≤ p ≤ 1/2, improving upon the previous bounds by polynomial factors.
Our analysis uses the Belief Propagation algorithm to characterize the distributions of (subsets of) the random vectors conditioned on producing a particular graph. In this sense, our analysis is connected to the “cavity method” from statistical physics. To analyze this process, we rely on novel sharp estimates for the area of the intersection of a random sphere cap with an arbitrary subset of Sd−1, which we prove using optimal transport maps and entropy-transport inequalities on the unit sphere. We believe these techniques may be of independent interest.
We present an algorithm for strongly refuting smoothed instances of all Boolean CSPs. The smoothed model is a hybrid between worst and average-case input models, where the input is an arbitrary instance of the CSP with only the negation patterns of the literals re-randomized with some small probability. For an n-variable smoothed instance of a k-arity CSP, our algorithm runs in n^O(ℓ) time, and succeeds with high probability in bounding the optimum fraction of satisfiable constraints away from 1, provided that the number of constraints is at least Õ(n) (n/ell)^(k/2 - 1). This matches, up to polylogarithmic factors in n, the trade-off between running time and the number of constraints of the state-of-the-art algorithms for refuting fully random instances of CSPs.
We also make a surprising connection between the analysis of our refutation algorithm in the significantly ”randomness starved” setting of semi-random k-XOR and the existence of even covers in worst-case hypergraphs. We use this connection to positively resolve Feige’s 2008 conjecture – an extremal combinatorics conjecture on the existence of even covers in sufficiently dense hypergraphs that generalizes the well-known Moore bound for the girth of graphs. As a corollary, we show that polynomial-size refutation witnesses exist for arbitrary smoothed CSP instances with number of constraints a polynomial factor below the ”spectral threshold” of n^(k/2), extending the celebrated result for random 3-SAT of Feige, Kim and Ofek.
We study the communication complexity of dominant strategy implementations of combinatorial auctions. We start with two domains that are generally considered “easy”: multi-unit auctions with decreasing marginal values and combinatorial auctions with gross substitutes valuations. For both domains we have fast algorithms that find the welfare-maximizing allocation with communication complexity that is poly-logarithmic in the input size. This immediately implies that welfare maximization can be achieved in ex-post equilibrium with no significant communication cost, by using VCG payments. In contrast, we show that in both domains the communication complexity of any dominant strategy implementation that achieves the optimal welfare is polynomial in the input size.
We then move on to studying the approximation ratios achievable by dominant strategy mechanisms. For multi-unit auctions with decreasing marginal values, we provide a dominant-strategy communication FPTAS. For combinatorial auctions with general valuations, we show that there is no dominant strategy mechanism that achieves an approximation ratio better than m1−є that uses poly(m,n) bits of communication, where m is the number of items and n is the number of bidders. In contrast, a randomized dominant strategy mechanism that achieves an O(√m) approximation with poly(m,n) communication is known. This proves the first gap between computationally efficient deterministic dominant strategy mechanisms and randomized ones.
En route, we answer an open question on the communication cost of implementing dominant strategy mechanisms for more than two players, and also solve some open problems in the area of simultaneous combinatorial auctions.
We study revenue maximization in multi-item multi-bidder auctions under the natural item-independence assumption – a classical problem in Multi-Dimensional Bayesian Mechanism Design. One of the biggest challenges in this area is developing algorithms to compute (approximately) optimal mechanisms that are not brute-force in the size of the bidder type space, which is usually exponential in the number of items in multi-item auctions. Unfortunately, such algorithms were only known for basic settings of our problem when bidders have unit-demand or additive valuations.
In this paper, we significantly improve the previous results and design the first algorithm that runs in time polynomial in the number of items and the number of bidders to compute mechanisms that are O(1)-approximations to the optimal revenue when bidders have XOS valuations, resolving an open problem raised by Chawla, Miller and Cai, Zhao. Moreover, the computed mechanism has a simple structure: It is either a posted price mechanism or a two-part tariff mechanism. As a corollary of our result, we show how to compute an approximately optimal and simple mechanism efficiently using only sample access to the bidders’ value distributions. Our algorithm builds on two innovations that allow us to search over the space of mechanisms efficiently: (i) a new type of succinct representation of mechanisms – the marginal reduced forms, and (ii) a novel Lift-and-Round procedure that concavifies the problem.
We study bilateral trade between two strategic agents. The celebrated result of Myerson and Satterthwaite states that in general, no incentive-compatible, individually rational and weakly budget balanced mechanism can be efficient. I.e., no mechanism with these properties can guarantee a trade whenever buyer value exceeds seller cost. Given this, a natural question is whether there exists a mechanism with these properties that guarantees a constant fraction of the first-best gains-from-trade, namely a constant fraction of the gains-from-trade attainable whenever buyer’s value weakly exceeds seller’s cost. In this work, we positively resolve this long-standing open question on constant-factor approximation, mentioned in several previous works, using a simple mechanism that obtains a 1/8.23 ≈ 0.121 fraction of the first-best.
We study the revenue guarantees and approximability of item pricing. Recent work shows that with n heterogeneous items, item-pricing guarantees an O(logn) approximation to the optimal revenue achievable by any (buy-many) mechanism, even when buyers have arbitrarily combinatorial valuations. However, finding good item prices is challenging – it is known that even under unit-demand valuations, it is NP-hard to find item prices that approximate the revenue of the optimal item pricing better than O(√n).
Our work provides a more fine-grained analysis of the revenue guarantees and computational complexity in terms of the number of item “categories” which may be significantly fewer than n. We assume the items are partitioned in k categories so that items within a category are totally-ordered and a buyer’s value for a bundle depends only on the best item contained from every category.
We show that item-pricing guarantees an O(logk) approximation to the optimal (buy-many) revenue and provide a PTAS for computing the optimal item-pricing when k is constant. We also provide a matching lower bound showing that the problem is (strongly) NP-hard even when k=1. Our results naturally extend to the case where items are only partially ordered, in which case the revenue guarantees and computational complexity depend on the width of the partial ordering, i.e. the largest set for which no two items are comparable.
Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS ‘21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is O(polylog(T)) after T repetitions of the game. In this paper we extend their result from external regret to internal and swap regret, thereby establishing uncoupled learning dynamics that converge to an approximate correlated equilibrium at the rate of O( T−1 ). This substantially improves over the prior best rate of convergence of O(T−3/4) due to Chen and Peng (NeurIPS ‘20), and it is optimal up to polylogarithmic factors.
To obtain these results, we develop new techniques for establishing higher-order smoothness for learning dynamics involving fixed point operations. Specifically, we first establish that the no-internal-regret learning dynamics of Stoltz and Lugosi (Mach Learn ‘05) are equivalently simulated by no-external-regret dynamics on a combinatorial space. This allows us to trade the computation of the stationary distribution on a polynomial-sized Markov chain for a (much more well-behaved) linear transformation on an exponential-sized set, enabling us to leverage similar techniques as DGF to near-optimally bound the internal regret.
Moreover, we establish an O(polylog(T)) no-swap-regret bound for the classic algorithm of Blum and Mansour (BM) (JMLR ‘07). We do so by introducing a technique based on the Cauchy Integral Formula that circumvents the more limited combinatorial arguments of DFG. In addition to shedding clarity on the near-optimal regret guarantees of BM, our arguments provide insights into the various ways in which the techniques by DFG can be extended and leveraged in the analysis of more involved learning algorithms.
Despite immense progress in quantum Hamiltonian complexity in the past decade, little is known about the computational complexity of quantum physics at the thermodynamic limit. In fact, even defining the problem properly is not straight forward. We study the complexity of estimating the ground energy of a fixed, translationally-invariant (TI) Hamiltonian in the thermodynamic limit, to within a given precision; this precision (given by n the number of bits of the approximation) is the sole input to the problem. Understanding the complexity of this problem captures how difficult it is for a physicist to measure or compute another digit in the approximation of a physical quantity in the thermodynamic limit. We show that this problem is contained in FEXPQMA-EXP and is hard for FEXPNEXP. This means that the problem is doubly exponentially hard in the size of the input.
As an ingredient in our construction, we study the problem of computing the ground energy of translationally invariant finite 1D chains. A single Hamiltonian term, which is a fixed parameter of the problem, is applied to every pair of particles in a finite chain. In the finite case, the length of the chain is the sole input to the problem and the task is to compute an approximation of the ground energy. No thresholds are provided as in the standard formulation of the local Hamiltonian problem. We show that this problem is contained in FPQMA-EXP and is hard for FPNEXP. Our techniques employ a circular clock structure in which the ground energy is calibrated by the length of the cycle. This requires more precise expressions for the ground states of the resulting matrices than was required for previous QMA-completeness constructions and even exact analytical bounds for the infinite case which we derive using techniques from spectral graph theory. To our knowledge, this is the first use of the circuit-to-Hamiltonian construction which shows hardness for a function class.
We study the complexity of finding the ground state energy density of a local Hamiltonian on a lattice in the thermodynamic limit of infinite lattice size. We formulate this rigorously as a function problem, in which we request an estimate of the ground state energy density to some specified precision; and as an equivalent promise problem, GSED, in which we ask whether the ground state energy density is above or below specified thresholds.
The ground state energy density problem is unusual, in that it concerns a single, fixed Hamiltonian in the thermodynamic limit, whose ground state energy density is just some fixed, real number. The only input to the computational problem is the precision to which to estimate this fixed real number, corresponding to the ground state energy density. Hardness of this problem for a complexity class therefore implies that the solutions to all problems in the class are encoded in this single number (analogous to Chaitin’s constant in computability theory).
This captures computationally the type of question most commonly encountered in condensed matter physics, which is typically concerned with the physical properties of a single Hamiltonian in the thermodynamic limit. We show that for classical, translationally invariant, nearest neighbour Hamiltonians on a 2D square lattice, PNEEXP⊆EXPGSED⊆ EXPNEXP, and for quantum Hamiltonians PNEEXP⊆EXPGSED⊆ EXPQMAEXP. With some technical caveats on the oracle definitions, the EXP in some of these results can be strengthened to PSPACE. We also give analogous complexity bounds for the function version of GSED.
The fundamental problem in much of physics and quantum chemistry is to optimize a low-degree polynomial in certain anticommuting variables. Being a quantum mechanical problem, in many cases we do not know an efficient classical witness to the optimum, or even to an approximation of the optimum. One prominent exception is when the optimum is described by a so-called “Gaussian state”, also called a free fermion state. In this work we are interested in the complexity of this optimization problem when no good Gaussian state exists. Our primary testbed is the Sachdev–Ye–Kitaev (SYK) model of random degree-q polynomials, a model of great current interest in condensed matter physics and string theory, and one which has remarkable properties from a computational complexity standpoint. Among other results, we give an efficient classical certification algorithm for upper-bounding the largest eigenvalue in the q=4 SYK model, and an efficient quantum certification algorithm for lower-bounding this largest eigenvalue; both algorithms achieve constant-factor approximations with high probability.
Quantum money is a main primitive in quantum cryptography, that enables a bank to distribute to parties in the network, called wallets, unclonable quantum banknotes that serve as a medium of exchange between wallets. While quantum money suggests a theoretical solution to some of the fundamental problems in currency systems, it still requires a strong model to be implemented; quantum computation and a quantum communication infrastructure. A central open question in this context is whether we can have a quantum money scheme that uses "minimal quantumness", namely, local quantum computation and only classical communication.
Public-key semi-quantum money (Radian and Sattath, AFT 2019) is a quantum money scheme where the algorithm of the bank is completely classical, and quantum banknotes are publicly verifiable on any quantum computer. In particular, such scheme relies on local quantum computation and only classical communication. The only known construction of public-key semi-quantum is based on quantum lightning (Zhandry, EUROCRYPT 2019), which is based on a computational assumption that is now known to be broken.
In this work, we construct public-key semi-quantum money, based on quantum-secure indistinguishability obfuscation and the sub-exponential hardness of the Learning With Errors problem. The technical centerpiece of our construction is a new 3-message protocol, where a classical computer can delegate to a quantum computer the generation of a quantum state that is both, unclonable and publicly verifiable.
In classical computing, garbled circuits (and their generalization known as randomized encodings) are a versatile cryptographic tool with many applications such as secure multiparty computation, delegated computation, depth-reduction of cryptographic primitives, complexity lower-bounds, and more. Quantum analogues of garbled circuits were not known prior to this work.
In this work, we introduce a definition of quantum randomized encodings and present a construction which allows us to efficiently garble any quantum circuit, assuming the existence of quantum-secure one-way functions. Our construction has comparable properties to the best known classical garbling schemes. We can also achieve perfect information-theoretic security albeit with blowup in the size of the garbled circuits.
We believe that quantum garbled circuits and quantum randomized encodings can be an instrumental concept and building block for quantum computation and in particular quantum cryptography. We present some applications, including a conceptually-simple zero-knowledge proof system for QMA, a protocol for private simultaneous messages, functional encryption, and more.
We introduce the notion of a reproducible algorithm in the context of learning. A reproducible learning algorithm is resilient to variations in its samples — with high probability, it returns the exact same output when run on two samples from the same underlying distribution. We begin by unpacking the definition, clarifying how randomness is instrumental in balancing accuracy and reproducibility. We initiate a theory of reproducible algorithms, showing how reproducibility implies desirable properties such as data reuse and efficient testability. Despite the exceedingly strong demand of reproducibility, there are efficient reproducible algorithms for several fundamental problems in statistics and learning. First, we show that any statistical query algorithm can be made reproducible with a modest increase in sample complexity, and we use this to construct reproducible algorithms for finding approximate heavy-hitters and medians. Using these ideas, we give the first reproducible algorithm for learning halfspaces via a reproducible weak learner and a reproducible boosting algorithm. Interestingly, we utilize a connection to foams as a higher-dimension randomized rounding scheme. Finally, we initiate the study of lower bounds and inherent tradeoffs for reproducible algorithms, giving nearly tight sample complexity upper and lower bounds for reproducible versus nonreproducible SQ algorithms.
Here we revisit the classic problem of linear quadratic estimation, i.e. estimating the trajectory of a linear dynamical system from noisy measurements. The celebrated Kalman filter gives an optimal estimator when the measurement noise is Gaussian, but is widely known to break down when one deviates from this assumption, e.g. when the noise is heavy-tailed. Many ad hoc heuristics have been employed in practice for dealing with outliers. In a pioneering work, Schick and Mitter gave provable guarantees when the measurement noise is a known infinitesimal perturbation of a Gaussian and raised the important question of whether one can get similar guarantees for large and unknown perturbations.
In this work we give a truly robust filter: we give the first strong provable guarantees for linear quadratic estimation when even a constant fraction of measurements have been adversarially corrupted. This framework can model heavy-tailed and even non-stationary noise processes. Our algorithm robustifies the Kalman filter in the sense that it competes with the optimal algorithm that knows the locations of the corruptions. Our work is in a challenging Bayesian setting where the number of measurements scales with the complexity of what we need to estimate. Moreover, in linear dynamical systems past information decays over time. We develop a suite of new techniques to robustly extract information across different time steps and over varying time scales.
We study fast rates of convergence in the setting of nonparametric online regression, namely where regret is defined with respect to an arbitrary function class which has bounded complexity. Our contributions are two-fold:
- In the realizable setting of nonparametric online regression with the absolute loss, we propose a randomized proper learning algorithm which gets a near-optimal cumulative loss in terms of the sequential fat-shattering dimension of the hypothesis class. In the setting of online classification with a class of Littlestone dimension d, our bound reduces to d · poly logT. This result answers a question as to whether proper learners could achieve near-optimal cumulative loss; previously, even for online classification, the best known cumulative loss was Õ( √dT). Further, for the real-valued (regression) setting, a cumulative loss bound with near-optimal scaling on sequential fat-shattering dimension was not even known for improper learners, prior to this work.
- Using the above result, we exhibit an independent learning algorithm for general-sum binary games of Littlestone dimension d, for which each player achieves regret Õ(d3/4 · T1/4). This result generalizes analogous results of Syrgkanis et al. (2015) who showed that in finite games the optimal regret can be accelerated from O(√T) in the adversarial setting to O(T1/4) in the game setting.
To establish the above results, we introduce several new techniques, including: a hierarchical aggregation rule to achieve the optimal cumulative loss for real-valued classes, a multi-scale extension of the proper online realizable learner of Hanneke et al. (2021), an approach to show that the output of such nonparametric learning algorithms is stable, and a proof that the minimax theorem holds in all online learnable games.
It was recently shown that almost all solutions in the symmetric binary perceptron are isolated, even at low constraint densities, suggesting that finding typical solutions is hard. In contrast, some algorithms have been shown empirically to succeed in finding solutions at low density. This phenomenon has been justified numerically by the existence of subdominant and dense connected regions of solutions, which are accessible by simple learning algorithms. In this paper, we establish formally such a phenomenon for both the symmetric and asymmetric binary perceptrons. We show that at low constraint density (equivalently for overparametrized perceptrons), there exists indeed a subdominant connected cluster of solutions with almost maximal diameter, and that an efficient multiscale majority algorithm can find solutions in such a cluster with high probability, settling in particular an open problem posed by Perkins-Xu in STOC'21. In addition, even close to the critical threshold, we show that there exist clusters of linear diameter for the symmetric perceptron, as well as for the asymmetric perceptron under additional assumptions.
We study the problem of PAC learning halfspaces on ℝd with Massart noise under the Gaussian distribution. In the Massart model, an adversary is allowed to flip the label of each point x with unknown probability η(x) ≤ η, for some parameter η ∈ [0,1/2]. The goal is to find a hypothesis with misclassification error of OPT + є, where OPT is the error of the target halfspace. This problem had been previously studied under two assumptions: (i) the target halfspace is homogeneous (i.e., the separating hyperplane goes through the origin), and (ii) the parameter η is strictly smaller than 1/2. Prior to this work, no nontrivial bounds were known when either of these assumptions is removed. We study the general problem and establish the following:
[leftmargin = *] For η <1/2, we give a learning algorithm for general halfspaces with sample and computational complexity dOη(log(1/γ))poly(1/є), where γ max{є, min{Pr[f(x) = 1], Pr[f(x) = −1]} } is the “bias” of the target halfspace f. Prior efficient algorithms could only handle the special case of γ = 1/2. Interestingly, we establish a qualitatively matching lower bound of dΩ(log(1/γ)) on the complexity of any Statistical Query (SQ) algorithm. For η = 1/2, we give a learning algorithm for general halfspaces with sample and computational complexity Oє(1) dO(log(1/є)). This result is new even for the subclass of homogeneous halfspaces; prior algorithms for homogeneous Massart halfspaces provide vacuous guarantees for η=1/2. We complement our upper bound with a nearly-matching SQ lower bound of dΩ(log(1/є) ), which holds even for the special case of homogeneous halfspaces.
Taken together, our results qualitatively characterize the complexity of learning general halfspaces with general Massart noise under Gaussian marginals. Our techniques rely on determining the existence (or non-existence) of low-degree polynomials whose expectations distinguish Massart halfspaces from random noise.
Branchwidth determines how graphs, and more generally, arbitrary connectivity (basically symmetric and submodular) functions could be decomposed into a tree-like structure by specific cuts. We develop a general framework for designing fixed-parameter tractable (FPT) 2-approximation algorithms for branchwidth of connectivity functions. The first ingredient of our framework is combinatorial. We prove a structural theorem establishing that either a sequence of particular refinement operations could decrease the width of a branch decomposition or that the width of the decomposition is already within a factor of 2 from the optimum. The second ingredient is an efficient implementation of the refinement operations for branch decompositions that support efficient dynamic programming. We present two concrete applications of our general framework.
An algorithm that for a given n-vertex graph G and integer k in time 22O(k) n2 either constructs a rank decomposition of G of width at most 2k or concludes that the rankwidth of G is more than k. It also yields a (22k+1−1)-approximation algorithm for cliquewidth within the same time complexity, which in turn, improves to f(k)n2 the running times of various algorithms on graphs of cliquewidth k. Breaking the “cubic barrier” for rankwidth and cliquewidth was an open problem in the area.
An algorithm that for a given n-vertex graph G and integer k in time 2O(k) n either constructs a branch decomposition of G of width at most 2k or concludes that the branchwidth of G is more than k. This improves over the 3-approximation that follows from the recent treewidth 2-approximation of Korhonen [FOCS 2021].
In the F-minor-free deletion problem we are given an undirected graph G and the goal is to find a minimum vertex set that intersects all minor models of graphs from the family F. This captures numerous important problems including Vertex cover, Feedback vertex set, Treewidth-η modulator, and Vertex planarization. In the latter one, we ask for a minimum vertex set whose removal makes the graph planar. This is a special case of F-minor-free deletion for the family F = {K5, K3,3}.
Whenever the family F contains at least one planar graph, then F-minor-free deletion is known to admit a constant-factor approximation algorithm and a polynomial kernelization [Fomin, Lokshtanov, Misra, and Saurabh, FOCS’12]. A polynomial kernelization is a polynomial-time algorithm that, given a graph G and integer k, outputs a graph G′ on poly(k) vertices and integer k′, so that OPT(G) ≤ k if and only if OPT(G′) ≤ k′. The Vertex planarization problem is arguably the simplest setting for which F does not contain a planar graph and the existence of a constant-factor approximation or a polynomial kernelization remains a major open problem.
In this work we show that Vertex planarization admits an algorithm which is a combination of both approaches. Namely, we present a polynomial α-approximate kernelization, for some constant α > 1, based on the framework of lossy kernelization [Lokshtanov, Panolan, Ramanujan, and Saurabh, STOC’17]. Simply speaking, when given a graph G and integer k, we show how to compute a graph G′ on poly(k) vertices so that any β-approximate solution to G′ can be lifted to an (α· β)-approximate solution to G, as long as OPT(G) ≤ k/α· β. In order to achieve this, we develop a framework for sparsification of planar graphs which approximately preserves all separators and near-separators between subsets of the given terminal set.
Our result yields an improvement over the state-of-art approximation algorithms for Vertex planarization. The problem admits a polynomial-time O(nє)-approximation algorithm, for any є > 0, and a quasi-polynomial-time (logn)O(1)-approximation algorithm, where n is the input size, both randomized [Kawarabayashi and Sidiropoulos, FOCS’17]. By pipelining these algorithms with our approximate kernelization, we improve the approximation factors to respectively O(OPTє) and (logOPT)O(1).
We prove that Graph Isomorphism and Canonization in graphs excluding a fixed graph H as a minor can be solved by an algorithm working in time f(H)· nO(1), where f is some function. In other words, we show that these problems are fixed-parameter tractable when parameterized by the size of the excluded minor, with the caveat that the bound on the running time is not necessarily computable. The underlying approach is based on decomposing the graph in a canonical way into unbreakable (intuitively, well-connected) parts, which essentially provides a reduction to the case where the given H-minor-free graph is unbreakable itself. This is complemented by an analysis of unbreakable H-minor-free graphs, which reveals that every such graph can be canonically decomposed into a part that admits few automorphisms and a part that has bounded treewidth.
We establish a list of characterizations of bounded twin-width for hereditary classes of totally ordered graphs: as classes of at most exponential growth studied in enumerative combinatorics, as monadically NIP classes studied in model theory, as classes that do not transduce the class of all graphs studied in finite model theory, and as classes for which model checking first-order logic is fixed-parameter tractable studied in algorithmic graph theory.
This has several consequences. First, it allows us to show that every hereditary class of ordered graphs either has at most exponential growth, or has at least factorial growth. This settles a question first asked by Balogh, Bollobás, and Morris [Eur. J. Comb. ’06] on the growth of hereditary classes of ordered graphs, generalizing the Stanley-Wilf conjecture/Marcus-Tardos theorem. Second, it gives a fixed-parameter approximation algorithm for twin-width on ordered graphs. Third, it yields a full classification of fixed-parameter tractable first-order model checking on hereditary classes of ordered binary structures. Fourth, it provides a model-theoretic characterization of classes with bounded twin-width. Finally, it settles our small conjecture [SODA ’21] in the case of ordered graphs.
We show a flow-augmentation algorithm in directed graphs: There exists a randomized polynomial-time algorithm that, given a directed graph G, two integers s,t ∈ V(G), and an integer k, adds (randomly) to G a number of arcs such that for every minimal st-cut Z in G of size at most k, with probability 2−poly(k) the set Z becomes a minimum st-cut in the resulting graph.
The directed flow-augmentation tool allows us to prove (randomized) fixed-parameter tractability of a number of problems parameterized by the cardinality of the deletion set, whose parameterized complexity status was repeatedly posed as open problems:
Chain SAT, defined by Chitnis, Egri, and Marx [ESA’13, Algorithmica’17], a number of weighted variants of classic directed cut problems, such as Weighted st-Cut, Weighted Directed Feedback Vertex Set, or Weighted Almost 2-SAT.
By proving that Chain SAT is (randomized) FPT, we confirm a conjecture of Chitnis, Egri, and Marx that, for any graph H, if the List H-Coloring problem is polynomial-time solvable, then the corresponding vertex-deletion problem is fixed-parameter tractable (with the remark that our algorithms are randomized).
In interactive coding, Alice and Bob wish to compute some function f of their individual private inputs x and y. They do this by engaging in a non-adaptive (fixed order, fixed length) interactive protocol to jointly compute f(x,y). The goal is to do this in an error-resilient way, such that even given some fraction of adversarial corruptions to the protocol, both parties still learn f(x,y).
In this work, we study the optimal error resilience of such a protocol in the face of adversarial bit flip or erasures. While the optimal error resilience of such a protocol over a large alphabet is well understood, the situation over the binary alphabet has remained open. Firstly, we determine the optimal error resilience of an interactive coding scheme over the binary erasure channel to be 1/2, by constructing a protocol that achieves this previously known upper bound. The communication complexity of our binary erasure protocol is linear in the size of the minimal noiseless protocol computing f. Secondly, we determine the optimal error resilience over the binary bit flip channel for the message exchange problem (where f(x,y)=(x,y)) to be 1/6. The communication complexity of our protocol is polynomial in the size of the parties’ inputs. Note that this implies an interactive coding scheme for any f resilient to 1/6 errors with an exponential blowup in communication complexity.
Investigating the computational resources we need for cryptography is an essential task of both theoretical and practical interests. This paper provides answers to this problem on pseudorandom functions (PRFs). We resolve the exact complexity of PRFs by proving tight upper and lower bounds for various circuit models.
* PRFs can be constructed in 2n + o(n) size general circuits assuming the existence of polynomial-size PRFs, simplifying and improving the O(n) upper bound by Ishai, Kushilevitz, Ostrovsky, and Sahai (STOC 2008). Moreover, if PRFs exist in NC1, we can further guarantee the depth of our construction to be (1+є) logn. We show that our construction is almost optimal by giving an unconditional 2n−O(1) lower bound.
* PRFs can be constructed in AC0[2] circuits of o(n) gates and 2n+o(n) wires assuming the existence of polynomial-size AC0[2] PRFs. We show the optimality of our construction with a 2n+Ω(√n) wire complexity lower bound.
* PRFs can be constructed with wire complexity n1+O(1.61−d) in depth-d TC0 circuits assuming the existence of polynomial-size TC0 PRFs. We also present an n1+Ω(c−d) wire complexity lower bound against depth-d TC0 circuits for some c>1.61.
As a byproduct, we prove unconditional tight upper and lower bounds for ”almost” universal hash functions that are of independent interest.
Following the natural proof barrier of Razborov and Rudich (J. Comput. Syst. Sci. 1997), we observe that our exact complexity results are closely related to the ”bootstrapping phenomena” in circuit complexity (such as hardness magnification and quantified derandomization). We introduce the black-box natural proof barrier and show that a large range of techniques for bootstrapping results cannot be combined with ”black-box” lower bound proofs to obtain a breakthrough.
We consider the P-CSP problem for 3-ary predicates P on satisfiable instances. We show that under certain conditions on P and a (1,s) integrality gap instance of the P-CSP problem, it can be translated into a dictatorship vs. quasirandomness test with perfect completeness and soundness s+ε, for every constant ε>0. Compared to Ragahvendra’s result [STOC, 2008], we do not lose perfect completeness. This is particularly interesting as this test implies new hardness results on satisfiable constraint satisfaction problems, assuming the Rich 2-to-1 Games Conjecture by Braverman, Khot, and Minzer [ITCS, 2021]. Our result can be seen as the first step of a potentially long-term challenging program of characterizing optimal inapproximability of every satisfiable k-ary CSP.
At the heart of the reduction is our main analytical lemma for a class of 3-ary predicates, which is a generalization of a lemma by Mossel [Geometric and Functional Analysis, 2010]. The lemma and a further generalization of it that we conjecture may be of independent interest.
A µ-biased Max-CSP instance with predicate ψ:{0,1}r → {0,1} is an instance of Constraint Satisfaction Problem (CSP) where the objective is to find a labeling of relative weight at most µ which satisfies the maximum fraction of constraints. Biased CSPs are versatile and express several well studied problems such as Densest-k-Sub(Hyper)graph and SmallSetExpansion.
In this work, we explore the role played by the bias parameter µ on the approximability of biased CSPs. We show that the approximability of such CSPs can be characterized (up to loss of factors in arity r) using the bias-approximation curve of Densest-k-SubHypergraph (DkSH). In particular, this gives a tight characterization of predicates which admit approximation guarantees that are independent of the bias parameter µ.
Motivated by the above, we give new approximation and hardness results for . In particular, assuming the Small Set Expansion Hypothesis (SSEH), we show that with arity r and k = µ n is NP-hard to approximate to a factor of Ω(r3µr−1log(1/µ)) for every r ≥ 2 and µ < 2−r. We also give a O(µr−1log(1/µ))-approximation algorithm for the same setting. Our upper and lower bounds are tight up to constant factors, when the arity r is a constant, and in particular, imply the first tight approximation bounds for the Densest-k-Subgraph problem in the linear bias regime. Furthermore, using the above characterization, our results also imply matching algorithms and hardness for every biased CSP of constant arity.
We prove that for every 3-player (3-prover) game, with binary questions and answers and value <1, the value of the n-fold parallel repetition of the game decays polynomially fast to 0. That is, for every such game, there exists a constant c>0, such that the value of the n-fold parallel repetition of the game is at most n−c.
Along the way to proving this theorem, we prove two additional parallel repetition theorems for multiplayer (multiprover) games, that may be of independent interest:
Playerwise Connected Games (with any number of players and any Alphabet size): We identify a large class of multiplayer games and prove that for every game with value <1 in that class, the value of the n-fold parallel repetition of the game decays polynomially fast to 0.
More precisely, our result applies for playerwise connected games, with any number of players and any alphabet size: For each player i, we define the graph Gi, whose vertices are the possible questions for that player and two questions x,x′ are connected by an edge if there exists a vector y of questions for all other players, such that both (x,y) and (x′,y) are asked by the referee with non-zero probability. We say that the game is playerwise connected if for every i, the graph Gi is connected.
Our class of playerwise connected games is strictly larger than the class of connected games that was defined by Dinur, Harsha, Venkat and Yuen (ITCS 2017) and for which they proved exponential decay bounds on the value of parallel repetition. For playerwise connected games that are not connected, only inverse Ackermann decay bounds were previously known (Verbitsky 1996).
Exponential Bounds for the Anti-Correlation Game: In the 3-player anti-correlation game, two out of three players are given 1 as input, and the remaining player is given 0. The two players who were given 1 must produce different outputs in {0,1}. We prove that the value of the n-fold parallel repetition of that game decays exponentially fast to 0. That is, there exists a constant c>0, such that the value of the n-fold parallel repetition of the game is at most 2−c n. Only inverse Ackermann decay bounds were previously known (Verbitsky 1996).
The 3-player anti-correlation game was studied and motivated in several previous works. In particular, Holmgren and Yang (STOC 2019) gave it as an example for a 3-player game whose non-signaling value (is smaller than 1 and yet) does not decrease at all under parallel repetition.
In the ε-Consensus-Halving problem, we are given n probability measures v1, …, vn on the interval R = [0,1], and the goal is to partition R into two parts R+ and R− using at most n cuts, so that |vi(R+) − vi(R−)| ≤ ε for all i. This fundamental fair division problem was the first natural problem shown to be complete for the class PPA, and all subsequent PPA-completeness results for other natural problems have been obtained by reducing from it.
We show that ε-Consensus-Halving is PPA-complete even when the parameter ε is a constant. In fact, we prove that this holds for any constant ε < 1/5. As a result, we obtain constant inapproximability results for all known natural PPA-complete problems, including Necklace-Splitting, the Discrete-Ham-Sandwich problem, two variants of the pizza sharing problem, and for finding fair independent sets in cycles and paths.
Counting graph homomorphisms and its generalizations such as the Counting Constraint Satisfaction Problem (CSP), its variations, and counting problems in general have been intensively studied since the pioneering work of Valiant. While the complexity of exact counting of graph homomorphisms (Dyer and Greenhill, 2000) and the counting CSP (Bulatov, 2013, and Dyer and Richerby, 2013) is well understood, counting modulo some natural number has attracted considerable interest as well. In their 2015 paper Faben and Jerrum suggested a conjecture stating that counting homomorphisms to a fixed graph H modulo a prime number is hard whenever it is hard to count exactly, unless H has automorphisms of certain kind. In this paper we confirm this conjecture. As a part of this investigation we develop techniques that widen the spectrum of reductions available for modular counting and apply to the general CSP rather than being limited to graph homomorphisms.
The (k,z)-clustering problem consists of finding a set of k points called centers, such that the sum of distances raised to the power of z of every data point to its closest center is minimized. Among the most commonly encountered special cases are k-median problem (z=1) and k-means problem (z=2). The k-median and k-means problems are at the heart of modern data analysis and massive data applications have given raise to the notion of coreset: a small (weighted) subset of the input point set preserving the cost of any solution to the problem up to a multiplicative (1± ε) factor, hence reducing from large to small scale the input to the problem.
While there has been an intensive effort to understand what is the best coreset size possible for both problems in various metric spaces, there is still a significant gap between the state-of-the-art upper and lower bounds. In this paper, we make progress on both upper and lower bounds, obtaining tight bounds for several cases, namely:
• In finite n point general metrics, any coreset must consist of Ω(k logn / ε2) points. This improves on the Ω(k logn /ε) lower bound of Braverman, Jiang, Krauthgamer, and Wu [ICML’19] and matches the upper bounds proposed for k-median by Feldman and Langberg [STOC’11] and k-means by Cohen-Addad, Saulpic and Schwiegelshohn [STOC’21] up to polylog factors.
• For doubling metrics with doubling constant D, any coreset must consist of Ω(k D/ε2) points. This matches the k-median and k-means upper bounds by Cohen-Addad, Saulpic, and Schwiegelshohn [STOC’21] up to polylog factors.
• In d-dimensional Euclidean space, any coreset for (k,z) clustering requires Ω(k/ε2) points. This improves on the Ω(k/√ε) lower bound of Baker, Braverman, Huang, Jiang, Krauthgamer, and Wu [ICML’20] for k-median and complements the Ω(kmin(d,2z/20)) lower bound of Huang and Vishnoi [STOC’20].
We complement our lower bound for d-dimensional Euclidean space with the construction of a coreset of size Õ(k/ε2· min(ε−z,k)). This improves over the Õ(k2 ε−4) upper bound for general power of z proposed by Braverman Jiang, Krauthgamer, and Wu [SODA’21] and over the Õ(k/ε4) upper bound for k-median by Huang and Vishnoi [STOC’20]. In fact, ours is the first construction breaking through the ε−2· min(d,ε−2) barrier inherent in all previous coreset constructions. To do this, we employ a novel chaining based analysis that may be of independent interest. Together our upper and lower bounds for k-median in Euclidean spaces are tight up to a factor O(ε−1 polylog k/ε).
Given two point sets A and B in ℝd of size n each, for some constant dimension d≥ 1, and a parameter ε>0, we present a deterministic algorithm that computes, in n·(ε−1 logn)O(d) time, a perfect matching between A and B whose cost is within a (1+ε) factor of the optimal matching under any ℓp-norm. Although a Monte-Carlo algorithm with a similar running time is proposed by Raghvendra and Agarwal [J. ACM 2020], the best-known deterministic ε-approximation algorithm takes Ω(n3/2) time. Our algorithm constructs a (refinement of a) tree cover of ℝd, and we develop several new tools to apply a tree-cover based approach to compute an ε-approximate perfect matching.
Chan, Har-Peled, and Jones [2020] recently developed locality-sensitive ordering (LSO), a new tool that allows one to reduce problems in the Euclidean space ℝd to the 1-dimensional line. They used LSO’s to solve a host of problems. Later, Buchin, Har-Peled, and Oláh [2019,2020] used the LSO of Chan et al. to construct very sparse reliable spanners for the Euclidean space. A highly desirable feature of a reliable spanner is its ability to withstand a massive failure: the network remains functioning even if 90% of the nodes fail. In a follow-up work, Har-Peled, Mendel, and Oláh [2021] constructed reliable spanners for general and topologically structured metrics. Their construction used a different approach, and is based on sparse covers.
In this paper, we develop the theory of LSO’s in non-Euclidean metrics by introducing new types of LSO’s suitable for general and topologically structured metrics. We then construct such LSO’s, as well as constructing considerably improved LSO’s for doubling metrics. Afterwards, we use our new LSO’s to construct reliable spanners with improved stretch and sparsity parameters. Most prominently, we construct Õ(n)-size reliable spanners for trees and planar graphs with the optimal stretch of 2. Along the way to the construction of LSO’s and reliable spanners, we introduce and construct ultrametric covers, and construct 2-hop reliable spanners for the line.
We (nearly) settle the time complexity for computing vertex fault-tolerant (VFT) spanners with optimal sparsity (up to polylogarithmic factors). VFT spanners are sparse subgraphs that preserve distance information, up to a small multiplicative stretch, in the presence of vertex failures. These structures were introduced by [Chechik et al., STOC 2009] and have received a lot of attention since then.
Recent work provided algorithms for computing VFT spanners with optimal sparsity but in exponential runtime. The first polynomial time algorithms for these structures have been given by [Bodwin, Dinitz and Robelle, SODA 2021]. Their algorithms, as all other prior algorithms, are greedy and thus inherently sequential. We provide algorithms for computing nearly optimal f-VFT spanners for any n-vertex m-edge graph, with near optimal running time in several computational models:
(i) A randomized sequential algorithm with a runtime of O(m) (i.e., independent in the number of faults f). The state-of-the-art time bound is O(f1−1/k· n2+1/k+f2 m) by [Bodwin, Dinitz and Robelle, SODA 2021].
(ii) A distributed congest algorithm of O(1) rounds. Improving upon [Dinitz and Robelle, PODC 2020] that obtained FT spanners with near-optimal sparsity in O(f2) rounds.
(iii) A PRAM (CRCW) algorithm with O(m) work and O(1) depth. Prior bounds implied by [Dinitz and Krauthgamer, PODC 2011] obtained sub-optimal FT spanners using O(f3m) work and O(f3) depth.
An immediate corollary provides the first nearly-optimal PRAM algorithm for computing nearly optimal λ-vertex connectivity certificates using polylogarithmic depth and near-linear work. This improves the state-of-the-art parallel bounds of O(1) depth and O(λ m) work, by [Karger and Motwani, STOC’93].
We present the first compact distance oracle that tolerates multiple failures and maintains *exact* distances. Given an undirected weighted graph G = (V, E) and an arbitrarily large constant d, we construct an oracle that given vertices u, v ∈ V and a set of d edge failures D, outputs the *exact* distance between u and v in G − D (that is, G with edges in D removed). Our oracle has space complexity O(d n4) and query time dO(d). Previously, there were compact *approximate* distance oracles under multiple failures [Chechik, Cohen, Fiat, and Kaplan, SODA’17; Duan, Gu, and Ren, SODA’21], but the best exact distance oracles under d failures require essentially Ω(nd) space [Duan and Pettie, SODA’09]. Our distance oracle seems to require nΩ(d) time to preprocess; we leave it as an open question to improve this preprocessing time.
We revisit the task of computing the edit distance in sublinear time. In the (k,K)-gap edit distance problem we are given oracle access to two strings of length n and the task is to distinguish whether their edit distance is at most k or at least K. It has been established by Goldenberg, Krauthgamer and Saha (FOCS ’19), with improvements by Kociumaka and Saha (FOCS ’20), that the (k,k2)-gap problem can be solved in time O(n/k + poly(k)). One of the most natural questions in this line of research is whether the (k,k2)-gap is best-possible for the running time O(n/k + poly(k)).
In this work we answer this question by significantly improving the gap. Specifically, we show that in time O(n/k + poly(k)) we can even solve the (k,k1+o(1))-gap problem. This is the first algorithm that breaks the (k,k2)-gap in this running time. Our algorithm is almost optimal in the following sense: In the low distance regime (k ≤ n0.19) our running time becomes O(n/k), which matches a known n/k1+o(1) lower bound for the (k,k1+o(1))-gap problem up to lower order factors.
Our result also reveals a surprising similarity of Hamming distance and edit distance in the low distance regime: For both, the (k,k1+o(1))-gap problem has time complexity n/k1± o(1) for small k.
In contrast to previous work, which employed a subsampled variant of the Landau-Vishkin algorithm, we instead build upon the algorithm of Andoni, Krauthgamer and Onak (FOCS ’10) which approximates the edit distance in almost-linear time O(n1+ε) within a polylogarithmic factor. We first simplify their approach and then show how to to effectively prune their computation tree in order to obtain a sublinear-time algorithm in the given time bound. Towards that, we use a variety of structural insights on the (local and global) patterns that can emerge during this process and design appropriate property testers to effectively detect these patterns.
In this paper, we consider the problems from the area of sublinear-time algorithms of edge sampling, edge counting, and triangle counting. Part of our contribution is that we consider three different settings, differing in the way in which one may access the neighborhood of a given vertex. In previous work, people have considered indexed neighbor access, with a query returning the i-th neighbor of a given vertex. Full neighborhood access model, which has a query that returns the entire neighborhood at a unit cost, has recently been considered in the applied community. Between these, we propose hash-ordered neighbor access, inspired by coordinated sampling, where we have a global fully random hash function, and can access neighbors in order of their hash values, paying a constant for each accessed neighbor.
For edge sampling and counting, our new lower bounds are in the most powerful full neighborhood access model. We provide matching upper bounds in the weaker hash-ordered neighbor access model. Our new faster algorithms can be provably implemented efficiently on massive graphs in external memory and with the current APIs for, e.g., Twitter or Wikipedia. For triangle counting, we provide a separation: a better upper bound with full neighborhood access than the known lower bounds with indexed neighbor access. The technical core of our paper is our edge-sampling algorithm on which the other results depend. We now describe our results on the classic problems of edge and triangle counting.
We give an algorithm that uses hash-ordered neighbor access to approximately count edges in time Õ(n/є √m + 1/є2) (compare to the state of the art without hash-ordered neighbor access of Õ(n/є2 √m) by Eden, Ron, and Seshadhri [ICALP 2017]). We present an Ω(n/є √m) lower bound for є ≥√m/n in the full neighborhood access model. This improves the lower bound of Ω(n/√є m) by Goldreich and Ron [Rand. Struct. Alg. 2008]) and it matches our new upper bound for є ≥ √m/n. We also show an algorithm that uses the more standard assumption of pair queries (“are the vertices u and v adjacent?”), with time complexity of Õ(n/є √m + 1/є4). This matches our lower bound for є ≥ m1/6/n1/3.
Finally, we focus on triangle counting. For this, we use the full power of the full neighbor access. In the indexed neighbor model, an algorithm that makes Õ(n/є10/3 T1/3 + min(m,m3/2/є3 T)) queries for T being the number of triangles, is known and this is known to be the best possible up to the dependency on є (Eden, Levi, Ron, and Seshadhri [FOCS 2015]). We improve this significantly to Õ(min(n,n/є T1/3 + √n m/є2 √T)) full neighbor accesses, thus showing that the full neighbor access is fundamentally stronger for triangle counting than the weaker indexed neighbor model. We also give a lower bound, showing that this is the best possible with full neighborhood access, in terms of n,m,T.
We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-p norm. Here, given access to a matrix A through matrix-vector products, an accuracy parameter є, and a target rank k, the goal is to find a rank-k matrix Z with orthonormal columns such that || A (I − Z Z⊤) || Sp ≤ (1+є)min U⊤U = Ik || A (I − U U⊤) || Sp, where || M ||Sp denotes the ℓp norm of the the singular values of M. For the special cases of p=2 (Frobenius norm) and p = ∞ (Spectral norm), Musco and Musco (NeurIPS 2015) obtained an algorithm based on Krylov methods that uses Õ(k/√є) matrix-vector products, improving on the naïve Õ(k/є) dependence obtainable by the power method, where Õ(·) suppresses poly(log(dk/є)) factors.
Our main result is an algorithm that uses only Õ(kp1/6/є1/3) matrix-vector products, and works for all, not necessarily constant, p ≥ 1. For p = 2 our bound improves the previous Õ(k/є1/2) bound to Õ(k/є1/3). Since the Schatten-p and Schatten-∞ norms of any matrix are the same up to a 1+ є factor when p ≥ (logd)/є, our bound recovers the result of Musco and Musco for p = ∞. Further, we prove a matrix-vector query lower bound of Ω(1/є1/3) for any fixed constant p ≥ 1, showing that surprisingly Θ(1/є1/3) is the optimal complexity for constant k.
To obtain our results, we introduce several new techniques, including optimizing over multiple Krylov subspaces simultaneously, and pinching inequalities for partitioned operators. Our lower bound for p ∈ [1,2] uses the Araki-Lieb-Thirring trace inequality, whereas for p>2, we appeal to a norm-compression inequality for aligned partitioned operators. As our algorithms only require matrix-vector product access, they can be applied in settings where alternative techniques such as sketching cannot, e.g., to covariance matrices, Hessians defined implicitly by a neural network, and arbitrary polynomials of a matrix.
We present a new sublinear time algorithm for approximating the spectral density (eigenvalue distribution) of an n× n normalized graph adjacency or Laplacian matrix. The algorithm recovers the spectrum up to є accuracy in the Wasserstein-1 distance in O(n· (1/є)) time given sample access to the graph. This result compliments recent work by David Cohen-Steiner, Weihao Kong, Christian Sohler, and Gregory Valiant (2018), which obtains a solution with runtime independent of n, but exponential in 1/є. We conjecture that the trade-off between dimension dependence and accuracy is inherent.
Our method is simple and works well experimentally. It is based on a Chebyshev polynomial moment matching method that employees randomized estimators for the matrix trace. We prove that, for any Hermitian A, this moment matching method returns an є approximation to the spectral density using just O(1/є) matrix-vector products with A. By leveraging stability properties of the Chebyshev polynomial three-term recurrence, we then prove that the method is amenable to the use of coarse approximate matrix-vector products. Our sublinear time algorithm follows from combining this result with a novel sampling algorithm for approximating matrix-vector products with a normalized graph adjacency matrix.
Of independent interest, we show a similar result for the widely used kernel polynomial method (KPM), proving that this practical algorithm nearly matches the theoretical guarantees of our moment matching method. Our analysis uses tools from Jackson’s seminal work on approximation with positive polynomial kernels.
Online learning with expert advice is a fundamental problem of sequential prediction. In this problem, the algorithm has access to a set of n “experts” who make predictions on each day. The goal on each day is to process these predictions, and make a prediction with the minimum cost. After making a prediction, the algorithm sees the actual outcome on that day, updates its state, and then moves on to the next day. An algorithm is judged by how well it does compared to the best expert in the set.
The classical algorithm for this problem is the multiplicative weights algorithm, which has been well-studied in many fields since as early as the 1950s. Variations of this algorithm have been applied to and optimized for a broad range of problems, including boosting an ensemble of weak-learners in machine learning, and approximately solving linear and semi-definite programs. However, every application, to our knowledge, relies on storing weights for every expert, and uses Ω(n) memory. There is little work on understanding the memory required to solve the online learning with expert advice problem (or to run standard sequential prediction algorithms, such as multiplicative weights) in natural streaming models, which is especially important when the number of experts and number of days are both large.
We initiate the study of the learning with expert advice problem in the streaming setting, and show lower and upper bounds. Our lower bound for i.i.d., random order, and adversarial order streams uses a reduction to a custom-built problem with a novel masking technique, to show a smooth trade-off for regret versus memory. Our upper bounds show new ways to run standard sequential prediction algorithms in rounds on small “pools” of experts, thus reducing the necessary memory. For random-order streams, we show that our upper bound is tight up to low order terms. We hope that these results and techniques will have broad applications in online learning, and can inspire algorithms based on standard sequential prediction techniques, like multiplicative weights, for a wide range of other problems in the memory-constrained setting.
Let k,ℓ≥ 2 be two multiplicatively independent integers. Cobham’s famous theorem states that a set X⊆ ℕ is both k-recognizable and ℓ-recognizable if and only if it is definable in Presburger arithmetic. Here we show the following strengthening: let X⊆ ℕm be k-recognizable, let Y⊆ ℕn be ℓ-recognizable such that both X and Y are not definable in Presburger arithmetic. Then the first-order logical theory of (ℕ,+,X,Y) is undecidable. This is in contrast to a well-known theorem of Büchi that the first-order logical theory of (ℕ,+,X) is decidable.
Proving circuit lower bounds has been an important but extremely hard problem for decades. Although it can be shown that almost every function f:F2n→F2 requires circuit of size Ω(2n/n) by a simple counting argument, it remains unknown whether there is an explicit function (for example, a function in NP) not computable by circuits of size 10n. In fact, a 3n−o(n) explicit lower bound by Blum (TCS, 1984) was unbeaten for over 30 years until a recent breakthrough by Find, Golovnev, Hirsch, and Kulikov (FOCS, 2016), which proved a (3+1/86)n−o(n) lower bound for affine dispersers, a class of functions known to be constructible in P. To obtain this improvement, Find, Golovnev, Hirsch, and Kulikov (FOCS, 2016) generalized the classical gate elimination method by keeping track of a bottleneck structure called troubled gates.
In this paper, we prove a stronger lower bound 3.1n − o(n) for affine dispersers. To get this result, we strengthen the gate elimination approach for the (3+1/86)n−o(n) lower bound, by a more sophisticated case analysis that significantly decreases the number of troubled gates introduced during the elimination procedure. Intuitively, our improvement relies on three observations: adjacent troubled gates can be eliminated together; the gates eliminated are usually connected; and the hardest cases during gate elimination have nice local properties to prevent the introduction of new troubled gates.
The approximate degree of a Boolean function f∶{0,1}n→{0,1} is the minimum degree of a real polynomial p that approximates f pointwise: |f(x)−p(x)|≤1/3 for all x∈{0,1}n. For any δ>0, we construct DNF and CNF formulas of polynomial size with approximate degree Ω(n1−δ), essentially matching the trivial upper bound of n. This fully resolves the approximate degree of constant-depth circuits (AC0), a question that has seen extensive research over the past 10 years. Prior to our work, an Ω(n1−δ) lower bound was known only for AC0 circuits of depth that grows with 1/δ (Bun and Thaler, FOCS 2017). Furthermore, the DNF and CNF formulas that we construct are the simplest possible in that they have constant width.
Our result gives the first near-linear lower bounds on the bounded-error communication complexity of polynomial-size DNF and CNF formulas in the challenging k-party number-on-the-forehead model and two-party quantum model: Ω(n/4kk2)1−δ and Ω(n1−δ), respectively, where δ>0 is any constant. Our lower bounds are essentially optimal. Analogous to above, such lower bounds were previously known only for AC0 circuits of depth that grows with 1/δ.
Given i.i.d. samples from an unknown distribution over a large domain [N], approximating several basic quantities, including the distribution’s support size, its entropy, and its distance from the uniform distribution, requires Θ(N / logN) samples [Valiant and Valiant, STOC 2011].
Suppose, however, that we can interact with a powerful but untrusted prover, who knows the entire distribution (or a good approximation of it). Can we use such a prover to approximate (or rather, to approximately verify) such statistical quantities more efficiently? We show that this is indeed the case: the support size, the entropy, and the distance from the uniform distribution, can all be approximately verified via a 2-message interactive proof, where the communication complexity, the verifier’s running time, and the sample complexity are O(√N). For all these quantities, the sample complexity is tight up to polylog N factors (for any interactive proof, regardless of its communication complexity or verification time).
More generally, we give a tolerant interactive proof system with the above sample and communication complexities for verifying a distribution’s proximity to any label-invariant property (any property that is invariant to re-labeling of the elements in the distribution’s support). The verifier’s running time in this more general protocol is also O(√N), under a mild assumption about the complexity of deciding, given a compact representation of a distribution, whether it is in the property or far from it.
The most basic lower-bound question in randomized communication complexity is: Does a given problem have constant cost, or non-constant cost? We observe that this question has a deep connection to implicit graph representations in structural graph theory. Specifically, constant-cost communication problems correspond to hereditary graph families that admit constant-size adjacency sketches, or equivalently constant-size probabilistic universal graphs (PUGs), and these graph families are a subset of families that admit adjacency labeling schemes of size O(logn), which are the subject of the well-studied implicit graph question (IGQ).
We initiate the study of the hereditary graph families that admit constant-size PUGs, with the two (equivalent) goals of (1) giving a structural characterization of randomized constant-cost communication problems, and (2) resolving a probabilistic version of the IGQ. For each family F studied in this paper (including the monogenic bipartite families, product graphs, interval and permutation graphs, families of bounded twin-width, and others), it holds that the subfamilies H ⊆ F are either stable (in a sense relating to model theory), in which case they admit constant-size PUGs (i.e. adjacency sketches), or they are not stable, in which case they do not.
The correspondence between communication problems and hereditary graph families allows for a probabilistic method of constructing adjacency labeling schemes. By this method, we show that the induced subgraphs of any Cartesian products Gd are positive examples to the IGQ, also giving a bound on the number of unique induced subgraphs of any graph product. We prove that this probabilistic construction cannot be ”naïvely derandomized” by using an Equality oracle, implying that the Equality oracle cannot simulate the k-Hamming Distance communication protocol.
As a consequence of our results, we obtain constant-size sketches for deciding dist(x,y) ≤ k for vertices x,y in any stable graph family with bounded twin-width, answering an open question about planar graphs from earlier work. This generalizes to constant-size sketches for deciding first-order formulas over the same graphs.
We give a polynomial-time algorithm for the problem of robustly estimating a mixture of k arbitrary Gaussians in ℝd, for any fixed k, in the presence of a constant fraction of arbitrary corruptions. This resolves the main open problem in several previous works on algorithmic robust statistics, which addressed the special cases of robustly estimating (a) a single Gaussian, (b) a mixture of TV-distance separated Gaussians, and (c) a uniform mixture of two Gaussians. Our main tools are an efficient partial clustering algorithm that relies on the sum-of-squares method, and a novel tensor decomposition algorithm that allows errors in both Frobenius norm and low-rank terms.
We consider the problem of clustering mixtures of mean-separated Gaussians in high dimensions. We are given samples from a mixture of k identity covariance Gaussians, so that the minimum pairwise distance between any two pairs of means is at least Δ, for some parameter Δ > 0, and the goal is to recover the ground truth clustering of these samples. It is folklore that separation Δ = Θ (√logk) is both necessary and sufficient to recover a good clustering (say with constant or 1/poly(k) error), at least information theoretically. However, the estimators which achieve this guarantee are inefficient. We give the first algorithm which runs in polynomial time, and which almost matches this guarantee. More precisely, we give an algorithm which takes polynomially many samples and time, and which can successfully recover a good clustering, so long as the separation is Δ = Ω (log1/2 + c k), for any c > 0. Previously, polynomial time algorithms were only known for this problem when the separation was polynomial in k, and all algorithms which could tolerate poly logk separation required quasipolynomial time. We also extend our result to mixtures of translations of a distribution which satisfies the Poincaré inequality, under additional mild assumptions.
Our main technical tool, which we believe is of independent interest, is a novel way to implicitly represent and estimate high degree moments of a distribution, which allows us to extract important information about high-degree moments without ever writing down the full moment tensors explicitly.
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset. Specifically, we are given a set T of n points in ℝd and a parameter 0< α <1/2 such that an α-fraction of the points in T are i.i.d. samples from a well-behaved distribution D and the remaining (1−α)-fraction are arbitrary. The goal is to output a small list of vectors, at least one of which is close to the mean of D. We develop new algorithms for this problem achieving nearly-optimal statistical guarantees, with runtime O(n1 + є0 d), for any fixed є0 > 0. All prior algorithms for this problem had additional polynomial factors in 1/α. We leverage this result, together with additional techniques, to obtain the first almost-linear time algorithms for clustering mixtures of k separated well-behaved distributions, nearly-matching the statistical guarantees of spectral methods. Prior clustering algorithms inherently relied on an application of k-PCA, thereby incurring runtimes of Ω(n d k). This marks the first runtime improvement for this basic statistical problem in nearly two decades.
The starting point of our approach is a novel and simpler near-linear time robust mean estimation algorithm in the α → 1 regime, based on a one-shot matrix multiplicative weights-inspired potential decrease. We crucially leverage this new algorithmic framework in the context of the iterative multi-filtering technique of Diakonikolas et al. ’18, Diakonikolas et al. ’20, providing a method to simultaneously cluster and downsample points using one-dimensional projections — thus, bypassing the k-PCA subroutines required by prior algorithms.
We give the first polynomial time algorithm for list-decodable covariance estimation. For any α > 0, our algorithm takes input a sample Y ⊆ d of size n≥ dpoly(1/α) obtained by adversarially corrupting an (1−α)n points in an i.i.d. sample X of size n from the Gaussian distribution with unknown mean µ* and covariance Σ*. In npoly(1/α) time, it outputs a constant-size list of k = k(α)= (1/α)poly(1/α) candidate parameters that, with high probability, contains a (µ,Σ) such that the total variation distance TV(N(µ*,Σ*),N(µ,Σ))<1−Oα(1). This is a statistically strongest notion of distance and implies multiplicative spectral and relative Frobenius distance approximation with dimension independent error. Our algorithm works more generally for any distribution D that possesses low-degree sum-of-squares certificates of two natural analytic properties: 1) anti-concentration of one-dimensional marginals and 2) hypercontractivity of degree 2 polynomials.
Prior to our work, the only known results for estimating covariance in the list-decodable setting were for the special cases of list-decodable linear regression and subspace recovery [Karmalkar-Klivans-Kothari 2019, Bakshi-Kothari 2020, Raghavendra-Yau’19, 20]. Even for these special cases, the known error guarantees are weak and in particular, the algorithms need super-polynomial time for any sub-constant (in dimension d) target error in natural norms. Our result, as a corollary, yields the first polynomial time exact algorithm for list-decodable linear regression and subspace recovery that, in particular, obtain 2−(d) error in polynomial-time in the underlying dimension.
For nearly six decades, the central open question in the study of hash tables has been to determine the optimal achievable tradeoff curve between time and space. State-of-the-art hash tables offer the following guarantee: If keys/values are Θ(logn) bits each, then it is possible to achieve constant-time insertions/deletions/queries while wasting only O(loglogn) bits of space per key when compared to the information-theoretic optimum—this bound has been proven to be optimal for a number of closely related problems (e.g., stable hashing, dynamic retrieval, and dynamically-resized filters).
This paper shows that O(loglogn) wasted bits per key is not the end of the line for hashing. In fact, for any k ∈ [log* n], it is possible to achieve O(k)-time insertions/deletions, O(1)-time queries, and the k-th iterated logarithm O(log(k) n) wasted bits per key (all with high probability in n), while also supporting dynamic resizing as the size of the table changes. We further show that this tradeoff curve is the best achievable by any of a large class of hash tables, including any hash table designed using the current framework for making constant-time hash tables succinct.
Our result holds for arbitrarily large keys/values, and in the case where keys/values are very small, we can tighten our bounds to o(1) wasted bits per key. Building on this, we obtain a constant-time dynamic filter that uses n ⌈logє−1 ⌉+ n loge + o(n) bits of space for a wide choice of false-positive rates є, resolving a long-standing open problem for the design of dynamic filters.
We consider the problem of dynamically assigning n elements unique indices, known as hashcodes, in the range [(1+o(1))n]. This problem is known as perfect hashing and is considered a fundamental building block in the design of more involved data structures. The challenge we address is that of designing a data structure that meets several, seemingly opposing, requirements: (1) the range and the space of the data structure must be, at all times, proportional to the current cardinality nt of the input set, and (2) the hashcodes it assigns must be stable in that the hashcode of an element must not change while the element is continuously in the set. A simple argument shows that these two desiderata are impossible to achieve when arbitrary deletions and insertions are allowed.
In this paper, we show that one can achieve them when only insertions occur and, more generally, when the hashcode range and the space are allowed to grow as a function of the maximum cardinality Nt of the set until time t. The data structure executes all operations in worst case constant time with high probability and requires space that is within a constant factor of the lower bound.
In particular, this leads to a hash table design that does not need to move elements as its size grows. More generally, we present, as an application, a cyclic sequence of reductions between data structures that lead to the following bootstrapping result. Let B(u,n) denote the lower bound on the space of a dictionary for n elements over a universe [u]. Given a compact dynamic dictionary (i.e., space O(B(u,n))), we can use it to build a dynamic dictionary with space B(u,n)+O(nloglogn). This reduction increases the time per operation by an additive constant and applies both to the extendable and non-extendable settings (failure probability is 1/poly(n) per insertion).
We study vertex sparsification for distances, in the setting of planar graphs with distortion: Given a planar graph G (with edge weights) and a subset of k terminal vertices, the goal is to construct an ε-emulator, which is a small planar graph G′ that contains the terminals and preserves the distances between the terminals up to factor 1+ε.
We design the first ε-emulators for planar graphs of almost-linear size k1+o(1)/ε. In terms of k, this is a dramatic improvement over the previous quadratic upper bound of Cheung, Goranci and Henzinger [ICALP 2016], and breaks below known quadratic lower bounds for exact emulators (the case when ε=0). Moreover, our emulators can be computed in near-linear time, with applications to fast (1+ε)-approximation algorithms for basic optimization problems on planar graphs such as minimum (s,t)-cut and diameter.
This paper studies the fundamental task of establishing routing paths in distributed networks. We prove the existence of compact routing tables that store in each network-node few simple forwarding rules tracing out hop-constrained and oblivious routing paths for any pair of nodes. For any collection of pairs the congestion of these paths is almost-optimal, i.e., competitive with the globally optimal solution up to a sub-polynomial factor.
The key to this result is the algorithmic theory of hop-constrained expanders and expander decompositions developed in this paper. Informally, we say a graph is an h-hop φ-expanders iff for any set of node-pairs, in which each node is paired to O(deg(v)) nodes at hop-distance O(h), there exists an (h)-hop path between each pair such that any edge is used at most (1/φ) often. An h-hop φ-expander is basically a regular (φ)-expander when h = Ω(logn/φ). For shorter path lengths h ≪ 1/φ, however, h-hop expanders become interesting, powerful, and fundamentally different. We prove that every graph can be decomposed into h-hop (boundary-linked) φ-expanders by cutting at most an O(φ logn)-fraction of its edges.
We also provide efficient constructions for almost-optimal h-hop expander decompositions and our compact routing tables. These are parallel algorithms with almost-linear work and sub-polynomial depth that also have highly-efficient implementations in CONGEST, the standard model for distributed message-passing algorithms. As major implication this gives novel CONGEST algorithms for a large class of important optimization problems, including minimum-spanning tree, (1+є)-min-cut, (1+)-shortest paths. Our algorithms solve these problems in sub-polynomial rounds on any network, as long as a sub-polynomial-round distributed algorithm exists for this network. This strong optimality guarantee is closely related to the notion of universal optimality.
Oblivious routing has a long history in both the theory and practice of networking. In this work we initiate the formal study of oblivious routing in the context of reconfigurable networks, a new architecture that has recently come to the fore in datacenter networking. These networks allow a rapidly changing bounded-degree pattern of interconnections between nodes, but the network topology and the selection of routing paths must both be oblivious to the traffic demand matrix. Our focus is on the trade-off between maximizing throughput and minimizing latency in these networks. For every constant throughput rate, we characterize (up to a constant factor) the minimum latency achievable by an oblivious reconfigurable network design that satisfies the given throughput guarantee. The trade-off between these two objectives turns out to be surprisingly subtle: the curve depicting it has an unexpected scalloped shape reflecting the fact that load-balancing becomes more difficult when the average length of routing paths is not an integer because equalizing all the path lengths is not possible. The proof of our lower bound uses LP duality to verify that Valiant load balancing is the most efficient oblivious routing scheme when used in combination with an optimally-designed reconfigurable network topology. The proof of our upper bound uses an algebraic construction in which the network nodes are identified with vectors over a finite field, the network topology is described by either the elementary basis or a sequence of Vandermonde matrices, and routing paths are constructed by selecting columns of these matrices to yield the appropriate mixture of path lengths within the shortest possible time interval.
Succinct arguments are proof systems that allow a powerful, but untrusted, prover to convince a weak verifier that an input x belongs to a language L ∈ NP, with communication that is much shorter than the NP witness. Such arguments, which grew out of the theory literature, are now drawing immense interest also in practice, where a key bottleneck that has arisen is the high computational cost of proving correctness.
In this work we address this problem by constructing succinct arguments for general computations, expressed as Boolean circuits (of bounded fan-in), with a strictly linear size prover. The soundness error of the protocol is an arbitrarily small constant. Prior to this work, succinct arguments were known with a quasi-linear size prover for general Boolean circuits or with linear-size only for arithmetic circuits, defined over large finite fields.
In more detail, for every Boolean circuit C=C(x,w), we construct an O(log|C|)-round argument-system in which the prover can be implemented by a size O(|C|) Boolean circuit (given as input both the instance x and the witness w), with arbitrarily small constant soundness error and using poly(λ,log|C|) communication, where λ denotes the security parameter. The verifier can be implemented by a size O(|x|) + poly(λ, log|C|) circuit following a size O(|C|) private pre-processing step, or, alternatively, by using a purely public-coin protocol (with no pre-processing) with a size O(|C|) verifier. The protocol can be made zero-knowledge using standard techniques (and with similar parameters). The soundness of our protocol is computational and relies on the existence of collision resistant hash functions that can be computed by linear-size circuits, such as those proposed by Applebaum et al. (ITCS, 2017).
At the heart of our construction is a new information-theoretic interactive oracle proof (IOP), an interactive analog of a PCP, for circuit satisfiability, with constant prover overhead. The improved efficiency of our IOP is obtained by bypassing a barrier faced by prior IOP constructions, which needed to (either explicitly or implicitly) encode the entire computation using a multiplication code.
At ITCS 2010, Dziembowski, Pietrzak, and Wichs introduced Non-malleable Codes (NMCs) which protect against tampering of a codeword of a given message into the codeword of a related message. A well-studied model of tampering is the 2-split-state model where the codeword consists of two independently tamperable states. As with standard error-correcting codes, it is of great importance to build codes with high rates.
Following a long line of work, Aggarwal and Obremski (FOCS 2020) showed the first constant rate non-malleable code in the 2−split state model; however, this constant was a minuscule 10−6! In this work, we build a Non-malleable Code with rate 1/3. This nearly matches the rate 1/2 lower bound for this model due to Cheraghchi and Guruswami (ITCS 2014). Our construction is simple, requiring just an inner-product extractor, a seeded extractor, and an affine-evasive function.
(Sender-)Deniable encryption provides a very strong privacy guarantee: a sender who is coerced by an attacker into “opening” their ciphertext after-the-fact is able to generate “fake” local random choices that are consistent with any plaintext of their choice. The only known fully-efficient constructions of public-key deniable encryption rely on indistinguishability obfuscation (iO) (which currently can only be based on sub-exponential hardness assumptions).
In this work, we study (sender-)deniable encryption in a setting where the encryption procedure is a quantum algorithm, but the ciphertext is classical. First, we propose a quantum analog of the classical definition in this setting. We give a fully efficient construction satisfying this definition, assuming the quantum hardness of the Learning with Errors (LWE) problem.
Second, we show that quantum computation unlocks a fundamentally stronger form of deniable encryption, which we call perfect unexplainability. The primitive at the heart of unexplainability is a quantum computation for which there is provably no efficient way, such as exhibiting the “history of the computation,” to establish that the output was indeed the result of the computation. We give a construction which is secure in the random oracle model, assuming the quantum hardness of LWE. Crucially, this notion implies a form of protection against coercion “before-the-fact”, a property that is impossible to achieve classically.
In distributed differential privacy, the parties perform analysis over their joint data while preserving the privacy for both datasets. Interestingly, for a few fundamental two-party functions such as inner product and Hamming distance, the accuracy of the distributed solution lags way behind what is achievable in the client-server setting. McGregor, Mironov, Pitassi, Reingold, Talwar, and Vadhan [FOCS ’10] proved that this gap is inherent, showing upper bounds on the accuracy of (any) distributed solution for these functions. These limitations can be bypassed when settling for computational differential privacy, where the data is differentially private only in the eyes of a computationally bounded observer, using oblivious transfer.
We prove that the use of public-key cryptography is necessary for bypassing the limitation of McGregor et al., showing that a non-trivial solution for the inner product, or the Hamming distance, implies the existence of a key-agreement protocol. Our bound implies a combinatorial proof for the fact that non-Boolean inner product of independent (strong) Santha-Vazirani sources is a good condenser. We obtain our main result by showing that the inner-product of a (single, strong) SV source with a uniformly random seed is a good condenser, even when the seed and source are dependent.
We give the first polynomial-time algorithm to estimate the mean of a d-variate probability distribution with bounded covariance from Õ(d) independent samples subject to pure differential privacy. Prior algorithms for this problem either incur exponential running time, require Ω(d1.5) samples, or satisfy only the weaker concentrated or approximate differential privacy conditions. In particular, all prior polynomial-time algorithms require d1+Ω(1) samples to guarantee small privacy loss with “cryptographically” high probability, 1−2−dΩ(1), while our algorithm retains Õ(d) sample complexity even in this stringent setting.
Our main technique is a new approach to use the powerful Sum of Squares method (SoS) to design differentially private algorithms. SoS proofs to algorithms is a key theme in numerous recent works in high-dimensional algorithmic statistics – estimators which apparently require exponential running time but whose analysis can be captured by low-degree Sum of Squares proofs can be automatically turned into polynomial-time algorithms with the same provable guarantees. We demonstrate a similar proofs to private algorithms phenomenon: instances of the workhorse exponential mechanism which apparently require exponential time but which can be analyzed with low-degree SoS proofs can be automatically turned into polynomial-time differentially private algorithms. We prove a meta-theorem capturing this phenomenon, which we expect to be of broad use in private algorithm design.
Our techniques also draw new connections between differentially private and robust statistics in high dimensions. In particular, viewed through our proofs-to-private-algorithms lens, several well-studied SoS proofs from recent works in algorithmic robust statistics directly yield key components of our differentially private mean estimation algorithm.
We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution µ on k-sized subsets of a ground set of elements says that for any (possibly randomly chosen) set S, the relative entropy of a single element of S drawn uniformly at random carries at most O(1/k) fraction of the relative entropy of S. Entropic independence is the analog of the notion of spectral independence, if one replaces variance by entropy. We use entropic independence to derive tight mixing time bounds, overcoming the lossy nature of spectral analysis of Markov chains on exponential-sized state spaces.
In our main technical result, we show a general way of deriving entropy contraction, a.k.a. modified log-Sobolev inequalities, for down-up random walks from spectral notions. We show that spectral independence of a distribution under arbitrary external fields automatically implies entropic independence. We furthermore extend our theory to the case where spectral independence does not hold under arbitrary external fields. To do this, we introduce a framework for obtaining tight mixing time bounds for Markov chains based on what we call restricted modified log-Sobolev inequalities, which guarantee entropy contraction not for all distributions, but for those in a sufficiently large neighborhood of the stationary distribution. To derive our results, we relate entropic independence to properties of polynomials: µ is entropically independent exactly when a transformed version of the generating polynomial of µ is upper bounded by its linear tangent; this property is implied by concavity of the said transformation, which was shown by prior work to be locally equivalent to spectral independence.
We apply our results to obtain (1) tight modified log-Sobolev inequalities and mixing times for multi-step down-up walks on fractionally log-concave distributions, (2) the tight mixing time of O(nlogn) for Glauber dynamics on Ising models whose interaction matrix has eigenspectrum lying within an interval of length smaller than 1, improving upon the prior quadratic dependence on n, and (3) nearly-linear time Oδ(n) samplers for the hardcore and Ising models on n-node graphs that have δ-relative gap to the tree-uniqueness threshold. In the last application, our bound on the running time does not depend on the maximum degree Δ of the graph, and is therefore optimal even for high-degree graphs, and in fact, is sublinear in the size of the graph for high-degree graphs.
The single-site dynamics are a canonical class of Markov chains for sampling from high-dimensional probability distributions, e.g. the ones represented by graphical models.
We give a simple and generic parallel algorithm that can faithfully simulate single-site dynamics. When the chain asymptotically satisfies the ℓp-Dobrushin’s condition, specifically, when the Dobrushin’s influence matrix has constantly bounded ℓp-induced operator norm for an arbitrary p∈[1, ∞], the parallel simulation of N steps of single-site updates succeeds within O(N/n+logn) depth of parallel computing using Õ(m) processors, where n is the number of sites and m is the size of graphical model. Since the Dobrushin’s condition is almost always satisfied asymptotically by mixing chains, this parallel simulation algorithm essentially transforms single-site dynamics with optimal O(nlogn) mixing time to algorithms for sampling. In particular we obtain samplers, for the Ising models on general graphs in the uniqueness regime, and for satisfying solutions of CNF formulas in a local lemma regime. With non-adaptive simulated annealing, these samplers can be transformed routinely to algorithms for approximate counting.
A key step in our parallel simulation algorithm, is a so-called “universal coupling” procedure, which tries to simultaneously couple all distributions over the same sample space. We construct such a universal coupling, that for every pair of distributions the coupled probability is at least their Jaccard similarity. We also prove that this is optimal in the worst case. The universal coupling and its applications are of independent interests.
Glauber dynamics on spin systems are well known to suffer exponential slowdowns at low temperatures due to the emergence of multiple metastable phases, separated by narrow bottlenecks that are hard for the dynamics to cross. It is a folklore belief that if the dynamics is initialized from an appropriate random mixture of ground states, one for each phase, then convergence to the Gibbs distribution should be much faster. However, such phenomena have largely evaded rigorous analysis, as most tools in the study of Markov chain mixing times are tailored to worst-case initializations.
In this paper we develop a general framework towards establishing this conjectured behavior for the Ising model. In the classical setting of the Ising model on an N-vertex torus in ℤd, our framework implies that the mixing time for the Glauber dynamics, initialized from a 1/2-1/2 mixture of the all-plus and all-minus configurations, is N1+o(1) in dimension d=2, and at most quasi-polynomial in all dimensions d≥ 3, at all temperatures below the critical one. The key innovation in our analysis is the introduction of the notion of ”weak spatial mixing within a phase”, a low-temperature adaptation of the classical concept of weak spatial mixing. We show both that this new notion is strong enough to control the mixing time from the above random initialization (by relating it to the mixing time with plus boundary condition at O(logN) scales), and that it holds at all low temperatures in all dimensions.
This framework naturally extends to more general families of graphs. To illustrate this, we use the same approach to establish optimal O(NlogN) mixing for the Ising Glauber dynamics on random regular graphs at sufficiently low temperatures, when initialized from the same random mixture.
The ferromagnetic Ising model is a model of a magnetic material and a central topic in statistical physics. It also plays a starring role in the algorithmic study of approximate counting: approximating the partition function of the ferromagnetic Ising model with uniform external field is tractable at all temperatures and on all graphs, due to the randomized algorithm of Jerrum and Sinclair.
Here we show that hidden inside the model are hard computational problems. For the class of bounded-degree graphs we find computational thresholds for the approximate counting and sampling problems for the ferromagnetic Ising model at fixed magnetization (that is, fixing the number of +1 and −1 spins).
In particular, letting βc(Δ) denote the critical inverse temperature of the zero-field Ising model on the infinite Δ-regular tree, and ηΔ,β,1+ denote the mean magnetization of the zero-field + measure on the infinite Δ-regular tree at inverse temperature β, we prove, for the class of graphs of maximum degree Δ: (i) for β < βc(Δ) there is an FPRAS and efficient sampling scheme for the fixed-magnetization Ising model for all magnetizations η. (ii) For β > βc(Δ), there is an FPRAS and efficient sampling scheme for the fixed-magnetization Ising model for magnetizations η such that |η| >ηΔ,β,1+. (iii) For β > βc(Δ), there is no FPRAS for the fixed-magnetization Ising model for magnetizations η such that |η| <ηΔ,β,1+ unless NP=RP.
We give an FPTAS for computing the number of matchings of size k in a graph G of maximum degree Δ on n vertices, for all k ≤ (1−δ)m*(G), where δ>0 is fixed and m*(G) is the matching number of G, and an FPTAS for the number of independent sets of size k ≤ (1−δ) αc(Δ) n, where αc(Δ) is the NP-hardness threshold for this problem. We also provide quasi-linear time randomized algorithms to approximately sample from the uniform distribution on matchings of size k ≤ (1−δ)m*(G) and independent sets of size k ≤ (1−δ)αc(Δ)n.
Our results are based on a new framework for exploiting local central limit theorems as an algorithmic tool. We use a combination of Fourier inversion, probabilistic estimates, and the deterministic approximation of partition functions at complex activities to extract approximations of the coefficients of the partition function. For our results for independent sets, we prove a new local central limit theorem for the hard-core model that applies to all fugacities below λc(Δ), the uniqueness threshold on the infinite Δ-regular tree.
We present a new technique for efficiently removing almost all short cycles in a graph without unintentionally removing its triangles. Consequently, triangle finding problems do not become easy even in almost k-cycle free graphs, for any constant k≥ 4.
Triangle finding is at the base of many conditional lower bounds in P, mainly for distance computation problems, and the existence of many 4- or 5-cycles in a worst-case instance had been the obstacle towards resolving major open questions.
Hardness of approximation: Are there distance oracles with m1+o(1) preprocessing time and mo(1) query time that achieve a constant approximation? Existing algorithms with such desirable time bounds only achieve super-constant approximation factors, while only 3− factors were conditionally ruled out (Pătraşcu, Roditty, and Thorup; FOCS 2012). We prove that no O(1) approximations are possible, assuming the 3-SUM or APSP conjectures. In particular, we prove that k-approximations require Ω(m1+1/ck) time, which is tight up to the constant c. The lower bound holds even for the offline version where we are given the queries in advance, and extends to other problems such as dynamic shortest paths. The 4-Cycle problem: An infamous open question in fine-grained complexity is to establish any surprising consequences from a subquadratic or even linear-time algorithm for detecting a 4-cycle in a graph. This is arguably one of the simplest problems without a near-linear time algorithm nor a conditional lower bound. We prove that Ω(m1.1194) time is needed for k-cycle detection for all k≥ 4, unless we can detect a triangle in √n-degree graphs in O(n2−δ) time; a breakthrough that is not known to follow even from optimal matrix multiplication algorithms.
The 3SUM hypothesis, the All-Pairs Shortest Paths (APSP) hypothesis and the Strong Exponential Time Hypothesis are the three main hypotheses in the area of fine-grained complexity. So far, within the area, the first two hypotheses have mainly been about integer inputs in the Word RAM model of computation. The “Real APSP” and “Real 3SUM” hypotheses, which assert that the APSP and 3SUM hypotheses hold for real-valued inputs in a reasonable version of the Real RAM model, are even more believable than their integer counterparts.
Under the very believable hypothesis that at least one of the Integer 3SUM hypothesis, Integer APSP hypothesis or SETH is true, Abboud, Vassilevska W. and Yu [STOC 2015] showed that a problem called Triangle Collection requires n3−o(1) time on an n-node graph.
The main result of this paper is a nontrivial lower bound for a slight generalization of Triangle Collection, called All-Color-Pairs Triangle Collection, under the even more believable hypothesis that at least one of the Real 3SUM, the Real APSP, and the Orthogonal Vector (OV) hypotheses is true. Combined with slight modifications of prior reductions from Triangle Collection, we obtain polynomial conditional lower bounds for problems such as the (static) ST-Max Flow problem and dynamic versions of Max Flow, Single-Source Reachability Count, and Counting Strongly Connected Components, now under the new weaker hypothesis.
Our main result is built on the following two lines of reductions. In the first line of reductions, we show Real APSP and Real 3SUM hardness for the All-Edges Sparse Triangle problem. Prior reductions only worked from the integer variants of these problems. In the second line of reductions, we show Real APSP and OV hardness for a variant of the Boolean Matrix Multiplication problem.
Along the way we show that Triangle Collection is equivalent to a simpler restricted version of the problem, simplifying prior work. Our techniques also have other interesting implications, such as a super-linear lower bound of Integer All-Numbers 3SUM based on the Real 3SUM hypothesis, and a tight lower bound for a string matching problem based on the OV hypothesis.
Popular fine-grained hypotheses have been successful in proving conditional lower bounds for many dynamic problems. Two of the most widely applicable hypotheses in this context are the combinatorial Boolean Matrix Multiplication (BMM) hypothesis and the closely-related Online Matrix Vector Multiplication (OMv) hypothesis. The main theme of this paper is using k-dimensional generalizations of these two hypotheses to prove new tight conditional lower bounds for dynamic problems.
The combinatorial k-Clique hypothesis, which is a standard hypothesis in the literature, naturally generalizes the combinatorial BMM hypothesis. In this paper, we prove tight lower bounds for several dynamic problems under the combinatorial k-Clique hypothesis. For instance, we show that the Dynamic Range Mode problem has no combinatorial algorithms with poly(n) pre-processing time, O(n2/3−є) update time and O(n2/3−є) query time for any є > 0, matching the known upper bounds for this problem. Previous lower bounds only ruled out algorithms with O(n1/2−є) update and query time under the OMv hypothesis. We also show that the Dynamic Subgraph Connectivity problem on undirected graphs with m edges has no combinatorial algorithms with poly(m) pre-processing time, O(m2/3−є) update time and O(m1−є) query time for є > 0, matching the upper bound given by Chan, Pătraşcu, and Roditty [SICOMP’11], and improving the previous update time lower bound (based on OMv) with exponent 1/2. Other examples include tight combinatorial lower bounds for Dynamic 2D Orthogonal Range Color Counting, Dynamic 2-Pattern Document Retrieval, and Dynamic Range Mode in higher dimensions. Furthermore, we propose the OuMvk hypothesis as a natural generalization of the OMv hypothesis. Under this hypothesis, we prove tight lower bounds for various dynamic problems. For instance, we show that the Dynamic Skyline Points Counting problem in (2k−1)-dimensional space has no algorithm with poly(n) pre-processing time and O(n1−1/k−є) update and query time for є > 0, even if the updates are semi-online. Other examples include tight conditional lower bounds for (semi-online) Dynamic Klee’s measure for unit cubes, and high-dimensional generalizations of Erickson’s problem and Langerman’s problem.
In this paper, we show that the time complexity of monotone min-plus product of two n× n matrices is Õ(n(3+ω)/2)=Õ(n2.687), where ω < 2.373 is the fast matrix multiplication exponent [Alman and Vassilevska Williams 2021]. That is, when A is an arbitrary integer matrix and B is either row-monotone or column-monotone with integer elements bounded by O(n), computing the min-plus product C where Ci,j=mink{Ai,k+Bk,j} takes Õ(n(3+ω)/2) time, which greatly improves the previous time bound of Õ(n(12+ω)/5)=Õ(n2.875) [Gu, Polak, Vassilevska Williams and Xu 2021]. Then by simple reductions, this means the case that A is arbitrary and the columns or rows of B are bounded-difference can also be solved in Õ(n(3+ω)/2) time, whose previous result gives time complexity of Õ(n2.922) [Bringmann, Grandoni, Saha and Vassilevska Williams 2016]. So the case that both of A and B are bounded-difference also has Õ(n(3+ω)/2) time algorithm, whose previous results give time complexities of Õ(n2.824) [Bringmann, Grandoni, Saha and Vassilevska Williams 2016] and Õ(n2.779) [Chi, Duan and Xie 2022]. Many problems are reducible to these problems, such as language edit distance, RNA-folding, scored parsing problem on BD grammars [Bringmann, Grandoni, Saha and Vassilevska Williams 2016]. Thus, their complexities are all improved.
Finally, we also consider the problem of min-plus convolution between two integral sequences which are monotone and bounded by O(n), and achieve a running time upper bound of Õ(n1.5). Previously, this task requires running time Õ(n(9+√177)/12) = O(n1.859) [Chan and Lewenstein 2015].
We study the computational complexity of the problem #IndSub(Φ) of counting k-vertex induced subgraphs of a graph G that satisfy a graph property Φ. Our main result establishes an exhaustive and explicit classification for all hereditary properties, including tight conditional lower bounds under the Exponential Time Hypothesis (ETH): If a hereditary property Φ is true for all graphs, or if it is true only for finitely many graphs, then #IndSub(Φ) is solvable in polynomial time. Otherwise, #IndSub(Φ) is #W[1]-complete when parameterised by k, and, assuming ETH, it cannot be solved in time f(k)· |G|o(k) for any function f. This classification features a wide range of properties for which the corresponding detection problem (as classified by Khot and Raman [TCS 02]) is tractable but counting is hard. Moreover, even for properties which are already intractable in their decision version, our results yield significantly stronger lower bounds for the counting problem. As additional result, we also present an exhaustive and explicit parameterised complexity classification for all properties that are invariant under homomorphic equivalence. By covering one of the most natural and general notions of closure, namely, closure under vertex-deletion (hereditary), we generalise some of the earlier results on this problem. For instance, our results fully subsume and strengthen the existing classification of #IndSub(Φ) for monotone (subgraph-closed) properties due to Roth, Schmitt, and Wellnitz [FOCS 20]. A full version of our paper, containing all proofs, is available at .
A probabilistic algorithm A is pseudodeterministic if, on every input, there exists a canonical value that is output with high probability. If the algorithm outputs one of k canonical values with high probability, then it is called a k-pseudodeterministic algorithm. In the study of pseudodeterminism, the Acceptance Probability Estimation Problem (APEP), which is to additively approximate the acceptance probability of a Boolean circuit, is emerging as a central computational problem. This problem admits a 2-pseudodeterministic algorithm. Recently, it was shown that a pseudodeterministic algorithm for this problem would imply that any multi-valued function that admits a k-pseudodeterministic algorithm for a constant k (including approximation algorithms) also admits a pseudodeterministic algorithm (Dixon, Pavan, Vinodchandran; ITCS 2021).
The contribution of the present work is two-fold. First, as our main conceptual contribution, we establish that the existence of a pseudodeterministic algorithm for APEP is fundamentally related to the gap between probabilistic promise classes and the corresponding standard complexity classes. In particular, we show the following equivalence: APEP has a pseudodeterministic approximation algorithm if and only if every promise problem in PromiseBPP has a solution in BPP. A conceptual interpretation of this equivalence is that the algorithmic gap between 2-pseudodeterminism and pseudodeterminism is equivalent to the gap between PromiseBPP and BPP. Based on this connection, we show that designing pseudodeterministic algorithms for APEP leads to the solution of some open problems in complexity theory, including new Boolean circuit lower bounds. This equivalence also explains how multi-pseudodeterminism is connected to problems in SearchBPP. In particular, we show that if APEP has a pseudodeterministic algorithm, then every problem that admits a k(n)-pseudodeterministic algorithm (for any polynomial k) is in SearchBPP and admits a pseudodeterministic algorithm. Motivated by this connection, we also explore its connection to probabilistic search problems and establish that APEP is complete for certain notions of search problems in the context of pseudodeterminism.
Our second contribution is establishing query complexity lower bounds for multi-pseudodeterministic computations. We prove that for every k ≥ 1, there exists a problem whose (k+1)-pseudodeterministic query complexity, in the uniform query model, is O(1) but has a k-pseudodeterministic query complexity of Ω(n), even in the more general nonadaptive query model. A key contribution of this part of the work is the utilization of Sperner’s lemma in establishing query complexity lower bounds.
We present a new framework for designing worst-case to average-case reductions. For a large class of problems, it provides an explicit transformation of algorithms running in time T that are only correct on a small (subconstant) fraction of their inputs into algorithms running in time O(T) that are correct on all inputs.
Using our framework, we obtain such efficient worst-case to average-case reductions for fundamental problems in a variety of computational models; namely, algorithms for matrix multiplication, streaming algorithms for the online matrix-vector multiplication problem, and static data structures for all linear problems as well as for the multivariate polynomial evaluation problem.
Our techniques crucially rely on additive combinatorics. In particular, we show a local correction lemma that relies on a new probabilistic version of the quasi-polynomial Bogolyubov-Ruzsa lemma.
We show broad equivalences in the average-case complexity of many different meta-complexity problems, including Kolmogorov complexity, time-bounded Kolmogorov complexity, and the Minimum Circuit Size Problem. These results hold for a wide range of parameters (various thresholds, approximation gaps, weak or strong average-case hardness, etc.) and complexity notions, showing the theory of meta-complexity is very *robust* in the average-case setting.
Our results are shown by establishing new and generic connections between meta-complexity and the theory of pseudorandomness and one-way functions. Using these connections, we give the first unconditional characterization of one-way functions based on the average-case hardness of the Minimum Circuit Size Problem. We also give a surprising and clean characterization of one-way functions based on the average-case hardness of (the worst-case uncomputable) Kolmogorov complexity. Moreover, the latter is the first characterization of one-way functions based on the average-case hardness of a fixed problem on *any* samplable distribution.
We give various applications of these results to the foundations of cryptography and the theory of meta-complexity. For example, we show that the average-case hardness of deciding k-SAT or Clique on any samplable distribution of high enough entropy implies the existence of one-way functions. We also use our results to unconditionally solve various meta-complexity problems in CZK (computational zero-knowledge) on average, and give implications of our results for the classic question of proving NP-hardness for the Minimum Circuit Size Problem.
We consider the problem of extracting randomness from sumset sources, a general class of weak sources introduced by Chattopadhyay and Li (STOC, 2016). An (n,k,C)-sumset source X is a distribution on {0,1}n of the form X1 + X2 + … + XC, where Xi’s are independent sources on n bits with min-entropy at least k. Prior extractors either required the number of sources C to be a large constant or the min-entropy k to be at least 0.51 n.
As our main result, we construct an explicit extractor for sumset sources in the setting of C=2 for min-entropy poly(logn) and polynomially small error. We can further improve the min-entropy requirement to (logn) · (loglogn)1 + o(1) at the expense of worse error parameter of our extractor. We find applications of our sumset extractor for extracting randomness from other well-studied models of weak sources such as affine sources, small-space sources, and interleaved sources.
Interestingly, it is unknown if a random function is an extractor for sumset sources. We use techniques from additive combinatorics to show that it is a disperser, and further prove that an affine extractor works for an interesting subclass of sumset sources which informally corresponds to the “low doubling” case (i.e., the support of X1 + X2 is not much larger than 2k).
The basic goal of survivable network design is to build cheap networks that guarantee the connectivity of certain pairs of nodes despite the failure of a few edges or nodes. A celebrated result by Jain [Combinatorica'01] provides a 2-approximation for a wide class of these problems. However nothing better is known even for very basic special cases, raising the natural question whether any improved approximation factor is possible at all.
In this paper we address one of the most basic problems in this family for which 2 is still the best-known approximation factor, the Forest Augmentation Problem (FAP): given an undirected unweighted graph (that w.l.o.g. we can assume to be a forest) and a collection of extra edges (links), compute a minimum cardinality subset of links whose addition to the graph makes it 2-edge-connected. Several better-than-2 approximation algorithms are known for the special case where the input graph is a tree, a.k.a. the Tree Augmentation Problem (TAP), see e.g. [Grandoni, Kalaitzis, Zenklusen - STOC'18; Cecchetto, Traub, Zenklusen - STOC'21] and references therein. Recently this was achieved also for the weighted version of TAP [Traub, Zenklusen - FOCS'21], and for the k-connectivity generalization of TAP [Byrka, Grandoni, Jabal-Ameli - STOC'20; Cecchetto, Traub, Zenklusen - STOC'21]. These results heavily exploit the fact that the input graph is connected, a condition that does not hold in FAP.
In this paper we breach the 2-approximation barrier for FAP. Our result is based on two main ingredients. First, we describe a reduction to the Path Augmentation Problem (PAP), the special case of FAP where the input graph is a collection of disjoint paths. Our reduction is not approximation preserving, however it is sufficiently accurate to improve on a factor 2 approximation. Second, we present a better-than-2 approximation algorithm for PAP, an open problem on its own. Here we exploit a novel notion of implicit credits which might turn out to be helpful in future related work.
We give a randomized 1+5.06/√k-approximation algorithm for the minimum k-edge connected spanning multi-subgraph problem, k-ECSM.
Motivated by data analysis and machine learning applications, we consider the popular high-dimensional Euclidean k-median and k-means problems. We propose a new primal-dual algorithm, inspired by the classic algorithm of Jain and Vazirani and the recent algorithm of Ahmadian, Norouzi-Fard, Svensson, and Ward. Our algorithm achieves an approximation ratio of 2.406 and 5.912 for Euclidean k-median and k-means, respectively, improving upon the 2.633 approximation ratio of Ahmadian et al. and the 6.1291 approximation ratio of Grandoni, Ostrovsky, Rabani, Schulman, and Venkat.
Our techniques involve a much stronger exploitation of the Euclidean metric than previous work on Euclidean clustering. In addition, we introduce a new method of removing excess centers using a variant of independent sets over graphs that we dub a “nested quasi-independent set”. In turn, this technique may be of interest for other optimization problems in Euclidean and ℓp metric spaces.
We provide a new bi-criteria Õ(log2 k) competitive algorithm for explainable k-means clustering. Explainable k-means was recently introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). It is described by an easy to interpret and understand (threshold) decision tree or diagram. The cost of the explainable k-means clustering equals to the sum of costs of its clusters; and the cost of each cluster equals the sum of squared distances from the points in the cluster to the center of that cluster. The best non bi-criteria algorithm for explainable clustering Õ(k) competitive, and this bound is tight.
Our randomized bi-criteria algorithm constructs a threshold decision tree that partitions the data set into (1+δ)k clusters (where δ∈ (0,1) is a parameter of the algorithm). The cost of this clustering is at most Õ(1/δ· log2 k) times the cost of the optimal unconstrained k-means clustering. We show that this bound is almost optimal.
We study reachability and shortest paths problems in dynamic directed graphs. Whereas algebraic dynamic data structures supporting edge updates and reachability/distance queries have been known for quite a long time, they do not, in general, allow reporting the underlying paths within the same time bounds, especially against an adaptive adversary.
In this paper we develop the first known fully dynamic reachability data structures working against an adaptive adversary and supporting edge updates and path queries for two natural variants: (1) point-to-point path reporting, and (2) single-source reachability tree reporting. For point-to-point queries in DAGs, we achieve O(n1.529) worst-case update and query bounds, whereas for tree reporting in DAGs, the respective worst-case bounds are O(n1.765). More importantly, we show how to lift these algorithms to work on general graphs at the cost of increasing the bounds to n1+5/6+o(1) and making the update times amortized. On the way to accomplishing these goals, we obtain two interesting subresults. We give subquadratic fully dynamic algorithms for topological order (in a DAG), and strongly connected components. To the best of our knowledge, such algorithms have not been described before.
Additionally, we provide deterministic incremental data structures for (point-to-point or single-source) reachability and shortest paths that can handle edge insertions and report the respective paths within subquadratic worst-case time bounds. For reachability and (1+є)-approximate shortest paths in weighted directed graphs, these bounds match the best known dynamic matrix inverse-based randomized bounds for fully dynamic reachability [v.d.Brand, Nanongkai and Saranurak, FOCS’19]. For exact shortest paths in unweighted graphs, the obtained bounds in the incremental setting polynomially improve upon the respective best known randomized update/distance query bounds in the fully dynamic setting.
The suffix array SA[1..n] of a text T of length n is a permutation of {1, …, n} describing the lexicographical ordering of suffixes of T and is considered to be one of the most important data structures for string processing, with dozens of applications in data compression, bioinformatics, and information retrieval. One of the biggest drawbacks of the suffix array is that it is very difficult to maintain under text updates: even a single character substitution can completely change the contents of the suffix array. Thus, the suffix array of a dynamic text is modelled using suffix array queries, which return the value SA[i] given any i ∈ [1..n].
Prior to this work, the fastest dynamic suffix array implementations were by Amir and Boneh, who showed how to answer suffix array queries in Õ(k) time, where k ∈ [1..n] is a trade-off parameter, with Õ(n/k)-time text updates [ISAAC 2020]. In a very recent preprint, they also provided a solution with O(log5 n)-time queries and Õ(n2/3)-time updates [arXiv 2021].
We propose the first data structure that supports both suffix array queries and text updates in O(polylog n) time (achieving O(log4 n) and O(log3+o(1) n) time, respectively). Our data structure is deterministic and the running times for all operations are worst-case. In addition to the standard single-character edits (character insertions, deletions, and substitutions), we support (also in O(log3+o(1) n) time) the ”cut-paste” operation that moves any (arbitrarily long) substring of T to any place in T. To achieve our result, we develop a number of new techniques which are of independent interest. This includes a new flavor of dynamic locally consistent parsing, as well as a dynamic construction of string synchronizing sets with an extra local sparsity property; this significantly generalizes the sampling technique introduced at STOC 2019. We complement our structure by a hardness result: unless the Online Matrix-Vector Multiplication (OMv) Conjecture fails, no data structure with O(polylog n)-time suffix array queries can support the ”copy-paste” operation in O(n1−є) time for any є > 0.
Given an input that undergoes a sequence of updates, a dynamic algorithm maintains a valid solution to some predefined problem at any point in time; the goal is to design an algorithm in which computing a solution to the updated input is done more efficiently than computing the solution from scratch. A dynamic algorithm against an adaptive adversary is required to be correct when the adversary chooses the next update after seeing the previous outputs of the algorithm. We obtain faster dynamic algorithms against an adaptive adversary and separation results between what is achievable in the oblivious vs. adaptive settings. To get these results we exploit techniques from differential privacy, cryptography, and adaptive data analysis. Our results are as follows.
1. We give a general reduction transforming a dynamic algorithm against an oblivious adversary to a dynamic algorithm robust against an adaptive adversary. This reduction maintains several copies of the oblivious algorithm and uses differential privacy to protect their random bits. Using this reduction we obtain dynamic algorithms against an adaptive adversary with improved update and query times for global minimum cut, all pairs distances, and all pairs effective resistance.
2. We further improve our update and query times by showing how to maintain a sparsifier over an expander decomposition that can be refreshed fast. This fast refresh enables it to be robust against what we call a blinking adversary that can observe the output of the algorithm only following refreshes. We believe that these techniques will prove useful for additional problems.
3. On the flip side, we specify dynamic problems that, assuming a random oracle, every dynamic algorithm that solves them against an adaptive adversary must be polynomially slower than a rather straightforward dynamic algorithm that solves them against an oblivious adversary. We first show a separation result for a search problem and then show a separation result for an estimation problem. In the latter case our separation result draws from lower bounds in adaptive data analysis.
We study dynamic algorithms for the problem of maximizing a monotone submodular function over a stream of n insertions and deletions. We show that any algorithm that maintains a (0.5+є)-approximate solution under a cardinality constraint, for any constant є>0, must have an amortized query complexity that is polynomial in n. Moreover, a linear amortized query complexity is needed in order to maintain a 0.584-approximate solution. This is in sharp contrast with recent dynamic algorithms of [LMN+20, Mon20] that achieve (0.5−є)-approximation with a polylog(n) amortized query complexity.
On the positive side, when the stream is insertion-only, we present efficient algorithms for the problem under a cardinality constraint and under a matroid constraint with approximation guarantee 1−1/e−є and amortized query complexities O(log(k/є)/є2) and kÕ(1/є2)logn, respectively, where k denotes the cardinality parameter or the rank of the matroid.