Vizing’s theorem states that any n-vertex m-edge graph of maximum degree Δ can be edge colored using at most Δ + 1 different colors [Vizing, 1964]. Vizing’s original proof is algorithmic and shows that such an edge coloring can be found in O(mn) time. This was subsequently improved to Õ(m√n) time, independently by [Arjomandi, 1982] and by [Gabow et al., 1985]. Very recently, independently and concurrently, using randomization, this runtime bound was further improved to Õ(n2) by [Assadi, 2024] and Õ(mn1/3) by [Bhattacharya, Carmon, Costa, Solomon and Zhang, 2024] (and subsequently to Õ(mn1/4) by [Bhattacharya, Costa, Solomon and Zhang, 2024]). In this paper, we present a randomized algorithm that computes a (Δ+1)-edge coloring in near-linear time—in fact, only O(mlogΔ) time—with high probability, giving a near-optimal algorithm for this fundamental problem.
We give a deterministic O(mlog2/3n)-time algorithm for single-source shortest paths (SSSP) on directed graphs with real non-negative edge weights in the comparison-addition model. This is the first result to break the O(m+nlogn) time bound of Dijkstra’s algorithm on sparse graphs, showing that Dijkstra’s algorithm is not optimal for SSSP.
We construct 2-query, quasi-linear size probabilistically checkable proofs (PCPs) with arbitrarily small constant soundness, improving upon Dinur’s 2-query quasi-linear size PCPs with soundness 1−Ω(1). As an immediate corollary, we get that under the exponential time hypothesis, for all >0 no approximation algorithm for 3-SAT can obtain an approximation ratio of 7/8+ in time 2n/logC n, where C is a constant depending on . Our result builds on a recent line of independent works by Bafna, Lifshitz and Minzer, and Dikstein, Dinur and Lubotzky, that showed the existence of linear size direct product testers with small soundness.
The main new ingredient in our proof is a technique that embeds a given 2-CSP into a 2-CSP on a prescribed graph, provided that the latter is a graph underlying a sufficiently good high-dimensional expander (HDX). We achieve this by establishing a novel connection between PCPs and fault-tolerant distributed computing, more precisely, to the almost-everywhere reliable transmission problem introduced by Dwork, Peleg, Pippenger and Upfal (1986). We instantiate this connection by showing that graphs underlying HDXs admit routing protocols that are tolerant to adversarial edge corruptions, also improving upon the state of the art constructions of sparse edge-fault-tolerant networks in the process.
Our PCP construction requires variants of the aforementioned direct product testers with poly-logarithmic degree. The existence and constructability of these variants is shown in the full version.
For any ε > 0, we prove that k-Dimensional Matching is hard to approximate within a factor of k/(12 + ε) for large k unless NP ⊆ BPP. Listed in Karp’s 21 NP-complete problems, k-Dimensional Matching is a benchmark computational complexity problem which we find as a special case of many constrained optimization problems over independence systems including: k-Set Packing, k-Matroid Intersection, and Matroid k-Parity.
For all the aforementioned problems, the best-known lower bound was a Ω(k /log(k))-hardness by Hazan, Safra, and Schwartz. In contrast, state-of-the-art algorithms achieve an approximation of O(k).
Our result narrows down this gap to a constant and thus provides a rationale for the observed algorithmic difficulties. The crux of our result hinges on a novel approximation preserving gadget from R-degree bounded k-CSPs over alphabet size R to kR-Dimensional Matching. Along the way, we prove that R-degree bounded k-CSPs over alphabet size R are hard to approximate within a factor Ωk(R) using known randomised sparsification methods for CSPs.
Bourgain’s symmetrization theorem is a powerful technique reducing boolean analysis on product spaces to the cube. It states that for any product Ωi⊗ d, function f: Ωi⊗ d → , and q > 1:
||T_1/2f(x)||_q ≤||f(r,x)||_q ≤||T_c_qf(x)||_q
where Tρf = ∑ρSf=S is the noise operator and f(r,x) = ∑rSf=S(x) ‘symmetrizes’ f by convolving its Fourier components {f=S}S ⊆ [d] with a random boolean string r ∈ {± 1}d.
In this work, we extend the symmetrization theorem to high dimensional expanders (HDX). Building on (O’Donnell and Zhao 2021), we show this implies nearly-sharp (2→q)-hypercontractivity for partite HDX. This resolves the main open question of (Gur, Lifshitz, and Liu STOC 2022) and gives the first fully hypercontractive subsets X ⊂ [n]d of support n·exp((d)), an exponential improvement over Bafna, Hopkins, Kaufman, and Lovett’s n·exp(exp(d)) bound (BHKL STOC 2022). Adapting (Bourgain JAMS 1999), we also give the first booster theorem for HDX, resolving a main open question of BHKL.
Our proof is based on two elementary new ideas in the theory of high dimensional expansion. First we introduce ‘q-norm HDX’, generalizing standard spectral notions to higher moments, and observe every spectral HDX is a q-norm HDX. Second, we introduce a simple method of coordinate-wise analysis on HDX which breaks high dimensional random walks into coordinate-wise components and allows each component to be analyzed as a 1-dimensional operator locally within X. This allows for application of standard tricks such as the replacement method, greatly simplifying prior analytic techniques.
This is an extended abstract. The full paper may be found at https://arxiv.org/abs/2408.16687
In a 3-XOR game G, the verifier samples a challenge (x,y,z)∼ µ where µ is a probability distribution over Σ×Γ×Φ, and a map t∶ Σ×Γ×Φ→A for a finite Abelian group A defining a constraint. The verifier sends the questions x, y and z to the players Alice, Bob and Charlie respectively, receives answers f(x), g(y) and h(z) that are elements in A and accepts if f(x)+g(y)+h(z) = t(x,y,z). The value, val(G), of the game is defined to be the maximum probability the verifier accepts over all players’ strategies. We show that if G is a 3-XOR game with value strictly less than 1, whose underlying distribution over questions µ does not admit Abelian embeddings into (ℤ,+), then the value of the n-fold repetition of G is exponentially decaying. That is, there exists c=c(G)>0 such that val(G⊗ n)≤ 2−cn. This extends a previous result of [Braverman-Khot-Minzer, FOCS 2023] showing exponential decay for the GHZ game. Our proof combines tools from additive combinatorics and tools from discrete Fourier analysis.
The network unreliability problem asks for the probability that a given undirected graph gets disconnected when every edge independently fails with a given probability p. Valiant (1979) showed that this problem is #P-hard; therefore, the best we can hope for are approximation algorithms. In a classic result, Karger (1995) obtained the first FPRAS for this problem by leveraging the fact that when a graph disconnects, it almost always does so at a near-minimum cut, and there are only a small (polynomial) number of near-minimum cuts. Since then, a series of results have obtained progressively faster algorithms to the current bound of m1+o(1) + Õ(n3/2) (Cen, He, Li, and Panigrahi, 2024). In this paper, we obtain an m1+o(1)-time algorithm for the network unreliability problem. This is essentially optimal, since we need O(m) time to read the input graph. Our main new ingredient is relating network unreliability to an ideal tree packing of spanning trees (Thorup, 2001).
We consider the Global Minimum Vertex-Cut problem: given an undirected vertex-weighted graph G, compute a minimum-weight subset of its vertices whose removal disconnects G. The problem is closely related to Global Minimum Edge-Cut, where the weights are on the graph edges instead of vertices, and the goal is to compute a minimum-weight subset of edges whose removal disconnects the graph. Global Minimum Cut is one of the most basic and extensively studied problems in combinatorial optimization and graph theory. While an almost-linear time algorithm was known for the edge version of the problem for awhile (Karger, STOC 1996 and J. ACM 2000), the fastest previous algorithm for the vertex version (Henzinger, Rao and Gabow, FOCS 1996 and J. Algorithms 2000) achieves a running time of (mn), where m and n denote the number of edges and vertices in the input graph, respectively. For the special case of unit vertex weights, this bound was broken only recently (Li et al., STOC 2021); their result, combined with the recent breakthrough almost-linear time algorithm for Maximum s-t Flow (Chen et al., FOCS 2022, van den Brand et al., FOCS 2023), yields an almost-linear time algorithm for Global Minimum Vertex-Cut with unit vertex weights. In this paper we break the 28 years old bound of Henzinger et al. for the general weighted Global Minimum Vertex-Cut, by providing a randomized algorithm for the problem with running time O(min{mn0.99+o(1),m1.5+o(1)}).
We present a nearly linear work parallel algorithm for approximating the Held–Karp bound for Metric TSP. Given an edge-weighted undirected graph G=(V,E) on m edges and ε>0, it returns a (1+ε)-approximation to the Held–Karp bound with high probability, in Õ(m/ε4) work and Õ(1/ε4) depth. While a nearly linear time sequential algorithm was known for almost a decade (Chekuri and Quanrud ’17), it was not known how to simultaneously achieve nearly linear work alongside polylogarithmic depth. Using a reduction by Chalermsook et al. ’22, we also give a parallel algorithm for computing a (1+ε)-approximate fractional solution to the k-edge-connected spanning subgraph (k-ECSS) problem, with similar complexity.
To obtain these results, we introduce a notion of core-sequences for the parallel Multiplicative Weights Update (MWU) framework (Luby–Nisan ’93, Young ’01). For Metric TSP and k-ECSS, core-sequences enable us to exploit the structure of approximate minimum cuts to reduce the cost per iteration and/or the number of iterations. The acceleration technique via core-sequences is generic and of independent interest. In particular, it improves the best-known iteration complexity of MWU algorithms for packing/covering LPs from poly(lognnz(A)) to polylogarithmic in the product of cardinalities of the core-sequence sets, where A is the constraint matrix of the LP. For certain implicitly defined LPs such as the k-ECSS LP, this yields an exponential improvement in depth.
Recent oracle separations [Kretschmer, TQC’21, Kretschmer et. al., STOC’23] have raised the tantalizing possibility of basing quantum cryptography on sources of hardness that persist even if the polynomial hierarchy collapses. We realize this possibility by building quantum bit commitments and secure computation from unrelativized, well-studied mathematical problems that are conjectured to be #P-hard – such as approximating the permanents of complex Gaussian matrices, or approximating the output probabilities of random quantum circuits. Indeed, we show that as long as any one of the conjectures underlying sampling-based quantum advantage (e.g., BosonSampling [Aaronson-Arkhipov, STOC’11], Random Circuit Sampling [Boixo et. al., Nature Physics 2018], IQP [Bremner, Jozsa and Shepherd, Proc. Royal Society of London 2010]) is true, quantum cryptography can be based on the extremely mild (worst-case) complexity assumption that efficient quantum machines cannot decide P^#P.
Our techniques uncover strong connections between the hardness of approximating the probabilities of outcomes of quantum processes, the existence of “one-way” state synthesis problems, and the existence of useful cryptographic primitives such as one-way puzzles and quantum bit commitments. Specifically, we prove that the following hardness assumptions are equivalent under quantum polynomial time reductions. Specifically, we prove that the following hardness assumptions are equivalent under quantum polynomial time reductions.
- The hardness of approximating the probabilities of outcomes of quantumly efficiently sampleable distributions, upto inverse polynomial relative error.
- The existence of one-way puzzles, where a quantum sampler outputs a pair of classical strings – a puzzle and its key – and where the hardness lies in finding any key corresponding to a random puzzle. These are known to imply quantum bit commitments [Khurana and Tomer, STOC’24].
- The existence of state puzzles, or one-way state synthesis problems, where it is hard to synthesize a secret quantum state given a public classical identifier. These capture the hardness of search problems with quantum secrets and classical challenges.
These are the first constructions of quantum cryptographic primitives (one-way puzzles, quantum bit commitments, state puzzles) from mathematical assumptions that do not imply the existence of one-way functions/classical cryptography. Along the way, we also show that distributions that admit efficient quantum samplers but cannot be pseudo-deterministically efficiently sampled imply quantum commitments.
We construct a classical oracle relative to which P = NP but quantum-computable quantum-secure trapdoor one-way functions exist. This is a substantial strengthening of the result of Kretschmer, Qian, Sinha, and Tal (STOC 2023), which only achieved single-copy pseudorandom quantum states relative to an oracle that collapses NP to P. For example, our result implies multi-copy pseudorandom states and pseudorandom unitaries, but also classical-communication public-key encryption, signatures, and oblivious transfer schemes relative to an oracle on which P=NP. Hence, in our new relativized world, classical computers live in ”Algorithmica” whereas quantum computers live in ”Cryptomania,” using the language of Impagliazzo’s worlds. Our proof relies on a new distributional block-insensitivity lemma for AC0 circuits, wherein a single block is resampled from an arbitrary distribution.
Aaronson, Atia, and Susskind (2020) established that efficiently mapping between quantum states |ψ⟩ and |φ⟩ is computationally equivalent to distinguishing their superpositions 1/√2(|ψ⟩ + |φ⟩) and 1/√2(|ψ⟩ − |φ⟩). We generalize this insight into a broader duality principle in quantum computation, wherein manipulating quantum states in one basis is equivalent to extracting their value in a complementary basis. In its most general form, this duality principle states that for a given group, the ability to implement a unitary representation of the group is computationally equivalent to the ability to perform a Fourier extraction from the invariant subspaces corresponding to its irreducible representations.
Building on our duality principle, we present the following applications: (1) Quantum money, which captures quantum states that are verifiable but unclonable, and its stronger variant, quantum lightning, have long resisted constructions based on concrete cryptographic assumptions. While (public-key) quantum money has been constructed from indistinguishability obfuscation (iO)—an assumption widely considered too strong—quantum lightning has not been constructed from any such assumptions, with previous attempts based on assumptions that were later broken. We present the first construction of quantum lightning with a rigorous security proof, grounded in a plausible and well-founded cryptographic assumption. We extend the construction of Zhandry (2024) from Abelian group actions to non-Abelian group actions, and eliminate Zhandry’s reliance on a black-box model for justifying security. Instead, we prove a direct reduction to a computational assumption – the pre-action security of cryptographic group actions. We show how these group actions can be realized with various instantiations, including with the group actions of the symmetric group implicit in the McEliece cryptosystem. (2) We provide an alternative quantum money and lightning construction from one-way homomorphisms, showing that security holds under specific conditions on the homomorphism. Notably, our scheme exhibits the remarkable property that four distinct security notions—quantum lightning security, security against both worst-case cloning and average-case cloning, and security against preparing a specific canonical state – are all equivalent. (3) Quantum fire captures the notion of a samplable distribution on quantum states that are efficiently clonable, but not efficiently telegraphable, meaning they cannot be efficiently encoded as classical information. These states can be spread like fire, provided they are kept alive quantumly and do not decohere. The only previously known construction relied on a unitary quantum oracle, whereas we present the first candidate construction of quantum fire using a classical oracle.
We define the notion of a classical commitment scheme to quantum states, which allows a quantum prover to compute a classical commitment to a quantum state, and later open each qubit of the state in either the standard or the Hadamard basis. Our notion is a strengthening of the measurement protocol from Mahadev (STOC 2018). We construct such a commitment scheme from the post-quantum Learning With Errors (LWE) assumption, and more generally from any noisy trapdoor claw-free function family that has the distributional strong adaptive hardcore bit property (a property that we define in this work).
Our scheme is succinct in the sense that the running time of the verifier in the commitment phase depends only on the security parameter (independent of the size of the committed state), and its running time in the opening phase grows only with the number of qubits that are being opened (and the security parameter). As a corollary we obtain a classical succinct argument system for QMA under the post-quantum LWE assumption. Previously, this was only known assuming post-quantum secure indistinguishability obfuscation. As an additional corollary we obtain a generic way of converting any X/Z quantum PCP into a succinct argument system under the quantum hardness of LWE.
We present a general framework for designing efficient data structures for high-dimensional pattern-matching problems (∃ i ∈ [n], f(xi,y) = 1) through communication models in which f(x,y) admits sublinear communication protocols with exponentially-small error. Specifically, we reduce the data structure problem to the Unambiguous Arthur-Merlin (UAM) communication complexity of f(x,y) under product distributions. We apply our framework to the Partial Match problem (a.k.a, matching with wildcards), whose underlying communication problem is sparse set-disjointness. When the database consists of n points in dimension d, and the number of ⋆’s in the query is at most w = c logn (≪ d), the fastest known linear-space data structure (Cole, Gottlieb and Lewenstein, STOC’04) had query time t ≈ 2w = nc, which is nontrivial only when c<1. By contrast, our framework produces a data structure with query time n1−1/(c log2 c) and space close to linear. To achieve this, we develop a one-sided є-error communication protocol for set-disjointness under product distributions with Θ(√d log(1/є)) complexity, improving on the classical result of Babai, Frankl and Simon (FOCS’86). Building on this protocol, we show that the Unambiguous AM communication complexity of w-Sparse set-disjointness with є-error under product distributions is Õ(√w log(1/є)), independent of the ambient dimension d, which is crucial for the partial match result. Our framework sheds further light on the power of data-dependent data structures, which is instrumental for reducing to the (much easier) case of product distributions.
In this work, we study the relative hardness of fundamental problems with state-of-the-art word RAM algorithms taking O(n√logn) time for instances described in Θ(n) machine words (i.e., Θ(nlogn) bits). The word RAM model nowadays serves as the default model of computation for sequential algorithms, and understanding its limitations lies at the core of theoretical computer science. The class of problems solvable in O(n√logn) time is one of the six levels of hardness listed in the seminal paper of Chan and Pǎtraşcu [SODA 2010]. According to the current state of knowledge, this class characterizes problems from several domains, including counting inversions, string processing problems (BWT Construction, LZ77 Factorization, Longest Common Substring, Batched Longest Previous Factor Queries, Batched Inverse Suffix Array Queries), and computational geometry problems (Orthogonal Range Counting, Orthogonal Segment Intersection). Our contribution is twofold:
We present several new connections between the aforementioned string problems and an old Dictionary Matching problem, which asks whether a given text contains (an exact occurrence of) at least one among the given patterns. This is a classical problem with a solution based on the Aho–Corasick automaton dating back to 1975. In this work, we restrict Dictionary Matching to instances with O(n) binary patterns of length m=O(logn), short enough to be stored using O(1) machine words each, and we prove that, unless this problem can be solved faster than the current bound of O(n√logn), most fundamental string problems cannot be solved faster either.
With further reductions, we extend this hierarchy beyond string problems, proving that computational tasks like counting inversions—a fundamental component in geometric algorithms—inherit this hardness. This, in turn, establishes the hardness of Orthogonal Range Counting and Orthogonal Segment Intersection. The key to extending our results to other domains is a surprising equivalent characterization of Dictionary Matching in terms of a new problem we call String Nesting, which, through a chain of three more reductions, can be solved by counting inversions.
Put together, our results unveil a single hard problem, with two different but equivalent formulations, that underlies the hardness of nearly all known major problems, coming from different domains, currently occupying the O(n√logn) level of hardness. These results drastically funnel further efforts to improve the complexity of near-linear problems.
Many of our reductions hold even for simpler versions of basic problems, such as determining the parity of the number of phrases in the LZ77 factorization or the number of runs in the BWT. This yields stronger results that can be used to design future reductions more easily. As an auxiliary outcome of our framework, we also prove that several central string problems in the RAM model do not get easier when limited to strings over the binary alphabet. Our reductions to the binary case simplify the currently fastest algorithms for many classical problems, including LZ77 Factorization and Longest Common Substring.
A connected dominating set (CDS) in a graph is a dominating set of vertices that induces a connected subgraph. Having many disjoint CDSs in a graph can be considered as a measure of its connectivity, and has various graph-theoretic and algorithmic implications.
We show that d-regular (weakly) pseudoreandom graphs contain (1+o(1))d/lnd disjoint CDSs, which is asymptotically best possible. In particular, this implies that random d-regular graphs typically contain (1+o(1))d/lnd disjoint CDSs.
In this paper we consider the problem of finding “as many edge-disjoint Hamilton cycles as possible” in the binomial random digraph Dn,p.
We show that a typical Dn,p contains precisely the minimum between the minimum out- and in-degrees many edge-disjoint Hamilton cycles, given that p≥ log15 n/n, which is optimal up to a factor of polylogn.
Our proof provides a randomized algorithm to generate the cycles and uses a novel idea of generating Dn,p in a sophisticated way that enables us to control some key properties, and on an “online sprinkling” idea as was introduced by Ferber and Vu.
Estimating the second frequency moment of a stream up to (1±ε) multiplicative error requires at most O(logn / ε2) bits of space, due to a seminal result of Alon, Matias, and Szegedy.
It is also known that at least Ω(logn + 1/ε2) space is needed.
We prove a tight lower bound of Ω(log(n ε2 ) / ε2) for all ε = Ω(1/√n).
Note that when ε>n−1/2 + c, where c>0, our lower bound matches the classic upper bound of AMS. For smaller values of ε we also introduce a revised algorithm that improves the classic AMS bound and matches our lower bound.
Our lower bound holds also for the more general problem of p-th frequency moment estimation for the range of p∈ (1,2], giving a tight bound in the only remaining range to settle the optimal space complexity of estimating frequency moments.
We consider the problems of distributed heavy hitters and frequency moments in both the coordinator model and the distributed tracking model (also known as the distributed functional monitoring model). We present simple and optimal (up to logarithmic factors) algorithms for ℓp heavy hitters and Fp estimation (p ≥ 2) in these distributed models.
For ℓp heavy hitters in the coordinator model, our algorithm requires only one round and uses Õ(kp−1/p) bits of communication. For p > 2, this is the first near-optimal result. By combining our algorithm with the standard recursive sketching technique, we obtain a near-optimal two-round algorithm for Fp in the coordinator model, matching a significant result from recent work by Esfandiari et al. (STOC 2024). Our algorithm and analysis are much simpler and have better costs with respect to logarithmic factors. Furthermore, our technique provides a one-round algorithm for Fp, which is a significant improvement over a result of Woodruff and Zhang (STOC 2012).
Thanks to the simplicity of our heavy hitter algorithms, we manage to adapt them to the distributed tracking model with only a (n) increase in communication. For ℓp heavy hitters, our algorithm has a communication cost of Õ(kp−1/p), representing the first near-optimal algorithm for all p ≥ 2. By applying the recursive sketching technique, we also provide the first near-optimal algorithm for Fp in the distributed tracking model, with a communication cost of Õ(kp−1/2) for all p ≥ 2. Even for F2, our result improves upon the bounds established by Cormode, Muthukrishnan, and Yi (SODA 2008) and Woodruff and Zhang (STOC 2012), nearly matching the existing lower bound for the first time.
Correlation clustering is a widely-used approach for clustering large data sets based only on pairwise similarity information. In recent years, there has been a steady stream of better and better classical algorithms for approximating this problem. Meanwhile, another line of research has focused on porting the classical advances to various sublinear algorithm models, including semi-streaming, Massively Parallel Computation (MPC), and distributed computing. Yet, these latter works typically rely on ad-hoc approaches that do not necessarily keep up with advances in approximation ratios achieved by classical algorithms.
Hence, the motivating question for our work is this: can the gains made by classical algorithms for correlation clustering be ported over to sublinear algorithms in a black-box manner? We answer this question in the affirmative by introducing the paradigm of graph de-sparsification.
A versatile approach for designing sublinear algorithms across various models is the graph (linear) sketching. It is known that one can find a cut sparsifier of a given graph—which approximately preserves cut structures—via graph sketching, and that this is sufficient information-theoretically for recovering a near-optimal correlation clustering solution. However, no efficient algorithms are known for this task as the resulting cut sparsifier is necessarily a weighted graph, and correlation clustering is known to be a distinctly harder problem on weighted graphs. Our main result is a randomized linear sketch of O(n) size for n-vertex graphs, from which one can recover with high probability an (α+o(1))-approximate correlation clustering in polynomial time, where α is the best approximation ratio of any polynomial time classical algorithm for (unweighted) correlation clustering. This is proved via our new de-sparsification result: we recover in polynomial-time from some O(n) size linear sketch of a graph G, an unweighted, simple graph that approximately preserves the cut structure of G. In fact we show that under some mild conditions, any spectral sparsifier of a graph G can be de-sparsified into an unweighted simple graph with nearly the same spectrum. We believe the de-sparsification paradigm is interesting in its own right as a way of reducing graph complexity when weighted version of a problem is harder than its unweighted version.
Finally, we use our techniques to get efficient algorithms for correlation clustering that match the performance of best classical algorithms, in a variety of different models, including dynamic streaming, MPC, and distributed communication models.
We study the task of agnostic tomography: given copies of an unknown n-qubit state ρ which has fidelity τ with some state in a given class C, find a state which has fidelity ≥ τ − є with ρ. We give a new framework, stabilizer bootstrapping, for designing computationally efficient protocols for this task, and use this to get new agnostic tomography protocols for the following classes:
Stabilizer states: We give a protocol that runs in time poly(n,1/є)· (1/τ)O(log(1/τ)), answering an open question posed by Grewal, Iyer, Kretschmer, Liang and Anshu and Arunachalam. Previous protocols ran in time exp(Θ(n)) or required τ>cos2(π/8).
States with stabilizer dimension n − t: We give a protocol that runs in time n3·(2t/τ)O(log(1/є)), extending recent work on learning quantum states prepared by circuits with few non-Clifford gates, which only applied in the realizable setting where τ = 1.
Discrete product states: If C = K⊗ n for some µ-separated discrete set K of single-qubit states, we give a protocol that runs in time (n/µ)O((1 + log(1/τ))/µ)/є2. This strictly generalizes a prior guarantee which applied to stabilizer product states. For stabilizer product states, we give a further improved protocol that runs in time (n2/є2)· (1/τ)O(log(1/τ)).
As a corollary, we give the first protocol for estimating stabilizer fidelity, a standard measure of magic for quantum states, to error є in n3 quasipoly(1/є) time.
We consider the problem of testing whether an unknown n-qubit quantum state is a stabilizer state, with only single-copy access. We give an algorithm solving this problem using O(n) copies, and conversely prove that Ω(√n) copies are required for any algorithm. The main observation behind our algorithm is that when repeatedly measuring in a randomly chosen stabilizer basis, stabilizer states are the most likely among the set of all pure states to exhibit linear dependencies in measurement outcomes. Our algorithm is designed to probe deviations from this extremal behavior. For the lower bound, we first reduce stabilizer testing to the task of distinguishing random stabilizer states from the maximally mixed state. We then argue that, without loss of generality, it is sufficient to consider measurement strategies that a) lie in the commutant of the tensor action of the Clifford group and b) satisfy a Positive Partial Transpose (PPT) condition. By leveraging these constraints, together with novel results on the partial transposes of the generators of the Clifford commutant, we derive the lower bound on the sample complexity.
We study functions f : [0, 1]d → [0, 1]d that are both monotone and contracting, and we consider the problem of finding an ε-approximate fixed point of f. We show that the problem lies in the complexity class UEOPL. We give an algorithm that finds an ε-approximate fixed point of a three-dimensional monotone contraction using O(log(1/ε)) queries to f. We also give a decomposition theorem that allows us to use this result to obtain an algorithm that finds an ε-approximate fixed point of a d-dimensional monotone contraction using O((c · log(1/ε))⌈ d / 3 ⌉) queries to f for some constant c. Moreover, each step of both of our algorithms takes time that is polynomial in the representation of f. These results are strictly better than the best-known results for functions that are only monotone, or only contracting.
All of our results also apply to Shapley stochastic games, which are known to be reducible to the monotone contraction problem. Thus we put Shapley games in UEOPL, and we give a faster algorithm for approximating the value of a Shapley game.
We propose efficient no-regret learning dynamics and ellipsoid-based methods for computing linear correlated equilibria—a relaxation of correlated equilibria and a strengthening of coarse correlated equilibria—in general convex games. These are games where the number of pure strategies is potentially exponential in the natural representation of the game, such as extensive-form games. Our work identifies linear correlated equilibria as the tightest known notion of equilibrium that is computable in polynomial time and is efficiently learnable for general convex games. Our results are enabled by a generalization of the seminal framework of Gordon et al. for Φ-regret minimization, providing extensions to this framework that can be used even when the set of deviations Φ is intractable to separate/optimize over. Our polynomial-time algorithms are similarly enabled by extending the Ellipsoid-Against-Hope approach of Papadimitriou and Roughgarden and its generalization to games of non-polynomial type proposed by Farina and Pipis. We provide an extension to these approaches when we do not have access to the separation oracles required by these works for the dual player.
In communication complexity the input of a function f:X×Y→Z is distributed between two players Alice and Bob. If Alice knows only x∈X and Bob only y∈Y, how much information must Alice and Bob share to be able to elicit the value of f(x,y)? Do we need ℓ more resources to solve ℓ instances of a problem? This question is the direct sum question and has been studied in many computational models. In this paper we focus on the case of 2-party deterministic communication complexity and give a counterexample to the direct sum conjecture in its strongest form. To do so we exhibit a family of functions for which the complexity of solving ℓ instances is less than (1−ϵ)ℓ times the complexity of solving one instance for some small enough ϵ>0. We use a customised method in the analysis of our family of total functions, showing that one can force the alternation of rounds between players. This idea allows us to exploit the integrality of the complexity measure to create an increasing gap between the complexity of solving the instances independently with that of solving them together.
Trevisan and Vadhan (FOCS 2000) introduced the notion of (seedless) extractors for samplable distributions. They showed that under a very strong complexity theoretic hardness assumption, there are extractors for samplable distributions with large min-entropy of k=(1−γ) · n, for some small constant γ>0. Recent work by Ball, Goldin, Dachman-Soled and Mutreja (FOCS 2023) weakened the hardness assumption. However, since the original paper by Trevisan and Vadhan, there has been no improvement in the min-entropy threshold k.
In this paper we give a construction of extractors for samplable distributions with low min-entropy of k=n1−γ for some constant γ>0, and in particular we achieve k<n/2 (which is a barrier for the construction of Trevisan and Vadhan).
Our extractors are constructed under a hardness assumption that is weaker than the one used by Trevisan and Vadhan, and stronger than that used by Ball, Goldin, Dachman-Soled and Mutreja. Specifically, that there exists a constant β>0, and a problem in =(2O(n)) that cannot be computed by size 2β n circuits that have an oracle to Σ5¶.
Our approach builds on the technique of Trevisan and Vadhan, while introducing new objects and ideas. We introduce and construct two objects: an errorless (seedless) condenser for samplable distributions, and functions that are hard to compute on every samplable distributions with sufficient min-entropy. We use techniques by Shaltiel and Silbak (STOC 2024), as well as additional tools and ideas, to construct the two new objects, under the hardness assumption. We then show how to modify the construction of Trevisan and Vadhan, using these new objects, so that the barrier of k=n/2 can be bypassed, and we can achieve an extractor for samplable distributions with low min-entropy.
In the classical NP-hard (metric) k-median problem, we are given a set of n clients and centers with metric distances between them, along with an integer parameter k ≥ 1. The objective is to select a subset of k open centers that minimizes the total distance from each client to its closest open center.
In their seminal work, Jain, Mahdian, Markakis, Saberi, and Vazirani presented the Greedy algorithm for facility location, which implies a 2-approximation algorithm for k-median that opens k centers in expectation.
Since then, substantial research has aimed at narrowing the gap between their algorithm and the best achievable approximation by an algorithm guaranteed to open exactly k centers,
as required in the k-median problem.
During the last decade, all improvements have been achieved by leveraging their algorithm (or a small improvement thereof), followed by a second step called bi-point rounding, which inherently adds an additional factor to the approximation guarantee.
Our main result closes this gap: for any > 0, we present a (2+)-approximation algorithm for the k-median problem, improving the previous best-known approximation factor of 2.613. Our approach builds on a combination of two key algorithms. First, we present a non-trivial modification of the Greedy algorithm that operates with only O(logn/2) adaptive phases. Through a novel walk-between-solutions approach, this enables us to construct a (2+)-approximation algorithm for k-median that consistently opens at most k + O(logn/2) centers: via known results, this already implies a (2+)-approximation algorithm that runs in quasi-polynomial time. Second, we develop a novel (2+)-approximation algorithm tailored for stable instances,
where removing any center from an optimal solution increases the cost by at least an Ω(3/logn) fraction. Achieving this involves several ideas, including a sampling approach inspired by the k-means++ algorithm and a reduction to submodular optimization subject to a partition matroid.
This allows us to convert the previous result into a polynomial time algorithm that opens exactly k centers while maintaining the (2+)-approximation guarantee.
We prove that the integrality gap of the Goemans–Linial semidefinite program for the Sparsest Cut problem (with general capacities and demands) on inputs of size n≥ 2 is Θ(√logn). We achieve this by establishing the following geometric/structural result. If (M,d) is an n-point metric space of negative type, then for every τ>0 there is a random subset Z of M such that for any pair of points x,y∈ M with d(x,y)≥ τ, the probability that both x∈ Z and d(y,Z)≥ βτ/√1+log(|B(y,κ β τ)|/|B(y,β τ)|) is Ω(1), where 0<β<1<κ are universal constants. The proof relies on a refinement of the Arora–Rao–Vazirani rounding technique.
We give an asymptotically good family of quantum CSS codes on qubits with a transversal CCZ gate, meaning that the parallel logical CCZ on all logical qubits is performed by parallel physical CCZs on (a subset of) physical qubits. The construction is based on the observation that any classical code satisfying a multiplication property can be used to construct a quantum CSS code with transversal (qudit) CCZ. To obtain a constant-rate and linear-distance family, we then instantiate this construction with a classical good family of algebraic-geometry codes on a non-binary, but constant-sized, alphabet. Finally, we use a technique from the arithmetic secret sharing literature to reduce the alphabet to binary. As a corollary, the constructed code family provides a magic state distillation scheme with constant space overhead.
In a model of fault-tolerant quantum computation with quick and noiseless polyloglog-time auxiliary classical computation, we construct a quantum fault tolerance protocol with constant-space and O(logN)-time overheads, where the notation O(·) hides sub-polylogarithmic factors. This significantly improves over the previous state-of-the-art protocol of Yamasaki and Koashi that achieved constant-space and quasi-polylogarithmic-time overhead. Our construction is obtained by using constant-rate quantum locally testable codes (qLTC) of appropriately chosen block size and developing new fault-tolerant gadgets on qLTCs and qLDPC codes. In particular, we obtain the following new technical results:
(1) We develop a magic state distillation protocol with (log1/ε)γ spacetime overhead, where γ → 0 as ε → 0. This result differs from recent works in that we use a simple and self-contained construction using Reed-Solomon codes to obtain low spacetime overhead (rather than just space overhead as in recent works). We show a similar result for stabilizer state distillation.
(2) We prove that quantum codes based on the cubical complex construction admit sequential and parallel single-shot decoders against adversarial errors of weight scaling with the code distance. In particular, our proof applies to a recent family of almost-good qLTCs of Dinur-Lin-Vidick and the good qLDPC codes of Dinur-Hsieh-Lin-Vidick.
(3) Using the introduced distillation schemes, we develop fault-tolerant logical state preparation procedures with O(1)-spacetime overhead on qLTCs. Here, the qLTC property is used to quickly test if a state is too far from the codespace before proceeding.
(4) We introduce the use of multiple resource states (from a small-sized set) to obtain addressable qLDPC logic that can be easily prepared using our state preparation schemes and transversal qLDPC gates.
We obtain the main result by combining the above results with carefully chosen parameters. In doing so, we introduce a new weight enumerator formalism to prove fault tolerance in a composable way, which is of independent interest. To our knowledge, this gives the lowest spacetime overhead to date in the considered model of quantum fault tolerance, which, for the first time, matches that of classical fault tolerance up to sub-polylogarithmic factors.We conjecture this is optimal up to sub-polylogarithmic factors.
In the last years, Regev’s reduction has been used as a quantum algorithmic tool for providing a quantum advantage for variants of the decoding problem. Following this line of work, there has recently been a new design for a quantum algorithm called “Decoded Quantum Interferometry” that is able to solve in polynomial time several optimization problems. They study in particular the Optimal Polynomial Interpolation (OPI) problem, which can be seen as a decoding problem on Reed-Solomon codes. In this work, we provide strong improvements for some instantiations of the OPI problem. The most notable improvements are for the ISIS∞ problem (originating from lattice-based cryptography) on Reed-Solomon codes but we also study different constraints for OPI. Our results provide natural and convincing decoding problems for which we believe to have a quantum advantage. Our proof techniques involve the use of a soft decoder for Reed-Solomon codes, namely the decoding algorithm from Koetter and Vardy. In order to be able to use this decoder in the setting of Regev’s reduction, we provide a novel generic reduction from a syndrome decoding problem to a coset sampling problem, providing a powerful and simple to use theorem, which generalizes previous work and is of independent interest. We also provide an extensive study of OPI using the Koetter and Vardy algorithm.
By giving new reductions, we show the following algorithmic results and relations between various isomorphism problems:
If Graph Isomorphism is in P, then testing equivalence of cubic forms in n variables over a finite field Fq, and testing isomorphism of n-dimensional algebras over Fq, can both be solved in time qO(n), improving from the brute-force upper bound qO(n2) for both of these.
Polynomial-time search- and counting-to-decision reduction for testing isomorphism of p-groups of class 2 and exponent p in the Cayley table model. This answers questions of Arvind and Torán (Bull. EATCS, 2005) for this group class, thought to be one of the hardest cases of Group Isomorphism.
Combined with the |G|O((log|G|)1/2)-time isomorphism test for p-groups of Frattini class 2 (Ivanyos, Mendoza, Qiao, Sun, and Zhang, FOCS ’24), our reductions extend this runtime to p-groups of exponent p and class c < p.
Our reductions show that several other Tensor Isomorphism-complete problems over a finite prime field Fq can be solved in time qÕ(n3/2), where n is the side length. This improves the previous state of the art bound, which was the brute force qO(n2), for the isomorphism problems for cubic forms, algebras, tensors, and more.
The key to our reductions is to give new gadgets that improve the parameters of previous reductions around Tensor Isomorphism (Grochow and Qiao, ITCS ’21; SIAM J. Comp., ’23). In particular, several of these previous reductions incurred a quadratic increase in the length of the tensors involved. When the tensors represent p-groups, this corresponds to an increase in the order of the group of the form |G|Θ(log|G|), negating any asymptotic gains in the Cayley table model. We remedy this by presenting a new kind of tensor gadget that allows us to replace those quadratic-length reductions with linear-length ones, yielding the above consequences.
Tensors over commutative rings naturally appear in number theory, geometry, and group theory. For example, 2× 2× 2 tensors over ℤ form the starting point of Bhargava’s celebrated works generalising Gauss’s composition law (Bhargava, Ann. Math., 2004). Symmetric tensors over ℤ are central to the classification of Calabi–Yau threefolds (Yau, Commun. Pure Appl. Math., 1978; Wall, Invent. Math., 1966), geometric objects of significance in string theory. Additionally, tensors over finite commutative rings closely correspond to finite nilpotent groups of class 2 (Baer, Trans. Am. Math. Soc., 1938).
In these settings, testing isomorphism of tensors is of great interest. For example, in mathematical physics, several recent works apply machine learning techniques to distinguish symmetric tensors from Calabi–Yau threefolds. For group isomorphism, a recent breakthrough of Sun (STOC, 2023) gives the first No(logN)-time algorithm for testing isomorphism of p-groups of class 2 and exponent p of order N, using tensor-based techniques. Grunewald and Segal studied the computability of tensor isomorphism problems over ℤ, showing that they are computable in finite time (Ann. Math., 1980).
In this work, we study isomorphism testing of tensors over commutative rings from a complexity-theoretic viewpoint, and its applications. Some of our main results are as follows.
Let R be a commutative ring. We introduce two complexity classes: 3TIR consisting of problems that are polynomial-time reducible to isomorphism problems of tensor products of three modules over R, and 3FTIR consisting of problems that are polynomial-time reducible to isomorphism problems of tensor products of three free modules over R.
We show that some classical problems considered by Grunewald and Segal (ibid.), and the problem of classifying Calabi–Yau threefolds, are 3FTIℤ-complete. We also show that many natural problems are complete for 3TIℤ/peℤ.
We show that testing isomorphism of tensors in ℤ2⊗ℤ2⊗ℤ2 is polynomial-time equivalent to the principal ideal problem in algorithmic number theory. The key to this reduction is Bhargava’s work (Ann. Math., 2004). Using our equivalence, a result of Hallgren (J. ACM, 2007) then implies that 2× 2× 2 tensor isomorphism over ℤ is in quantum polynomial time.
We present an NO((logN)8/9)-time algorithm for testing isomorphism of finite nilpotent groups of class 2 and odd order N. This is achieved by considering tensor isomorphism over ℤ/peℤ. Following the strategy of (Sun, STOC, 2023), the algorithm is a reduction to testing the congruence of matrix tuples over ℤ/peℤ, for which we present a polynomial-time solution following and generalizing (Ivanyos–Qiao, SIAM J. Comput., 2019), who solved the analogous problem over finite fields of odd order.
Matrix concentration inequalities, commonly used in the forms of non-commutative Khintchine inequalities or matrix Chernoff bounds, are central to a wide range of applications in computer science and mathematics. However, they fall short in many applications where tensor versions of these inequalities are needed.
In this work, we study the ℓp injective norms of sums of independent tensors. We obtain the first non-trivial concentration inequalities in this setting, and our inequalities are nearly tight in certain regimes of p and the order of the tensors. Previously, tensor concentration inequalities were known only in the special cases of rank-1 tensors or p=2. Our results are obtained via a geometric argument based on estimating the covering numbers for the natural stochastic processes corresponding to tensor injective norms. Our approach is quite general and might be applicable to other settings of matrix and tensor concentration.
We discuss applications and connections of our inequalities to various other problems, including tensor principle component analysis, various models of random tensors and matrices, type-2 constants of certain Banach spaces, and locally decodable codes.
We construct a new family of explicit codes that are list decodable to capacity and achieve an optimal list size of O(1/є). In contrast to existing explicit constructions of codes achieving list decoding capacity, our arguments do not rely on algebraic structure but utilize simple combinatorial properties of expander graphs.
Our construction is based on a celebrated distance amplification procedure due to Alon, Edmonds, and Luby [FOCS’95], which transforms any high-rate code into one with near-optimal rate-distance tradeoff. We generalize it to show that the same procedure can be used to transform any high-rate code into one that achieves list decoding capacity. Our proof can be interpreted as a ”local-to-global” phenomenon for (a slight strengthening of) the generalized Singleton bound.
Using this construction, for every R, є ∈ (0,1) and k ∈ ℕ+, we obtain an explicit family of rate R codes C ⊆ Σn that achieve the є-relaxed generalized Singleton bound. The alphabet size of these codes is a constant depending only on є and k, and they can be list decoded up to radius k−1/k · (1−R−є), in time nOk,є(1) with a list of size k−1.
As a corollary of our result, we also obtain the first explicit construction of LDPC codes achieving list decoding capacity, and in fact arbitrarily close to the generalized Singleton bound.
We present efficient counting and sampling algorithms for random k-SAT when the clause density satisfies
α ≤ 2k/poly(k).In particular, the exponential term 2k matches the satisfiability threshold Θ(2k) for the existence of a solution and the (conjectured) algorithmic threshold 2k (lnk) / k for efficiently finding a solution.
Previously, the best-known counting and sampling algorithms required far more restricted densities α≲ 2k/3 [He, Wu, Yang, SODA ’23].
Notably, our result goes beyond the lower bound d≳ 2k/2 for worst-case k-SAT with bounded-degree d [Bezáková et al, SICOMP ’19],
showing that for counting and sampling, the average-case random k-SAT model is computationally much easier than the worst-case model.
At the heart of our approach is a new refined analysis of the recent novel coupling procedure by [Wang, Yin, FOCS ’24], utilizing the structural properties of random constraint satisfaction problems (CSPs).
Crucially, our analysis avoids reliance on the 2-tree structure used in prior works, which cannot extend beyond the worst-case threshold 2k/2.
Instead, we employ a witness tree similar to that used in the analysis of the Moser-Tardos algorithm [Moser, Tardos, JACM ’10] for the Lovász Local lemma, which may be of independent interest.
Our new analysis provides a universal framework for efficient counting and sampling for random atomic CSPs, including, for example, random hypergraph colorings. At the same time, it immediately implies as corollaries several structural and probabilistic properties of random CSPs that have been widely studied but rarely justified, including replica symmetry and non-reconstruction.
Over the past decades, a fascinating computational phase transition has been identified in sampling from Gibbs distributions.
Specifically, for the hardcore model on graphs with n vertices and maximum degree Δ,
the computational complexity of sampling from the Gibbs distribution,
defined over the independent sets of the graph with vertex-weight λ>0,
undergoes a sharp transition at the critical threshold λc(Δ) := (Δ−1)Δ−1/(Δ−2)Δ, known as the tree-uniqueness threshold:
In the uniqueness regime where λ<λc(Δ), a local Markov chain for sampling from the Gibbs distribution known as Glauber dynamics has an optimal mixing time of O(n logn). In the non-uniqueness regime where λ>λc(Δ), the Glauber dynamics exhibits exponential mixing time; furthermore, the sampling problem becomes intractable unless RP=NP.
The computational complexity at the critical point λ = λc(Δ) remains poorly understood,
as previous algorithmic and hardness results all required a constant slack from this threshold.
In this paper, we resolve this open question at the critical phase transition threshold, thus completing the picture of the computational phase transition.
We show that for the hardcore model on graphs with maximum degree Δ≥ 3 at the uniqueness threshold λ = λc(Δ),
the mixing time of Glauber dynamics is upper bounded by a polynomial in n, but is not nearly linear in the worst case:
specifically, it falls between Õ(n(2+4e)+O(1/Δ)) and Ω(n4/3).
For the Ising model (either antiferromagnetic or ferromagnetic), we establish similar results.
For the Ising model on graphs with maximum degree Δ≥ 3 at the critical temperature β where |β| = βc(Δ),
with the tree-uniqueness threshold βc(Δ) defined by (Δ−1)tanhβc(Δ)=1,
we show that the mixing time of Glauber dynamics is upper bounded by Õ(n2 + O(1/Δ)) and lower bounded by Ω(n3/2) in the worst case.
For the Ising model specified by a critical interaction matrix J with ∥ J ∥2=1, we obtain an upper bound Õ(n3/2) for the mixing time, matching the lower bound Ω(n3/2) on the complete graph up to a logarithmic factor.
Our mixing time upper bounds hold regardless of whether the maximum degree Δ is constant.
These bounds are derived from a new interpretation and analysis of the localization scheme method introduced by Chen and Eldan, applied to the field dynamics for the hardcore model and the proximal sampler for the Ising model.
As key steps in both our upper and lower bounds,
we establish sub-linear upper and lower bounds for spectral independence at the critical point for worst-case instances.
High-dimensional planted problems, such as finding a hidden dense subgraph within a random graph, often exhibit a gap between statistical and computational feasibility. While recovering the hidden structure may be statistically possible, it is conjectured to be computationally intractable in certain parameter regimes. A powerful approach to understanding this hardness involves proving lower bounds on the efficacy of low-degree polynomial algorithms. We introduce new techniques for establishing such lower bounds, leading to novel results across diverse settings: planted submatrix, planted dense subgraph, the spiked Wigner model, and the stochastic block model. Notably, our results address the estimation task — whereas most prior work is limited to hypothesis testing — and capture sharp phase transitions such as the “BBP” transition in the spiked Wigner model (named for Baik, Ben Arous, and Péché) and the Kesten–Stigum threshold in the stochastic block model. Existing work on estimation either falls short of achieving these sharp thresholds or is limited to polynomials of very low (constant or logarithmic) degree. In contrast, our results rule out estimation with polynomials of degree nδ where n is the dimension and δ > 0 is a constant, and in some cases we pin down the optimal constant δ. Our work resolves open problems posed by Hopkins & Steurer (2017) and Schramm & Wein (2022), and provides rigorous support within the low-degree framework for conjectures by Abbe & Sandon (2018) and Lelarge & Miolane (2019).
We study succinct non-interactive arguments of proximity (SNAP), which allow a prover to convince a verifier that a statement is true through a short message. Moreover, the verifier reads only a sublinear number of bits of the statement, and soundness is required to hold against polynomial-time adversaries when the statement is є-far from any true statements. SNAPs can be seen as the natural analog of property testing in the context of succinct non-interactive arguments (SNARGs). We obtain both positive and negative results for SNAPs. For any є ∈ (0, 1), we construct the first adaptively sound SNAPs for P with є-proximity based on standard assumptions: LWE or subexponential DDH or DLIN over bilinear maps. Our proof size, verifier’s query complexity, and verification time are n1/2 + o(1)· poly(λ), where n is the length of the statement and λ is the security parameter. By additionally assuming sub-exponentially secure indistinguishability obfuscation, we upgrade this result to SNAPs for NP with essentially the same parameters. Previously, we only had non-adaptively sound SNAPs for P in the designated verifier setting with O(n1−δ) proof size, query complexity, and verification time for some constant δ > 0. We show that our parameters in the adaptive soundness setting are nearly optimal, up to an no(1) · poly(λ) factor: in any adaptive SNAP for P, the product of proof size and verifier query complexity must be Ω(n). Our lower bound is unconditional. For any constant є ∈ (0, 1), we construct the first non-adaptively sound SNAPs for NP with є-proximity, based on learning with errors and indistinguishability obfuscation. The proof size, verifier’s query complexity, and verification time in our constructions are fixed polynomials in the security parameter. We also show that, restricting such SNAPs to just P would already imply non-adaptively sound SNARGs for NP. Central to our SNAP constructions is a new notion of commitment of proximity, which enables sublinear-time verification of the commitment. To derive our unconditional lower bound, we adopt and generalize theorems from oracle-presampling techniques in the random oracle literature. Both techniques may be of independent interest.
A secret-sharing scheme allows the distribution of a secret s among n parties, such that only certain predefined “authorized” sets of parties can reconstruct the secret, while all other “unauthorized” sets learn nothing about s. The collection of authorized/unauthorized sets is defined by a monotone function f: {0,1}n → {0,1}. It is known that any monotone function can be realized by a secret-sharing scheme; thus, the smallest achievable total share size, S(f), serves as a natural complexity measure.
In this paper, we initiate the study of the following meta-complexity question: Given a monotone function f, is it possible to efficiently distinguish between cases where the secret-sharing complexity of f is small versus large? We examine this question across several computational models, yielding the following main results.
(Hardness for formulas and circuits): Given a monotone formula f of size L, it is coNP-hard to distinguish between “cheap” functions, where the maximum share size is 1 bit and the total share size is O(L0.01), and “expensive” functions, where the maximum share size is Ω(√L) and the total share size is Ω(L/logL).
This latter bound nearly matches known secret-sharing constructions yielding a total share size of L bits. For monotone circuits, we strengthen the bound on the expensive case to a maximum share size of Ω(L/logL) and a total share size of Ω(L2/logL). These results rule out the existence of instance-optimal compilers that map a formula f to a secret-sharing scheme with complexity polynomially related to S(f).
(Hardness for truth tables): Under cryptographic assumptions, either (1) every n-bit slice function can be realized by a poly(n)-size secret-sharing scheme, or (2) given a truth-table representation of f of size N = 2n, it is computationally infeasible to distinguish in time poly(N) between cases where S(f) = poly(n) and S(f) = nω(1). Option (1) would be considered a breakthrough result, as the best-known construction for slices has a sub-exponential complexity of 2Õ(√n) (Liu, Vaikuntanathan, and Wee; Eurocrypt 2018). Our proof introduces a new worst-case-to-average-case reduction for slices, which may be of independent interest.
(Hardness for graphs): We examine the simple case where f is given as a 2-DNF, represented by a graph G whose edges correspond to 2-terms, and ask whether it is possible to distinguish between cases where the share size is constant and those where the share size is large, say Ω(logn). We establish several connections between this question and questions in communication complexity. For instance, we show that graphs admitting constant-cost secret sharing form a subclass of graphs with constant randomized communication complexity and constant-size adjacency sketches (Harms, Wild, and Zamaraev; STOC 2022). We leverage these connections to establish new lower bounds for specific graph families, derive a combinatorial characterization of graphs with constant-size linear secret-sharing schemes, and show that a natural class of myopic algorithms fails to distinguish cheap graphs from expensive ones.
A classical challenge in complexity theory and cryptography is to simulate interactive proof systems by non-interactive proof systems. In this work we leverage approaches from recent works in derandomization to address this challenge, focusing on non-interactive simulations that are sound against uniform adversarial algorithms.
Our results concern fundamental questions in complexity theory, such as the NP vs PSPACE question, and also in cryptography, such as the question of constructing non-interactive zero-knowledge arguments for NP from unstructured assumptions. Relying on strong complexity-theoretic hardness assumptions (that will be described below):
1. *Complexity theory.* We prove that PSPACE is contained in the “computationally sound” version of NP. Specifically, for every L∈PSPACE, membership in L can be verified by an NP-type (deterministic, polynomial-time) verifier V with the following guarantee: The verifier accepts every x∈ L when given a proof π from an honest prover that runs in fixed exponential time TP; and every uniform adversary running in probabilistic time poly(TP) cannot find x∉ L and π such that V(x,π)=1, except with negligible probability in TP. As a corollary in the area of bounded arithmetic, under the same assumptions, we deduce that NP≠PSPACE is not provable in the theory APC1. This is a strong theory, which captures many of the major results in complexity.
2. *Cryptography.* We construct new cryptographic protocols, including succinct non-interactive arguments (SNARGs) for NC in the plain model, as well as non-interactive zero-knowledge and witness indistinguishable (NIZK and NIWI) proof systems for NP, all with computational soundness against uniform adversaries. The SNARG relies solely on the aforementioned complexity-theoretic assumption, whereas the NIZK and NIWI require also a sub-exponentially secure one-way function (which should be injective in the case of the NIWI). These are the first constructions of the above protocols that do not rely on highly structured cryptographic primitives.
Roughly speaking, following Chen and Tell (FOCS 2021, STOC 2023), the complexity-theoretic hardness assumptions throughout our paper assert the existence of functions f ∶ {0,1}n → {0,1}k that are computable in polynomial time and hard for bounded-space machines (say, linear space) in a strong average-case sense: No efficient algorithm can find an input x on which the bounded-space machine computes f, except with negligible probability.
The Huge Object model of property testing [Goldreich and Ron, TheoretiCS 23] concerns properties of distributions supported on {0,1}n, where n is so large that even reading a single sampled string is unrealistic. Instead, query access is provided to the samples, and the efficiency of the algorithm is measured by the total number of queries that were made to them.
Index-invariant properties under this model were defined in [Chakraborty et al., COLT 23], as a compromise between enduring the full intricacies of string testing when considering unconstrained properties, and giving up completely on the string structure when considering label-invariant properties. Index-invariant properties are those that are invariant through a consistent reordering of the bits of the involved strings.
Here we provide an adaptation of Szemerédi’s regularity method for this setting, and in particular show that if an index-invariant property admits an є-test with a number of queries depending only on the proximity parameter є, then it also admits a distance estimation algorithm whose number of queries depends only on the approximation parameter.
We study monotonicity testing of high-dimensional distributions on {−1,1}n in the model of subcube conditioning, suggested and studied by Canonne, Ron, and Servedio and Bhattacharyya and Chakraborty. Previous work shows that the sample complexity of monotonicity testing must be exponential in n (Rubinfeld, Vasilian, and Aliakbarpour, Gouleakis, Peebles, Rubinfeld, Yodpinyanee). We show that the subcube query complexity is Θ(n/є2), by proving nearly matching upper and lower bounds. Our work is the first to use directed isoperimetric inequalities (developed for function monotonicity testing) for analyzing a distribution testing algorithm. Along the way, we generalize an inequality of Khot, Minzer, and Safra to real-valued functions on {−1,1}n.
We also study uniformity testing of distributions that are promised to be monotone, a problem introduced by Rubinfeld, Servedio, using subcube conditioning. We show that the query complexity is Θ(√n/є2). Our work proves the lower bound, which matches (up to poly-logarithmic factors) the uniformity testing upper bound for general distributions (Canonne, Chen, Kamath, Levi, Waingarten). Hence, we show that monotonicity does not help, beyond logarithmic factors, in testing uniformity of distributions with subcube conditional queries.
We give nearly optimal bounds on the sample complexity of (Ω(є),є)-tolerant testing the ρ-independent set property in the dense graph setting. In particular, we give an algorithm that inspects a random subgraph on O(ρ3/є2) vertices and, for some constant c, distinguishes between graphs that have an induced subgraph of size ρ n with fewer than є/c log4(1/є) n2 edges from graphs for which every induced subgraph of size ρ n has at least є n2 edges. Our sample complexity bound matches, up to logarithmic factors, the recent upper bound by Blais and Seth (2023) for the non-tolerant testing problem, which is known to be optimal for the non-tolerant testing problem based on a lower bound by Feige, Langberg and Schechtman (2004). Our main technique is a new graph container lemma for sparse subgraphs instead of independent sets. We also show that our new lemma can be used to generalize one of the classic applications of the container method, that of counting independent sets in regular graphs, to counting sparse subgraphs in regular graphs.
Counting small subgraphs, referred to as motifs, in large graphs is a fundamental task in graph analysis, extensively studied across various contexts and computational models.
In the sublinear-time regime, the relaxed problem of approximate counting has been explored within two prominent query frameworks:
the standard model, which permits degree, neighbor, and pair queries, and the strictly more
powerful augmented model, which additionally allows for uniform edge sampling.
Currently, in the standard model, (optimal) results have been established only
for approximately counting edges, stars, and cliques, all of which have a radius of one.
This contrasts sharply with the state of affairs in the augmented model, where
algorithmic results (some of which are optimal) are known for any input motif, leading to a
disparity which we term the “scope gap” between the two models.
In this work, we make significant progress in bridging this gap.
Our approach draws inspiration from recent advancements in the augmented model
and utilizes a framework centered on counting by uniform sampling, thus
allowing us to establish new results in the standard model and simplify on previous results.
In particular, our first, and main, contribution is a new algorithm in the standard model for approximately counting any Hamiltonian motif in sublinear time,
where the complexity of the algorithm is the sum of two terms. One term equals the complexity of the known algorithms by Assadi, Kapralov, and Khanna (ITCS 2019) and Fichtenberger and Peng (ICALP 2020) in the (strictly stronger) augmented model and the other is an additional, necessary, additive overhead.
Our second contribution is a variant of our algorithm that enables nearly uniform sampling of these motifs, a capability previously limited in the standard model to edges and cliques.
Our third contribution is to introduce even simpler algorithms for stars and cliques by exploiting their radius-one property. As a result, we simplify all previously known algorithms in the standard
model for stars (Gonen, Ron, Shavitt (SODA 2010)), triangles (Eden, Levi, Ron Seshadhri (FOCS 2015)) and cliques (Eden, Ron, Seshadri (STOC 2018)).
Two matrices are said to be principal minor equivalent if they have equal corresponding principal minors of all orders. We give a characterization of principal minor equivalence and a deterministic polynomial time algorithm to check if two given matrices are principal minor equivalent. Earlier such results were known for certain special cases like symmetric matrices, skew-symmetric matrices with 0, 1, -1-entries, and matrices with no cuts (i.e., for any non-trivial partition of the indices, the top right block or the bottom left block must have rank more than 1).
As an immediate application, we get an algorithm to check if the determinantal point processes corresponding to two given kernel matrices (not necessarily symmetric) are the same. As another application, we give a deterministic polynomial-time test to check the equality of two multivariate polynomials, each computed by a symbolic determinant with a rank 1 constraint on coefficient matrices.
A central question in mathematics and computer science is the question of determining whether a given ideal I is prime, which geometrically corresponds to the zero set of I, denoted Z(I), being irreducible. The case of principal ideals (i.e., m=1) corresponds to the more familiar absolute irreducibility testing of polynomials, where the seminal work of (Kaltofen 1995) yields a randomized, polynomial time algorithm for this problem. However, when m > 1, the complexity of the primality testing problem seems much harder. The current best algorithms for this problem are only known to be in EXPSPACE.
Such drastic state of affairs has prompted research on the primality testing problem (and its more general variants, the primary decomposition problem, and the problem of counting the number of irreducible components) for natural classes of ideals. Notable classes of ideals are the class of radical ideals, complete intersections (and more generally Cohen-Macaulay ideals). For radical ideals, the current best upper bounds are given by (BürgisserScheiblechner, 2007), putting the problem in PSPACE. For complete intersections, the primary decomposition algorithm of (Eisenbud, Huneke, Vasconcelos 1992) coupled with the degree bounds of (DFGS 1991), puts the ideal primality testing problem in EXP. In these situations, the only known complexity-theoretic lower bound for the ideal primality testing problem is that it is coNP-hard for the classes of radical ideals, and equidimensional Cohen-Macaulay ideals.
In this work, we significantly reduce the complexity-theoretic gap for the ideal primality testing problem for the important families of ideals I (namely, radical ideals and equidimensional Cohen-Macaulay ideals). For these classes of ideals, assuming the Generalized Riemann Hypothesis, we show that primality testing lies in Σ3p ∩ Π3p. This significantly improves the upper bound for these classes, approaching their lower bound, as the primality testing problem is coNP-hard for these classes of ideals.
Another consequence of our results is that for equidimensional Cohen-Macaulay ideals, we get the first PSPACE algorithm for primality testing, exponentially improving the space and time complexity of prior known algorithms.
The celebrated result of Kabanets and Impagliazzo (Computational Complexity, 2004) showed that PIT algorithms imply circuit lower bounds, and vice versa. Since then it has been a major challenge to understand the precise connections between PIT and lower bounds. In particular, a main goal has been to understand which lower bounds suffice to obtain efficient PIT algorithms, and how close are they to lower bounds that are necessary for the conclusion.
We construct polynomial-time PIT algorithms from lower bounds that are, up to relatively minor remaining gaps, necessary for the existence of such algorithms. That is, we prove that these lower bounds are, up to the mentioned minor gaps, both sufficient and necessary for polynomial-time PIT, over fields of characteristic zero. Over sufficiently large finite fields, we show a similar result wherein the PIT algorithm runs in time nlog(c)(n), i.e. a power of c-iterated log for an arbitrarily large constant c>1.
The key to these improvements is studying PIT versus lower bounds in the uniform setting, in which we focus on proving lower bounds for uniform arithmetic circuits and their variants (and on deducing algorithms from such lower bounds). Indeed, by working in this setting we obtain results that are significantly tighter than previously known results concerning polynomial-time PIT vs lower bounds, and are in fact also tighter than known hardness-vs-randomness connections in the Boolean setting.
Our results are obtained by combining recent techniques from Boolean hardness vs randomness, and in particular the generator of Chen and Tell (FOCS 2021), with the algebraic hitting-set generator of Guo, Kumar, Saptharishi, and Solomon (SIAM J. Computing 2022) along with the bootstrapping ideas of Agrawal, Ghosh, and Saxena (STOC 2018) and of Kumar, Saptharishi, and Tengse (SODA 2019).
Two fundamental problems on directed graphs are to decide s-t connectivity, and to estimate the behavior of random walks. Currently, there is no known algorithm for s-t connectivity running in polynomial time and no(1) space, and no known algorithm for estimating the n-step random walk matrix running in non-deterministic logspace.
We show that for every directed graph, at least one of these problems is solvable in time and space that significantly improve on the respective state-of-the-art. In particular, there is a pair of algorithms A1 and A2 such that for every graph G, either:
A1(G) outputs the transitive closure of G in polynomial time and polylogarithmic space. A2(G) outputs an approximation of the n-step random walk matrix of G in non-deterministic logspace.
As one application, we show surprisingly tight win-win results for space-bounded complexity. For example, for certain parameter regimes, either Savitch’s theorem can be non-trivially sped up, or randomized space can be almost completely derandomized.
We also apply our techniques to significantly weaken the assumptions required to derandomize space-bounded computation, and to make non-deterministic space-bounded computation unambiguous. Specifically, we deduce such conclusions from lower bounds against uniform circuits of polynomial size, which is an exponential improvement on the required hardness in previous works (Doron–Pyne–Tell STOC 2024, Li–Pyne–Tell FOCS 2024). We further show similar results for minimal-memory derandomization (Doron–Tell CCC 2024).
To prove these results, we substantially improve the array of technical tools introduced in recent years for studying hardness-vs.-randomness for bounded-space computation. In particular, we develop derandomized distinguish-to-predict transformations for new types of distinguishers (corresponding to compositions of PRGs with weak distinguishers), we construct a derandomized logspace reconstruction procedure for the Shaltiel–Umans generator (JACM 2005) that can compress hard truth-tables to polylogarithmic size, and we design a version of the Chen–Tell generator (FOCS 2021) that is particularly suitable for the space-bounded setting.
Schubert coefficients are nonnegative integers that arise in Algebraic Geometry and play a central role in Algebraic Combinatorics. Computing them is both difficult and mysterious. It is known that they are in GapP, but little else is known except in special cases. Notably, it is a major open problem to show that they are in # P in full generality.
We study the hardness of vanishing of Schubert coefficients, i.e. whether they are equal to zero. Until this work it was open whether the vanishing is in PH. In fact, it was believed to be not in PH. We prove that the vanishing problem is in coAM assuming the GRH (the Generalized Riemann Hypothesis). Our approach is based on a reduction to HNP (Parametric Hilbert’s Nullstellensatz) recently introduced by Ait El Manssour et al.
We then use a completely different approach to show that the non-vanishing of Schubert coefficients is in NP ℂ ∩ P ℝ in the Blum–Shub–Smale (BSS) model of computation. This result is incomparable to the inclusion in AM and underscores the algebraic nature of Schubert coefficients. We apply our approach to show that computing Schubert coefficients is in # P ℂ. This is the first nontrivial upper bound for the problem.
We present our results in the generality of all series of classical reductive groups: general linear, special orthogonal, and symplectic groups of complex matrices, corresponding to root systems A,B,C, and D, respectively. With one notable exception, the above results extend to all series.
Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of missing intra-cluster edges.
Cao, Cohen-Addad, Lee, Li, Newman, and Vogl [STOC 2024] introduced the cluster LP for Correlation Clustering, which they argued captures the problem much more succinctly than previous linear programming formulations. However, the cluster LP has exponential size, with a variable for every possible set of vertices in the input graph. Nevertheless, they showed how to find a feasible solution for the cluster LP in time O(npoly(1/ε)) with objective value at most (1+ε) times the value of an optimal solution for the respective Correlation Clustering instance. Furthermore, they showed how to round a solution to the cluster LP, yielding a (1.437+ε)-approximation algorithm for the Correlation Clustering problem.
The main technical result of this paper is a new approach to find a feasible solution for the cluster LP with objective value at most (1+ε) of the optimum in time O(2poly(1/ε) n), where n is the number of vertices in the graph. We also show how to implement the rounding within the same time bounds, thus achieving a fast (1.437+ε)-approximation algorithm for the Correlation Clustering problem. This bridges the gap between state-of-the-art methods for approximating Correlation Clustering and the recent focus on fast algorithms.
In metric k-clustering, we are given as input a set of n points in a general metric space, and we have to pick k centers and cluster the input points around these chosen centers, so as to minimize an appropriate objective function. In recent years, significant effort has been devoted to the study of metric k-clustering problems in a dynamic setting, where the input keeps changing via updates (point insertions/deletions), and we have to maintain a good clustering throughout these updates [Fichtenberger, Lattanzi, Norouzi-Fard and Svensson, SODA’21; Bateni, Esfandiari, Fichtenberger, Henzinger, Jayaram, Mirrokni and Weise, SODA’23; Lacki, Haeupler, Grunau, Rozhon and Jayaram, SODA’24; Bhattacharya, Costa, Garg, Lattanzi and Parotsidis, FOCS’24; Forster and Skarlatos, SODA’25]. The performance of such a dynamic algorithm is measured in terms of three parameters: (i) Approximation ratio, which signifies the quality of the maintained solution, (ii) Recourse, which signifies how stable the maintained solution is, and (iii) Update time, which signifies the efficiency of the algorithm.
We consider a textbook metric k-clustering problem, metric k-median, where the objective is the sum of the distances of the points to their nearest centers. We design the first dynamic algorithm for this problem with near-optimal guarantees across all three performance measures (up to a constant factor in approximation ratio, and polylogarithmic factors in recourse and update time). Specifically, we obtain a O(1)-approximation algorithm for dynamic metric k-median with Õ(1) recourse and Õ(k) update time. Prior to our work, the state-of-the-art here was the recent result of [Bhattacharya, Costa, Garg, Lattanzi and Parotsidis, FOCS’24], who obtained O(є−1)-approximation ratio with Õ(kє) recourse and Õ(k1+є) update time.
We achieve our results by carefully synthesizing the concept of robust centers introduced in [Fichtenberger, Lattanzi, Norouzi-Fard and Svensson, SODA’21] along with the randomized local search subroutine from [Bhattacharya, Costa, Garg, Lattanzi and Parotsidis, FOCS’24], in addition to several key technical insights of our own.
We consider the problems of testing and learning an unknown n-qubit quantum Hamiltonian H=∑x λx σx expressed in its Pauli basis, from queries to its evolution operator e−iHt under the normalized Frobenius norm. To this end, we prove the following results (with and without quantum memory) for Hamiltonians whose Pauli spectrum involves only k-local terms or has sparsity at most s:
(1) Local Hamiltonians: We give a tolerant testing protocol to decide if a Hamiltonian is ε1-close to k-local or ε2-far from k-local, with O(1/(ε2−ε1)4) queries, thereby solving two open questions posed in a recent work by Bluhm, Caro and Oufkir [BCO’24]. For learning a k-local Hamiltonian up to error ε, we give a protocol with query complexity and total time evolution exp(O(k2+klog(1/ε))). Our algorithm leverages the non-commutative Bohnenblust-Hille inequality in order to get a complexity independent of n.
(2) Sparse Hamiltonians: We give a protocol for testing whether a Hamiltonian is ε1-close to being s-sparse or ε2-far from being s-sparse, with O(s6/(ε22−ε12)6) queries. For learning up to error ε, we show that O(s4/ε8) queries suffices.
(3) Learning without quantum memory: The learning results stated above have no dependence on the system size n, but require n-qubit quantum memory.
We give subroutines that allow us to reproduce all the above learning results without quantum memory; increasing the query complexity by a (logn)-factor in the local case and an n-factor in the sparse case.
(4) Testing without quantum memory: We give a new subroutine called Pauli hashing, which allows one to tolerantly test s-sparse Hamiltonians using Õ(s14/(ε22−ε12)18) query complexity. A key ingredient is showing that s-sparse Pauli channels can be tested in a tolerant fashion as being ε1-close to being s-sparse or ε2-far under the diamond norm, using Õ(s2/(ε2−ε1)6) queries via Pauli hashing.
In order to prove these results, we prove new structural theorems for local Hamiltonians, sparse Pauli channels and sparse Hamiltonians. We complement our learning algorithms with lower bounds that are polynomially weaker. Furthermore, our algorithms use short time evolutions and do not assume prior knowledge of the terms on which the Pauli spectrum is supported on, i.e., we do not require prior knowledge about the support of the Hamiltonian terms.
A history-independent data structure does not reveal the history of operations applied to it, only its current logical state, even if its internal state is examined. This paper studies history-independent concurrent dictionaries, in particular, hash tables, and establishes inherent bounds on their space requirements.
This paper shows that there is a lock-free history-independent concurrent hash table, in which each memory cell stores two elements and two bits, based on Robin Hood hashing. Our implementation is linearizable, and uses the shared memory primitive LL/SC. The expected amortized step complexity of the hash table is O(c), where c is an upper bound on the number of concurrent operations that access the same element, assuming the hash table is not overpopulated. We complement this positive result by showing that even if we have only two concurrent processes, no history-independent concurrent dictionary that supports sets of any size, with wait-free membership queries and obstruction-free insertions and deletions, can store only two elements of the set and a constant number of bits in each memory cell. This holds even if the step complexity of operations on the dictionary is unbounded.
In the almost-everywhere reliable message transmission problem, introduced by [Dwork, Pippenger, Peleg, Upfal’86], the goal is to design a sparse communication network G that supports efficient, fault-tolerant protocols for interactions between all node pairs. By fault-tolerant, we mean that that even if an adversary corrupts a small fraction of vertices in G, then all but a small fraction of vertices can still communicate perfectly via the constructed protocols. Being successful to do so allows one to simulate, on a sparse graph, any fault-tolerant distributed computing task and secure multi-party computation protocols built for a complete network, with only minimal overhead in efficiency. Previous works on this problem achieved either constant-degree networks tolerating o(1) faults, constant-degree networks tolerating a constant fraction of faults via inefficient protocols (exponential work complexity), or poly-logarithmic degree networks tolerating a constant fraction of faults. We show a construction of constant-degree networks with efficient protocols (i.e., with polylogarithmic work complexity) that can tolerate a constant fraction of adversarial faults, thus solving the main open problem of Dwork et al.. Our main contribution is a composition technique for communication networks, based on graph products. Our technique combines two networks tolerant to adversarial edge-faults to construct a network with a smaller degree while maintaining efficiency and fault-tolerance. We apply this composition result multiple times, using the polylogarithmic-degree edge-fault tolerant networks constructed in a recent work of [Bafna, Minzer, Vyas’24] (that are based on high-dimensional expanders) with itself, and then with the constant-degree networks (albeit with inefficient protocols) of [Upfal’92].
We provide the first evidence for the inherent difficulty of finding complex sets with optimal proof systems. For this, we construct oracles O1 and O2 with the following properties, where RE denotes the class of recursively enumerable sets and NQP the class of sets accepted in non-deterministic quasi-polynomial time.
O1: no set in PSPACE \ NP has optimal proof systems and PH is infinite
O2: no set in RE \ NQP has optimal proof systems and NP ≠ coNP
Oracle O2 is the first relative to which complex sets with optimal proof systems do not exist. By oracle O1, no relativizable proof can show that there exist sets in PSPACE \ NP with optimal proof systems, even when assuming an infinite PH. By oracle O2, no relativizable proof can show that there exist sets outside NQP with optimal proof systems, even when assuming NP ≠ coNP. This explains the difficulty of the following longstanding open questions raised by Krajíček and Pudlák in 1989, Sadowski in 1997, Köbler and Messner in 1998, and Messner in 2000.
Q1: Are there sets outside NP with optimal proof systems?
Q2: Are there arbitrarily complex sets outside NP with optimal proof systems?
Moreover, relative to O2, there exist arbitrarily complex sets L ∉ NQP with almost optimal algorithms, but none of them has optimal proof systems. This explains the difficulty of Messner’s approach to translate almost optimal algorithms into optimal proof systems.
For an N × N matrix A, its rank-r rigidity, denoted RA(r), is the minimum number of entries of A that one must change to make its rank become at most r. Determining the rigidity of interesting explicit families of matrices remains a major open problem, and is central to understanding the complexities of these matrices in many different models of computation and communication. We focus in this paper on the Walsh-Hadamard transform and on the ‘distance matrix’, whose rows and columns correspond to binary vectors, and whose entries calculate whether the row and column are close in Hamming distance. Our results also generalize to other Kronecker powers and ‘Majority powers’ of fixed matrices. We prove two new results about such matrices.
First, we prove new rigidity lower bounds in the low-rank regime where r < logN. For instance, we prove that over any finite field, there are constants c1, c2 > 0 such that the N × N Walsh-Hadamard matrix Hn satisfies RHn(c1 logN) ≥ N2 ( 1/2 − N−c2 ), and a similar lower bound for the other aforementioned matrices. This is tight, and is the new best rigidity lower bound for an explicit matrix family at this rank; the previous best was R(c1 logN) ≥ c3 N2 for a small constant c3>0.
Second, we give new hardness amplification results, showing that rigidity lower bounds for these matrices for slightly higher rank would imply breakthrough rigidity lower bounds for much higher rank. For instance, if one could prove RHn(log1 + ε N) ≥ N2 ( 1/2 − N−1/2(loglogN)o(1) ) over any finite field for some ε>0, this would imply that Hn is Razborov rigid, giving a breakthrough lower bound in communication complexity.
We study the online stochastic matching problem. Against the offline benchmark, Feldman, Gravin, and Lucier (SODA 2015) designed an optimal 0.5-competitive algorithm. A recent line of work, initiated by Papadimitriou, Pollner, Saberi, and Wajc (MOR 2024), focuses on designing approximation algorithms against the online optimum. The online benchmark allows positive results surpassing the 0.5 ratio.
In this work, adapting the order-competitive analysis by Ezra, Feldman, Gravin, and Tang (SODA 2023), we design a 0.5+Ω(1) order-competitive algorithm against the online benchmark with unknown arrival order. Our algorithm is significantly different from existing ones, as the known arrival order is crucial to the previous approximation algorithms.
We study a continuous-time, infinite-horizon dynamic bipartite matching problem. Suppliers arrive according to a Poisson process; while waiting, they may abandon the queue at a uniform rate. Customers on the other hand must be matched upon arrival. The objective is to minimize the expected long-term average cost subject to a throughput constraint on the total match rate.
Previous literature on dynamic matching focuses on ”static” policies, where the matching decisions do not depend explicitly on the state of the supplier queues, achieving constant-factor approximations. By contrast, we design ”adaptive” policies, which leverage queue length information, and obtain near-optimal polynomial-time algorithms for several classes of instances.
First, we develop a bi-criteria fully polynomial-time approximation scheme for dynamic matching on networks with a constant number of queues—that computes a (1−є)-approximation of the optimal policy in time polynomial in both the input size and 1/є. A key new technique is a hybrid LP relaxation, which combines static and state-dependent LP approximations of the queue dynamics, after a decomposition of the network. Networks with a constant number of queues are motivated by deceased organ donation schemes, where the supply types can be divided according to blood and tissue types.
The above algorithm, combined with a careful cell decomposition gives a polynomial-time approximation scheme for dynamic matching on Euclidean networks of fixed dimension. The Euclidean case is of interest in ride-hailing and spatial service platforms, where the goal is to fulfill as many trips as possible while minimizing driving distances.
QAC0 is the family of constant-depth polynomial-size quantum circuits consisting of arbitrary single qubit unitaries and multi-qubit Toffoli gates. It was introduced by Moore as a quantum counterpart of AC0, along with the conjecture that QAC0 circuits can not compute PARITY. In this work we make progress on this longstanding conjecture: we show that any depth-d QAC0 circuit requires n1+3−d ancillae to compute a function with approximate degree Θ(n), which includes PARITY, MAJORITY and MODk. We further establish superlinear lower bounds on quantum state synthesis and quantum channel synthesis. This is the first superlinear lower bound on the super-linear sized QAC0. Regarding PARITY, we show that any further improvement on the size of ancillae to n1+exp(−o(d)) would imply that PARITY ∉ QAC0. These lower bounds are derived by giving low-degree approximations to QAC0 circuits. We show that a depth-d QAC0 circuit with a ancillae, when applied to low-degree operators, has a degree (n+a)1−3−d polynomial approximation in the spectral norm. This implies that the class QLC0, corresponding to linear size QAC0 circuits, has approximate degree o(n). This is a quantum generalization of the result that LC0 circuits have approximate degree o(n) by Bun, Robin, and Thaler. Our result also implies that QLC0≠NC1.
The preparation of quantum Gibbs states is a crucial task in quantum computing. In this work, we prove that a recently introduced, efficiently implementable dissipative evolution thermalizes to the Gibbs state in time scaling polynomially or even logarithmically with system size at high enough temperatures for any Hamiltonian that satisfies a Lieb-Robinson bound, such as local Hamiltonians on a lattice. Furthermore, we show the efficient adiabatic preparation of the associated purifications or "thermofield double" states. To the best of our knowledge, these are the first results rigorously establishing the efficient preparation of high-temperature Gibbs states and their purifications.
In the low-temperature regime, we show that implementing this family of dissipative evolutions for inverse temperatures polynomial in the system's size is computationally equivalent to standard quantum computations. On a technical level, for high temperatures, our proof makes use of the mapping of the generator of the evolution into a Hamiltonian, and then connecting its convergence to that of the infinite temperature limit. We further present an alternative proof that is based on showing the exponential decay of the so-called oscillator norm, yielding convergence in logarithmic times. For low temperature, we instead perform a perturbation at zero temperature and resort to circuit-to-Hamiltonian mappings akin to the proof of universality of quantum adiabatic computing.
Taken together, our results show that a family of quasi-local dissipative evolutions efficiently prepares a large class of quantum many-body states of interest, and has the potential to mirror the success of classical Monte Carlo methods for quantum many-body systems.
We propose a generalization of Zhandry’s compressed oracle method to random permutations, where an algorithm can query both the permutation and its inverse. We show how to use the resulting oracle simulation to bound the success probability of an algorithm for any predicate on input-output pairs, a key feature of Zhandry’s technique that had hitherto resisted attempts at generalization to random permutations. One key technical ingredient is to use the strictly monotone factorization of a permutation, which also underlies the well-known Fisher-Yates shuffle, to represent it in the oracle’s database. As an application of our framework, we show that the one-round sponge construction is unconditionally preimage resistant in the random permutation model, for all parameter choices. This proves a conjecture by Unruh.
We study a longstanding question of Aaronson and Kuperberg on whether there exists a classical oracle separating QMA from QCMA. Settling this question in either direction would yield insight into the power of quantum proofs over classical proofs. We show that such an oracle exists if a certain quantum pseudorandomness conjecture holds. Roughly speaking, the conjecture posits that quantum algorithms cannot, by making few queries, distinguish between the uniform distribution over permutations versus permutations drawn from so-called “dense” distributions.
Our result can be viewed as establishing a “win-win” scenario: either there is a classical oracle separation of QMA from QCMA, or there is quantum advantage in distinguishing pseudorandom distributions on permutations.
We study the problem of fair allocation of chores among agents with additive preferences. In the discrete setting, envy-freeness up to any chore (EFX) has emerged as a compelling fairness criterion. However, establishing its (non-)existence or achieving a meaningful approximation remains a major open question in fair division. The current best guarantee is the existence of O(n2)-EFX allocations, where n denotes the number of agents, obtained through a sophisticated algorithm. In this paper, we show the existence of 4-EFX allocations, providing the first constant-factor approximation of EFX.
We further investigate the existence of allocations that are both fair and efficient, using Pareto optimality (PO) as our efficiency criterion. For the special case of bivalued instances, we establish the existence of allocations that are both 3-EFX and PO, thereby improving upon the current best factor of O(n)-EFX without any efficiency guarantees. For general additive instances, the existence of allocations that are α-EFk and PO has remained open for any constant values of α and k, where EFk denotes envy-freeness up to k chores. We provide the first positive result in this direction by showing the existence of allocations that are 2-EF2 and PO.
Our results are obtained via a novel economic framework called earning restricted (ER) competitive equilibrium for fractional allocations, which imposes limits on the earnings of agents from each chore. We show the existence of ER equilibria by carefully formulating a linear complementarity problem (LCP) that captures all ER equilibria, and then prove that the classic complementary pivot algorithm applied to this LCP terminates at an ER equilibrium. By carefully setting earning limits and leveraging the properties of ER equilibria, we design algorithms that involve rounding the fractional solutions and then performing swaps and merges of bundles to meet the desired fairness and efficiency criteria. We expect that the concept of ER equilibrium will play a crucial role in deriving further results on related problems.
Pseudorandom codes are error-correcting codes with the property that no efficient adversary can distinguish encodings from uniformly random strings. They were recently introduced by Christ and Gunn [CRYPTO 2024] for the purpose of watermarking the outputs of randomized algorithms, such as generative AI models. Several constructions of pseudorandom codes have since been proposed, but none of them are robust to error channels that depend on previously seen codewords. This stronger kind of robustness is referred to as adaptive robustness, and it is important for meaningful applications to watermarking.
In this work, we show the following.
Adaptive robustness: We show that the pseudorandom codes of Christ and Gunn are adaptively robust, resolving a conjecture posed by Cohen, Hoover, and Schoenbach [S&P 2025].
Our proof involves several new ingredients, combining ideas from both cryptography and coding theory and taking hints from the analysis of Boolean functions.
Ideal security: We define an ideal pseudorandom code as one which is indistinguishable from the ideal functionality, capturing both the pseudorandomness and robustness properties in one simple definition.
We show that any adaptively robust pseudorandom code for single-bit messages can be bootstrapped to build an ideal pseudorandom code with linear information rate, under no additional assumptions.
CCA security: In the setting where the encoding key is made public, we define a CCA-secure pseudorandom code in analogy with CCA-secure encryption.
We show that any adaptively robust public-key pseudorandom code for single-bit messages can be used to build a CCA-secure pseudorandom code with linear information rate, in the random oracle model.
Together with the result of Christ and Gunn, it follows that there exist ideal pseudorandom codes assuming the 2O(√n)-hardness of LPN. This extends to CCA security in the random oracle model. These results immediately imply stronger robustness guarantees for generative AI watermarking schemes, such as the practical quality-preserving image watermarks of Gunn, Zhao, and Song [ICLR 2025].
In this paper, we construct new t-server Private Information Retrieval (PIR) schemes with communication complexity subpolynomial in the previously best known, for all but finitely many t. Our results are based on combining derivatives (in the spirit of Woodruff-Yekhanin) with the Matching Vector based PIRs of Yekhanin and Efremenko. Previously such a combination was achieved in an ingenious way by Dvir and Gopi, using polynomials and derivatives over certain exotic rings, en route to their fundamental result giving the first 2-server PIR with subpolynomial communication. Our improved PIRs are based on two ingredients: We develop a new and direct approach to combine derivatives with Matching Vector based PIRs. This approach is much simpler than that of Dvir-Gopi: it works over the same field as the original PIRs, and only uses elementary properties of polynomials and derivatives. A key subproblem that arises in the above approach is a higher-order polynomial interpolation problem. We show how “sparse S-decoding polynomials”, a powerful tool from the original constructions of Matching Vector PIRs, can be used to solve this higher-order polynomial interpolation problem using surprisingly few higer-order evaluations. Using the known sparse S-decoding polynomials in combination with our ideas leads to our improved PIRs. Notably, we get a 3-server PIR scheme with communication 2( (logn)1/3) , improving upon the previously best known communication of 2( √logn) due to Efremenko.
We introduce the following natural generalization of trace reconstruction, parameterized by a deletion probability δ ∈ (0,1) and length n: There is a length n string of probabilities, S=p1,…,pn, and each trace is obtained by 1) sampling a length n binary string whose ith coordinate is independently set to 1 with probability pi and 0 otherwise, and then 2) deleting each of the binary values independently with probability δ, and returning the corresponding binary string of length ≤ n. The goal is to recover an estimate of S from a set of independently drawn traces. In the case that all pi ∈ {0,1} this is the standard trace reconstruction problem. We show two complementary results. First, for worst-case strings S and any deletion probability at least order 1/√n, no algorithm can approximate S to constant ℓ∞ distance or ℓ1 distance o(√n) using fewer than 2Ω(√n) traces. Second–as in the case for standard trace reconstruction–reconstruction is easy for random S: for any sufficiently small constant deletion probability, and any є>0, drawing each pi independently from the uniform distribution over [0,1], with high probability S can be recovered to ℓ1 error є using (n,1/є) traces and computation time.
We show indistinguishability in our lower bound by regarding a complicated alternating sum (comparing two distributions) as the Fourier transformation of some function evaluated at ± π, and then showing that the Fourier transform decays rapidly away from zero by analyzing its moment generating function.
We study the task of high-dimensional entangled mean estimation in the subset-of-signals model. Specifically, given N independent random points x1,…,xN in D and a parameter α ∈ (0, 1) such that each xi is drawn from a Gaussian with mean µ and unknown covariance, and an unknown α-fraction of the points have identity-bounded covariances, the goal is to estimate the common mean µ. The one-dimensional version of this task has received significant attention in theoretical computer science and statistics over the past decades. Recent work has given near-optimal upper and lower bounds for the one-dimensional setting. On the other hand, our understanding of even the information-theoretic aspects of the multivariate setting has remained limited.
In this work, we design a computationally efficient algorithm achieving an information-theoretically near-optimal error. Specifically, we show that the optimal error (up to polylogarithmic factors) is f(α,N) + √D/(α N), where the term f(α,N) is the error of the one-dimensional problem and the second term is the sub-Gaussian error rate. Our algorithmic approach employs an iterative refinement strategy, whereby we progressively learn more accurate approximations µ to µ. This is achieved via a novel rejection sampling procedure that removes points significantly deviating from µ, as an attempt to filter out unusually noisy samples. A complication that arises is that rejection sampling introduces bias in the distribution of the remaining points. To address this issue, we perform a careful analysis of the bias, develop an iterative dimension-reduction strategy, and employ a novel subroutine inspired by list-decodable learning that leverages the one-dimensional result.
We prove that there is a universal constant C>0 so that for every d ∈ ℕ, every centered subgaussian distribution D on ℝd, and every even p ∈ ℕ, the d-variate polynomial (Cp)p/2 · ||v||2p − EX ∼ D ⟨ v,X⟩p is a sum of square polynomials. This establishes that every subgaussian distribution is SoS-certifiably subgaussian—a condition that yields efficient learning algorithms for a wide variety of high-dimensional statistical tasks. As a direct corollary, we obtain computationally efficient algorithms with near-optimal guarantees for the following tasks, when given samples from an arbitrary subgaussian distribution: robust mean estimation, list-decodable mean estimation, clustering mean-separated mixture models, robust covariance-aware mean estimation, robust covariance estimation, and robust linear regression. Our proof makes essential use of Talagrand’s generic chaining/majorizing measures theorem.
We study sparse singular value certificates for random rectangular matrices. If M is a d × n matrix with independent Gaussian entries, we give a new family of polynomial-time algorithms which can certify upper bounds on the maximum of ||M u||, where u is a unit vector with at most η n nonzero entries for a given η ∈ (0,1). This basic algorithmic primitive lies at the heart of a wide range of problems across algorithmic statistics and theoretical computer science, including robust mean and covariance estimation, certification of distortion of random subspaces of n, certification of the 2 → p norm of a random matrix, and sparse principal component analysis.
Our algorithms certify a bound which is asymptotically smaller than the naive one, given by the maximum singular value of M, for nearly the widest-possible range of n,d, and η. Efficiently certifying such a bound for a range of n,d and η which is larger by any polynomial factor than what is achieved by our algorithm would violate lower bounds in the statistical query and low-degree polynomials models. Our certification algorithm makes essential use of the Sum-of-Squares hierarchy. To prove the correctness of our algorithm, we develop a new combinatorial connection between the graph matrix approach to analyze random matrices with dependent entries, and the Efron-Stein decomposition of functions of independent random variables.
As applications of our certification algorithm, we obtain new efficient algorithms for a wide range of well-studied algorithmic tasks. In algorithmic robust statistics, we obtain new algorithms for robust mean and covariance estimation with tradeoffs between breakdown point and sample complexity, which are nearly matched by statistical query and low-degree polynomial lower bounds (that we establish). We also obtain new polynomial-time guarantees for certification of ℓ1/ℓ2 distortion of random subspaces of n (also with nearly matching lower bounds), sparse principal component analysis, and certification of the 2→ p norm of a random matrix.
Relaxations for the constraint satisfaction problem (CSP) include bounded width (BW), linear program (LP), semidefinite program (SDP), affine integer program (AIP), and their combinations. Tightening relaxations systematically leads to hierarchies and stronger algorithms. For LP+AIP and SDP+AIP hierarchies, various lower bounds were shown by Ciardo and Živný (STOC 2023, STOC 2024) and by Chan, Ng, and Peng (STOC 2024).
This paper continues on this line of work to show lower bounds for related hierarchies. We prove new lower bounds to BW and AIP hierarchies. We also show the first lower bounds to the cohomological consistency hierarchy of Ó Conghaile (MFCS 2022) and the C(BLP+AIP) hierarchy of Ciardo and Živný (SODA 2022). Our lower bounds are for linear level and optimal. They make partial progress towards an open question of Lichter and Pago (arXiv 2407.09097v1) concerning the power of these hierarchies.
We prove our results using new closure and boundary. We generalize closure and boundary to streamline proofs across hierarchies.
The multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models. Given a sequence length k, attention matrices Σ1,…,Σm∈ℝd× d, and projection matrices W1,…,Wm∈ℝd× d, the corresponding multi-head attention layer F: ℝk× d→ ℝk× d transforms length-k sequences of d-dimensional tokens X∈ℝk× d via F(X) ≜ ∑i=1m softmax(XΣiX⊤)XWi.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples and give the first nontrivial upper and lower bounds for this problem. Provided {Wi, Σi} satisfy certain non-degeneracy conditions, we give a (dk)O(m3)-time algorithm that learns F to small error given random labeled examples drawn uniformly from {± 1}k× d. We also prove computational lower bounds showing that in the worst case, exponential dependence on the number of heads m is unavoidable.
We chose to focus on Boolean X to mimic the discrete nature of tokens in large language models, though our techniques naturally extend to standard continuous settings, e.g. Gaussian. Our algorithm, which is centered around using examples to sculpt a convex body containing the unknown parameters, is a significant departure from existing provable algorithms for learning feed-forward networks, which predominantly exploit fine-grained algebraic and rotation invariance properties of the Gaussian distribution. In contrast, our analysis is more flexible as it primarily relies on various upper and lower tail bounds for the input distribution and “slices” thereof.
We give a polynomial time algorithm that, given copies of an unknown quantum state ψ=U0n that is prepared by an unknown constant depth circuit U on a finite-dimensional lattice, learns a constant depth quantum circuit that prepares ψ. The algorithm extends to the case when the depth of U is polylog(n), with a quasi-polynomial run-time. The key new idea is a simple and general procedure that efficiently reconstructs the global state ψ from its local reduced density matrices. As an application, we give an efficient algorithm to test whether an unknown quantum state on a lattice has low or high quantum circuit complexity.
History-determinism is a restricted notion of nondeterminism in automata, where the nondeterminism can be successfully resolved based solely on the prefix read so far. History-deterministic automata still allow for exponential succinctness in automata over infinite words compared to deterministic automata (Kuperberg and Skrzypczak, 2015), allow for canonical forms unlike deterministic automata (Abu Radi and Kupferman, 2019 and 2020; Ehlers and Schewe, 2022), and retain some of the algorithmic properties of deterministic automata, for example for reactive synthesis (Henzinger and Piterman, 2006; Ehlers and Khalimov, 2024).
Despite the topic of history-determinism having received a lot of attention over the last decade, the complexity of deciding whether a parity automaton is history-deterministic has, up till now, remained open. We show that history-determinism for a parity automaton with a fixed parity index can be checked in PTIME, thus improving upon the naive EXPTIME upper bound of Henzinger and Piterman that has stood since 2006. More precisely, we show that the so-called 2-token game, which can be solved in PTIME for parity automata with a fixed parity index, characterises history-determinism for parity automata. This game was introduced by Bagnol and Kuperberg in 2018, who showed that to decide if a Büchi automaton is history-determinism, it suffices to find the winner of the 2-token game on it. They conjectured that this 2-token game based characterisation extends to parity automata. Boker, Kuperberg, Lehtinen, and Skrzypcak showed in 2020 that this conjecture holds for coBüchi automata as well. We prove Bagnol and Kuperberg’s conjecture that the winner of the 2-token game characterises history-determinism on parity automata.
As consequences of our result, we also get the 2-token game based characterisation for history-determinism in alternating parity automata and ω-regular automata, thus yielding efficient algorithms for deciding their history-determinism. Additionally, we also obtain an efficient algorithm for deciding the good-enough-synthesis (Almagor and Kupferman, 2020) for specifications given by a deterministic parity automata.
We consider the model of one-dimensional Pushdown Vector Addition Systems (1-PVAS), a fundamental computational model simulating both recursive and concurrent behaviours. Our main result is decidability of the reachability problem for 1-PVAS, an important open problem investigated for at least a decade. In the algorithm we actually consider an equivalent model of Grammar Vector Addition Systems (GVAS). We prove the main result by showing that for every one-dimensional GVAS (1-GVAS) one can compute another 1-GVAS, which has the same reachability relation as the original one and additionally has the so-called thin property. Due to the work of Atig and Ganty from 2011, thin 1-GVAS have decidable reachability problem, therefore our construction implies decidability of the problem for all 1-GVAS. Moreover, we also show that if reachability in thin 1-GVAS can be decided in elementary time then also reachability in all 1-GVAS can be decided in elementary time.
We propose the notion of succinct oblivious tensor evaluation (OTE), where two parties compute an additive secret sharing of a tensor product of two vectors x ⊗ y, exchanging two simultaneous messages. Crucially, the size of both messages and of the CRS is independent of the dimension of x. We present a construction of OTE with optimal complexity from the standard learning with errors (LWE) problem. Then we show how this new technical tool enables a host of cryptographic primitives, all with security reducible to LWE, such as: (a) Adaptively secure laconic function evaluation for depth-D functions f:{0, 1}m→{0, 1}ℓ with communication m+ℓ+D· poly(λ); (b) A trapdoor hash function for all functions; (c) An (optimally) succinct homomorphic secret sharing for all functions; (d) A rate-1/2 laconic oblivious transfer for batch messages, which is best possible.
In particular, we obtain the first laconic function evaluation scheme that is adaptively secure from the standard LWE assumption, improving upon Quach, Wee, and Wichs (FOCS 2018). As a key technical ingredient, we introduce a new notion of adaptive lattice encodings, which may be of independent interest.
We present a polynomial-time reduction from solving noisy linear equations over in dimension Θ(klogn/(logk,logq,loglogn)) with a uniformly random coefficient matrix to noisy linear equations over in dimension n where each row of the coefficient matrix has uniformly random support of size k. This allows us to deduce the hardness of sparse problems from their dense counterparts. In particular, we derive hardness results in the following canonical settings:
• Assuming the ℓ-dimensional (dense) learning with errors () problem over a polynomial-size field takes time 2Ω(ℓ), k-sparse in dimension n takes time nΩ(k/(logk · (logk + loglogn))) .
• Assuming the ℓ-dimensional (dense) learning parity with noise () problem over ℤ/2ℤ takes time 2Ω(ℓ/logℓ), k-sparse in dimension n takes time nΩ(k/(logk · (logk + loglogn)2)) .
These running time lower bounds are nearly tight as both sparse problems can be solved in time nO(k), given sufficiently many samples.
Our reduction allows us to derive several consequences in cryptography and the computational complexity of statistical problems. In addition, as a new application, we give a reduction from k-sparse LWE to noisy tensor completion. Concretely, composing the two reductions implies that order-k rank-2k−1 noisy tensor completion in ℝn⊗ k takes time nΩ(k/ logk · (logk + loglogn)), assuming the exponential hardness of standard worst-case lattice problems.
We give a public key encryption scheme that is provably secure against poly-size adversaries, assuming nlogαn hardness of the standard planted clique conjecture, for any α ∈ (0,1), and a relatively mild hardness conjecture about noisy k-LIN over expanders that is not known to imply public-key encryption on its own. Both of our conjectures correspond to natural average-case variants of NP-complete problems and have been studied for multiple decades, with unconditional lower bounds supporting them in a variety of restricted models of computation. Our encryption scheme answers an open question in a seminal work by Applebaum, Barak, and Wigderson [STOC’10].
For a finite set F of graphs, the F-Hitting problem aims to compute, for a given graph G (taken from some graph class G) of n vertices (and m edges) and a parameter k ∈ ℕ, a set S of vertices in G such that |S| ≤ k and G−S does not contain any subgraph isomorphic to a graph in F. As a generic problem, F-Hitting subsumes many fundamental vertex-deletion problems that are well-studied in the literature. The F-Hitting problem admits a simple branching algorithm with running time 2O(k) · nO(1), while it cannot be solved in 2o(k) · nO(1) time on general graphs assuming the ETH, follows from the seminal work of Lewis and Yannakakis.
In this paper, we establish a general framework to design subexponential parameterized algorithms for the F-Hitting problem on a broad family of graph classes. Specifically, our framework yields algorithms that solve F-Hitting with running time 2O(kc) · n + O(m) for a constant c < 1 on any graph class G that admits balanced separators whose size is (strongly) sublinear in the number of vertices and polynomial in the size of a maximum clique. Examples include all graph classes of polynomial expansion (e.g., planar graphs, bounded-genus graphs, minor-free graphs, etc.) and many important classes of geometric intersection graphs (e.g., map graphs, intersection graphs of any fat geometric objects, pseudo-disks, etc.). Our algorithms also apply to the weighted version of F-Hitting, where each vertex of G has a weight and the goal is to compute the set S with a minimum weight that satisfies the desired conditions.
The core of our framework, which is our main technical contribution, is an intricate subexponential branching algorithm that reduces an instance of F-Hitting (on the aforementioned graph classes) to 2O(kc) general hitting-set instances, where the Gaifman graph of each instance has treewidth O(kc), for some constant c < 1 depending on F and the graph class.
Dell, Lapinskas and Meeks [DLM SICOMP 2022] presented a general reduction from approximate counting to decision for a class of fine-grained problems that can be viewed as hyperedge counting or detection problems in an implicit hypergraph, thus obtaining tight equivalences between approximate counting and decision for many key problems such as k-clique, k-sum and more. Their result is a reduction from approximately counting the number of hyperedges in an implicit k-partite hypergraph to a polylogarithmic number of calls to a hyperedge oracle that returns whether a given subhypergraph contains an edge.
The main result of this paper is a generalization of the DLM result for output-sensitive approximate counting, where the running time of the desired counting algorithm is inversely proportional to the number of witnesses. Our theorem is a reduction from approximately counting the (unknown) number of hyperedges in an implicit k-partite hypergraph to a polylogarithmic number of calls to a hyperedge oracle called only on subhypergraphs with a small “measure”. If a subhypergraph has ui nodes in the ith node partition of the k-partite hypergraph, then its measure is ∏i ui.
Using the new general reduction and by efficiently implementing measure-bounded colorful independence oracles, we obtain new improved output-sensitive approximate counting algorithms for k-clique, k-dominating set and k-sum. In graphs with nt k-cliques, for instance, our algorithm (1± є)-approximates the k-clique count in time Õє(nω(k−t−1/3,k−t/3,k−t+2/3) +n2), where ω(a,b,c) is the exponent of na× nb by nb× nc matrix multiplication. For large k and t>2, this is a substantial improvement over prior work, even if ω=2.
A set of probabilistic forecasts is calibrated if each prediction of the forecaster closely approximates the empirical distribution of outcomes on the subset of timesteps where that prediction was made. We study the fundamental problem of online calibrated forecasting of binary sequences, which was initially studied by Foster and Vohra. They derived an algorithm with O(T2/3) calibration error after T time steps, and showed a lower bound of Ω(T1/2). These bounds remained stagnant for two decades, until Qiao and Valiant improved the lower bound to Ω(T0.528) by introducing a combinatorial game called sign preservation and showing that lower bounds for this game imply lower bounds for calibration.
In this paper, we give the first improvement to the O(T2/3) upper bound on calibration error of Foster and Vohra.
We do this by introducing a variant of Qiao and Valiant’s game that we call sign preservation with reuse (SPR). We prove that the relationship between SPR and calibrated forecasting is bidirectional: not only do lower bounds for SPR translate into lower bounds for calibration, but algorithms for SPR also translate into new algorithms for calibrated forecasting. We then give an improved upper bound for the SPR game, which implies, via our equivalence, a forecasting algorithm with calibration error O(T2/3 − ) for some > 0, improving Foster and Vohra’s upper bound for the first time. Using similar ideas, we then prove a slightly stronger lower bound than that of Qiao and Valiant, namely Ω(T0.54389). Our lower bound is obtained by an oblivious adversary, marking the first ω(T1/2) calibration lower bound for oblivious adversaries.
Given a set of points, the k-means clustering problem consists of finding a partition of a set of points into k clusters such that the sum of squared Euclidean distances between the points and their assigned centers is minimized. In this paper, we consider learning bounds for this problem. That is, given a set of n samples P drawn independently from some unknown but fixed distribution D, how quickly does a solution computed on P converge to the optimal clustering of D? The currently fastest provable rate of convergence of the order √k/nmin(k,logklog2(n/k)) is due to [Appert, Catoni, 2021] with the best known lower bound being of the order √k/n due to [Bartlett, Linder, and Lugosi, 1998].
We give learning bounds with both optimal dependency on the sample size n and nearly optimal dependency on k by proving a convergence rate of the order of √klogk/n.
We resolve a fundamental question about the ability to perform a statistical task, such as learning, when an adversary corrupts the sample. Such adversaries are specified by the types of corruption they can make and their level of knowledge about the sample. The latter distinguishes between sample-adaptive adversaries which know the contents of the sample when choosing the corruption, and sample-oblivious adversaries, which do not. We prove that for all types of corruptions, sample-adaptive and sample-oblivious adversaries are equivalent up to polynomial factors in the sample size. This resolves the main open question introduced by Blanc et al. (COLT, 2022) and further explored in Canonne et al. (FOCS, 2023).
Specifically, consider any algorithm A that solves a statistical task even when a sample-oblivious adversary corrupts its input. We show that there is an algorithm A′ that solves the same task when the corresponding sample-adaptive adversary corrupts its input. The construction of A′ is simple and maintains the computational efficiency of A: It requests a polynomially larger sample than A uses and then runs A on a uniformly random subsample.
One of our main technical tools is a new structural result relating two distributions defined on sunflowers which may be of independent interest.
We give two results on PAC learning DNF formulas using membership queries in the challenging “distribution-free” learning framework, where learning algorithms must succeed for an arbitrary and unknown distribution over {0,1}n.
(1) We first give a quasi-polynomial time “list-decoding” algorithm for learning a single term of an unknown DNF formula. More precisely, for any target s-term DNF formula f = T1 ∨ ⋯ ∨ Ts over {0,1}n and any unknown distribution D over {0,1}n, our algorithm, which uses membership queries and random examples from D, runs in quasipoly(n,s) time and outputs a list L of candidate terms such that with high probability some term Ti of f belongs to L.
(2) We then use result (1) to give a quasipoly(n,s)-time algorithm, in the distribution-free PAC learning model with membership queries, for learning the class of size-s DNFs in which all terms have the same size. Our algorithm learns using a DNF hypothesis.
The key tool used to establish result (1) is a new result on “locally mixing random walks,” which, roughly speaking, shows that a random walk on a graph that is covered by a small number of expanders has a non-negligible probability of mixing quickly in a subset of these expanders.
The stochastic block model is a canonical model of communities in random graphs. It was introduced in the social sciences and statistics as a model of communities, and in theoretical computer science as an average case model for graph partitioning problems under the name of the “planted partition model.” Given a sparse stochastic block model, the two standard inference tasks are: (i) Weak recovery: can we estimate the communities with non-trivial overlap with the true communities? (ii) Detection/Hypothesis testing: can we distinguish if the sample was drawn from the block model or from a random graph with no community structure with probability tending to 1 as the graph size tends to infinity? In this work, we show that for sparse stochastic block models, the two inference tasks are equivalent except at a critical point. That is, weak recovery is information theoretically possible if and only if detection is possible. We thus find a strong connection between these two notions of inference for the model. We further prove that when detection is impossible, an explicit hypothesis test based on low-degree polynomials in the adjacency matrix of the observed graph achieves the optimal statistical power. This low-degree test is efficient as opposed to the likelihood ratio test, which is not known to be efficient. Moreover, we prove that the asymptotic mutual information between the observed network and the community structure exhibits a phase transition at the weak recovery threshold. Our results are proven in much broader settings including the hypergraph stochastic block models and general planted factor graphs. In these settings, we prove that the impossibility of weak recovery implies contiguity and provide a condition that guarantees the equivalence of weak recovery and detection.
Existence of long arithmetic progression in sumsets and subset sums has been studied extensively in additive combinatorics. These results play a central role in the recent progress of fundamental problems in theoretical computer science including Knapsack and Subset Sum. The non-constructiveness of relevant additive combinatorics results affects their application in algorithms. In particular, additive combinatorics-based algorithms for Subset Sum, including an Õ(n)-time algorithm for dense subset sum [Bringmann and Wellnitz ’21] and an Õ(n + √amaxt)-time algorithm [Chen, Lian, Mao, and Zhang ’24], work only for the decision version of the problem, but not for the search version. To find a solution, one has to spend a lot more time.
In this paper, we provide constructive proofs for finite addition theorems [Sárközy’89 ’94], which are fundamental results in additive combinatorics concerning the existence of long arithmetic progression in sumsets and subset sums. Our constructive proofs yield a near-linear time algorithm that returns an arithmetic progression explicitly, and moreover, for each term in the arithmetic progression, it also returns its representation as the sum of elements in the base set.
As an application, we can obtain an Õ(n)-time algorithm for the search version of dense subset sum now. Another application of our result is Unbounded Subset Sum, where each input integer can be used an infinite number of times. A classic result on the Frobenius problem [Erdős and Graham ’72] implies that for all t ≥ 2amax2/n,
the decision version can be solved trivially in linear time. It remains unknown whether the search version can be solved in the same time. Our result implies that for all t ≥ camax2/n for some constant c, a solution for Unbounded Subset Sum can be obtained in O(n logamax) time.
The major technical challenge is that the original proofs for the above-mentioned additive combinatorics results heavily rely on two fundamental theorems, Mann’s theorem and Kneser’s theorem. These two theorems constitute the main non-constructive part. To bypass these two obstacles, we introduce two techniques.
(i). A new set operation called greedy sumset. Greedy sumset computes a moderately large subset of the traditional sumset, but enjoys the advantage that searching for a representation for elements in the greedy sumset can be done efficiently.
(ii). A framework that can be used to iteratively augment an arithmetic progression. It plays the role of Kneser’s theorem in the proof but enjoys the advantage that the representation of elements in the arithmetic progression can be efficiently traced.
There is no known polynomial-time algorithm for graph isomorphism testing, but elementary combinatorial “refinement” algorithms seem to be very efficient in practice. Some philosophical justification for this phenomenon is provided by a classical theorem of Babai, Erdős and Selkow: an extremely simple polynomial-time combinatorial algorithm (variously known as “naïve refinement”, “naïve vertex classification”, “colour refinement” or the “1-dimensional Weisfeiler–Leman algorithm”) yields a so-called canonical labelling scheme for “almost all graphs”. More precisely, for a typical outcome of a random graph G(n,1/2), this simple combinatorial algorithm assigns labels to vertices in a way that easily permits isomorphism-testing against any other graph.
We study the problem of detecting or recovering a planted ranked subgraph from a directed graph, an analog for directed graphs of the well-studied planted dense subgraph model.
We suppose that, among a set of n items, there is a subset S of k items having a latent ranking in the form of a permutation π of S, and that we observe a fraction p of pairwise orderings between elements of S which agree with π with probability 1/2 + q and otherwise are uniformly random.
Unlike in the planted dense subgraph and planted clique problems where the community S is distinguished by its unusual density of edges, here the community is only distinguished by the unusual consistency of its pairwise orderings.
We establish computational and statistical thresholds for both detecting and recovering such a ranked community.
In the log-density setting where k, p, and q all scale as powers of n, we establish the exact thresholds in the associated exponents at which detection and recovery become statistically and computationally feasible.
These regimes include a rich variety of behaviors, exhibiting both statistical-computational and detection-recovery gaps.
We also give finer-grained results for two extreme cases: (1) p = 1, k = n, and q small, where a full tournament is observed that is weakly correlated with a global ranking, and (2) p = 1, q = 1/2, and k small, where a small “ordered clique” (totally ordered directed subgraph) is planted in a random tournament.
In an m-edge host graph G, all triangles can be listed in time O(m1.5) [Itai, Rodeh ’78], and all k-cycles can be listed in time O(m2−1/⌈ k/2 ⌉ + t) where t is the output size [Alon, Yuster, Zwick ’97]. These classic results also hold for the colored problem variant, where the nodes of the host graph G are colored by nodes in the pattern graph H, and we are only interested in subgraphs of G that are isomorphic to the pattern H and respect the colors. We study the problem of listing all H-subgraphs in the colored setting, for fixed pattern graphs H.
As our main result, we determine all pattern graphs H such that all H-subgraphs can be listed in subquadratic time O(m2−ε + t), where t is the output size. Moreover, for each such subquadratic pattern H we determine the smallest exponent c(H) such that all H-subgraphs can be listed in time O(mc(H) + t). This is a vast generalization of the classic results on triangles and cycles. In particular, it answers an open problem from the database community [Joglekar, Ré ’18].
We also show the same results for two related problems: finding an H-subgraph of minimum total edge-weight in time O(mc(H)), and enumerating all H-subgraphs in O(mc(H)) preprocessing time and constant delay. Again we determine all pattern graphs H that have complexity c(H) < 2, and for each such subquadratic pattern we determine the optimal complexity c(H).
To prove these results, we design new algorithms and prove conditional lower bounds based on standard hypotheses from fine-grained complexity theory. For our algorithms we introduce a new ingredient that we call hyper-degree splitting, where we split tuples of nodes into high degree and low degree depending on their number of common neighbors. This technique generalizes the classic high-degree-low-degree approach (which only splits along degrees of single nodes) and is crucial to obtain optimal algorithms. Furthermore, we contribute the simple but fundamental insight that to determine the optimal complexity of some natural families of algorithms and lower bounds it suffices to study graphs without clique separators (i.e., cliques whose removal disconnects the pattern graph). This insight immediately implies some existing results in the area and is crucial to obtain a complete classification.
The edit distance (also known as the Levenshtein distance) of two strings is the minimum number of character insertions, deletions, and substitutions needed to transform one string into the other. The textbook algorithm determines the edit distance of two length-n strings in O(n2) time, and one of the foundational results of fine-grained complexity is that any polynomial-factor improvement upon this quadratic runtime would violate the Orthogonal Vectors Hypothesis. In the bounded version of the problem, where the complexity is parameterized by the value k of the edit distance, the classic algorithm of Landau and Vishkin [JCSS’88] achieves O(n+k2) time, which is optimal (up to sub-polynomial factors and conditioned on OVH) as a function of n and k.
While the Levenshtein distance is a fundamental theoretical notion, most practical applications use weighted edit distance, where the weight (cost) of each edit can be an arbitrary real number in [1,∞) that may depend on the edit type and the characters involved. Unfortunately, the Landau–Vishkin algorithm does not generalize to the weighted setting and, for many decades, a simple O(nk)-time dynamic programming procedure remained the state of the art for bounded weighted edit distance. Only recently, Das, Gilbert, Hajiaghayi, Kociumaka, and Saha [STOC’23] provided an O(n+k5)-time algorithm; shortly afterward, Cassis, Kociumaka, and Wellnitz [FOCS’23] presented an Õ(n+√nk3)-time solution (where Õ(·) hides polylogn factors) and proved this runtime optimal for √n ≤ k ≤ n (up to sub-polynomial factors and conditioned on the All-Pairs Shortest Paths Hypothesis).
Notably, the hard instances constructed to establish the underlying conditional lower bound use fractional weights with large denominators, reaching (n), which stands in contrast to weight functions used in practice (e.g., in bioinformatics) that, after normalization, typically attain small integer values. Our first main contribution is a surprising discovery that the Õ(n+k2) running time of the Landau–Vishkin algorithm can be recovered if the weights are small integers (e.g., if some specific weight function is fixed in a given application). In general, our solution takes Õ(n+min{W,√k}· k2) time for integer weights not exceeding a threshold W. Despite matching time complexities, we do not use the Landau–Vishkin framework; instead, we build upon the recent techniques for arbitrary weights. For this, we exploit extra structure following from integer weights and avoid further bottlenecks using several novel ideas to give faster and more robust implementations of multiple steps of the previous approach.
Next, we shift focus to the dynamic version of the unweighted edit distance problem, which asks to maintain the edit distance of two strings that change dynamically, with each update modeled as a single edit (character insertion, deletion, or substitution). For many years, the best approach for dynamic edit distance combined the Landau–Vishkin algorithm with a dynamic strings implementation supporting efficient substring equality queries, such as one by Mehlhorn, Sundar, and Uhrig [SODA’94]; the resulting solution supports updates in Õ(k2) time. Recently, Charalampopoulos, Kociumaka, and Mozes [CPM’20] observed that a framework of Tiskin [SODA’10] yields a dynamic algorithm with an update time of Õ(n). This is optimal in terms of n: significantly faster updates would improve upon the static O(n2)-time algorithm and violate OVH. With the state-of-the-art update time at Õ(min{n,k2}), an exciting open question is whether Õ(k) update time is possible. Our second main contribution is an affirmative answer to this question: we present a deterministic dynamic algorithm that maintains the edit distance in Õ(k) worst-case time per update. Surprisingly, this result builds upon our new static solution and hence natively supports small integer weights: if they do not exceed a threshold W, the update time becomes Õ(W2 k).
The symmetric binary perceptron (SBPκ) problem with parameter κ : ℝ≥1 → [0,1] is an average-case search problem defined as follows: given a random Gaussian matrix A ∼ N(0,1)n × m as input where m ≥ n, output a vector x ∈ {−1,1}m such that || A x ||∞ ≤ κ(m/n) · √m .
The number partitioning problem (NPPκ) corresponds to the special case of setting n=1. There is considerable evidence that both problems exhibit large computational-statistical gaps.
In this work, we show (nearly) tight average-case hardness for these problems, assuming the worst-case hardness of standard approximate shortest vector problems on lattices.
• For SBPκ, statistically, solutions exist with κ(x) = 2−Θ(x) (Aubin, Perkins and Zdeborová, Journal of Physics 2019). For large n, the best that efficient algorithms have been able to achieve is a far cry from the statistical bound, namely κ(x) = Θ(1/√x) (Bansal and Spencer, Random Structures and Algorithms 2020). The problem has been extensively studied in the TCS and statistics communities, and Gamarnik, Kızıldağ, Perkins and Xu (FOCS 2022) conjecture that Bansal-Spencer is tight: namely, κ(x) = Θ(1/√x) is the optimal value achieved by computationally efficient algorithms. We prove their conjecture assuming the worst-case hardness of approximating the shortest vector problem on lattices.
• For NPPκ, statistically, solutions exist with κ(m) = Θ(2−m) (Karmarkar, Karp, Lueker and Odlyzko, Journal of Applied Probability 1986). Karmarkar and Karp’s classical differencing algorithm achieves κ(m) = 2−O(log2 m) . We prove that Karmarkar-Karp is nearly tight: namely, no polynomial-time algorithm can achieve κ(m) = 2−Ω(log3 m), once again assuming the worst-case subexponential hardness of approximating the shortest vector problem on lattices to within a subexponential factor.
Our hardness results are versatile, and hold with respect to different distributions of the matrix A (e.g., i.i.d. uniform entries from [0,1]) and weaker requirements on the solution vector x.
We provide m1+o(1)kє−1-time algorithms for computing multiplicative (1 − є)-approximate solutions to multi-commodity flow problems with k-commodities on m-edge directed graphs, including concurrent multi-commodity flow and maximum multi-commodity flow.
To obtain our results, we provide new optimization tools of potential independent interest.
First, we provide an improved optimization method for solving ℓq, p-regression problems to high accuracy. This method makes Õq, p(k) queries to a high accuracy convex minimization oracle, where Õq, p(·) hides factors depending only on q, p, or poly(logm), improving upon the Õq, p(k2) bound of [Chen-Ye, ICALP 2024].
As a result, we obtain the first almost-linear time algorithm that solves ℓq, p flows on directed graphs to high accuracy.
Second, we present optimization tools to reduce approximately solving composite ℓ1, ∞-regression problems to solving mo(1)є−1 instances of the composite ℓq, p-regression problem.
The method builds upon recent advances in solving box-simplex games [Jambulapati-Tian, NeurIPS 2023] and the area convex regularizer introduced in [Sherman, STOC 2017] to obtain faster rates for constrained versions of the problem. Carefully combining these techniques yields our directed multi-commodity flow algorithm.
We study the problem of finding an є-first-order stationary point (FOSP) of a smooth function, given access only to gradient information. The best-known gradient query complexity for this task, assuming both the gradient and Hessian of the objective function are Lipschitz continuous, is O(є−7/4). In this work, we propose a method with a gradient complexity of O(d1/4є−13/8), where d is the problem dimension, leading to an improved complexity when d = O(є−1/2). To achieve this result, we design an optimization algorithm that, underneath, involves solving two online learning problems. Specifically, we first reformulate the task of finding a stationary point for a nonconvex problem as minimizing the regret in an online convex optimization problem, where the loss is determined by the gradient of the objective function. Then, we introduce a novel optimistic quasi-Newton method to solve this online learning problem, with the Hessian approximation update itself framed as an online learning problem in the space of matrices. Beyond improving the complexity bound for achieving an є-FOSP using a gradient oracle, our result provides the first guarantee suggesting that quasi-Newton methods can potentially outperform gradient descent-type methods in nonconvex settings.
We give a fast, spectral procedure for implementing approximate-message passing (AMP) algorithms robustly. For any quadratic optimization problem over symmetric matrices X with independent subgaussian entries, and any separable AMP algorithm A, our algorithm performs a spectral pre-processing step and then mildly modifies the iterates of A. If given the perturbed input X + E ∈ ℝn × n for any E supported on a ε n × ε n principal minor, our algorithm outputs a solution v which is guaranteed to be close to the output of A on the uncorrupted X, with ||A(X) − v||2 ≤ f(ε) ||A(X)||2 where f(ε) → 0 as ε → 0 depending only on ε.
A (1+ε)-stretch tree cover of an edge-weighted n-vertex graph G is a collection of trees, where every pair of vertices has a (1+ε)-stretch path in one of the trees. The celebrated Dumbbell Theorem by Arya et. al. [STOC’95] states that any set of n points in d-dimensional Euclidean space admits a (1+ε)-stretch tree cover with a constant number of trees, where the constant depends on ε and the dimension d. This result was generalized for arbitrary doubling metrics by Bartal et. al. [ICALP’19]. While the total number of edges in the tree covers of Arya et. al. and Bartal et. al. is O(n), all known tree cover constructions incur a total lightness of Ω(logn); whether one can get a tree cover of constant lightness has remained a longstanding open question, even for 2-dimensional point sets.
In this work we resolve this fundamental question
in the affirmative, as a direct corollary of a new construction of (1+ε)-stretch spanning tree cover for doubling graphs; in a spanning tree cover, every tree may only use edges of the input graph rather than the corresponding metric. To the best of our knowledge, this is the first constant-stretch spanning tree cover construction (let alone for (1+ε)-stretch) with a constant number of trees, for any nontrivial family of graphs.
Concrete applications of our spanning tree cover include:
- A (1+ε)-stretch tree cover construction, where both the number of trees and lightness are bounded by O(1), for doubling graphs. In doubling metrics, we can also bound the maximum degree of each vertex by O(1) (which is impossible in doubling graphs).
- A compact (1+ε)-stretch routing scheme in the labeled model for doubling graphs, which uses the asymptotically optimal (up to the dependencies on ε and d) bound of O(logn) bits on all the involved measures (label, header, and routing tables sizes). This is a significant improvement over the works of Chan et. al. [SODA’05], Abraham et. al. [ICDCS’06], Konjevod et. al. [SODA’07], where the local memory usage either depends on the aspect ratio of the graph or is Ω(log3 n).
- The first path-reporting distance oracle for doubling graphs achieving optimal bounds for all important parameters: O(n) space, (1+ε)-stretch, and O(1) query time for constant d and ε.
We define and study analogs of probabilistic tree embedding and tree cover for directed graphs. We define the notion of a DAG cover of a general directed graph G: a small collection D1,…, Dg of DAGs so that for all pairs of vertices s,t, some DAG Di provides low distortion for (s,t); i.e. Di(s, t) ≤ α · G(s, t), where α is the distortion. As a trivial upper bound, there is a DAG cover with n DAGs and α=1 by taking the shortest-paths tree from each vertex. When each DAG is restricted to be a subgraph of G, there is a simple matching lower bound (via a directed cycle) that n DAGs are necessary, even to preserve reachability. Thus, we allow the DAGs to include a limited number of additional edges not from the original graph.
When n2 additional edges are allowed, there is a simple upper bound of two DAGs and α=1. Our first result is an almost-matching lower bound that even for n2−o(1) additional edges, at least n1−o(1) DAGs are needed, even to preserve reachability. However, the story is different when the number of additional edges is Õ(m), a natural setting where the sparsity of the DAG collection asymptotically matches that of the original graph. Our main upper bound is that there is a near-linear time algorithm to construct a DAG cover with Õ(m) additional edges, polylogarithmic distortion, and only O(logn) DAGs. This is similar to known results for undirected graphs: the well-known FRT probabilistic tree embedding implies a tree cover where both the number of trees and the distortion are logarithmic. Our algorithm also extends to a certain probabilistic embedding guarantee. Lastly, we complement our upper bound with a lower bound showing that achieving a DAG cover with no distortion and Õ(m) additional edges requires a polynomial number of DAGs.
We give a deterministic algorithm for computing a global minimum vertex cut in a vertex-weighted graph with n vertices and m edges in Ô(mn) time. We use Õ(·) and Ô· to hide log(n) and no(1) factors, respectively. This breaks the long-standing Ω(n4)-time barrier in dense graphs, achievable by trivially computing all-pairs maximum flows. Up to subpolynomial factors, we match the fastest randomized Õ(mn)-time algorithm by [Henzinger, Rao and Gabow FOCS’96], and affirmatively answer the question by [Gabow FOCS’00] whether deterministic O(mn)-time algorithms exist even for unweighted graphs. Our algorithm works in directed graphs, too.
In unweighted undirected graphs, we present a faster deterministic Ô(mκ)-time algorithm where κ≤ n is the size of the global minimum vertex cut. For a moderate value of κ, this strictly improves upon all previous deterministic algorithms in unweighted graphs with running time Ô(m(n+κ2)) [Even’75], Ô(m(n+κ√n)) [Gabow FOCS’00], and Ô(m2O(κ2)) [Saranurak and Yingchareonthawornchai FOCS’22]. Recently, a linear-time algorithm has been shown by [Korhonen’24] for small κ.
Our approach applies the common-neighborhood clustering, recently introduced by [Blikstad, Jiang, Mukhopadhyay and Yingchareonthawornchai STOC’25], in novel ways, e.g., on top of weighted graphs and on top of vertex-expander decomposition. We also exploit pseudorandom objects often used in computational complexity communities, including crossing families based on dispersers from [TaShma, Umans and Zuckerman STOC’01, Wigderson and Zuckerman Combinatorica’99] and selectors based on linear lossless condensers [Cheraghchi’11, Guruswami, Umans and Vadhan JACM’09]. To our knowledge, this is the first application of selectors in graph algorithms.
A recent breakthrough by [LNPSY STOC’21] showed that solving s-t vertex connectivity is sufficient (up to polylogarithmic factors) to solve (global) vertex connectivity in the sequential model. This raises a natural question: What is the relationship between s-t and global vertex connectivity in other computational models? In this paper, we demonstrate that the connection between global and s-t variants behaves very differently across computational models.
In parallel and distributed models, we obtain almost tight reductions from global to s-t vertex connectivity. In PRAM, this leads to a nω+o(1)-work and no(1)-depth algorithm for vertex connectivity, improving over the 35-year-old Õ(nω+1)-work O(log2n)-depth algorithm by [LLW FOCS’86], where ω is the matrix multiplication exponent and n is the number of vertices. In CONGEST, the reduction implies the first sublinear-round vertex connectivity algorithm when the diameter is moderately small. This answers an open question in [JM STOC’23].
In contrast, we show that global vertex connectivity is strictly harder than s-t vertex connectivity in the two-party communication setting, requiring n1.5 bits of communication. The s-t variant was known to be solvable in Õ(n) communication [BvdBEMN FOCS’22]. Our results resolve open problems raised by [MN STOC’20, BvdBEMN FOCS’22, AS SOSA’23].
At the heart of our results is a new graph decomposition framework we call common-neighborhood clustering, which can be applied in multiple models. Finally, we observe that global vertex connectivity cannot be solved without using s-t vertex connectivity by proving an s-t to global reduction in dense graphs in the PRAM and communication models.
Let S and T be two point sets in a metric space (M,d) with |S|+|T|=n, and b:S∪ T→ ℕ be a function satisfying b(s)≤ |T| for s∈ S and b(t)≤ |S| for t∈ T. We call a function µ:S× T→ {0,1} a multimatching between S and T with respect to b if for each s∈ S (and t∈ T), the number of points t∈ T (and s∈ S) with µ(s,t)=1 is at least b(s) (and b(t)). The cost of a multimatching µ is defined as cost(µ)=∑(s,t)µ(s,t)· d(s,t) where d(s,t) is the distance between s and t in M. The geometric multimatching problem aims to find a multimatching that minimizes its cost. The special case that b(v)=1 for all v∈ S∪ T is known as the geometric many-to-many matching problem.
We present two results for the geometric multimatching problem and the geometric many-to-many matching problem when a metric space M has a doubling dimension ddim. Notably, we present the first near-linear-time approximation scheme for the geometric multimatching problem with respect to the output size. Furthermore, our second result improves the best-known (1+ε)-approximation algorithm for the geometric many-to-many matching problem presented by Bandyapadhyay and Xue [SoCG24], winning the best paper award at SoCG’24. More specifically, we present the following two algorithms:
1. A (1/ε)O(ddim) Blog2 n-time deterministic algorithm that returns a multimatching of cost at most (1+ε) times the cost of an optimal multimatching for any constant ε>0, where B denotes the summation of b(v) for all v∈ S∪ T.
2. A (1/ε)O(ddim) nlogn-time deterministic algorithm that returns a many-to-many matching of cost at most (1+ε) times the cost of an optimal many-to-many matching for any constant ε>0.
Fingerprinting codes are a crucial tool for proving lower bounds in differential privacy. They have been used to prove tight lower bounds for several fundamental questions, especially in the “low accuracy” regime. Unlike reconstruction/discrepancy approaches however, they are more suited for query sets that arise naturally from the fingerprinting codes construction. In this work, we propose a general framework for proving fingerprinting type lower bounds, that allows us to tailor the technique to the geometry of the query set. Our approach allows us to prove several new results, including the following.
We show that any (sample- and population-)accurate algorithm for answering Q arbitrary adaptive counting queries over a universe X to accuracy α needs Ω(√log|X|· logQ/α3) samples, matching known upper bounds. This shows that the approaches based on differential privacy are optimal for this question, and improves significantly on the previously known lower bounds of logQ/α2 and min(√Q, √log|X|)/α2. We show that any (,δ)-DP algorithm for answering Q counting queries to accuracy α needs Ω(√ log|X| log(1/δ) logQ/ α2) samples, matching known upper bounds up to constants. Our framework allows for proving this bound via a direct correlation analysis and improves the prior bound of [] by √log(1/δ). For privately releasing a set of random 0-1 queries, we show tight sample complexity lower bounds in the high accuracy regime.
In the low accuracy regime, the picture is more complex. For random queries, we show that there is a discontinuity in the sample complexity. For Q random queries over a universe , the sample complexity grows as Θ, δ(1/α2), with no dependence on Q or |X|. This new sample complexity bound, based on sparse histograms, is asymptotically better than known lower bounds for CDP. However, at α ≈ √log|X|/√Q, the sample complexity jumps to Θ,δ(√Q/α).