We revisit the constant-factor approximation algorithm for the asymmetric traveling salesman problem by Svensson, Tarnawski, and Végh [STOC 2018]. We improve on each part of this algorithm. We avoid the reduction to irreducible instances and thus obtain a simpler and much better reduction to vertebrate pairs. We also show that a slight variant of their algorithm for vertebrate pairs has a much smaller approximation ratio. Overall we improve the approximation ratio from 506 to 22+ε for any ε > 0. This also improves the upper bound on the integrality ratio from 319 to 22.

We present a black-box reduction from the path version of the Traveling Salesman Problem (Path TSP) to the classical tour version (TSP). More precisely, we show that given an α-approximation algorithm for TSP, then, for any ε >0, there is an (α+ε)-approximation algorithm for the more general Path TSP. This reduction implies that the approximability of Path TSP is the same as for TSP, up to an arbitrarily small error. This avoids future discrepancies between the best known approximation factors achievable for these two problems, as they have existed until very recently.

A well-studied special case of TSP, Graph TSP, asks for tours in unit-weight graphs. Our reduction shows that any α-approximation algorithm for Graph TSP implies an (α+ε)-approximation algorithm for its path version. By applying our reduction to the 1.4-approximation algorithm for Graph TSP by Sebő and Vygen, we obtain a polynomial-time (1.4+ε)-approximation algorithm for Graph Path TSP, improving on a recent 1.497-approximation algorithm of Traub and Vygen.

We obtain our results through a variety of new techniques, including a novel way to set up a recursive dynamic program to guess significant parts of an optimal solution. At the core of our dynamic program we deal with instances of a new generalization of (Path) TSP which combines parity constraints with certain connectivity requirements. This problem, which we call Φ-TSP, has a constant-factor approximation algorithm and can be reduced to TSP in certain cases when the dynamic program would not make sufficient progress.

We design a 1.49993-approximation algorithm for the metric traveling salesperson problem (TSP) for instances in which an optimal solution to the subtour linear programming relaxation is half-integral. These instances received significant attention over the last decade due to a conjecture of Schalekamp, Williamson and van Zuylen stating that half-integral LP solutions have the largest integrality gap over all fractional solutions. So, if the conjecture of Schalekamp et al. holds true, our result shows that the integrality gap of the subtour polytope is bounded away from 3/2.

The symmetric traveling salesman problem (TSP) is the problem of finding the shortest
Hamiltonian cycle in an edge-weighted undirected graph. In 1962 Bellman, and independently
Held and Karp, showed that TSP instances with *n* cities can be solved in *O*(*n*^{2}2^{n}) time. Since then it has been a notorious problem to improve the runtime to *O*((2−є)^{n}) for some constant є>0. In this work we establish the following progress: If (*s*× *s*)-matrices can be multiplied in *s*^{2+o(1)} time, than all instances of TSP in *bipartite* graphs can be solved in *O*(1.9999^{n}) time by a randomized algorithm with constant error probability. We also indicate
how our methods may be useful to solve TSP in non-bipartite graphs.

On a high level, our approach is via a new problem called MinHamPair: Given two families of weighted perfect matchings, find a combination of minimum weight that forms a Hamiltonian cycle. As our main technical contribution, we give a fast algorithm for MinHamPair based on a new sparse cut-based factorization of the ‘matchings connectivity matrix’, introduced by Cygan et al. [JACM’18].

We introduce the binary value principle which is a simple subset-sum instance expressing that a natural number written in binary cannot be negative, relating it to central problems in proof and algebraic complexity. We prove conditional superpolynomial lower bounds on the Ideal Proof System (IPS) refutation size of this instance, based on a well-known hypothesis by Shub and Smale about the hardness of computing factorials, where IPS is the strong algebraic proof system introduced by Grochow and Pitassi (2018). Conversely, we show that short IPS refutations of this instance bridge the gap between sufficiently strong algebraic and semi-algebraic proof systems. Our results extend to full-fledged IPS the paradigm introduced in Forbes et al. (2016), whereby lower bounds against subsystems of IPS were obtained using restricted algebraic circuit lower bounds, and demonstrate that the binary value principle captures the advantage of semi-algebraic over algebraic reasoning, for sufficiently strong systems. Specifically, we show the following:

*Conditional IPS lower bounds:* The Shub-Smale hypothesis (1995) implies a superpolynomial lower bound on the size
of IPS refutations of the binary value principle over the rationals defined as the
unsatisfiable linear equation ∑_{i=1}^{n} 2^{i−1}*x*_{i} = −1, for boolean *x*_{i}’s. Further, the related τ-conjecture (1995) implies a superpolynomial lower bound
on the size of IPS refutations of a variant of the binary value principle over the
ring of rational functions. No prior conditional lower bounds were known for IPS or
for apparently much weaker propositional proof systems such as Frege.

*Algebraic vs. semi-algebraic proofs:* Admitting short refutations of the binary value principle is necessary for any algebraic
proof system to fully simulate any known semi-algebraic proof system, and for strong
enough algebraic proof systems it is also sufficient. In particular, we introduce
a very strong proof system that simulates all known semi-algebraic proof systems (and
most other known concrete propositional proof systems), under the name Cone Proof
System (CPS), as a semi-algebraic analogue of the ideal proof system: CPS establishes
the unsatisfiability of collections of polynomial equalities and inequalities over
the reals, by representing sum-of-squares proofs (and extensions) as algebraic circuits.
We prove that IPS is polynomially equivalent to CPS iff IPS admits polynomial-size
refutations of the binary value principle (for the language of systems of equations
that have no 0/1-solutions), over both ℤ and ℚ.

We show that Cutting Planes (CP) proofs are hard to find: Given an unsatisfiable formula *F*, It is -hard to find a CP refutation of *F* in time polynomial in the length of the shortest such refutation; and unless Gap-Hitting-Set
admits a nontrivial algorithm, one cannot find a *tree-like* CP refutation of *F* in time polynomial in the length of the shortest such refutation.

The first result extends the recent breakthrough of Atserias and M'uller (FOCS 2019)
that established an analogous result for Resolution. Our proofs rely on two new lifting
theorems: (1) Dag-like lifting for gadgets with *many output bits*. (2) Tree-like lifting that simulates an *r*-round protocol with gadgets of query complexity *O*(*r*).

One of the major open problems in proof complexity is to prove lower bounds on *AC*_{0}[*p*]-Frege proof systems. As a step toward this goal Impagliazzo, Mouli and Pitassi in
a recent paper suggested to prove lower bounds on the size for Polynomial Calculus
over the {± 1} basis. In this paper we show a technique for proving such lower bounds
and moreover we also give lower bounds on the size for Sum-of-Squares over the {±
1} basis.

We show lower bounds on random Δ-CNF formulas and formulas composed with a gadget. As a byproduct, we establish a separation between Polynomial Calculus and Sum-of-Squares over the {± 1} basis by proving a lower bound on the Pigeonhole Principle.

We give a surprising classification for the computational complexity of the Quantified Constraint Satisfaction Problem over a constraint language Γ, QCSP(Γ), where Γ is a finite language over 3 elements which contains all constants. In particular, such problems are either in P, NP-complete, co-NP-complete or PSpace-complete. Our classification refutes the hitherto widely-believed Chen Conjecture.

Additionally, we show that already on a 4-element domain there exists a constraint
language Γ such that (Γ) is DP-complete (from Boolean Hierarchy), and on a 10-element
domain there exists a constraint language giving the complexity class Θ_{2}^{P}.

Meanwhile, we prove the Chen Conjecture for finite conservative languages Γ. If the polymorphism clone of such Γ has the polynomially generated powers (PGP) property then QCSP(Γ) is in NP. Otherwise, the polymorphism clone of Γ has the exponentially generated powers (EGP) property and QCSP(Γ) is PSpace-complete.

This paper focuses on the contention resolution problem on a shared communication channel that does not support collision detection. A shared communication channel is a multiple access channel, which consists of a sequence of synchronized time slots. Players on the channel may attempt to broadcast a packet (message) in any time slot. A player's broadcast succeeds if no other player broadcasts during that slot. If two or more players broadcast in the same time slot, then the broadcasts collide and both broadcasts fail. The lack of collision detection means that a player monitoring the channel cannot differentiate between the case of two or more players broadcasting in the same slot (a collision) and zero players broadcasting. In the contention-resolution problem, players arrive on the channel over time, and each player has one packet to transmit. The goal is to coordinate the players so that each player is able to successfully transmit its packet within reasonable time. However, the players can only communicate via the shared channel by choosing to either broadcast or not. A contention-resolution protocol is measured in terms of its throughput (channel utilization). Previous work on contention resolution that achieved constant throughput assumed that either players could detect collisions, or the players' arrival pattern is generated by a memoryless (non-adversarial) process.

The foundational question answered by this paper is whether collision detection is a luxury or necessity when the objective is to achieve constant throughput. We show that even without collision detection, one can solve contention resolution, achieving constant throughput, with high probability.

Population protocols are a model of distributed computing, where *n* agents with limited computational power and memory perform randomly scheduled pairwise
interactions. A fundamental problem in this setting is that of *leader election*, where all agents start from the same state, and they seek to reach and maintain
a global state where exactly one agent is in a dedicated leader state. A significant
amount of work has been devoted to the study of the time and space complexity of this
problem. Alistarh et al. (SODA’17) have shown that Ω(loglog*n*) states per agent are needed in order to elect a leader in fewer than Θ(*n*^{2}) expected interactions. Moreover, Ω(*n*log*n*) expected interactions are required regardless of the number of states (Sudo and
Masuzawa, 2019). On the upper bound side, Gasieniec and Stachowiak (SODA’18) have
presented the first protocol that uses an optimal, Θ(loglog*n*), number or states and elects a leader in *O*(*n* log^{2} *n*) expected interactions. This running time was subsequently improved to *O*(*n* log*n*loglog*n*) (Gasieniec et al., SPAA’19).

In this paper we provide the first leader election population protocol that is both
time and space optimal: it uses Θ(loglog*n*) states per agent, and elects a leader in *O*(*n*log*n*) interactions in expectation. A key novel component of our approach is a simple protocol
that efficiently selects a small set of agents, of *poly*(log*n*) size, given an initial set of *s* = *O*(*n*^{є}) selected agents. Unlike existing approaches, which proceed by shrinking the initial
set monotonically over time, our protocol first increases the set in a controlled
way to a specific size (which is independent of *s*), before it shrinks the set to a *poly*(log*n*) size.

In this paper, we study submodular function minimization in the adaptive complexity
model. Seminal work by Gr'otschel, Lovász, and Schrijver shows that with oracle access
to a function *f*, the problem of submodular minimization can be solved exactly with *poly*(*n*) queries to *f*. A long line of work has since then been dedicated to the acceleration of submodular
minimization. In particular, recent work obtains a (strongly) polynomial time algorithm
with Õ(*n*^{3}) query complexity. A natural way to accelerate computation is via parallelization,
though very little is known about the extent to which submodular minimization can
be parallelized.

A natural measure for the parallel runtime of a black-box optimization algorithm is
its adaptivity, as recently introduced in the context of submodular maximization.
Informally, the adaptivity of an algorithm is the number of sequential rounds it makes
when each round can execute polynomially-many function evaluations in parallel. In
the past two years there have been breakthroughs in the study of adaptivity for both
submodular maximization and convex minimization, in particular an exponential improvement
in the parallel running time of submodular maximization was obtained with a *O*(log*n*)-adaptive algorithm. Whether submodular minimization can enjoy, thanks to parallelization,
the same dramatic speedups as submodular maximization is unknown. To date, we do not
know of any polynomial time algorithm for solving submodular minimization whose adaptivity
is subquadratic in *n*.

We initiate the study of the adaptivity of submodular function minimization by giving
the first non-trivial lower bound for the parallel runtime of submodular minimization.
We show that there is no *o*(log*n*/loglog*n*)-adaptive algorithm with *poly*(*n*) queries which solves the problem of submodular minimization. This is the first adaptivity
lower bound for unconstrained submodular optimization (whether for maximization or
minimization) and the analysis relies on a novel and intricate construction of submodular
functions.

In large-data applications, it is desirable to design algorithms with a high degree
of parallelization. In the context of submodular optimization, adaptive complexity
has become a widely-used measure of an algorithm’s “sequentiality”. Algorithms in
the adaptive model proceed in rounds, and can issue polynomially many queries to a
function *f* in each round. The queries in each round must be independent, produced by a computation
that depends only on query results obtained in previous rounds.

In this work, we examine two fundamental variants of submodular maximization in the
adaptive complexity model: cardinality-constrained monotone maximization, and unconstrained
non-mono-tone maximization. Our main result is that an *r*-round algorithm for cardinality-constrained monotone maximization cannot achieve
an approximation factor better than 1 − 1/*e* − Ω(min{ 1/*r*, log^{2} *n*/*r*^{3} }), for any *r* < *n*^{c} (where *c*>0 is some constant). This is the first result showing that the number of rounds must
blow up polynomially large as we approach the optimal factor of 1−1/*e*.

For the unconstrained non-monotone maximization problem, we show a positive result:
For every instance, and every δ>0, either we obtain a (1/2−δ)-approximation in 1 round,
or a (1/2+Ω(δ^{2}))-approximation in *O*(1/δ^{2}) rounds. In particular (and in contrast to the cardinality-constrained case), there
cannot be an instance where (i) it is impossible to achieve an approximation factor
better than 1/2 regardless of the number of rounds, and (ii) it takes *r* rounds to achieve a factor of 1/2−*O*(1/*r*).

In the dynamic Single-Source Shortest Paths (SSSP) problem, we are given a graph *G*=(*V*,*E*) subject to edge insertions and deletions and a source vertex *s*∈ *V*, and the goal is to maintain the distance *d*(*s*,*t*) for all *t*∈ *V*.

Fine-grained complexity has provided strong lower bounds for exact partially dynamic SSSP and approximate fully dynamic SSSP [ESA’04, FOCS’14, STOC’15]. Thus much focus has been directed towards finding efficient partially dynamic (1+є)-approximate SSSP algorithms [STOC’14, ICALP’15, SODA’14, FOCS’14, STOC’16, SODA’17, ICALP’17, ICALP’19, STOC’19, SODA’20, SODA’20]. Despite this rich literature, for directed graphs there are no known deterministic algorithms for (1+є)-approximate dynamic SSSP that perform better than the classic ES-tree [JACM’81]. We present the first such algorithm.

We present a *deterministic* data structure for incremental SSSP in weighted directed graphs with total update
time Õ(*n*^{2} log*W*/є^{O(1)}) which is near-optimal for very dense graphs; here *W* is the ratio of the largest weight in the graph to the smallest. Our algorithm also
improves over the best known partially dynamic *randomized* algorithm for directed SSSP by Henzinger et al. [STOC’14, ICALP’15] if *m*=ω(*n*^{1.1}).

Complementing our algorithm, we provide improved conditional lower bounds. Henzinger
et al. [STOC’15] showed that under the OMv Hypothesis, the partially dynamic exact
*s*-*t* Shortest Path problem in undirected graphs requires amortized update or query time
*m*^{1/2−o(1)}, given polynomial preprocessing time. Under a new hypothesis about finding Cliques,
we improve the update and query lower bound for algorithms with polynomial preprocessing
time to *m*^{0.626−o(1)}. Further, under the *k*-Cycle hypothesis, we show that any partially dynamic SSSP algorithm with *O*(*m*^{2−є}) preprocessing time requires amortized update or query time *m*^{1−o(1)}, which is essentially optimal. All previous conditional lower bounds that come close
to our bound [ESA’04,FOCS’14] only held for “combinatorial” algorithms, while our
new lower bound does not make such restrictions.

Given a dynamic graph subject to insertions and deletions of edges, a natural question
is whether the graph presently admits a planar embedding. We give a deterministic
fully-dynamic algorithm for general graphs, running in amortized *O*(log^{3} *n*) time per edge insertion or deletion, that maintains a bit indicating whether or
not the graph is presently planar. This is an exponential improvement over the previous
best algorithm [Eppstein, Galil, Italiano, Spencer, 1996] which spends amortized *O*(√*n*) time per update.

We give the first fully dynamic algorithm which maintains a (1−є)-approximate densest
subgraph in worst-case time poly(log*n*, є^{−1}) per update. Dense subgraph discovery is an important primitive for many real-world
applications such as community detection, link spam detection, distance query indexing,
and computational biology. We approach the densest subgraph problem by framing its
dual as a graph orientation problem, which we solve using an augmenting path-like
adjustment technique. Our result improves upon the previous best approximation factor
of (1/4 − є) for fully dynamic densest subgraph [Bhattacharya et. al., STOC ‘15].
We also extend our techniques to solving the problem on vertex-weighted graphs with
similar runtimes.

Additionally, we reduce the (1−є)-approximate densest subgraph problem on directed
graphs to *O*(log*n*/є) instances of (1−є)-approximate densest subgraph on vertex-weighted graphs. This
reduction, together with our algorithm for vertex-weighted graphs, gives the first
fully-dynamic algorithm for directed densest subgraph in worst-case time poly(log*n*, є^{−1}) per update. Moreover, combined with a near-linear time algorithm for densest subgraph
[Bahmani et. al., WAW ‘14], this gives the first near-linear time algorithm for directed
densest subgraph.

We present a new dynamic matching sparsification scheme. From this scheme we derive
a framework for dynamically rounding fractional matchings against *adaptive adversaries*. Plugging in known dynamic fractional matching algorithms into our framework, we
obtain numerous randomized dynamic matching algorithms which work against adaptive
adversaries. In contrast, all previous randomized algorithms for this problem assumed
a weaker, *oblivious*, adversary. Our dynamic algorithms against adaptive adversaries include, for any
constant є >0, a (2+є)-approximate algorithm with constant update time or polylog
worst-case update time, as well as (2−δ)-approximate algorithms in bipartite graphs
with arbitrarily-small polynomial update time. All these results achieve *polynomially* better update time to approximation trade-offs than previously known to be achievable
against adaptive adversaries.

We develop a new technique for proving concentration inequalities which relate between the variance and influences of Boolean functions.

Using this technique, we first settle a conjecture of Talagrand, proving that ∫ ⎧
⎨ ⎩−1,1⎫ ⎬ ⎭^{n}√*h*_{f}⎛ ⎝*x*⎞ ⎠*d*µ≥ *C*·⎛ ⎝*f*⎞ ⎠·⎛ ⎜ ⎜ ⎜ ⎝log⎛ ⎜ ⎜ ⎜ ⎝1∑_{i}^{2}⎛ ⎝*f*⎞ ⎠⎞ ⎟ ⎟ ⎟ ⎠⎞ ⎟ ⎟ ⎟ ⎠1/2 , where *h*_{f}(*x*) is the number of edges at *x* along which *f* changes its value, and _{i}(*f*) is the influence of the *i*-th coordinate.

Second, we strengthen several classical inequalities concerning the influences of
a Boolean function, showing that near-maximizers must have large vertex boundaries.
An inequality due to Talagrand states that for a Boolean function *f*, (*f*)≤ *C*∑_{i=1}^{n}_{i}(*f*)/1+log(1/_{i}(*f*)). We give a lower bound for the size of the vertex boundary of functions saturating
this inequality. As a corollary, we show that for sets that satisfy the edge-isoperimetric
inequality or the Kahn-Kalai-Linial inequality up to a constant, a constant proportion
of the mass is in the inner vertex boundary.

Lastly, we improve a quantitative relation between influences and noise stability given by Keller and Kindler.

Our proofs rely on techniques based on stochastic calculus, and bypass the use of hypercontractivity common to previous proofs.

A function *f*∶{0,1}^{n}→ {0,1} is called an approximate AND-homomorphism if choosing *x*,*y*∈*n* uniformly at random, we have that *f*(*x*∧ *y*) = *f*(*x*)∧ *f*(*y*) with probability at least 1−ε, where *x*∧ *y* = (*x*_{1}∧ *y*_{1},…,*x*_{n}∧ *y*_{n}). We prove that if *f*∶ {0,1}^{n} → {0,1} is an approximate AND-homomorphism, then *f* is δ-close to either a constant function or an AND function, where δ(ε) → 0 as ε→
0. This improves on a result of Nehama, who proved a similar statement in which δ
depends on *n*.

Our theorem implies a strong result on judgement aggregation in computational social
choice. In the language of social choice, our result shows that if *f* is ε-close to satisfying judgement aggregation, then it is δ(ε)-close to an oligarchy
(the name for the AND function in social choice theory). This improves on Nehama’s
result, in which δ decays polynomially with *n*.

Our result follows from a more general one, in which we characterize approximate solutions
to the eigenvalue equation *f* = λ *g*, where is the downwards noise operator *f*(*x*) = _{y}[*f*(*x* ∧ *y*)], *f* is [0,1]-valued, and *g* is {0,1}-valued. We identify all exact solutions to this equation, and show that
any approximate solution in which *f* and λ *g* are close is close to an exact solution.

A major challenge in complexity theory is to explicitly construct functions that have
small correlation with low-degree polynomials over F_{2}. We introduce a new technique to prove such correlation bounds with F_{2} polynomials. Using this technique, we bound the correlation of an XOR of Majorities
with constant degree polynomials. In fact, we prove a more general XOR lemma that
extends to arbitrary resilient functions. We conjecture that the technique generalizes
to higher degree polynomials as well.

A key ingredient in our new approach is a structural result about the Fourier spectrum
of low degree polynomials over F_{2}. We show that for any *n*-variate polynomial *p* over F_{2} of degree at most *d*, there is a small set *S* ⊂ [*n*] of variables, such that almost all of the Fourier mass of *p* lies on Fourier coefficients that intersect with *S*. In fact our result is more general, and finds such a set *S* for any low-dimensional subspace of polynomials. This generality is crucial in deriving
the new XOR lemmas.

A decision list is an ordered list of rules. Each rule is specified by a term, which is a conjunction of literals, and a value. Given an input, the output of a decision list is the value corresponding to the first rule whose term is satisfied by the input. Decision lists generalize both CNFs and DNFs, and have been studied both in complexity theory and in learning theory.

The size of a decision list is the number of rules, and its width is the maximal number of variables in a term. We prove that decision lists of small width can always be approximated by decision lists of small size, where we obtain sharp bounds. This in particular resolves a conjecture of Gopalan, Meka and Reingold (Computational Complexity, 2013) on DNF sparsification.

An ingredient in our proof is a new random restriction lemma, which allows to analyze how DNFs (and more generally, decision lists) simplify if a small fraction of the variables are fixed. This is in contrast to the more commonly used switching lemma, which requires most of the variables to be fixed.

We define the notion of *one-shot signatures*, which are signatures where any secret key can be used to sign only a single message,
and then self-destructs. While such signatures are of course impossible classically,
we construct one-shot signatures using *quantum no-cloning*. In particular, we show that such signatures exist relative to a classical oracle,
which we can then heuristically obfuscate using known indistinguishability obfuscation
schemes.

We show that one-shot signatures have numerous applications for hybrid quantum/classical cryptographic tasks, where all communication is required to be classical, but local quantum operations are allowed. Applications include one-time signature tokens, quantum money with classical communication, decentralized blockchain-less cryptocurrency, signature schemes with unclonable secret keys, non-interactive certifiable min-entropy, and more. We thus position one-shot signatures as a powerful new building block for novel quantum cryptographic protocols.

We construct a constant-round zero-knowledge classical argument for NP secure against quantum attacks. We assume the existence of Quantum Fully-Homomorphic Encryption and other standard primitives, known based on the Learning with Errors Assumption for quantum algorithms. As a corollary, we also obtain a constant-round zero-knowledge quantum argument for QMA.

At the heart of our protocol is a new *no-cloning* non-black-box simulation technique.

A secret-sharing scheme allows to distribute a secret *s* among *n* parties such that only some predefined “authorized” sets of parties can reconstruct
the secret, and all other “unauthorized” sets learn nothing about *s*. For over 30 years, it was known that any (monotone) collection of authorized sets
can be realized by a secret-sharing scheme whose shares are of size 2^{n−o(n)} and until recently no better scheme was known. In a recent breakthrough, Liu and
Vaikuntanathan (STOC 2018) have reduced the share size to 2^{0.994n+o(n)}, which was later improved to 2^{0.892n+o(n)} by Applebaum et al. (EUROCRYPT 2019).

In this paper we improve the exponent of general secret-sharing down to 0.637. For the special case of linear secret-sharing schemes, we get an exponent of 0.762 (compared to 0.942 of Applebaum et al.). As our main building block, we introduce a new robust variant of conditional disclosure of secrets (robust CDS) that achieves unconditional security even under bounded form of re-usability. We show that the problem of general secret-sharing reduces to robust CDS with sub-exponential overhead and derive our main result by implementing robust CDS with a non-trivial exponent. The latter construction follows by presenting a general immunization procedure that turns standard CDS into a robust CDS.

This paper shows several connections between data structure problems and cryptography against preprocessing attacks. Our results span data structure upper bounds, cryptographic applications, and data structure lower bounds, as summarized next.

First, we apply Fiat-Naor inversion, a technique with cryptographic origins, to obtain
a data structure upper bound. In particular, our technique yields a suite of algorithms
with space *S* and (online) time *T* for a preprocessing version of the *N*-input 3SUM problem where *S*^{3}· *T* = *O*(*N*^{6}). This disproves a strong conjecture (Goldstein et al., WADS 2017) that there is
no data structure that solves this problem for *S*=*N*^{2−δ} and *T* = *N*^{1−δ} for any constant δ>0.

Secondly, we show equivalence between lower bounds for a broad class of (static) data
structure problems and one-way functions in the random oracle model that resist a
very strong form of preprocessing attack. Concretely, given a random function *F*: [*N*] → [*N*] (accessed as an oracle) we show how to compile it into a function *G*^{F}: [*N*^{2}] → [*N*^{2}] which resists *S*-bit preprocessing attacks that run in query time *T* where *ST*=*O*(*N*^{2−ε}) (assuming a corresponding data structure lower bound on 3SUM). In contrast, a classical
result of Hellman tells us that *F* itself can be more easily inverted, say with *N*^{2/3}-bit preprocessing in *N*^{2/3} time. We also show that much stronger lower bounds follow from the hardness of kSUM.
Our results can be equivalently interpreted as security against adversaries that are
very non-uniform, or have large auxiliary input, or as security in the face of a powerfully
backdoored random oracle.

Thirdly, we give non-adaptive lower bounds for 3SUM which match the best known lower bounds for static data structure problems. Moreover, we show that our lower bound generalizes to a range of geometric problems, such as three points on a line, polygon containment, and others.

We present the first *m* polylog(*n*) work, polylog(*n*) time algorithm in the PRAM model that computes (1+є)-approximate single-source shortest
paths on weighted, undirected graphs. This improves upon the breakthrough result of
Cohen [JACM’00] that achieves *O*(*m*^{1+є0}) work and polylog(*n*) time. While most previous approaches, including Cohen’s, leveraged the power of
hopsets, our algorithm builds upon the recent developments in *continuous optimization*, studying the shortest path problem from the lens of the closely-related *minimum transshipment* problem. To obtain our algorithm, we demonstrate a series of near-linear work, polylogarithmic-time
reductions between the problems of approximate shortest path, approximate transshipment,
and ℓ_{1}-embeddings, and establish a recursive algorithm that cycles through the three problems
and reduces the graph size on each cycle. As a consequence, we also obtain faster
parallel algorithms for approximate transshipment and ℓ_{1}-embeddings with polylogarithmic distortion. The minimum transshipment algorithm in
particular improves upon the previous best *m*^{1+o(1)} work sequential algorithm of Sherman [SODA’17].

To improve readability, the paper is almost entirely self-contained, save for several staple theorems in algorithms and combinatorics.

We present a (1+ε)-approximate parallel algorithm for computing shortest paths in
undirected graphs, achieving *poly*(log*n*) depth and *m**poly*(log*n*) work for *n*-nodes *m*-edges graphs. Although sequential algorithms with (nearly) optimal running time have
been known for several decades, near-optimal parallel algorithms have turned out to
be a much tougher challenge. For (1+ε)-approximation, all prior algorithms with *poly*(log*n*) depth perform at least Ω(*mn*^{c}) work for some constant *c*>0. Improving this long-standing upper bound obtained by Cohen (STOC’94) has been
open for 25 years.

We develop several new tools of independent interest. One of them is a new notion
beyond hopsets — low hop emulator — a *poly*(log*n*)-approximate emulator graph in which every shortest path has at most *O*(loglog*n*) hops (edges). Direct applications of the low hop emulators are parallel algorithms
for *poly*(log*n*)-approximate single source shortest path (SSSP), Bourgain’s embedding, metric tree
embedding, and low diameter decomposition, all with *poly*(log*n*) depth and *m**poly*(log*n*) work.

To boost the approximation ratio to (1+ε), we introduce compressible preconditioners
and apply it inside Sherman’s framework (SODA’17) to solve the more general problem
of uncapacitated minimum cost flow (a.k.a., transshipment problem). Our algorithm
computes a (1+ε)-approximate uncapacitated minimum cost flow in *poly*(log*n*) depth using *m**poly*(log*n*) work. As a consequence, it also improves the state-of-the-art sequential running
time from *m*· 2^{O(√logn)} to *m**poly*(log*n*).

The approximate single-source shortest-path problem is as follows: given a graph with
nonnegative edge weights and a designated source vertex *s*, return estimates of the distances from *s* to each other vertex such that the estimate falls between the true distance and (1+є)
times the distance. This paper provides the first nearly work-efficient parallel algorithm
with sublinear span (also called depth) for the approximate shortest-path problem
on *directed* graphs. Specifically, for constant є and polynomially-bounded edge weights, our algorithm
has work Õ(*m*) and span *n*^{1/2+o(1)}. Several algorithms were previously known for the case of *undirected* graphs, but none of the techniques seem to translate to the directed setting.

The main technical contribution is the first nearly linear-work algorithm for constructing
hopsets on directed graphs. A (β,є)-hopset is a set of weighted edges (sometimes called
shortcuts) which, when added to the graph, admit β-hop paths with weight no more than
(1+є) times the true shortest-path distances. There is a simple sequential algorithm
that takes as input a directed graph and produces a linear-cardinality hopset with
β=Õ(√*n*), but its running time is quite high—specifically Õ(*m*√*n*). Our algorithm is the first more efficient algorithm that produces a directed hopset
with similar characteristics. Specifically, our sequential algorithm runs in Õ(*m*) time and constructs a hopset with Õ(*n*) edges and β = *n*^{1/2+o(1)}. A parallel version of the algorithm has work Õ(*m*) and span *n*^{1/2+o(1)}.

We present a simple polylogarithmic-time deterministic distributed algorithm for network
decomposition. This improves on a celebrated 2^{O(√logn)}-time algorithm of Panconesi and Srinivasan [STOC’92] and settles a central and long-standing
question in distributed graph algorithms. It also leads to the first polylogarithmic-time
deterministic distributed algorithms for numerous other problems, hence resolving
several well-known and decades-old open problems, including Linial’s question about
the deterministic complexity of maximal independent set [FOCS’87; SICOMP’92]—which
had been called the most outstanding problem in the area.

The main implication is a more general distributed derandomization theorem: Put together
with the results of Ghaffari, Kuhn, and Maus [STOC’17] and Ghaffari, Harris, and Kuhn
[FOCS’18], our network decomposition implies that *P*-*RLOCAL* = *P*-*LOCAL*.

That is, for any problem whose solution can be checked deterministically in polylogarithmic-time, any polylogarithmic-time randomized algorithm can be derandomized to a polylogarithmic-time deterministic algorithm. Informally, for the standard first-order interpretation of efficiency as polylogarithmic-time, distributed algorithms do not need randomness for efficiency.

By known connections, our result leads also to substantially faster randomized distributed algorithms for a number of well-studied problems including (Δ+1)-coloring, maximal independent set, and Lovász Local Lemma, as well as massively parallel algorithms for (Δ+1)-coloring.

We introduce a set of techniques that allow for efficiently generating many independent
random walks in the Massively Parallel Computation (MPC) model with space per machine
strongly sublinear in the number of vertices. In this space-per-machine regime, many
natural approaches to graph problems struggle to overcome the Θ(log *n*) MPC round complexity barrier, where *n* is the number of vertices. Our techniques enable achieving this for PageRank—one
of the most important applications of random walks—even in more challenging directed
graphs, as well as for approximate bipartiteness and expansion testing.

In the undirected case, we start our random walks from the stationary distribution,
which implies that we approximately know the empirical distribution of their next
steps. This allows for preparing continuations of random walks in advance and applying
a doubling approach. As a result we can generate multiple random walks of length *l* in Θ(log *l*) rounds on MPC. Moreover, we show that under the popular 1-vs.-2-Cycles conjecture,
this round complexity is asymptotically tight.

For directed graphs, our approach stems from our treatment of the PageRank Markov chain. We first compute the PageRank for the undirected version of the input graph and then slowly transition towards the directed case, considering convex combinations of the transition matrices in the process.

For PageRank, we achieve the following round complexities for damping factor equal
to 1 − є: in *O*(log log *n* + log 1 / є) rounds for undirected graphs (with Õ(*m* / є^{2}) total space), in Õ(log^{2} log *n* + log^{2} 1/є) rounds for directed graphs (with Õ((*m*+*n*^{1+o(1)}) / *poly*(є)) total space). The round complexity of our result for computing PageRank has only
logarithmic dependence on 1/є. We use this to show that our PageRank algorithm can
be used to construct directed length-*l* random walks in *O*(log^{2} log *n* + log^{2} *l*) rounds with Õ((*m*+*n*^{1+o(1)}) *poly*(*l*)) total space. More specifically, by setting є = Θ(1 / *l*), a length-*l* PageRank walk with constant probability contains no random jump, and hence is a directed
random walk.

We present a quasi-polynomial time classical algorithm that estimates the partition
function of quantum many-body systems at temperatures above the thermal phase transition
point. It is known that in the worst case, the same problem is NP-hard below this
point. Together with our work, this shows that the transition in the phase of a quantum
system is also accompanied by a transition in the hardness of approximation. We also
show that in a system of *n* particles above the phase transition point, the correlation between two observables
whose distance is at least Ω(log*n*) decays exponentially. We can improve the factor of log*n* to a constant when the Hamiltonian has commuting terms or is on a 1D chain. The key
to our results is a characterization of the phase transition and the critical behavior
of the system in terms of the complex zeros of the partition function. Our work extends
a seminal work of Dobrushin and Shlosman on the equivalence between the decay of correlations
and the analyticity of the free energy in classical spin models. On the algorithmic
side, our result extends the scope of a recent approach due to Barvinok for solving
classical counting problems to quantum many-body systems.

We present an algorithmic framework for quantum-inspired classical algorithms on close-to-low-rank
matrices, generalizing the series of results started by Tang’s breakthrough quantum-inspired
algorithm for recommendation systems [STOC’19]. Motivated by quantum linear algebra
algorithms and the quantum singular value transformation (SVT) framework of Gilyén
et al. [STOC’19], we develop classical algorithms for SVT that run in time independent
of input dimension, under suitable quantum-inspired sampling assumptions. Our results
give compelling evidence that in the corresponding QRAM data structure input model,
quantum SVT does not yield exponential quantum speedups. Since the quantum SVT framework
generalizes essentially all known techniques for quantum linear algebra, our results,
combined with sampling lemmas from previous work, suffices to generalize all recent
results about dequantizing quantum machine learning algorithms. In particular, our
classical SVT framework recovers and often improves the dequantization results on
recommendation systems, principal component analysis, supervised clustering, support
vector machines, low-rank regression, and semidefinite program solving. We also give
additional dequantization results on low-rank Hamiltonian simulation and discriminant
analysis. Our improvements come from identifying the key feature of the quantum-inspired
input model that is at the core of all prior quantum-inspired results: ℓ^{2}-norm sampling can approximate matrix products in time independent of their dimension.
We reduce all our main results to this fact, making our exposition concise, self-contained,
and intuitive.

A relational bipartite communication problem is presented that has an efficient quantum simultaneous-messages protocol, but no efficient classical two-way protocol.

A quantum walk algorithm can detect the presence of a marked vertex on a graph quadratically faster than the corresponding random walk algorithm (Szegedy, FOCS 2004). However, quantum algorithms that actually find a marked element quadratically faster than a classical random walk were only known for the special case when the marked set consists of just a single vertex, or in the case of some specific graphs. We present a new quantum algorithm for finding a marked vertex in any graph, with any set of marked vertices, that is (up to a log factor) quadratically faster than the corresponding classical random walk, resolving a question that had been open for 15 years.

We give new characterizations of the sample complexity of answering linear queries (statistical queries) in the local and central models of differential privacy:

(1) In the non-interactive local model, we give the first approximate characterization of the sample complexity. Informally our bounds are tight to within polylogarithmic factors in the number of queries and desired accuracy. Our characterization extends to agnostic learning in the local model.

(2) In the central model, we give a characterization of the sample complexity in the high-accuracy regime that is analogous to that of Nikolov, Talwar, and Zhang (STOC 2013), but is both quantitatively tighter and has a dramatically simpler proof.

Our lower bounds apply equally to the empirical and population estimation problems. In both cases, our characterizations show that a particular factorization mechanism is approximately optimal, and the optimal sample complexity is bounded from above and below by well studied factorization norms of a matrix associated with the queries.

We study differentially private (DP) algorithms for *stochastic convex optimization*: the problem of minimizing the population loss given i.i.d. samples from a distribution
over convex loss functions. A recent work of Bassily et al. (2019) has established
the optimal bound on the excess population loss achievable given *n* samples. Unfortunately, their algorithm achieving this bound is relatively inefficient:
it requires *O*(min{*n*^{3/2}, *n*^{5/2}/*d*}) gradient computations, where *d* is the dimension of the optimization problem.

We describe two new techniques for deriving DP convex optimization algorithms both
achieving the optimal bound on excess loss and using *O*(min{*n*, *n*^{2}/*d*}) gradient computations. In particular, the algorithms match the running time of
the optimal non-private algorithms. The first approach relies on the use of variable
batch sizes and is analyzed using the privacy amplification by iteration technique
of Feldman et al. (2018). The second approach is based on a general reduction to the
problem of localizing an approximately optimal solution with differential privacy.
Such localization, in turn, can be achieved using existing (non-private) uniformly
stable optimization algorithms. As in the earlier work, our algorithms require a mild
smoothness assumption. We also give a linear-time optimal algorithm for the strongly
convex case, as well as a faster algorithm for the non-smooth case.

Local differential privacy (LDP) is a model where users send privatized data to an untrusted central server whose goal it to solve some data analysis task. In the non-interactive version of this model the protocol consists of a single round in which a server sends requests to all users then receives their responses. This version is deployed in industry due to its practical advantages and has attracted significant research interest.

Our main result is an exponential lower bound on the number of samples necessary to solve the standard task of learning a large-margin linear separator in the non-interactive LDP model. Via a standard reduction this lower bound implies an exponential lower bound for stochastic convex optimization and specifically, for learning linear models with a convex, Lipschitz and smooth loss. These results answer the questions posed by Smith, Thakurta, and Upadhyay (IEEE Symposium on Security and Privacy 2017) and Daniely and Feldman (NeurIPS 2019). Our lower bound relies on a new technique for constructing pairs of distributions with nearly matching moments but whose supports can be nearly separated by a large margin hyperplane. These lower bounds also hold in the model where communication from each user is limited and follow from a lower bound on learning using non-adaptive statistical queries.

In the committee selection problem, we are given *m* candidates, and *n* voters. Candidates can have different weights. A committee is a subset of candidates,
and its weight is the sum of weights of its candidates. Each voter expresses an ordinal
ranking over all possible committees. The only assumption we make on preferences is
monotonicity: If *S* ⊆ *S*′ are two committees, then any voter weakly prefers *S*′ to *S*.

We study a general notion of group fairness via stability: A committee of given total
weight *K* is stable if no coalition of voters can deviate and choose a committee of proportional
weight, so that all these voters strictly prefer the new committee to the existing
one. Extending this notion to approximation, for parameter *c* ≥ 1, a committee *S* of weight *K* is said to be *c*-approximately stable if for any other committee *S*′ of weight *K*′, the fraction of voters that strictly prefer *S*′ to *S* is strictly less than *c* *K*′/*K*. When *c* = 1, this condition is equivalent to classical core stability.

The question we ask is: Does a *c*-approximately stable committee of weight at most any given value *K* always exist for constant *c*? It is relatively easy to show that there exist monotone preferences for which *c* ≥ 2. However, even for simple and widely studied preference structures, a non-trivial
upper bound on *c* has been elusive.

In this paper, we show that *c* = *O*(1) for all monotone preference structures. Our proof proceeds via showing an existence
result for a randomized notion of stability, and iteratively rounding the resulting
fractional solution.

In the *k*-cut problem, we are given an edge-weighted graph and want to find the least-weight
set of edges whose deletion breaks the graph into *k* connected components. Algorithms due to Karger-Stein and Thorup showed how to find
such a minimum *k*-cut in time approximately *O*(*n*^{2k−2}). The best lower bounds come from conjectures about the solvability of the *k*-clique problem and a reduction from *k*-clique to *k*-cut, and show that solving *k*-cut is likely to require time Ω(*n*^{k}). Our recent results have given special-purpose algorithms that solve the problem
in time *n*^{1.98k + O(1)}, and ones that have better performance for special classes of graphs (e.g., for small
integer weights).

In this work, we resolve the problem for general graphs, by showing that for any fixed
*k* ≥ 2, the Karger-Stein algorithm outputs any fixed minimum *k*-cut with probability at least *O*(*n*^{−k}), where *O*(·) hides a 2^{O(lnlnn)2} factor. This also gives an extremal bound of *O*(*n*^{k}) on the number of minimum *k*-cuts in an *n*-vertex graph and an algorithm to compute a minimum *k*-cut in similar runtime. Both are tight up to *O*(1) factors.

The first main ingredient in our result is a fine-grained analysis of how the graph
shrinks—and how the average degree evolves—under the Karger-Stein process. The second
ingredient is an extremal result bounding the number of cuts of size at most (2−δ)
*OPT*/*k*, using the Sunflower lemma.

We improve the time for approximating network (un)reliability to (*n*^{2}). We do so not with a new algorithm, but with a deeper analysis and tweaking of algorithms
from our previous work. In particular, we show that once a graph’s failure probability
shrinks below 1/2, the graph rapidly transitions to a regime where even the *expected number* of cut failures is small, and in fact is almost exactly the same as the graph’s failure
probability. That is, we are very unlikely to ever see more than one cut fail. This
lets us treat these cut failures as essentially independent, making it easier to estimate
their likelihood. The contribution of this paper is not just the improved time bound,
but also this clearer understanding of the evolution of a graph’s reliability. Our
results rely on some new methods for analyzing the distribution of cut failures conditioned
on the failure of a particular cut, as well as new insights into the evolution of
a graph’s connectivity as edges are randomly added over time. Some of our results
apply more broadly, to all monotone reliability systems.

Consider the following *2-respecting min-cut* problem. Given any weighted graph *G* and its spanning tree *T*, find the minimum cut among the cuts that contain at most two edges in *T*. This problem is an important subroutine in Karger’s celebrated randomized near-linear-time
min-cut algorithm [STOC’96]. We present a new approach for this problem which can
be easily implemented in many settings, leading to the following randomized min-cut
algorithms for weighted graphs.

– An *O*(*m*log^{2} *n*/loglog*n* + *n*log^{6} *n*)-time sequential algorithm improving Karger’s long-standing *O*(*m* log^{3} *n*) and *O*(*n*log^{6} *n* + *m*log^{2} *n*log(*n*^{2}/*m*)/loglog*n*) bounds when the input graph is not extremely sparse or dense. Improvements over
Karger’s bounds were previously known only under a rather strong assumption that the
input graph is *simple* (unweighted without parallel edges) [Henzinger, Rao, Wang, SODA’17; Ghaffari, Nowicki,
Thorup, SODA’20]. For unweighted graphs (possibly with parallel edges) and using bit
operations, our bound can be further improved to *O*(*m*log^{1.5} *n*/loglog*n* + *n*log^{6} *n*).

– An algorithm that requires Õ(*n*) *cut queries* to compute the min-cut of a weighted graph: This answers an open problem by Rubinstein,
Schramm, and Weinberg [ITCS’18], who obtained a similar bound for simple graphs. Our
bound is tight up to polylogarithmic factors. – A *streaming* algorithm that requires Õ(*n*) space and *O*(log*n*) passes to compute the min-cut: The only previous non-trivial exact min-cut algorithm
in this setting is the 2-pass Õ(*n*)-space algorithm on simple graphs [Rubinstein et al., ITCS’18] (observed by Assadi,
Chen, and Khanna [STOC’19]).

Our approach exploits some cute structural properties so that it only needs to compute
the values of Õ(*n*) cuts corresponding to removing Õ(*n*) pairs of tree edges, an operation that can be done quickly in many settings. This
is in contrast to the techniques used by Karger and Lovett-Sandlund to solve 2-respecting
min-cut where information about many more cuts is computed, stored in and accessed
from sophisticated data-structures.

For every constant *d* ≥ 3 and є > 0, we give a deterministic poly(*n*)-time algorithm that outputs a *d*-regular graph on Θ(*n*) vertices that is є-near-Ramanujan; i.e., its eigenvalues are bounded in magnitude
by 2√*d*−1 + є (excluding the single trivial eigenvalue of *d*).

We give a complete answer to the following basic question: ”What is the maximal fraction
of deletions or insertions tolerable by *q*-ary list-decodable codes with non-vanishing information rate?”

This question has been open even for binary codes, including the restriction to the binary insertion-only setting, where the best-known result was that a γ≤ 0.707 fraction of insertions is tolerable by some binary code family.

For any desired є>0, we construct a family of binary codes of positive rate which can be efficiently list-decoded from any combination of γ fraction of insertions and δ fraction of deletions as long as γ+2δ≤ 1−ε. On the other hand, for any γ, δ with γ+2δ=1 list-decoding is impossible. Our result thus precisely characterizes the feasibility region of binary list-decodable codes for insertions and deletions.

We further generalize our result to codes over any finite alphabet of size *q*. Surprisingly, our work reveals that the feasibility region for *q*>2 is not the natural generalization of the binary bound above. We provide tight upper
and lower bounds that precisely pin down the feasibility region, which turns out to
have a (*q*−1)-piece-wise linear boundary whose *q* corner-points lie on a quadratic curve.

The main technical work in our results is proving the existence of code families of sufficiently large size with good list-decoding properties for any combination of δ,γ within the claimed feasibility region. We achieve this via an intricate analysis of codes introduced by [Bukh, Ma; SIAM J. Discrete Math; 2014]. Finally, we give a simple yet powerful concatenation scheme for list-decodable insertion-deletion codes which transforms any such (non-efficient) code family (with information rate zero) into an efficiently decodable code family with constant rate.

List-decoding of Reed-Solomon (RS) codes beyond the so called Johnson radius has been one of the main open questions in coding theory and theoretical computer science since the work of Guruswami and Sudan. It is now known by the work of Rudra and Wootters, using techniques from high dimensional probability, that over large enough alphabets there exist RS codes that are indeed list-decodable beyond this radius.

In this paper we take a more combinatorial approach that allows us to determine the precise relation (up to the exact constant) between the decoding radius and the list size. We prove a generalized Singleton bound for a given list size, and conjecture that the bound is tight for most RS codes over large enough finite fields. We also show that the conjecture holds true for list sizes 2 and 3, and as a by product show that most RS codes with a rate of at least 1/9 are list-decodable beyond the Johnson radius. Lastly, we give the first explicit construction of such RS codes. The main tools used in the proof are a new type of linear dependency between codewords of a code that are contained in a small Hamming ball, and the notion of cycle space from Graph Theory. Both of them have not been used before in the context of list-decoding.

Let *W* be a binary-input memoryless symmetric (BMS) channel with Shannon capacity *I*(*W*) and fix any α > 0. We construct, for any sufficiently small δ > 0, binary linear
codes of block length *O*(1/δ^{2+α}) and rate *I*(*W*)−δ that enable reliable communication on *W* with quasi-linear time encoding and decoding. Shannon’s noisy coding theorem established
the *existence* of such codes (without efficient constructions or decoding) with block length *O*(1/δ^{2}). This quadratic dependence on the gap δ to capacity is known to be the best possible.
Our result thus yields a constructive version of Shannon’s theorem with near-optimal
convergence to capacity as a function of the block length. This resolves a central
theoretical challenge associated with the attainment of Shannon capacity. Previously
such a result was only known for the binary erasure channel.

Our codes are a variant of Arikan’s polar codes based on multiple carefully constructed local kernels, one for each intermediate channel that arises in the decoding. A crucial ingredient in the analysis is a strong converse of the noisy coding theorem when communicating using random linear codes on arbitrary BMS channels. Our converse theorem shows extreme unpredictability of even a single message bit for random coding at rates slightly above capacity.

Interactive error correcting codes can protect interactive communication protocols against a constant fraction of adversarial errors, while incurring only a constant multiplicative overhead in the total communication. What is the maximum fraction of errors that such codes can protect against?

For the non-adaptive channel, where the parties must agree in advance on the order
in which they communicate, Braverman and Rao prove that the maximum error resilience
is 1/4 (STOC, 2011). Ghaffari, Haeupler, and Sudan (STOC, 2014) consider the *adaptive* channel, where the order in which the parties communicate may not be fixed, and give
a clever protocol that is resilient to a 2/7 fraction of errors. This was believed
to be optimal.

We revisit this result, and show how to overcome the 2/7 barrier. Specifically, we show that, over the adaptive channel, every two-party communication protocol can be converted to a protocol that is resilient to 7/24 > 2/7 fraction of errors with only a constant multiplicative overhead to the total communication. The protocol is obtained by a novel implementation of a feedback mechanism over the adaptive channel.

Estimating the normalizing constant of an unnormalized probability distribution has
important applications in computer science, statistical physics, machine learning,
and statistics. In this work, we consider the problem of estimating the normalizing
constant *Z*=∫_{ℝd} *e*^{−f(x)} *d**x* to within a multiplication factor of 1 ± ε for a µ-strongly convex and *L*-smooth function *f*, given query access to *f*(*x*) and ∇ *f*(*x*). We give both algorithms and lowerbounds for this problem. Using an annealing algorithm
combined with a multilevel Monte Carlo method based on underdamped Langevin dynamics,
we show that *O*(*d*^{4/3}κ + *d*^{7/6}κ^{7/6}/ε^{2}) queries to ∇ *f* are sufficient, where κ= *L* / µ is the condition number. Moreover, we provide an information theoretic lowerbound,
showing that at least *d*^{1−o(1)}/ε^{2−o(1)} queries are necessary. This provides a first nontrivial lowerbound for the problem.

We consider the problem of learning a mixture of linear regressions (MLRs). An MLR
is specified by *k* nonnegative mixing weights *p*_{1}, …, *p*_{k} summing to 1, and *k* unknown regressors *w*_{1},...,*w*_{k}∈ℝ^{d}. A sample from the MLR is drawn by sampling *i* with probability *p*_{i}, then outputting (*x*, *y*) where *y* = ⟨ *x*, *w*_{i} ⟩ + η, where η∼(0,ς^{2}) for noise rate ς. Mixtures of linear regressions are a popular generative model
and have been studied extensively in machine learning and theoretical computer science.
However, all previous algorithms for learning the parameters of an MLR require running
time and sample complexity scaling exponentially with *k*.

In this paper, we give the first algorithm for learning an MLR that runs in time which
is sub-exponential in *k*. Specifically, we give an algorithm which runs in time *O*(*d*)·exp(*O*(√*k*)) and outputs the parameters of the MLR to high accuracy, even in the presence of
nontrivial regression noise. We demonstrate a new method that we call “Fourier moment
descent,” which uses univariate density estimation and low-degree moments of the Fourier
transform of suitable univariate projections of the MLR to iteratively refine our
estimate of the parameters. To the best of our knowledge, these techniques have never
been used in the context of high dimensional distribution learning, and may be of
independent interest. We also show that our techniques can be used to give a sub-exponential
time algorithm for a natural hard instance of the subspace clustering problem, which
we call learning mixtures of hyperplanes.

We study polynomial-time algorithms for linear regression and covariance estimation in the absence of strong (Gaussian) assumptions on the underlying distributions of samples, making assumptions instead about only finitely-many moments. We focus on how many samples are required to perform estimation and regression with high accuracy and exponentially-good success probability in the face of heavy-tailed data.

For covariance estimation, linear regression, and several other problems in high-dimensional
statistics, estimators have recently been constructed whose sample complexities and
rates of statistical error match what is possible when the underlying distribution
is Gaussian, but known algorithms for these estimators require exponential time. We
narrow the gap between the Gaussian and heavy-tailed settings for polynomial-time
estimators with: (a) a polynomial-time estimator which takes *n* samples from a *d*-dimensional random vector *X* with covariance Σ and produces Σ such that in spectral norm ||Σ − Σ ||_{2} ≤ Õ(*d*^{3/4}/√*n*) w.p. 1−2^{−d} where the information-theoretically optimal error bound is Õ(√*d*/*n*), while previous approaches to polynomial-time algorithms were stuck at Õ(*d*/√*n*) and (b) a polynomial-time algorithm which takes *n* samples (*X*_{i},*Y*_{i}) where *Y*_{i} = ⟨ *u*,*X*_{i} ⟩ + _{i} where both *X* and have a constant number of bounded moments and produces û such that the loss ||*u* − û||^{2} ≤ *O*(*d*/*n*) w.p. 1−2^{−d} for any *n* ≥ *d*^{3/2} log(*d*). This (information-theoretically optimal) error is achieved by inefficient algorithms
for any *n* ≫ *d*, while previous approaches to polynomial-time algorithms suffer loss Ω(*d*^{2}/*n*) and require *n* ≫ *d*^{2}.

Our algorithms make crucial use of degree-8 sum-of-squares semidefinite programs.
Both apply to any *X* which has constantly-many *certifiably hypercontractive moments*. We offer preliminary evidence that improving on these rates of error in polynomial
time is not possible in the *median of means* framework our algorithms employ. Our work introduces new techniques to high-probability
estimation, and suggests numerous new algorithmic questions in the following vein:
*when is it computationally feasible to do statistics in high dimensions with Gaussian-style
errors when data is far from Gaussian*?

We consider the following basic inference problem: there is an unknown high-dimensional
vector *w* ∈ ℝ^{n}, and an algorithm is given access to labeled pairs (*x*,*y*) where *x* ∈ ℝ^{n} is a measurement and *y* = *w* · *x* + *noise*. What is the complexity of deciding whether the target vector *w* is (approximately) *k*-sparse? The recovery analogue of this problem — given the promise that *w* is sparse, find or approximate the vector *w* — is the famous *sparse recovery* problem, with a rich body of work in signal processing, statistics, and computer
science.

We study the decision version of this problem (i.e. deciding whether the unknown *w* is *k*-sparse) from the vantage point of *property testing*. Our focus is on answering the following high-level question: when is it possible
to efficiently *test* whether the unknown target vector *w* is sparse versus far-from-sparse using a number of samples which is *completely independent* of the dimension *n*? We consider the natural setting in which *x* is drawn from an i.i.d. product distribution *D* over ℝ^{n} and the *noise* process is independent of the input *x*. As our main result, we give a general algorithm which solves the above-described
testing problem using a number of samples which is completely independent of the ambient
dimension *n*, as long as *D* is not a Gaussian. In fact, our algorithm is *fully noise tolerant*, in the sense that for an arbitrary *w*, it approximately computes the distance of *w* to the closest *k*-sparse vector. To complement this algorithmic result, we show that weakening any
of our conditions makes it information-theoretically impossible for *any* algorithm to solve the testing problem with fewer than essentially log*n* samples. Thus our conditions essentially characterize when it is possible to test
noisy linear functions for sparsity with constant sample complexity.

Our algorithmic approach is based on relating the cumulants of the output distribution
(i.e. of *y*) with elementary power sum symmetric polynomials in *w* and using the latter to measure the sparsity of *w*. This approach crucially relies on a theorem of Marcinkiewicz from probability theory.
In fact, to obtain effective sample complexity bounds with our approach, we prove
a new finitary version of Marcinkiewicz’s theorem. This involves extending the complex
analytic arguments used in the original proof with results about the distribution
of zeros of entire functions.

A sunflower with *r* petals is a collection of *r* sets so that the intersection of each pair is equal to the intersection of all. Erdős
and Rado proved the sunflower lemma: for any fixed *r*, any family of sets of size *w*, with at least about *w*^{w} sets, must contain a sunflower. The famous sunflower conjecture is that the bound
on the number of sets can be improved to *c*^{w} for some constant *c*. In this paper, we improve the bound to about (log*w*)^{w}. In fact, we prove the result for a robust notion of sunflowers, for which the bound
we obtain is tight up to lower order terms.

We present a randomized algorithm that takes as input an undirected *n*-vertex graph *G* with maximum degree Δ and an integer *k* > 3Δ, and returns a random proper *k*-coloring of *G*. The distribution of the coloring is *perfectly* uniform over the set of all proper *k*-colorings; the expected running time of the algorithm is *poly*(*k*,*n*)=*O*(*n*Δ^{2}· log(*k*)). This improves upon a result of Huber (STOC 1998) who obtained a polynomial time
perfect sampling algorithm for *k*>Δ^{2}+2Δ. Prior to our work, no algorithm with expected running time *poly*(*k*,*n*) was known to guarantee perfectly sampling with sub-quadratic number of colors in
general.

Our algorithm (like several other perfect sampling algorithms including Huber’s) is
based on the Coupling from the Past method. Inspired by the *bounding chain* approach, pioneered independently by Huber (STOC 1998) and H'aggstr'om & Nelander (Scand.
J. Statist., 1999), we employ a novel bounding chain to derive our result for the
graph coloring problem.

We revisit a fundamental problem in string matching: given a pattern of length *m* and a text of length *n*, both over an alphabet of size σ, compute the Hamming distance (i.e., the number
of mismatches) between the pattern and the text at every location. Several randomized
(1+ε)-approximation algorithms have been proposed in the literature (e.g., by Karloff
(Inf. Proc. Lett., 1993), Indyk (FOCS 1998), and Kopelowitz and Porat (SOSA 2018)),
with running time of the form *O*(ε^{−O(1)}*n*log*n*log*m*), all using fast Fourier transform (FFT). We describe a simple randomized (1+ε)-approximation
algorithm that is faster and does not need FFT. Combining our approach with additional
ideas leads to numerous new results (all Monte-Carlo randomized) in different settings:

(1) We design the first truly *linear-time* approximation algorithm for constant ; the running time is *O*(ε^{−2}*n*). In fact, the time bound can be made slightly sublinear in *n* if the alphabet size σ is small (by using bit packing tricks).

(2) We apply our approximation algorithms to design a faster *exact* algorithm computing all Hamming distances up to a threshold *k*; its runtime of *O*(*n* + min(*nk*√log*m*/√*m*,*nk*^{2}/*m*)) improves upon previous results by logarithmic factors and is linear for *k*≤ √*m*.

(3) We alternatively design approximation algorithms with better ε*-dependence*, by using fast rectangular matrix multiplication. In fact, the time bound is *O*(*n* *polylog* *n*) when the pattern is sufficiently long, i.e., *m*≥ ε^{−c} for a specific constant *c*. Previous algorithms with the best ε-dependence require *O*(ε^{−1}*n* *polylog* *n*) time.

(4) When *k* is not too small, we design a truly *sublinear-time* algorithm to find all locations with Hamming distance approximately (up to a constant
factor) less than *k*, in time *O*((*n*/*k*^{Ω(1)}+*occ*)*n*^{o(1)}) time, where *occ* is the output size. The algorithm leads to a *property tester* for pattern matching that costs *O*((δ^{−1/3}*n*^{2/3} + δ^{−1}*n*/*m*) *n*) time and, with high probability, returns true if an exact match exists and false
if the Hamming distance is more than δ *m* at every location.

(5) We design a *streaming* algorithm that approximately computes the Hamming distance for all locations with
the distance approximately less than *k*, using *O*(ε^{−2}√*k* *n*) space. Previously, streaming algorithms were known for the exact problem with *O*(*k* *n*) space (which is tight up to the *polylog**n* factor) or for the approximate problem with *O*(ε^{−O(1)}√*m**polylog**n*) space. For the special case of *k*=*m*, we improve the space usage to *O*(ε^{−1.5}√*m**polylog**n*).

We study edit distance computation with preprocessing: the preprocessing algorithm acts on each string separately, and then the query algorithm takes as input the two preprocessed strings. This model is inspired by scenarios where we would like to compute edit distance between many pairs in the same pool of strings.

Our results include: Permutation-LCS: If the LCS between two permutations has length
*n*−*k*, we can compute it *exactly* with *O*(*n* log(*n*)) preprocessing and *O*(*k* log(*n*)) query time. Small edit distance: For general strings, if their edit distance is
at most *k*, we can compute it *exactly* with *O*(*n*log(*n*)) preprocessing and *O*(*k*^{2} log(*n*)) query time. Approximate edit distance: For the most general input, we can approximate
the edit distance to within factor (7+*o*(1)) with preprocessing time Õ(*n*^{2}) and query time Õ(*n*^{1.5+o(1)}).

All of these results significantly improve over the state of the art in edit distance computation without preprocessing. Interestingly, by combining ideas from our algorithms with preprocessing, we provide new improved results for approximating edit distance without preprocessing in subquadratic time.

In this paper, we provide new approximation algorithms for dynamic variations of the
longest increasing subsequence (LIS) problem, and the complementary distance to monotonicity
(DTM) problem. In this setting, operations of the following form arrive sequentially:
(i) add an element, (ii) remove an element, or (iii) substitute an element for another.
At every point in time, the algorithm has an approximation to the longest increasing
subsequence (or distance to monotonicity). We present a (1+є)-approximation algorithm
for DTM with polylogarithmic worst-case update time and a constant factor approximation
algorithm for LIS with worst-case update time Õ(*n*^{є}) for any constant є > 0.

We show that the edit distance between two strings of length *n* can be computed via a randomized algorithm within a factor of *f*(є) in *n*^{1+є} time as long as the edit distance is at least *n*^{1−δ} for some δ(є) > 0.

For any *T* ≥ 1, there are constants *R*=*R*(*T*) ≥ 1 and ζ=ζ(*T*)>0 and a randomized algorithm that takes as input an integer *n* and two strings *x*,*y* of length at most *n*, and runs in time *O*(*n*^{1+1/T}) and outputs an upper bound *U* on the edit distance of *edit*(*x*,*y*) that with high probability, satisfies *U* ≤ *R*(*edit*(*x*,*y*)+*n*^{1−ζ}). In particular, on any input with *edit*(*x*,*y*) ≥ *n*^{1−ζ} the algorithm outputs a constant factor approximation with high probability. A similar
result has been proven independently by Brakensiek and Rubinstein (this proceedings).

Understanding the difference between group orbits and their closures is a key difficulty in geometric complexity theory (GCT): While the GCT program is set up to separate certain orbit closures (i.e., prove non-containment of one orbit closure in the other), many beautiful mathematical properties are only known for the group orbits, in particular close relations with symmetry groups and invariant spaces, while the orbit closures seem much more difficult to understand. However, in order to prove lower bounds in algebraic complexity theory, considering group orbits is not enough.

In this paper we tighten the relationship between the orbit of the power sum polynomial and its closure, so that we can separate this orbit closure from the orbit closure of the product of variables by just considering the symmetry groups of both polynomials and their representation theoretic decomposition coefficients. In a natural way our construction yields a multiplicity obstruction that is neither an occurrence obstruction, nor a so-called vanishing ideal occurrence obstruction. All multiplicity obstructions so far have been of one of these two types.

Our paper is the first implementation of the ambitious approach that was originally suggested in the first papers on geometric complexity theory by Mulmuley and Sohoni (SIAM J Comput 2001, 2008): Before our paper, all existence proofs of obstructions only took into account the symmetry group of one of the two polynomials (or tensors) that were to be separated. In our paper the multiplicity obstruction is obtained by comparing the representation theoretic decomposition coefficients of both symmetry groups.

Our proof uses a semi-explicit description of the coordinate ring of the orbit closure of the power sum polynomial in terms of Young tableaux, which enables its comparison to the coordinate ring of the orbit.

We prove a Pumping Lemma for the noncooperative abstract Tile Assembly Model, a model
central to the theory of algorithmic self-assembly since the beginning of the field.
This theory suggests, and our result proves, that small differences in the nature
of adhesive bindings between abstract square molecules gives rise to vastly different
expressive capabilities. In the cooperative abstract Tile Assembly Model, square tiles
attach to each other using multi-sided cooperation of one, two or more sides. This
precise control of tile binding is directly exploited for algorithmic tasks including
growth of specified shapes using very few tile types, as well as simulation of Turing
machines and even self-simulation of self-assembly systems. But are cooperative bindings
required for these computational tasks? The definitionally simpler noncooperative
(or Temperature 1) model has poor control over local binding events: tiles stick if
they bind on at least one side. This has led to the conjecture that it is impossible
for it to exhibit precisely controlled growth of computationally-defined shapes. Here,
we prove such an impossibility result. We show that any planar noncooperative system
that attempts to grow large algorithmically-controlled tile-efficient assemblies must
also grow infinite non-algorithmic (pumped) structures with a simple closed-form description,
or else suffer blocking of intended algorithmic structures. Our result holds for both
directed and nondirected systems, and gives an explicit upper bound of (8|*T*|)^{4|T|+1}(5|σ| + 6), where |*T*| is the size of the tileset and |σ| is the size of the seed assembly, beyond which
any path of tiles is pumpable or blockable.

For *d* ≥ 2 and all *q*≥ *q*_{0}(*d*) we give an efficient algorithm to approximately sample from the *q*-state ferromagnetic Potts and random cluster models on the torus (ℤ / *n* ℤ )^{d} for any inverse temperature β≥ 0. This stands in contrast to Markov chain mixing
time results: the Glauber dynamics mix slowly at and below the critical temperature,
and the Swendsen–Wang dynamics mix slowly at the critical temperature. We also provide
an efficient algorithm (an FPRAS) for approximating the partition functions of these
models.

Our algorithms are based on representing the random cluster model as a contour model using Pirogov–Sinai theory, and then computing an accurate approximation of the logarithm of the partition function by inductively truncating the resulting cluster expansion. One important innovation of our approach is an algorithmic treatment of unstable ground states; this is essential for our algorithms to apply to all inverse temperatures β. By treating unstable ground states our work gives a general template for converting probabilistic applications of Pirogov–Sinai theory to efficient algorithms.

The study of branching programs for the Tree Evaluation Problem (TreeEval), introduced
by S. Cook et al. (TOCT 2012), remains one of the most promising approaches to separating
L from P. Given a label in [*k*] at each leaf of a complete binary tree and an explicit function in [*k*]^{2} → [*k*] for recursively computing the value of each internal node from its children, the
problem is to compute the value at the root node. (While the original problem allows
an arbitrary-degree tree, we focus on binary trees.) The problem is parameterized
by the alphabet size *k* and the height *h* of the tree. A branching program implementing the straightforward recursive algorithm
uses Θ((*k* + 1)^{h}) states, organized into 2^{h}−1 layers of width up to *k*^{h}. Until now no better deterministic algorithm was known.

We present a series of three new algorithms solving TreeEval. They are inspired by
the work of Buhrman et al. on *catalytic space* (STOC 2012), applied outside the catalytic-space setting. First we give a novel branching
program with 2^{4h} poly(*k*) layers of width 2^{3k}, which beats the straightforward algorithm when *h* = ω(*k* / log*k*). Next we give a branching program with *k*^{2h} poly(*k*) layers of width *k*^{3}. This has total size comparable to the straightforward algorithm, but is implemented
using the catalytic framework. Finally we interpolate between the two algorithms to
give a branching program with (*O*(*k*/*h*))^{2h} *textnormal**poly*(*k*) layers of width (*O*(*k*/*h*))^{є h} for any constant є > 0, which beats the straightforward algorithm for all *h* ≥ *k*^{1/2 + poly є}. These are the first deterministic branching programs to beat the straightforward
algorithm, but more importantly this is the first non-trivial approach to proving
deterministic upper bounds for TreeEval.

We also contribute new machinery to the catalytic computing program, which may be of independent interest to some readers.

Following the breakthrough work of Tardos (Oper. Res. ’86) in the bit-complexity model,
Vavasis and Ye (Math. Prog. ’96) gave the first exact algorithm for linear programming
in the real model of computation with running time depending only on the constraint
matrix. For solving a linear program (LP) max *c**x*, *Ax* = *b*, *x* ≥ 0, *A* ∈ ^{m × n}, Vavasis and Ye developed a primal-dual interior point method using a *‘layered least squares’* (LLS) step, and showed that *O*(*n*^{3.5} log(χ_{A}+*n*)) iterations suffice to solve (LP) exactly, where χ_{A} is a condition measure controlling the size of solutions to linear systems related
to *A*.

Monteiro and Tsuchiya (SIAM J. Optim. ’03), noting that the central path is invariant
under rescalings of the columns of *A* and *c*, asked whether there exists an LP algorithm depending instead on the measure χ_{A}^{*}, defined as the minimum χ_{AD} value achievable by a column rescaling *AD* of *A*, and gave strong evidence that this should be the case. We resolve this open question
affirmatively.

Our first main contribution is an *O*(*m*^{2} *n*^{2} + *n*^{3}) time algorithm which works on the linear matroid of *A* to compute a nearly optimal diagonal rescaling *D* satisfying χ_{AD} ≤ *n*(χ^{*})^{3}. This algorithm also allows us to approximate the value of χ_{A} up to a factor *n* (χ^{*})^{2}. This result is in (surprising) contrast to that of Tunçel (Math. Prog. ’99), who
showed NP-hardness for approximating χ_{A} to within 2^{poly(rank(A))}. The key insight for our algorithm is to work with ratios *g*_{i}/*g*_{j} of circuits of *A*—i.e., minimal linear dependencies *Ag*=0—which allow us to approximate the value of χ_{A}^{*} by a maximum geometric mean cycle computation in what we call the *‘circuit ratio digraph’* of *A*.

While this resolves Monteiro and Tsuchiya’s question by appropriate preprocessing,
it falls short of providing either a truly scaling invariant algorithm or an improvement
upon the base LLS analysis. In this vein, as our second main contribution we develop
a *scaling invariant* LLS algorithm, which uses and dynamically maintains improving estimates of the circuit
ratio digraph, together with a refined potential function based analysis for LLS algorithms
in general. With this analysis, we derive an improved *O*(*n*^{2.5} log*n*log(χ_{A}^{*}+*n*)) iteration bound for optimally solving (LP) using our algorithm. The same argument
also yields a factor *n*/log*n* improvement on the iteration complexity bound of the original Vavasis-Ye algorithm.

In this paper we provide an *O*(*nd*+*d*^{3}) time randomized algorithm for solving linear programs with *d* variables and *n* constraints with high probability. To obtain this result we provide a robust, primal-dual
*O*(√*d*)-iteration interior point method inspired by the methods of Lee and Sidford (2014,
2019) and show how to efficiently implement this method using new data-structures
based on heavy-hitters, the Johnson–Lindenstrauss lemma, and inverse maintenance.
Interestingly, we obtain this running time without using fast matrix multiplication
and consequently, barring a major advance in linear system solving, our running time
is near optimal for solving dense linear programs among algorithms that do not use
fast matrix multiplication.

We give the first approximation algorithm for mixed packing and covering semidefinite
programs (SDPs) with polylogarithmic dependence on width. Mixed packing and covering
SDPs constitute a fundamental algorithmic primitive with applications in combinatorial
optimization, robust learning, and quantum complexity. The current approximate solvers
for positive semidefinite programming can handle only pure packing instances, and
technical hurdles prevent their generalization to a wider class of positive instances.
For a given multiplicative accuracy of є, our algorithm takes *O*(log^{3}(*nd*ρ) · є^{−3}) parallelizable iterations, where *n*, *d* are the problem dimensions and ρ is a width parameter of the instance, generalizing
or improving all previous parallel algorithms in the positive linear and semidefinite
programming literature. When specialized to pure packing SDPs, our algorithm’s iteration
complexity is *O*(log^{2} (*nd*) · є^{−2}), a slight improvement and derandomization of the state-of-the-art due to Allen-Zhu
et. al. ’16, Peng et. al. ’16, and Wang et. al. ’15. For several common structured
instances, the iterations of our algorithm run in nearly-linear time. In doing so,
we give matrix analytic techniques to overcome obstacles that have stymied prior approaches
to this problem, as stated in Peng et. al. ’16 and Mahoney et. al. ’16. Crucial to
our analysis are a simplification of existing algorithms for mixed positive linear
programs, achieved by removing an asymmetry from modifying covering constraints, and
a suite of matrix inequalities with proofs based on analyzing the Schur complements
of matrices in a higher dimension. We hope that our algorithm and techniques open
the door to improved solvers for positive semidefinite programming and its applications.

In this paper we provide an algorithm which given any *m*-edge *n*-vertex directed graph with integer capacities at most *U* computes a maximum *s*-*t* flow for any vertices *s* and *t* in *m*^{11/8+o(1)}*U*^{1/4} time with high probability. This running time improves upon the previous best of
Õ(*m*^{10/7} *U*^{1/7}) (Mądry 2016), Õ(*m* √*n* log*U*) (Lee Sidford 2014), and *O*(*mn*) (Orlin 2013) when the graph is not too dense or has large capacities.

We achieve this result by leveraging recent advances in solving undirected flow problems
on graphs. We show that in the maximum flow framework of (Mądry 2016) the problem
of optimizing the amount of perturbation of the central path needed to maximize energy
and thereby reduce congestion can be efficiently reduced to a smoothed ℓ_{2}-ℓ_{p} flow optimization problem, which can be solved approximately via recent work (Kyng,
Peng, Sachdeva, Wang 2019). Leveraging this new primitive, we provide a new interior
point method for maximum flow with faster convergence and simpler analysis that no
longer needs global potential functions involving energy as in previous methods (Mądry
2013, Mądry 2016).

The basic goal of survivable network design is to build a cheap network that maintains
the connectivity between given sets of nodes despite the failure of a few edges/nodes.
The *Connectivity Augmentation* Problem (CAP) is arguably one of the most basic problems in this area: given a *k*(-edge)-connected graph *G* and a set of extra edges (*links*), select a minimum cardinality subset *A* of links such that adding *A* to *G* increases its edge connectivity to *k*+1. Intuitively, one wants to make an *existing* network more reliable by *augmenting* it with extra edges. The best known approximation factor for this NP-hard problem
is 2, and this can be achieved with multiple approaches (the first such result is
in [Frederickson and Jájá’81]).

It is known [Dinitz et al.’76] that CAP can be reduced to the case *k*=1, a.k.a. the *Tree Augmentation* Problem (TAP), for odd *k*, and to the case *k*=2, a.k.a. the *Cactus Augmentation* Problem (CacAP), for even *k*. Several better than 2 approximation algorithms are known for TAP, culminating with
a recent 1.458 approximation [Grandoni et al.’18]. However, for CacAP the best known
approximation is 2.

In this paper we breach the 2 approximation barrier for CacAP, hence for CAP, by presenting a polynomial-time 2ln(4)−967/1120+є<1.91 approximation. From a technical point of view, our approach deviates quite substantially from the current related literature. In particular, the better-than-2 approximation algorithms for TAP either exploit greedy-style algorithms or are based on rounding carefully-designed LPs. These approaches exploit properties of TAP that do not seem to generalize to CacAP. We instead use a reduction to the Steiner tree problem which was previously used in parameterized algorithms [Basavaraju et al.’14]. This reduction is not approximation preserving, and using the current best approximation factor for Steiner tree [Byrka et al.’13] as a black-box would not be good enough to improve on 2. To achieve the latter goal, we “open the box” and exploit the specific properties of the instances of Steiner tree arising from CacAP.

In our opinion this connection between approximation algorithms for survivable network design and Steiner-type problems is interesting, and it might lead to other results in the area.

We present a spectral approach to design approximation algorithms for network design problems. We observe that the underlying mathematical questions are the spectral rounding problems, which were studied in spectral sparsification and in discrepancy theory. We extend these results to incorporate additional linear constraints, and show that they can be used to significantly extend the scope of network design problems that can be solved. Our algorithm for spectral rounding is an iterative randomized rounding algorithm based on the regret minimization framework. In some settings, this provides an alternative spectral algorithm to achieve constant factor approximation for survivable network design, and partially answers a question of Bansal about survivable network design with concentration property. We also show that the spectral rounding results have many other applications, including weighted experimental design and additive spectral sparsification.

The degree-4 Sum-of-Squares (SoS) SDP relaxation is a powerful algorithm that captures the best known polynomial time algorithms for a broad range of problems including MaxCut, Sparsest Cut, all MaxCSPs and tensor PCA. Despite being an explicit algorithm with relatively low computational complexity, the limits of degree-4 SoS SDP are not well understood. For example, existing integrality gaps do not rule out a (2−)-algorithm for Vertex Cover or a (0.878+)-algorithm for MaxCut via degree-4 SoS SDPs, each of which would refute the notorious Unique Games Conjecture.

We exhibit an explicit mapping from solutions for degree-2 Sum-of-Squares SDP (Goemans-Williamson
SDP) to solutions for the degree-4 Sum-of-Squares SDP relaxation on boolean variables.
By virtue of this mapping, one can lift lower bounds for degree-2 SoS SDP relaxation
to corresponding lower bounds for degree-4 SoS SDPs. We use this approach to obtain
degree-4 SoS SDP lower bounds for MaxCut on random *d*-regular graphs, Sherington-Kirkpatrick model from statistical physics and PSD Grothendieck
problem.

Our constructions use the idea of pseudocalibration towards candidate SDP vectors,
while it was previously only used to produce the candidate matrix which one would
show is PSD using much technical work. In addition, we develop a different technique
to bound the spectral norms of *graphical* matrices that arise in the context of SoS SDPs. The technique is much simpler and
yields better bounds in many cases than the *trace method* – which was the sole technique for this purpose.

We give new algorithms based on Markov chains to sample and approximately count satisfying
assignments to *k*-uniform CNF formulas where each variable appears at most *d* times. For any *k* and *d* satisfying *kd*<*n*^{o(1)} and *k*≥ 20log*k* + 20log*d* + 60, the new sampling algorithm runs in close to linear time, and the counting algorithm
runs in close to quadratic time.

Our approach is inspired by Moitra (JACM, 2019) which remarkably utilizes the Lovász
local lemma in approximate counting. Our main technical contribution is to use the
local lemma to bypass the connectivity barrier in traditional Markov chain approaches,
which makes the well developed MCMC method applicable on disconnected state spaces
such as SAT solutions. The benefit of our approach is to avoid the enumeration of
local structures and obtain fixed polynomial running times, even if *k*=ω(1) or *d*=ω(1).

Let *H* be a frustration-free Hamiltonian describing a 2D grid of qudits with local interactions,
a unique ground state, and local spectral gap lower bounded by a positive constant.
For any bipartition defined by a vertical cut of length *L* running from top to bottom of the grid, we prove that the corresponding entanglement
entropy of the ground state of *H* is upper bounded by Õ(*L*^{5/3}). For the special case of a 1D chain, our result provides a new area law which improves
upon prior work, in terms of the scaling with qudit dimension and spectral gap. In
addition, for any bipartition of the grid into a rectangular region *A* and its complement, we show that the entanglement entropy is upper bounded as Õ(|∂
*A*|^{5/3}) where ∂ *A* is the boundary of *A*. This represents a subvolume bound on entanglement in frustration-free 2D systems.
In contrast with previous work, our bounds depend on the local (rather than global)
spectral gap of the Hamiltonian. We prove our results using a known method which bounds
the entanglement entropy of the ground state in terms of certain properties of an
approximate ground state projector (AGSP). To this end, we construct a new AGSP which
is based on a robust polynomial approximation of the AND function and we show that
it achieves an improved trade-off between approximation error and entanglement.

Recent work of Bravyi et al. and follow-up work by Bene Watts et al. demonstrates
a quantum advantage for shallow circuits: constant-depth quantum circuits can perform
a task which constant-depth classical (i.e., ^{0}) circuits cannot. Their results have the advantage that the quantum circuit is fairly
practical, and their proofs are free of hardness assumptions (e.g., factoring is classically
hard, etc.). Unfortunately, constant-depth classical circuits are too weak to yield
a convincing real-world demonstration of quantum advantage. We attempt to hold on
to the advantages of the above results, while increasing the power of the classical
model. Our main result is a two-round interactive task which is solved by a constant-depth
quantum circuit (using only Clifford gates, between neighboring qubits of a 2D grid,
with Pauli measurements), but such that any classical solution would necessarily solve
-hard problems. This implies a more powerful class of constant-depth classical circuits
(e.g., ^{0}[*p*] for any prime *p*) unconditionally cannot perform the task. Furthermore, under standard complexity-theoretic
conjectures, log-depth circuits and log-space Turing machines cannot perform the task
either. Using the same techniques, we prove hardness results for weaker complexity
classes under more restrictive circuit topologies. Specifically, we give ^{0} interactive tasks on 2 × *n* and 1 × *n* grids which require classical simulations of power ^{1} and ^{0}[6], respectively. Moreover, these hardness results are robust to a small constant
fraction of error in the classical simulation.

We use ideas and techniques from the theory of branching programs, quantum contextuality, measurement-based quantum computation, and Kilian randomization.

A conjecture of Jozsa (arXiv:quant-ph/0508124) states that any polynomial-time quantum
computation can be simulated by polylogarithmic-depth quantum computation interleaved
with polynomial-depth classical computation. Separately, Aaronson conjectured that
there exists an oracle *O* such that BQP^{O} ≠ (BPP^{BQNC})^{O}. These conjectures are intriguing allusions to the unresolved potential of combining
classical and low-depth quantum computation. In this work we show that the Welded
Tree Problem, which is an oracle problem that can be solved in quantum polynomial
time as shown by Childs et al. (arXiv:quant-ph/0209131), cannot be solved in BPP^{BQNC}, nor can it be solved in the class that Jozsa describes. This proves Aaronson’s oracle
separation conjecture and provides a counterpoint to Jozsa’s conjecture relative to
the Welded Tree oracle problem. More precisely, we define two complexity classes,
HQC and JC whose languages are decided by two different families of interleaved quantum-classical
circuits. HQC contains BPP^{BQNC} and is therefore relevant to Aaronson’s conjecture, while JC captures the model of
computation that Jozsa considers. We show that the Welded Tree Problem gives an oracle
separation between either of {JC, HQC} and BQP. Therefore, even when interleaved with
arbitrary polynomial-time classical computation, greater ”quantum depth” leads to
strictly greater computational ability in this relativized setting.

Near-term quantum computers are likely to have small depths due to short coherence time and noisy gates. A natural approach to leverage these quantum computers is interleaving them with classical computers. Understanding the capabilities and limits of this hybrid approach is an essential topic in quantum computation. Most notably, the quantum Fourier transform can be implemented by a hybrid of logarithmic-depth quantum circuits and a classical polynomial-time algorithm. Therefore, it seems possible that quantum polylogarithmic depth is as powerful as quantum polynomial depth in the presence of classical computation.

Indeed, Jozsa conjectured that “*Any quantum polynomial-time algorithm can be implemented with only **O*(log*n*)* quantum depth interspersed with polynomial-time classical computations.*” This can be formalized as asserting the equivalence of *BQP* and “*BQNC*^{BPP}”. On the other hand, Aaronson conjectured that “*there exists an oracle separation between **BQP** and **BPP*^{BQNC}*.*” *BQNC*^{BPP} and *BPP*^{BQNC} are two natural and seeming incomparable ways of hybrid classical-quantum computation.

In this work, we manage to prove Aaronson’s conjecture and disproves Jozsa’s conjecture
relative to an oracle. In fact, we prove a stronger statement that for any depth parameter
*d*, there exists an oracle that separates quantum depth *d* and 2*d*+1 in the presence of classical computation. Thus, our results show that relative
to oracles, doubling the quantum circuit depth indeed gives the hybrid model more
power, and this cannot be traded by classical computation.

How can two parties with competing interests carry out a fair coin flip across a quantum communication channel? This problem (quantum weak coin-flipping) was formalized more than 15 years ago, and, despite some phenomenal theoretical progress, practical quantum coin-flipping protocols with vanishing bias have proved hard to find. In the current work we show that there is a reason that practical weak quantum coin-flipping is difficult: any quantum weak coin-flipping protocol with bias є must use at least exp( Ω (1/√є )) rounds of communication. This is a large improvement over the previous best known lower bound of Ω ( log log(1/є )) due to Ambainis from 2004. Our proof is based on a theoretical construction (the two-variable profile function) which may find further applications.

We initiate a study of the following problem: Given a continuous domain Ω along with
its convex hull *K*, a point *A* ∈ *K* and a prior measure µ on Ω, find the probability density over Ω whose marginal is
*A* and that minimizes the KL-divergence to µ. This framework gives rise to several extremal
distributions that arise in mathematics, quantum mechanics, statistics, and theoretical
computer science. Our technical contributions include a polynomial bound on the norm
of the optimizer of the dual problem that holds in a very general setting and relies
on a “balance” property of the measure µ on Ω, and exact algorithms for evaluating
the dual and its gradient for several interesting settings of Ω and µ. Together, along
with the ellipsoid method, these results imply polynomial-time algorithms to compute
such KL-divergence minimizing distributions in several cases. Applications of our
results include: 1) an optimization characterization of the Goemans-Williamson measure
that is used to round a positive semidefinite matrix to a vector, 2) the computability
of the entropic barrier for polytopes studied by Bubeck and Eldan, and 3) a polynomial-time
algorithm to compute the barycentric quantum entropy of a density matrix that was
proposed as an alternative to von Neumann entropy by Band and Park in the 1970s: this
corresponds to the case when Ω is the set of rank one projection matrices and µ corresponds
to the Haar measure on the unit sphere. Our techniques generalize to the setting of
rank *k* projections using the Harish-Chandra-Itzykson-Zuber formula, and are applicable even
beyond, to adjoint orbits of compact Lie groups.

Given a separation oracle for a convex set *K* ⊂ ℝ^{n} that is contained in a box of radius *R*, the goal is to either compute a point in *K* or prove that *K* does not contain a ball of radius є. We propose a new cutting plane algorithm that
uses an optimal *O*(*n* log(κ)) evaluations of the oracle and an additional *O*(*n*^{2}) time per evaluation, where κ = *nR*/є. This improves upon Vaidya’s *O*( SO · *n* log(κ) + *n*^{ω+1} log(κ)) time algorithm [Vaidya, FOCS 1989a] in terms of polynomial dependence on
*n*, where ω < 2.373 is the exponent of matrix multiplication and SO is the time for
oracle evaluation. This improves upon Lee-Sidford-Wong’s *O*( SO · *n* log(κ) + *n*^{3} log^{O(1)} (κ)) time algorithm [Lee, Sidford and Wong, FOCS 2015] in terms of dependence on
κ. For many important applications in economics, κ = Ω(exp(*n*)) and this leads to a significant difference between log(κ) and (log(κ)). We also
provide evidence that the *n*^{2} time per evaluation cannot be improved and thus our running time is optimal.

A bottleneck of previous cutting plane methods is to compute *leverage scores*, a measure of the relative importance of past constraints. Our result is achieved
by a novel multi-layered data structure for leverage score maintenance, which is a
sophisticated combination of diverse techniques such as random projection, batched
low-rank update, inverse maintenance, polynomial interpolation, and fast rectangular
matrix multiplication. Interestingly, our method requires a combination of different
fast rectangular matrix multiplication algorithms.

Our algorithm not only works for the classical convex optimization setting, but also generalizes to convex-concave games. We apply our algorithm to improve the runtimes of many interesting problems, e.g., Linear Arrow-Debreu Markets, Fisher Markets, and Walrasian equilibrium.

State-of-the-art results on image recognition tasks are achieved using over-parameterized learning algorithms that (nearly) perfectly fit the training set and are known to fit well even random labels. This tendency to memorize seemingly useless training data labels is not explained by existing theoretical analyses. Memorization of the training data also presents significant privacy risks when the training data contains sensitive personal information and thus it is important to understand whether such memorization is necessary for accurate learning.

We provide a simple conceptual explanation and a theoretical model demonstrating that for natural data distributions memorization of labels is necessary for achieving close-to-optimal generalization error. The model is motivated and supported by the results of several recent empirical works. In our model, data is sampled from a mixture of subpopulations and the frequencies of these subpopulations are chosen from some prior. The model allows to quantify the effect of not fitting the training data on the generalization performance of the learned classifier and demonstrates that memorization is necessary whenever frequencies are long-tailed. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. Our results also have concrete implications for the cost of ensuring differential privacy in learning.

We study the problem, introduced by Qiao and Valiant, of learning from untrusted batches.
Here, we assume *m* users, all of whom have samples from some underlying distribution over 1, …, *n*. Each user sends a batch of *k* i.i.d. samples from this distribution; however an є-fraction of users are untrustworthy
and can send adversarially chosen responses. The goal of the algorithm is to learn
in total variation distance. When *k* = 1 this is the standard robust univariate density estimation setting and it is well-understood
that (є) error is unavoidable. Suprisingly, Qiao and Valiant gave an estimator which
improves upon this rate when *k* is large. Unfortunately, their algorithms run in time which is exponential in either
*n* or *k*.

We first give a sequence of polynomial time algorithms whose estimation error approaches the information-theoretically optimal bound for this problem. Our approach is based on recent algorithms derived from the sum-of-squares hierarchy, in the context of high-dimensional robust estimation. We show that algorithms for learning from untrusted batches can also be cast in this framework, but by working with a more complicated set of test functions.

It turns out that this abstraction is quite powerful, and can be generalized to incorporate
additional problem specific constraints. Our second and main result is to show that
this technology can be leveraged to build in prior knowledge about the shape of the
distribution. Crucially, this allows us to reduce the sample complexity of learning
from untrusted batches to polylogarithmic in *n* for most natural classes of distributions, which is important in many applications.
To do so, we demonstrate that these sum-of-squares algorithms for robust mean estimation
can be made to handle complex combinatorial constraints (e.g. those arising from VC
theory), which may be of independent technical interest.

The popular 3-SUM conjecture states that there is no strongly subquadratic time algorithm
for checking if a given set of integers contains three distinct elements that sum
up to zero. A closely related problem is to check if a given set of integers contains
distinct *x*_{1}, *x*_{2}, *x*_{3} such that *x*_{1}+*x*_{2}=2*x*_{3}. This can be reduced to 3-SUM in almost-linear time, but surprisingly a reverse reduction
establishing 3-SUM hardness was not known.

We provide such a reduction, thus resolving an open question of Erickson. In fact,
we consider a more general problem called 3-LDT parameterized by integer parameters
α_{1}, α_{2}, α_{3} and *t*. In this problem, we need to check if a given set of integers contains distinct elements
*x*_{1}, *x*_{2}, *x*_{3} such that α_{1} *x*_{1}+α_{2} *x*_{2} +α_{3} *x*_{3} = *t*. For some combinations of the parameters, every instance of this problem is a NO-instance
or there exists a simple almost-linear time algorithm. We call such variants trivial.
We prove that all non-trivial variants of 3-LDT are equivalent under subquadratic
reductions. Our main technical contribution is an efficient deterministic procedure
based on the famous Behrend’s construction that partitions a given set of integers
into few subsets that avoid a chosen linear equation.

In the classical SubsetSum problem we are given a set *X* and a target *t*, and the task is to decide whether there exists a subset of *X* which sums to *t*. A recent line of research has resulted in (*t* · *poly* (log*t*))-time algorithms, which are (near-)optimal under popular complexity-theoretic assumptions.
On the other hand, the standard dynamic programming algorithm runs in time *O*(*n* · |*S*(*X*,*t*)|), where *S*(*X*,*t*) is the set of all subset sums of *X* that are smaller than *t*. All previous pseudopolynomial algorithms actually solve a stronger task, since they
actually compute the whole set *S*(*X*,*t*).

As the aforementioned two running times are incomparable, in this paper we ask whether
one can achieve the best of both worlds: running time |*S*(*X*,*t*)|·*poly*(log*t*). In particular, we ask whether *S*(*X*,*t*) can be computed in near-linear time in the output-size. Using a diverse toolkit
containing techniques such as color coding, sparse recovery, and sumset estimates,
we make considerable progress towards this question and design an algorithm running
in time |*S*(*X*,*t*)|^{4/3} · *poly*(log*t*).

Central to our approach is the study of *top-**k**-convolution*, a natural problem of independent interest: given degree-*d* sparse polynomials with non-negative coefficients, compute the lowest *k* non-zero monomials of their product. We design an algorithm running in time *k*^{4/3} *poly*(log*d*), by a combination of sparse convolution and sumset estimates considered in Additive
Combinatorics. Moreover, we provide evidence that going beyond some of the barriers
we have faced requires either an algorithmic breakthrough or possibly new techniques
from Additive Combinatorics on how to pass from information on restricted sumsets
to information on unrestricted sumsets.

The Sparsest Cut is a fundamental optimization problem that have been extensively studied. For planar inputs the problem is in P and can be solved in Õ(n 3 ) time if all vertex weights are 1. Despite a significant amount of effort, the best algorithms date back to the early 90’s and can only achieve O(log n)-approximation in Õ(n) time or 3.5-approximation in Õ(n 2 ) time [Rao, STOC92]. Our main result is an Ω(n 2−ε ) lower bound for Sparsest Cut even in planar graphs with unit vertex weights, under the (min, +)-Convolution conjecture, showing that approxima- tions are inevitable in the near-linear time regime. To complement the lower bound, we provide a 3.3-approximation in near-linear time, improving upon the 25-year old result of Rao in both time and accuracy. We also show that our lower bound is not far from optimal by observing an exact algorithm with running time Õ(n 5/2 ) improving upon the Õ(n 3 ) algorithm of Park and Phillips [STOC93]. Our lower bound accomplishes a repeatedly raised challenge by being the first fine-grained lower bound for a natural planar graph problem in P. Building on our construction we prove near-quadratic lower bounds under SETH for variants of the closest pair problem in planar graphs, and use them to show that the popular Average-Linkage procedure for Hierarchical Clustering cannot be simulated in truly subquadratic time. At the core of our constructions is a diamond-like gadget that also settles the complexity of Diameter in distributed planar networks. We prove an Ω(n/ log n) lower bound on the number of communication rounds required to compute the weighted diameter of a network in the CONGET model, even when the underlying graph is planar and all nodes are D = 4 hops away from each other. This is the first poly(n) lower bound in the planar-distributed setting, and it complements the recent poly(D, log n) upper bounds of Li and Parter [STOC 2019] for (exact) unweighted diameter and for (1 + ε) approximate weighted diameter.

In this paper we provide a Õ(*m*√*n*) time algorithm that computes a 3-multiplicative approximation of the girth of a
*n*-node *m*-edge directed graph with non-negative edge lengths. This is the first algorithm which
approximates the girth of a directed graph up to a constant multiplicative factor
faster than All-Pairs Shortest Paths (APSP) time, i.e. *O*(*mn*). Additionally, for any integer *k* ≥ 1, we provide a deterministic algorithm for a *O*(*k*loglog*n*)-multiplicative approximation to the girth in directed graphs in Õ(*m*^{1+1/k}) time. Combining the techniques from these two results gives us an algorithm for
a *O*(*k*log*k*)-multiplicative approximation to the girth in directed graphs in Õ(*m*^{1+1/k}) time. Our results naturally also provide algorithms for improved constructions of
roundtrip spanners, the analog of spanners in directed graphs.

The previous fastest algorithms for these problems either ran in All-Pairs Shortest
Paths (APSP) time, i.e. *O*(*mn*), or were due Pachocki which provided a randomized algorithm that for any integer
*k* ≥ 1 in time Õ(*m*^{1+1/k}) computed with high probability a *O*(*k*log*n*) multiplicative approximation of the girth. Our first algorithm constitutes the first
sub-APSP-time algorithm for approximating the girth to constant accuracy, our second
removes the need for randomness and improves the approximation factor in Pachocki ,
and our third is the first time versus quality trade-off for obtaining constant approximations.

Non-signaling proofs, motivated by quantum computation, have found applications in
cryptography and hardness of approximation. An important open problem is characterizing
the power of non-signaling proofs. It is known that non-signaling proofs with two
provers are characterized by PSPACE and that non-signaling proofs with poly(n)-provers
are characterized by EXP. However, the power of *k*-prover non-signaling proofs, for 2<*k*<(*n*) remained an open problem.

We show that *k*-prover non-signaling proofs (with negligible soundness) for *k*=*O*(√log*n*) are contained in PSPACE. We prove this via two different routes that are of independent
interest. In both routes we consider a relaxation of non-signaling called sub-non-signaling.
Our main technical contribution (which is used in both our proofs) is a reduction
showing how to convert any sub-non-signaling strategy with value at least 1−2^{−Ω(k2)} into a non-signaling one with value at least 2^{−O(k2)}.

In the first route, we show that the classical prover reduction method for converting
*k*-prover games into 2-prover games carries over to the non-signaling setting with the
following loss in soundness: if a *k*-prover game has value less than 2^{−ck2} (for some constant *c*>0), then the corresponding 2-prover game has value less than 1 − 2^{dk2} (for some constant *d*>0). In the second route we show that the value of a sub-non-signaling game can be
approximated in space that is polynomial in the communication complexity and exponential
in the number of provers.

Hardness of computing the Kolmogorov complexity of a given string is closely tied
to a security proof of hitting set generators, and thus understanding hardness of
Kolmogorov complexity is one of the central questions in complexity theory. In this
paper, we develop new proof techniques for showing hardness of computing Kolmogorov
complexity under *surprisingly efficient reductions*, which were previously conjectured to be impossible. It is known that the set *R*_{K} of Kolmogorov-random strings is PSPACE-hard under polynomial-time Turing reductions,
i.e., *PSPACE* ⊂ *P*^{RK}, and that *NEXP* ⊂ *NP*^{RK}, which was conjectured to be tight by Allender (CiE 2012). We prove that *EXP*^{NP} ⊂ *P*^{RK}, which simultaneously improves these hardness results and refutes the conjecture
of Allender under the plausible assumption that *EXP*^{NP} ≠ *NEXP*. At the core of our results is a new security proof of a pseudorandom generator via
a black-box uniform reduction, which overcomes an impossibility result of Gutfreund
and Vadhan (RANDOM/APPROX 2008).

Our proof techniques have further consequences, including:

1. Applying our proof techniques to the case of resource-bounded Kolmogorov complexity,
we obtain NP-hardness of the problem *MINcKT*^{SAT} of computing conditional polynomial-time-bounded SAT-oracle Kolmogorov complexity
under polynomial-time deterministic reductions. In contrast, the Minimum SAT-Oracle
Circuit Size Problem, which is a version of sublinear-time-bounded Kolmogorov complexity,
cannot be NP-hard under polynomial-time deterministic reductions without resolving
*EXP* ≠ *ZPP*. Our hardness result is the first result that overcomes the non-NP-hardness results
of MCSP. We also prove DistNP-hardness of *MINKT*^{SAT}, which is a partial converse of the approach of Hirahara (FOCS 2018) for proving
the equivalence between worst-case and average-case complexity of NP.

2. We prove *S*_{2}^{p}-hardness of Kolmogorov complexity under quasi-polynomial-time *nonadaptive* reductions. This is the first result that overcomes a P/poly barrier result of Allender,
Buhrman, Friedman, and Loff (MFCS 2012).

We also establish a firm link between non-trivial satisfiability algorithms and immunity of random strings, and obtain the following unconditional lower bounds.

1. It has been a long-standing open question whether the set of subexponential-time-bounded Kolmogorov-random strings is decidable in P. We resolve this open question, by showing that the set of super-polynomial-time-bounded Kolmogorov-random strings is P-immune, which is a much stronger lower bound than an average-case lower bound.

2. The set of Levin’s Kolmogorov-random strings is (P-uniform ACC)-immune.

We show that the smoothed complexity of the FLIP algorithm for local Max-Cut is at
most φ *n*^{O(√logn)}, where *n* is the number of nodes in the graph and φ is a parameter that measures the magnitude
of perturbations applied on its edge weights. This improves the previously best upper
bound of φ *n*^{O(logn)} by Etscheid and Roglin. Our result is based on an analysis of long sequences of flips,
which shows that it is very unlikely for every flip in a long sequence to incur a
positive but small improvement in the cut weight. We also extend the same upper bound
on the smoothed complexity of FLIP to all binary Maximum Constraint Satisfaction Problems.

We consider the simplest non-linear discrete dynamical systems, given by the logistic
maps *f*_{a}(*x*)=*ax*(1−*x*) of the interval [0,1]. We show that there exist real parameters *a*∈ (0,4) for which almost every orbit of *f*_{a} has the same statistical distribution in [0,1], but this limiting distribution is
not Turing computable. In particular, the Monte Carlo method cannot be applied to
study these dynamical systems.

We prove the first separation in the approximation guarantee achievable by truthful
and non-truthful combinatorial auctions with polynomial communication. Specifically,
we prove that any truthful auction guaranteeing a (34−1240+є)-approximation for two
buyers with XOS valuations over *m* items requires exp(Ω(ε^{2} · *m*)) communication whereas a non-truthful auction by Feige [*J. Comput.* 2009] is already known to achieve a 34-approximation in (*m*) communication.

We obtain our lower bound for truthful auctions by proving that any simultaneous auction
(not necessarily truthful) which guarantees a (34−1240+ε)-approximation requires communication
exp(Ω(ε^{2} · *m*)), and then apply the taxation complexity framework of Dobzinski [FOCS 2016] to extend
the lower bound to all truthful auctions (including interactive truthful auctions).

We consider incentive compatible mechanisms for a domain that is very close to the
domain of scheduling *n* unrelated machines: the single exception is that the valuation of just one machine
is submodular. For the scheduling problem with such cost functions, we give a lower
bound of Ω(√*n*) on the approximation ratio of incentive compatible deterministic mechanisms. This
is a strong information-theoretic impossibility result on the approximation ratio
of mechanisms that provides strong evidence for the Nisan-Ronen conjecture. This is
the first non-constant lower bound that assumes no restriction on the mechanism side;
in contrast, all previous general results hold for only special classes of mechanisms
such as local, strongly monotone, and anonymous mechanisms. Our approach is based
on a novel multi-player characterization of appropriately selected instances that
allows us to focus on particular type of algorithms, linear mechanisms, and it is
a potential stepping stone towards the full resolution of the conjecture.

There has been a long history for studying randomized greedy matching algorithms since
the work by Dyer and Frieze(RSA 1991). We follow this trend and consider the problem
formulated in the oblivious setting, in which the algorithm makes (random) decisions
that are essentially oblivious to the input graph. We revisit the Modified Randomized
Greedy (MRG) algorithm by Aronson et al.(RSA 1995) which is proved to be (0.5+*epsilon*)-approximate. In particular, we study a weaker version of the algorithm named Random
Decision Order (RDO) that in each step, randomly picks an unmatched vertex and matches
it to an arbitrary neighbor if exists. We prove the RDO algorithm is 0.639-approximate
and 0.531-approximate for bipartite graphs and general graphs respectively. As a corollary,
we substantially improve the approximation ratio of MRG. Furthermore, we generalize
the RDO algorithm to the edge-weighted case and prove that it achieves a 0.501 approximation
ratio. This result solves the open question by Chan et al.(SICOMP 2018) about the
existence of an algorithm that beats greedy in this setting. As a corollary, it also
solves the open questions by Gamlath et al.(SODA 2019) in the stochastic setting.

Suppose that we are given an arbitrary graph *G*=(*V*, *E*) and know that each edge in *E* is going to be *realized* independently with some probability *p*. The goal in the *stochastic matching* problem is to pick a sparse subgraph *Q* of *G* such that the realized edges in *Q*, in expectation, include a matching that is approximately as large as the maximum
matching among the realized edges of *G*. The maximum degree of *Q* can depend on *p*, but not on the size of *G*. This problem has been subject to extensive studies over the years and the approximation
factor has been improved gradually from 0.5 to eventually 2/3 which is a known barrier.
In this work, we analyze a natural sampling-based algorithm and show that it can obtain
a (1−є) approximation, for any constant є > 0. A key and of possible independent interest
component of our analysis is an algorithm that constructs a matching on a *stochastic* graph, which among some other important properties, guarantees that each vertex is
matched *independently* from the vertices that are sufficiently far. This allows us to bypass a previously
known barrier towards achieving (1−є) approximation based on existence of dense Ruzsa-Szemerédi
graphs.

We consider the (weighted) Paging with Time Windows problem, which is identical to the classical weighted paging problem but where each page request only needs to be served by a given deadline. This problem arises in many practical applications of online caching, such as the deadline I/O scheduler in the Linux kernel and video-on-demand streaming. From a theoretical perspective, this generalizes the caching problem to allow delayed service, a line of work that has recently gained traction in online algorithms (e.g., Emek et al. STOC '16, Azar et al. STOC '17, Azar and Touitou FOCS '19, etc.).

Our main result is an O(log k log n)-competitive algorithm for the Paging with Time Windows problem on n pages with a cache of size k. This significantly improves on the previous best bound of O(k) (Azar et al. (STOC '17). We also consider the offline version of this problem, for which we give an O(1) approximation algorithm and prove APX-hardness. These are the first results for the offline problem; even NP-hardness was not known before our work.

At the heart of our algorithms is a novel hitting-set LP relaxation of this problem that overcomes the Omega(k) integrality gap of the natural LP for the problem. To the best of our knowledge, this is the first example of an LP-based algorithm for an online algorithm with delays/deadlines.

We consider an online vector balancing question where *T* vectors, chosen from an arbitrary distribution over [−1,1]^{n}, arrive one-by-one and must be immediately given a ± sign. The goal is to keep the
discrepancy—the ℓ_{∞}-norm of any signed prefix-sum—as small as possible. A concrete example of this question
is the online interval discrepancy problem where *T* points are sampled one-by-one uniformly in the unit interval [0,1], and the goal
is to immediately color them ± such that every sub-interval remains always nearly
balanced. As random coloring incurs Ω(*T*^{1/2}) discrepancy, while the worst-case offline bounds are Θ(√*n* log(*T*/*n*)) for vector balancing and 1 for interval balancing, a natural question is whether
one can (nearly) match the offline bounds in the online setting for these problems.
One must utilize the stochasticity as in the worst-case scenario it is known that
discrepancy is Ω(*T*^{1/2}) for any online algorithm.

In a special case of online vector balancing, Bansal and Spencer [BS19] recently show
an *O*(√*n*log*T*) bound when each coordinate is *independently* chosen. When there are *dependencies* among the coordinates, as in the interval discrepancy problem, the problem becomes
much more challenging, as evidenced by a recent work of Jiang, Kulkarni, and Singla
[JKS19] that gives a non-trivial *O*(*T*^{1/loglogT}) bound for online interval discrepancy. Although this beats random coloring, it is
still far from the offline bound.

In this work, we introduce a new framework that allows us to handle online vector
balancing even when the input distribution has dependencies across coordinates. In
particular, this lets us obtain a *poly*(*n*, log*T*) bound for online vector balancing under arbitrary input distributions, and a *polylog* (*T*) bound for online interval discrepancy. Our framework is powerful enough to capture
other well-studied geometric discrepancy problems; e.g., we obtain a *poly*(log^{d} (*T*)) bound for the online *d*-dimensional Tusnády’s problem. All our bounds are tight up to polynomial factors.

A key new technical ingredient in our work is an *anti-concentration* inequality for sums of pairwise uncorrelated random variables, which might also be
of independent interest.

Mehta and Panigrahi (FOCS 2012) introduce the problem of online matching with stochastic rewards, where edges are associated with success probabilities and a match succeeds with the probability of the corresponding edge. It is one of the few online matching problems that have defied the randomized online primal dual framework by Devanur, Jain, and Kleinberg (SODA 2013) thus far. This paper unlocks the power of randomized online primal dual in online matching with stochastic rewards by employing the configuration linear program rather than the standard matching linear program used in previous works. Our main result is a 0.572 competitive algorithm for the case of vanishing and unequal probabilities, improving the best previous bound of 0.534 by Mehta, Waggoner, and Zadimoghaddam (SODA 2015) and, in fact, is even better than the best previous bound of 0.567 by Mehta and Panigrahi (FOCS 2012) for the more restricted case of vanishing and equal probabilities. For vanishing and equal probabilities, we get a better competitive ratio of 0.576. Our results further generalize to the vertex-weighted case due to the intrinsic robustness of the randomized online primal dual analysis.

We study the resource augmented version of the *k*-server problem, also known as the *k*-server problem against weak adversaries or the (*h*,*k*)-server problem. In this setting, an online algorithm using *k* servers is compared to an offline algorithm using *h* servers, where *h* ≤ *k*. For uniform metrics, it has been known since the seminal work of Sleator and Tarjan
(1985) that for any є>0, the competitive ratio drops to a constant if *k*=(1+є) · *h*. This result was later generalized to weighted stars (Young 1994) and trees of bounded
depth (Bansal et al. 2017). The main open problem for this setting is whether a similar
phenomenon occurs on general metrics. We resolve this question negatively. With a
simple recursive construction, we show that the competitive ratio is at least Ω(loglog*h*), even as *k*→∞. Our lower bound holds for both deterministic and randomized algorithms. It also
disproves the existence of a competitive algorithm for the infinite server problem
on general metrics.

We give a pseudorandom generator that fools degree-*d* polynomial threshold functions over *n*-dimensional Gaussian space with seed length *d*^{O(logd)} · log*n*. All previous generators had a seed length with at least a 2^{d} dependence on *d*.

The key new ingredient is our *Local Hyperconcentration Theorem*, which shows that every degree-*d* Gaussian polynomial is hyperconcentrated almost everywhere at scale *d*^{−O(logd)}.

Randomness extraction is a fundamental problem that has been studied for over three decades. A well-studied setting assumes that one has access to multiple independent weak random sources, each with some entropy. However, this assumption is often unrealistic in practice. In real life, natural sources of randomness can produce samples with no entropy at all or with unwanted dependence. Motivated by this and applications from cryptography, we initiate a systematic study of randomness extraction for the class of adversarial sources defined as follows.

A weak source *X* of the form *X*_{1}, …, *X*_{N}, where each *X*_{i} is on n bits, is an (*N*,*K*,*n*,*k*)-source of locality *d* if the following hold: (1) Somewhere good sources: at least *K* of the *X*_{i}’s are independent, and each contains min-entropy at least *k*. We call these *X*_{i}’s good sources, and their locations are unknown. (2) Bounded dependence: each remaining
(bad) source can depend arbitrarily on at most *d* good sources.

We focus on constructing extractors with negligible error, in the regime where most
of the entropy is contained within a few sources instead of across many (i.e., *k* is at least polynomial in *K*). In this setting, even for the case of 0-locality, very little is known prior to
our work. For *d*=1, essentially no previous results are known. We present various new extractors for
adversarial sources in a wide range of parameters, and some of our constructions work
for locality *d*=*K*^{Ω(1)}. As an application, we also give improved extractors for small-space sources.

The class of adversarial sources generalizes several previously studied classes of sources, and our explicit extractor constructions exploit tools from recent advances in extractor machinery, such as two-source non-malleable extractors and low-error condensers. Thus, our constructions can be viewed as a new application of non-malleable extractors. In addition, our constructions combine the tools from extractor theory in a novel way through various sorts of explicit extremal hypergraphs. These connections leverage recent progress in combinatorics, such as improved bounds on cap sets and explicit constructions of Ramsey graphs, and may be of independent interest.

The motivation of this work is to extend the techniques of higher order random walks on simplicial complexes to analyze mixing times of Markov chains for combinatorial problems. Our main result is a sharp upper bound on the second eigenvalue of the down-up walk on a pure simplicial complex, in terms of the second eigenvalues of its links. We show some applications of this result in analyzing mixing times of Markov chains, including sampling independent sets of a graph and sampling common independent sets of two partition matroids.

Motivated by the Dikin walk, we develop aspects of the interior-point theory for sampling
in high dimension. Specifically, we introduce the notions of strong self-concordance
and symmetry for a barrier. These properties imply that the Dikin walk defined using
a strongly self-concordant barrier with symmetry parameter ν mixes in Õ(*n*ν) steps from a warm start for a convex body in ℝ^{n}. For many natural barriers, ν is roughly bounded by ν, the standard self-concordance
parameter. We also show that these properties hold for the Lee-Sidford barrier. As
a consequence, we obtain the first walk that mixes in Õ(*n*^{2}) steps for an arbitrary polytope in ℝ^{n}. Strong self-concordance for other barriers leads to an interesting (and unexpected)
connection — for the universal and entropic barriers, it is implied by the KLS conjecture.

A longstanding observation, which was partially proven by Li, Nguyen, and Woodruff in 2014, and extended by Ai, Hu, Li, and Woodruff in 2016, is that any turnstile streaming algorithm can be implemented as a linear sketch (the reverse is trivially true). We study the relationship between turnstile streaming and linear sketching algorithms in more detail, giving both new separations and new equivalences between the two models.

It was shown by Li, Nguyen, and Woodruff in 2014 that, if a turnstile algorithm works for arbitrarily long streams with arbitrarily large coordinates at intermediate stages of the stream, then the turnstile algorithm is equivalent to a linear sketch. We show separations of the opposite form: if either the stream length or the maximum value of the stream are substantially restricted, there exist problems where linear sketching is exponentially harder than turnstile streaming.

A further limitation of the Li, Nguyen, and Woodruff equivalence is that the turnstile sketching algorithm is neither explicit nor uniform, but requires an exponentially long advice string. We show how to remove this limitation for deterministic streaming algorithms: we give an explicit small-space algorithm that takes the streaming algorithm and computes an equivalent module.

Consider the following abstract coin tossing problem: Given a set of *n* coins with unknown biases, find the most biased coin using a minimal number of coin
tosses. This is a common abstraction of various exploration problems in theoretical
computer science and machine learning and has been studied extensively over the years.
In particular, algorithms with optimal sample complexity (number of coin tosses) have
been known for this problem for quite some time.

Motivated by applications to processing massive datasets, we study the space complexity of solving this problem with optimal number of coin tosses in the streaming model. In this model, the coins are arriving one by one and the algorithm is only allowed to store a limited number of coins at any point – any coin not present in the memory is lost and can no longer be tossed or compared to arriving coins. Prior algorithms for the coin tossing problem with optimal sample complexity are based on iterative elimination of coins which inherently require storing all the coins, leading to memory-inefficient streaming algorithms.

We remedy this state-of-affairs by presenting a series of improved streaming algorithms
for this problem: we start with a simple algorithm which require storing only *O*(log*n*) coins and then iteratively refine it further and further, leading to algorithms
with *O*(loglog(*n*)) memory, *O*((*n*)) memory, and finally a one that only stores a single extra coin in memory – the
same exact space needed to just store the best coin throughout the stream.

Furthermore, we extend our algorithms to the problem of finding the *k* most biased coins as well as other exploration problems such as finding top-*k* elements using noisy comparisons or finding an -best arm in stochastic multi-armed
bandits, and obtain efficient streaming algorithms for these problems.

Adaptive sampling is a useful algorithmic tool for data summarization problems in
the classical centralized setting, where the entire dataset is available to the single
processor performing the computation. Adaptive sampling repeatedly selects rows of
an underlying matrix *A*∈ℝ^{n× d}, where *n*≫ *d*, with probabilities proportional to their distances to the subspace of the previously
selected rows. Intuitively, adaptive sampling seems to be limited to trivial multi-pass
algorithms in the streaming model of computation due to its inherently sequential
nature of assigning sampling probabilities to each row only after the previous iteration
is completed. Surprisingly, we show this is not the case by giving the first one-pass
algorithms for adaptive sampling on turnstile streams and using space poly(*d*,*k*,log*n*), where *k* is the number of adaptive sampling rounds to be performed.

Our adaptive sampling procedure has a number of applications to various data summarization
problems that either improve state-of-the-art or have only been previously studied
in the more relaxed row-arrival model. We give the first relative-error algorithm
for column subset selection on turnstile streams. We show our adaptive sampling algorithm
also gives the first relative-error algorithm for subspace approximation on turnstile
streams that returns *k* noisy rows of *A*. The quality of the output can be improved to a (1+є)-approximation at the tradeoff
of a bicriteria algorithm that outputs a larger number of rows. We then give the first
algorithm for projective clustering on turnstile streams that uses space sublinear
in *n*. In fact, we use space poly(*d*,*k*,*s*,1/є,log*n*) to output a (1+є)-approximation, where *s* is the number of *k*-dimensional subspaces. Our adaptive sampling primitive also provides the first algorithm
for volume maximization on turnstile streams. We complement our volume maximization
algorithmic results with lower bounds that are tight up to lower order terms, even
for multi-pass algorithms. By a similar construction, we also obtain lower bounds
for volume maximization in the row-arrival model, which we match with competitive
upper bounds.

Previous work on tabulation hashing by Pǎtraşcu and Thorup from STOC’11 on simple
tabulation and from SODA’13 on twisted tabulation offered Chernoff-style concentration
bounds on hash based sums, e.g., the number of balls/keys hashing to a given bin,
but under some quite severe restrictions on the expected values of these sums. The
basic idea in tabulation hashing is to view a key as consisting of *c*=*O*(1) characters, e.g., a 64-bit key as *c*=8 characters of 8-bits. The character domain Σ should be small enough that character
tables of size |Σ| fit in fast cache. The schemes then use *O*(1) tables of this size, so the space of tabulation hashing is *O*(|Σ|). However, the concentration bounds by Pǎtraşcu and Thorup only apply if the
expected sums are ≪ |Σ|.

To see the problem, consider the very simple case where we use tabulation hashing
to throw *n* balls into *m* bins and want to analyse the number of balls in a given bin. With their concentration
bounds, we are fine if *n*=*m*, for then the expected value is 1. However, if *m*=2, as when tossing *n* unbiased coins, the expected value *n*/2 is ≫ |Σ| for large data sets, e.g., data sets that do not fit in fast cache.

To handle expectations that go beyond the limits of our small space, we need a much
more advanced analysis of simple tabulation, plus a new tabulation technique that
we call *tabulation-permutation* hashing which is at most twice as slow as simple tabulation. No other hashing scheme
of comparable speed offers similar Chernoff-style concentration bounds.

The three-in-a-tree problem is to determine if a simple undirected graph contains
an induced subgraph which is a tree connecting three given vertices. Based on a beautiful
characterization that is proved in more than twenty pages, Chudnovsky and Seymour
[*Combinatorica* 2010] gave the previously only known polynomial-time algorithm, running in *O*(*mn*^{2}) time, to solve the three-in-a-tree problem on an *n*-vertex *m*-edge graph. Their three-in-a-tree algorithm has become a critical subroutine in several
state-of-the-art graph recognition and detection algorithms.

In this paper we solve the three-in-a-tree problem in *O*(*m*log^{2} *n*) time, leading to improved algorithms for recognizing perfect graphs and detecting
thetas, pyramids, beetles, and odd and even holes. Our result is based on a new and
more constructive characterization than that of Chudnovsky and Seymour. Our new characterization
is stronger than the original, and our proof implies a new simpler proof for the original
characterization. The improved characterization gains the first factor *n* in speed. The remaining improvement is based on dynamic graph algorithms.

We resolve the fine-grained parameterized complexity of detecting and counting small
patterns in planar graphs, assuming the Exponential Time Hypothesis. Given an *n*-vertex planar graph *G* and a *k*-vertex pattern graph *P*, we compute the number of (induced) copies of *P* in *G* in time 2^{Õ(√k)}*n*^{O(1)}, if *P* is a matching, independent set, or connected bounded maximum degree graph, 2^{O(k/logk)}*n*^{O(1)}, for any pattern *P*. Our results significantly advance the state of the art of the planar graph isomorphism
problem: No 2^{o(k)}*n*^{O(1)} time algorithms where previously known for counting patterns even in very restricted
cases such as independent sets in subgraphs of grids, Even for detection, no 2^{O(k/logk)}*n*^{O(1)} time algorithms for unrestricted patterns *P* were previously known, Our run times cannot be improved assuming the Exponential
Time Hypothesis (ETH). Our results are a corollary of the following general result:
We compute the number of (induced) copies of *P* in *G* in 2^{Õ(√k)}(σ(*P*)*n*)^{O(1)} time, where σ(*P*) denotes the number of non-isomorphic separations of *P* of order Õ(√*k*).

To obtain the first sub-exponential time algorithms, we introduce a new general technique
that we call *efficient inclusion-exclusion*. This technique allows us to efficiently use hierarchies of separators for counting
problems. To resolve the optimal complexity of planar subgraph isomorphism we provide
another technique that we call *balanced cycle sparsification*. This technique allows us to obtain from one balanced cycle separator in a planar
graphs a family of balanced cycle separators that have limited mutual overlap, implying
one separator has few vertices of the pattern copy.

In the Disjoint Paths problem, the input is an undirected graph *G* on *n* vertices and a set of *k* vertex pairs, {*s*_{i},*t*_{i}}_{i=1}^{k}, and the task is to find *k* pairwise vertex-disjoint paths such that the *i*’th path connects *s*_{i} to *t*_{i}. In this paper, we give a parameterized algorithm with running time 2^{O(k2)}*n*^{O(1)} for Planar Disjoint Paths, the variant of the problem where the input graph is required
to be planar. Our algorithm is based on the unique linkage/treewidth reduction theorem
for planar graphs by Adler et al. [JCTB 2017], the algebraic co-homology based technique
developed by Schrijver [SICOMP 1994] for Disjoint Paths on directed planar graphs,
and one of the key combinatorial insights developed by Cygan et al. [FOCS 2013] in
their algorithm for Disjoint Paths on directed planar graphs. To the best of our knowledge
our algorithm is the first parameterized algorithm to exploit that the treewidth of
the input graph is small in a way completely different from the use of dynamic programming.

In the Topological Minor Deletion (TM-Deletion) problem, the input consists of an
undirected graph *G*, a family of undirected graphs *F* and an integer *k*. The task is to determine whether *G* contains a set of vertices *S* of size at most *k*, such that the graph *G*∖ *S* obtained from *G* by removing the vertices of *S*, contains no graph from *F* as a topological minor. We give an algorithm forTM-Deletion with running time *f*(*h*^{⋆},*k*)· |*V*(*G*)|^{4}. Here *h*^{⋆} is the maximum size of a graph in *F* and *f* is a computable function of *h*^{⋆} and *k*. This is the first fixed parameter tractable algorithm (FPT) for the problem. In
fact, even for the restricted case of planar inputs the first FPT algorithm was found
only recently by Golovach et al. [SODA 2020]. For this case we improve upon the algorithm
of Golovach et al. [SODA 2020] by designing an FPT algorithm with explicit dependence
on *k* and *h*^{⋆}.

We prove that for all constants *a*, *NQP* = *NTIME*[*n*^{polylog(n)}] cannot be (1/2 + 2^{−loga n})-approximated by 2^{loga n}-size *ACC*^{0} ∘ *THR* circuits ( *ACC*^{0} circuits with a bottom layer of *THR* gates). Previously, it was even open whether *E*^{ NP} can be (1/2+1/√*n*)-approximated by *AC*^{0}[⊕] circuits. As a straightforward application, we obtain an infinitely often ( *NE* ∩ *coNE*)_{/1}-computable pseudorandom generator for poly-size *ACC*^{0} circuits with seed length 2^{logєn}, for all є > 0.

More generally, we establish a connection showing that, for a typical circuit class
*C*, non-trivial nondeterministic algorithms estimating the acceptance probability of
a given *S*-size *C* circuit with an additive error 1/*S* (we call it a *CAPP* algorithm) imply strong (1/2 + 1/*n*^{ω(1)}) average-case lower bounds for nondeterministic time classes against *C* circuits. Note that the existence of such (deterministic) algorithms is much weaker
than the widely believed conjecture *PromiseBPP* = *PromiseP*.

We also apply our results to several sub-classes of *TC*^{0} circuits. First, we show that for all *k*, *NP* cannot be (1/2 + *n*^{−k})-approximated by *n*^{k}-size *Sum*∘ *THR* circuits (exact ℝ-linear combination of threshold gates), improving the corresponding
worst-case result in [Williams, CCC 2018]. Second, we establish strong average-case
lower bounds and build ( *NE* ∩ *coNE*)_{/1}-computable PRGs for *Sum* ∘ *PTF* circuits, for various regimes of degrees. Third, we show that non-trivial *CAPP* algorithms for *MAJ* ∘ *MAJ* indeed already imply worst-case lower bounds for *TC*_{3}^{0} ( *MAJ* ∘ *MAJ* ∘ *MAJ*). Since exponential lower bounds for *MAJ* ∘ *MAJ* are already known, this suggests *TC*_{3}^{0} lower bounds are probably within reach.

Our new results build on a line of recent works, including [Murray and Williams, STOC
2018], [Chen and Williams, CCC 2019], and [Chen, FOCS 2019]. In particular, it strengthens
the corresponding (1/2 + 1/*polylog*(*n*))-inapproximability average-case lower bounds in [Chen, FOCS 2019].

The two important technical ingredients are techniques from Cryptography in *NC*^{0} [Applebaum et al., SICOMP 2006], and Probabilistic Checkable Proofs of Proximity
with *NC*^{1}-computable proofs.

We establish several “sharp threshold” results for computational complexity. For certain
tasks, we can prove a resource lower bound of *n*^{c} for *c* ≥ 1 (or obtain an efficient circuit-analysis algorithm for *n*^{c} size), there is strong intuition that a similar result can be proved for larger functions
of *n*, yet we can also prove that replacing “*n*^{c}” with “*n*^{c+ε}” in our results, for any ε > 0, would imply a breakthrough *n*^{ω(1)} lower bound. We first establish such a result for Hardness Magnification. We prove (among other results) that for some *c*, the Minimum Circuit Size Problem for (log*n*)^{c}-size circuits on length-*n* truth tables (*MCSP*[(log*n*)^{c}]) does not have *n*^{2−o(1)}-size probabilistic formulas. We also prove that an *n*^{2+ε} lower bound for *MCSP*[(log*n*)^{c}] (for any ε > 0 and *c* ≥ 1) would imply major lower bound results, such as *NP* does not have *n*^{k}-size formulas for all *k*, and #*SAT* does not have log-depth circuits. Similar results hold for time-bounded Kolmogorov
complexity. Note that *cubic* size lower bounds are known for probabilistic De Morgan formulas (for other functions).
Next we show a sharp threshold for Quantified Derandomization (QD) of probabilistic formulas: (a) For all α, ε > 0, there is a deterministic polynomial-time
algorithm that finds satisfying assignments to every probabilistic formula of *n*^{2−2α−ε} size with at most 2^{nα} falsifying assignments. (b) If for some α, ε > 0, there is such an algorithm for
probabilistic formulas of *n*^{2−α+ε}-size and 2^{nα} unsatisfying assignments, then a *full derandomization of **NC*^{1} follows: a deterministic poly-time algorithm additively approximating the acceptance
probability of *any polynomial-size formula*. Consequently, *NP* does not have *n*^{k}-size formulas, for all *k*. Finally we show a sharp threshold result for Explicit Obstructions, inspired by Mulmuley’s notion of explicit obstructions from GCT. An *explicit obstruction against **S*(*n*)*-size formulas* is a poly-time algorithm *A* such that *A*(1^{n}) outputs a list {(*x*_{i},*f*(*x*_{i}))}_{i ∈ [poly(n)]} ⊆ {0,1}^{n} × {0,1}, and every *S*(*n*)-size formula *F* is inconsistent with the (partially defined) function *f*. We prove that for all ε > 0, there is an explicit obstruction against *n*^{2−ε}-size formulas, and prove that there is an explicit obstruction against *n*^{2+ε}-size formulas for some ε > 0 if and only if there is an explicit obstruction against
*all polynomial-size* formulas. This in turn is equivalent to the statement that *E* does not have 2^{o(n)}-size formulas, a breakthrough in circuit complexity.

Hegedűs’s lemma is the following combinatorial statement regarding polynomials over
finite fields. Over a field F of characteristic *p* > 0 and for *q* a power of *p*, the lemma says that any multilinear polynomial *P*∈ F[*x*_{1},…,*x*_{n}] of degree less than *q* that vanishes at all points in {0,1}^{n} of Hamming weight *k*∈ [*q*,*n*−*q*] must also vanish at all points in {0,1}^{n} of weight *k* + *q*. This lemma was used by Hegedűs (2009) to give a solution to *Galvin’s problem*, an extremal problem about set systems; by Alon, Kumar and Volk (2018) to improve
the best-known multilinear circuit lower bounds; and by Hrubeš, Ramamoorthy, Rao and
Yehudayoff (2019) to prove optimal lower bounds against depth-2 threshold circuits
for computing some symmetric functions. In this paper, we formulate a robust version
of Hegedűs’s lemma. Informally, this version says that if a polynomial of degree *o*(*q*) vanishes at most points of weight *k*, then it vanishes at many points of weight *k*+*q*. We prove this lemma and give the following three different applications.

1. Degree lower bounds for the coin problem: The δ*-Coin Problem* is the problem of distinguishing between a coin that is heads with probability ((1/2)
+ δ) and a coin that is heads with probability 1/2. We show that over a field of positive
(fixed) characteristic, any polynomial that solves the δ-coin problem with error ε
must have degree Ω(1/δlog(1/ε)), which is tight up to constant factors. 2. Probabilistic
degree lower bounds: The *Probabilistic degree* of a Boolean function is the minimum *d* such that there is a random polynomial of degree *d* that agrees with the function at each point with high probability. We give tight
lower bounds on the probabilistic degree of *every* symmetric Boolean function over positive (fixed) characteristic. As far as we know,
this was not known even for some very simple functions such as unweighted Exact Threshold
functions, and constant error. 3. A robust version of the combinatorial result of
Hegedűs (2009) mentioned above.

We consider the classical problem of maximizing a monotone submodular function subject to a cardinality constraint, which, due to its numerous applications, has recently been studied in various computational models. We consider a clean multi-player model that lies between the offline and streaming model, and study it under the aspect of one-way communication complexity. Our model captures the streaming setting (by considering a large number of players), and, in addition, two player approximation results for it translate into the robust setting. We present tight one-way communication complexity results for our model, which, due to the above-mentioned connections, have multiple implications in the data stream and robust setting.

Even for just two players, a prior information-theoretic hardness result implies that
no approximation factor above 1/2 can be achieved in our model, if only queries to
feasible sets, i.e., sets respecting the cardinality constraint, are allowed. We show
that the possibility of querying infeasible sets can actually be exploited to beat
this bound, by presenting a tight 2/3-approximation taking exponential time, and an
efficient 0.514-approximation. To the best of our knowledge, this is the first example
where querying a submodular function on infeasible sets leads to provably better results.
Through the above-mentioned link to the robust setting, both of these algorithms improve
on the current state-of-the-art for robust submodular maximization, showing that approximation
factors beyond 1/2 are possible. Moreover, exploiting the link of our model to streaming,
we settle the approximability for streaming algorithms by presenting a tight 1/2+ε
hardness result, based on the construction of a new family of coverage functions.
This improves on a prior 1−1/*e*+ε hardness and matches, up to an arbitrarily small margin, the best known approximation
algorithm.

We present the first distance sensitivity oracle (DSO) with subcubic preprocessing
time and poly-logarithmic query time for directed graphs with integer weights in the
range [−*M*,*M*].

Weimann and Yuster [FOCS 10] presented a distance sensitivity oracle for a single
vertex/edge failure with subcubic preprocessing time of *O*(*Mn*^{ω+1−α}) and subquadratic query time of Õ(*n*^{1+α}), where α is any parameter in [0,1], *n* is the number of vertices, *m* is the number of edges, the Õ(·) notation hides poly-logarithmic factors in *n* and ω<2.373 is the matrix multiplication exponent.

Later, Grandoni and Vassilevska Williams [FOCS 12] substantially improved the query
time to sublinear in *n*. In particular, they presented a distance sensitivity oracle for a single vertex/edge
failure with Õ(*Mn*^{ω+1/2}+ *Mn*^{ω+α(4−ω)}) preprocessing time and Õ(*n*^{1−α}) query time.

Despite the substantial improvement in the query time, it still remains polynomial
in the size of the graph, which may be undesirable in many settings where the graph
is of large scale. A natural question is whether one can hope for a distance sensitivity
oracle with subcubic preprocessing time and very fast query time (of poly-logarithmic
in *n*).

In this paper we answer this question affirmatively by presenting a distance sensitive
oracle supporting a single vertex/edge failure in subcubic Õ(*Mn*^{2.873}) preprocessing time for ω=2.373, Õ(*n*^{2.5}) space and near optimal query time of Õ(1).

For comparison, with the same Õ(*Mn*^{2.873}) preprocessing time the DSO of Grandoni and Vassilevska Williams has Õ(*n*^{0.693}) query time. In fact, the best query time their algorithm can obtain is (*Mn*^{0.385}) (with (*Mn*^{3}) preprocessing time).

Given a set *S* of *n* (distinct) keys from key space [*U*], each associated with a value from Σ, the *static dictionary* problem asks to preprocess these (key, value) pairs into a data structure, supporting
value-retrieval queries: for any given *x*∈ [*U*], *valRet*(*x*) must return the value associated with *x* if *x*∈ *S*, or return ⊥ if *x*∉ *S*. The special case where |Σ|=1 is called the *membership* problem. The “textbook” solution is to use a hash table, which occupies linear space
and answers each query in constant time. On the other hand, the minimum possible space
to encode all (key, value) pairs is only *OPT*:= ⌈lg_{2}(*U**n*)+*n*lg_{2}|Σ|⌉ bits, which could be much less.

In this paper, we design a randomized dictionary data structure using *OPT*+lg*n*+*O*(lglglglglg*U*) bits of space, and it has *expected constant* query time, assuming the query algorithm can access an external lookup table of size
*n*^{0.001}. The lookup table depends only on *U*, *n* and |Σ|, and not the input. Previously, even for membership queries and *U*≤ *n*^{O(1)}, the best known data structure with constant query time requires *OPT*+*n*/lg*n* bits of space by Pagh (SIAM J. Comput. 2001) and Pǎtraşcu (FOCS 2008); the best known
using *OPT*+*n*^{0.999} space has query time *O*(lg*n*); the only known non-trivial data structure with *OPT*+*n*^{0.001} space has *O*(lg*n*) query time and requires a lookup table of size ≥ *n*^{2.99} (!). Our new data structure answers open questions by Pǎtraşcu and Thorup.

Given an integer array *A*[1..*n*], the Range Minimum Query problem (RMQ) asks to preprocess *A* into a data structure, supporting RMQ queries: given *a*,*b*∈ [1,*n*], return the index *i*∈[*a*,*b*] that minimizes *A*[*i*], i.e., *argmin*_{i∈[a,b]} *A*[*i*]. This problem has a classic solution using *O*(*n*) space and *O*(1) query time by Gabow, Bentley, Tarjan (STOC, 1984) and Harel, Tarjan (SICOMP, 1984).
The best known data structure by Fischer, Heun (SICOMP, 2011) and Navarro, Sadakane
(TALG, 2014) uses 2*n*+*n*/(log*n*/*t*)^{t}+Õ(*n*^{3/4}) bits and answers queries in *O*(*t*) time, assuming the word-size is *w*=Θ(log*n*). In particular, it uses 2*n*+*n*/*poly*log*n* bits of space as long as the query time is a constant.

In this paper, we prove the first lower bound for this problem, showing that 2*n*+*n*/*poly*log*n* space is necessary for constant query time. In general, we show that if the data
structure has query time *O*(*t*), then it must use at least 2*n*+*n*/(log*n*)^{Õ(t2)} space, in the cell-probe model with word-size *w*=Θ(log*n*).

Given a collection of *n* points in ℝ^{d}, the goal of the (*k*,*z*)-clustering problem is to find a subset of *k* “centers” that minimizes the sum of the *z*-th powers of the Euclidean distance of each point to the closest center. Special
cases of the (*k*,*z*)-clustering problem include the *k*-median and *k*-means problems. Our main result is a unified two-stage importance sampling framework
that constructs an ε-coreset for the (*k*,*z*)-clustering problem. Compared to the results for (*k*,*z*)-clustering in [Feldman and Langberg, STOC 2011], our framework saves a ε^{2} *d* factor in the coreset size. Compared to the results for (*k*,*z*)-clustering in [Sohler and Woodruff, FOCS 2018], our framework saves a poly(*k*) factor in the coreset size and avoids the exp(*k*/ε) term in the construction time. Specifically, our coreset for *k*-median (*z*=1) has size Õ(ε^{−4} *k*) which, when compared to the result in [Sohler and Woodruff, STOC 2018], saves a
*k* factor in the coreset size. Our algorithmic results rely on a new dimensionality
reduction technique that connects two well-known shape fitting problems: subspace
approximation and clustering, and may be of independent interest. We also provide
a size lower bound of Ω(*k*· min{2^{z/20},*d* }) for a 0.01-coreset for (*k*,*z*)-clustering, which has a linear dependence of size on *k* and an exponential dependence on *z* that matches our algorithmic results.