Abstractions and algorithms are at the heart of computational thinking. In this talk I will discuss the evolution of the theory and practice of programming language and compiler design through the lens of computational thinking. Many of the key concepts in this area were introduced at the ACM Symposium on the Theory of Computing.

NP (search) problems allow easy correctness tests for solutions. Climbing algorithms allow also easy assessment of how close to yielding the correct answer is the configuration at any stage of their run. This offers a great flexibility, as how sensible is any deviation from the standard procedures can be instantly assessed.

An example is the Dual Matrix Algorithm (DMA) for linear programming, variations of which were considered by A.Y.Levin in 1965 and by Yamnitsky and myself in 1982. It has little sensitivity to numerical errors and to the number of inequalities. It offers substantial flexibility and, thus, potential for further developments.

Counting the number of words of a certain length accepted by a non-deterministic finite automaton (NFA) is a fundamental problem, which has many applications in different areas such as graph databases, knowledge compilation, and information extraction. Along with this, generating such words uniformly at random is also a relevant problem, particularly in scenarios where returning varied outputs is a desirable feature.

The previous problems are formalized as follows. The input of #NFA is an NFA N and a length k given in unary (that is, given as a string 0^k), and then the task is to compute the number of strings of length k accepted by N. The input of GEN-NFA is the same as #NFA, but now the task is to generate uniformly, at random, a string accepted by N of length k.

It is known that #NFA is #P-complete, so an efficient algorithm to compute this function exactly is not expected to exist. However, this does not preclude the existence of an efficient approximation algorithm for it. In this talk, we will show that #NFA admits a fully polynomial-time randomized approximation scheme (FPRAS). Prior to our work, it was open whether #NFA admits an FPRAS; in fact, the best randomized approximation scheme known for #NFA ran in time n^O(log(n)).

Besides, we will mention some consequences and applications of our results. In particular, from well-known results on counting and uniform generation, we obtain that GEN-NFA admits a fully polynomial-time almost uniform generator. Moreover, as #NFA is SpanL-complete under polynomial-time parsimonious reductions, we obtain that every function in the complexity class SpanL admits an FPRAS.

The problem of chasing convex functions is easy to state: faced with a sequence of convex functions f t over d-dimensional Euclidean spaces, the goal of the algorithm is to output a point x t at each time, so that the sum of the function costs f t (x t ), plus the movement costs ||x t − x t − 1 || is minimized. This problem generalizes questions in online algorithms such as caching and the k-server problem. In 1994, Friedman and Linial posed the question of getting an algorithm with a competitive ratio that depends only on the dimension d. In this talk we give an O (d)-competitive algorithm, based on the notion of the Steiner point of a convex body.

The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, connecting them with kernel methods. In this talk, I'll introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features.

Algorithms are increasingly used to aid, or in some cases supplant, human decision-making, particularly for decisions that hinge on predictions. As a result, two additional features in addition to prediction quality have generated interest: (i) to facilitate human interaction and understanding with these algorithms, we desire prediction functions that are in some fashion simple or interpretable; and (ii) because they influence consequential decisions, we also want them to produce equitable allocations. We develop a formal model to explore the relationship between the demands of simplicity and equity. Although the two concepts appear to be motivated by qualitatively distinct goals, we show a fundamental inconsistency between them. Specifically, we formalize a general framework for producing simple prediction functions, and in this framework we establish two basic results. First, every simple prediction function is strictly improvable: there exists a more complex prediction function that is both strictly more efficient and also strictly more equitable. Put another way, using a simple prediction function both reduces utility for disadvantaged groups and reduces overall welfare relative to other options. Second, we show that simple prediction functions necessarily create incentives to use information about individuals' membership in a disadvantaged group --- incentives that weren't present before simplification, and that work against these individuals. Thus, simplicity transforms disadvantage into bias against the disadvantaged group. Our results are not only about algorithms but about any process that produces simple models, and as such they connect to the psychology of stereotypes and to an earlier economics literature on statistical discrimination.

We reconsider the black hole firewall puzzle, emphasizing that quantum error-correction, computational complexity, and pseudorandomness are crucial concepts for understanding the black hole interior. We assume that the Hawking radiation emitted by an old black hole is pseudorandom, meaning that it cannot be distinguished from a perfectly thermal state by any efficient quantum computation acting on the radiation alone. We then infer the existence of a subspace of the radiation system which we interpret as an encoding of the black hole interior. This encoded interior is entangled with the late outgoing Hawking quanta emitted by the old black hole, and is inaccessible to computationally bounded observers who are outside the black hole. Specifically, efficient operations acting on the radiation, those with quantum computational complexity polynomial in the entropy of the remaining black hole, commute with a complete set of logical operators acting on the encoded interior, up to corrections which are exponentially small in the entropy. Thus, under our pseudorandomness assumption, the black hole interior is well protected from exterior observers as long as the remaining black hole is macroscopic. On the other hand, if the radiation is not pseudorandom, an exterior observer may be able to create a firewall by applying a polynomial-time quantum computation to the radiation.

We give a new proof of the "transfer theorem" underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sample-accurate is also accurate out-of-sample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the conditional distribution on datasets induced by the transcript of the interaction is close to its true value on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to its conditional expectation with high probability. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the conditional distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds. An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds.

This talk is about scheduling and load balancing in a multi-server system, with the goal of minimizing mean response time in a general stochastic setting. We will specifically concentrate on the common case of a load balancing system, where a front-end load balancer (a.k.a. dispatcher) dispatches requests to multiple back-end servers, each with their own queue. Much is known about load balancing in the case where the scheduling at the servers is First-Come-First-Served (FCFS). However, to minimize mean response time, we need to use Shortest-Remaining-Processing-Time (SRPT) scheduling at the servers. Unfortunately, there is almost nothing known about optimal dispatching when SRPT scheduling is used at the servers. To make things worse, it turns out that the traditional dispatching policies that are used in practice with FCFS servers often have poor performance in systems with SRPT servers. In this talk, we devise a simple fix that can be applied to any dispatching policy. This fix, called "guardrails" ensures that the dispatching policy yields optimal mean response time under heavy traffic, when used in a system with SRPT servers. Any dispatching policy, when augmented with guardrails becomes heavy-traffic optimal. Our results also yield the first analytical bounds on mean response time for load balancing systems with SRPT scheduling at the servers. Load balancing and scheduling are highly studied both in the stochastic and the worst-case scheduling communities. One aim of this talk is to contrast some differences in the approaches of the two communities when tackling multi-server scheduling problems.

A fundamental result in statistical learning theory is the equivalence of PAC learnability of a class with the finiteness of its Vapnik-Chervonenkis dimension. However, this clean result applies only to binary classification problems. In search for a similar combinatorial characterization of learnability in a more general setting, we discovered a surprising independence of set theory for some basic general notion of learnability. Consider the following statistical estimation problem: given a family F of real valued random variables over some domain X and an i.i.d. sample drawn from an unknown distribution P over X, find f in F such that its expectation w.r.t. P is close to the supremum expectation over all members of F. This Expectation Maximization (EMX) problem captures many well studied learning problems. Surprisingly, we show that the EMX learnability of some simple classes depends on the cardinality of the continuum and is therefore independent of the set theory ZFC axioms. Our results imply that that there exist no "finitary" combinatorial parameter that characterizes EMX learnability in a way similar to the VC-dimension characterization of binary classification learnability.

Log-concave polynomials give rise to discrete probability distributions with several nice properties. In particular, log-concavity of the generating polynomial guarantees the existence of efficient algorithms for approximately sampling from a distribution and finding the size of its support. This class of distributions contains several important examples, including uniform measures over bases or independent sets of matroids, determinantal point processes and strongly Rayleigh measures, measures defined by mixed volumes in Mikowski sums, the random cluster model in certain regimes, and more. In this tutorial, we will introduce the theory and applications of log-concave polynomials and survey some of the recent developments in this area.

I will describe recent progress in determination of asymptotic behavior in random constraint satisfaction problems, including the independent set problem on random graphs, random regular NAE-SAT, and random SAT. The results include sharp phase transitions and some understanding of solution geometry, particularly in the setting of the random regular NAE-SAT problem. In this lecture I will survey the physics heuristics, and explain how they lead to combinatorial models for the solution geometry, which form a basis of mathematical approaches to these problems. As time allows, I will discuss some of the mathematical techniques that have been introduced, particularly with regards to solving certain non-convex optimization problems that arise in moment method calculations.

We study discrepancy minimization for vectors in ℝ^{n} under various settings. The main result is the analysis of a new simple random process
in high dimensions through a comparison argument. As corollaries, we obtain bounds
which are tight up to logarithmic factors for online vector balancing against oblivious
adversaries, resolving several questions posed by Bansal, Jiang, Singla, and Sinha
(STOC 2020), as well as a linear time algorithm for logarithmic bounds for the Komlós conjecture.

We prove that for any distinct *x*,*y* ∈ {0,1}^{n}, there is a deterministic finite automaton with *O*(*n*^{1/3}) states that accepts *x* but not *y*. This improves Robson’s 1989 bound of *O*(*n*^{2/5}). Using a similar complex analytic technique, we improve the upper bound on worst
case trace reconstruction, showing that any unknown string *x* ∈ {0,1}^{n} can be reconstructed with high probability from exp(*O*(*n*^{1/5})) independently generated traces.

For some > 10^{−36} we give a randomized 3/2− approximation algorithm for metric TSP.

We study search problems that can be solved by performing Gradient Descent on a bounded
convex polytopal domain and show that this class is equal to the intersection of two
well-known classes: PPAD and PLS. As our main underlying technical contribution, we
show that computing a Karush-Kuhn-Tucker (KKT) point of a continuously differentiable
function over the domain [0,1]^{2} is PPAD ∩ PLS-complete. This is the first natural problem to be shown complete for
this class. Our results also imply that the class CLS (Continuous Local Search) -
which was defined by Daskalakis and Papadimitriou as a more “natural” counterpart
to PPAD ∩ PLS and contains many interesting problems - is itself equal to PPAD ∩ PLS.

Indistinguishability obfuscation, introduced by [Barak et. al. Crypto 2001], aims to compile programs into unintelligible ones while preserving functionality. It is a fascinating and powerful object that has been shown to enable a host of new cryptographic goals and beyond. However, constructions of indistinguishability obfuscation have remained elusive, with all other proposals relying on heuristics or newly conjectured hardness assumptions. In this work, we show how to construct indistinguishability obfuscation from subexponential hardness of four well-founded assumptions. We prove:

Informal Theorem: * Let *τ ∈ (0,∞), δ ∈ (0,1), ∈ (0,1)* be arbitrary constants. Assume sub-exponential security of the following assumptions:
*

*- the Learning With Errors (**LWE**) assumption with subexponential modulus-to-noise ratio *2^{kє}* and noises of magnitude polynomial in **k**, where **k** is the dimension of the **LWE** secret, *

*- the Learning Parity with Noise (**LPN**) assumption over general prime fields *ℤ_{p}* with polynomially many **LPN** samples and error rate *1/ℓ^{δ}*, where *ℓ* is the dimension of the **LPN** secret, *

*- the existence of a Boolean Pseudo-Random Generator (**PRG**) in **NC*^{0}* with stretch **n*^{1+τ}*, where **n** is the length of the **PRG** seed, *

*- the Decision Linear (**DLIN**) assumption on symmetric bilinear groups of prime order. *

*Then, (subexponentially secure) indistinguishability obfuscation for all polynomial-size
circuits exists. Further, assuming only polynomial security of the aforementioned
assumptions, there exists collusion resistant public-key functional encryption for
all polynomial-size circuits. *

Motivated by practical needs such as large-scale learning, we study the impact of
adaptivity constraints to linear contextual bandits, a central problem in online learning
and decision making. We consider two popular limited adaptivity models in literature:
batch learning and rare policy switches. We show that, when the context vectors are
adversarially chosen in *d*-dimensional linear contextual bandits, the learner needs *O*(*d* log*d* log*T*) policy switches to achieve the minimax-optimal regret, and this is optimal up to
*poly*(log*d*, loglog*T*) factors; for stochastic context vectors, even in the more restricted batch learning
model, only *O*(loglog*T*) batches are needed to achieve the optimal regret. Together with the known results
in literature, our results present a complete picture about the adaptivity constraints
in linear contextual bandits. Along the way, we propose the distributional optimal
design, a natural extension of the optimal experiment design, and provide a both statistically
and computationally efficient learning algorithm for the problem, which may be of
independent interest.

We study the problem of PAC learning homogeneous halfspaces with Tsybakov noise. In the Tsybakov noise model, the label of every example is independently flipped with an adversarially controlled probability that can be arbitrarily close to 1/2 for a fraction of the examples. We give the first polynomial-time algorithm for this fundamental learning problem. Our algorithm learns the true halfspace within any desired accuracy and succeeds under a broad family of well-behaved distributions including log-concave distributions.

This extended abstract is a merge of two papers. In an earlier work, a subset of the authors developed an efficient reduction from learning to certifying the non-optimality of a candidate halfspace and gave a quasi-polynomial time certificate algorithm. In a subsequent work, the authors of the this paper developed a polynomial-time certificate algorithm.

We obtain robust and computationally efficient estimators for learning several linear
models that achieve statistically optimal convergence rate under minimal distributional
assumptions. Concretely, we assume our data is drawn from a *k*-hypercontractive distribution and an є-fraction is adversarially corrupted. We then
describe an estimator that converges to the optimal least-squares minimizer for the
true distribution at a rate proportional to є^{2−2/k}, when the noise is independent of the covariates. We note that no such estimator
was known prior to our work, even with access to unbounded computation. The rate we
achieve is information-theoretically optimal and thus we resolve the main open question
in Klivans, Kothari and Meka [COLT’18].

Our key insight is to identify an analytic condition that serves as a polynomial relaxation
of independence of random variables. In particular, we show that when the moments
of the noise and covariates are negatively-correlated, we obtain the same rate as
independent noise. Further, when the condition is not satisfied, we obtain a rate
proportional to є^{2−4/k}, and again match the information-theoretic lower bound. Our central technical contribution
is to algorithmically exploit independence of random variables in the ”sum-of-squares”
framework by formulating it as the aforementioned polynomial inequality.

This paper studies the statistical query (SQ) complexity of estimating *d*-dimensional submanifolds in ℝ^{n}. We propose a purely geometric algorithm called Manifold Propagation, that reduces
the problem to three natural geometric routines: projection, tangent space estimation,
and point detection. We then provide constructions of these geometric routines in
the SQ framework. Given an adversarial *STAT*(τ) oracle and a target Hausdorff distance precision ε = Ω(τ^{2/(d+1)}), the resulting SQ manifold reconstruction algorithm has query complexity *O*(*n* polylog(*n*) ε^{−d/2}), which is proved to be nearly optimal. In the process, we establish low-rank matrix
completion results for SQ’s and lower bounds for randomized SQ estimators in general
metric spaces.

Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when the examples are high-dimensional and have entropy much higher than the sample size, and even when most of that information is ultimately irrelevant to the task at hand. Further, our results do not depend on the training algorithm or the class of models used for learning.

Our problems are simple and fairly natural variants of the next-symbol prediction and the cluster labeling tasks. These tasks can be seen as abstractions of text- and image-related prediction problems. To establish our results, we reduce from a family of one-way communication problems for which we prove new information complexity lower bounds.

We show that *n*-variable tree-structured Ising models can be learned computationally-efficiently
to within total variation distance є from an optimal *O*(*n* ln*n*/є^{2}) samples, where *O*(·) hides an absolute constant which, importantly, does not depend on the model being
learned—neither its tree nor the magnitude of its edge strengths, on which we place
no assumptions. Our guarantees hold, in fact, for the celebrated Chow-Liu algorithm
[1968], using the plug-in estimator for estimating mutual information. While this
(or any other) algorithm may fail to identify the structure of the underlying model
correctly from a finite sample, we show that it will still learn a tree-structured
model that is є-close to the true one in total variation distance, a guarantee called
“proper learning.”

Our guarantees do not follow from known results for the Chow-Liu algorithm and the ensuing literature on learning graphical models, including the very recent renaissance of algorithms on this learning challenge, which only yield asymptotic consistency results, or sample-suboptimal and/or time-inefficient algorithms, unless further assumptions are placed on the model, such as bounds on the “strengths” of the model’s edges. While we establish guarantees for a widely known and simple algorithm, the analysis that this algorithm succeeds and is sample-optimal is quite complex, requiring a hierarchical classification of the edges into layers with different reconstruction guarantees, depending on their strength, combined with delicate uses of the subadditivity of the squared Hellinger distance over graphical models to control the error accumulation.

We provide finite sample guarantees for the classical Chow-Liu algorithm (IEEE Trans. Inform. Theory,
1968) to learn a tree-structured graphical model of a distribution. For a distribution
*P* on Σ^{n} and a tree *T* on *n* nodes, we say *T* is an ε-approximate tree for *P* if there is a *T*-structured distribution *Q* such that *D*(*P* || *Q*) is at most ε more than the best possible tree-structured distribution for *P*. We show that if *P* itself is tree-structured, then the Chow-Liu algorithm with the plug-in estimator
for mutual information with *O*(|Σ|^{3} *n*ε^{−1}) i.i.d. samples outputs an ε-approximate tree for *P* with constant probability. In contrast, for a general *P* (which may not be tree-structured), Ω(*n*^{2}ε^{−2}) samples are necessary to find an ε-approximate tree. Our upper bound is based on
a new conditional independence tester that addresses an open problem posed by Canonne,
Diakonikolas, Kane, and Stewart (STOC, 2018): we prove that for three random variables
*X*,*Y*,*Z* each over Σ, testing if *I*(*X*; *Y* ∣ *Z*) is 0 or ≥ ε is possible with *O*(|Σ|^{3}/ε) samples. Finally, we show that for a specific tree *T*, with *O*(|Σ|^{2}*n*ε^{−1}) samples from a distribution *P* over Σ^{n}, one can efficiently learn the closest *T*-structured distribution in KL divergence by applying the add-1 estimator at each
node.

There have been two main lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix ; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples.

Our main theorem provides guarantees for one-sample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a high-dimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the afore-described multiple-sample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anti-concentration results for the Ising model, which we believe have applications beyond the scope of this paper.

Given a metric space, the (*k*,*z*)-clustering problem consists of finding *k* centers such that the sum of the of distances raised to the power *z* of every point to its closest center is minimized. This encapsulates the famous k-median
(*z*=1) and *k*-means (*z*=2) clustering problems. Designing small-space sketches of the data that approximately
preserves the cost of the solutions, also known as *coresets*, has been an important research direction over the last 15 years. In this paper,
we present a new, simple coreset framework that simultaneously improves upon the best
known bounds for a large variety of settings, ranging from Euclidean space, doubling
metric, minor-free metric, and the general metric cases.

In this paper we prove that the sample complexity of properly learning a class of
Littlestone dimension *d* with approximate differential privacy is Õ(*d*^{6}), ignoring privacy and accuracy parameters. This result answers a question of Bun
et al. (FOCS 2020) by improving upon their upper bound of 2^{O(d)} on the sample complexity. Prior to our work, finiteness of the sample complexity
for privately learning a class of finite Littlestone dimension was only known for
improper private learners, and the fact that our learner is proper answers another
question of Bun et al., which was also asked by Bousquet et al. (NeurIPS 2020). Using
machinery developed by Bousquet et al., we then show that the sample complexity of
sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension
and dual Littlestone dimension. This implies that a class is sanitizable if and only
if it has finite Littlestone dimension. An important ingredient of our proofs is a
new property of binary hypothesis classes that we call irreducibility, which may be
of independent interest.

Let *f*: {0, 1}^{n} → {0, 1} be a boolean function, and let *f*_{∧}(*x*, *y*) = *f*(*x* ∧ *y*) denote the *AND-function* of *f*, where *x* ∧ *y* denotes bit-wise AND. We study the deterministic communication complexity of *f*_{∧} and show that, up to a log*n* factor, it is bounded by a polynomial in the logarithm of the real rank of the communication
matrix of *f*_{∧}. This comes within a log*n* factor of establishing the log-rank conjecture for AND-functions with *no assumptions* on *f*. Our result stands in contrast with previous results on special cases of the log-rank
conjecture, which needed significant restrictions on *f* such as monotonicity or low F_{2}-degree. Our techniques can also be used to prove (within a log*n* factor) a *lifting theorem* for AND-functions, stating that the deterministic communication complexity of *f*_{∧} is polynomially related to the *AND-decision tree complexity* of *f*.

The results rely on a new structural result regarding boolean functions *f*: {0, 1}^{n} → {0, 1} with a sparse polynomial representation, which may be of independent interest.
We show that if the polynomial computing *f* has few monomials then the set system of the monomials has a small hitting set, of
size poly-logarithmic in its sparsity. We also establish extensions of this result
to multi-linear polynomials *f*: {0, 1}^{n} → with a larger range.

We show that algebraic proofs are hard to find: Given an unsatisfiable CNF formula *F*, it is *NP*-hard to find a refutation of *F* in the Nullstellensatz, Polynomial Calculus, or Sherali–Adams proof systems in time
polynomial in the size of the shortest such refutation. Our work extends, and gives
a simplified proof of, the recent breakthrough of Atserias and Müller (JACM 2020)
that established an analogous result for Resolution.

We show unconditionally that Cook’s theory PV formalizing poly-time reasoning cannot
prove, for any non-deterministic poly-time machine *M* defining a language *L*(*M*), that *L*(*M*) is inapproximable by co-nondeterministic circuits of sub-exponential size. In fact,
our unprovability result holds also for a theory which supports a fragment of Jeřábek’s
theory of approximate counting *APC*_{1}. We also show similar unconditional unprovability results for the conjecture of Rudich
about the existence of super-bits.

We propose a diagonalization-based approach to several important questions in proof complexity. We illustrate this approach in the context of the algebraic proof system IPS and in the context of propositional proof systems more generally.

We use the approach to give an explicit sequence of CNF formulas {φ_{n}} such that *VNP* ≠ *VP* iff there are no polynomial-size IPS proofs for the formulas φ_{n}. This provides a natural equivalence between proof complexity lower bounds and standard
algebraic complexity lower bounds. Our proof of this fact uses the implication from
IPS lower bounds to algebraic complexity lower bounds due to Grochow and Pitassi together
with a diagonalization argument: the formulas φ_{n} themselves assert the non-existence of short IPS proofs for formulas encoding *VNP* ≠ *VP* at a different input length. Our result also has meta-mathematical implications:
it gives evidence for the difficulty of proving strong lower bounds for IPS within
IPS.

For any strong enough propositional proof system *R*, we define the *iterated *R*-lower bound formulas*, which inductively assert the non-existence of short *R* proofs for formulas encoding the same statement at a different input length, and
propose them as explicit hard candidates for the proof system *R*. We observe that this hypothesis holds for Resolution following recent results of
Atserias and Muller and of Garlik, and give evidence in favour of it for other proof
systems.

We obtain new separation results for the two-party external information complexity of Boolean functions. The external information complexity of a function f(x,y) is the minimum amount of information a two-party protocol computing f must reveal to an outside observer about the input. We prove an exponential separation between external and internal information complexity, which is the best possible; previously no separation was known. We use this result in order to then prove a near-quadratic separation between amortized zero-error communication complexity and external information complexity for total functions, disproving a conjecture of the first author. Finally, we prove a matching upper bound showing that our separation result is tight.

In this work we resolve conjectures of Beecken, Mitmann and Saxena [BMS13] and Gupta
[Gupta14], by proving an analog of a theorem of Edelstein and Kelly for quadratic
polynomials. As immediate corollary we obtain the first deterministic polynomial time
black-box algorithm for testing zeroness of Σ^{[3]}ΠΣΠ^{[2]} circuits.

We prove a new derandomization of Håstad’s switching lemma, showing how to efficiently
generate restrictions satisfying the switching lemma for DNF or CNF formulas of size
*m* using only *O*(log*m*) random bits. Derandomizations of the switching lemma have been useful in many works
as a key building-block for constructing objects which are in some way provably-pseudorandom
with respect to AC^{0}-circuits.

Here, we use our new derandomization to give an improved analysis of the pseudorandom
generator of Trevisan and Xue for AC^{0}-circuits (CCC’13): we show that the generator ε-fools size-*m*, depth-*D* circuits with *n*-bit inputs using only *O*(log(*m*/ε)^{D} · log*n*) random bits. In particular, we obtain (modulo the loglog-factors hidden in the *O*-notation) a dependence on *m*/ε which is best-possible with respect to currently-known AC^{0}-circuit lower bounds.

Extending the classical “hardness-to-randomness” line-of-works, Doron, Moshkovitz,
Oh, and Zuckerman (FOCS 2020) recently proved that derandomization with *near-quadratic time overhead* is possible, under the assumption that there exists a function in *DTIME*[2^{n}] that cannot be computed by randomized SVN circuits of size 2^{(1−є)· n} for a small є.

In this work we extend their inquiry and answer several open questions that arose
from their work. For a time function *T*(*n*), consider the following assumption: Non-uniformly secure one-way functions exist,
and for δ=δ(є) and *k*=*k*_{T}(є) there exists a problem in *DTIME*[2^{k· n}] that is hard for algorithms that run in time 2^{(k−δ)· n} and use 2^{(1−δ)· n} bits of advice. Under this assumption, we show that:

1. (Worst-case derandomization.) Probabilistic algorithms that run in time *T*(*n*) can be deterministically simulated in time *n*· *T*(*n*)^{1+є}.

2. (Average-case derandomization.) For polynomial time functions *T*(*n*)=*poly*(*n*), we can improve the derandomization time to *n*^{є}· *T*(*n*) if we allow the derandomization to succeed only on average, rather than in the worst-case.

3. (Conditional optimality.) For worst-case derandomization, the multiplicative time overhead of *n* is essentially optimal, conditioned on a counting version of the non-deterministic
strong exponential-time hypothesis (i.e., on #*NSETH*).

Lastly, we present an alternative proof for the result of Doron, Moshkovitz, Oh, and
Zuckerman that is simpler and more versatile. In fact, we show how to simplify the
analysis not only of their construction, but of *any construction* that “extracts randomness from a pseudoentropic string”.

A long-standing and central open question in the theory of average-case complexity is to base average-case hardness of NP on worst-case hardness of NP. A frontier question along this line is to prove that PH is hard on average if UP requires (sub-)exponential worst-case complexity. The difficulty of resolving this question has been discussed from various perspectives based on technical barrier results, such as the limits of black-box reductions and the non-existence of worst-case hardness amplification procedures in PH.

In this paper, we overcome these barriers and resolve the open question by presenting the following main results:

1. *UP* ⊈*DTIME*(2^{O(n / logn)}) implies *DistNP* ⊈*AvgP*.

2. *PH* ⊈*DTIME*(2^{O(n / logn)}) implies *DistPH* ⊈*AvgP*.

3. *NP* ⊈*DTIME*(2^{O(n / logn)}) implies *DistNP* ⊈*Avg*_{P} *P*. Here, *Avg*_{P} *P* denotes P-computable average-case polynomial time, which interpolates average-case
polynomial-time and worst-case polynomial-time. We complement this result by showing
that *DistPH* ⊈*AvgP* if and only if *DistPH* ⊈*Avg*_{P} *P*.

At the core of all of our results is a new notion of universal heuristic scheme, whose running time is P-computable average-case polynomial time under every polynomial-time samplable distribution. Our proofs are based on the meta-complexity of time-bounded Kolmogorov complexity: We analyze average-case complexity through the lens of worst-case meta-complexity using a new “algorithmic” proof of language compression and weak symmetry of information for time-bounded Kolmogorov complexity.

We connect the study of pseudodeterministic algorithms to two major open problems about the structural complexity of BPTIME: proving hierarchy theorems and showing the existence of complete problems. Our main contributions can be summarised as follows.

A new pseudorandom generator and its consequences. We build on techniques developed
to prove hierarchy theorems for probabilistic time with advice (Fortnow and Santhanam,
FOCS 2004) to construct the first unconditional pseudorandom generator of polynomial
stretch computable in pseudodeterministic polynomial time (with one bit of advice)
that is secure infinitely often against polynomial-time computations. As an application
of this construction, we obtain new results about the complexity of generating and
representing prime numbers. For instance, we show unconditionally for each ε > 0 that
infinitely many primes *p*_{n} have a succinct representation in the following sense: there is a fixed probabilistic
polynomial time algorithm that generates *p*_{n} with high probability from its succinct representation of size *O*(|*p*_{n}|^{ε}). This offers an exponential improvement over the running time of previous results,
and shows that infinitely many primes have succinct and efficient representations.

Structural results for probabilistic time from pseudodeterministic algorithms. Oliveira and Santhanam (STOC 2017) established unconditionally that there is a pseudodeterministic algorithm for the Circuit Acceptance Probability Problem (CAPP) that runs in sub-exponential time and is correct with high probability over any samplable distribution on circuits on infinitely many input lengths. We show that improving this running time or obtaining a result that holds for every large input length would imply new time hierarchy theorems for probabilistic time. In addition, we prove that a worst-case polynomial-time pseudodeterministic algorithm for CAPP would imply that BPP has complete problems.

Equivalence between pseudodeterministic constructions and hierarchies. We establish
an equivalence between a certain explicit pseudodeterministic construction problem
and the existence of strong hierarchy theorems for probabilistic time. More precisely,
we show that pseudodeterministically constructing in exponential time strings of large
rKt complexity (Oliveira, ICALP 2019) is possible if and only if for every constructive
function *T*(*n*) ≤ exp(*o*(exp(*n*))) we have BPTIME[poly(*T*)] ⊈ i.o.BPTIME[*T*]/log*T*.

More generally, these results suggest new approaches for designing pseudodeterministic algorithms for search problems and for unveiling the structure of probabilistic time.

The vertex connectivity of an *m*-edge *n*-vertex undirected graph is the smallest number of vertices whose removal disconnects
the graph, or leaves only a singleton vertex. In this paper, we give a reduction from
the vertex connectivity problem to a set of maxflow instances. Using this reduction,
we can solve vertex connectivity in (*m*^{α}) time for any α ≥ 1, if there is a *m*^{α}-time maxflow algorithm. Using the current best maxflow algorithm that runs in *m*^{4/3+o(1)} time (Kathuria, Liu and Sidford, FOCS 2020), this yields a *m*^{4/3+o(1)}-time vertex connectivity algorithm. This is the first improvement in the running
time of the vertex connectivity problem in over 20 years, the previous best being
an Õ(*mn*)-time algorithm due to Henzinger, Rao, and Gabow (FOCS 1996). Indeed, no algorithm
with an *o*(*mn*) running time was known before our work, even if we assume an (*m*)-time maxflow algorithm.

Our new technique is robust enough to also improve the best Õ(*mn*)-time bound for directed vertex connectivity to *mn*^{1−1/12+o(1)} time

For an integer *t*, a graph *G* is called *C*_{>t}*-free* if *G* does not contain any induced cycle on more than *t* vertices. We prove the following statement: for every pair of integers *d* and *t* and a statement φ, there exists an algorithm that, given an *n*-vertex *C*_{>t}-free graph *G* with weights on vertices, finds in time *n*^{(log3 n)} a maximum-weight vertex subset *S* such that *G*[*S*] has degeneracy at most *d* and satisfies φ. The running time can be improved to *n*^{(log2 n)} assuming *G* is *P*_{t}-free, that is, *G* does not contain an induced path on *t* vertices. This expands the recent results of the authors [FOCS 2020 and SOSA 2021]
on the Maximum Weight Independent Set problem on *P*_{t}-free graphs in two directions: by encompassing the more general setting of *C*_{>t}-free graphs, and by being applicable to a much wider variety of problems, such as
Maximum Weight Induced Forest or Maximum Weight Induced Planar Graph.

In low distortion metric embeddings, the goal is to embed a host “hard” metric space
into a “simpler” target space while approximately preserving pairwise distances. A
highly desirable target space is that of a tree metric. Unfortunately, such embedding
will result in a huge distortion. A celebrated bypass to this problem is stochastic
embedding with logarithmic expected distortion. Another bypass is Ramsey-type embedding,
where the distortion guarantee applies only to a subset of the points. However, both
these solutions fail to provide an embedding into a single tree with a worst-case
distortion guarantee on all pairs. In this paper, we propose a novel third bypass
called clan embedding. Here each point *x* is mapped to a subset of points *f*(*x*), called a clan, with a special chief point χ(*x*)∈ *f*(*x*). The clan embedding has multiplicative distortion *t* if for every pair (*x*,*y*) some copy *y*′∈ *f*(*y*) in the clan of *y* is close to the chief of *x*: min_{y′∈ f(y)}*d*(*y*′,χ(*x*))≤ *t*· *d*(*x*,*y*). Our first result is a clan embedding into a tree with multiplicative distortion
*O*(log*n*/є) such that each point has 1+є copies (in expectation). In addition, we provide
a “spanning” version of this theorem for graphs and use it to devise the first compact
routing scheme with constant size routing tables.

We then focus on minor-free graphs of diameter prameterized by *D*, which were known to be stochastically embeddable into bounded treewidth graphs with
expected additive distortion є *D*. We devise Ramsey-type embedding and clan embedding analogs of the stochastic embedding.
We use these embeddings to construct the first (bicriteria quasi-polynomial time)
approximation scheme for the metric ρ-dominating set and metric ρ-independent set
problems in minor-free graphs.

Network design problems aim to compute low-cost structures such as routes, trees and subgraphs. Often, it is natural and desirable to require that these structures have small hop length or hop diameter. Unfortunately, optimization problems with hop constraints are much harder and less well understood than their hop-unconstrained counterparts. A significant algorithmic barrier in this setting is the fact that hop-constrained distances in graphs are very far from being a metric.

We show that, nonetheless, hop-constrained distances can be approximated by distributions over ``partial tree metrics.'' We build this result into a powerful and versatile algorithmic tool which, similarly to classic probabilistic tree embeddings, reduces hop-constrained problems in general graphs to hop-unconstrained problems on trees. We then use this tool to give the first poly-logarithmic bicriteria approximations for the hop-constrained variants of many classic network design problems. These include Steiner forest, group Steiner tree, group Steiner forest, buy-at-bulk network design as well as online and oblivious versions of many of these problems.

We consider the Connectivity Augmentation Problem (CAP), a classical problem in the
area of Survivable Network Design. It is about increasing the edge-connectivity of
a graph by one unit in the cheapest possible way. More precisely, given a *k*-edge-connected graph *G*=(*V*,*E*) and a set of extra edges, the task is to find a minimum cardinality subset of extra
edges whose addition to *G* makes the graph (*k*+1)-edge-connected. If *k* is odd, the problem is known to reduce to the Tree Augmentation Problem (TAP)—i.e.,
*G* is a spanning tree—for which significant progress has been achieved recently, leading
to approximation factors below 1.5 (the currently best factor is 1.458). However,
advances on TAP did not carry over to CAP so far. Indeed, only very recently, Byrka,
Grandoni, and Ameli (STOC 2020) managed to obtain the first approximation factor below
2 for CAP by presenting a 1.91-approximation algorithm based on a method that is disjoint
from recent advances for TAP.

We first bridge the gap between TAP and CAP, by presenting techniques that allow for leveraging insights and methods from TAP to approach CAP. We then introduce a new way to get approximation factors below 1.5, based on a new analysis technique. Through these ingredients, we obtain a -approximation algorithm for CAP, and therefore also TAP. This leads to the currently best approximation result for both problems in a unified way, by significantly improving on the above-mentioned 1.91-approximation for CAP and also the previously best approximation factor of 1.458 for TAP by Grandoni, Kalaitzis, and Zenklusen (STOC 2018). Additionally, a feature we inherit from recent TAP advances is that our approach can deal with the weighted setting when the ratio between the largest to smallest cost on extra links is bounded, in which case we obtain approximation factors below 1.5.

We present a deterministic (global) mincut algorithm for weighted, undirected graphs
that runs in *m*^{1+o(1)} time, answering an open question of Karger from the 1990s. To obtain our result,
we de-randomize the construction of the *skeleton* graph in Karger’s near-linear time mincut algorithm, which is its only randomized
component. In particular, we partially de-randomize the well-known Benczur-Karger
graph sparsification technique by random sampling, which we accomplish by the method
of pessimistic estimators. Our main technical component is designing an efficient
pessimistic estimator to capture the cuts of a graph, which involves harnessing the
expander decomposition framework introduced in recent work by Goranci et al. (SODA 2021).
As a side-effect, we obtain a structural representation of all approximate mincuts
in a graph, which may have future applications.

We show that the multiplicity of the second normalized adjacency matrix eigenvalue
of any connected graph of maximum degree Δ is bounded by *O*(*n* Δ^{7/5}/log^{1/5−o(1)}*n*) for any Δ, and improve this to *O*(*n*log^{1/2}*d*/log^{1/4−o(1)}*n*) for simple *d*-regular graphs when *d*≥ log^{1/4}*n*. In fact, the same bounds hold for the number of eigenvalues in any interval of width
λ_{2}/log_{Δ}^{1−o(1)}*n* containing the second eigenvalue λ_{2}. The main ingredient in the proof is a polynomial (in *k*) lower bound on the typical support of a closed random walk of length 2*k* in any connected graph, which in turn relies on new lower bounds for the entries
of the Perron eigenvector of submatrices of the normalized adjacency matrix.

We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal
distributions, and more generally distributions associated with log-concave polynomials.
For a matroid of rank *k* on a ground set of *n* elements, or more generally distributions associated with log-concave polynomials
of homogeneous degree *k* on *n* variables, we show that the down-up random walk, started from an arbitrary point
in the support, mixes in time *O*(*k*log*k*). Our bound has no dependence on *n* or the starting point, unlike the previous analyses of Anari et al. (STOC 2019),
Cryan et al. (FOCS 2019), and is tight up to constant factors. The main new ingredient
is a property we call approximate exchange, a generalization of well-studied exchange
properties for matroids and valuated matroids, which may be of independent interest.
In particular, given a distribution µ over size-*k* subsets of [*n*], our approximate exchange property implies that a simple local search algorithm
gives a *k*^{O(k)}-approximation of max_{S} µ(*S*) when µ is generated by a log-concave polynomial, and that greedy gives the same
approximation ratio when µ is strongly Rayleigh. As an application, we show how to
leverage down-up random walks to approximately sample random forests or random spanning
trees in a graph with *n* edges in time *O*(*n*log^{2} *n*). The best known result for sampling random forest was a FPAUS with high polynomial
runtime recently found by Anari et al. (STOC 2019), Cryan et al. (FOCS 2019). For
spanning tree, we improve on the almost-linear time algorithm by Schild (STOC 2018).
Our analysis works on weighted graphs too, and is the first to achieve nearly-linear
running time for these problems. Our algorithms can be naturally extended to support
approximately sampling from random forests of size between *k*_{1} and *k*_{2} in time *O*(*n* log^{2} *n*), for fixed parameters *k*_{1}, *k*_{2}.

The matroid intersection problem is a fundamental problem that has been extensively
studied for half a century. In the classic version of this problem, we are given two
matroids *M*_{1} = (*V*, *I*_{1}) and *M*_{2} = (*V*, *I*_{2}) on a comment ground set *V* of *n* elements, and then we have to find the largest common independent set *S* ∈ *I*_{1} ∩ *I*_{2} by making independence oracle queries of the form ”Is *S* ∈ *I*_{1}?” or ”Is *S* ∈ *I*_{2}?” for *S* ⊆ *V*. The goal is to minimize the number of queries.

Beating the existing Õ(*n*^{2}) bound, known as the quadratic barrier, is an open problem that captures the limits
of techniques from two lines of work. The first one is the classic Cunningham’s algorithm
[SICOMP 1986], whose Õ(*n*^{2})-query implementations were shown by CLS+ [FOCS 2019] and Nguyen [2019] (more generally,
these algorithms take Õ(*nr*) queries where *r* denotes the rank which can be as big as *n*). The other one is the general cutting plane method of Lee, Sidford, and Wong [FOCS
2015]. The only progress towards breaking the quadratic barrier requires either approximation
algorithms or a more powerful rank oracle query [CLS+ FOCS 2019]. No exact algorithm
with *o*(*n*^{2}) independence queries was known.

In this work, we break the quadratic barrier with a randomized algorithm guaranteeing
Õ(*n*^{9/5}) independence queries with high probability, and a deterministic algorithm guaranteeing
Õ(*n*^{11/6}) independence queries. Our key insight is simple and fast algorithms to solve a graph
reachability problem that arose in the standard augmenting path framework [Edmonds
1968]. Combining this with previous exact and approximation algorithms leads to our
results.

We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting non-perfect matchings was shown by Jerrum (J Stat Phys 1987) to be #P-hard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multi-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downward-closed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions for generating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity. These notions generalize well-studied properties like real-stability and log-concavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by Anari et al. (FOCS 2020), providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional log-concavity; this generalizes a classic result established by Gårding (J Math Mech 1959) who showed homogeneous polynomials that have no roots in a half-plane must be log-concave over the positive orthant.

Laws of large numbers guarantee that given a large enough sample from some population,
the measure of any fixed sub-population is well-estimated by its frequency in the
sample. We study laws of large numbers in sampling processes that can affect the environment
they are acting upon and interact with it. Specifically, we consider the sequential
sampling model proposed by Ben-Eliezer and Yogev (2020), and characterize the classes
which admit a uniform law of large numbers in this model: these are exactly the classes
that are *online learnable*. Our characterization may be interpreted as an online analogue to the equivalence
between learnability and uniform convergence in statistical (PAC) learning. The sample-complexity
bounds we obtain are tight for many parameter regimes, and as an application, we determine
the optimal regret bounds in online learning, stated in terms of *Littlestone’s dimension*, thus resolving the main open question from Ben-David, Pál, and Shalev-Shwartz (2009),
which was also posed by Rakhlin, Sridharan, and Tewari (2015).

We consider an online binary prediction setting where a forecaster observes a sequence
of *T* bits one by one. Before each bit is revealed, the forecaster predicts the probability
that the bit is 1. The forecaster is called well-calibrated if for each *p* ∈ [0, 1], among the *n*_{p} bits for which the forecaster predicts probability *p*, the actual number of ones, *m*_{p}, is indeed equal to *p* · *n*_{p}. The calibration error, defined as ∑_{p} |*m*_{p} − *p* *n*_{p}|, quantifies the extent to which the forecaster deviates from being well-calibrated.
It has long been known that an *O*(*T*^{2/3}) calibration error is achievable even when the bits are chosen adversarially, and
possibly based on the previous predictions. However, little is known on the lower
bound side, except an Ω(√*T*) bound that follows from the trivial example of independent fair coin flips.

In this paper, we prove an Ω(*T*^{0.528}) bound on the calibration error, which is the first super-√*T* lower bound for this setting to the best of our knowledge. The technical contributions
of our work include two lower bound techniques, early stopping and sidestepping, which
circumvent the obstacles that have previously hindered strong calibration lower bounds.
We also propose an abstraction of the prediction setting, termed the Sign-Preservation
game, which may be of independent interest. This game has a much smaller state space
than the full prediction setting and allows simpler analyses. The Ω(*T*^{0.528}) lower bound follows from a general reduction theorem that translates lower bounds
on the game value of Sign-Preservation into lower bounds on the calibration error.

The problem of learning *t*-term DNF formulas (for *t* = *O*(1)) has been studied extensively in the PAC model since its introduction by Valiant
(STOC 1984). A *t*-term DNF can be efficiently learnt using a *t*-term DNF only if *t* = 1 i.e., when it is an AND, while even *weakly* learning a 2-term DNF using a constant term DNF was shown to be NP-hard by Khot and
Saket (FOCS 2008). On the other hand, Feldman, Guruswami, Raghavendra and Wu (FOCS
2009) showed the hardness of weakly learning a noisy AND using a *halfspace* – the latter being a generalization of an AND, while Khot and Saket (STOC 2008) showed
that an intersection of two halfspaces is hard to weakly learn using any function
of constantly many halfspaces. The question of whether a 2-term DNF is efficiently
learnable using 2 or constantly many halfspaces remained open.

In this work we answer this question in the negative by showing the hardness of weakly
learning a 2-term DNF as well as a noisy AND using any function of a constant number
of halfspaces. In particular we prove the following. For any constants ν, ζ > 0 and
ℓ ∈ *N*, given a distribution over point-value pairs {0,1}^{n} × {0,1}, it is NP-hard to decide whether, (i) YES Case. There is a 2-term DNF that
classifies all the points of the distribution, and an AND that classifies at least
1−ζ fraction of the points correctly. (ii) NO Case. Any boolean function depending
on at most ℓ halfspaces classifies at most 1/2 + ν fraction of the points of the distribution
correctly.

Our result generalizes and strengthens the previous best results mentioned above on the hardness of learning a 2-term DNF, learning an intersection of two halfspaces, and learning a noisy AND.

Boosting is a celebrated machine learning approach which is based on the idea of combining
weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting
under the assumption that the weak hypotheses belong to a class of bounded capacity.
This assumption is inspired by the common convention that weak hypotheses are “rules-of-thumbs”
from an “easy-to-learn class”. (Schapire and Freund ’12, Shalev-Shwartz and Ben-David
’14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension.
We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are
needed in order to produce an accurate hypothesis? We design a novel boosting algorithm
and demonstrate that it circumvents a classical lower bound by Freund and Schapire
(’95, ’12). Whereas the lower bound shows that Ω(1/γ^{2}) weak hypotheses with γ-margin are sometimes necessary, our new method requires only
Õ(1/γ) weak hypothesis, provided that they belong to a class of bounded VC dimension.
Unlike previous boosting algorithms which aggregate the weak hypotheses by majority
votes, the new boosting algorithm uses more complex (“deeper”) aggregation rules.
We complement this result by showing that complex aggregation rules are in fact necessary
to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be
learned by boosting weak hypotheses from a bounded VC class? Can complex concepts
that are “far away” from the class be learned? Towards answering the first question
we identify a combinatorial-geometric parameter which captures the expressivity of
base-classes in boosting. As a corollary we provide an affirmative answer to the second
question for many well-studied classes, including half-spaces and decision stumps.
Along the way, we establish and exploit connections with Discrepancy Theory.

For more than a century and a half it has been widely-believed (but was never rigorously shown) that the physics of diffraction imposes certain fundamental limits on the resolution of an optical system. However our understanding of what exactly can and cannot be resolved has never risen above heuristic arguments which, even worse, appear contradictory. In this work we remedy this gap by studying the diffraction limit as a statistical inverse problem and, based on connections to provable algorithms for learning mixture models, we rigorously prove upper and lower bounds on the statistical and algorithmic complexity needed to resolve closely spaced point sources. In particular we show that there is a phase transition where the sample complexity goes from polynomial to exponential. Surprisingly, we show that this does not occur at the Abbe limit, which has long been presumed to be the true diffraction limit.

We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distribution-free sample-based model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closely-related variant that we call “lower VC” (or LVC) dimension to obtain strong lower bounds on this sample complexity.

We use this result to obtain strong and in many cases nearly optimal bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees. Conversely, we show that two natural classes of functions, juntas and monotone functions, can be tested with a number of samples that is polynomially smaller than the number of samples required for PAC learning.

Finally, we also use the connection between VC dimension and property testing to establish new lower bounds for testing radius clusterability and testing feasibility of linear constraint systems.

This work represents a natural coalescence of two important lines of work – learning mixtures of Gaussians and algorithmic robust statistics. In particular we give the first provably robust algorithm for learning mixtures of any constant number of Gaussians. We require only mild assumptions on the mixing weights (bounded fractionality) and that the total variation distance between components is bounded away from zero. At the heart of our algorithm is a new method for proving dimension-independent polynomial identifiability through applying a carefully chosen sequence of differential operations to certain generating functions that not only encode the parameters we would like to learn but also the system of polynomial equations we would like to solve. We show how the symbolic identities we derive can be directly used to analyze a natural sum-of-squares relaxation.

How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its “learning curve”, that is, the decay of the error rate as a function of the number of training examples. However, the classical theoretical framework for understanding learnability, the PAC model of Vapnik-Chervonenkis and Valiant, does not explain the behavior of learning curves: the distribution-free PAC model of learning can only bound the upper envelope of the learning curves over all possible data distributions. This does not match the practice of machine learning, where the data source is typically fixed in any given scenario, while the learner may choose the number of training examples on the basis of factors such as computational resources and desired accuracy.

In this paper, we study an alternative learning model that better captures such practical
aspects of machine learning, but still gives rise to a complete theory of the learnable
in the spirit of the PAC model. More precisely, we consider the problem of *universal* learning, which aims to understand the performance of learning algorithms on *every* data distribution, but without requiring uniformity over the distribution. The main
result of this paper is a remarkable trichotomy: there are only three possible rates
of universal learning. More precisely, we show that the learning curves of any given
concept class decay either at an exponential, linear, or arbitrarily slow rates. Moreover,
each of these cases is completely characterized by appropriate combinatorial parameters,
and we exhibit optimal learning algorithms that achieve the best possible rate in
each case.

For concreteness, we consider in this paper only the realizable case, though analogous results are expected to extend to more general learning scenarios.

We study the problem of testing discrete distributions with a focus on the high probability
regime. Specifically, given samples from one or more discrete distributions, a property
*P*, and parameters 0< є, δ <1, we want to distinguish *with probability at least *1−δ whether these distributions satisfy *P* or are є-far from *P* in total variation distance. Most prior work in distribution testing studied the
constant confidence case (corresponding to δ = Ω(1)), and provided sample-optimal
testers for a range of properties. While one can always boost the confidence probability
of any such tester by black-box amplification, this generic boosting method typically
leads to sub-optimal sample bounds.

Here we study the following broad question: For a given property *P*, can we *characterize* the sample complexity of testing *P* as a function of all relevant problem parameters, including the error probability
δ? Prior to this work, uniformity testing was the only statistical task whose sample
complexity had been characterized in this setting. As our main results, we provide
the first algorithms for closeness and independence testing that are sample-optimal,
within constant factors, as a function of all relevant parameters. We also show matching
information-theoretic lower bounds on the sample complexity of these problems. Our
techniques naturally extend to give optimal testers for related problems. To illustrate
the generality of our methods, we give optimal algorithms for testing collections
of distributions and testing closeness with unequal sized samples.

Estimating the *cardinality* (number of distinct elements) of a large multiset is a classic problem in streaming
and sketching, dating back to Flajolet and Martin’s classic Probabilistic Counting
(*PCSA*) algorithm from 1983.

In this paper we study the intrinsic tradeoff between the *space complexity* of the sketch and its *estimation error* in the random oracle model. We define a new measure of efficiency for cardinality
estimators called the *Fisher*-*Shannon* (*Fish*) number *H*/*I*. It captures the tension between the limiting Shannon entropy (*H*) of the sketch and its normalized Fisher information (*I*), which characterizes the variance of a statistically efficient, asymptotically unbiased
estimator.

Our results are as follows.

(i) We prove that all base-*q* variants of Flajolet and Martin’s *PCSA* sketch have *Fish*-number *H*_{0}/*I*_{0} ≈ 1.98016 and that every base-*q* variant of (Hyper)LogLog has *Fish*-number worse than *H*_{0}/*I*_{0}, but that they tend to *H*_{0}/*I*_{0} in the limit as *q*→ ∞. Here *H*_{0},*I*_{0} are precisely defined constants.

(ii) We describe a sketch called Fishmonger that is based on a *smoothed*, entropy-compressed variant of *PCSA* with a different estimator function. It is proved that with high probability, Fishmonger
processes a multiset of [*U*] such that *at all times*, its space is *O*(log^{2}log*U*) + (1+*o*(1))(*H*_{0}/*I*_{0})*b* ≈ 1.98*b* bits and its standard error is 1/√*b*.

(iii) We give circumstantial evidence that *H*_{0}/*I*_{0} is the optimum *Fish*-number of mergeable sketches for Cardinality Estimation. We define a class of *linearizable* sketches and prove that no member of this class can beat *H*_{0}/*I*_{0}. The popular mergeable sketches are, in fact, also linearizable.

We give an almost quadratic *n*^{2−o(1)} lower bound on the space consumption of any *o*(√log*n*)-pass streaming algorithm solving the (directed) *s*-*t* reachability problem. This means that any such algorithm must essentially store the
entire graph. As corollaries, we obtain almost quadratic space lower bounds for additional
fundamental problems, including maximum matching, shortest path, matrix rank, and
linear programming.

Our main technical contribution is the definition and construction of set hiding graphs,
that may be of independent interest: we give a general way of encoding a set *S* ⊆ [*k*] as a directed graph with *n* = *k*^{ 1 + o( 1 ) } vertices, such that deciding whether *i* ∈ *S* boils down to deciding if *t*_{i} is reachable from *s*_{i}, for a specific pair of vertices (*s*_{i},*t*_{i}) in the graph. Furthermore, we prove that our graph “hides” *S*, in the sense that no low-space streaming algorithm with a small number of passes
can learn (almost) anything about *S*.

A natural problem in high-dimensional inference is to decide if a classifier *f*:ℝ^{n} → {−1,1} depends on a small number of linear directions of its input data. Call a
function *g*: ℝ^{n} → {−1,1}, a linear *k*-junta if it is completely determined by some *k*-dimensional subspace of the input space. A recent work of the authors showed that
linear *k*-juntas are testable. Thus there exists an algorithm to distinguish between: (1)
*f*: ℝ^{n} → {−1,1} which is a linear *k*-junta with surface area *s*. (2) *f* is є-far from any linear *k*-junta with surface area (1+є)*s*. The query complexity of the algorithm is independent of the ambient dimension
*n*.

Following the surge of interest in noise-tolerant property testing, in this paper
we prove a noise-tolerant (or robust) version of this result. Namely, we give an algorithm
which given any *c*>0, є>0, distinguishes between: (1) *f*: ℝ^{n} → {−1,1} has correlation at least *c* with some linear *k*-junta with surface area *s*. (2) *f* has correlation at most *c*−є with any linear *k*-junta with surface area at most *s*. The query complexity of our tester is *k*^{poly(s/є)}. Using our techniques, we also obtain a fully noise tolerant tester with the same
query complexity for any class *C* of linear *k*-juntas with surface area bounded by *s*. As a consequence, we obtain a fully noise tolerant tester with query complexity
*k*^{O(poly(logk/є))} for the class of intersection of *k*-halfspaces (for constant *k*) over the Gaussian space. Our query complexity is independent of the ambient dimension
*n*. Previously, no non-trivial noise tolerant testers were known even for a single halfspace.

Cut and spectral sparsification of graphs have numerous applications, including e.g. speeding up algorithms for cuts and Laplacian solvers. These powerful notions have recently been extended to hypergraphs, which are much richer and may offer new applications. However, the current bounds on the size of hypergraph sparsifiers are not as tight as the corresponding bounds for graphs.

Our first result is a polynomial-time algorithm that, given a hypergraph on *n* vertices with maximum hyperedge size *r*, outputs an є-spectral sparsifier with *O*^{*}(*nr*) hyperedges, where *O*^{*} suppresses (є^{−1} log*n*)^{O(1)} factors. This size bound improves the two previous bounds: *O*^{*}(*n*^{3}) [Soma and Yoshida, SODA’19] and *O*^{*}(*nr*^{3}) [Bansal, Svensson and Trevisan, FOCS’19]. Our main technical tool is a new method
for proving concentration of the nonlinear analogue of the quadratic form of the Laplacians
for hypergraph expanders.

We complement this with lower bounds on the bit complexity of any compression scheme
that (1+є)-approximates all the cuts in a given hypergraph, and hence also on the
bit complexity of every є-cut/spectral sparsifier. These lower bounds are based on
Ruzsa-Szemerédi graphs, and a particular instantiation yields an Ω(*nr*) lower bound on the bit complexity even for fixed constant є. In the case of hypergraph
cut sparsifiers, this is tight up to polylogarithmic factors in *n*, due to recent result of [Chen, Khanna and Nagda, FOCS’20]. For spectral sparsifiers
it narrows the gap to *O*^{*}(*r*).

Finally, for directed hypergraphs, we present an algorithm that computes an є-spectral
sparsifier with *O*^{*}(*n*^{2}*r*^{3}) hyperarcs, where *r* is the maximum size of a hyperarc. For small *r*, this improves over *O*^{*}(*n*^{3}) known from [Soma and Yoshida, SODA’19], and is getting close to the trivial lower
bound of Ω(*n*^{2}) hyperarcs.

We study space-pass tradeoffs in graph streaming algorithms for parameter estimation and property testing problems
such as estimating the size of maximum matchings and maximum cuts, weight of minimum
spanning trees, or testing if a graph is connected or cycle-free versus being far
from these properties. We develop a new lower bound technique that proves that for
many problems of interest, including all the above, obtaining a (1+є)-approximation
requires either *n*^{Ω(1)} space or Ω(1/є) passes, even on highly restricted families of graphs such as bounded-degree
planar graphs. For multiple of these problems, this bound matches those of existing
algorithms and is thus (asymptotically) optimal. Our results considerably strengthen
prior lower bounds even for arbitrary graphs: starting from the influential work of
[Verbin, Yu; SODA 2011], there has been a plethora of lower bounds for single-pass
algorithms for these problems; however, the only multi-pass lower bounds proven very
recently in [Assadi, Kol, Saxena, Yu; FOCS 2020] rules out sublinear-space algorithms
with exponentially smaller *o*(log(1/є)) passes for these problems. One key ingredient of our proofs is a simple
streaming XOR Lemma, a generic hardness amplification result, that we prove: informally speaking, if
a *p*-pass *s*-space streaming algorithm can only solve a decision problem with advantage δ > 0
over random guessing, then it cannot solve XOR of ℓ independent copies of the problem
with advantage much better than δ^{ℓ}. This result can be of independent interest and useful for other streaming lower
bounds as well.

We study the decremental All-Pairs Shortest Paths (APSP) problem in undirected edge-weighted
graphs. The input to the problem is an undirected *n*-vertex *m*-edge graph *G* with non-negative lengths on edges, that undergoes an online sequence of edge deletions.
The goal is to support approximate shortest-paths queries: given a pair *x*,*y* of vertices of *G*, return a path *P* connecting *x* to *y*, whose length is within factor α of the length of the shortest *x*-*y* path, in time Õ(|*E*(*P*)|), where α is the approximation factor of the algorithm. APSP is one of the most
basic and extensively studied dynamic graph problems.

A long line of work culminated in the algorithm of [Chechik, FOCS 2018] with near
optimal guarantees: for any constant 0<є≤ 1 and parameter *k*≥ 1, the algorithm achieves approximation factor (2+є)*k*−1, and total update time *O*(*mn*^{1/k+o(1)}log(*nL*)), where *L* is the ratio of longest to shortest edge lengths. Unfortunately, as much of prior
work, the algorithm is randomized and needs to assume an oblivious adversary; that
is, the input edge-deletion sequence is fixed in advance and may not depend on the
algorithm’s behavior. In many real-world scenarios, and in applications of APSP to
static graph problems, it is crucial that the algorithm works against an adaptive
adversary, where the edge deletion sequence may depend on the algorithm’s past behavior
arbitrarily; ideally, such an algorithm should be deterministic. Unfortunately, unlike
the oblivious-adversary setting, its adaptive-adversary counterpart is still poorly
understood. For unweighted graphs, the algorithm of [Henzinger, Krinninger and Nanongkai,
FOCS ’13, SICOMP ’16] achieves a (1+є)-approximation with total update time Õ(*mn*/є); the best current total update time guarantee of *n*^{2.5+O(є)} is achieved by the recent deterministic algorithm of [Chuzhoy, Saranurak, SODA’21],
with 2^{O(1/є)}-multiplicative and 2^{O(log3/4n/є)}-additive approximation. To the best of our knowledge, for arbitrary non-negative
edge weights, the fastest current adaptive-update algorithm has total update time
*O*(*n*^{3}log*L*/є), achieving a (1+є)-approximation. Even if we are willing to settle for any *o*(*n*)-approximation factor, no currently known algorithm has a better than Θ(*n*^{3}) total update time in weighted graphs and better than Θ(*n*^{2.5}) total update time in unweighted graphs. Several conditional lower bounds suggest
that no algorithm with a sufficiently small approximation factor can achieve an *o*(*n*^{3}) total update time. Our main result is a deterministic algorithm for decremental
APSP in undirected edge-weighted graphs, that, for any Ω(1/loglog*m*)≤ є< 1, achieves approximation factor (log*m*)^{2O(1/є)}, with total update time *O*(*m*^{1+O(є)}· (log*m*)^{O(1/є2)}· log*L*). In particular, we obtain a (*poly*log*m*)-approximation in time Õ(*m*^{1+є}) for any constant є, and, for any slowly growing function *f*(*m*), we obtain (log*m*)^{f(m)}-approximation in time *m*^{1+o(1)}. We also provide an algorithm with similar guarantees for decremental Sparse Neighborhood
Covers.

We study dynamic algorithms for the longest increasing subsequence (LIS) problem. A dynamic LIS algorithm maintains a sequence subject to operations of the following form arriving one by one: insert an element, delete an element, or substitute an element for another. After each update, the algorithm must report the length of the longest increasing subsequence of the current sequence.

Our main contribution is the first exact dynamic LIS algorithm with sublinear update
time. More precisely, we present a randomized algorithm that performs each operation
in time Õ(*n*^{4/5}) and, after each update, reports the answer to the LIS problem correctly with high
probability. We use several novel techniques and observations for this algorithm that
may find applications in future work.

In the second part of the paper, we study approximate dynamic LIS algorithms, which
are allowed to underestimate the solution size within a bounded multiplicative factor.
In this setting, we give a deterministic (1−*o*(1))-approximation algorithm with update time *O*(*n*^{o(1)}). This result improves upon the previous work of Mitzenmacher and Seddighin (STOC’20)
that provides an Ω(є^{O(1/є)})-approximation algorithm with update time Õ(*n*^{є}) for any є > 0.

We revisit the problem of maintaining the longest increasing subsequence (LIS) of
an array under (i) inserting an element, and (ii) deleting an element of an array.
In a recent breakthrough, Mitzenmacher and Seddighin [STOC 2020] designed an algorithm
that maintains an *O*((1/є)^{O(1/є)})-approximation of LIS under both operations with worst-case update time Õ(*n*^{є}), for any constant є>0 (Õ hides factors polynomial in log*n*, where *n* is the length of the input). We exponentially improve on their result by designing
an algorithm that maintains an (1+є) approximation of LIS under both operations with
worst-case update time Õ(є^{−5}). Instead of working with the grid packing technique introduced by Mitzenmacher and
Seddighin, we take a different approach building on a new tool that might be of independent
interest: LIS sparsification.

A particularly interesting consequence of our result is an improved solution for the
so-called Erdős-Szekeres partitioning, in which we seek a partition of a given permutation
of {1,2,…,*n*} into *O*(√*n*) monotone subsequences. This problem has been repeatedly stated as one of the natural
examples in which we see a large gap between the decision-tree complexity and algorithmic
complexity. The result of Mitzenmacher and Seddighin implies an *O*(*n*^{1+є}) time solution for this problem, for any є>0. Our algorithm (in fact, its simpler
decremental version) further improves this to Õ(*n*).

We introduce a new framework for computing approximate maximum weight matchings. Our primary focus is on the fully dynamic setting, where there is a large gap between the guarantees of the best known algorithms for computing weighted and unweighted matchings. Indeed, almost all current weighted matching algorithms that reduce to the unweighted problem lose a factor of two in the approximation ratio. In contrast, in other sublinear models such as the distributed and streaming models, recent work has largely closed this weighted/unweighted gap.

For bipartite graphs, we almost completely settle the gap with a general reduction
that converts *any* algorithm for α-approximate unweighted matching to an algorithm for (1−)α-approximate
weighted matching, while only increasing the update time by an *O*(log*n*) factor for constant . We also show that our framework leads to significant improvements
for non-bipartite graphs, though not in the form of a universal reduction. In particular,
we give two algorithms for weighted non-bipartite matching:

1. A randomized (Las Vegas) fully dynamic algorithm that maintains a (1/2−)-approximate
maximum weight matching in worst-case update time *O*(polylog *n*) with high probability against an adaptive adversary. Our bounds are essentially
the same as those of the unweighted algorithm of Wajc [STOC 2020]. 2. A deterministic
fully dynamic algorithm that maintains a (2/3−)-approximate maximum weight matching
in amortized update time *O*(*m*^{1/4}). Our bounds are essentially the same as those of the unweighted algorithm of Bernstein
and Stein [SODA 2016].

A key feature of our framework is that it uses existing algorithms for unweighted matching as black-boxes. As a result, our framework is simple and versatile. Moreover, our framework easily translates to other models, and we use it to derive new results for the weighted matching problem in streaming and communication complexity models.

We study the online stochastic matching problem. Consider a bipartite graph with offline vertices on one side, and with i.i.d.online vertices on the other side. The offline vertices and the distribution of online vertices are known to the algorithm beforehand. The realization of the online vertices, however, is revealed one at a time, upon which the algorithm immediately decides how to match it. For maximizing the cardinality of the matching, we give a 0.711-competitive online algorithm, which improves the best previous ratio of 0.706. When the offline vertices are weighted, we introduce a 0.7009-competitive online algorithm for maximizing the total weight of the matched offline vertices, which improves the best previous ratio of 0.662.

Conceptually, we find that the analysis of online algorithms simplifies if the online vertices follow a Poisson process, and establish an approximate equivalence between this Poisson arrival model and online stochstic matching. Technically, we propose a natural linear program for the Poisson arrival model, and demonstrate how to exploit its structure by introducing a converse of Jensen’s inequality. Moreover, we design an algorithmic amortization to replace the analytic one in previous work, and as a result get the first vertex-weighted online stochastic matching algorithm that improves the results in the weaker random arrival model.

We introduce a continuous analogue of the Learning with Errors (LWE) problem, which we name CLWE. We give a polynomial-time quantum reduction from worst-case lattice problems to CLWE, showing that CLWE enjoys similar hardness guarantees to those of LWE. Alternatively, our result can also be seen as opening new avenues of (quantum) attacks on lattice problems. Our work resolves an open problem regarding the computational complexity of learning mixtures of Gaussians without separability assumptions (Diakonikolas 2016, Moitra 2018). As an additional motivation, (a slight variant of) CLWE was considered in the context of robust machine learning (Diakonikolas et al.~FOCS 2017), where hardness in the statistical query (SQ) model was shown; our work addresses the open question regarding its computational hardness (Bubeck et al.~ICML 2019).

We construct a succinct non-interactive publicly-verifiable delegation scheme for
any log-space uniform circuit under the sub-exponential Learning With Errors (LWE)
assumption. For a circuit *C*:{0,1}^{N}→{0,1} of size *S* and depth *D*, the prover runs in time poly(*S*), the communication complexity is *D* · polylog(*S*), and the verifier runs in time (*D*+*N*) ·polylog(*S*). To obtain this result, we introduce a new cryptographic primitive: *a lossy correlation-intractable hash function family*. We use this primitive to soundly instantiate the Fiat-Shamir transform for a large
class of interactive proofs, including the interactive sum-check protocol and the
GKR protocol, assuming the sub-exponential hardness of LWE.

Additionally, by relying on the result of Choudhuri et al. (STOC 2019), we establish (sub-exponential) average-case hardness of PPAD, assuming the sub-exponential hardness of LWE.

Let MK^{t}P[*s*] be the set of strings *x* such that *K*^{t}(*x*) ≤ *s*(|*x*|), where *K*^{t}(*x*) denotes the *t*-bounded Kolmogorov complexity of the truthtable described by *x*. Our main theorem shows that for an appropriate notion of mild average-case hardness,
for every ε>0, polynomial *t*(*n*) ≥ (1+ε)*n*, and every “nice” class *F* of super-polynomial functions, the following are equivalent: (i) the existence of
some function *T* ∈ *F* such that *T*-hard one-way functions (OWF) exists (with non-uniform security); (ii) the existence
of some function *T* ∈ *F* such that MK^{t}P[*T*^{−1}] is mildly average-case hard with respect to sublinear-time non-uniform algorithms
(with running-time *n*^{δ} for some 0<δ<1). For instance, existence of subexponentially-hard (resp. quasi-poly-nomially-hard)
OWFs is equivalent to mild average-case hardness of MK^{t}P[*poly* log*n*] (resp. MK^{t}P[2^{O(√logn)})]) w.r.t. sublinear-time non-uniform algorithms. We additionally note that if we
want to deduce *T*-hard OWFs where security holds w.r.t. uniform *T*-time probabilistic attackers (i.e., uniformly-secure OWFs), it suffices to assume
sublinear time hardness of MK^{t}P w.r.t. uniform probabilistic sublinear-time attackers. We complement this result
by proving lower bounds that come surprisingly close to what is required to unconditionally
deduce the existence of (uniformly-secure) OWFs: MK^{t}P[*poly*log*n*] is worst-case hard w.r.t. uniform probabilistic sublinear-time algorithms, and MK^{t}P[*n*−log*n*] is mildly average-case hard for all *O*(*t*(*n*)/*n*^{3})-time deterministic algorithms.

We show the existence of indistinguishability obfuscators (iO) for general circuits assuming subexponential security of: (a) the Learning with Errors (LWE) assumption (with subexponential modulus-to-noise ratio); (b) a circular security conjecture regarding the Gentry-Sahai-Waters' (GSW) encryption scheme and a Packed version of Regev's encryption scheme. The circular security conjecture states that a notion of leakage-resilient security, that we prove is satisfied by GSW assuming LWE, is retained in the presence of an encrypted key-cycle involving GSW and Packed Regev.

In a seminal work, Goldreich, Micali and Wigderson (CRYPTO ’86) demonstrated the wide
applicability of zero-knowledge proofs by constructing such a proof system for the
NP-complete problem of graph 3-coloring. A long-standing open question has been whether
parallel repetition of their protocol preserves zero knowledge. In this work, we answer
this question in the negative, assuming a standard cryptographic assumption (i.e.,
the hardness of learning with errors (*LWE*)).

Leveraging a connection observed by Dwork, Naor, Reingold, and Stockmeyer (FOCS ’99), our negative result is obtained by making positive progress on a related fundamental problem in cryptography: securely instantiating the Fiat-Shamir heuristic for eliminating interaction in public-coin interactive protocols. A recent line of work has shown how to instantiate the heuristic securely, albeit only for a limited class of protocols.

Our main result shows how to instantiate Fiat-Shamir for parallel repetitions of much
more general interactive proofs. In particular, we construct hash functions that,
assuming *LWE*, securely realize the Fiat-Shamir transform for the following rich classes of protocols:

1) The parallel repetition of any “commit-and-open” protocol (such as the GMW protocol mentioned above), when a specific (natural) commitment scheme is used. Commit-and-open protocols are a ubiquitous paradigm for constructing general purpose public-coin zero knowledge proofs.

2) The parallel repetition of any base protocol that (1) satisfies a stronger notion of soundness called round-by-round soundness, and (2) has an efficient procedure, using a suitable trapdoor, for recognizing “bad verifier randomness” that would allow the prover to cheat.

Our results are obtained by establishing a new connection between the Fiat-Shamir transform and list-recoverable codes. In contrast to the usual focus in coding theory, we focus on a parameter regime in which the input lists are extremely large, but the rate can be small. We give a (probabilistic) construction based on Parvaresh-Vardy codes (FOCS ’05) that suffices for our applications.

In this work we prove that there is a function *f* ∈ *E*^{ NP} such that, for every sufficiently large *n* and *d* = √*n*/log*n*, *f*_{n} (*f* restricted to *n*-bit inputs) cannot be (1/2 + 2^{−d})-approximated by *F*_{2}-polynomials of degree *d*. We also observe that a minor improvement (e.g., improving *d* to *n*^{1/2+ε} for any ε > 0) over our result would imply *E*^{ NP} cannot be computed by depth-3 *AC*^{0}-circuits of 2^{n1/2 + ε} size, which is a notoriously hard open question in complexity theory.

Using the same proof techniques, we are also able to construct *extremely rigid* matrices over *F*_{2} in *P*^{ NP}. More specifically, we show that for every constant ε ∈ (0,1), there is a *P*^{ NP} algorithm which on input 1^{n} outputs an *n*× *n* *F*_{2}-matrix *H*_{n} satisfying *R*_{Hn}(2^{log1 − ε n}) ≥ (1/2 − exp(−log^{2/3 · ε} *n*) ) · *n*^{2}, for every sufficiently large *n*. This improves the recent *P*^{ NP} constructions of rigid matrices in [Alman and Chen, FOCS 2019] and [Bhangale et al.,
FOCS 2020], which only give Ω(*n*^{2}) rigidity.

The key ingredient in the proof of our new results is a new *derandomized XOR lemma* based on *approximate linear sums*, which roughly says that given an *n*-input function *f* which cannot be 0.99-approximated by certain linear sum of *s* many functions in *F* within ℓ_{1}-distance, one can construct a new function *Amp*^{f} with *O*(*n*) input bits, which cannot be (1/2+*s*^{Ω(1)})-approximated by *F*-functions. Taking *F* to be a function collection containing low-degree *F*_{2}-polynomials or low-rank *F*_{2}-matrices, our results are then obtained by first using the algorithmic method to
construct a function which is weakly hard against linear sums of *F* in the above sense, and then applying the derandomized XOR lemma to *f*.

We obtain our new derandomized XOR lemma by giving a generalization of the famous
hardcore lemma by Impagliazzo. Our generalization in some sense constructs a *non-Boolean* hardcore of a weakly hard function *f* with respect to *F*-functions, from the weak inapproximability of *f* by any linear sum of *F* with bounded ℓ_{p}-norm. This generalization recovers the original hardcore lemma by considering the
ℓ_{∞}-norm. Surprisingly, when we switch to the ℓ_{1}-norm, we immediately rediscover Levin’s proof of Yao’s XOR Lemma. That is, these
first two proofs of Yao’s XOR Lemma can be unified with our new perspective. For proving
the correlation bounds, our new derandomized XOR lemma indeed works with the ℓ_{4/3}-norm.

For a matrix *M* and a positive integer *r*, the rank *r* rigidity of *M* is the smallest number of entries of *M* which one must change to make its rank at most *r*. There are many known applications of rigidity lower bounds to a variety of areas
in complexity theory, but fewer known applications of rigidity upper bounds. In this
paper, we use rigidity upper bounds to prove new upper bounds in a few different models
of computation. Our results include:

- For any *d*>1, and over any field F, the *N* × *N* Walsh-Hadamard transform has a depth-*d* linear circuit of size *O*(*d* · *N*^{1 + 0.96/d}). This circumvents a known lower bound of Ω(*d* · *N*^{1 + 1/d}) for circuits with bounded coefficients over ℂ by Pudlák (2000), by using coefficients
of magnitude polynomial in *N*. Our construction also generalizes to linear transformations given by a Kronecker
power of any fixed 2 × 2 matrix.

- The *N* × *N* Walsh-Hadamard transform has a linear circuit of size ≤ (1.81 + *o*(1)) *N* log_{2} *N*, improving on the bound of ≈ 1.88 *N* log_{2} *N* which one obtains from the standard fast Walsh-Hadamard transform.

- A new rigidity upper bound, showing that the following classes of matrices are not
rigid enough to prove circuit lower bounds using Valiant’s approach: (1) for any field
F and any function *f* : {0,1}^{n} → F, the matrix *V*_{f} ∈ F^{2n × 2n} given by, for any *x*,*y* ∈ {0,1}^{n}, *V*_{f}[*x*,*y*] = *f*(*x* ∧ *y*), and (2) for any field F and any fixed-size matrices *M*_{1}, …, *M*_{n} ∈ F^{q × q}, the Kronecker product *M*_{1} ⊗ *M*_{2} ⊗ ⋯ ⊗ *M*_{n}. This generalizes recent results on non-rigidity, using a simpler approach which
avoids needing the polynomial method.

- New connections between recursive linear transformations like Fourier and Walsh-Hadamard transforms, and circuits for matrix multiplication.

Valiant (1980) showed that general arithmetic circuits with negation can be exponentially
more powerful than monotone ones. We give the first improvement to this classical
result: we construct a family of polynomials *P*_{n} in *n* variables, each of its monomials has non-negative coefficient, such that *P*_{n} can be computed by a polynomial-size *depth-three formula* but every monotone circuit computing it has size 2^{Ω(n1/4/log(n))}.

The polynomial *P*_{n} embeds the SINK∘ XOR function devised recently by Chattopadhyay, Mande and Sherif
(2020) to refute the Log-Approximate-Rank Conjecture in communication complexity.
To prove our lower bound for *P*_{n}, we develop a general connection between corruption of combinatorial rectangles by
any function *f* ∘ XOR and corruption of product polynomials by a certain polynomial *P*^{f} that is an arithmetic embedding of *f*. This connection should be of independent interest.

Using further ideas from communication complexity, we construct another family of
set-multilinear polynomials *f*_{n,m} such that both *F*_{n,m} − є· *f*_{n,m} and *F*_{n,m} + є· *f*_{n,m} have monotone circuit complexity 2^{Ω(n/log(n))} if є ≥ 2^{− Ω( m )} and *F*_{n,m} ∏_{i=1}^{n} (*x*_{i,1} +⋯+*x*_{i,m}), with *m* = *O*( *n*/log*n* ). The polynomials *f*_{n,m} have 0/1 coefficients and are in VNP. Proving such lower bounds for monotone circuits
has been advocated recently by Hrubeš (2020) as a first step towards proving lower
bounds against *general circuits* via his new approach.

We prove that the slice rank of a 3-tensor (a combinatorial notion introduced by Tao in the context of the cap-set problem), the analytic rank (a Fourier-theoretic notion introduced by Gowers and Wolf), and the geometric rank (a recently introduced algebro-geometric notion) are all equivalent up to an absolute constant. As a corollary, we obtain strong trade-offs on the arithmetic complexity of a biased bililnear map, and on the separation between computing a bilinear map exactly and on average. Our result settles open questions of Haramaty and Shpilka [STOC 2010], and of Lovett [Discrete Anal., 2019] for 3-tensors.

We give new and efficient black-box reconstruction algorithms for some classes of
depth-3 arithmetic circuits. As a consequence, we obtain the first efficient algorithm
for computing the tensor rank and for finding the optimal tensor decomposition as
a sum of rank-one tensors when then input is a *constant-rank* tensor. More specifically, we provide efficient learning algorithms that run in randomized
polynomial time over general fields and in deterministic polynomial time over and
for the following classes: 1) *Set-multilinear depth-*3* circuits of constant top fan-in (*(*k*)* circuits)*. As a consequence of our algorithm, we obtain the first polynomial time algorithm
for tensor rank computation and optimal tensor decomposition of constant-rank tensors.
This result holds for *d* dimensional tensors for any *d*, but is interesting even for *d*=3. 2) *Sums of powers of constantly many linear forms (*(*k*)* circuits)*. As a consequence we obtain the first polynomial-time algorithm for tensor rank computation
and optimal tensor decomposition of constant-rank symmetric tensors. 3) *Multilinear depth-3 circuits of constant top fan-in (multilinear *(*k*)* circuits)*. Our algorithm works over all fields of characteristic 0 or large enough characteristic.
Prior to our work the only efficient algorithms known were over polynomially-sized
finite fields (see. Karnin-Shpilka 09’). Prior to our work, the only polynomial-time
or even subexponential-time algorithms known (deterministic or randomized) for subclasses
of (*k*) circuits that also work over large/infinite fields were for the setting when the
top fan-in *k* is at most 2 (see Sinha 16’ and Sinha 20’).

The fastest known LP solver for general (dense) linear programs is due to [Cohen,
Lee and Song’19] and runs in *O*^{*}(*n*^{ω} +*n*^{2.5−α/2} + *n*^{2+1/6}) time. A number of follow-up works [Lee, Song and Zhang’19, Brand’20, Song and Yu’20]
obtain the same complexity through different techniques, but none of them can go below
*n*^{2+1/6}, even if ω=2. This leaves a polynomial gap between the cost of solving linear systems
(*n*^{ω}) and the cost of solving linear programs, and as such, improving the *n*^{2+1/6} term is crucial toward establishing an equivalence between these two fundamental
problems. In this paper, we reduce the running time to *O*^{*}(*n*^{ω} +*n*^{2.5−α/2} + *n*^{2+1/18}) where ω and α are the fast matrix multiplication exponent and its dual. Hence, under
the common belief that ω ≈ 2 and α ≈ 1, our LP solver runs in *O*^{*}(*n*^{2.055}) time instead of *O*^{*}(*n*^{2.16}).

A Bernoulli factory is an algorithmic procedure for exact sampling of certain random
variables having only Bernoulli access to their parameters. Bernoulli access to a
parameter *p* ∈ [0,1] means the algorithm does not know *p*, but has sample access to independent draws of a Bernoulli random variable with mean
equal to *p*. In this paper, we study the problem of Bernoulli factories for polytopes: given
Bernoulli access to a vector *x*∈ *P* for a given polytope *P*⊂ [0,1]^{n}, output a randomized vertex such that the expected value of the *i*-th coordinate is *exactly* equal to *x*_{i}. For example, for the special case of the perfect matching polytope, one is given
Bernoulli access to the entries of a doubly stochastic matrix [*x*_{ij}] and asked to sample a matching such that the probability of each edge (*i*,*j*) be present in the matching is exactly equal to *x*_{ij}.

We show that a polytope *P* admits a Bernoulli factory if and and only if *P* is the intersection of [0,1]^{n} with an affine subspace. Our construction is based on an algebraic formulation of
the problem, involving identifying a family of Bernstein polynomials (one per vertex)
that satisfy a certain algebraic identity on *P*. The main technical tool behind our construction is a connection between these polynomials
and the geometry of zonotope tilings.

We apply these results to construct an explicit factory for the perfect matching polytope.
The resulting factory is deeply connected to the combinatorial enumeration of arborescences
and may be of independent interest. For the *k*-uniform matroid polytope, we recover a sampling procedure known in statistics as
Sampford sampling.

We give a sharp lower bound on the capacity of a real stable polynomial, depending
only on the value of its gradient at *x* = 1. This result implies a sharp improvement to a similar inequality proved by Linial-Samorodnitsky-Wigderson
in 2000, which was crucial to the analysis of their permanent approximation algorithm.
Such inequalities have played an important role in the recent work on operator scaling
and its generalizations and applications, and in fact we use our bound to construct
a new scaling algorithm for real stable polynomials. Our bound is also quite similar
to one used very recently by Karlin-Klein-Oveis Gharan to give an improved approximation
factor for metric TSP.

The new technique we develop to prove this bound is productization, which says that any real stable polynomial can be approximated at any point in the positive orthant by a product of linear forms. Beyond the results of this paper, our main hope is that this new technique will allow us to avoid ”frightening technicalities”, in the words of Laurent and Schrijver, that often accompany combinatorial lower bounds. In particular, we believe that this technique will be useful towards simplifying and improving further the approximation factor given in the fantastic work of Karlin-Klein-Oveis Gharan on metric TSP.

In this paper we provide new randomized algorithms with improved runtimes for solving
linear programs with two-sided constraints. In the special case of the minimum cost
flow problem on *n*-vertex *m*-edge graphs with integer polynomially-bounded costs and capacities we obtain a randomized
method which solves the problem in Õ(*m* + *n*^{1.5}) time. This improves upon the previous best runtime of Õ(*m* √*n*) [Lee-Sidford’14] and, in the special case of unit-capacity maximum flow, improves
upon the previous best runtimes of *m*^{4/3 + o(1)} [Liu-Sidford’20, Kathuria’20] and Õ(*m* √*n*) [Lee-Sidford’14] for sufficiently dense graphs.

In the case of ℓ_{1}-regression in a matrix with *n*-columns and *m*-rows we obtain a randomized method which computes an є-approximate solution in Õ(*mn* + *n*^{2.5}) time. This yields a randomized method which computes an є-optimal policy of a discounted
Markov Decision Process with *S* states and, *A* actions per state in time Õ(*S*^{2} *A* + *S*^{2.5}). These methods improve upon the previous best runtimes of methods which depend polylogarithmically
on problem parameters, which were Õ(*mn*^{1.5}) [Lee-Sidford’15] and Õ(*S*^{2.5} *A*) [Lee-Sidford’14, Sidford-Wang-Wu-Ye’18] respectively.

To obtain this result we introduce two new algorithmic tools of possible independent interest. First, we design a new general interior point method for solving linear programs with two sided constraints which combines techniques from [Lee-Song-Zhang’19, Brand et al.’20] to obtain a robust stochastic method with iteration count nearly the square root of the smaller dimension. Second, to implement this method we provide dynamic data structures for efficiently maintaining approximations to variants of Lewis-weights, a fundamental importance measure for matrices which generalize leverage scores and effective resistances.

We investigate the approximability of the following optimization problem. The input
is an *n*× *n* matrix *A*=(*A*_{ij}) with real entries and an origin-symmetric convex body *K*⊂ ℝ^{n} that is given by a membership oracle. The task is to compute (or approximate) the
maximum of the quadratic form ∑_{i=1}^{n}∑_{j=1}^{n} *A*_{ij} *x*_{i}*x*_{j}=⟨ *x*,*Ax*⟩ as *x* ranges over *K*. This is a rich and expressive family of optimization problems; for different choices
of matrices *A* and convex bodies *K* it includes a diverse range of optimization problems like max-cut, Grothendieck/non-commutative
Grothendieck inequalities, small set expansion and more. While the literature studied
these special cases using case-specific reasoning, here we develop a general methodology
for treatment of the approximability and inapproximability aspects of these questions.

The underlying geometry of *K* plays a critical role; we show under commonly used complexity assumptions that polytime
constant-approximability necessitates that *K* has type-2 constant that grows slowly with *n*. However, we show that even when the type-2 constant is bounded, this problem sometimes
exhibits strong hardness of approximation. Thus, even within the realm of type-2 bodies,
the approximability landscape is nuanced and subtle.

However, the link that we establish between optimization and geometry of Banach spaces
allows us to devise a generic algorithmic approach to the above problem. We associate
to each convex body a new (higher dimensional) auxiliary set that is not convex, but
is approximately convex when *K* has a bounded type-2 constant. If our auxiliary set has an approximate separation
oracle, then we design an approximation algorithm for the original quadratic optimization
problem, using an approximate version of the ellipsoid method. Even though our hardness
result implies that such an oracle does not exist in general, this new question can
be solved in specific cases of interest by implementing a range of classical tools
from functional analysis, most notably the deep factorization theory of linear operators.

Beyond encompassing the scenarios in the literature for which constant-factor approximation algorithms were found, our generic framework implies that that for convex sets with bounded type-2 constant, constant factor approximability is preserved under the following basic operations: (a) Subspaces, (b) Quotients, (c) Minkowski Sums, (d) Complex Interpolation. This yields a rich family of new examples where constant factor approximations are possible, which were beyond the reach of previous methods. We also show (under commonly used complexity assumptions) that for symmetric norms and unitarily invariant matrix norms the type-2 constant nearly characterizes the approximability of quadratic maximization.

We study the communication complexity of incentive compatible auction-protocols between a monopolist seller and a single buyer with a combinatorial valuation function over n items. Motivated by the fact that revenue-optimal auctions are randomized (as well as by an open problem of Babaioff, Gonczarowski, and Nisan), we focus on the randomized communication complexity of this problem (in contrast to most prior work on deterministic communication). We design simple, incentive compatible, and revenue-optimal auction-protocols whose expected communication complexity is much (in fact infinitely) more efficient than their deterministic counterparts. We also give nearly matching lower bounds on the expected communication complexity of approximately-revenue-optimal auctions. These results follow from a simple characterization of incentive compatible auction-protocols that allows us to prove lower bounds against randomized auction-protocols. In particular, our lower bounds give the first approximation-resistant, exponential separation between communication complexity of incentivizing vs implementing a Bayesian incentive compatible social choice rule, settling an open question of Fadel and Segal.

Min-max optimization of an objective function *f*: ℝ^{d} × ℝ^{d} → ℝ is an important model for robustness in an adversarial setting, with applications
to many areas including optimization, economics, and deep learning. In many applications
*f* may be nonconvex-nonconcave, and finding a global min-max point may be computationally
intractable. There is a long line of work that seeks computationally tractable algorithms
for alternatives to the min-max optimization model. However, many of the alternative
models have solution points which are only guaranteed to exist under strong assumptions
on *f*, such as convexity, monotonicity, or special properties of the starting point. We
propose an optimization model, the ε-greedy adversarial equilibrium, and show that
it can serve as a computationally tractable alternative to the min-max optimization
model. Roughly, we say that a point (*x*^{⋆}, *y*^{⋆}) is an ε-greedy adversarial equilibrium if *y*^{⋆} is an ε-approximate local maximum for *f*(*x*^{⋆},·), and *x*^{⋆} is an ε-approximate local minimum for a “greedy approximation” to the function max_{z} *f*(*x*, *z*) which can be efficiently estimated using second-order optimization algorithms. We
prove the existence of such a point for any smooth function which is bounded and has
Lipschitz Hessian. To prove existence, we introduce an algorithm that converges from
any starting point to an ε-greedy adversarial equilibrium in a number of evaluations
of the function *f*, the max-player’s gradient ∇_{y} *f*(*x*,*y*), and its Hessian ∇_{y}^{2} *f*(*x*,*y*), that is polynomial in the dimension *d*, 1/ε, and the bounds on *f* and its Lipschitz constant.

We study contextual search, a generalization of binary search in higher dimensions,
which captures settings such as feature-based dynamic pricing. Standard game-theoretic
formulations of this problem assume that agents act in accordance with a specific
behavioral model. In practice, some agents may not subscribe to the dominant behavioral
model or may act in ways that are seemingly *arbitrarily irrational*. Existing algorithms heavily depend on the behavioral model being (approximately)
accurate for all agents and have poor performance even with a few arbitrarily irrational
agents.

We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model. In particular, we provide two algorithms, one based on multidimensional binary search methods and one based on gradient descent. Our techniques draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis.

Algorithms often have tunable parameters that impact performance metrics such as runtime
and solution quality. For many algorithms used in practice, no parameter settings
admit meaningful worst-case bounds, so the parameters are made available for the user
to tune. Alternatively, parameters may be tuned implicitly within the proof of a worst-case
guarantee. Worst-case instances, however, may be rare or nonexistent in practice.
A growing body of research has demonstrated that *data-driven algorithm design* can lead to significant improvements in performance. This approach uses a *training set* of problem instances sampled from an unknown, application-specific distribution and
returns a parameter setting with strong average performance on the training set.

We provide a broadly applicable theory for deriving *generalization guarantees* that bound the difference between the algorithm’s average performance over the training
set and its expected performance on the unknown distribution. Our results apply no
matter how the parameters are tuned, be it via an automated or manual approach. The
challenge is that for many types of algorithms, performance is a volatile function
of the parameters: slightly perturbing the parameters can cause a large change in
behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP’17; Balcan et al., COLT’17,
ICML’18, EC’18) has proved generalization bounds by employing case-by-case analyses
of greedy algorithms, clustering algorithms, integer programming algorithms, and selling
mechanisms. We uncover a unifying structure which we use to prove extremely general
guarantees, yet we recover the bounds from prior research. Our guarantees, which are
tight up to logarithmic factors in the worst case, apply whenever an algorithm’s performance
is a piecewise-constant, -linear, or—more generally—*piecewise-structured* function of its parameters. Our theory also implies novel bounds for voting mechanisms
and dynamic programming algorithms from computational biology.

Let (*f*,*P*) be an incentive compatible mechanism where *f* is the social choice function and *P* is the payment function. In many important settings, *f* uniquely determines *P* (up to a constant) and therefore a common approach is to focus on the design of *f* and neglect the role of the payment function. Fadel and Segal [JET, 2009] question
this approach by taking the lenses of communication complexity: can it be that the
communication complexity of an incentive compatible mechanism that implements *f* (that is, computes both the output and the payments) is much larger than the communication
complexity of computing the output? I.e., can it be that *cc*_{IC}(*f*)>>*cc*(*f*)? Fadel and Segal show that for every *f*, *cc*_{IC}(*f*)≤ *exp*(*cc*(*f*)). They also show that fully computing the incentive compatible mechanism is strictly
harder than computing only the output: there exists a social choice function *f* such that *cc*_{IC}(*f*)=*cc*(*f*)+1. In a follow-up work, Babaioff, Blumrosen, Naor, and Schapira [EC’08] provide
a social choice function *f* such that *cc*_{IC}(*f*)=Θ(*n*· *cc*(*f*)), where *n* is the number of players. The question of whether the exponential upper bound of
Fadel and Segal is tight remained wide open. In this paper we solve this question
by explicitly providing a function *f* such that *cc*_{IC}(*f*)= *exp*(*cc*(*f*)). In fact, we establish this via two very different proofs. In contrast, we show
that if the players are risk-neutral and we can compromise on a randomized truthful-in-expectation
implementation (and not on deterministic ex-post implementation) gives that *cc*_{TIE}(*f*)=*poly*(*n*,*cc*(*f*)) for every function *f*, as long as the domain of *f* is single parameter or a convex multi-parameter domain. We also provide efficient
algorithms for deterministic computation of payments in several important domains.

We consider the problem of implementing a fixed social choice function between multiple
players (which takes as input a type *t*_{i} from each player *i* and outputs an outcome *f*(*t*_{1},…, *t*_{n})), in which each player must be incentivized to follow the protocol. In particular,
we study the communication requirements of a protocol which: (a) implements *f*, (b) implements *f* and computes payments that make it ex-post incentive compatible (EPIC) to follow
the protocol, and (c) implements *f* and computes payments in a way that makes it dominant-strategy incentive compatible
(DSIC) to follow the protocol.

We show exponential separations between all three of these quantities, already for
just two players. That is, we first construct an *f* such that *f* can be implemented in communication *c*, but any EPIC implementation of *f* (with any choice of payments) requires communication exp(*c*). This answers an open question of [Fadel and Segal, 2009; Babaioff et. al., 2013].
Second, we construct an *f* such that an EPIC protocol implements *f* with communication *C*, but all DSIC implementations of *f* require communication exp(*C*).

We show that the volume of a convex body in *R*^{n} in the general membership oracle model can be computed to within relative error ε
using *O*(*n*^{3}ψ^{2}/ε^{2}) oracle queries, where ψ is the KLS constant. With the current bound of ψ=*O*(*n*^{o(1)}), this gives an *O*(*n*^{3+o(1)}/ε^{2}) algorithm, the first improvement on the Lovász-Vempala *O*(*n*^{4}/ε^{2}) algorithm from 2003. The main new ingredient is an *O*(*n*^{3}ψ^{2}) algorithm for isotropic transformation, following which we can apply the *O*(*n*^{3}/ε^{2}) volume algorithm of Cousins and Vempala for well-rounded convex bodies. A positive
resolution of the KLS conjecture would imply an *O*(*n*^{3}/є^{2}) volume algorithm. We also give an efficient implementation of the new algorithm
for convex polytopes defined by *m* inequalities in *R*^{n}: polytope volume can be estimated in time *O*(*mn*^{c}/ε^{2}) where *c*<3.2 depends on the current matrix multiplication exponent and improves on the previous
best bound.

Given a set of pairwise disjoint polygonal obstacles in the plane, finding an obstacle-avoiding
Euclidean shortest path between two points is a classical problem in computational
geometry and has been studied extensively. Previously, Hershberger and Suri [SIAM
J. Comput. 1999] gave an algorithm of *O*(*n*log*n*) time and *O*(*n*log*n*) space, where *n* is the total number of vertices of all obstacles. Recently, by modifying Hershberger
and Suri’s algorithm, Wang [SODA 2021] reduced the space to *O*(*n*) while the runtime of the algorithm is still *O*(*n*log*n*). In this paper, we present a new algorithm of *O*(*n*+*h*log*h*) time and *O*(*n*) space, provided that a triangulation of the free space is given, where *h* is the number of obstacles. Our algorithm builds a shortest path map for a source
point *s*, so that given any query point *t*, the shortest path length from *s* to *t* can be computed in *O*(log*n*) time and a shortest *s*-*t* path can be produced in additional time linear in the number of edges of the path.

Given a finite point set *P* in ℝ^{d}, and >0 we say that *N*⊆ ℝ^{d} is a weak -net if it pierces every convex set *K* with |*K*∩ *P*|≥ є |*P*|.

Let *d*≥ 3. We show that for any finite point set in ℝ^{d}, and any є>0, there exist a weak -net of cardinality *O*(1/є^{d−1/2+γ}), where γ>0 is an arbitrary small constant.

This is the first improvement of the bound of *O*^{*}(1/є^{d}) that was obtained in 1993 by Chazelle, Edelsbrunner, Grigni, Guibas, Sharir, and
Welzl for general point sets in dimension *d*≥ 3.

In this paper we describe a fully-dynamic data structure that supports point location
queries in a connected planar subdivision with *n* edges. Our data structure uses *O*(*n*) space, answers queries in *O*(log*n*) time, and supports updates in *O*(log*n*) time. Our solution is based on a data structure for vertical ray shooting queries
that supports queries and updates in *O*(log*n*) time.

Conjunctive queries are one of the most common class of queries used in database systems, and the best studied in the literature. A seminal result of Grohe, Schwentick, and Segoufin (STOC 2001) demonstrates that for every class G of graphs, the evaluation of all conjunctive queries whose underlying graph is in G is tractable if, and only if, G has bounded treewidth. In this work, we extend this characterization to the counting problem for conjunctive queries. Specifically, for every class C of conjunctive queries with bounded treewidth, we introduce the first fully polynomial-time randomized approximation scheme (FPRAS) for counting answers to a query in C, and the first polynomial-time algorithm for sampling answers uniformly from a query in C. As a corollary, it follows that for every class G of graphs, the counting problem for conjunctive queries whose underlying graph is in G admits an FPRAS if, and only if, G has bounded treewidth (unless BPP is different from P). In fact, our FPRAS is more general, and also applies to conjunctive queries with bounded hypertree width, as well as unions of such queries.

The key ingredient in our proof is the resolution of a fundamental counting problem from automata theory. Specifically, we demonstrate the first FPRAS and polynomial time sampler for the set of trees of size n accepted by a tree automaton, which improves the prior quasi-polynomial time randomized approximation scheme (QPRAS) and sampling algorithm of Gore, Jerrum, Kannan, Sweedyk, and Mahaney ’97. We demonstrate how this algorithm can be used to obtain an FPRAS for many open problems, such as counting solutions to constraint satisfaction problems (CSP) with bounded hypertree width, counting the number of error threads in programs with nested call subroutines, and counting valid assignments to structured DNNF circuits.

We give an algorithm that computes a (1+є)-approximate Steiner forest in near-linear
time *n* · 2^{(1/є)O(ddim2) (loglogn)2}, where *ddim* is the doubling dimension of the metric space. This improves upon the best previous
result due to Chan et al. (SIAM J. Comput. 4 (2018)), who gave a runtime of about
*n*^{2O(ddim)} · 2^{(ddim/є)O(ddim) √logn}. For Steiner tree our methods achieve an even better runtime *n* (log*n*)^{(1/є)O(ddim2)}.

Weighted flow time is a fundamental and very well-studied objective function in scheduling. In this paper, we study the setting of a single machine with preemptions.

The input consists of a set of jobs, characterized by their processing times, release times, and weights and we want to compute a (possibly preemptive) schedule for them. The objective is to minimize the sum of the weighted flow times of the jobs, where the flow time of a job is the time between its release date and its completion time.

It had been a long-standing open problem to find a polynomial time *O*(1)-approximation algorithm for this setting. In a recent break-through result, Batra,
Garg, and Kumar (FOCS 2018) found such an algorithm if the input data are polynomially
bounded integers, and Feige, Kulkarni, and Li (SODA 2019) presented a black-box reduction
to this setting. The resulting approximation ratio is a (not explicitly stated) constant
which is at least 10000. In this paper we improve this ratio to 2+ε. The algorithm
by Batra, Garg, and Kumar (FOCS 2018) reduces the problem to Demand MultiCut on trees
and solves the resulting instances via LP-rounding and a dynamic program. Instead,
we first reduce the problem to a (different) geometric problem while losing only a
factor 1+ε, and then solve its resulting instances up to a factor of 2+ε by a dynamic
program. In particular, our reduction ensures certain structural properties, thanks
to which we do not need LP-rounding methods.

We believe that our result makes substantial progress towards finding a PTAS for weighted flow time on a single machine.

The (non-uniform) sparsest cut problem is the following graph-partitioning problem: given a “supply” graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constant-factor approximations are known.

We consider the problem for planar graphs, and give a (2+)-approximation algorithm
that runs in quasipolynomial time. Our approach defines a new structural decomposition
of an optimal solution using a “patching” primitive. We combine this decomposition
with a Sherali-Adams-style linear programming relaxation of the problem, which we
then round. This should be compared with the polynomial-time approximation algorithm
of Rao (1999), which uses the metric linear programming relaxation and ℓ_{1}-embeddings, and achieves an *O*(√log*n*)-approximation in polynomial time.

We consider the problem of online scheduling on a single machine in order to minimize weighted flow time. The existing algorithms for this problem (STOC ’01, SODA ’03, FOCS ’18) all require exact knowledge of the processing time of each job. This assumption is crucial, as even a slight perturbation of the processing time would lead to polynomial competitive ratio. However, this assumption very rarely holds in real-life scenarios.

In this paper, we present the first algorithm for weighted flow time which do *not* require exact knowledge of the processing times of jobs. Specifically, we introduce
the Scheduling with Predicted Processing Time (SPPT) problem, where the algorithm
is given a prediction for the processing time of each job, instead of its real processing
time. For the case of a constant factor distortion between the predictions and the
real processing time, our algorithms match *all* the best known competitiveness bounds for weighted flow time – namely *O*(log*P*), *O*(log*D*) and *O*(log*W*), where *P*,*D*,*W* are the maximum ratios of processing times, densities, and weights, respectively.
For larger errors, the competitiveness of our algorithms degrades gracefully.

There has been a recent wave of interest in intermediate trust models for differential privacy that eliminate the need for a fully trusted central data collector, but overcome the limitations of local differential privacy. This interest has led to the introduction of the shuffle model (Cheu et al., EUROCRYPT 2019; Erlingsson et al., SODA 2019) and revisiting the pan-private model (Dwork et al., ITCS 2010). The message of this line of work is that, for a variety of low-dimensional problems—such as counts, means, and histograms—these intermediate models offer nearly as much power as central differential privacy. However, there has been considerably less success using these models for high-dimensional learning and estimation problems.

In this work we prove the first non-trivial lower bounds for high-dimensional learning
and estimation in both the pan-private model and the general multi-message shuffle
model. Our lower bounds apply to a variety of problems—for example, we show that,
private agnostic learning of parity functions over *d* bits requires Ω(2^{d/2}) samples in these models, and privately selecting the most common attribute from
a set of *d* choices requires Ω(*d*^{1/2}) samples, both of which are exponential separations from the central model.

Prediction algorithms assign numbers to individuals that are popularly understood as individual “probabilities”—what is the probability of 5-year survival after cancer diagnosis?—and which increasingly form the basis for life-altering decisions. Drawing on an understanding of computational indistinguishability developed in complexity theory and cryptography, we introduce Outcome Indistinguishability. Predictors that are Outcome Indistinguishable (OI) yield a generative model for outcomes that cannot be efficiently refuted on the basis of the real-life observations produced by .

We investigate a hierarchy of OI definitions, whose stringency increases with the degree to which distinguishers may access the predictor in question. Our findings reveal that OI behaves qualitatively differently than previously studied notions of indistinguishability. First, we provide constructions at all levels of the hierarchy. Then, leveraging recently-developed machinery for proving average-case fine-grained hardness, we obtain lower bounds on the complexity of the more stringent forms of OI. This hardness result provides the first scientific grounds for the political argument that, when inspecting algorithmic risk prediction instruments, auditors should be granted oracle access to the algorithm, not simply historical predictions.

We construct asymptotically optimal adjacency labelling schemes for every hereditary
class containing 2^{Ω(n2)} *n*-vertex graphs as *n*→ ∞. This regime contains many classes of interest, for instance perfect graphs or
comparability graphs, for which we obtain an adjacency labelling scheme with labels
of *n*/4+*o*(*n*) bits per vertex. This implies the existence of a reachability labelling scheme for
digraphs with labels of *n*/4+*o*(*n*) bits per vertex and comparability labelling scheme for posets with labels of *n*/4+*o*(*n*) bits per element. All these results are best possible, up to the lower order term.

We obtain a description of the Bipartite Perfect Matching decision problem as a multilinear
polynomial over the Reals. We show that it has full degree and (1−*o*_{n}(1))· 2^{n2} monomials with non-zero coefficients. In contrast, we show that in the dual representation
(switching the roles of 0 and 1) the number of monomials is *only* exponential in Θ(*n* log*n*). Our proof relies heavily on the fact that the lattice of graphs which are “matching-covered”
is Eulerian.

We study the graphlet sampling problem: given an integer *k* ≥ 3 and a graph *G*=(*V*,*E*), sample a connected induced *k*-node subgraph of *G* (also called *k*-graphlet) uniformly at random. This is a fundamental graph mining primitive, with
applications in social network analysis and bioinformatics. The two state-of-the-art
techniques are random walks and color coding. The random walk is elegant, but the
current upper bounds and lower bounds on its mixing time suffer a gap of Δ^{k−1} where Δ is the maximum degree of *G*. Color coding is better understood, but requires a 2^{O(k)} *m*-time preprocessing over the entire graph. Moreover, no efficient algorithm is known
for sampling graphlets uniformly — random walks and color coding yield only є-uniform
samples. In this work, we provide the following results:

(i) A near-optimal mixing time bound for the classic *k*-graphlet random walk, as a function of the mixing time of *G*. In particular, ignoring *k*^{O(k)} factors, we show that the *k*-graphlet random walk mixes in Θ(*t*(*G*) · ρ(*G*)^{k−1}) steps, where *t*(*G*) is the mixing time of *G* and ρ(*G*) is the ratio between its maximum and minimum degree, and on some graphs this is
tight up to lg*n* factors.

(ii) The first efficient algorithm for uniform graphlet sampling. The algorithm has
a preprocessing phase that uses time *O*(*n* *k*^{2} lg*k* + *m*) and space *O*(*n*), and a sampling phase that takes *k*^{O(k)} lgΔ time per sample. It is based on ordering *G* in a simple way, so to virtually partition the graphlets into buckets, and then sampling
from those buckets using rejection sampling. The algorithm can be used also for counting,
with additive guarantees.

(iii) A near-optimal algorithm for є-uniform graphlet sampling, with a preprocessing
phase that runs in time *O*(*k*^{6} є^{−1} *n* lg*n*) and space *O*(*n*), and a sampling phase that takes *k*^{O(k)}(1/є)^{10} lg1/є expected time per sample. The algorithm is based on a nontrivial sketching
of the ordering of *G*, followed by emulating uniform sampling through coupling arguments.

Minimum-weight cut (min-cut) is a basic measure of a network’s connectivity strength.
While the min-cut can be computed efficiently in the sequential setting [Karger STOC’96],
there was no efficient way for a distributed network to compute its own min-cut without
limiting the input structure or dropping the output quality: In the standard CONGEST
model, existing algorithms with nearly-optimal time (e.g. [Ghaffari, Kuhn, DISC’13;
Nanongkai, Su, DISC’14]) can guarantee a solution that is (1+є)-approximation at best
while the exact Õ(*n*^{0.8}*D*^{0.2} + *n*^{0.9})-time algorithm [Ghaffari, Nowicki, Thorup, SODA’20] works only on simple networks
(no weights and no parallel edges). Throughout, *n* and *D* denote the network’s number of vertices and hop-diameter, respectively. For the weighted
case, the best bound was Õ(*n*) [Daga, Henzinger, Nanongkai, Saranurak, STOC’19]. In this paper, we provide an exact
Õ(√*n* + *D*)-time algorithm for computing min-cut on weighted networks. Our result improves even
the previous algorithm that works only on simple networks. Its time complexity matches
the known lower bound up to polylogarithmic factors. At the heart of our algorithm
are a routing trick and two structural lemmas regarding the structure of a minimum
cut of a graph. These two structural lemmas considerably strengthen and generalize
the framework of Mukhopadhyay-Nanongkai [STOC’20] and can be of independent interest.

In this paper we show that the Minimum Spanning Tree problem (MST) can be solved *deterministically* in *O*(1) rounds of the *Congested* *Clique* model.

In the *Congested* *Clique* model there are *n* players that perform computation in synchronous rounds. Each round consist of a phase
of local computation and a phase of communication, in which each pair of players is
allowed to exchange *O*(log*n*) bit messages. The studies of this model began with the MST problem: in the paper
by Lotker, Pavlov, Patt-Shamir, and Peleg [SPAA’03, SICOMP’05] that defines the *Congested* *Clique* model the authors give a deterministic *O*(loglog*n*) round algorithm that improved over a trivial *O*(log*n*) round adaptation of Borůvka’s algorithm.

There was a sequence of gradual improvements to this result: an *O*(logloglog*n*) round algorithm by Hegeman, Pandurangan, Pemmaraju, Sardeshmukh, and Scquizzato
[PODC’15], an *O*(log^{*} *n*) round algorithm by Ghaffari and Parter, [PODC’16] and an *O*(1) round algorithm by Jurdziński and Nowicki, [SODA’18], but all those algorithms
were randomized. Therefore, the question about the existence of any deterministic
*o*(loglog*n*) round algorithms for the Minimum Spanning Tree problem remains open since the seminal
paper by Lotker, Pavlov, Patt-Shamir, and Peleg [SPAA’03, SICOMP’05].

Our result resolves this question and establishes that *O*(1) rounds is enough to solve the MST problem in the *Congested* *Clique* model, even if we are not allowed to use any randomness. Furthermore, the amount
of communication needed by the algorithm makes it applicable to a variant of the *MPC* model using machines with local memory of size *O*(*n*).

Many distributed optimization algorithms achieve *existentially-optimal* running times, meaning that there exists *some* pathological worst-case topology on which no algorithm can do better. Still, most
networks of interest allow for exponentially faster algorithms. This motivates two
questions:

(i) What network topology parameters determine the complexity of distributed optimization?

(ii) Are there *universally-optimal* algorithms that are as fast as possible on *every* topology?

We resolve these 25-year-old open problems in the known-topology setting (i.e., supported CONGEST) for a wide class of global network optimization problems including MST, (1+є)-min cut, various approximate shortest paths problems, sub-graph connectivity, etc.

In particular, we provide several (equivalent) graph parameters and show they are
tight *universal lower bounds* for the above problems, fully characterizing their inherent complexity. Our results
also imply that algorithms based on the low-congestion shortcut framework match the
above lower bound, making them universally optimal if shortcuts are efficiently approximable.

Distributed vertex coloring is one of the classic problems and probably also the most
widely studied problems in the area of distributed graph algorithms. We present a
new randomized distributed vertex coloring algorithm for the standard CONGEST model,
where the network is modeled as an *n*-node graph *G*, and where the nodes of *G* operate in synchronous communication rounds in which they can exchange *O*(log*n*)-bit messages over all the edges of *G*. For graphs with maximum degree Δ, we show that the (Δ+1)-list coloring problem (and
therefore also the standard (Δ+1)-coloring problem) can be solved in *O*(log^{5}log*n*) rounds. Previously such a result was only known for the significantly more powerful
LOCAL model, where in each round, neighboring nodes can exchange messages of arbitrary
size. The best previous (Δ+1)-coloring algorithm in the CONGEST model had a running
time of *O*(logΔ + log^{6}log*n*) rounds. As a function of *n* alone, the best previous algorithm therefore had a round complexity of *O*(log*n*), which is a bound that can also be achieved by a na'ive folklore algorithm. For
large maximum degree Δ, our algorithm hence is an exponential improvement over the
previous state of the art.

In the multiparty number-in-hand set disjointness problem, we have *k* players, with private inputs *X*_{1},…,*X*_{k} ⊆ [*n*]. The players’ goal is to check whether ∩_{ℓ=1}^{k} *X*_{ℓ} = ∅. It is known that in the shared blackboard model of communication, set disjointness
requires Ω(*n* log*k* + *k*) bits of communication, and in the coordinator model, it requires Ω(*kn*) bits. However, these two lower bounds require that the players’ inputs can be highly
correlated. We study the communication complexity of multiparty set disjointness under
product distributions, and ask whether the problem becomes significantly easier, as
it is known to become in the two-party case. Our main result is a nearly-tight bound
of Θ^{̃}(*n*^{1−1/k} + *k*) for both the shared blackboard model and the coordinator model. This shows that
in the shared blackboard model, as the number of players grows, having independent
inputs helps less and less; but in the coordinator model, when *k* is very large, having independent inputs makes the problem much easier. Both our
upper and our lower bounds use new ideas, as the original techniques developed for
the two-party case do not scale to more than two players.

We prove the existence of an oblivious routing scheme that is *poly*(log*n*)-competitive in terms of (*congestion* + *dilation*), thus resolving a well-known question in oblivious routing.

Concretely, consider an undirected network and a set of packets each with its own
source and destination. The objective is to choose a path for each packet, from its
source to its destination, so as to minimize (*congestion* + *dilation*), defined as follows: The dilation is the maximum path hop-length, and the congestion
is the maximum number of paths that include any single edge. The routing scheme obliviously
and randomly selects a path for each packet independent of (the existence of) the
other packets. Despite this obliviousness, the selected paths have (*congestion* + *dilation*) within a *poly*(log*n*) factor of the best possible value. More precisely, for any integer hop-bound *h*, this oblivious routing scheme selects paths of length at most *h* · *poly*(log*n*) and is *poly*(log*n*)-competitive in terms of *congestion* in comparison to the best possible *congestion* achievable via paths of length at most *h* hops. These paths can be sampled in polynomial time.

This result can be viewed as an analogue of the celebrated oblivious routing results
of R'acke [FOCS 2002, STOC 2008], which are *O*(log*n*)-competitive in terms of *congestion*, but are not competitive in terms of *dilation*.

Double Compare-And-Swap (DCAS) is a tremendously useful synchronization primitive,
which is also notoriously difficult to implement efficiently from objects that are
provided by hardware. We present a randomized implementation of DCAS with *O*(log*n*) expected amortized step complexity against the oblivious adversary, where *n* is the number of processes in the system. This is the only algorithm to-date that
achieves sub-linear step complexity. We achieve that by first implementing two novel
algorithms as building blocks. One is a mechanism that allows processes to repeatedly
agree on a random value among multiple proposed ones, and the other one is a restricted
bipartite version of DCAS.

We study the error resilience of the message exchange task: Two parties, each holding a private input, want to exchange their inputs. However, the channel connecting them is governed by an adversary that may corrupt a constant fraction of the transmissions. What is the maximum fraction of corruptions that still allows the parties to exchange their inputs?

For the non-adaptive channel, where the parties must agree in advance on the order in which they communicate, the maximum error resilience was shown to be 1/4 (see Braverman and Rao, STOC 2011). The problem was also studied over the adaptive channel, where the order in which the parties communicate may not be predetermined (Ghaffari, Haeupler, and Sudan, STOC 2014; Efremenko, Kol, and Saxena, STOC 2020). These works show that the adaptive channel admits much richer set of protocols but leave open the question of finding its maximum error resilience.

In this work, we show that the maximum error resilience of a protocol for message exchange over the adaptive channel is 5/16, thereby settling the above question. Our result requires improving both the known upper bounds and the known lower bounds for the problem.

The cup game on *n* cups is a multi-step game with two players, a filler and an emptier. At each step,
the filler distributes 1 unit of water among the cups, and then the emptier selects
a single cup to remove (up to) 1 unit of water from.

There are several objective functions that the emptier might wish to minimize. One
of the strongest guarantees would be to minimize *tail size*, which is defined to be the number of cups with fill 2 or greater. A simple lower-bound
construction shows that the optimal tail size for deterministic emptying algorithms
is Θ(*n*), however.

We present a simple randomized emptying algorithm that achieves tail size Õ(log*n*) with high probability in *n* for poly *n* steps. Moreover, we show that this is tight up to doubly logarithmic factors. We
also extend our results to the *multi-processor cup game*, achieving tail size Õ(log*n* + *p*) on *p* processors with high probability in *n*. We show that the dependence on *p* is near optimal for any emptying algorithm that achieves polynomial-bounded backlog.

A natural question is whether our results can be extended to give *unending guarantees*, which apply to arbitrarily long games. We give a lower bound construction showing
that no monotone memoryless emptying algorithm can achieve an unending guarantee on
either tail size or the related objective function of backlog. On the other hand,
we show that even a very small (i.e., 1 / poly *n*) amount of resource augmentation is sufficient to overcome this barrier.

In dynamic load balancing, we wish to distribute balls into bins in an environment where both balls and bins can be added and removed. We want to minimize the maximum load of any bin but we also want to minimize the number of balls and bins that are affected when adding or removing a ball or a bin. We want a hashing-style solution where we given the ID of a ball can find its bin efficiently.

We are given a user-specified balancing parameter *c*=1+ε, where ε∈ (0,1). Let *n* and *m* be the current number of balls and bins. Then we want no bin with load above *C*=⌈ *c* *n*/*m*⌉, referred to as the *capacity* of the bins.

We present a scheme where we can locate a ball checking 1+*O*(log1/ε) bins in expectation. When inserting or deleting a ball, we expect to move
*O*(1/ε) balls, and when inserting or deleting a bin, we expect to move *O*(*C*/ε) balls. Previous bounds were off by a factor 1/ε.

The above bounds are best possible when *C*=*O*(1) but for larger *C*, we can do much better: We define *f*=ε *C* when *C*≤ log1/ε, *f*=ε√*C*· √log(1/(ε√*C*)) when log1/ε≤ *C*<1/2ε^{2}, and *f*=1 when *C*≥ 1/2ε^{2}. We show that we expect to move *O*(1/*f*) balls when inserting or deleting a ball, and *O*(*C*/*f*) balls when inserting or deleting a bin. Moreover, when *C*≥ log1/ε, we can search a ball checking only *O*(1) bins in expectation.

For the bounds with larger *C*, we first have to resolve a much simpler probabilistic problem. Place *n* balls in *m* bins of capacity *C*, one ball at the time. Each ball picks a uniformly random non-full bin. We show that
in expectation and with high probability, the fraction of non-full bins is Θ(*f*). Then the expected number of bins that a new ball would have to visit to find one
that is not full is Θ(1/*f*). As it turns out, this is also the complexity of an insertion in our more complicated
scheme where both balls and bins can be added and removed.

We present a quantum LDPC code family that has distance Ω(*N*^{3/5}/*polylog*(*N*)) and Θ(*N*^{3/5}) logical qubits, where *N* is the code length. This is the first quantum LDPC code construction that achieves
distance greater than *N*^{1/2} *polylog*(*N*). The construction is based on generalizing the homological product of codes to a
fiber bundle.

We prove that for every decision tree, the absolute values of the Fourier coefficients
of given order *t*≥1 sum to at most (*cd*/*t*)^{t/2}(1+log*n*)^{(t−1)/2}, where *n* is the number of variables, *d* is the tree depth, and *c*>0 is an absolute constant. This bound is essentially tight and settles a conjecture
due to Tal (arxiv 2019; FOCS 2020). The bounds prior to our work degraded rapidly
with *t*, becoming trivial already at *t*=√*d*.

As an application, we obtain, for every integer *k*≥1, a partial Boolean function on *n* bits that has bounded-error quantum query complexity at most ⌈ *k*/2⌉ and randomized query complexity Ω(*n*^{1−1/k}). This separation of bounded-error quantum versus randomized query complexity is
best possible, by the results of Aaronson and Ambainis (STOC 2015). Prior to our work,
the best known separation was polynomially weaker: *O*(1) versus Ω(*n*^{2/3−є}) for any є>0 (Tal, FOCS 2020).

As another application, we obtain an essentially optimal separation of *O*(log*n*) versus Ω(*n*^{1−є}) for bounded-error quantum versus randomized communication complexity, for any є>0.
The best previous separation was polynomially weaker: *O*(log*n*) versus Ω(*n*^{2/3−є}) (implicit in Tal, FOCS 2020).

Aaronson and Ambainis (SICOMP ‘18) showed that any partial function on *N* bits that can be computed with an advantage δ over a random guess by making *q* quantum queries, can also be computed classically with an advantage δ/2 by a randomized
decision tree making *O*_{q}(*N*^{1−1/2q}δ^{−2}) queries. Moreover, they conjectured the *k*-Forrelation problem — a partial function that can be computed with *q* = ⌈ *k*/2 ⌉ quantum queries — to be a suitable candidate for exhibiting such an extremal
separation.

We prove their conjecture by showing a tight lower bound of Ω(*N*^{1−1/k}) for the randomized query complexity of *k*-Forrelation, where δ = 2^{−O(k)}. By standard amplification arguments, this gives an explicit partial function that
exhibits an *O*_{є}(1) vs Ω(*N*^{1−є}) separation between bounded-error quantum and randomized query complexities, where
є>0 can be made arbitrarily small. Our proof also gives the same bound for the closely
related but non-explicit *k*-Rorrelation function introduced by Tal (FOCS ‘20).

Our techniques rely on classical Gaussian tools, in particular, Gaussian interpolation
and Gaussian integration by parts, and in fact, give a more general statement. We
show that to prove lower bounds for *k*-Forrelation against a family of functions, it suffices to bound the ℓ_{1}-weight of the Fourier coefficients between levels *k* and (*k*−1)*k*. We also prove new interpolation and integration by parts identities that might be
of independent interest in the context of rounding high-dimensional Gaussian vectors.

In this work we introduce a new notion of expansion in higher dimensions that is stronger
than the well studied cosystolic expansion notion, and is termed *Collective-cosystolic expansion*.

We show that tensoring two cosystolic expanders yields a new cosystolic expander, assuming one of the complexes in the product, is not only cosystolic expander, but rather a collective cosystolic expander.

We then show that the well known bounded degree cosystolic expanders, the Ramanujan complexes are, in fact, collective cosystolic expanders. This enables us to construct new bounded degree cosystolic expanders, by tensoring of Ramanujan complexes.

Using our new constructed bounded degree cosystolic expanders we construct *explicit* quantum LDPC codes of distance √*n* log^{k} *n* for any *k*, improving a recent result of Evra et. al. [FOCS, 2020], and setting a new record
for distance of explicit quantum LDPC codes.

The work of Evra et. al. [FOCS, 2020] took advantage of the high dimensional expansion notion known as cosystolic expansion, that occurs in Ramanujan complexes. Our improvement is achieved by considering tensor product of Ramanujan complexes, and using their newly derived property, the collective cosystolic expansion.

Based on the recent breakthrough of Huang (2019), we show that for any total Boolean function f,

deg(f) = O(adeg(f)^2): The degree of f is at most quadratic in the approximate degree of f. This is optimal as witnessed by the OR function.

D(f) = O(Q(f)^4): The deterministic query complexity of f is at most quartic in the quantum query complexity of f. This matches the known separation (up to log factors) due to Ambainis, Balodis, Belovs, Lee, Santha, and Smotrovs (2017).

We apply these results to resolve the quantum analogue of the Aanderaa–Karp–Rosenberg
conjecture. We show that if f is a nontrivial monotone graph property of an *n*-vertex graph specified by its adjacency matrix, then Q(f)=Ω(n), which is also optimal.
We also show that the approximate degree of any read-once formula on *n* variables is Θ(√n).

A foundational result in the theory of quantum computation, known as the "principle of safe storage," shows that it is always possible to take a quantum circuit and produce an equivalent circuit that makes all measurements at the end of the computation. While this procedure is time efficient, meaning that it does not introduce a large overhead in the number of gates, it uses extra ancillary qubits, and so is not generally space efficient. It is quite natural to ask whether it is possible to eliminate intermediate measurements without increasing the number of ancillary qubits. We give an affirmative answer to this question by exhibiting a procedure to eliminate all intermediate measurements that is simultaneously space efficient and time efficient. In particular, this shows that the definition of a space-bounded quantum complexity class is robust to allowing or forbidding intermediate measurements. A key component of our approach, which may be of independent interest, involves showing that the well-conditioned versions of many standard linear-algebraic problems may be solved by a quantum computer in less space than seems possible by a classical computer.

We demonstrate the possibility of (sub)exponential quantum speedup via a quantum algorithm
that follows an adiabatic path of a gapped Hamiltonian with no sign problem. The Hamiltonian
that exhibits this speed-up comes from the adjacency matrix of an undirected graph
whose vertices are labeled by *n*-bit strings, and we can view the adiabatic evolution as an efficient *O*(*poly*(*n*))-time quantum algorithm for finding a specific “EXIT” vertex in the graph given
the “ENTRANCE” vertex. On the other hand we show that if the graph is given via an
adjacency-list oracle, there is no classical algorithm that finds the “EXIT” with
probability greater than exp(−*n*^{δ}) using at most exp(*n*^{δ}) queries for δ= 1/5 − *o*(1). Our construction of the graph is somewhat similar to the “welded-trees” construction
of Childs et al., but uses additional ideas of Hastings for achieving a spectral gap
and a short adiabatic path.

In the universal blind quantum computation problem, a client wants to make use of
a single quantum server to evaluate *C*|0⟩ where *C* is an arbitrary quantum circuit while keeping *C* secret. The client’s goal is to use as few resources as possible. This problem, first
raised by Broadbent, Fitzsimons and Kashefi [FOCS 2009], has become fundamental to
the study of quantum cryptography, not only because of its own importance, but also
because it provides a testbed for new techniques that can be later applied to related
problems (for example, quantum computation verification). Known protocols on this
problem are mainly either information-theoretically (IT) secure or based on trapdoor
assumptions (public key encryptions). In this paper we study how the availability
of symmetric-key primitives, modeled by a random oracle, changes the complexity of
universal blind quantum computation. We give a new universal blind quantum computation
protocol. Similar to previous works on IT-secure protocols (for example, Broadbent-Fitzsimons-Kashefi),
our protocol can be divided into two phases. In the first phase the client prepares
some quantum gadgets with relatively simple quantum gates and sends them to the server,
and in the second phase the client is entirely classical — it does not even need quantum
storage. Crucially, the protocol’s first phase is succinct, that is, its complexity
is independent of the circuit size. Given the security parameter κ, its complexity
is only a fixed polynomial of κ, and can be used to evaluate any circuit (or several
circuits) of size up to a subexponential of κ. In contrast, known schemes either require
the client to perform quantum computations that scale with the size of the circuit,
or require trapdoor assumptions.

Given two Hermitian matrices *Y* and Λ, the Harish-Chandra–Itzykson–Zuber (HCIZ) distribution is given by the density
*e*^{Tr(U Λ U*Y)} with respect to the Haar measure on the unitary group. Random unitary matrices distributed
according to the HCIZ distribution are important in various settings in physics and
random matrix theory, but the problem of sampling efficiently from this distribution
has remained open. We present two algorithms to sample matrices from distributions
that are close to the HCIZ distribution. The first produces samples that are ξ-close
in total variation distance, and the number of arithmetic operations required depends
on *poly*(log1/ξ). The second produces samples that are ξ-close in infinity divergence, but
with a *poly*(1/ξ) dependence. Our results have the following applications: 1) an efficient algorithm
to sample from complex versions of matrix Langevin distributions studied in statistics,
2) an efficient algorithm to sample from continuous maximum entropy distributions
over unitary orbits, which in turn implies an efficient algorithm to sample a pure
quantum state from the entropy-maximizing ensemble representing a given density matrix,
and 3) an efficient algorithm for differentially private rank-*k* approximation that comes with improved utility bounds for *k*>1.

We provide more sample-efficient versions of some basic routines in quantum data analysis,
along with simpler proofs. Particularly, we give a quantum ”Threshold Search” algorithm
that requires only *O*((log^{2} *m*)/є^{2}) samples of a *d*-dimensional state ρ. That is, given observables 0 ≤ *A*_{1}, *A*_{2}, …, *A*_{m} ≤ 1 such that (ρ *A*_{i}) ≥ 1/2 for at least one *i*, the algorithm finds *j* with (ρ *A*_{j}) ≥ 1/2−є. As a consequence, we obtain a Shadow Tomography algorithm requiring only
*O*((log^{2} *m*)(log*d*)/є^{4}) samples, which simultaneously achieves the best known dependence on each parameter
*m*, *d*, є. This yields the same sample complexity for quantum Hypothesis Selection among
*m* states; we also give an alternative Hypothesis Selection method using *O*((log^{3} *m*)/є^{2}) samples.

We consider the problem of approximating maximum Nash social welfare (NSW) while allocating
a set of indivisible items to *n* agents. The NSW is a popular objective that provides a balanced tradeoff between
the often conflicting requirements of fairness and efficiency, defined as the weighted
geometric mean of the agents’ valuations. For the symmetric additive case of the problem,
where agents have the same weight with additive valuations, the first constant-factor
approximation algorithm was obtained in 2015. Subsequent work has obtained constant-factor
approximation algorithms for the symmetric case under mild generalizations of additive,
and *O*(*n*)-approximation algorithms for subadditive valuations and for the asymmetric case.

In this paper, we make significant progress towards both symmetric and asymmetric
NSW problems. We present the first constant-factor approximation algorithm for the
symmetric case under *Rado* valuations. Rado valuations form a general class of valuation functions that arise
from maximum cost independent matching problems, including as special cases assignment
(OXS) valuations and weighted matroid rank functions. Furthermore, our approach also
gives the first constant-factor approximation algorithm for the asymmetric case under
Rado valuations, provided that the maximum ratio between the weights is bounded by
a constant.

We consider (i) the problem of finding a (possibly mixed) Nash equilibrium in congestion
games, and (ii) the problem of finding an (exponential precision) fixed point of the
gradient descent dynamics of a smooth function *f*:[0,1]^{n} → ℝ. We prove that these problems are equivalent. Our result holds for various explicit
descriptions of *f*, ranging from (almost general) arithmetic circuits, to degree-5 polynomials. By a
very recent result of [Fearnley et al., STOC 2021], this implies that these problems
are *PPAD* ∩ *PLS*-complete. As a corollary, we also obtain the following equivalence of complexity
classes:

*CCLS* = *PPAD* ⋂ *PLS*.

This paper considers prior-independent mechanism design, in which a single mechanism is designed to achieve approximately optimal performance on every prior distribution from a given class. Most results in this literature focus on mechanisms with truthtelling equilibria, a.k.a., truthful mechanisms. Feng and Hartline [FOCS 2018] introduce the revelation gap to quantify the loss of the restriction to truthful mechanisms. We solve a main open question left in Feng and Hartline [FOCS 2018]; namely, we identify a non-trivial revelation gap for revenue maximization.

Our analysis focuses on the canonical problem of selling a single item to a single agent with only access to a single sample from the agent's valuation distribution. We identify the sample-bid mechanism (a simple non-truthful mechanism) and upper-bound its prior-independent approximation ratio by 1.835 (resp. 1.296) for regular (resp. MHR) distributions. We further prove that no truthful mechanism can achieve prior-independent approximation ratio better than 1.957 (resp. 1.543) for regular (resp. MHR) distributions. Thus, a non-trivial revelation gap is shown as the sample-bid mechanism outperforms the optimal prior-independent truthful mechanism. On the hardness side, we prove that no (possibly non-truthful) mechanism can achieve prior-independent approximation ratio better than 1.073 even for uniform distributions.

A celebrated impossibility result by Myerson and Satterthwaite (1983) shows that any truthful mechanism for two-sided markets that maximizes social welfare must run a deficit, resulting in a necessity to relax welfare efficiency and the use of approximation mechanisms. Such mechanisms in general make extensive use of the Bayesian priors. In this work, we investigate a question of increasing theoretical and practical importance: how much prior information is required to design mechanisms with near-optimal approximations?

Our first contribution is a more general impossibility result stating that no meaningful approximation is possible without any prior information, expanding the famous impossibility result of Myerson and Satterthwaite.

Our second contribution is that one single sample (one number per item), arguably a minimum-possible amount of prior information, from each seller distribution is sufficient for a large class of two-sided markets. We prove matching upper and lower bounds on the best approximation that can be obtained with one single sample for subadditive buyers and additive sellers, regardless of computational considerations.

Our third contribution is the design of computationally efficient blackbox reductions that turn any one-sided mechanism into a two-sided mechanism with a small loss in the approximation, while using only one single sample from each seller. On the way, our blackbox-type mechanisms deliver several interesting positive results in their own right, often beating even the state of the art that uses full prior information.

Despite its important applications in Machine Learning, min-max optimization of objective
functions that are *nonconvex-nonconcave* remains elusive. Not only are there no known first-order methods converging to even
approximate local min-max equilibria (a.k.a. approximate saddle points), but the computational
complexity of identifying them is also poorly understood. In this paper, we provide
a characterization of the computational complexity as well as of the limitations of
first-order methods in this problem. Specifically, we show that in linearly constrained
min-max optimization problems with nonconvex-nonconcave objectives an approximate
local min-max equilibrium of large enough approximation is guaranteed to exist, but
computing such a point is PPAD-complete. The same is true of computing an approximate
fixed point of the (Projected) Gradient Descent/Ascent update dynamics, which is computationally
equivalent to computing approximate local min-max equilibria. An important byproduct
of our proof is to establish an unconditional hardness result in the Nemirovsky-Yudin
1983 oracle optimization model, where we are given oracle access to the values of
some function *f* : *P* → [−1, 1] and its gradient ∇ *f*, where *P* ⊆ [0, 1]^{d} is a known convex polytope. We show that any algorithm that uses such first-order
oracle access to *f* and finds an ε-approximate local min-max equilibrium needs to make a number of oracle
queries that is exponential in at least one of 1/ε, *L*, *G*, or *d*, where *L* and *G* are respectively the smoothness and Lipschitzness of *f*. This comes in sharp contrast to minimization problems, where finding approximate
local minima in the same setting can be done with Projected Gradient Descent using
*O*(*L*/ε) many queries. Our result is the first to show an exponential separation between
these two fundamental optimization problems in the oracle model.

We strengthen the results from a recent work by the second author, achieving bounds
on the weight distribution of binary linear codes that are successful under block-MAP
(as well as bit-MAP) decoding on the BEC. We conclude that a linear code that is successful
on the BEC can also decode over a range of binary memoryless symmetric (BMS) channels.
In particular, applying the result of Kudekar, Kumar, Mondelli, Pfister, Şaşoğlu and Urbanke
from STOC 2016, we prove that a Reed–Muller code of positive rate *R* decodes errors on the *p* with high probability if *p* < 1/2 − √2^{−R}(1−2^{−R}).

The multiplicity Schwartz-Zippel lemma bounds the total multiplicity of zeroes of a multivariate polynomial on a product set. This lemma motivates the multiplicity codes of Kopparty, Saraf and Yekhanin [J. ACM, 2014], who showed how to use this lemma to construct high-rate locally-decodable codes. However, the algorithmic results about these codes crucially rely on the fact that the polynomials are evaluated on a vector space and not an arbitrary product set.

In this work, we show how to decode multivariate multiplicity codes of large multiplicities in polynomial time over finite product sets (over fields of large characteristic and zero characteristic). Previously such decoding algorithms were not known even for a positive fraction of errors. In contrast, our work goes all the way to the distance of the code and in particular exceeds both the unique decoding bound and the Johnson radius. For errors exceeding the Johnson radius, even combinatorial list-decodablity of these codes was not known.

Our algorithm is an application of the classical polynomial method directly to the multivariate setting. In particular, we do not rely on a reduction from the multivariate to the univariate case as is typical of many of the existing results on decoding codes based on multivariate polynomials. However, a vanilla application of the polynomial method in the multivariate setting does not yield a polynomial upper bound on the list size. We obtain a polynomial bound on the list size by taking an alternative view of multivariate multiplicity codes. In this view, we glue all the partial derivatives of the same order together using a fresh set of variables. We then apply the polynomial method by viewing this as a problem over the field () of rational functions in .

We present an explicit and efficient algebraic construction of capacity-achieving
list decodable codes with both constant alphabet and constant list sizes. More specifically,
for any *R* ∈ (0,1) and є>0, we give an algebraic construction of an infinite family of error-correcting
codes of rate *R*, over an alphabet of size (1/є)^{O(1/є2)}, that can be list decoded from a (1−*R*−є)-fraction of errors with list size at most exp(*poly*(1/є)). Moreover, the codes can be encoded in time *poly*(1/є, *n*), the output list is contained in a linear subspace of dimension at most *poly*(1/є), and a basis for this subspace can be found in time *poly*(1/є, *n*). Thus, both encoding and list decoding can be performed in fully polynomial-time
*poly*(1/є, *n*), except for pruning the subspace and outputting the final list which takes time
exp(*poly*(1/є)) · *poly*(*n*). In contrast, prior explicit and efficient constructions of capacity-achieving list
decodable codes either required a much higher complexity in terms of 1/є (and were
additionally much less structured), or had super-constant alphabet or list sizes.

Our codes are quite natural and structured. Specifically, we use algebraic-geometric (AG) codes with evaluation points restricted to a subfield, and with the message space restricted to a (carefully chosen) linear subspace. Our main observation is that the output list of AG codes with subfield evaluation points is contained in an affine shift of the image of a block-triangular-Toeplitz (BTT) matrix, and that the list size can potentially be reduced to a constant by restricting the message space to a BTT evasive subspace, which is a large subspace that intersects the image of any BTT matrix in a constant number of points. We further show how to explicitly construct such BTT evasive subspaces, based on the explicit subspace designs of Guruswami and Kopparty (Combinatorica, 2016), and composition.

We consider codes for space bounded channels. This is a model for communication under
noise that was introduced by Guruswami and Smith (J. ACM 2016) and lies between the
Shannon (random) and Hamming (adversarial) models. In this model, a channel is a space
bounded procedure that reads the codeword in one pass, and modifies at most a *p* fraction of the bits of the codeword.

(1) Explicit uniquely decodable codes for space bounded channels: Our main result
is that for every 0 ≤ *p* < 1/4, there exists a constant δ>0 and a uniquely decodable code with rate 1−*H*(*p*) for channels with space *n*^{δ}. This code is explicit (meaning that encoding and decoding are in poly-time). This
improves upon previous explicit codes by Guruswami and Smith, and Kopparty, Shaltiel
and Silbak (FOCS 2019). Specifically, we obtain the same space and rate as earlier
works, even though prior work gave only list-decodable codes (rather than uniquely
decodable codes).

(2) Complete characterization of the capacity of space bounded channels: Together
with a result by Guruswami and Smith showing the impossibility of unique decoding
for *p* ≥ 1/4, our techniques also give a complete characterization of the capacity *R*(*p*) of space *n*^{1−o(1)} channels, specifically: For 0≤p<1/4 R(p)=1-H(p) and for p ≥1/4 R(p)=0. This capacity
is strictly larger than the capacity of Hamming channels for every 0 < *p* < 1/4, and matches the capacity of list decoding, and binary symmetric channels in
this range. Curiously, this shows that *R*(·) is not continuous at *p*=1/4.

Our results are incomparable to recent work on casual channels (these are stronger channels that read the codeword in one pass, but there is no space restriction). The best known codes for casual channels, due to Chen, Jaggi and Langberg (STOC 2015), are shown to exist by the probabilistic method, and no explicit codes are known.

A key new ingredient in our construction is a new notion of “evasiveness” of codes, which is concerned with whether a decoding algorithm rejects a word that is obtained when a channel induces few errors to a uniformly chosen (or pseudorandom) string. We use evasiveness (as well as several additional new ideas related to coding theory and pseudorandomness) to identify the “correct” message in the list obtained by previous list-decoding algorithms.

The Gilbert–Varshamov bound non-constructively establishes the existence of binary
codes of distance 1/2−є/2 and rate Ω(є^{2}). In a breakthrough result, Ta-Shma [STOC 2017] constructed the first *explicit* family of nearly optimal binary codes with distance 1/2−є/2 and rate Ω(є^{2+α}), where α → 0 as є → 0. Moreover, the codes in Ta-Shma’s construction are є-balanced,
where the distance between distinct codewords is not only bounded from below by 1/2−є/2,
but also from above by 1/2+є/2.

Polynomial time decoding algorithms for (a slight modification of) Ta-Shma’s codes
appeared in [FOCS 2020], and were based on the Sum-of-Squares (SoS) semidefinite programming
hierarchy. The running times for these algorithms were of the form *N*^{Oα(1)} for unique decoding, and *N*^{Oє,α(1)} for the setting of “gentle list decoding”, with large exponents of *N* even when α is a fixed constant. We derive new algorithms for both these tasks, running
in time Õ_{є}(*N*). Our algorithms also apply to the general setting of decoding direct-sum codes.

Our algorithms follow from new structural and algorithmic results for collections
of *k*-tuples (ordered hypergraphs) possessing a “structured expansion” property, which
we call *splittability*. This property was previously identified and used in the analysis of SoS-based decoding
and constraint satisfaction algorithms, and is also known to be satisfied by Ta-Shma’s
code construction. We obtain a new weak regularity decomposition for (possibly sparse)
splittable collections *W* ⊆ [*n*]^{k}, similar to the regularity decomposition for dense structures by Frieze and Kannan
[FOCS 1996]. These decompositions are also computable in near-linear time Õ(|*W* |), and form a key component of our algorithmic results.

We prove an optimal mixing time bound for the single-site update Markov chain known
as the Glauber dynamics or Gibbs sampling in a variety of settings. Our work presents
an improved version of the spectral independence approach of Anari et al. (2020) and
shows *O*(*n*log*n*) mixing time on any *n*-vertex graph of bounded degree when the maximum eigenvalue of an associated influence
matrix is bounded. As an application of our results, for the hard-core model on independent
sets weighted by a fugacity λ, we establish *O*(*n*log*n*) mixing time for the Glauber dynamics on any *n*-vertex graph of constant maximum degree Δ when λ<λ_{c}(Δ) where λ_{c}(Δ) is the critical point for the uniqueness/non-uniqueness phase transition on the
Δ-regular tree. More generally, for any antiferromagnetic 2-spin system we prove *O*(*n*log*n*) mixing time of the Glauber dynamics on any bounded degree graph in the corresponding
tree uniqueness region. Our results apply more broadly; for example, we also obtain
*O*(*n*log*n*) mixing for *q*-colorings of triangle-free graphs of maximum degree Δ when the number of colors satisfies
*q* > α Δ where α ≈ 1.763, and *O*(*m*log*n*) mixing for generating random matchings of any graph with bounded degree and *m* edges.

Our approach is based on two steps. First, we show that the approximate tensorization of entropy (i.e., factorizing entropy into single vertices), which is a key step for establishing the modified log-Sobolev inequality in many previous works, can be deduced from entropy factorization into blocks of fixed linear size. Second, we adapt the local-to-global scheme of Alev and Lau (2020) to establish such block factorization of entropy in a more general setting of pure weighted simplicial complexes satisfying local spectral expansion; this also substantially generalizes the result of Cryan et al. (2019).

We study the mixing time of the *Swendsen-Wang* dynamics for the ferromagnetic Ising and Potts models on the integer lattice ℤ^{d}. This dynamics is a widely used Markov chain that has largely resisted sharp analysis
because it is *non-local*, i.e., it changes the entire configuration in one step. We prove that, whenever *strong spatial mixing (SSM)* holds, the mixing time on any *n*-vertex cube in ℤ^{d} is *O*(log*n*), and we prove this is tight by establishing a matching lower bound. The previous
best known bound was *O*(*n*). SSM is a standard condition corresponding to exponential decay of correlations
with distance between spins on the lattice and is known to hold in *d*=2 dimensions throughout the high-temperature (single phase) region. Our result follows
from a *modified log-Sobolev inequality*, which expresses the fact that the dynamics contracts relative entropy at a constant
rate at each step. The proof of this fact utilizes a new factorization of the entropy
in the joint probability space over spins and edges that underlies the Swendsen-Wang
dynamics, which extends to general bipartite graphs of bounded degree. This factorization
leads to several additional results, including mixing time bounds for a number of
natural local and non-local Markov chains on the joint space, as well as for the standard
random-cluster dynamics.

We give a Markov chain based algorithm for sampling almost uniform solutions of constraint
satisfaction problems (CSPs). Assuming a canonical setting for the Lovász local lemma,
where each constraint is violated by a small number of forbidden local configurations,
our sampling algorithm is accurate in a local lemma regime, and the running time is
a fixed polynomial whose dependency on *n* is close to linear, where *n* is the number of variables. Our main approach is a new technique called *state compression*, which generalizes the “mark/unmark” paradigm of Moitra, and can give fast local-lemma-based
sampling algorithms. As concrete applications of our technique, we give the current
best almost-uniform samplers for hypergraph colorings and for CNF solutions.

We prove, under an assumption on the critical points of a real-valued function, that the symmetric Ising perceptron exhibits the `frozen 1-RSB' structure conjectured by Krauth and Mezard in the physics literature; that is, typical solutions of the model lie in clusters of vanishing entropy density. Moreover, we prove this in a very strong form conjectured by Huang, Wong, and Kabashima: a typical solution of the model is isolated with high probability and the Hamming distance to all other solutions is linear in the dimension. The frozen 1-RSB scenario is part of a recent and intriguing explanation of the performance of learning algorithms by Baldassi, Ingrosso, Lucibello, Saglietti, and Zecchina. We prove this structural result by comparing the symmetric Ising perceptron model to a planted model and proving a comparison result between the two models. Our main technical tool towards this comparison is an inductive argument for the concentration of the logarithm of number of solutions in the model.

We present a randomized algorithm which takes as input an undirected graph *G* on *n* vertices with maximum degree Δ, and a number of colors *k* ≥ (8/3 + *o*_{Δ}(1))Δ, and returns – in expected time Õ(*n*Δ^{2}log*k*) – a proper *k*-coloring of *G* distributed *perfectly* uniformly on the set of all proper *k*-colorings of *G*. Notably, our sampler breaks the barrier at *k* = 3Δ encountered in recent work of Bhandari and Chakraborty [STOC 2020]. We also
discuss how our methods may be modified to relax the restriction on *k* to *k* ≥ (8/3 − є_{0})Δ for an absolute constant є_{0} > 0.

As in the work of Bhandari and Chakraborty, and the pioneering work of Huber [STOC 1998], our sampler is based on Coupling from the Past [Propp&Wilson, Random Struct. Algorithms, 1995] and the bounding chain method [Huber, STOC 1998; H'aggstr'om& Nelander, Scand. J. Statist., 1999]. Our innovations include a novel bounding chain routine inspired by Jerrum’s analysis of the Glauber dynamics [Random Struct. Algorithms, 1995], as well as a preconditioning routine for bounding chains which uses the algorithmic Lovász Local Lemma [Moser&Tardos, J.ACM, 2010].

We consider the 0-Extension problem, where we are given an undirected graph *G*=(*V*,*E*) equipped with non-negative edge weights *w*:*E*→ ℝ^{+}, a collection *T*={ *t*_{1},…,*t*_{k}}⊆ *V* of *k* special vertices called terminals, and a semi-metric *D* over *T*. The goal is to assign every non-terminal vertex to a terminal while minimizing the
sum over all edges of the weight of the edge multiplied by the distance in *D* between the terminals to which the endpoints of the edge are assigned. 0-Extension
admits two known algorithms, achieving approximations of *O*(log*k*) [Călinescu-Karloff-Rabani *SICOMP* ’05] and *O*(log*k*/loglog*k*) [Fakcharoenphol-Harrelson-Rao-Talwar *SODA* ’03]. Both known algorithms are based on rounding a natural linear programming relaxation
called the metric relaxation, in which *D* is extended from *T* to the entire of *V*. The current best known integrality gap for the metric relaxation is Ω (√log*k*). In this work we present an improved integrality gap of Ω(log^{2/3}*k*) for the metric relaxation. Our construction is based on the randomized extension
of one graph by another, a notion that captures lifts of graphs as a special case
and might be of independent interest. Inspired by algebraic topology, our analysis
of the gap instance is based on proving no continuous section (in the topological
sense) exists in the randomized extension.

A seminal result of Håstad (2001) shows that it is NP-hard to find an assignment that
satisfies 1/|*G*|+ε fraction of the constraints of a given *k*-LIN instance over an abelian group, even if there is an assignment that satisfies
(1−ε) fraction of the constraints, for any constant ε>0. Engebretsen, Holmerin and
Russell (2004) later showed that the same hardness result holds for *k*-LIN instances over any finite non-abelian group.

Unlike the abelian case, where we can efficiently find a solution if the instance is satisfiable, in the non-abelian case, it is NP-complete to decide if a given system of linear equations is satisfiable or not, as shown by Goldmann and Russell (1999).

Surprisingly, for certain non-abelian groups *G*, given a satisfiable *k*-LIN instance over *G*, one can in fact do better than just outputting a random assignment using a simple
but clever algorithm. The approximation factor achieved by this algorithm varies with
the underlying group. In this paper, we show that this algorithm is *optimal* by proving a tight hardness of approximation of satisfiable *k*-LIN instance over *any* non-abelian *G*, assuming *P*≠ *NP*. As a corollary, we also get 3-query probabilistically checkable proofs with perfect
completeness over large alphabets with improved soundness.

Our proof crucially uses the *quasi-random* properties of the non-abelian groups defined by Gowers (2008).

We give an algorithm for solving unique games (UG) instances whenever low-degree sum-of-squares
proofs certify good bounds on the small-set-expansion of the underlying constraint
graph via a hypercontractive inequality. Our algorithm is in fact more versatile,
and succeeds even when the constraint graph is not a small-set expander as long as
the structure of non-expanding small sets is (informally speaking) “characterized”
by a low-degree sum-of-squares proof. Our results are obtained by rounding *low-entropy* solutions — measured via a new global potential function — to sum-of-squares (SoS)
semidefinite programs. This technique adds to the (currently short) list of general
tools for analyzing SoS relaxations for *worst-case* optimization problems.

As corollaries, we obtain the first polynomial-time algorithms for solving any UG
instance where the constraint graph is either the *noisy hypercube*, the *short code* or the *Johnson* graph. The prior best algorithm for such instances was the eigenvalue enumeration
algorithm of Arora, Barak, and Steurer (2010) which requires quasi-polynomial time
for the noisy hypercube and nearly-exponential time for the short code and Johnson
graphs. All of our results achieve an approximation of 1−є vs δ for UG instances,
where є>0 and δ > 0 depend on the expansion parameters of the graph but are independent
of the alphabet size.

In this work we ask the following basic question: assume the vertices of an expander
graph are labelled by 0,1. What “test” functions *f* : { 0,1}^{t} → {0,1} cannot distinguish *t* independent samples from those obtained by a random walk? The expander hitting property
due to Ajtai, Komlos and Szemeredi (STOC 1987) is captured by the *AND* test function, whereas the fundamental expander Chernoff bound due to Gillman (SICOMP
1998), Heally (Computational Complexity 2008) is about test functions indicating whether
the weight is close to the mean. In fact, it is known that all threshold functions
are fooled by a random walk (Kipnis and Varadhan, Communications in Mathematical Physics
1986). Recently, it was shown that even the highly sensitive *PARITY* function is fooled by a random walk Ta-Shma (STOC 2017).

We focus on balanced labels. Our first main result is proving that *all* symmetric functions are fooled by a random walk. Put differently, we prove a central
limit theorem (CLT) for expander random walks *with respect to the total variation distance*, significantly strengthening the classic CLT for Markov Chains that is established
with respect to the Kolmogorov distance (Kipnis and Varadhan, Communications in Mathematical
Physics 1986). Our approach significantly deviates from prior works. We first study
how well a Fourier character χ_{S} is fooled by a random walk as a function of *S*. Then, given a test function *f*, we expand *f* in the Fourier basis and combine the above with known results on the Fourier spectrum
of *f*.

We also proceed further and consider general test functions - not necessarily symmetric.
As our approach is Fourier analytic, it is general enough to analyze such versatile
test functions. For our second result, we prove that random walks on sufficiently
good expander graphs fool tests functions computed by *AC*^{0} circuits, read-once branching programs, and functions with bounded query complexity.

We prove Tomaszewski’s conjecture (1986): Let *f*:{−1,1}^{n} → ℝ be of the form *f*(*x*)= ∑_{i=1}^{n} *a*_{i} *x*_{i}. Then Pr[|*f*(*x*)| ≤ √*Var*[*f*]] ≥ 1/2. Our main novel tools are local concentration inequalities and an improved
Berry-Esseen inequality for first-degree functions on the discrete cube. These tools
are of independent interest, and may be useful in the study of linear threshold functions
and of low degree Boolean functions.

We present an *O*^{∗}(2^{0.5n}) time and *O*^{∗}(2^{0.249999n}) space randomized algorithm for solving worst-case Subset Sum instances with *n* integers. This is the first improvement over the long-standing *O*^{∗}(2^{n/2}) time and *O*^{∗}(2^{n/4}) space algorithm due to Schroeppel and Shamir (FOCS 1979).

We breach this gap in two steps: (1) We present a space efficient reduction to the
Orthogonal Vectors Problem (OV), one of the most central problem in Fine-Grained Complexity.
The reduction is established via an intricate combination of the method of Schroeppel
and Shamir, and the representation technique introduced by Howgrave-Graham and Joux
(EUROCRYPT 2010) for designing Subset Sum algorithms for the average case regime.
(2) We provide an algorithm for OV that detects an orthogonal pair among *N* given vectors in {0,1}^{d} with support size *d*/4 in time Õ(*N*· 2^{d}/*d* *d*/4). Our algorithm for OV is based on and refines the representative families framework
developed by Fomin, Lokshtanov, Panolan and Saurabh (J. ACM 2016).

Our reduction uncovers a curious tight relation between Subset Sum and OV, because any improvement of our algorithm for OV would imply an improvement over the runtime of Schroeppel and Shamir, which is also a long standing open problem.

We prove several tight results on the fine-grained complexity of approximating the diameter of a graph. First, we prove that, for any ε>0, assuming the Strong Exponential Time Hypothesis (SETH), there are no near-linear time 2−ε-approximation algorithms for the Diameter of a sparse directed graph, even in unweighted graphs. This result shows that a simple near-linear time 2-approximation algorithm for Diameter is optimal under SETH, answering a question from a survey of Rubinstein and Vassilevska-Williams (SIGACT ’19) for the case of directed graphs.

In the same survey, Rubinstein and Vassilevska-Williams also asked if it is possible
to show that there are no 2−ε approximation algorithms for Diameter in a directed
graph in *O*(*n*^{1.499}) time. We show that, assuming a hypothesis called NSETH, one cannot use a deterministic
SETH-based reduction to rule out the existence of such algorithms.

Extending the techniques in these two results, we characterize whether a 2−ε approximation
algorithm running in time *O*(*n*^{1+δ}) for the Diameter of a sparse directed unweighted graph can be ruled out by a deterministic
SETH-based reduction for every δ∈(0,1) and essentially every ε∈(0,1), assuming NSETH.
This settles the SETH-hardness of approximating the diameter of sparse directed unweighted
graphs for deterministic reductions, up to NSETH. We make the same characterization
for randomized SETH-based reductions, assuming another hypothesis called NUNSETH.

We prove additional hardness and non-reducibility results for undirected graphs.

Among the most fundamental graph parameters is the Diameter, the largest distance
between any pair of vertices in a graph. Computing the Diameter of a graph with *m* edges requires *m*^{2−o(1)} time under the Strong Exponential Time Hypothesis (SETH), which can be prohibitive
for very large graphs, so efficient *approximation* algorithms for Diameter are desired.

There is a folklore algorithm that gives a 2-approximation for Diameter in Õ(*m*) time (where Õ notation suppresses logarithmic factors). Additionally, a line of
work [SODA’96, STOC’13, SODA’14] concludes with a 3/2-approximation algorithm for
Diameter in weighted directed graphs that runs in Õ(*m*^{3/2}) time. For directed graphs, these are the only known approximation algorithms for
Diameter.

The 3/2-approximation algorithm is known to be tight under SETH: Roditty and Vassilevska
W. [STOC’13] proved that under SETH any 3/2−ε approximation algorithm for Diameter
in undirected unweighted graphs requires *m*^{2−o(1)} time, and then Backurs, Roditty, Segal, Vassilevska W., and Wein [STOC’18] and the
follow-up work of Li proved that under SETH any 5/3−ε approximation algorithm for
Diameter in undirected unweighted graphs requires *m*^{3/2−o(1)} time.

Whether or not the folklore 2-approximation algorithm is tight, however, is unknown,
and has been explicitly posed as an open problem in numerous papers. Towards this
question, Bonnet recently proved that under SETH, any 7/4−ε approximation requires
*m*^{4/3−o(1)}, only for directed weighted graphs.

We completely resolve this question for directed graphs by proving that the folklore
2-approximation algorithm is conditionally optimal. In doing so, we obtain a series
of conditional lower bounds that together with prior work, give a complete time-accuracy
trade-off that is tight with the three known algorithms for directed graphs. Specifically,
we prove that under SETH for any δ>0, a (2*k*−1/*k*−δ)-approximation algorithm for Diameter on directed unweighted graphs requires *m*^{k/k−1−o(1)} time.

Computing the convolution *A* ⋆ *B* of two length-*n* vectors *A*,*B* is an ubiquitous computational primitive, with applications in a variety of disciplines.
Within theoretical computer science, applications range from string problems to Knapsack-type
problems, and from 3SUM to All-Pairs Shortest Paths. These applications often come
in the form of *nonnegative* convolution, where the entries of *A*,*B* are nonnegative integers. The classical algorithm to compute *A*⋆ *B* uses the Fast Fourier Transform (FFT) and runs in time *O*(*n* log*n*).

However, in many cases *A* and *B* might satisfy sparsity conditions, and hence one could hope for significant gains
compared to the standard FFT algorithm. The ideal goal would be an *O*(*k* log*k*)-time algorithm, where *k* is the number of non-zero elements in the output, i.e., the size of the support of
*A* ⋆ *B*. This problem is referred to as *sparse* nonnegative convolution, and has received a considerable amount of attention in the
literature; the fastest algorithms to date run in time *O*(*k* log^{2} *n*).

The main result of this paper is the first *O*(*k* log*k*)-time algorithm for sparse nonnegative convolution. Our algorithm is randomized and
assumes that the length *n* and the largest entry of *A* and *B* are subexponential in *k*. Surprisingly, we can phrase our algorithm as a reduction from the sparse case to
the dense case of nonnegative convolution, showing that, under some mild assumptions,
sparse nonnegative convolution is equivalent to dense nonnegative convolution for
constant-error randomized algorithms. Specifically, if *D*(*n*) is the time to convolve two nonnegative length-*n* vectors with success probability 2/3, and *S*(*k*) is the time to convolve two nonnegative vectors with output size *k* with success probability 2/3, then *S*(*k*) = *O*(*D*(*k*) + *k* (loglog*k*)^{2}).

Our approach uses a variety of new techniques in combination with some old machinery from linear sketching and structured linear algebra, as well as new insights on linear hashing, the most classical hash function.

Every undirected graph *G* has a (weighted) *cut-equivalent tree* *T*, commonly named after Gomory and Hu who discovered it in 1961. Both *T* and *G* have the same node set, and for every node pair *s*,*t*, the minimum (*s*,*t*)-cut in *T* is also an exact minimum (*s*,*t*)-cut in *G*.

We give the first subcubic-time algorithm that constructs such a tree for a simple
graph *G* (unweighted with no parallel edges). Its time complexity is Õ(*n*^{2.5}), for *n*=|*V*(*G*)|; previously, only Õ(*n*^{3}) was known, except for restricted cases like sparse graphs. Consequently, we obtain
the first algorithm for All-Pairs Max-Flow in simple graphs that breaks the cubic-time
barrier.

Gomory and Hu compute this tree using *n*−1 queries to (single-pair) Max-Flow; the new algorithm can be viewed as a fine-grained
reduction to Õ(√*n*) Max-Flow computations on *n*-node graphs.

The Gomory-Hu tree or cut tree (Gomory and Hu, 1961) is a classic data structure for
reporting *s*−*t* mincuts (and by duality, the values of *s*−*t* maxflows) for all pairs of vertices *s* and *t* in an undirected graph. Gomory and Hu showed that it can be computed using *n*−1 exact maxflow computations. Surprisingly, this remains the best algorithm for Gomory-Hu
trees more than 50 years later, even for approximate mincuts. In this paper, we break
this longstanding barrier and give an algorithm for computing a (1+є)-approximate
Gomory-Hu tree using log(*n*) maxflow computations. Specifically, we obtain the runtime bounds we describe below.

We obtain a randomized (Monte Carlo) algorithm for undirected, weighted graphs that
runs in Õ(*m* + *n*^{3/2}) time and returns a (1+є)-approximate Gomory-Hu tree algorithm whp. Previously, the
best running time known was Õ(*n*^{5/2}), which is obtained by running Gomory and Hu’s original algorithm on a cut sparsifier
of the graph.

Next, we obtain a randomized (Monte Carlo) algorithm for undirected, unweighted graphs
that runs in *m*^{4/3+o(1)} time and returns a (1+є)-approximate Gomory-Hu tree algorithm whp. This improves
on our first result for sparse graphs, namely *m* = *o*(*n*^{9/8}). Previously, the best running time known for unweighted graphs was Õ(*mn*) for an exact Gomory-Hu tree (Bhalgat *et al.*, STOC 2007); no better result was known if approximations are allowed.

As a consequence of our Gomory-Hu tree algorithms, we also solve the (1+є)-approximate
all pairs mincut and single source mincut problems in the same time bounds. (These
problems are simpler in that the goal is to only return the *s*−*t* mincut values, and not the mincuts.) This improves on the recent algorithm for these
problems in Õ(*n*^{2}) time due to Abboud et al. (FOCS 2020).

For every graph *G*, let ω(*G*) be the largest size of complete subgraph in *G*. This paper presents a simple algorithm which, on input a graph *G*, a positive integer *k* and a small constant є>0, outputs a graph *G*′ and an integer *k*′ in 2^{Θ(k5)}· |*G*|^{O(1)}-time such that (1) *k*′≤ 2^{Θ(k5)}, (2) if ω(*G*)≥ *k*, then ω(*G*′)≥ *k*′, (3) if ω(*G*)<*k*, then ω(*G*′)< (1−є)*k*′. This implies that no *f*(*k*)· |*G*|^{O(1)}-time algorithm can distinguish between the cases ω(*G*)≥ *k* and ω(*G*)<*k*/*c* for any constant *c*≥ 1 and computable function *f*, unless *FPT*= *W*[1].

We study the parameterized complexity of various classic vertex-deletion problems
such as Odd cycle transversal, Vertex planarization, and Chordal vertex deletion under
hybrid parameterizations. Existing FPT algorithms for these problems either focus
on the parameterization by solution size, detecting solutions of size *k* in time *f*(*k*) · *n*^{O(1)}, or width parameterizations, finding arbitrarily large optimal solutions in time
*f*(*w*) · *n*^{O(1)} for some width measure *w* like treewidth. We unify these lines of research by presenting FPT algorithms for
parameterizations that can simultaneously be arbitrarily much smaller than the solution
size and the treewidth.

The first class of parameterizations is based on the notion of *elimination distance* of the input graph to the target graph class , which intuitively measures the number
of rounds needed to obtain a graph in by removing one vertex from each connected component
in each round. The second class of parameterizations consists of a relaxation of the
notion of treewidth, allowing arbitrarily large bags that induce subgraphs belonging
to the target class of the deletion problem as long as these subgraphs have small
neighborhoods. Both kinds of parameterizations have been introduced recently and have
already spawned several independent results.

Our contribution is twofold. First, we present a framework for computing approximately
optimal decompositions related to these graph measures. Namely, if the cost of an
optimal decomposition is *k*, we show how to find a decomposition of cost *k*^{O(1)} in time *f*(*k*) · *n*^{O(1)}. This is applicable to any class for which we can solve the so-called separation
problem. Secondly, we exploit the constructed decompositions for solving vertex-deletion
problems by extending ideas from algorithms using iterative compression and the finite
state property. For the three mentioned vertex-deletion problems, and all problems
which can be formulated as hitting a finite set of connected forbidden (a) minors
or (b) (induced) subgraphs, we obtain FPT algorithms with respect to both studied
parameterizations. For example, we present an algorithm running in time *n*^{O(1)} + 2^{kO(1)}·(*n*+*m*) and polynomial space for Odd cycle transversal parameterized by the elimination
distance *k* to the class of bipartite graphs.

Given an integer *n* ≥ 1 and an irreducible character χ_{λ} of *S*_{n} for some partition λ of *n*, the immanant *imm*_{λ}:ℂ^{n× n}→ℂ maps matrices *A*∈ℂ^{n× n} to *imm*_{λ}(*A*)=∑_{π∈ Sn}χ_{λ}(π)∏_{i=1}^{n}*A*_{i,π(i)}. Important special cases include the *determinant* and *permanent*, which are obtained from the *sign* and *trivial* character, respectively.

It is known that immanants can be evaluated in polynomial time for characters that
are “close” to the sign character: Given a partition λ of *n* with *s* parts, let *b*(λ):=*n*−*s* count the boxes to the right of the first column in the Young diagram of λ. For a
family of partitions Λ, let *b*(Λ) := max_{λ∈Λ}*b*(λ) and write *Imm*(Λ) for the problem of evaluating *imm*_{λ}(*A*) on input *A* and λ∈Λ. On the positive side, if *b*(Λ)<∞, then *Imm*(Λ) is known to be polynomial-time computable. This subsumes the case of the determinant.
Conversely, if *b*(Λ)=∞, then previously known hardness results suggest that *Imm*(Λ) cannot be solved in polynomial time. However, these results only address certain
restricted classes of families Λ.

In this paper, we show that the assumption *FPT*≠ #*W*[1] from parameterized complexity rules out polynomial-time algorithms for *Imm*(Λ) for any computationally reasonable family of partitions Λ with *b*(Λ)=∞. We give an analogous result in algebraic complexity under the assumption *VFPT*≠ *VW*[1]. Furthermore, if *b*(λ) even grows polynomially in Λ, we show that *Imm*(Λ) is hard for #*P* and *VNP*. This concludes a series of partial results on the complexity of immanants obtained
over the last 35 years.

Arising from structural graph theory, treewidth has become a focus of study in fixed-parameter
tractable algorithms. Many NP-hard problems are known to be solvable in *O*(*n* · 2^{O(τ)}) time, where τ is the treewidth of the input graph. Analogously, many problems in
P should be solvable in *O*(*n* · τ^{O(1)}) time; however, due to the lack of appropriate tools, only a few such results are
currently known. In our paper, we show this holds for linear programs: Given a linear
program of the form min_{Ax=b,ℓ ≤ x≤ u} *c*^{⊤} *x* whose dual graph *G*_{A} has treewidth τ, and a corresponding width-τ tree decomposition, we show how to solve
it in time

*O*(*n* · τ^{2} log(1/ε)),

where *n* is the number of variables and ε is the relative accuracy. When a tree decomposition
is not given, we use existing techniques in vertex separators to obtain algorithms
with *O*(*n* · τ^{4} log(1/ε)) and *O*(*n* · τ^{2} log(1/ε) + *n*^{1.5}) run-times. Besides being the first of its kind, our algorithm has run-time nearly
matching the fastest run-time for solving the sub-problem *Ax*=*b* (under the assumption that no fast matrix multiplication is used). We obtain these
results by combining recent techniques in interior-point methods (IPMs), sketching,
and a novel representation of the solution under a multiscale basis similar to the
wavelet basis. This representation further yields the first IPM with *o*(rank(*A*)) time per iteration when the treewidth is small.