Tutorials and workshops
Tutorials and workshops will be held on Saturday, May 31, 2014 at Columbia University.
There will be three full-day tutorials focusing on the following subjects:
- Recent Advances on the Lovasz Local Lemma
- Efficient Distribution Estimation
- Overcoming the intractability bottleneck in unsupervised machine learning
Location: The workshops are in Mudd Hall and Shapiro Center at Columbia University. The easiest way to reach this location is to take the 1 subway to 116 St., and enter campus though the main gate. Head to the north east corner of campus. A map can be found here
Lunch: You are on your own for lunch. There are many good and reasonably priced lunch options within walking distance. If you walk south on Broadway or north on Amsterdam, you will find many restaurants including Italian, Ethiopian, Thai, Indian, Mexican, Korean, Japanese, Chinese and even American. Here is a recent guide to some of the restaurants in the area.
Recent Advances on the Lovasz Local Lemma
Organizers: David Harris, Aravind Srinivasan, Mario Szegedy, Gabor Tardos
Synopsis: The Lovasz Local Lemma (LLL) is a fundamental probabilistic tool; there has been much recent progress related to it, especially on various algorithmic versions and generalizations. In this workshop, we will cover classical and modern material; the "classical" material includes powerful extensions like iterated applications of the LLL which are not that well known, and the modern material includes very recent information-theoretic insights into the LLL and its Moser/Moser-Tardos algorithmic versions. Various applications, connections to physics, and derandomization will be among the further aspects covered.
Location: 303 Mudd Hall
Schedule:
9:15-9:45 | Coffee (outside 303 Mudd Hall) |
9:45-10:35: | Aravind Srinivasan: A crown jewel of the Probabilistic Method |
- The basic Lovasz Local Lemma (LLL) - Proof - Simple applications - Iterated LLL (a surprisingly powerful version) and applications to routing and coloring | |
10:45-11:35 | Mario Szegedy: What are the limits? |
- Shearer's precise formulas - Connections to statistical physics - A counter example to the Shearer's bound in the variable setting. | |
11:35-1:30 | Lunch (on your own) |
1:30-2:20 | Bernhard Haeupler: Constructive aspects |
- Moser-Tardos, proof using recurrences - The LLL distribution of Haeupler et al. and applications (e.g., to the Santa Claus problem) - Derandomization | |
2:30-3:20 | David Harris: Beyond the ordinary LLL |
- Lopsided LLL revisited, with applications - Partial resampling - Permutation LLL | |
3:20-3:45 | Coffee (outside 303 Mudd Hall) |
3:45-4:35 | Dimitris Achlioptas: Algorithmic progress as compression |
- Moser's information-theoretic/compression argument. - Very recent information-theoretic insights, a local progress lemma. |
Efficient Distribution Estimation
Organizers: Ilias Diakonikolas, Gregory Valiant
Synopsis: The broad question of how to infer information or
accurately estimate properties of a distribution based on random samples is of
fundamental importance across the sciences. Additionally, questions in this vein
have rich mathematical structure, requiring tools and intuitions from
probability, combinatorics, analysis, and information theory. This question is
also distinctly computational in nature; at the end of the day, the goal is to
describe algorithms efficient from both a computational and information
theoretic standpoint and understand the behavior of these algorithms. The
recent flood of data, especially from the biological sciences, has provided
fresh perspective on some of these problems, and offers the tantalizing
possibility that new theoretical developments may very quickly find their way
into meaningful practical applications.
The goal of this workshop is to introduce the broad CS theory community to
central problems and techniques in this area of distributional property
estimation, and discuss some new directions and opportunities for future work
that lie between computation and statistics.
Location: 833 Mudd Hall
Schedule:
9:15-9:45 | Coffee (outside 303 Mudd Hall) |
9:45-10:35 | Ronitt Rubinfeld (MIT and Tel Aviv University): Taming distributions over big domains |
10:45-11:35 | Rocco Servedio (Columbia): A complexity theoretic perspective on unsupervised learning |
11:35-1:30 | Lunch (on your own) |
1:30-2:20 | Gregory Valiant (Stanford): Distributional Property Testing and Estimation: Past, Present, and Future |
2:30-3:20 | Paul Valiant (Brown): Lower Bound Techniques for Statistical Estimation Tasks |
3:20-3:45 | Coffee (outside 303 Mudd Hall) |
3:45-4:35 | Ilias Diakonikolas (Edinburgh): Beyond Histograms: Exploiting Structure in Distribution Estimation |
4:45-5:30 | Costis Daskalakis (MIT): Beyond Berry Esseen: Structure and Learning of Sums of Random Variables |
Overcoming the intractability bottleneck in unsupervised machine learning
Organizers: Sanjeev Arora, Rong Ge, Ankur Moitra
Synopsis:
Unsupervised learning — i.e., learning with unlabeled data — is
increasingly important given today's data deluge. Most natural problems in this
domain — e.g., for models such as mixture models, HMMs, graphical models,
topic models and sparse coding/dictionary learning — are NP-hard. Therefore
researchers in practice use either heuristics or convex relaxations with no
concrete approximation bounds. Several nonconvex heuristics work well in
practice, which is also a mystery.
Recently, a sequence of results has shown that rigorous approaches leading to
polynomial running time are possible for several of these problems. These
involve sidestepping worst-case complexity via special assumptions on the input.
Some of this work — e.g., for topic models — even leads to practical
running times (50x faster than previous approaches). It has even become possible
to analyse nonconvex optimization heuristics such as alternating minimization or
kSVD.
This day of talks is meant to introduce the TCS audience to this area, the
progress that has been made, and interesting open problems that remain. Yann Le
Cun, Director of AI Research at Facebook Labs, will tell us about the surprising
power of heuristics in practice.
Location: 417 Shapiro Center
Schedule:
9:15-9:45 | Coffee (outside 303 Mudd Hall) |
9:45-10:30 | Sanjeev Arora (Princeton): Intractable problems in unsupervised learning: a new frontier for theoretical CS |
10:45-11:30 | Yann LeCun (NYU, Director of AI Research, Facebook Labs): Nonconvex optimization techniques in ML (including deep learning) and their surprising power |
11:35-1:30 | Lunch (on your own) |
1:30-2:15 | Michael Jordan (UC Berkeley): Lower bounds on the performance of polynomial-time algorithms for sparse linear regression |
2:30-3:15 | Daniel Hsu (Columbia): Tensor-decomposition based approaches for model fitting |
3:15-3:45 | Coffee (outside 303 Mudd Hall) |
3:45-4:30 | Ankur Moitra (MIT): Provable algorithms for dictionary learning/sparse coding (and analysis of nonconvex heuristics) |
4:45-5:30 | Rong Ge (MSR New England): Provable algorithms for learning some deep nets |