A lecture exploring alternatives to using labeled training data. Labeled training data is often scarce, unavailable, or can be very costly to obtain. To circumvent this problem, there is a growing interest in developing methods that can exploit sources of information other than labeled data, such as weak-supervision and zero-shot learning. While these techniques obtained impressive accuracy in practice, both for vision and language domains, they come with no theoretical characterization of their accuracy. In a sequence of recent works, we develop a rigorous mathematical framework for constructing and analyzing algorithms that combine multiple sources of related data to solve a new learning task. Our learning algorithms provably converge to models that have minimum empirical risk with respect to an adversarial choice over feasible labelings for a set of unlabeled data, where the feasibility of a labeling is computed through constraints defined by estimated statistics of the sources. Notably, these methods do not require the related sources to have the same labeling space as the multiclass classification task. We demonstrate the effectiveness of our approach with experimentations on various image classification tasks. Creative Commons Attribution-Non-Commercial-Share Alike 2.0 UK: England & Wales; http://creativecommons.org/licenses/by-nc-sa/2.0/uk/
All content for Department of Statistics is the property of Oxford University and is served directly from their servers
with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
A lecture exploring alternatives to using labeled training data. Labeled training data is often scarce, unavailable, or can be very costly to obtain. To circumvent this problem, there is a growing interest in developing methods that can exploit sources of information other than labeled data, such as weak-supervision and zero-shot learning. While these techniques obtained impressive accuracy in practice, both for vision and language domains, they come with no theoretical characterization of their accuracy. In a sequence of recent works, we develop a rigorous mathematical framework for constructing and analyzing algorithms that combine multiple sources of related data to solve a new learning task. Our learning algorithms provably converge to models that have minimum empirical risk with respect to an adversarial choice over feasible labelings for a set of unlabeled data, where the feasibility of a labeling is computed through constraints defined by estimated statistics of the sources. Notably, these methods do not require the related sources to have the same labeling space as the multiclass classification task. We demonstrate the effectiveness of our approach with experimentations on various image classification tasks. Creative Commons Attribution-Non-Commercial-Share Alike 2.0 UK: England & Wales; http://creativecommons.org/licenses/by-nc-sa/2.0/uk/
Complexity of local MCMC methods for high-dimensional model selection
Department of Statistics
1 hour 1 minute
4 years ago
Complexity of local MCMC methods for high-dimensional model selection
Quan Zhou, Texas A and M University, gives an OxCSML Seminar on Friday 25th June 2021. Abstract:
In a model selection problem, the size of the state space typically grows exponentially (or even faster) with p (the number of variables). But MCMC methods for model selection usually rely on local moves which only look at a neighborhood of size polynomial in p. Naturally one may wonder how efficient these sampling methods are at exploring the posterior distribution. Consider variable selection first. Yang, Wainwright and Jordan (2016) proved that the random-walk add-delete-swap sampler is rapidly mixing under mild high-dimensional assumptions. By using an informed proposal scheme, we obtain a new MCMC sampler which achieves a much faster mixing time that is independent of p, under the same assumptions. The mixing time proof relies on a novel approach called "two-stage drift condition", which can be useful for obtaining tight complexity bounds. This result shows that the mixing rate of locally informed MCMC methods can be fast enough to offset the computational cost of local posterior evaluation, and thus such methods scale well to high-dimensional data. Next, we generalize this result to other model selection problems. It turns out that locally informed samplers attain a dimension-free mixing time if the posterior distribution satisfies a unimodal condition. We show that this condition can be established for the high-dimensional structure learning problem even when the ordering of variables is unknown.
This talk is based on joint works with H. Chang, J. Yang, D. Vats, G. Roberts and J. Rosenthal.
Bio: Quan Zhou is an assistant professor of the Department of Statistics at Texas A&M University (TAMU). Before joining TAMU, he was a postdoctoral research fellow at Rice University. He did his PhD at Baylor College of Medicine.
Department of Statistics
A lecture exploring alternatives to using labeled training data. Labeled training data is often scarce, unavailable, or can be very costly to obtain. To circumvent this problem, there is a growing interest in developing methods that can exploit sources of information other than labeled data, such as weak-supervision and zero-shot learning. While these techniques obtained impressive accuracy in practice, both for vision and language domains, they come with no theoretical characterization of their accuracy. In a sequence of recent works, we develop a rigorous mathematical framework for constructing and analyzing algorithms that combine multiple sources of related data to solve a new learning task. Our learning algorithms provably converge to models that have minimum empirical risk with respect to an adversarial choice over feasible labelings for a set of unlabeled data, where the feasibility of a labeling is computed through constraints defined by estimated statistics of the sources. Notably, these methods do not require the related sources to have the same labeling space as the multiclass classification task. We demonstrate the effectiveness of our approach with experimentations on various image classification tasks. Creative Commons Attribution-Non-Commercial-Share Alike 2.0 UK: England & Wales; http://creativecommons.org/licenses/by-nc-sa/2.0/uk/