Home
Categories
EXPLORE
Society & Culture
History
Business
Religion & Spirituality
Education
Music
Arts
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts112/v4/4c/78/c9/4c78c94e-1daa-45a1-8679-d10129ff5eb3/mza_14431068762409693551.jpg/600x600bb.jpg
Department of Statistics
Oxford University
35 episodes
8 months ago
A lecture exploring alternatives to using labeled training data. Labeled training data is often scarce, unavailable, or can be very costly to obtain. To circumvent this problem, there is a growing interest in developing methods that can exploit sources of information other than labeled data, such as weak-supervision and zero-shot learning. While these techniques obtained impressive accuracy in practice, both for vision and language domains, they come with no theoretical characterization of their accuracy. In a sequence of recent works, we develop a rigorous mathematical framework for constructing and analyzing algorithms that combine multiple sources of related data to solve a new learning task. Our learning algorithms provably converge to models that have minimum empirical risk with respect to an adversarial choice over feasible labelings for a set of unlabeled data, where the feasibility of a labeling is computed through constraints defined by estimated statistics of the sources. Notably, these methods do not require the related sources to have the same labeling space as the multiclass classification task. We demonstrate the effectiveness of our approach with experimentations on various image classification tasks. Creative Commons Attribution-Non-Commercial-Share Alike 2.0 UK: England & Wales; http://creativecommons.org/licenses/by-nc-sa/2.0/uk/
Show more...
Education
RSS
All content for Department of Statistics is the property of Oxford University and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
A lecture exploring alternatives to using labeled training data. Labeled training data is often scarce, unavailable, or can be very costly to obtain. To circumvent this problem, there is a growing interest in developing methods that can exploit sources of information other than labeled data, such as weak-supervision and zero-shot learning. While these techniques obtained impressive accuracy in practice, both for vision and language domains, they come with no theoretical characterization of their accuracy. In a sequence of recent works, we develop a rigorous mathematical framework for constructing and analyzing algorithms that combine multiple sources of related data to solve a new learning task. Our learning algorithms provably converge to models that have minimum empirical risk with respect to an adversarial choice over feasible labelings for a set of unlabeled data, where the feasibility of a labeling is computed through constraints defined by estimated statistics of the sources. Notably, these methods do not require the related sources to have the same labeling space as the multiclass classification task. We demonstrate the effectiveness of our approach with experimentations on various image classification tasks. Creative Commons Attribution-Non-Commercial-Share Alike 2.0 UK: England & Wales; http://creativecommons.org/licenses/by-nc-sa/2.0/uk/
Show more...
Education
https://is1-ssl.mzstatic.com/image/thumb/Podcasts112/v4/4c/78/c9/4c78c94e-1daa-45a1-8679-d10129ff5eb3/mza_14431068762409693551.jpg/600x600bb.jpg
Practical pre-asymptotic diagnostic of Monte Carlo estimates in Bayesian inference and machine learning
Department of Statistics
57 minutes
4 years ago
Practical pre-asymptotic diagnostic of Monte Carlo estimates in Bayesian inference and machine learning
Aki Vehtari (Aalto University) gives the OxCSML Seminar on Friday 7th May 2021 Abstract: I discuss the use of the Pareto-k diagnostic as a simple and practical approach for estimating both the required minimum sample size and empirical pre-asymptotic convergence rate for Monte Carlo estimates. Even when by construction a Monte Carlo estimate has finite variance the pre-asymptotic behaviour and convergence rate can be very different from the asymptotic behaviour following the central limit theorem. I demonstrate with practical examples in importance sampling, stochastic optimization, and variational inference, which are commonly used in Bayesian inference and machine learning.
Department of Statistics
A lecture exploring alternatives to using labeled training data. Labeled training data is often scarce, unavailable, or can be very costly to obtain. To circumvent this problem, there is a growing interest in developing methods that can exploit sources of information other than labeled data, such as weak-supervision and zero-shot learning. While these techniques obtained impressive accuracy in practice, both for vision and language domains, they come with no theoretical characterization of their accuracy. In a sequence of recent works, we develop a rigorous mathematical framework for constructing and analyzing algorithms that combine multiple sources of related data to solve a new learning task. Our learning algorithms provably converge to models that have minimum empirical risk with respect to an adversarial choice over feasible labelings for a set of unlabeled data, where the feasibility of a labeling is computed through constraints defined by estimated statistics of the sources. Notably, these methods do not require the related sources to have the same labeling space as the multiclass classification task. We demonstrate the effectiveness of our approach with experimentations on various image classification tasks. Creative Commons Attribution-Non-Commercial-Share Alike 2.0 UK: England & Wales; http://creativecommons.org/licenses/by-nc-sa/2.0/uk/