Ep. 262 - June 12, 2024

https://is1-ssl.mzstatic.com/image/thumb/Podcasts116/v4/87/8b/1e/878b1e67-fd1a-fb2f-de5b-113fe4018dc7/mza_11173054665888442467.jpg/600x600bb.jpg

TechcraftingAI NLP

Brad Edwards

271 episodes

2 days ago

TechcraftingAI NLP brings you daily summaries of the latest arXiv Computation and Language research.

Technology

RSS

All content for TechcraftingAI NLP is the property of Brad Edwards and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

TechcraftingAI NLP brings you daily summaries of the latest arXiv Computation and Language research.

Technology

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/39368654/39368654-1703088924475-7aa75231d6474.jpg

Ep. 262 - June 12, 2024

TechcraftingAI NLP

54 minutes 55 seconds

1 year ago

Ep. 262 - June 12, 2024

ArXiv NLP research for Wednesday, June 12, 2024.

00:19: VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment

02:05: BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain

03:15: Designing a Dashboard for Transparency and Control of Conversational AI

04:46: Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection

05:51: Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions

06:53: Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations

07:52: Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation

08:55: DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

10:20: Automated Information Extraction from Thyroid Operation Narrative: A Comparative Study of GPT-4 and Fine-tuned KoELECTRA

11:35: Large Language Model Unlearning via Embedding-Corrupted Prompts

13:17: Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation

14:46: Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling

16:02: LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

17:18: Guiding In-Context Learning of LLMs through Quality Estimation for Machine Translation

18:37: It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF

20:02: Adversarial Evasion Attack Efficiency against Large Language Models

21:06: Learning Job Title Representation from Job Description Aggregation Network

21:59: Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey

23:35: AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection

24:38: Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation

25:56: Multimodal Table Understanding

27:20: CoXQL: A Dataset for Parsing Explanation Requests in Conversational XAI Systems

28:51: Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling

30:36: Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

31:57: Semi-Supervised Spoken Language Glossification

33:16: Underneath the Numbers: Quantitative and Qualitative Gender Fairness in LLMs for Depression Prediction

34:37: A Dialogue Game for Eliciting Balanced Collaboration

35:23: Transformer-based Model for ASR N-Best Rescoring and Rewriting

36:16: SumHiS: Extractive Summarization Exploiting Hidden Structure

36:53: Figuratively Speaking: Authorship Attribution via Multi-Task Figurative Language Modeling

38:08: Leveraging Large Language Models for Web Scraping

39:51: M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation

41:15: Is Programming by Example solved by LLMs?

42:29: Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques

43:42: Towards Unsupervised Speech Recognition Without Pronunciation Models

44:50: cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers

45:57: Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

47:02: Tailoring Generative AI Chatbots for Multiethnic Communities in Disaster Preparedness Communication: Extending the CASA Paradigm

48:12: Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

49:56: TasTe: Teaching Large Language Models to Translate through Self-Reflection

51:28: OLMES: A Standard for Language Model Evaluations

52:47: Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing