Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
TV & Film
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts116/v4/87/8b/1e/878b1e67-fd1a-fb2f-de5b-113fe4018dc7/mza_11173054665888442467.jpg/600x600bb.jpg
TechcraftingAI NLP
Brad Edwards
271 episodes
4 days ago
TechcraftingAI NLP brings you daily summaries of the latest arXiv Computation and Language research.
Show more...
Technology
RSS
All content for TechcraftingAI NLP is the property of Brad Edwards and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
TechcraftingAI NLP brings you daily summaries of the latest arXiv Computation and Language research.
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/39368654/39368654-1703088924475-7aa75231d6474.jpg
Ep. 259 - June 9, 2024
TechcraftingAI NLP
37 minutes 33 seconds
1 year ago
Ep. 259 - June 9, 2024

ArXiv NLP research for Sunday, June 09, 2024.


00:19: How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States

01:40: DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation

03:25: Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses

05:08: MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations

06:17: SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context Large Language Models

08:11: Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions

09:54: MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation

11:20: QGEval: A Benchmark for Question Generation Evaluation

12:44: MrRank: Improving Question Answering Retrieval System through Multi-Result Ranking Model

13:43: Arabic Diacritics in the Wild: Exploiting Opportunities for Improved Diacritization

14:46: The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

16:30: RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation

18:14: Hidden Holes: topological aspects of language models

19:46: Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper

20:40: Seventeenth-Century Spanish American Notary Records for Fine-Tuning Spanish Large Language Models

22:02: MedREQAL: Examining Medical Knowledge Recall of Large Language Models via Question Answering

23:12: II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models

25:17: Zero-Shot End-To-End Spoken Question Answering In Medical Domain

26:27: Are Large Language Models Actually Good at Text Style Transfer?

27:32: Feriji: A French-Zarma Parallel Corpus, Glossary & Translator

28:56: TTM-RE: Memory-Augmented Document-Level Relation Extraction

30:12: Why Don't Prompt-Based Fairness Metrics Correlate?

31:27: Hello Again! LLM-powered Personalized Agent for Long-term Dialogue

33:12: Semisupervised Neural Proto-Language Reconstruction

34:12: Prompting Large Language Models with Audio for General-Purpose Speech Summarization

35:14: A Dual-View Approach to Classifying Radiology Reports by Co-Training

36:07: ThaiCoref: Thai Coreference Resolution Dataset

TechcraftingAI NLP
TechcraftingAI NLP brings you daily summaries of the latest arXiv Computation and Language research.