July 18, 2022

Who: Ido Dagan (Bar-Ilan University)
When: July 21st, 10am–11am
Where: CSE 305
Title: Beyond End-to-end: Decomposed Modeling and Representations in NLP
Abstract: Deep learning has pushed us to develop models and representations which we understand and control to a much lesser degree than earlier methods. In this talk I suggest revisiting the pursuit for decomposed models and representations, aiming to regain better understanding and control of our systems while still leveraging deep learning power. With respect to modeling, I will discuss an alternative to the common end-to-end approach, where a complex task is decomposed to its inherent subtasks, each addressed by a targeted model, illustrating it for the challenging use case of multi-document summarization. Notably, such decomposition becomes particularly appealing when targeted training data for the subtasks can be derived automatically from the originally available “end-to-end” training data. With respect to representations, I will advocate decomposing textual information into a systematic and comprehensive set of “minimal” question-answer pairs. This provides an appealing semi-structured  natural-language based representation, which may be viewed as a midway between traditional formal semantic representations and opaque neural representations. I will describe our recent unified QA-Sem parser, which extracts from a sentence a systematic set of QAs, covering predications by verbs, nominalizations and informational discourse relations, readily available for downstream use.

Lead students on the described projects are Ori Ernst and Ayal Klein.

Bio: Ido Dagan is a Professor at the Department of Computer Science at Bar-Ilan University, Israel, the founder of the Natural Language Processing (NLP) Lab at Bar-Ilan, the founding Director of the nationally funded Bar-Ilan University Data Science Institute, and a Fellow of the Association for Computational Linguistics (ACL). His interests are in applied semantic processing, focusing on textual inference, natural open semantic representations, consolidation and summarization of multi-text information, and interactive text summarization and exploration. Dagan and colleagues initiated and promoted textual entailment recognition (RTE, later aka NLI) as a generic empirical task. He was the President of the ACL in 2010 and served on its Executive Committee during 2008-2011. In that capacity, he led the establishment of the journal Transactions of the Association for Computational Linguistics, which became one of two premiere journals in NLP. Dagan received his B.A. summa cum laude and his Ph.D. (1992) in Computer Science from the Technion. He was a research fellow at the IBM Haifa Scientific Center (1991) and a Member of Technical Staff at AT&T Bell Laboratories (1992-1994). During 1998-2003 he was co-founder and CTO of FocusEngine and VP of Technology of LingoMotors, and has been regularly consulting in the industry. His academic research has involved extensive industrial collaboration, including funds from IBM, Google, Thomson-Reuters, Bloomberg, Intel and Facebook, as well as collaboration with local companies under funded projects of the Israel Innovation Authority.