Description

Recent progress in automated text generation relies predominantly on the use of large datasets, sometimes requiring millions of examples for each application setting. In the first part of the talk, we'll develop novel text generation methods that balance the goals of fluency, consistency, and relevancy without requiring any training data. We focus on text summarization and simplification by directly defining a multi-component reward, and training text generators to optimize this objective. The novel approaches that we introduce perform better than all existing unsupervised approaches and in many cases outperform those that rely on large datasets, showing that high-performing NLP models are possible when little data is available.

In the second part of the talk, we incorporate text generation into interfaces to help news readers navigate complex, unfolding news topics. We build a novel representation of news stories at scale and integrate new summarization, question generation and question answering modules into a chatbot and an automated interactive podcast. Human evaluations confirm that even though imperfect systems introduce friction for the user, they can serve as powerful tools to stimulate reader curiosity and help readers dive deeper into unfolding topics."

Three main learning points:

(1) How to approach text generation tasks such as summarization and simplification without any data

(2) Concrete examples of text-centric interfaces to navigate complex text collections with modern NLP integrated

(3) How to run human evaluation on complex interfaces to confirm which NLP features add value for a user

Local ODSC chapter in NYC, USA


Instructor's Bio

Philippe Laban 

Research Scientist at Salesforce Research

Philippe Laban is a Research Scientist at Salesforce Research, where he works on text generation projects, including summarization and interactive question answering. Previously, he obtained his Ph.D. from UC Berkeley, where he was advised by Marti Hearst and John Canny. His work in Berkeley focused on deigning unsupervised methods for text generation and on building and adapting NLP techniques to a very large, noisy and evolving news dataset. He did his undergraduate education at Georgia Tech, doing research in signal processing and discrete mathematics.


Webinar

  • 1

    ON-DEMAND WEBINAR: Unsupervised Text Generation and its Application to News Interfaces

    • Ai+ Training

    • Webinar recordaing

    • Join ODSC West 2021 Training Conference