What Will You Learn?

In this talk, I discuss four tactics that enable successful enterprise analytics efforts.  The first concerns data integration.  Because essentially all enterprise data resides in data silos, an integration effort is required before meaningful cross-silo analysis is possible.  Data science practitioners routinely report spending at least 80% of their time doing “data preparation” (aka data munging).  I describe why this activity is hard and tactics that can be employed to make it less costly.  Once one has clean cross-silo data, then two further tactics entail using an analytics suite and an information discovery tool.  The first is required to do data analytics while the second is necessary when one doesn’t know what analysis to perform.  I discuss desired features of each tool, as well as make some comments about machine learning.  The fourth tactic entails data lakes and lake houses.   Please put everything in a DBMS, so the integration challenge of data lakes is as manageable as possible.

Who Will Be Teaching You?


 Michael Stonebraker, PhD, Adjunct Professor @ MIT

Dr. Stonebraker has been a pioneer of data base research and technology for more than forty years.  He was the main architect of the INGRES relational DBMS, and the object-relational DBMS, POSTGRES.  These prototypes were developed at the University of California at Berkeley where Stonebraker was a Professor of Computer Science for twenty five years.  More recently at M.I.T. he was a co-architect of the Aurora/Borealis stream processing engine, the C-Store column-oriented DBMS, the H-Store transaction processing engine, the SciDB array DBMS, and the Data Tamer data curation system. Presently he serves as Chief Technology Officer of Hopara and Tamr, Inc.

Professor Stonebraker was awarded the ACM System Software Award in 1992 for his work on INGRES.  Additionally, he was awarded the first annual SIGMOD Innovation award in 1994,  and was elected to the National Academy of Engineering in 1997.  He was awarded the IEEE John Von Neumann award in 2005 and the 2014 Turing Award, and is presently an Adjunct Professor of Computer Science at M.I.T.

Course curriculum

  • 1

    Data Analytics at Scale: A Four-legged Stool

    • Data Analytics at Scale: A Four-legged Stool