Daniel Abadi

Associate Professor of Computer Science
Yale University
51 Prospect Street
New Haven, CT 06511

dna AT cs DOT yale DOT edu
Picture of Daniel Abadi

Bio

Prof. Abadi's research interests are in database system architecture and implementation, cloud computing, and the Semantic Web. Before joining the Yale computer science faculty, he spent four years at the Massachusetts Institute of Technology where he received his Ph.D. Abadi has been a recipient of a Churchill Scholarship, an NSF CAREER Award, a Sloan Research Fellowship, the 2008 SIGMOD Jim Gray Doctoral Dissertation Award, and the 2007 VLDB best paper award. His research on HadoopDB (see below) is currently being commercialized by Hadapt, where Abadi also serves as chief scientist. He blogs at DBMS Musings and tweets at @daniel_abadi.

News

Blog

Research

Current Projects

  • Petascale Parallel Database Systems (HadoopDB)

    HadoopDB is:
    1. A hybrid of DBMS and MapReduce technologies targeting analytical query workloads
    2. Designed to run on a shared-nothing cluster of commodity machines, or in the cloud
    3. An attempt to fill the gap in the market for a free and open source parallel DBMS
    4. Much more scalable than currently available parallel database systems and DBMS/MapReduce hybrid systems
    5. As scalable as Hadoop, while achieving superior performance on structured data analysis workloads
    This projects builds on our paper in VLDB 2009.

  • Data Management for Graph Data and the Semantic Web (SW-Store)

    The goal of SW-Store is to manage and query graph data with a particular focus on Semantic Web data. We are starting from a clean-slate and designing a DBMS specifically for data stored in a vertex-edge-vertex model (e.g. the prevalent Semantic Web data model, the Resource Description Framework). We explore how common graph queries and applications such as subgraph pattern matching can be built into the database. This work builds on a recent publication that won "Best Paper" at a recent VLDB.

  • High-Performance OLTP Databases (H-Store)

    Current OLTP database designs, which date largely from the 1970s, are based on several assumptions about the architecture of database applications and hardware that are less true today than they were 30 years ago. Some examples include:
    1. OLTP used to be dominated by disk-based systems. Today, most OLTP applications can fit completely in the main memory of a cluster of machines arranged in a shared-nothing architecture
    2. Many locking-based pessimistic concurrency control schemes designed to keep the CPUs busy during disk and user stalls are no longer necessary
    3. The number of CPU cores available to process transactions is rapidly expanding, and legacy DBMS code is struggling to keep up (i.e., they do not scale)
    The goal of the H-Store project is to investigate how these architectural and application shifts affect the performance of OLTP databases, and to study what performance benefits would be possible with a complete redesign of OLTP systems in light of these trends. Our early results show that a simple prototype built from scratch using modern assumptions can outperform current commercial DBMS offerings by around a factor of 80 on OLTP workloads. We are currently working to build a full-featured system that demonstrates these performance wins in a more robust prototype. This work is collaboration between MIT, Yale, and Brown.

Publications

Recent Talks

Teaching

Service