Daniel Abadi

Assistant Professor of Computer Science
Yale University
51 Prospect Street
New Haven, CT 06511

dna AT cs DOT yale DOT edu
Picture of Daniel Abadi

Bio

Prof. Abadi's research interests are in database system architecture and implementation, cloud computing, and the Semantic Web. Before joining the Yale computer science faculty, he spent four years at the Massachusetts Institute of Technology where he received his Ph.D. Abadi has been a recipient of a Churchill Scholarship, an NSF CAREER Award, the 2008 SIGMOD Jim Gray Doctoral Dissertation Award, and the 2007 VLDB best paper award.

News

Blog

Research

Current Projects

  • Petascale Parallel Database Systems (HadoopDB)

    HadoopDB is:
    1. A hybrid of DBMS and MapReduce technologies targeting analytical query workloads
    2. Designed to run on a shared-nothing cluster of commodity machines, or in the cloud
    3. An attempt to fill the gap in the market for a free and open source parallel DBMS
    4. Much more scalable than currently available parallel database systems and DBMS/MapReduce hybrid systems
    5. As scalable as Hadoop, while achieving superior performance on structured data analysis workloads
    This projects builds on our paper in VLDB 2009.

  • Data management for the Semantic Web (SW-Store)

    SW-Store is a recently launched project whose goal is to manage and query Semantic Web data. We are starting from a clean-slate and designing a DBMS specifically for this type of data and the prevalent Semantic Web data model, the Resource Description Framework, or RDF. We explore how common SW queries and applications such as reasoning and biological data integration can be built into the database. This work builds on a recent publication that won "Best Paper" at a recent VLDB.

  • Column-Oriented Database Systems (C-Store)

    As companies increasingly use analytic data marts and data warehouses for their customer relationship management and business intelligence applications, the use of column-oriented DBMS technology is growing. Column-oriented databases store DBMS tables column-by-column (instead of row-by-row) and tend to perform better on analytical applications since these applications tend to only focus on a subset of table attributes at a time, and are thus more I/O efficient. Consequently, column-stores have recently seen great momentum in industry with at least a half-dozen new start-up companies, and in the research community with a rapidly increasing number of recent publications. At Yale, we're interested in a variety of research topics within context of column-store databases systems, including:
    1. How to build a massively parallel, shared-nothing column-store database?
    2. How to make column-stores elastically expand or contract across resources in the cloud?
    3. How to maximize column-store write performance?
    4. How to turn PostgreSQL into a hybrid row-store/column-store DBMS?


  • High-Performance OLTP Databases (H-Store)

    Current OLTP database designs, which date largely from the 1970s, are based on several assumptions about the architecture of database applications and hardware that are less true today than they were 30 years ago. Some examples include:
    1. OLTP used to be dominated by disk-based systems. Today, most OLTP applications can fit completely in the main memory of a cluster of machines arranged in a shared-nothing architecture
    2. Many locking-based pessimistic concurrency control schemes designed to keep the CPUs busy during disk and user stalls are no longer necessary
    3. The number of CPU cores available to process transactions is rapidly expanding, and legacy DBMS code is struggling to keep up (i.e., they do not scale)
    The goal of the H-Store project is to investigate how these architectural and application shifts affect the performance of OLTP databases, and to study what performance benefits would be possible with a complete redesign of OLTP systems in light of these trends. Our early results show that a simple prototype built from scratch using modern assumptions can outperform current commercial DBMS offerings by around a factor of 80 on OLTP workloads. We are currently working to build a full-featured system that demonstrates these performance wins in a more robust prototype. This work is collaboration between MIT, Yale, and Brown.

Publications

Recent Talks

Teaching

Service