|
Bio
Prof. Abadi's research interests are in database system architecture and
implementation, cloud computing, and the Semantic Web. Before joining the Yale
computer science faculty, he spent four years at the Massachusetts Institute of
Technology where he received his Ph.D. Abadi has been a recipient of a
Churchill Scholarship, an NSF CAREER Award, a Sloan Research Fellowship,
the 2008 SIGMOD Jim Gray Doctoral Dissertation Award, and the 2007 VLDB
best paper award. His research on HadoopDB (see below) is currently being
commercialized by Hadapt, where Abadi also serves
as chief scientist. He blogs at DBMS Musings and tweets at @daniel_abadi.
Current Projects
- Petascale Parallel Database Systems (HadoopDB)
HadoopDB is:
- A hybrid of DBMS and MapReduce technologies targeting analytical query workloads
- Designed to run on a shared-nothing cluster of commodity machines, or in the cloud
- An attempt to fill the gap in the market for a free and open source parallel DBMS
- Much more scalable than currently available parallel database systems and DBMS/MapReduce hybrid systems
- As scalable as Hadoop, while achieving superior performance on structured data analysis workloads
This projects builds on our paper in VLDB 2009.
- Data Management for Graph Data and the Semantic Web (SW-Store)
The goal of SW-Store is to manage and query
graph data with a particular focus on Semantic Web data. We are starting from a clean-slate and designing a DBMS
specifically for data stored in a vertex-edge-vertex model (e.g. the prevalent Semantic Web data
model, the Resource Description Framework). We explore how common graph
queries and applications such as subgraph pattern matching
can be built into the database. This work builds on a recent publication that won "Best Paper" at a
recent VLDB.
- High-Performance OLTP Databases (H-Store)
Current OLTP database designs, which date largely from the 1970s, are based on
several assumptions about the architecture of database
applications and hardware that are less true today than they were 30 years
ago. Some examples include:
- OLTP used to be dominated by disk-based systems. Today, most OLTP
applications can fit completely in the main memory of a cluster of
machines arranged in a shared-nothing architecture
- Many locking-based pessimistic concurrency control schemes designed to keep
the CPUs busy during disk and user stalls are no longer necessary
- The number of CPU cores available to process transactions is rapidly
expanding, and legacy DBMS code is struggling to keep up (i.e., they do
not scale)
The goal of the H-Store project is to investigate how these architectural and
application shifts affect the performance of OLTP databases,
and to study what performance benefits would be possible
with a complete redesign of OLTP systems in light of these trends. Our early
results show that a simple prototype built from scratch using
modern assumptions can outperform current commercial DBMS offerings by around a
factor of 80 on OLTP workloads. We are currently working
to build a full-featured system that demonstrates these performance wins in a
more robust prototype. This work is collaboration between MIT, Yale, and Brown.
|