• About GSL
    • ABOUT JIM
    • Mission
    • FACILITY
    • News
    • Outreach
    • Contact
  • People
  • Projects
    • Polybase
    • PDW AU3
    • SSD
    • Query Progress Indicator
    • Energy QO
    • SQL Azure
Select the search type
 
  • Site
  • Web
Search
Microsoft Gray Systems Lab
Login |
You are here: Peopledewitt
  • Staff
  • Students
  • Affiliated Faculty
David DeWitt

David J. DeWitt

Educational  Background

Ph.D., University of Michigan, 1976
M.S.E., University of Michigan, 1971
A.B., Colgate University, 1970

Employment

April 2008 to present,   Technical Fellow
Microsoft Corporation,  Madison, WI

April 2008 to present,  John P. Morgridge Professor, Emeritus
Computer Sciences Department,   University of Wisconsin-Madison

1999 to March 2008,   John P. Morgridge Professor,
Computer Sciences Department,   University of Wisconsin-Madison

1999-2004,  Chair,
Computer Sciences Department,   University of Wisconsin-Madison

1984-1999,  Professor and Romnes Fellow,
Computer Sciences Department,University of Wisconsin-Madison

1982-1984, Associate Professor,
Computer Sciences Department,University of Wisconsin-Madison

1976-1982, Assistant Professor,
Computer Sciences Department,University of Wisconsin-Madison

Professional Societies & Honors

National Academy of Engineering, 1998-
Fellow, American Academy of Arts and Sciences, 2007-
ACM Fellow, 1995-
ACM Software Systems Award, 2009
IEEE Emanuel R. Piore Award, 2009
ACM SIGMOD Innovations Award, 1995

Talks

Polybase

One of the newest projects at GSL is Polybase which tightly integrates SQL Server PDW with Hadoop.   This talk provides a good overview of the goals of the project.  At this point there no decision has been made whether or not to commercialize the Polybase prototype.

PASS Keynotes

Since I joined Microsoft in 2008 I have been asked each year to give a keynote talk at PASS – the annual SQL Server users group meeting. The talks have covered the following topics 
 
2011 – Big Data, What's the Big Deal,  A talk on big data and Hadoop (file is 20MB)
2010 – SQL Query Optimization:  Why is it So Hard to Get Right,  Introduction to relational query optimization (file is 15MB)
2009 -  From 1 to 1000 MIPS,  A general talk on how technology trends have impacted the design of DB systems
2008 -  Parallel Database Systems 101, An introduction to the key techniques used by parallel database systems

Feel free to use these PASS talks, in whole, or in parts. The talks are each about 75 minutes long. There is no copyright on the decks. All I ask is that if you “lift” slides for a talk that you acknowledge where you got them.  

Publications

Below are a number of selected papers organized by research area and project. Additional papers can be found on the DBLP web site and the Wisconsin Database Group Web Site.

Hadoop and Big Data

A Comparison of Approaches to Large-Scale Data Analysis, (with Pavlo, A., Paulson, E., Rasin, A., Abadi, D., Madden, S., and M. Stonebraker), Proceedings of the 2009 SIGMOD Conference, Providence, R.I., May 2009.

MapReduce and Parallel DBMSs:  Friends or Foes, (with Stonebraker, M., Abadi, D., Madden, S., Paulson, E., Pavlo, A., and A. Rasin).  CACM, January 2010, Vol 53. No. 1.

Clustera:  An Integrated Computation and Data Management System (with Robinson, E., Shankar, S., Paulson, E., Naughton, J., Krioukov, A. and J. Royalty), Proceedings of the 2008 VLDB Conference, Auckland, NZ, August 2008.

Parallel Database Systems

Over the 32 year period I was a professor at Wisconsin we implemented three parallel database systems: DIRECT (1977-1984), Gamma (1984-1992) and Paradise (1993-1997). I no longer have copies of those papers without links unfortunately.

The following paper presents a high-level overview of the mechanisms used by today's commercial parallel database products.

Parallel Database Systems: The Future of Database Processing or a Passing Fad? (with J. Gray), Communications of the ACM, June, 1992. 
 
DIRECT

The DIRECT project ran from 1977 until 1984. It was one of the first operational parallel database systems. Several versions of the system were built starting with PDP 11/03s and ending with PDP 11/23 processors connected by a 1 megabtit token ring for passing messages and a shared-memory constructed using CCD chips.

DIRECT - A Multiprocessor Organization for Supporting Relational Data Base Management Systems, IEEE Transactions on Computers, Vol. C-28, No. 6, June 1979.

Query Execution in Direct,  Proceedings of the 1979 SIGMOD Conference, Boston, MA, May 1979.

Implementation of the Database Machine DIRECT (with H. Boral, D. Friedland, N. Jarrell, and W. K. Wilkinson), IEEE Transactions on Software Engineering, Vo. SE-8, No. 6, November, 1982.

Gamma

The GAMMA project began in January 1984 and ran until late 1992 at which point the code was so broken from years of patching that we gave up. The first version of GAMMA became operational in fall of 1985 on a collection of 20 VAX 11/750s connected by a 100 mbit/second token ring constructed by Proteon for us. Later the system was ported to a 32 processor Intel iPSC-2 hyerpcube configured with one disk per processor.

GAMMA - A High Performance Dataflow Database Machine (with B. Gerber, G. Graefe, M. Heytens, K. Kumar, and M. Muralikrishna), Proceedings of the 1986 VLDB Conference, Japan, August 1986.

The GAMMA Database Machine Project (with S. Ghandeharizadeh, D. Schneider, H. Hsiao, A. Bricker, R. Rasmussen), IEEE Transactions on Knowledge and Data Engineering, Vol. 2, No. 1, March, 1990.

A Performance Analysis of the Gamma Database Machine (with S. Ghandeharizadeh and D. Schneider), Proceedings of the 1988 SIGMOD Conference, Chicago, Ill., June, 1988.

Multiprocessor Hash-Based Join Algorithms (with Bob Gerber), Proceedings of the 1985 VLDB Conference, Stockholm, Sweden, August, 1985.

A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment (with D. Schneider), Proceedings of the 1989 SIGMOD Conference, Portland, Oregon, May 1989.

A Comparison of Non-Equijoin Algorithms (with J. Naughton, and D. Schneider), Proceedings of the 15th International VLDB Conference, Barcelona, Spain, August, 1991.

Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting (with J. Naughton and D. Schneider), Proceedings of the Parallel and Distributed Information Systems Conference, Miami Beach, Florida, December, 1991.

Practical Skew Handling in Parallel Joins (with J. Naughton, D. Schneider, and S. Seshadri), Proceedings of the 1992 Very Large Data Base Conference, Vancouver, CA, August 1992.

Nested Loops Revisited (with J. Naughton and J. Burger), Proceedings of the Second International Conference on Parallel and Distributed Information Systems, San Diego, CA, January, 1993.

Tradeoffs in Processing Multi-Way Join Queries via Hashing in Multiprocessor Database Machines (with D Schneider), Proceedings of the 1990 VLDB Conference, Brisbane, Australia, August, 1990.

Dynamic Memory Allocation for Multiple Query Workloads (with M. Mehta), Proceedings of the 1993 Very Large Data Base Conference, Dublin, Ireland, August 1993.

Managing Intra-Operator Parallelism in Parallel Database Systems (with M. Mehta), Proceedings of the 1995 VLDB Conference, Zurich, September 1995.

Chained Declustering: A New Availability Strategy for Multiprocessor Database Machines (with H. Hsiao), Proceedings of the 6th International Conference on Data Engineering, Los Angeles, CA, February 1990.

A Performance Study of Three High Availability Data Replication Strategies (with Hui-I Hsiao), Proceedings of the Parallel and Distributed Information Systems Conference, Miami Beach, Florida, December, 1991.

Paradise

Client-Server Paradise (with J. Patel, J. Luo, and J. Yu), Proceedings of the 1994 VLDB Conference, Chile, August 1994.

Building A Scalable GeoSpatial Database System: Technology, Implementation, and Evaluation (with J. Naughton, J. Patel, J. Yu, N. Kabra and a cast of dozens ), Proceedings of the 1997 SIGMOD Conference, Tucson, Arizona, May, 1997.

Query Pre-Execution and Batching in Paradise: A Two-Pronged Approach to the Efficient Processing of Queries in Tape-Resident Data Sets (with JieBing Yu), Proceedings of the 9th International Conference on Scientific and Statistical Database Management, Olympia, Washington, August 1997.

Processing Satellite Images on Tertiary Storage: A Study of the Impact of Tile Size on Performance
(with JieBing Yu), Proceedings of the 1996 NASA Conference on Mass Storage Systems, College Park, MD., Sept. 1996.

Partition Based Spatial Merge Join (with Jignesh Patel), Proceedings of the 1996 SIGMOD Conference, Montreal, CA, June, 1996.

Benchmarking

Benchmarking Database Systems - A Systematic Approach (with D. Bitton and C. Turbyfill), Proceedings of the 1983 Very Large Database Conference, October 1983. Here is a link to a tar file that contains the benchmark queries and generator

A Methodology for Database System Performance Evaluation (with H. Boral) Proceedings of the 1984 SIGMOD Conference, June, 1984.

The OO7 Benchmark (with M. Carey and J. Naughton), Proceedings of the 1993 SIGMOD Conference, Washington, D.C., May 1993.

The Bucky Object Relational Benchmark (with M. Carey, J. Naughton, M. Asgarian, J. Gehrke, D. Shah), Proceedings of the 1997 SIGMOD Conference, Tucson, Arizona, May, 1997.

Query Optimization

The EXODUS Optimizer Generator (with G. Graefe), Proceedings of the 1987 SIGMOD Conference, San Francisco, CA, May 1987.

Opt++ - an Object Oriented Approach to Query Optimization (with N. Kabra),  VLDB Journal November 1997.

Efficient Mid-Query Re-Optimization of SubOptimal Query Execution Plans  (with N.  Kabra), Proceedings of the 1998 SIGMOD Conference,   Seattle, WA, June, 1998.

Buffer Pool Aware Query Optimization (with R. Ramamurthy), Proceedings of the 2005 CIDR Conference, Asilomar, CA, January 2005.

Proactive Re-Optimization (with Babu, S. and P. Bizarro), Proceedings of the SIGMOD 2005 Conference, Baltimore, MD, June 2005.

Object-Oriented Database Systems

Of Objects and Databases: A Decade of Turmoil (with M. Carey), Invited Paper, Proceedings of the 1996 VLDB Conference, Bombay, India, August, 1996.

Shoring Up Persistent Applications (with M. Carey, J. Naughton, M. Solomon, ...) Proceedings of the 1994 SIGMOD Conference, Minneapolis, Minn, May 1994.

QuickStore: A High Performance Mapped Object Store (with S. White), Proceedings of the 1994 SIGMOD Conference, Minneapolis, Minn, May 1994. Also, VLDB Journal "Best of SIGMOD 1994 Issue, VLDB Journal, Vol 4, No. 4, October 1995.

Implementing Crash Recovery in QuickStore: A Performance Study (with S. White), Proceedings of the 1995 SIGMOD Conference, San Francisco, CA, May 1995.

A Performance Study of Alternative Object Faulting and Pointer Swizzling Strategies (with Seth White), Proceedings of the 1992 Very Large Data Base Conference, Vancouver, CA, August 1992.

A Study of Three Alternative Workstation-Server Architectures for Object Oriented Database Systems (with P. Futtersack, D. Maier, and F. Velez), Proceedings of the 1990 VLDB Conference, Brisbane, Australia, August, 1990

The Architecture of the EXODUS Extensible DBMS (with M. Carey, D. Frank, G. Graefe, J. E. Richardson, E. J. Shekita and M. Muralikrishna), Proceedings of the International Workshop on Object Oriented Database Systems, Asilomar, CA. September, 1986.

The EXODUS Extensible DBMS Project: An Overview (with M. Carey, Graefe, G., Haight, D., Richardson, J., Schuh, D., Shekita, E., and Vandenberg, S.), in Readings in Object-Orient Database Systems, S. Zdonik and D. Maier, eds., Morgan-Kaufman Publ. Co., 1989.

Object and File Management in the EXODUS Extensible Database System (with M. Carey, J. Richardson, and E. Shekita), Proceedings of the 1986 VLDB Conference, Japan, August 1986.

Storage Management for Objects in EXODUS (with Carey, M., Richardson, J., and Shekita, E.), in Object-Oriented Concepts, Applications, and Databases, W. Kim and F. Lochovsky, eds., Addison-Wesley Publishing Co., 1988.

Crash Recovery in Client-Server EXODUS (with M. Franklin, M. Zwilling. C. Tan, and M. Carey), Proceedings of the 1992 SIGMOD Conference, San Diego, CA, June 1992.

null
  • About GSL
    • ABOUT JIM
    • Mission
    • FACILITY
    • News
    • Outreach
    • Contact
  • People
  • Projects
    • Polybase
    • PDW AU3
    • SSD
    • Query Progress Indicator
    • Energy QO
    • SQL Azure
© 2013 Microsoft   –  Terms of UseTerms Of UsePrivacy Statement