Jeffrey D. Ullman

Jeff Ullman is the Stanford W. Ascherman Professor of Computer Science (Emeritus). His interests include database theory, database integration, data mining, and education using the information infrastructure.

What's New | Polemics | Books | Biographical Information

What's New

Workshop on Algorithms for MapReduce and Beyond

I have gotten involved in a Workshop "Algorithms for MapReduce and Beyond" that will be held March 28, 2014 in Athens, in conjunction with EDBT/ICDT. Here is the link for information. The submission deadline is Dec. 7, 2013. The Workshop Home Page. If you are working on algorithms for MapReduce, graph-oriented systems like Pregel, or any of the various extensions of MapReduce, we'd like to hear from you.

New Polemic

Experiments as Research Validation -- Have We Gone too Far?.

Follow Me on Google+

I refuse to get involved with Facebook, or Twitter, or LinkedIn, or any of the old social-network sites, but I have started to post observations and reports of my trips and such on Google+. I'm ullmanAtGmailDotCom (you can figure it out!).

Map-Reduce Algorithms

Along with a number of colleagues, I have been looking at the question of algorithm design for Hadoop (MapReduce). Foto Afrati and I started with a technique for optimal implementation of multiway joins using a single MR job: Optimizing Joins in a Map-Reduce Environment. This idea was applied to the problem of finding triangles and more complex structures in graphs using MR: Enumerating Subgraph Instances Using Map-Reduce. We then started looking at the problem of similarity joins using MR: Fuzzy Joins Using MapReduce.

The outgrowth of these studies was something that I believe is really important: a model of MapReduce computation that gets at the heart of what makes it different from plain old parallel computing. In Upper and Lower Bounds on the Cost of a Map-Reduce Computation. The theory explained here lets you get lower bounds on the amount of communication needed, as a function of how much work each reducer is assigned. That, in turn, lets you know when your algorithm can be improved and when it cannot. We already have a number of surprises in this regard, and we invite interested researchers to help explore the many problems and algorithms that can be analyzed this way. There is a Tutorial at the SoCC on Oct. 17, 2012. The tutorial covers these developments and a number of others related to MapReduce and its extensions.

Gradiance News

Gradiance is a system for creating and administering class exercises. These homeworks and progamming labs are designed to teach students, rather than merely to test. Through the concept of a "root question," students can repeat the same work several times, and are given advice when they make an error.

The Gradiance contract with Pearson (Addison-Wesley + Prentice-Hall) has terminated, and we have decided to turn Gradiance into a FREE service. If you are an instructor who wants to use the system, start by creating an account for yourself at www.gradiance.com/services NOT at the Pearson site. Note: passwords are >= 10 letters+digits, with at least one of each. Also, we cannot make an account be an instructor account for a book if the same account has registered as a student for a course using the same materials.

Then, email your chosen login, with the book whose materials you want, to support@gradiance.com We'll enable you to create a class using those materials. There are manuals at www.gradiance.com/info.html that should enable you to use the system without problems, but feel free to email support@gradiance.com if you encounter difficulties.

In addition, we have created eleven free "omnibus classes" covering Databases, Automata, Compilers, Operating Systems, Introductory Java, Data Structures, and Data Mining. Students wishing to join either one of these classes will find the Student Directions useful.

Free Book: Mining of Massive Datasets

The book Mining of Massive Dataasets by Anand Rajaraman and me, based on the CS345A course we have been teaching (now CS246 in the Winter and CS341 in the Spring), Has just been published in hardcopy by Cambridge University Press. It can be purchased in hardcopy Here. By agreement with Cambridge Press, it may still be downloaded gratis Here.

Another Free Book: Foundations of Computer Science

In 1992, Al Aho and I published a book called Foundations of Computer Science, whose goal was to present CS theory as something integrally connected to CS practice. For example, we viewed recursive programs, recursive definitions, and inductive proofs as the same thing. We believe the book deserved a better fate, but the publisher took it out of print years ago. Having received back the rights to the book, we are happy to make it freely available Here.

I Would Like to Hear From You, But...

I generally enjoy getting emails, even if it is to tell me of a mistake in a book. In fact, those notes are particularly important; they help not only me and my coauthors, but more importantly, later readers of the book. I try to respond helpfully and in a timely manner to email that isn't spam. However, there are two classes of emails that I think should not be written and that get a form response. Read More.

Polemics

As part of my role as old curmudgeon (replacing my previous role as young curmudgeon), I have been writing a few articles about things I think especially stupid. I hope to write more.


Books --- Past and Future

For some sets of notes and materials for supplementing current books, click here:

Biographical Information


Jeffrey D. Ullman
ullman @ cs.stanford.edu
650-494-8016 (home)
650-725-2588 (FAX)