ArcSpread Logo Photo Andreas Paepcke

Independent Study Opportunities

with Andreas Paepcke

1. DB2Viz

Specify massive-data visualizations as part of SQL queries.
(Independent study)

Prerequisites: SQL, JavaScript. D3 a plus.


D3 and Google visualizations are the modern tools for delivering visualizations on the Web. But creating the charts or creative new visualizations is procedural and hard. While we cannot make the creation of entirely new visualizations easy, we can make the connections between data sources and visualizations that make the data understandable much easier.

We will make visualizations declarative by extending SQL. The same query that generates the data result specifies the intended visualization. A compilation process then issues the query to the data source, creates the visualization frame, and connects that frame to the result stream. Here is one possible SQL extension grammar:

SELECT count(age) AS age_distrib FROM students WHERE country = 'US' GROUP BY age DISPLAY age_distrib AS histogram;

We will not need to write a full SQL parser. But our software will need to figure out data types to expect from SQL results.

2. Retrieving Data from the Canvas Learning Platform


Prerequisites: SQL, Python.


Stanford is switching to the Canvas learning management system (LMS). An LMS manages student enrollments, course material delivery, and grade records. The system also tracks interactions of learners with the learning material. These interactions can tell us about what works, and what doesn't in online courses. This project will retrieve those data from Canvas, and turn them into a form suitable for analytics. We will also examine how to augment this platform with novel education tools, such as simulations.

The first step will be to install a toy instance of the LMS on some computer, and learn how it works. Of interest will be how courses are created, and where the log records of students interacting with the LMS are stored.
We will work with technical people in the Graduate School of Business who are running an instance of Canvas, and therefore have experience with its operation. We will also work with Canvas creator Instructure to ensure that we use the best paths to the interaction data.
We already store a database of about one billion learner interactions from three other learning platforms from which Stanford offers its MOOCs. The data from Canvas will join that existing data.

Specific Projects

3. Automatic Study Guides for MOOCs

Prerequisites: Python, CS221N and/or CS229.

Given video closed caption files of instructional videos, student forum posts, the Web, and derived resources, create personal study guides for students. (Independent study)


We will take the view that many online courses will be modular, like Stanford's self-paced database course. We will extract word clusters from closed caption files of course videos to identify topics. We will then attach learning resources to each topic. Resources are relevant course forum question/answer pairs, video snippets, Wikipedia search results, and student-identified entities. We will use these resources to automatically create study guides and learning hints.

We can obtain closed caption files for a number of Stanford's online courses. These will be the source of word clusters that each define a topic. We also have a half billion individual 'events' of learners interacting with Stanford's open online courses. Events are starting or rewinding video tapes, forum posts, and assignment submissions. Forum posts identified by an existing poster-confusion classifier, as well as repeated incorrect assignment submissions will serve as triggers to offer topic- and student-specific study resources. We will need to identify those resources automatically.

Specific Projects

4. Authoring Tools for Interactive Simulations

Prerequisites: JavaScript. Python a plus. D3 extra plus.


We want to make it easy for instructors to build small, interactive simulations: simulets. Subject matters could be physics, electrical engineering, chemistry, music, literature, or any other area of interest. The results will be made available to learners in Stanford's online courses, but the goal is that others will build simulets themselves. We will want to design a number diverse examples to identify widely useful building blocks.

Simulets may be self contained JavaScript that runs entirely on the browser. Alternatively we are constructing a software bus, called the SchoolBus, which allows us to build powerful backend computational resources, or to query data sources. The figure shows examples of both.
The top shows a browser-based simulation that shows the connection between regression error lines, several error functions, and the corresponding two dimensional search surface. All views are linked.
The picture below shows an engine with numbers on it. Clicking on a part sends a message to the Wikipedia service, which in turn contacts Wikipedia for an answer. The result travels back through the bus to the Web client.

We have both these examples running, but constructing them should be easy.

Specific Projects

5. Teaching Choreography Online

Prerequisites: Python. Experience with either HCI or distributed systems.


We will develop infrastructure and (with help) pedagogy for teaching choreography entirely online. Choreography is the activity of designing dances. Geographically distant students will be able to work on dance design exercises together. The 'performers' will be avatars of any shape. They will operate in a 3D robotics simulation environment. Students will continuously be able to observe their teammates' work.

We will try to use Gazebo, an existing high fidelity robot simulation environment. The software was developed for the DARPA robotics challenges, and can take into account mass distributions of simulated avatars. Gazebo is by nature distributed, but we many need additionally to use a high-function distributed messaging system.
Three main elements are involved in this work. Development of a Web based UI for easily manipulating avatars, the distributed messaging for allowing distant Gazebo instances to be coupled, and some choreography pedagogy. We will consult with a professional choreographer.

Specific Projects

Andreas Paepcke
Home Page