Independent Study Opportunities
Specify massive-data visualizations as part of SQL
D3 and Google visualizations are the modern tools for
delivering visualizations on the Web. But creating the
charts or creative new visualizations is procedural and
hard. While we cannot make the creation of entirely new
visualizations easy, we can make the connections between
data sources and visualizations that make the data
understandable much easier.
We will make visualizations declarative by extending SQL. The
same query that generates the data result specifies the intended
visualization. A compilation process then issues the query to the data
source, creates the visualization frame, and connects that frame to
the result stream. Here is one possible SQL extension grammar:
SELECT count(age) AS age_distrib
WHERE country = 'US'
GROUP BY age
DISPLAY age_distrib AS histogram;
We will not need to write a full SQL parser. But our software will
need to figure out data types to expect from SQL results.
2. Retrieving Data from the Canvas Learning Platform
Prerequisites: SQL, Python.
Stanford is switching to
learning management system
(LMS). An LMS manages student
enrollments, course material delivery, and grade records. The
system also tracks interactions of learners with the learning
material. These interactions can tell us about what works, and
what doesn't in online courses. This project will
retrieve those data from Canvas, and turn them into a form
suitable for analytics. We will also examine how to augment
this platform with novel education tools, such as simulations.
The first step will be to install a toy instance of the LMS
on some computer, and learn how it works. Of interest will
be how courses are created, and where the log records of
students interacting with the LMS are stored.
We will work with technical people in the Graduate School
of Business who are running an instance of Canvas, and
therefore have experience with its operation. We will also
work with Canvas creator Instructure to ensure that we use
the best paths to the interaction data.
We already store a database of about one billion learner
interactions from three other learning platforms from which
Stanford offers its MOOCs. The data from Canvas will join
that existing data.
- Understand how Canvas exports its data
We will get together with the GSB personnel to learn the
data layout. From there we will find the schemas that
will create a best fit with the existing learning
data. The plan is to enable queries that reach across
data from several learning platforms.
- Build the bridge to Canvas
We will write some Python code that will retrieve Canvas
interaction data, and store it on a data analytics
- Create data analytics dashboards
Once the data bridge is built, you can go to town
writing visualizations both for instructors and students.
3. Automatic Study Guides for MOOCs
Given video closed caption files of instructional videos,
student forum posts, the Web, and derived resources, create
personal study guides for students. (Independent study)
We will take the view that many online courses
will be modular, like
. We will extract word clusters from
closed caption files of course videos to identify topics. We
will then attach learning resources to each topic. Resources
are relevant course forum question/answer pairs, video
snippets, Wikipedia search results, and student-identified
entities. We will use these resources to automatically
create study guides and learning hints.
We can obtain closed caption files for a number of
Stanford's online courses. These will be the source of word
clusters that each define a topic. We also have a half
billion individual 'events' of learners interacting with
Stanford's open online courses. Events are starting or
rewinding video tapes, forum posts, and assignment
submissions. Forum posts identified by an existing
poster-confusion classifier, as well as repeated incorrect
assignment submissions will serve as triggers to offer
topic- and student-specific study resources. We will need to
identify those resources automatically.
- Partition all video text into clusters.
- Identify 'good' question/answer
pairs among course forum pairs.
- Create UI for attaching help
resources to topic clusters.
- Given the topic clusters, identify
relevant Web based resources.
4. Authoring Tools for Interactive Simulations
plus. D3 extra plus.
We want to make it easy for instructors to build small,
interactive simulations: simulets. Subject matters
could be physics, electrical engineering, chemistry, music,
literature, or any other area of interest. The results will
be made available to learners in Stanford's online courses,
but the goal is that others will build simulets themselves.
We will want to design a number diverse examples to identify
widely useful building blocks.
on the browser. Alternatively we are constructing a software
bus, called the SchoolBus, which allows us to build powerful
backend computational resources, or to query data
sources. The figure shows examples of both.
The top shows a browser-based simulation that shows the
connection between regression error lines, several error
functions, and the corresponding two dimensional search
surface. All views are linked.
The picture below shows an engine with numbers on it.
Clicking on a part sends a message to the Wikipedia service,
which in turn contacts Wikipedia for an answer. The result
travels back through the bus to the Web client.
We have both these examples running, but constructing them
should be easy.
- Identify worthy simulet examples.
- Construct several prototypes.
- Identify useful building blocks.
- Build the authoring tool.
5. Teaching Choreography Online
Prerequisites: Python. Experience
with either HCI or distributed systems.
We will develop infrastructure and (with help) pedagogy for
teaching choreography entirely online. Choreography is the
activity of designing dances. Geographically distant
students will be able to work on dance design exercises
together. The 'performers' will be avatars of any
shape. They will operate in a 3D robotics simulation
environment. Students will continuously be able to observe
their teammates' work.
We will try to use
Gazebo, an existing high fidelity robot simulation
environment. The software was developed for the DARPA
robotics challenges, and can take into account mass
distributions of simulated avatars. Gazebo is by nature
distributed, but we many need additionally to use a
high-function distributed messaging system.
Three main elements are involved in this work. Development
of a Web based UI for easily manipulating avatars, the
distributed messaging for allowing distant Gazebo instances
to be coupled, and some choreography pedagogy. We will
consult with a professional choreographer.
- Develop cheap, efficient motion entry methods.>
- Ensure performance for the
- Build prototypes.