Explore what vector space embeddings of courses taken by
students can reveal about pathways through college.
Prerequisites: CS224W preferred; OK to take simultaneously. Alternatively: CS224N, CS246. Light-weight SQL familiarity, knowledge of PyTorch/TensorFlow a plus.
We will use historic enrollment, and possibly other
data. Already available are vector embeddings from courses
taken by past students. An exploration of the corresponding
clusters is a first step. But we plan to generate a network
of course sequences and their frequencies, to then apply
network analytics to these structures. Our hope is to find
course-taking patterns, and instances of unusual, innovative
course choice behaviors.
Scatter plot of course enrollments since 2000.
Each color corresponds to a discipline:
engineering, law, H&S, etc.
Prerequisites: Any neural networks course.
We will use analytic tools, such as neural networks to derive every department's Masters degree requirements at Stanford, and develop a unified form to describe them. Using eighteen years of enrollment history, and the majors of the respective students, we hope to learn the sometimes widely branching alternative paths to the degrees. We will also attempt to describe the undergraduate requirements from observed data, and express them in the form we develop.
Example descriptions of masters
degrees in CS and English.
We will use vector embeddings of course choices to compute the intellectual spread of student choices. Student choices are motivated by requirements, and background. But given these embeddings we will analyze how the resulting per year distributions of spread have changed during the past n years.
Deploy NLP on course evaluation answers to the question "What would you like to say about this course to a student who is considering taking it in the future?"
Prerequisites:Some NLP class.
The first thought when thinking of applying NLP to opinions tends to be 'sentiment analysis.' We can of course run such techniques over evaluations, particularly because the domain of discourse is narrow: The content is always about Stanford courses.
But more interesting will be the subtler gems. Hints such as "Definitely do the reading every week." Or "Problem sets are only every other week." Or "Find your project partner early, because you will need all the time you can get for completing the project." These hints will be harder to isolate, but could be extremely useful as a potential addition to Carta one day.
Given video closed caption files of instructional videos, student forum posts, the Web, and derived resources, create personal study guides for students.
We can obtain closed caption files for a number of Stanford's online courses. These will be the source of word clusters that each define a topic. We also have a half billion individual 'events' of learners interacting with Stanford's open online courses. Events are starting or rewinding video tapes, forum posts, and assignment submissions. Forum posts identified by an existing poster-confusion classifier, as well as repeated incorrect assignment submissions will serve as triggers to offer topic- and student-specific study resources. We will need to identify those resources automatically.
We will try to use
Gazebo, an existing high fidelity robot simulation
environment. The software was developed for the DARPA
robotics challenges, and can take into account mass
distributions of simulated avatars. Gazebo is by nature
distributed, but we many need additionally to use a
high-function distributed messaging system.
Three main elements are involved in this work. Development of a Web based UI for easily manipulating avatars, the distributed messaging for allowing distant Gazebo instances to be coupled, and some choreography pedagogy. We will consult with a professional choreographer.