Independent Study Opportunities

with Andreas Paepcke

Soundscapes for Ecology Monitoring

Train algorithms to interpret audio recordings of eco systems.

  •  Prerequisites: Practical experience with CNNs and RNNs though any of the applicable classes. 

We will start with data sets from the bird audio detection challenge and classify 10-sec periods into present/absent decisions of particular bird species. We will then use 24/7 audio recordings from Stanford's Jasper Ridge Biological Preserve, and possibly from African sources. The plan is to use two approaches, likely eventually to combine them.


Audio recordings of eco systems offer an inexpensive method for monitoring changes in the presence of animals, and annual environmental rhythms. In fact, groups of species are specialized to use particular parts of the audible and inaudible spectra. These recordings must be analyzed by algorithms to detect either unusual patterns, or monitor particular species.

Envisioned architecture for analyzing soundscapes. Other options are wide open. 

Which Courses, and Why?

Explore what vector space embeddings of courses taken by students can reveal about pathways through college.

  • Prerequisites: CS224W preferred; OK to take simultaneously. Alternatively: CS224N, CS246. Light-weight SQL familiarity, knowledge of PyTorch/TensorFlow a plus.

We will use historic enrollment, and possibly other data. Already available are vector embeddings from courses taken by past students. An exploration of the corresponding clusters is a first step. But we plan to generate a network of course sequences and their frequencies, to then apply network analytics to these structures. Our hope is to find course-taking patterns, and instances of unusual, innovative course choice behaviors.

A partially finished application based on Cytoscape was created by students, and there is plenty to do towards making that tool enormously useful (see bottom of figure).


If we understand how students make choices on their way through college, we can improve decision support for these important pathways. Courses chosen today have impact on other courses being options down the road, versus remaining closed for lack of prerequisite knowledge. Overly narrow course choices leave on the table important contributions that college can make to students' lives. A first goal is to understand what Stanford students have chosen over the past 18 years. That understanding can inform both future students, and university policy.

A number of novel charts derived from enrollment data.

The Gist of Course Evaluations

Deploy NLP on course evaluation answers to the question "What would you like to say about this course to a student who is considering taking it in the future?"

  • Prerequisites:Some NLP class.

The first thought when thinking of applying NLP to opinions tends to be 'sentiment analysis.' We can of course run such techniques over evaluations, particularly because the domain of discourse is narrow: The content is always about Stanford courses. But more interesting will be the subtler gems. Hints such as "Definitely do the reading every week." Or "Problem sets are only every other week." Or "Find your project partner early, because you will need all the time you can get for completing the project." These hints will be harder to isolate, but could be extremely useful as a potential addition to Carta one day.

We have started this project; there is a code base, numbers to crunch, and several experiments to accomplish.


When students use Carta, they often glean information from the textual course evaluation part. We aim to extract salient course information from the text. If successful, the results of this work may migrate into Carta to help future students. 

Determine top-10 of course reviews for some notion of 'top'

Predicting Sensitivity of Coral Reefs to Heat Stress

Analyze underwater photos of coral reefs to help learn their reaction to warming oceans.

  • Prerequisites: Python, CS231N.

An infrastructure has been contructed for labeling underwater photos that biologists have shot in the Pacific Ocean. We have labeled many photos, but a few more need to be done. After that effort, we are ready for machine learning. We are currently focusing on two coral guilds, aiming to train an algorithmic classifier. The central remaining work is that algorithm. If successful, our biology partners in the Hopkins Field Station will be able to transfer what they learn from their local corals to corals elsewhere in oceans.


An existing biology project is researching the impact of artificially introduced heat stress on coral bleaching. The 400 colonies under investigation are surrounded by sand, other types of corals, and algae. Given the heat stimuli and coral response data, can we create a predictor of coral response from photos taken around the corals? For example, can we help predict how surroundings of 25% sand, 30% branching corals, 35% encrusting algae, and 10% mounding corals predict coral response to heat? 

The goal is to identify these species (guilds) of corals

Teaching Choreography Online

A distributed system for teaching choreography remotely

  • Prerequisites: Python. Experience with distributed systems.

We will try to use any appropriate distributed robotics simulator, such as Gazebo, to prototype an infrastructure that enables team-based teaching of choreography remotely. Choreography is not dancing. It is the design, the creation of dance steps that will then be danced by artists.

Using robotic creatures as artists would introduce new challenges into choreography. We could create modified gravity worlds, or animal artists with their non-human constraints. We would like students to collaborate remotely through the infrastructure as they design dances together.

The infrastructure should also enable an instructor to monitor, and support students in their assignments.


We will develop infrastructure and (with help) pedagogy for teaching choreography entirely online. Choreography is the activity of designing dances. Geographically distant students will be able to work on dance design exercises together. The 'performers' will be avatars of any shape. They will operate in a 3D robotics simulation environment. Students will continuously be able to observe their teammates' work.

An online, collaborative choreography design infrastructure would provide freedom from constraints that limit choreographers for human artists.