CS486: Advanced Large Language Models Research Seminar (Spring 2026)

Announcements

3.2026 Prerequisite: CS372 is strongly recommended. Students who have not taken CS372 may be admitted by exception with instructor approval, typically only if they are exceptionally strong coders and have substantial OpenClaw experience. Also, please check out this free online textbook for the required background.

3.2026 The main goal of the seminar is to mentor student teams toward one or more NeurIPS-style submissions, with the paper ready and submitted by the tentative deadline of May 11, 2026.

Course Overview

This seminar studies how scientists and mathematicians can conduct exploratory research through disciplined human-LLM collaboration, and how to build a platform that makes such collaboration trustworthy, productive, and publishable.

The purpose of this course is to mentor students to build a platform for scientific discovery in human-LLM collaboration settings and to write a paper for submission to NeurIPS. The tentative submission deadline is May 11, 2026, and teams should plan for the paper to be ready and submitted by that date. The course builds directly on ideas covered in CS372, but shifts from conceptual foundations to implementation, integration, evaluation, and research writing.

The main theme is AGI-oriented scientific discovery. We will design and implement a platform that helps scientists and mathematicians search literature, organize memory, generate hypotheses, run controlled exploration, track provenance, validate intermediate results, and produce research artifacts under human supervision. The platform itself is a research contribution. The use of the platform in mathematical or scientific case studies may also become research papers.

The course is motivated in part by recent remarks from Terence Tao, who has argued that AI is becoming a practical research assistant in mathematics and theoretical physics, especially for literature search, coding, calculation, and rapid exploration of candidate ideas, while the human remains responsible for selecting problems, designing workflows, and verifying correctness.

Core Questions

What are the essential components of a trustworthy human-LLM research platform?
How should such a system support memory, provenance, validation, rollback, and multi-path exploration?
How do we keep the human in control of significance, correctness, and scientific judgment?
How can we evaluate whether a human-LLM platform truly improves exploratory scientific work?
How do we turn both the platform and its usage into publishable research papers?

Course Examines

Human-LLM collaboration protocols for scientific and mathematical discovery
Platform design for memory, validation, provenance, and research orchestration
Exploratory workflows for conjecture generation, branching search, and verification
Failure modes such as hallucination, sycophancy, context drift, and brittle overconfidence
Evaluation of scientific usefulness, controllability, and reproducibility
Paper writing and revision discipline for NeurIPS-style submissions

Prerequisite: CS372 is strongly recommended. Students who have not taken CS372 may be admitted by exception with instructor approval, typically only if they are exceptionally strong coders and have substantial OpenClaw experience.

Expected Deliverables

A platform module or subsystem
An integrated prototype by mid-May
A submission-quality paper on the platform and/or its use
A final presentation and demonstration

Tentative Grading

Weekly milestone reports and participation: 20%
Engineering contribution to the platform: 30%
Paper draft quality and revision discipline: 25%
Final presentation and demo: 15%
Final report or submission-ready paper: 10%

Resources

Textbooks

Multi-LLM Collaborative Intelligence (MACI), The Path to AGI Volume 1, ACM Books, Edward Y. Chang.

System-2 Reasoning: From Semantic Anchoring to Causal Intelligence, The Path to AGI Volume 2, Socrasynth, Edward Y. Chang.

Anchor Direction

The seminar is centered on building a platform for exploratory research in human-LLM collaboration settings, motivated by recent work on mathematical exploration and broader questions about AI-assisted scientific discovery.

Meeting Structure

Monday: main seminar meeting, lecture, design review, milestone planning, and paper discussion.

Wednesday: research clinic, implementation support, office hours, debugging, and team check-ins.

Schedule

The schedule below is a draft and will be refined.

#	Date	Topic	Focus	Milestone
1	3/30/2026	Kickoff Course aims, platform vision, and team formation	Scientific discovery as human-LLM collaboration; course roadmap; paper targets	Team formation begins
2	4/6	Systems #1 Requirements for a scientific discovery platform	User stories, provenance, memory, validation, rollback, trust	2-page design brief
3	4/13	Systems #2 Search, synthesis, and research memory	Literature workflows, persistent state, versioning, state management	Module ownership and interface spec
4	4/20	Systems #3 Conjecture generation, branching exploration, and verification	Multi-path reasoning, validator roles, computational checks, audit trails	Prototype checkpoint #1
5	4/27	Paper #1 Human-in-the-loop research workflows	Moderator roles, refusal, escalation, significance judgments, writing plan	Evaluation plan and paper outline
6	5/4	Build #1 Integrated prototype and internal review	End-to-end prototype, first case studies, debugging, interface refinement	Integrated prototype v1
7	5/11	Paper #2 Writing sprint and launch week	Experiments, figures, system diagrams, submission packaging	Paper ready and submitted by tentative 5/11 deadline
8	5/18	Build #2 Post-submission refinement and usage studies	Additional experiments, failure analysis, case-study extension	Usage-paper or extension draft
9	5/25	Generalization Platform portability across domains	Mathematics, science, causal discovery, long-term roadmap	Final demo preparation
10	6/1	Finale Final presentations, demos, and next steps	Lessons learned, summer continuation plans, release discussion	Final presentation and report