News
Symposium on Humanoid Robotics & Sovereign AI for Future Living — Keynote speaker alongside robotics pioneers Oussama Khatib (Director, Stanford Robotics Center) and Hiroshi Ishiguro (Director, Intelligent Robotics Lab, Osaka University)
Multi-LLM Agent Collaborative Intelligence: The Path to AGI — First edition published by SocraSynth, March 2024; acquired and published by ACM Books, December 2025
Two Paradigm Bridges to AGI — Presented to Stanford PhD students, November 2025
Pioneering Data-Centric AI 2007–2012
Between NeurIPS 2007 and 2012, while serving as Director of Google Research (Beijing), our team built the scalable infrastructure and large-scale datasets that would become foundational to modern data-centric AI — years before the term was coined. We produced one of the first web-scale annotated image datasets (30,000+ real web images with multimodal signals), sponsored Fei-Fei Li's ImageNet project at Google, and published a series of parallel machine learning algorithms on MapReduce that enabled training at unprecedented scale. This body of work was consolidated in the Springer book Foundations of Large-Scale Multimedia Information Management and Retrieval (2011), whose Chapter 2 explicitly formulated a data-driven + model-based hybrid architecture (DMD), asking "Can more data help a model?" — a decade before "data-centric AI" became a recognized paradigm.
Research
My research focuses on building the theoretical and practical foundations for safe, reliable AGI systems.
Developing System-2 on LLMs for AGI
Enabling multiple LLM agents to collaborate through structured debate, perspective synthesis, and consensus-building. Includes SocraSynth, CRIT, EVINCE, SagaLLM, and the UCCT theoretical foundation.
UAudit: Enhancing Reasoning Capability of LLMs
Auditing and strengthening LLM reasoning through blind verification protocols, structured probes, and consistency checks — enabling third-party evaluation of black-box model reasoning.
Transactional Swarm Orchestration (TSO)
Enabling robots to discover causal relationships through physical intervention, with transactional guarantees and epistemic regret minimization.
AI Safety & Alignment
Checks-and-balances frameworks for ethical AI, including RAudit for real-time verification and multi-branch governance architectures.
Recent Publications
View full publication at Google Scholar →Working Papers & Preprints
-
arXiv 2026
CausalT5K: An Extensive Benchmark for Conducting Causal Reasoning ResearchTL;DR: A large-scale benchmark (5,000+ samples) for evaluating causal reasoning in LLMs, covering intervention queries, counterfactual reasoning, and causal graph discovery across multiple domains.
-
arXiv 2026
RAudit: A Blind Auditing Protocol for Large Language Model ReasoningTL;DR: A protocol for verifying LLM reasoning correctness without access to the reasoning trace, enabling third-party auditing of black-box models through structured probes and consistency checks.
-
arXiv 2025
UCCT: The Unified Cognitive Consciousness Theory for Language Models: Anchoring Semantics, Thresholds of Activation, and Emergent ReasoningTL;DR: A unified theory explaining how LLMs turn pretrained capacity into goal-directed behavior via semantic anchoring. Formalizes anchoring strength S = ρd − dr − log k, predicting threshold-like performance flips and generalizing ICL, retrieval, and fine-tuning as anchoring variants.
-
arXiv 2024
EVINCE: Optimizing Adversarial LLM Dialogues via Conditional Statistics and Information TheoryTL;DR: Uses information-theoretic metrics to optimize multi-agent debates, measuring when additional dialogue rounds yield diminishing returns.
Recent Papers (2023–2026)
-
KDD 2026
REALM-Bench: A Real-World Planning Benchmark for LLMs and Multi-Agent SystemsTL;DR: Benchmark featuring real-world planning tasks (travel, scheduling, logistics) that exposes the gap between LLM reasoning capabilities and practical deployment.
-
VLDB 2025
SagaLLM: Context Management, Validation, and Transaction Guarantees for Multi-Agent LLM PlanningTL;DR: Brings database-style ACID guarantees to multi-agent LLM systems — ensuring plans are atomic, consistent, and recoverable through compensating transactions.
-
ICML 2025
A Checks-and-Balances Framework for Ethical AI AlignmentTL;DR: A three-branch governance architecture (Executive, Legislative, Judicial) for AI systems that prevents any single component from unilateral harmful actions.
-
NeurIPS AI Safety 2024
A Three-Branch Checks-and-Balances Framework for Context-Aware Ethical Alignment of Large Language ModelsTL;DR: Early version of checks-and-balances framework demonstrating how separation of powers prevents single-point-of-failure in AI alignment.
-
IEEE MIPR 2024
Behavioral Emotion Analysis Model for Large Language ModelsTL;DR: A framework for analyzing and modeling emotional behaviors in LLM responses, enabling more nuanced human-AI interaction.
-
IEEE CCWC 2023 100+ citations
Prompting Large Language Models With the Socratic MethodTL;DR: Introduces SocraSynth — using Socratic questioning to elicit deeper reasoning from LLMs through structured multi-turn dialogue and adversarial probing.
-
IEEE CSCI 2023 100+ citations
Examining GPT-4's Capabilities and Enhancement with SocraSynth (CRIT)TL;DR: Systematic evaluation of GPT-4's reasoning capabilities and introduction of CRIT — a critique-based method that improves accuracy through iterative refinement and self-correction.
Books
Teaching (Stanford)
-
Spring 2026
CS486 — Advanced Large Language Models Research Seminar
-
Winter 2026
CS372 — Artificial General Intelligence for Reasoning, Planning, and Decision Making
-
Spring 2025
CS372 — Artificial Intelligence for Reasoning, Planning, and Decision Making
-
2023–2024
CS372 — Artificial Intelligence for Precision Medicine and Psychiatric Disorders
-
2019–2022
CS372 — Artificial Intelligence for Disease Diagnosis and Information Recommendations
Background
Education
- Ph.D., Electrical Engineering
Stanford University - M.S., Computer Science
Stanford University - M.S., IEOR
University of California, Berkeley
Industry Experience
- Director of Research
Google, 2006–2012 - President
HTC Healthcare, 2012–2021
Selected Honors
- XPRIZE Tricorder
$1M Award for AI Medical Diagnosis, 2017 - ACM Fellow, IEEE Fellow
Citations: for contributions in scalable machine learning and healthcare
Previous Academic
- Professor (tenured)
UC Santa Barbara, 1999–2006