Edward Y. Chang (張智威)

Adjunct Professor, Stanford University

ACM Fellow | IEEE Fellow

 

Edward Y. Chang is an adjunct professor of Computer Science at Stanford University since 2019.  His current research interests are consciousness modeling, meta learning, and healthcare.  Chang received his MS in CS and PhD in EE, both from Stanford University.  He joined the ECE department of UC Santa Barbara in September 1999, was tenured in 2003 and promoted to full professor in 2006. From 2006 to 2012, Chang served at Google as a director of research, leading research and development in areas including scalable machine learning, indoor localization, Google QA, and recommendation systems. In subsequent years, Chang served as the president of HTC Healthcare (2012-2021) and a visiting professor at UC Berkeley AR/VR center (2017-2021), working on healthcare projects including VR surgery planning, AI-powered medical IoTs, and disease diagnosis. Chang is an ACM fellow and IEEE fellow, for his contributions to scalable machine learning and healthcare. 

Chang has made foundational technical contributions to scalable machine learning in two areas: speeding up machine learning algorithms through parallel algorithms and improving training data effectiveness through active learning. His work has made broader societal impact in improving healthcare accessibility and quality; and was awarded both the Tricorder XPRIZE and Taiwan Presidential Award.

 

(1) Parallelizing machine learning algorithms.

In 2005, Chang pioneered the data-centric machine learning approach, long before mass interest in the subject. Between 2006 and 2011, Chang led his teams at Google to develop parallel versions of five widely used machine-learning algorithms to train on big data: PSVM for Support Vector Machines, PFP for Frequent Itemset Mining, PLDA for Latent Dirichlet Allocation, PSC for Spectral Clustering, and SPeeDO for Parallel Convolutional Neural Networks [1-6]. PSVM applies approximate matrix factorization to the kernel matrix to allow the solver of the Interior Point Method to be distributed onto multiple machines. The novel contribution of PSVM lies in its use of a row-based Incomplete Cholesky Factorization to achieve both memory and computation reduction and their parallelism on inverting an n-by-n matrix. Given n training instances, PSVM can effectively reduce memory requirement from O(n^2) to O(n) and computation complexity from O(n^3) to O(n) on each of the square-root(n) parallel computation units (CPUs/GPUs). Subsequently Chang parallelized the other four algorithms by employing algorithmic and system combined techniques, including 1) matrix sparsification, 2) batching and reordering computation units, 3) pipelining computation and IOs, 4) preserving working-set locality, and 5) load-balancing distributed parameter servers.

Employing any algorithm that has computation complexity higher than O(n) was largely dismissed by Google before 2012. The reason was that with e.g., one billion training instances, an O(n) algorithm that takes 1 minute to complete would take an O(n^2) algorithm 2,000 years. To attest the data-centric approach to be viable, Chang endorsed Stanford’s ImageNet project with a substantial Google grant. After AlexNet demonstrated that scale-of-data matters, the industry was convinced to invest massive budget to parallelize algorithms. Berkeley Spark and Microsoft ADAM adopted Chang’s working-set and parameter-server techniques. Berkeley further included Chang’s PFP algorithm in its Spark open source. In 2014, Google finally filed two Chang’s data-centric ML patents, which were submitted with 2011 priority dates [6.b]. Chang’s system and algorithmic combined approach is foundational to scalable machine learning.

 

(2) Improving training data effectiveness via active learning.

In several applications, the amount of available labeled data is typically too small to effectively train a classifier. A significant example is healthcare systems where Chang has done significant work. His SVMActive work with Simon Tong [7] applied active learning to identify the most ambiguous and hence useful unlabeled instances to query an expert (e.g., a physician) to provide labels, to maximize information gain. This work was first applied to improve relevance feedback in an image-query refinement setting. This SVMActive work received the SIGMM test of time honor. In the healthcare domain, Chang integrates sparse-space active learning with reinforcement learning to decide a doctor-agent’s next symptom query on a patient to optimize diagnosis accuracy with a minimal number of symptom-probing iterations. The REFUEL algorithm [8] iteratively probes a patient’s symptom based on his/her previous responses. The challenge is that the number of symptoms of a disease is typically less than five, which is a 0.5% minority in the symptom space of over a thousand. REFUEL devises two strategies to address sparsity: reward shaping guides the symptom-space search and feature rebuilding iteratively eliminates conditional correlated symptoms. REFUEL has been successfully deployed by two hospital chains in Taiwan to perform remote diagnosis and triage, and by Taiwan CDC to combat the COVID-19 pandemic.

Between 2012 and 2017, Chang co-led the DBG team to compete in the Qualcomm Tricorder XPRIZE competition [9], the biggest biomedical prize in history.  Chang’s team produced a mobile device with REFUEL-powered symptom checker to conduct various lab tests for diagnosing 12 common diseases. DBG is one of the top 2 teams (out of 300 entries) that advanced to the final round of competition and received a $1-million prize for its achievement in 2017. This low-cost (US$300), lightweight (5-lbs), low-power diagnostic device brings realistic hope to resolve the age-long healthcare deadlock between affordability, accessibility, and quality. Besides XPRIZE, Chang’s symptom-checking chatbot launched with Taiwan CDC received the Presidential Award in 2020 for effectively containing the COVID-19 outbreak.

 

(3) Other notable contributions.

Chang was credited as the inventor of Digital VCR (DVR) [10], which was designed in 1997 and developed in 1998, supervised by Profs. Hector Garcia-Molina and Pat Hanrahan as a chapter of his PhD dissertation. DVR provides interactive features for streaming videos and replaced the traditional tape-based VCR in 1999.

 

References:

  1. Parallelizing Support Vector Machines on Distributed Computers, EY Chang, et al., NeurIPS 2007.
  2. PFP: Parallel FP-Growth for Query Recommendation, H. Li, Y. Wang, D. Zhang, M. Zhang, and EY Chang, ACM Recommender Systems, 2008.
  3. PLDA: Parallel Latent Dirichlet Allocation for Large-scale Applications, Y. Wang, Hongjie, Bai, M. Stanton, Wen-Yen Chen, Edward Y. Chang, International Conference on Algorithmic Applications in Management, June 2009.
  4. PLDA+: Parallel Latent Dirichlet Allocation with Data Placement and Pipeline Processing, Z. Liu, Y. Zhang, EY Chang, and M. Sun, ACM Transactions on Intelligent Systems and Technology 2(3), 2011.
  5. Parallel Spectral Clustering in Distributed Systems, WY Chen, Y. Song, H. Bai, CJ Lin, and EY Chang, IEEE PAMI, 2011.
  6. Scalable ML open source, patents, and Google QA product are listed below:

a.     Parallel ML Open Source: http://openbigdatagroup.github.io/ (50,000+ downloads since 2007).  

b.     Parallel ML Patents on data-driven and model-based hybrid machine learning methods for Image Feature Extraction and Object Recognition. Two US patents filed at Google, US8798375B1 (priority date: 9/14/2011) and US9547914B2 (priority date: 8/01/2011).

c.     Parallel ML powered product: Google QA, developed by Chang’s team and launched in 68 developing and underdeveloped countries.

  1. SVMActiveSupport Vector Machine Active Learning for Image Retrieval, S. Tong, and EY Chang, ACM Multimedia, 2001. (SIGMM Test of Time Honor)
  2. REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis, YS Peng, KF Tang, HT Lin, EY Chang, NeurIPS, 2018.

a.     REFUEL product sample: Symptom Checking Chatbot, launched with Changhua Christian Hospital, Taiwan. Link: https://www.cch.org.tw/about_page.aspx?Id=92

b.     Taiwan COVID-19 CDC Chatbot (疾管家) https://www.digitimes.com.tw/iot/article.asp?cat=130&cat1=40&cat2=15&id=0000593424_v565v0tz174j3j5ho9d5c

  1. Tricorder XPRIZE specifications, keynote, and press release:

a.     Specifications: Artificial Intelligence in XPRIZE DeepQ Tricorder (XPRIZE Award), EY Chang, et al., Proceedings of the 2nd International Workshop on Multimedia for Personal Health and Health Care, 2017.

b.     Award press link:  https://www.mobihealthnews.com/content/qualcomm-tricorder-x-prize-has-its-winner-work-tricorders-will-continue

c.     ACM Multimedia Conference keynote 2017. Keynote link: https://www.youtube.com/watch?v=qgtdMNJSc1U

  1. [10] Digital Video Recorder, Wikipedia link: https://en.wikipedia.org/wiki/Digital_video_recorder