Siheng Zhao

I am a first-year CS Ph.D. Student at University of Southern California, advised by Prof. Yue Wang. I received my Bachelor's degree with the highest honor at School of Artificial Intelligence, Nanjing University.

Previously, I was honoured to work with Prof. Tao Yu, Prof. Yanchao Yang, Prof. Lin Shao, and Dr. Jiangmiao Pang at the University of Hong Kong, National University of Singapore, and Shanghai AI Lab.

Email  /  Google Scholar  /  Github  /  Twitter

profile photo
News
  • [Apr, 2025] I'll join Amazon Frontier AI for Robotics Team as Applied Scientist Intern at SF this summer!
  • [Apr, 2025] RoboVerse is accepted by RSS 2025.
  • [Sep, 2024] OSWorld is accepted by NeurIPS 2024.
  • [Sep, 2024] TieBot is accepted by CoRL 2024 as Oral.
  • [Aug, 2024] Formally join USC as a PhD student.
  • [Jan, 2024] Text2Reward and Lemur are accepted by ICLR 2024 as Spotlight.
  • [Dec, 2023] Awarded SenseTime Scholarship (30 undergraduates in the field of AI in China).
  • [Dec, 2023] Awarded Nanjing University Top-Grade Scholarship (highest honor in Nanjing University).
  • [July, 2023] ClothesNet is accepted by ICCV 2023.
  • [June, 2023] DiffClothAI is accepted by IROS 2023.
Research

My current research focuses on learning language-conditioned embodied control from scalable data. This involves two key aspects:

  • Integrating language into robotic control: leverage language-conditioned generative models such as LLMs, VLMs, VLAs, and T2V models: Text2Reward (ICLR'24 Spotlight), UH-1 (arXiv'24), GRUtopia (arXiv'24)
  • Scaling up robotic training data: utilize synthetic data, in-the-wild images and videos, and human demonstration data: UH-1 (arXiv'24), RoboVerse (RSS'25 Oral), TieBot (CoRL'24 Oral)

Previous topics:

  • Language and multimodal digital agents: Lemur (ICLR'24 Spotlight), OSWorld (NeurIPS'24).
  • Simulation, perception and manipulation of deformable objects: DiffClothAI (IROS'23), ClothesNet (ICCV'23), TieBot (CoRL'24 Oral).

Selected Publications

* denotes equal contribution. For the full publication list, please refer to my Google Scholar .

profile photo Learning from Massive Human Videos for Universal Humanoid Pose Control
Siheng Zhao*, Jiageng Mao*, Siqi Song*, Tianheng Shi, Junjie Ye, Mingtong Zhang, Haoran Geng, Jitendra Malik, Vitor Guizilini, Yue Wang
arXiv 2024
[arxiv] [project] [dataset🤗]

profile photo OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu
Advances in Neural Information Processing Systems (NeurIPS) 2024
[arxiv] [project]

profile photo Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
Siheng Zhao*, Tianbao Xie*, Chen Henry Wu, Yitao Liu, Qian Luo, Victor Zhong, Yanchao Yang, Tao Yu
International Conference on Learning Representations (ICLR) 2024 Spotlight
[arxiv] [project]

profile photo DiffClothAI: Differentiable Cloth Simulation with Intersection-free Frictional Contact and Differentiable Two-Way Coupling with Articulated Rigid Bodies
Siheng Zhao*, Xinyuan Yu*, Siyuan Luo, Gang Yang, Lin Shao
International Conference on Intelligent Robots and Systems (IROS) 2023
[paper] [project]

Education
USC logo University of Southern California
Ph.D. in Computer Science (2024 - )
Advisor: Prof. Yue Wang
NJU logo Nanjing University
B.Eng. in Artificial Intelligence (2020 - 2024)
Overall GPA: 94.4/100, Ranking: 1/97
NUS logo National University of Singapore
Exchange in Computer Science (2023)
Overall GPA: 4.0/4.0
Work Experience
AILAB logo Amazon, FAR (Frontier AI for Robotics) Team
Applied Scientist Intern (2025)
Advisor: Dr. Rocky Duan
AILAB logo Shanghai AI Lab, OpenRobot Group
Research Intern (2024)
Advisor: Dr. Jiangmiao Pang
HKU logo The University of Hong Kong, NLP Group
Research Assistant (2023)
Advisor: Prof. Tao Yu
Services
  • Conference Reviewer:
    • Neural Information Processing Systems (NeurIPS) 2024, 2025
    • Conference on Robot Learning (CoRL) 2025
    • International Conference on Computer Vision (ICCV) 2025
    • IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
    • IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025
    • International Conference on Learning Representations (ICLR) 2025
    • European Conference on Computer Vision (ECCV) 2024
    • IEEE International Conference on Robotics and Automation (ICRA) 2024
  • Journal Reviewer:
    • IEEE Robotics and Automation Letter (RA-L) 2025
Talks
  • Text2Reward: Reward Shaping with Language Models for Reinforcement Learning, SenseTime, 2024
  • Text2Reward: Reward Shaping with Language Models for Reinforcement Learning, Shanghai AI Lab, 2023, 2024
Honors & Awards
  • University of Southern California PhD Fellowship, 2024
  • Outstanding Graduate of Nanjing University, 2024
  • Travel Award of the International Conference on Learning Representations, 2024
  • Jiangsu Province Study Abroad Scholarship, 2024
  • SenseTime Scholarship, 2023, awarded to 30 undergraduates in the field of AI in China
  • Nanjing University Top-Grade Scholarship, 2023, the highest honor in Nanjing University
  • Bao Gang Scholarship & Special Prize Nomination, 2023
  • Heng Fang Scholarship, 2022
  • National Scholarship, 2021, the highest honor in China's University
  • China Telecom Scholarship, 2021

This homepage is designed based on Jon Barron's website and deployed on Github Pages.

Copyright 2024 © Siheng Zhao