Basics
Name | Joseph Liu |
joseph@liu.us | |
Phone | +1 (650) 276-8035 |
Url | https://joseph.liu.us/ |
Education
-
2022.08 - 2025.05 Los Angeles, CA
University of Southern Californa(USC)
Bachelor of Science in Computer Science (GPA: 4.0/4.0)
- Artificial Intelligence
- Machine Learning
- LLMs in Natural Language Processing
- Foundations of Multi-Agent Systems
- Probablity
- Statistical Inference and Data Analysis
- Discrete Math
- Operating Systems
- Computer Systems
Publications
-
2025.10 Evaluation Under Imperfect Benchmarks and Ratings: A Case Study in Text Simplification
Joseph Liu, Yoonsoo Nam, Xinyue Cui, Swabha Swayamdipta
(submitted to COLM 2025)
-
2025.06 Symbolic Representation for Any-to-Any Generative Tasks
Jiaqi Chen, Xiaoye Zhu, Yue Wang, Tianyang Liu, Xinhui Chen, Ying Chen, Chak Tou Leong, Yifei Ke, Joseph Liu, Yiwen Yuan, Julian McAuley, Li-jia Li
CVPR 2025
-
2024.11 Generative Models in Protein Engineering: A Comprehensive Survey
Xinhui Chen*, Yiwen Yuan*, Joseph Liu*, Chak Tou Leong, Xiaoye Zhu, and Jiaqi Chen
NeurIPS 2024 Foundation Models for Science Workshop (poster)
-
2024.10 Variable Stars in M31 Stellar Clusters from the Panchromatic Hubble Andromeda Treasury
Richard Smith, Avi Patel, Monika D. Soraisam, Puragra Guhathakurta, Pranav Tadepalli, Sally Zhu, Joseph Liu, and Léo Girardi, L. Clifton Johnson, Sagnick Mukherjee, Knut A. G. Olsen, Benjamin F. Williams
The Astrophysical Journal
-
2023.12 Enhancing Debugging Skills of LLMs with Prompt Engineering
Keyu He*, Max Li*, and Joseph Liu*
Technical Report
-
2023.05 -
2022.06 Variable Stars in M31 Stellar Clusters using the Panchromatic Hubble Andromeda
Avi Patel, Sagnick Mukherjee, Monika Soraisam, Puragra Guhathakurta, Joseph Liu, and Pranav Tadepalli
Bulletin of the AAS
Research Experience
-
2024.09 - 2024.12 Symbolic Representation for Any-to-Any Generative Tasks
- Symbolic Any-to-any Paradigm: Introduced a symbolic language with functions, parameters, and topologies, enabling flexible representation of any-to-any generative tasks (e.g., image-to-video, image-to-3D, image merging, etc.).
- Training-free Inference: Developed a training-free inference engine that transforms natural language task descriptions into executable symbolic flows, allowing seamless task execution as a program.
-
2024.08 - 2024.12 Generative Models in Protein Engineering
- Protein Model Classification: Systematically categorized protein generative models through a multi-dimensional framework, encompassing inference methodologies (diffusion-based/autoregressive) and modeling targets (sequence/structure), establishing a structured overview of this emerging field’s technical landscape.
- Protein Diffusion Model Comparison: Established a comparison framework for protein diffusion models across two fundamental dimensions: the mathematical representation level and the structural invariance level, revealing how modeling choices affect protein structure design.
- Future Directions in Protein Modeling: Identified critical challenges and future opportunities in protein generative models, emphasizing the transition from data limitations to large-scale datasets and hybrid modeling approaches.
-
2024.05 - Present Los Angeles, CA
Learning Heuristics for Multi-Agent Pathfinding
IDM Lab, USC
Mentored by Yimin Tang, advised by Prof. Sven Koenig
- Trainable Heuristic Environment: Developed an RL environment to train heuristics for multi-robot path planning, leveraging 4D representations to capture spatial-temporal relationships between robot paths and environmental constraints.
- Two-Phase Training Strategy: Crafted a two-phase training strategy, initially replicating traditional heuristics and subsequently enhancing search efficiency with a node expansion reward system.
- Search Efficiency Assessment Tool: Implementing a quantitative evaluation system based on node expansion metrics, enabling direct measurement of search efficiency improvements for the learned heuristic function.
-
2024.01 - Present Los Angeles, CA
LLM-based Text Simplification Evaluation System
Data, Interpretability, Language, and Learning (DILL) Lab, USC
Mentored by Xinyue Cui and Yoonsoo Nam, advised by Prof. Swabha Swayamdipta
- Text Simplification Metrics: Designed a novel reference-free metric for text simplification by introducing LLM judges, eliminating the need for specialized training data.
- Model Architecture Design: Developed an efficient evaluation framework utilizing pre-trained models such as Llama 3 without fine-tuning, enabling broad domain coverage and robust simplification assessment.
- Evaluation: Demonstrated superior performance in evaluating simplifications, achieving a correlation of 0.54 with human judgment and outperforming traditional metrics, such as FKGL and SARI, and trained metrics such as LENS.
-
2023.09 - 2024.01 Los Angeles, CA
Enhancing Debugging Skills of LLMs with Prompt Engineering
Advised by Prof. Swabha Swayamdipta
- Debugging Prompt Engineering: Used prompt engineering with pretrained LLMs to boost performance in debugging tasks through few-shot learning and chain-of-thought prompting.
- Multidimensional Evaluation Metrics: Developed and implemented a comprehensive set of evaluation metrics, both similarity-based and executable, to quantitatively assess LLM debugging performance.
- Real-World Error Dataset Construction: Constructed a dataset of Java Leetcode solutions to replicate real-world programming bugs for dynamic analysis.
-
2023.08 - 2023.12 Los Angeles, CA
Wildfire Spread Prediction
Computation and Data Driven Discovery Group, USC
Mentored by Bryan Shaddy, advised by Prof. Assad Oberai
- Worked on physics-informed machine learning techniques to model wildfire spread using diffusion and GAN models
-
2020.06 - 2021.08 Santa Cruz, CA
Variable Stars in Andromeda Galaxy
UC Santa Cruz
Mentored by Sagnick Mukherjee, advised by Prof. Puragra Guhathakurta
- Hybrid Variable Star Detection Strategy: Combined statistical analysis of PHAT survey light curves with difference imaging to identify variable stars in M31 stellar clusters using HST observations.
- Data Cleaning and Collection: Organized, filtered, and cleaned datapoints of millions of stars, including work in database query optimization, parallelization, and computational geometry
- Variable Star Census and Classification: Established a catalog of 86 luminous variables (F814W < 19) in M31 clusters, with comprehensive characterization of their evolutionary phases and initial masses (0.8-67M ⊙) based on theoretical isochrones.
Teaching Experience
-
2024.05 - 2024.07 Los Angeles, CA
Teaching Assistant
University of Southern Californa
- Teaching Assistant for CSCI-201: Principles of Software Development for Prof. Victor Adamchik
- Helped the professor prepare the computer lab exercises and coached students in the lab for their coding assignments
-
2022.03 - 2022.06 Santa Clara, CA
Grader
Santa Clara University
- Grader for CSCI 163: Theory of Algorithms for Prof. Nicholas Tran
- As a freshman, graded homework and exams for a course primarily taken by upperclassmen
Industry Experience
-
2023.05 - 2023.08 Auburn Hills, MI (Remote)
Data Science Intern
Stellantis N.V.
- Pipeline Optimization: Led end-to-end optimization of ML sales prediction pipeline, achieving 86% reduction in interruptions, 30% faster runtime, and 25% cost savings while improving data quality by fixing critical bugs affecting 60% of the dataset.
- Research Leadership: Spearheaded feature engineering initiatives and performance optimization research, presenting findings to 80+ stakeholders including directors and VPs.
- Performance Recognition: Demonstrated exceptional performance resulting in return offer for Summer 2024
-
2022.06 - 2022.08 Taipei, Taiwan
Machine Learning Intern
iKala Interactive Media Inc.
- Video Analysis Research: Researched state-of-the-art methodologies in Computer Vision (CV) and Natural Language Processing (NLP) for video analysis.
- Audio-Video Embedding: Designed and implemented a Transformer-based model for multimodal (video and audio) embedding generation with PyTorch, achieving 60% precision on AudioSet dataset.
Awards
- 2024.09
USC Provost’s Undergrad Research Fellowship
The Office of the Provost, University of Southern California
The USC Office of the Provost awards this competitive undergraduate research fellowship to select students. Recipients receive a $1,000 stipend for conducting research for a minimum of ten weeks at 10 hours per week.
- 2023.09
USC Center for Undergraduate Research in Viterbi Engineering Fellowship
Viterbi School of Engineering, University of Southern California
The USC Center for Undergraduate Research in Viterbi Engineering (CURVE) Program awards this competitive undergraduate research fellowship to select students. Recipients receive a $1,250 ($3,000 in summer) stipend for conducting research for a minimum of 10 hours per week throughout the semester. $5,500 total award received.
- 2023.09
USC Viterbi Dean's List
Viterbi School of Engineering, University of Southern California
certificate award for students with 3.5+ GPA
- 2021.03.18
SCU Dean’s Scholarship
Financial Aid Office, Santa Clara University
Santa Clara University's Financial Aid Office awards this competitive merit-based Dean's Scholarship. Recipients receive $8,100, distributed as $2,700 per quarter. The scholarship is renewable for up to twelve consecutive quarters.
Skills
Languages | |
Python | |
Java | |
C++ | |
C# | |
SQL | |
JavaScript | |
x86-64 Assembly |
Frameworks/Tools | |
PyTorch | |
Pandas | |
NumPy | |
Git | |
AWS |
Environments | |
Unix/Linux | |
Windows | |
MacOS |
Areas of Expertise | |
Machine Learning | |
Natural Language Processing (NLP) | |
Large Language Models (LLMs) | |
Data Structures | |
Algorithms | |
Probability | |
Statistic |