Publications

2025

Under Review
Evaluation Under Imperfect Benchmarks and Ratings: A Case Study in Text Simplification

Joseph Liu, Yoonsoo Nam, Xinyue Cui, and Swabha Swayamdipta

In COLM 2025 (under review), Oct 2025

Abs Bib PDF

Existing text simplification metrics face challenges ranging from limited datasets to a reliance on references. To address this, we propose a Panel of Language Models as a reference-free metric that removes the need for extensive training data. The panel uses exclusively pretrained language models and therefore benefits from their extensive dataset, deeper understanding of language, and potential future advances. We show that this metric is competitive with, and in some cases outperforms, existing metrics in human correlation. Furthermore, we find that it is more consistent than even human annotators in scoring simplification quality on certain dimensions.
@inproceedings{j2025text, title = {Evaluation Under Imperfect Benchmarks and Ratings: A Case Study in Text Simplification}, author = {Liu, Joseph and Nam, Yoonsoo and Cui, Xinyue and Swayamdipta, Swabha}, booktitle = {COLM 2025 (under review)}, year = {2025}, month = oct, url = {https://joseph.liu.us/r/text}, }
CVPR
Symbolic Representation for Any-to-Any Generative Tasks

Jiaqi Chen, Xiaoye Zhu, Yue Wang , Tianyang Liu, Xinhui Chen, Ying Chen, Chak Tou Leong, Yifei Ke, Joseph Liu, and 3 more authors

In CVPR 2025, Nashville, TN, USA, Jun 2025

Abs Bib PDF

We propose a symbolic generative task descriptive language and inference engine, capable of representing arbitrary multimodal tasks as symbolic flows. The inference engine maps natural language instructions to symbolic flow, eliminating the need for task-specific training. Conventional generative models rely heavily on large-scale training and implicit neural representation to learn cross-modal mappings, which demands extensive computational resources and restricts expandability. In this paper, we propose an explicit symbolic task descriptive language, comprising three types of primitives: functions, parameters, and topological logic. Using a pre-trained language model to infer symbolic workflows in a training-free manner, our framework successfully performs over 12 multimodal generative tasks based on user instructions, demonstrating enhanced efficiency and flexibility. Extensive experiments demonstrate that our approach can generate multimodal content competitive with, and often surpassing, that of previous state-ofthe-art unified models, while offering robust interruptibility and editability. We believe that symbolic task representations are capable of cost-effectively expanding the boundaries of generative AI capabilities. All code and results are available in the Supplementary Materials.
@inproceedings{jiaqi2025symbolic, title = {Symbolic Representation for Any-to-Any Generative Tasks}, author = {Chen, Jiaqi and Zhu, Xiaoye and Wang, Yue and Liu, Tianyang and Chen, Xinhui and Chen, Ying and Leong, Chak Tou and Ke, Yifei and Liu, Joseph and Yuan, Yiwen and McAuley, Julian and Li, Li-jia}, booktitle = {CVPR 2025}, year = {2025}, month = jun, url = {https://joseph.liu.us/r/cvpr}, location = {Nashville, TN, USA}, }

2024

NeurIPS Workshop
Generative Models in Protein Engineering: A Comprehensive Survey

Chen Xinhui^*, Yiwen Yuan^*, Joseph Liu^*, Chak Tou Leong, Xiaoye Zhu, and Jiaqi Chen

In Neurips 2024 Workshop Foundation Models for Science: Progress, Opportunities, and Challenges, Vancouver, BC, Canada, Dec 2024

Poster Abs Bib PDF

Poster presentation

Proteins are fundamental molecules performing diverse functions in living organisms. Protein engineering, the process of designing or modifying proteins to enhance or create new functions, has therefore become a research focus in the fields of biotechnology and medicine. A primary challenge in protein engineering is to efficiently discover and design new proteins with desired functions. Traditional approaches like directed evolution and rational design, though widely used, are limited by high computational costs and restricted exploration of potential protein structures. The recent success of generative models in efficiently synthesizing high-quality data across various domains has inspired researchers to investigate their potential applications in protein engineering. In this survey, we systematically summarize recent works on generative models for protein engineering, with a particular focus on protein design. Specifically, we categorize three main frameworks in existing generative protein design methods: sequence-based, structure-based, and joint sequence-structure generation. Besides, we provide a detailed review of representative generative models, including autoregressive models and diffusion models, and their application in protein sequence prediction and structure generation. Finally, we pinpoint existing challenges and propose future directions, such as leveraging large datasets, improving complex structure validation, and integrating advanced modeling techniques.
@inproceedings{xinhui2024generative, title = {Generative Models in Protein Engineering: A Comprehensive Survey}, author = {Xinhui, Chen and Yuan, Yiwen and Liu, Joseph and Leong, Chak Tou and Zhu, Xiaoye and Chen, Jiaqi}, booktitle = {Neurips 2024 Workshop Foundation Models for Science: Progress, Opportunities, and Challenges}, year = {2024}, month = dec, url = {https://openreview.net/forum?id=Xc7l84S0Ao}, location = {Vancouver, BC, Canada}, }
APJ
Variable Stars in M31 Stellar Clusters from the Panchromatic Hubble Andromeda Treasury

Richard Smith, Avi Patel, Monika D. Soraisam, Puragra Guhathakurta, Pranav Tadepalli, Sally Zhu, Joseph Liu, Léo Girardi, L. Clifton Johnson, and 3 more authors

The Astrophysical Journal, Oct 2024

Abs DOI Bib HTML

Variable stars in stellar clusters can offer key constraints on stellar evolution and pulsation models, utilizing estimates of host cluster properties to constrain stellar physical parameters. We present a catalog of 86 luminous (F814W < 19) variable stars in M31 clusters identified by mining the archival Panchromatic Hubble Andromeda Treasury (PHAT) survey using a combination of statistical analysis of sparse PHAT light curves and difference imaging. We determine the evolutionary phases and initial masses of these variable stars by matching them with theoretical isochrones generated using host cluster properties from the literature. We calculate the probability of PHAT photometry being blended due to the highly crowded nature of cluster environments for each cluster-variable star, using these probabilities to inform our level of confidence in the derived properties of each star. Our 86 cluster-variable stars have initial masses between 0.8 and 67 M ⊙. Their evolutionary phases span the main sequence, more evolved hydrogen- and helium-burning phases, and the post–asymptotic giant branch. We identify numerous candidate variable star types: RV Tauri variables, red supergiants, and slowly pulsating B-type supergiants, along with Wolf–Rayet stars, α Cygni and Mira variables, a classical Cepheid, and a possible superasymptotic giant. We characterize 12 cluster-variable stars at higher confidence based on their difference image quality and lower blending probability. Ours is the first systematic study of variable stars in extragalactic stellar clusters leveraging the superior resolution of the Hubble Space Telescope and demonstrating the unique power of stellar clusters in constraining the fundamental properties of variable stars.
@article{Smith_2024, dimensions = {false}, google_scholar_id_disable = {u5HHmVD_uO8C}, doi = {10.3847/1538-4357/ad6eff}, year = {2024}, month = oct, url = {https://dx.doi.org/10.3847/1538-4357/ad6eff}, publisher = {The American Astronomical Society}, volume = {974}, number = {2}, pages = {292}, author = {Smith, Richard and Patel, Avi and Soraisam, Monika D. and Guhathakurta, Puragra and Tadepalli, Pranav and Zhu, Sally and Liu, Joseph and Girardi, Léo and Johnson, L. Clifton and Mukherjee, Sagnick and Olsen, Knut A. G. and Williams, Benjamin F.}, title = {Variable Stars in M31 Stellar Clusters from the Panchromatic Hubble Andromeda Treasury}, journal = {The Astrophysical Journal} }

2023

Tech Report
Enhancing Debugging Skills of LLMs with Prompt Engineering

Keyu He^*, Max Li^*, and Joseph Liu^*

Dec 2023

Abs Bib PDF

This paper presents a comprehensive study on improving the debugging capabilities of Large Language Models (LLMs) like GPT-3.5, focusing on the application of prompt engineering techniques. We explore the efficacy of few-shot learning, chain-of-thought prompting, and a baseline zero-shot model in enhancing LLMs’ ability to debug code. Utilizing static and dynamic evaluation metrics, the study rigorously assesses the debugging proficiency of these models. By introducing different types of bugs, including procedural and language model-generated errors, and applying varied prompting strategies, we provide a deeper understanding of LLMs’ debugging capabilities. The results provide insights into the limitation of debugging capabilities of GPT-3.5 Turbo, even with the assistance of various prompting techniques. Source code of our evaluation method and bug generation techniques are in GitHub repository.
@techreport{he_2023_enhancing, author = {He, Keyu and Li, Max and Liu, Joseph}, year = {2023}, month = dec, title = {Enhancing Debugging Skills of LLMs with Prompt Engineering}, url = {https://joseph.liu.us/assets/pdf/2023-1211-Enhancing-Debugging.pdf}, urldate = {2023-12-11} }
Tech Report
Predicting Game Popularity from Steam Descriptions

Joseph Liu

May 2023

Abs Bib PDF

In many cases, game descriptions are some of the f irst places where potential players learn about games. Therefore, it is imperative that publishers and developers write interesting descriptions that positively impact sales. In this project, we investigate the correlation between game descriptions and game popularity, independent from gameplay, using various models. We begin with a classification problem, including our baseline, Softmax Regression and our best model, Bidirectional RNN, and also experiment with different data representations and eventually regression. We conclude that while much of a game’s popularity is associated with gameplay, the description also has a non-negligible impact on popularity. GitHub repository.
@techreport{liu_2023_predicting, author = {Liu, Joseph}, year = {2023}, month = may, title = {Predicting Game Popularity from Steam Descriptions}, url = {https://joseph.liu.us/assets/pdf/2023-0501-Predicting-Game-Popularity.pdf}, urldate = {2023-05-01} }

2022

Bulletin of the AAS

Variable Stars in M31 Stellar Clusters using the Panchromatic Hubble Andromeda Treasury

Avi Patel, Sagnick Mukherjee, Monika Soraisam, Puragra Guhathakurta, Joseph Liu, and Pranav Tadepalli

Bulletin of the AAS, Jun 2022

Bib HTML

@article{Patel2022Variable,
  author = {Patel, Avi and Mukherjee, Sagnick and Soraisam, Monika and Guhathakurta, Puragra and Liu, Joseph and Tadepalli, Pranav},
  journal = {Bulletin of the AAS},
  number = {6},
  year = {2022},
  month = jun,
  url = {https://baas.aas.org/pub/2022n6i201p02},
  publisher = {},
  title = {Variable {Stars} in {M31} {Stellar} {Clusters} using the {Panchromatic} {Hubble} {Andromeda} {Treasury}},
  volume = {54}
}