Hello there!
I'm Tianyi(Lorena) Yan Contact: tianyiy `at` usc `dot` edu

I’m currently a fourth-year undergraduate student (Computer Science B.S.) at USC Viterbi.

My broad research interests lie in natural language processing. I am motivated to develop more robust, generalizable, and controllable large language models. I am grateful to be advised by Professor Muhao Chen at LUKA Lab. I have also had the pleasure to work with Professor Hao Zhou at Insititute for AI Industry Research (AIR), Tsinghua University on target-aware molecule generation.

With persistence and self-motivation, I am eager to enhance the reliability and efficiency of human-machine interaction and bring vision to reality.

CV Contact

πŸ—ƒ Research

Contrastive Instruction Tuning (Paper under submission)

Research Assistant @ LUKA lab, Supervisor: Prof. Muhao Chen

  • Tianyi Yan, Fei Wang, James Y. Huang, Wenxuan Zhou, Fan Yin, Aram Galstyan, Wenpeng Yin, Muhao Chen
  • Leveraged contrastive learning to enhance large language models (LLMs)’ robustness to instruction perturbation by maximizing the similarity between hidden representations of semantically equivalent instruction-input pairs
  • Consistently improved LLMs’ performance to variations in instructions across character, word, sentence, and semantic levels with an average of +2.5% accuracy
  • Paper link: https://arxiv.org/abs/2402.11138

Robust Natural Language Understanding with Residual Attention Debiasing

Research Assistant @ LUKA lab, Supervisor: Prof. Muhao Chen

  • Fei Wang*, James Y. Huang*, Tianyi Yan, Wenxuan Zhou, Muhao Chen
  • Developed one-stage product-of-experts and residual attention learning techniques to mitigate biases in NLU models by assembling predictions and low-level attention scores
  • Employed PyTorch and Huggingface to adapt the BERT model framework, visualized attention score distributions, and resulted in a 0.25% reduction in attention to potentially biased tokens
  • Significantly enhanced model’s performance on out-of-distribution datasets (HANS, FEVER-Symmetric, PAWS) with improvements of 12.9%, 11.0%, and 2.7%, respectively
  • Accepted at ACL 2023 [Paper] [Code]

Latent Diffusion for Target-Specific Molecule Generation (Paper in preparation)

Research Assistant @ AIR, Tsinghua University, Supervisor: Prof. Hao Zhou

  • Implemented an end-to-end pipeline that jointly trained EGNN-based variational autoencoder (VAE) and diffusion models to enhance the affinity of generated ligands given protein pockets
  • Pretrained unconditional VAE with reconstruction and KL divergence loss using a large-scale ligand-only generation dataset to address the scarcity of paired pocket-ligand data

Target-specific Molecule Generation and Docking (Paper in preparation)

Research Assistant @ AIR, Tsinghua University, Supervisor: Prof. Hao Zhou

  • Utilized PyTorch to implement a diffusion-based model that iteratively generates ligands based on current docking position and utilizes grid search to identify optimal docking positions
  • Conducted pilot studies for sampling and finding better docking positions in given proteins for docking 250k ZINC molecules

πŸ’» Work Experience

Linkedin Download My CV
June 2022 – August 2022
SWE Student Explore Internship
Microsoft M365 Deployment
  • Applied React to add a dashboard page for centralizing all global deployment issues from scattered alert emails and visualizing them in multiple dimensions
  • Monitored and collected real-time issue data from deployment workflow to Cosmos NoSQL database
  • Utilized C# and ASP.NET to create endpoint to filter issues based on threshold values at runtime and forwarded data to frontend
May 2021 – January 2022
Full Stack Software Engineering Intern
Talkilla Education
  • Used Vue.js and JSP to add a frontend module for employed teachers to create student reports
  • Added automatic email notifications and English-to-Chinese translation for sending and editing reports
  • Added a module for sending automatic notifications via WeChat Service Account when student reports are available
  • Applied Java Spring Framework, Maven, MySQL, and MyBatis to develop backend
  • Used Jenkins and Gitlab for deployment

πŸ—ƒ Projects

April 2023

Textual Dialog-Based Meme Recommendation

2022-2023 Google Research ExploreCSR

  • Implemented contrastive learning using PyTorch and HuggingFace to align hidden representations of text and image inputs from fine-tuned ViLT and EmoRoBERTa models based on emotion
  • Jointly trained models with cross-entropy loss for sentiment classification and contrastive loss
  • Used Pandas to create a multimodal dataset containing 8,000 tweets with emotion keywords and associated images
  • Finetuned ViLT model on the Memotion dataset for meme sentiment classification to gain domain knowledge

Github
August 2021 – August 2022

Ctrl-F

Role: Cofounder, Leader, Developer

  • Developed an online platform for USC that allows students and professors to centralize on-campus research and internship opportunities from different departments online
  • Incorporated Java multithreading to enable real-time position application status tracking and update
  • Utilized Vue.js, Webpack, and Axios for frontend; Java Spring, Maven, and Firebase for backend
  • Successfully led the team, orchestrated weekly meetings, and won $1000 USC ABC Innovation First Prize

Github
Ctrl-F

🏫 Teaching and Volunteer

January 2022 – Present
Teaching Assistant at USC Viterbi School of Engineering
Viterbi School of Engineering @ University of Southern California
  • Teaching Assistant for CSCI467 (Intro to Machine Learning) taught by Prof. Robin Jia
  • Teaching Assistant for CSCI270 (Intro to Algorithms and Theory of Computations) for Prof. David Kempe and Prof. Shahriar Shamsian
  • Held office hours and discussion and monitored online forum to answer students’ questions
  • Assisted professor in designing and grading homework and exams
  • Explained challenging topics in machine learning (transformers, expectation maximization, etc.), and algorithm concepts (DP, NP-hardness reduction, etc.) and guided students to correct answers
February 2021 – May 2023
Leader of Mentorship Program
USC Chinese Students and Scholars Association(CSSA) Career Development
  • Organized resume and interview workshop that mainly helps USC undergraduates to find juniors, seniors, or graduates as tutors for advice
  • Led the team and coordinated various resources: planning proposals, propaganda articles, information sessions, mixers, etc.
  • Participating in organizing large-scale activities such as 2022 California Chinese Entrepreneurship Conference and Mega Hackathon Web 3 competition

πŸ“š Education

August 2020 – Present (Exp. May 2024)
University of Southern California
- Cumulative GPA: 3.98/4.00
- Major: Computer Science B.S.
- Course work: : Natural Language Processing, Machine Learning, Probability Theory, Statistics, Calculus, Computing Algorithms, Data Structure, Internetworking

🎼 Personal

  • Language: Mandarin, English
  • Cooking, Flute (played for 8 years!), Piano, Volleyball, Game

Contact

Email

Work-related: tianyiy `at` usc `dot` edu
Others: lorena `dot` yantianyi1020 'at' gmail 'dot' com

Location

Los Angeles, California