Research Projects

My research journey spans both robotics and AI security, exploring the intersection of intelligent systems, security, and privacy. Currently, I'm focusing on making robotic systems more capable through advanced imitation learning techniques while also investigating security vulnerabilities in AI systems.

UCLA Robot Intelligence Lab (URIL)

Directed by Professor Yuchen Cui, URIL focuses on imitation learning, reinforcement learning, and human-robot interaction for complex robotic systems.

Learning where to look: Gaze-Guided Active Perception for Long-Horizon Imitation Learning

Imitation Learning Gaze Tracking Computer Vision Robotics

Developing gaze-guided active perception methods to improve long-horizon imitation learning by inferring human visual intent and dynamically decomposing complex tasks.

Key Contributions:

  • Utilized Meta Aria as a gaze tracking system and collaborated on homography-based and LightGlue+RNN transformer methods for translating human perspective to machine perspective
  • Architected a probabilistic gaze prediction model using MDN, GMM, and UNet to infer human visual intent
  • Trained a gaze-centric skill selector using LSTM that dynamically chooses robot skills at inference time
  • Improved long-horizon tasks' success rates by 16% through foveated vision model implementation
  • Presented research at Meta's Research Summit for Egocentric Perception 2025 and UCLA Research Symposium

Technical Details:

  • Foveated Vision Model: Applies Gaussian noise and reduces intensity to visual data outside human gaze point to simulate attention
  • System Integration: Assembled URIL ACT ALOHA bimanual teleoperation hardware system with ROS+OpenCV real-time data pipeline
  • Architecture Adaptation: Adapted ALOHA's ACT Architecture and benchmarked SOTA IL policies (Vanilla BC, Diffusion Policy, ACT)

Multi-sensory Electronics-integrated Robotic Limb for Intelligent manipulation (MERLIN)

Dexterous Manipulation Sensor Integration Collaborative Robotics Data Collection

Co-initiating a collaboration between URIL and RoMeLa to develop a high-quality dexterous-hand data-collection system with joint sensors, cameras, and tactile sensing.

Key Contributions:

  • Co-initiated MERLIN as a collaboration between URIL and RoMeLa (Robotics Mechanism Laboratory)
  • Co-designed a high-quality dexterous-hand data-collection glove with joint sensors, cameras, and tactile sensing
  • Integrated and extended Python APIs on ROHand platform to replay and visualize data from the prototype glove
  • Collecting dexterous hand demonstrations with MERLIN and DexUMI and training imitation policies to compare performance across data-collection modalities

Project Goals:

  • Create comprehensive dataset of dexterous manipulation tasks
  • Compare performance of different data-collection modalities (MERLIN vs DexUMI)
  • Develop robust imitation learning policies for complex manipulation tasks

UCLA Security and Privacy Lab

Directed by Prof. Yuan Tian Focusing on developing secure intelligent systems and investigating vulnerabilities in AI/ML systems, particularly in the context of adversarial attacks and LLM security.

Adversarial Attack and Defense on Images

Adversarial ML Computer Vision Security Defense Mechanisms

Investigating adversarial attacks on image classification and text-to-image generation models, and developing defense mechanisms to protect against malicious manipulations.

Key Contributions:

  • Tested gradient matching & ascent algorithms to introduce unwanted results for image classification models
  • Researched the effects of Encoder and Diffusion Attack on text-to-image generation models
  • Trained a protection model adding 1.6% perturbation to an image and prevented malicious generation on the targeted image

Research Focus:

  • Attack Methods: Gradient-based attacks on classification models, encoder/diffusion attacks on generative models
  • Defense Strategies: Adversarial training, input perturbation for protection
  • Impact: Preventing malicious content generation in text-to-image systems

Security Flaws in Translation: A Study of Prompt Injection in LLM Translators

LLM Security Prompt Injection Vulnerability Analysis Defense Mechanisms

Conducting security evaluation of prompt injection vulnerabilities in LLM-based translation systems and developing detection/defense mechanisms.

Key Contributions:

  • Conducted a security evaluation of prompt injection vulnerabilities in RedNote's LLM-based translation mode
  • Engineered automated data collection scrapers to gather injection prompts from comments on test posts
  • Developed a Gemini-based analysis pipeline utilizing 3/4 of manual data for few-shot prompt analysis
  • Investigated detection and defense mechanisms by performing attention analysis on malicious prompts

Methodology:

  • Data Collection: Automated scraping of injection prompts from social media/test platforms
  • Analysis Pipeline: Gemini-based system for few-shot prompt analysis and classification
  • Defense Research: Attention analysis to understand and mitigate prompt injection attacks

Publications

2023

A robust and novel semantic segmentation deep neural network for robotic surgery vision with a single RGB camera

Shi, Y.

In Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022) (Vol. 12610, pp. 73-77). SPIE.