Research Projects
My research journey spans both robotics and AI security, exploring the intersection of intelligent systems, security, and privacy. Currently, I'm focusing on making robotic systems more capable through advanced imitation learning techniques while also investigating security vulnerabilities in AI systems.
UCLA Robot Intelligence Lab (URIL)
Directed by Professor Yuchen Cui, URIL focuses on imitation learning, reinforcement learning, and human-robot interaction for complex robotic systems.
Learning where to look: Gaze-Guided Active Perception for Long-Horizon Imitation Learning
Developing gaze-guided active perception methods to improve long-horizon imitation learning by inferring human visual intent and dynamically decomposing complex tasks.
Key Contributions:
- Utilized Meta Aria as a gaze tracking system and collaborated on homography-based and LightGlue+RNN transformer methods for translating human perspective to machine perspective
- Architected a probabilistic gaze prediction model using MDN, GMM, and UNet to infer human visual intent
- Trained a gaze-centric skill selector using LSTM that dynamically chooses robot skills at inference time
- Improved long-horizon tasks' success rates by 16% through foveated vision model implementation
- Presented research at Meta's Research Summit for Egocentric Perception 2025 and UCLA Research Symposium
Technical Details:
- Foveated Vision Model: Applies Gaussian noise and reduces intensity to visual data outside human gaze point to simulate attention
- System Integration: Assembled URIL ACT ALOHA bimanual teleoperation hardware system with ROS+OpenCV real-time data pipeline
- Architecture Adaptation: Adapted ALOHA's ACT Architecture and benchmarked SOTA IL policies (Vanilla BC, Diffusion Policy, ACT)
Multi-sensory Electronics-integrated Robotic Limb for Intelligent manipulation (MERLIN)
Co-initiating a collaboration between URIL and RoMeLa to develop a high-quality dexterous-hand data-collection system with joint sensors, cameras, and tactile sensing.
Key Contributions:
- Co-initiated MERLIN as a collaboration between URIL and RoMeLa (Robotics Mechanism Laboratory)
- Co-designed a high-quality dexterous-hand data-collection glove with joint sensors, cameras, and tactile sensing
- Integrated and extended Python APIs on ROHand platform to replay and visualize data from the prototype glove
- Collecting dexterous hand demonstrations with MERLIN and DexUMI and training imitation policies to compare performance across data-collection modalities
Project Goals:
- Create comprehensive dataset of dexterous manipulation tasks
- Compare performance of different data-collection modalities (MERLIN vs DexUMI)
- Develop robust imitation learning policies for complex manipulation tasks
UCLA Security and Privacy Lab
Directed by Prof. Yuan Tian Focusing on developing secure intelligent systems and investigating vulnerabilities in AI/ML systems, particularly in the context of adversarial attacks and LLM security.
Adversarial Attack and Defense on Images
Investigating adversarial attacks on image classification and text-to-image generation models, and developing defense mechanisms to protect against malicious manipulations.
Key Contributions:
- Tested gradient matching & ascent algorithms to introduce unwanted results for image classification models
- Researched the effects of Encoder and Diffusion Attack on text-to-image generation models
- Trained a protection model adding 1.6% perturbation to an image and prevented malicious generation on the targeted image
Research Focus:
- Attack Methods: Gradient-based attacks on classification models, encoder/diffusion attacks on generative models
- Defense Strategies: Adversarial training, input perturbation for protection
- Impact: Preventing malicious content generation in text-to-image systems
Security Flaws in Translation: A Study of Prompt Injection in LLM Translators
Conducting security evaluation of prompt injection vulnerabilities in LLM-based translation systems and developing detection/defense mechanisms.
Key Contributions:
- Conducted a security evaluation of prompt injection vulnerabilities in RedNote's LLM-based translation mode
- Engineered automated data collection scrapers to gather injection prompts from comments on test posts
- Developed a Gemini-based analysis pipeline utilizing 3/4 of manual data for few-shot prompt analysis
- Investigated detection and defense mechanisms by performing attention analysis on malicious prompts
Methodology:
- Data Collection: Automated scraping of injection prompts from social media/test platforms
- Analysis Pipeline: Gemini-based system for few-shot prompt analysis and classification
- Defense Research: Attention analysis to understand and mitigate prompt injection attacks
Publications
A robust and novel semantic segmentation deep neural network for robotic surgery vision with a single RGB camera
In Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022) (Vol. 12610, pp. 73-77). SPIE.