Research
Research Focus
My research aims to enhance reasoning reliability and interpretability in AI systems by integrating symbolic methods with neural learning. I explore how formal reasoning, structured knowledge, and interactive verification can address fundamental limitations in language models and autonomous agents.
Core Interests: Neuro-Symbolic AI · LLM Reasoning · Reinforcement Learning for Reasoning · Formal Verification · Interactive Agents
Ongoing Research
Reasoning-Based WebAgents with Fine-Grained Reinforcement Learning
Ludwig Maximilian University of Munich · Advisor: Dr. Yao Zhang · Nov 2025 - Present
- Problem: Web agents struggle with multi-step reasoning tasks requiring perception, planning, and grounded decision-making
- Approach: Designing fine-grained RL frameworks that decompose complex web tasks into learnable sub-goals with structured reward signals
- Methods: Novel reward modeling mechanisms for training stability; image-grounded reasoning modules for multimodal perception
- Deliverables: Prototype agent system integrating vision-language models with RL-based task decomposition
Neuro-Symbolic Framework for Adaptive Logical Reasoning
KRistal Group, Nanjing University · Advisor: Prof. YiZheng Zhao · Oct 2025 - Present
- Problem: LLMs perform poorly on tasks requiring strict logical consistency and compositional generalization
- Approach: Building a framework that combines logical semantics with differentiable neural modules to enable end-to-end learning while preserving logical structure
- Methods: Meta-learning-based reasoning strategies; fuzzy logic engine balancing faithfulness with computational efficiency
- Deliverables: Modular reasoning architecture supporting both neural adaptation and symbolic constraints
Reasoning-Enhanced Reward Models for Preference Alignment
Independent Research · Advisor: Dr. Zhen Han · Jul 2025 - Present
- Problem: Standard reward models for RLHF struggle to capture reasoning quality beyond surface-level coherence
- Approach: Integrating reasoning-specific signals into reward modeling to improve alignment on complex reasoning tasks
- Methods: Pipeline combining reject sampling, supervised fine-tuning (SFT), and reinforcement learning for scalable preference data generation
- Status: Conducting experiments on reasoning-guided reward modeling; manuscript in preparation
Past Research Experience
Interactive Theorem Proving with LLMs and Lean4
ScaleML Lab, UIUC · Advisor: Prof. Tong Zhang · Apr - Jun 2025
- Problem: LLMs can propose plausible proof steps but lack formal verification, limiting their reliability for mathematical reasoning
- Approach: Built a prototype integrating Lean4 formal proof assistant with LLMs for interactive theorem proving on MiniF2F benchmark
- Methods: Designed bidirectional communication pipeline (LLM ↔ Lean4) with proof-state serialization; implemented closed-loop refinement where Lean4 verifies LLM-proposed tactics
- Artifacts: Working prototype system; analysis of common failure modes (context violations, invalid step proposals) informing interface design
Benchmarking System for HarmonyOS Intelligent Agents
Huawei 2012 Labs · Supervisor: JianFeng Gui · Jul - Sept 2025
- Problem: Need for systematic evaluation of reasoning and adaptability in mobile OS agents across diverse tasks
- Contributions: Co-developed benchmarking infrastructure from scratch; contributed to IntelliOS-agent pipeline
- Methods: Integrated HDC debugging tools with LLM-based reasoning modules; ported Python dependency libraries to HarmonyOS environment
- Deployment: System deployed in Huawei's internal IntelliOS project for agent evaluation
Quantum Memory Architectures for Machine Learning
QUEST Lab, NC State University · Advisor: Prof. Yuan Liu · Jul - Nov 2024
- Problem: Quantum computing hardware for ML workloads lacks optimized memory architectures tailored to quantum-classical hybrid execution
- Approach: Explored quantum memory designs specifically for quantum machine learning algorithms
- Contributions: Proposed optimized computational architecture for ML workloads on quantum systems
- Status: Co-first author manuscript submitted to ISCA 2025
Adversarial Backdoors in Machine Learning Models
COSEC Research Group, Nanjing University · Advisors: Prof. Yuan Zhang, Prof. Sheng Zhong · Jul 2023 - Dec 2024
- Problem: Understanding and defending against backdoor attacks in neural network training pipelines
- Contributions: Proposed novel exploit mechanism for backdoor injection; designed attack experiments on malicious training scenarios
- Impact: Work contributed to group's broader research on ML robustness and trustworthiness