Research & Engineering Projects
Selected projects demonstrating systems-building, algorithm implementation, and research prototyping. Listed in reverse chronological order. Full project list available on GitHub.
VocoType · SPH Fluid · Healthcare AI · Robot Pose · Autograd · OS · RAG
VocoType-linux · Dec 2025 – Present
High-performance offline Chinese speech recognition system with ML-powered post-processing and visual configuration
Key Features:
- End-to-end pipeline: audio capture, VAD, ASR model inference, and system integration
- Lightweight on-device post-processing models for punctuation restoration and text formatting
- Speech editing mode: real-time correction and insertion with voice commands
- Visual configuration interface for model selection, hotkey binding, and input method integration
- Optimized for low-latency offline operation without cloud dependencies
Stack: Python, PyTorch, ONNX Runtime, ALSA, Linux input methods

Gaussian SPH Fluid: Physics-integrated 3D Gaussians for SPH Fluid Dynamics · Winter 2025
Unified simulation–rendering pipeline that advances 3D Gaussians with a divergence-free SPH solver and renders them directly
My Contributions:
- Co-designed the DFSPH coupling that enforces incompressibility (constant-density & divergence-free constraints) on Gaussian particles each step
- Implemented uniform internal filling — converts surface-biased 3DGS into SPH-ready volumes via a smoothed opacity field over a uniform 3D grid
- Implemented an SPH ∇v–based implicit covariance update so anisotropic Gaussian shapes align with local flow during simulation
- Compared against PhysGaussian (MPM) on Synthetic-NeRF scenes (Materials, Hotdog, Ficus) — preserves liquid-like coherence and density with a single point-based representation
Stack: C++, CUDA, OpenGL, 3D Gaussian Splatting, DFSPH
Course project: Computer 3D Graphics and Deep Learning · BlendED × NVIDIA
Optimizing Transfer Learning for High-Accuracy Healthcare AI Under Data Scarcity · Summer 2025
Skin-lesion classifier on the ISIC dermatoscopic dataset, built on a self-supervised ViT-Base/16 (MAE) backbone with two-stage fine-tuning and post-hoc decision calibration
My Contributions:
- Two-stage training: linear probing on frozen MAE features (8 epochs) followed by end-to-end fine-tuning at reduced lr (25 epochs)
- Class-imbalance handling via weighted random sampling (inverse class frequency) and AdamW with weight decay; automatic mixed precision for stability
- Post-hoc decision calibration: prior adjustment (logits − τ·log prior) plus class-specific bias to suppress systematic nevus over-prediction
- Standard dermatology augmentations (random crops, ±30° rotation, color jitter) on 224×224 normalized inputs
Stack: PyTorch, Vision Transformer (ViT-B/16), Masked Autoencoder, AdamW, AMP
Course project: BlendED
Mobile Robot Pose Estimation · Mar – Apr 2025
Point cloud-based localization module for robotic manipulation tasks
My Contributions:
- Implemented multi-method point cloud denoising pipeline (statistical outlier removal, voxel downsampling)
- Integrated ICP-based pose refinement for robust localization under sensor noise
- Improved robustness for mobile manipulation in dynamic environments
Stack: Python, PCL, ROS
Course project: AI and Robotics for Mobile Robot Manipulation · Boston Dynamics collaboration
Autograd Engine & Neural Network in C++
From-scratch implementation of automatic differentiation and a neural network library
My Contributions:
- Reimplemented PyTorch-style autograd with computational graph construction and reverse-mode differentiation
- Built fully connected neural network layers, activation functions, and optimizers (SGD, Adam)
- Validated against standard benchmarks to ensure numerical correctness
Stack: C++17, modern template metaprogramming
Operating System Development · Spring 2023 & Summer 2024
Teaching-oriented OS kernel work across two course series — implementing core subsystems from scratch
My Contributions:
- NJUOS lab series (Spring 2023): process scheduling (round-robin, priority-based), virtual memory management (paging, demand paging)
- MIT xv6 lab series (Summer 2024): inode-based file system, basic POSIX system calls, additional kernel subsystems
- Hands-on practice across boot, scheduler, memory, and storage layers of a teaching kernel
Stack: C, x86 Assembly, QEMU
Retrieval-Augmented Generation (RAG) System · June 2023
Full-stack question-answering and document summarization system with a fine-tuned LLaMA backend
My Contributions:
- Fine-tuned LLaMA model for domain-specific Q&A tasks
- Implemented vector database indexing pipeline for efficient document retrieval
- Built Vue.js frontend and Spring Boot backend with RESTful APIs
Stack: LLaMA, FAISS, Vue.js, Spring Boot, Python