Research & Engineering Projects

Selected projects demonstrating systems-building, algorithm implementation, and research prototyping. Listed in reverse chronological order. Full project list available on GitHub.

VocoType · SPH Fluid · Healthcare AI · Robot Pose · Autograd · OS · RAG

VocoType-linux · Dec 2025 – Present

High-performance offline Chinese speech recognition system with ML-powered post-processing and visual configuration

Key Features:

  • End-to-end pipeline: audio capture, VAD, ASR model inference, and system integration
  • Lightweight on-device post-processing models for punctuation restoration and text formatting
  • Speech editing mode: real-time correction and insertion with voice commands
  • Visual configuration interface for model selection, hotkey binding, and input method integration
  • Optimized for low-latency offline operation without cloud dependencies

Stack: Python, PyTorch, ONNX Runtime, ALSA, Linux input methods

VocoType-linux demo screenshot

Gaussian SPH Fluid: Physics-integrated 3D Gaussians for SPH Fluid Dynamics · Winter 2025

Unified simulation–rendering pipeline that advances 3D Gaussians with a divergence-free SPH solver and renders them directly

My Contributions:

  • Co-designed the DFSPH coupling that enforces incompressibility (constant-density & divergence-free constraints) on Gaussian particles each step
  • Implemented uniform internal filling — converts surface-biased 3DGS into SPH-ready volumes via a smoothed opacity field over a uniform 3D grid
  • Implemented an SPH ∇v–based implicit covariance update so anisotropic Gaussian shapes align with local flow during simulation
  • Compared against PhysGaussian (MPM) on Synthetic-NeRF scenes (Materials, Hotdog, Ficus) — preserves liquid-like coherence and density with a single point-based representation

Stack: C++, CUDA, OpenGL, 3D Gaussian Splatting, DFSPH

Course project: Computer 3D Graphics and Deep Learning · BlendED × NVIDIA

Optimizing Transfer Learning for High-Accuracy Healthcare AI Under Data Scarcity · Summer 2025

Skin-lesion classifier on the ISIC dermatoscopic dataset, built on a self-supervised ViT-Base/16 (MAE) backbone with two-stage fine-tuning and post-hoc decision calibration

My Contributions:

  • Two-stage training: linear probing on frozen MAE features (8 epochs) followed by end-to-end fine-tuning at reduced lr (25 epochs)
  • Class-imbalance handling via weighted random sampling (inverse class frequency) and AdamW with weight decay; automatic mixed precision for stability
  • Post-hoc decision calibration: prior adjustment (logits − τ·log prior) plus class-specific bias to suppress systematic nevus over-prediction
  • Standard dermatology augmentations (random crops, ±30° rotation, color jitter) on 224×224 normalized inputs

Stack: PyTorch, Vision Transformer (ViT-B/16), Masked Autoencoder, AdamW, AMP

Course project: BlendED

Mobile Robot Pose Estimation · Mar – Apr 2025

Point cloud-based localization module for robotic manipulation tasks

My Contributions:

  • Implemented multi-method point cloud denoising pipeline (statistical outlier removal, voxel downsampling)
  • Integrated ICP-based pose refinement for robust localization under sensor noise
  • Improved robustness for mobile manipulation in dynamic environments

Stack: Python, PCL, ROS

Course project: AI and Robotics for Mobile Robot Manipulation · Boston Dynamics collaboration

Autograd Engine & Neural Network in C++

From-scratch implementation of automatic differentiation and a neural network library

My Contributions:

  • Reimplemented PyTorch-style autograd with computational graph construction and reverse-mode differentiation
  • Built fully connected neural network layers, activation functions, and optimizers (SGD, Adam)
  • Validated against standard benchmarks to ensure numerical correctness

Stack: C++17, modern template metaprogramming

Operating System Development · Spring 2023 & Summer 2024

Teaching-oriented OS kernel work across two course series — implementing core subsystems from scratch

My Contributions:

  • NJUOS lab series (Spring 2023): process scheduling (round-robin, priority-based), virtual memory management (paging, demand paging)
  • MIT xv6 lab series (Summer 2024): inode-based file system, basic POSIX system calls, additional kernel subsystems
  • Hands-on practice across boot, scheduler, memory, and storage layers of a teaching kernel

Stack: C, x86 Assembly, QEMU

Retrieval-Augmented Generation (RAG) System · June 2023

Full-stack question-answering and document summarization system with a fine-tuned LLaMA backend

Built in mid-2023 — before “RAG” became a household term, before GPT-4's tooling ecosystem matured, and before LangChain/LlamaIndex were the obvious starting points. This was a freshman-year course project that explored the same retrieve-then-generate pattern that the field would converge on a year later.

My Contributions:

  • Fine-tuned LLaMA model for domain-specific Q&A tasks
  • Implemented vector database indexing pipeline for efficient document retrieval
  • Built Vue.js frontend and Spring Boot backend with RESTful APIs

Stack: LLaMA, FAISS, Vue.js, Spring Boot, Python