I enjoy making things. Here are a selection of projects that I have worked on over the years.
RL-Project: Comprehensive Reinforcement Learning Framework for Atari and MuJoCo This project implements a comprehensive Reinforcement Learning framework capable of solving both discrete control tasks (Atari games using DQN) and continuous control tasks (MuJoCo robotics using PPO). It is designed for modularity, scalability, and ease of experimentation, featuring automated parallel training, configuration-driven evaluation, and robust headless visualization support. Key Features: DQN (Deep Q-Network): Supports Vanilla, Double, Dueling, and Rainbow variants. PPO (Proximal Policy Optimization): Optimized for continuous control with observation normalization and reward clipping. Parallel Training: Efficient data collection using vectorized environments. Automated Evaluation: run.py for rendering, video recording, and performance metrics. Configuration Registry: Centralized management of best model checkpoints via configs/best_models.py.
RAG-TA: RAG-based Intelligent Teaching Assistant System The RAG Intelligent Teaching Assistant System is an intelligent teaching assistance platform based on Multimodal Retrieval-Augmented Generation (Multimodal RAG), specifically designed for educational scenarios. The system integrates the following core functions: ๐ค Intelligent Question Answering System Multimodal Understanding: Supports complex question answering with text and images, capable of understanding charts, formulas, and image content in course materials. Contextual Retrieval: Precise semantic retrieval based on the ChromaDB vector database. Hybrid Retrieval: Combines dense vector retrieval and sparse retrieval (BM25) to improve retrieval accuracy. Source Tracing: Automatically annotates the source file and page number of the answer, ensuring information traceability. ๐ Knowledge Base Management Multi-format Support: Supports various formats including PDF, PPTX, DOCX, TXT, MD, and images. Intelligent Indexing: Automatically extracts text and image content to build a multimodal vector index. Incremental Updates: Intelligently detects file changes and updates only the changed parts, improving efficiency. Folder Management: Supports hierarchical directory structure for easy organization of course materials. ๐ฌ Conversation Management History: Complete conversation history saving and management. Multiple Answer Options: Supports multiple answer versions for the same question, allowing users to switch between them. Folder Classification: Supports conversation folder management for easy course classification. Thinking Mode: Visualizes the AI reasoning process to help understand the answer logic. ๐ผ๏ธ Multimodal Interaction Image Understanding: Automatically extracts and describes image content in PDFs/PPTXs. Real-time Upload: Supports users uploading images and documents for instant question answering. Visual Question Answering: Provides comprehensive answers combining image content and text knowledge.
TripDataset Machine Learning Project This project is a complete implementation of machine learning pipelines applied to the TripDataset, focusing on data preprocessing, classification, and regression tasks, including: ๐งน Data preprocessing and cleaning (handling missing values, outlier detection, normalization, and feature engineering) ๐ค Model training for classification and regression (various ML algorithms for categorical and continuous prediction tasks) ๐ Performance evaluation and metrics (accuracy, F1-score, RMSE, and other evaluation techniques) ๐ Exploratory data analysis and visualization (insightful plots for feature relationships, distribution, and model performance)
ViT-torch: Vision Transformer on CIFAR-10 (PyTorch) This project is a complete implementation of Vision Transformer (ViT) applied to small-scale datasets (especially CIFAR-10), including: ๐ฏ Model implementations with various configurations (native ViT, ResNet+ViT hybrid, different patch/heads/blocks setups, Stochastic Depth/DropPath, etc.) ๐น Training and evaluation scripts (with learning rate schedulers: Warmup/Linear/Cosine/Constant-Cosine/Warmup-Constant-Cosine) ๐งฉ Data augmentation (RandomCrop+Paste, MixUp, CutMix, RandAugment, and batch random augmentation) ๐ Visualization and analysis (attention maps, attention distance, gradient rollout, feature maps, positional embedding similarity)