About

I am an AI engineer and Master of Computer Science candidate at Rice University. My work sits between applied AI research and production engineering: agent systems, model orchestration, long-context memory, and multimodal knowledge-centric workflows.

Recently, I worked as a Research Intern at the Beijing Academy of Artificial Intelligence (BAAI), where I built scalable data and model experimentation pipelines with vLLM and FlagScale, deployed MCP services, and fine-tuned Qwen-family models for faster ML iteration.

My current interests are agentic workflows, multimodal knowledge graphs, retrieval systems, and the engineering practices that make AI systems observable, reproducible, and useful outside demos. I also keep notes on algorithms, interviews, deployment, and the small engineering decisions behind this website.

Focus

Agent systems with memory, tool use, and evaluation
Multimodal knowledge graphs and retrieval workflows
AI infrastructure for experimentation and deployment
Full-stack demos that make system behavior easier to inspect

Zhiheng Wang

Focus

Links