Hi, I’m River / Yuhe Liu, an M.Eng. student in Computer Science and Technology at Tsinghua University.
My recent work focuses on LLM evaluation, AIOps benchmarks, RAG & Agent pipelines, and domain-specific AI systems. I care about turning messy real-world operations knowledge into evaluation data, benchmarks, and deployed systems that can be used by engineers.
Research & Projects
- AIOps and LLM evaluation. I work on benchmarks and evaluation methods for IT operations, including OpsEval and Eagle.
- Applied LLM systems. I have built or contributed to RAG/Agent-based benchmark generation, AIOps model evaluation, log-analysis evaluation, and telecom-domain model adaptation.
- Earlier research. I also worked on skeleton-based action recognition and domain adaptation, including Skeleton-CutMix.
Selected Publications
- Eagle: Leveraging Operations Documents for Comprehensive Benchmark Question Generation - FSE 2026 Industry.
- OpsEval: A Comprehensive Benchmark Suite for Evaluating Large Language Models’ Capability in IT Operations Domain - FSE 2025.
- TechSupportEval: An Automated Evaluation Framework for Technical Support Question Answering - IJCNN 2025.
- Skeleton-CutMix: Mixing Up Skeleton with Probabilistic Bone Exchange for Supervised Domain Adaptation - IEEE Transactions on Image Processing, 2023.
For a fuller academic profile and CV, see lyh.river9.top. This blog is where I keep technical notes, project logs, and occasional experiments outside the formal CV.