Research Areas
From Operating Systems to Generative AI, we build practical systems that bridge OS and AI.
Operating Systems
We reduce latency and improve reliability at kernel/runtime levels, instrumented and validated in real environments.
Topics
- Scheduling, memory, I/O stack optimizations
- Containers/virtualization, eBPF observability
- Filesystem/storage consistency & reliability
Approach
- Profiling/tracing for bottleneck hunting
- Experimental design, A/B, micro-benchmarks
- Fail-safe & recovery scenario validation
Stack
LLM
We study model compression/tuning and resilient inference pipelines, balancing efficiency and quality under constraints.
Topics
- Instruction tuning & domain adaptation
- Hallucination reduction & safety
- KV cache, quantization, serving optimizations
Approach
- Data curriculum & priority sampling
- Prompt/chain design
- Quantitative eval (accuracy/latency/cost)
Stack
RAG (Retrieval-Augmented Generation)
We combine retrieval and generation for accuracy and freshness, tuning indexing, re-ranking, and context building end-to-end.
Topics
- Chunking/sliding & hybrid indexes
- BM25 + Dense, cross-encoder re-ranking
- Context fusion, citation & evaluation
Approach
- KB schema design & versioning
- Groundedness/answerability metrics
- Agent tool-use integration
Stack
Cloud Computing
We focus on automation, reliability, and cost efficiency in large-scale distributed systems, across multi/hybrid clouds.
Topics
- K8s scheduling & autoscaling
- Serverless & event-driven architectures
- Observability (tracing/logging/profiling)
Approach
- SLO-driven capacity/cost modeling
- Canary/blue-green/resilience testing
- Edge/hybrid routing & data governance
Stack
Real-time Avatar
We enable ultra-low-latency interaction across voice, vision, and motion, robust to jitter and packet loss.
Topics
- WebRTC pipeline & adaptive bitrate
- Streaming ASR/VAD/TTS/voice conversion
- Lip-sync/facial tracking & lightweight rendering
Approach
- A/V buffering latency prediction
- GPU scheduling & multi-stream optimization
- QoE (fluency/sync/latency/distortion)
Stack
AI
We integrate models, data, and systems into end-to-end AI, prioritizing reproducibility and operability.
Topics
- Model compression, latency optimization
- MLOps & data quality/bias management
- Multimodal pipeline design
Approach
- Experiment tracking & model registry
- Data cataloging/versioning
- Risk/ethics/security guidelines