ShortSWA Is the Next-Generation N-gram Embedding
Yifan's Blog, January 12, 2026
Revisiting Variance Reduction in Policy Gradients for LLM Reinforcement Learning
Yifan's Blog, December 27, 2025
Rethinking SWA: Why Short Sliding Window Attention Will Replace ShortConv
Yifan's Blog, December 16, 2025
Matrix Exponential Attention
Yifan's Blog, December 15, 2025
More posts coming soon.