Hi, I'm Kingsley Kim. I'm an UVA alum, interested in kernel optimization and LLM research.I've worked under Chen-Yu Wei on RL and LLM post-training as well as training more efficient Process Reward Models.
I was previously a Forward Deployed Engineer Intern at Palantir, during which I was also an 8VC Fellow.
Things I've worked on:
- Published an ICLR Workshop Paper on training data-efficient Process Reward Models for LLM posttraining
- A more applied paper on Transformers on Time Series, AAAI 2024 Workshop
Code:
- Opensource H100 CUDA Kernels for the Gated Delta Net arch repo, check blog for more details
- A minimal dependence CUDA inference engine for H100s (WIP) repo
- A minimal repo to learn FP8 and BF16 MoE optimizations (WIP, worklog soon) repo
- RL applied to control problems
- Research on mini-VLMs for 'social navigation' in robots repo
I've also written a few posts on what I've learned from CUDA kernel engineering, and some notes on textbooks I've read:
Contact: