zeynebnk | zeynebnk

Zeyneb N. Kaya

Hi! I am Zeyneb, I study CS+Math at Stanford University. I'm broadly interested in understanding and pushing the limits of ML, working on robustness/adaptation, learning from data (efficiently), statistics, and physics––among other things.

Recently, projects I've worked on include reasoning @ OpenAI, models for physics+optimization as co-founder @ Topological; decentralized AI, synthetic data, and midtraining @ Dria; and RL/self-improvement/diffusion @ SAIL.

I’m always eager to discuss interesting ideas—please reach out! When I'm not reading papers, I'll geek out over poetry/art, geoguessr/linguistics, my collections, cats, and whatever topic I've spiraled into.

zeynebnk [at] stanford [dot] edu

x / linkedin / scholar / curius / writing

Research.

My interests are in studying and building systems that learn, with work broadly spanning self-improvement/continual learning, data/robustness, and learning dynamics, working in machine learning and statistics.

Listed below are selected relevant research.

Anchored Self-Play for Code Repair
Caroline Choi, Zeyneb N. Kaya, Shirley Wu, Tengyu Ma, Tatsunori Hashimoto, Ludwig Schmidt

ICML 2026; Spotlight @ ICML RSI Workshop, 2026

Test-Time Meta-Adaptation with Self-Synthesis

Zeyneb N. Kaya, Nick Rui

ICLR RSI, DATA-FM, & LIT Workshops, 2026

Optimizing Remasking Schedules for Reasoning in Discrete Diffusion Models

Zeyneb N. Kaya, Radostin Cholakov, Nicole H. Ma

ICLR ReALM-GEN & NFAM Workshops, 2026

Semantic Anchoring in Large Language Models: Thresholds, Transfer, and Geometry

Edward Y. Chang, Ethan Chang, Zeyneb N. Kaya

arXiv preprint, 2026

Measuring the Impact of Data Augmentation Methods for Extremely Low-Resource NMT

Zeyneb N. Kaya, Annie K. Lamar

ACL LoResMT Workshop, 2023

MADLIBS: A Novel Multilingual Data Augmentation Algorithm for Low-Resource Neural Machine Translation

Zeyneb N. Kaya

Regeneron Science Talent Search, 2024

Full Scope Word Embedding Variability for Low-Resource Languages

Zeyneb N. Kaya, Annie K. Lamar

IEEE MITURTC, 2023

The Pervasiveness of Language Contact: Evidence from Negative Existentials in Romeyka/Turkish Code-Switching

Zeyneb N. Kaya
Proceedings of the Linguistic Society of America (PLSA), 2023

Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions

Zeyneb N. Kaya, Souvick Ghosh

arXiv preprint

Awards & Recognition.

ChatGPT Futures, Class of 2026 – Honoree, 2026

Regeneron Science Talent Search Winner – 5th Place/$90K Winner, 2024

Etched x Mercor x Cognition Hackathon – 1st Place/$40K Winner, 2025
Coca Cola Scholar – 2024

PearVC x Anthropic Hackathon – 1st Place/Most Technical Winner, 2025

TreeHacks Scrapybara Prize – 1st Place/$16K Winner, 2025

Geoguessr – Master Tier Player, 2025

National Junior Science and Humanities Symposium (NJSHS) – National HM, 2nd Math/CS, 2023

Congressional App Challenge – 1st Place Winner, 2021

Olympiad in Linguistics (Onling) – 10th Place / 1st in USA, 2023

International Olympiad in Artificial Intelligence – Team USA invited representative

Projects & Blogs.

to be included
Spin Glasses and the Statistical Mechanics of Transformers
Topological and Algebraic Connectivity in Random Graphs
Spiky Smooth Shapes in High Dimensions

Language Models (can be) Few-Shot Fakers

The Shape of Data & Learning