top of page

Zeyneb N. Kaya

Hi! I am Zeyneb, I study CS+Math at Stanford University. I'm broadly interested in understanding and pushing the limits of ML, working on robustness/adaptation, learning from data (efficiently), statistics, and physics––among other things.

Recently, projects I've worked on include reasoning/IOL @ OpenAI, models for physics+optimization as co-founder @ Topological; decentralized AI, synthetic data, and midtraining @ Dria; and RL/self-improvement/diffusion @ SAIL. 

​ 

I’m always eager to discuss interesting ideas—please reach out!​ When I'm not reading papers, I'll geek out over poetry/art, geoguessr/linguistics, my collections, cats, andwhatever topic I've spiraled into.

zeynebnk [at] stanford [dot] edu

x / linkedin / github / curiuswriting

output-onlinegiftools.gif

Research.

My interests are in studying and building systems that learn, with work broadly spanning self-improvement/continual learning, data/robustness, and learning dynamics, working in machine learning and statistics.

 

Listed below are selected relevant research.

Anchored Self-Play for Code Repair
Caroline Choi, Zeyneb N. Kaya, Shirley Wu, Tengyu Ma, Tatsunori Hashimoto, Ludwig Schmidt

preprint, 2026

Test-Time Meta-Adaptation with Self-Synthesis

Zeyneb N. Kaya, Nick Rui

ICLR RSI, DATA-FM, & LIT Workshops, 2026

Optimizing Remasking Schedules for Reasoning in Discrete Diffusion Models

Zeyneb N. Kaya, Radostin Cholakov, Nicole H. Ma

ICLR ReALM-GEN & NFAM Workshops, 2026

Semantic Anchoring in Large Language Models: Thresholds, Transfer, and Geometry

Edward Y. Chang, Ethan Chang, Zeyneb N. Kaya

arXiv preprint, 2026

Measuring the Impact of Data Augmentation Methods for Extremely Low-Resource NMT  

Zeyneb N. Kaya, Annie K. Lamar

EACL LoResMT Workshop @ EACL, 2023

MADLIBS: A Novel Multilingual Data Augmentation Algorithm for Low-Resource Neural Machine Translation 

Zeyneb N. Kaya

Regeneron Science Talent Search, 2024

Full Scope Word Embedding Variability for Low-Resource Languages

Zeyneb N. Kaya, Annie K. Lamar

IEEE MIT URTC, 2023

​​

The Pervasiveness of Language Contact: Evidence from Negative Existentials in Romeyka/Turkish Code-Switching

Zeyneb N. Kaya
Proceedings of the Linguistic Society of America (PLSA), 2023 

Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions

Zeyneb N. Kaya, Souvick Ghosh

arXiv preprint

New Piskel (9).gif

Awards & Recognition.

Etched x Mercor x Cognition Hackathon – 1st Place/$40K Winner 2025
Regeneron Science Talent Search Winner5th Place/$90K Winner 2024
Coca Cola Scholar – 2024

PearVC x Anthropic Hackathon – 1st Place/Most Technical Winner, 2025

TreeHacks Scrapybara Prize – 1st Place/$16K Winner, 2025

Geoguessr – Master Tier Player, 2025

National Junior Science and Humanities Symposium (NJSHS) National HM, 2nd Math/CS, 2023

Congressional App Challenge1st Place Winner, 2021

Olympiad in Linguistics (Onling) – 10th Place / 1st in USA, 2023

International Olympiad in Artificial Intelligence – Team USA invited representative (did not attend due to conflicts)

Projects & Blogs.

to be included
Spin Glasses and the Statistical Mechanics of Transformers 
Topological and Algebraic Connectivity in Random Graphs

Spiky Smooth Shapes in High Dimensions

Language Models (can be) Few-Shot Fakers

The Shape of Data & Learning

patched.gif
bottom of page