Dangerous Capability Evals

Mentor:
Joshua Clymer
Columbia University

Mentor Bio

Joshua Clymer is an AI safety researcher, on leave from CS and Mathematics at Columbia University. As a Research Intern at the Center for Cyber Warfare, he co-authored papers on statistical models for cyber attack detection. Joshua managed projects at the Center for AI Safety, received the Century Fellowship, and was one of the people who spearheaded the CAIS statement. Currently exploring AI safety careers, he recently conducted research on reward model generalization.
Visit joshuaclymer.com for more.

Project Description

I'm building a benchmark that measures the ability to adapt to unseen tasks, i.e. it is meant to be a very general capabilities benchmark. The benchmark is based on multi-player games which will allow us to track AI capabilities even as they far exceed the human-level (most other benchmarks will saturate).

See this document with more details.

Personal Fit

Ideal mentees

  • Spent at least 1000 hours in python

  • Understands the basics of how transformers work (i.e. you should know what a 'logit' is)

Mentorship style
I'm pretty hands on. I assign tasks and check in with each individual every couple of days.

Time commitment
At least 10 hours per week. Preference for more time.