TW123
Posts
[MLSN #9] Verifying large training runs, security risks from LLM access to APIs,...
by TW123, Dan H @ 2023-04-11 | +18 | 0 comments
by TW123, Dan H @ 2023-04-11 | +18 | 0 comments
[MLSN #8]: Mechanistic interpretability, using law to inform AI alignment,...
by TW123, Dan H @ 2023-02-20 | +25 | 0 comments
by TW123, Dan H @ 2023-02-20 | +25 | 0 comments
"A Creepy Feeling": Nixon's Decision to Disavow Biological Weapons
by TW123 @ 2022-09-30 | +48 | 0 comments
by TW123 @ 2022-09-30 | +48 | 0 comments
Announcing the Introduction to ML Safety Course
by TW123, Dan H, Oliver Z @ 2022-08-06 | +136 | 0 comments
by TW123, Dan H, Oliver Z @ 2022-08-06 | +136 | 0 comments
$20K in Bounties for AI Safety Public Materials
by TW123, Dan H, Oliver Z @ 2022-08-05 | +45 | 0 comments
by TW123, Dan H, Oliver Z @ 2022-08-05 | +45 | 0 comments
Perform Tractable Research While Avoiding Capabilities Externalities [Pragmatic...
by TW123, Dan H @ 2022-05-30 | +33 | 0 comments
by TW123, Dan H @ 2022-05-30 | +33 | 0 comments
Complex Systems for AI Safety [Pragmatic AI Safety #3]
by TW123, Dan H @ 2022-05-24 | +49 | 0 comments
by TW123, Dan H @ 2022-05-24 | +49 | 0 comments
A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]
by TW123, Dan H @ 2022-05-09 | +97 | 0 comments
by TW123, Dan H @ 2022-05-09 | +97 | 0 comments
Introduction to Pragmatic AI Safety [Pragmatic AI Safety #1]
by TW123, Dan H @ 2022-05-09 | +68 | 0 comments
by TW123, Dan H @ 2022-05-09 | +68 | 0 comments
Introducing the ML Safety Scholars Program
by TW123, Dan H, Mantas Mazeika, Oliver Z, Sidney Hough, Kevin Liu @ 2022-05-04 | +157 | 0 comments
by TW123, Dan H, Mantas Mazeika, Oliver Z, Sidney Hough, Kevin Liu @ 2022-05-04 | +157 | 0 comments
[$20K In Prizes] AI Safety Arguments Competition
by TW123, Dan H, Oliver Z, Sidney Hough, Kevin Liu @ 2022-04-26 | +71 | 0 comments
by TW123, Dan H, Oliver Z, Sidney Hough, Kevin Liu @ 2022-04-26 | +71 | 0 comments
Yale EA’s Fellowship Application Scores were not Predictive of Eventual...
by TW123, jessica_mccurdy🔸 @ 2021-01-28 | +148 | 0 comments
by TW123, jessica_mccurdy🔸 @ 2021-01-28 | +148 | 0 comments