Upcoming Events

Subscribe To Our Events Calendar

AI Policy Tuesday: The Case for Regulating AI Companies, Not AI Models

Tuesday, September 30, 6:00 PM - 9:00 PM
Wim Howson Creutzberg will discuss the case for treating business entities — as opposed to models or use cases — as the main focal unit for preemptive risk regulation of advanced AI systems.
The talk will give an overview of "Entity-Based Regulation in Frontier AI Governance" by Ketan Ramakrishnan and Dean Ball, followed by commentary on the article and surrounding discussion.

RSVP Here
AI Safety Thursday: Attempts and Successes of LLMs Persuading on Harmful Topics

Thursday, October 2nd, 6:00pm - 9:00pm
Large Language Models can persuade people at unprecedented scale—but how effectively, and are they willing to try persuading us toward harmful ideas?
In this talk, Matthew Kowal and Jasper Timm will present findings showing that LLMs can shift beliefs toward conspiracy theories as effectively as they debunk them, and that many models are willing to attempt harmful persuasion on dangerous topics.

RSVP Here
AI Policy Tuesday: The Concept of Political Space and AI Safety

Tuesday, October 7th, 6:00 pm - 9:00 pm
International cooperation often becomes possible only after shocks, crises, or dramatic shifts in perception. ChatGPT's release in 2022 created space for x-risk discussions at the UK AI Summit, while just two years later, the Paris Summit cast these same concerns as "science fiction".
At this talk, Jason Yung will examine the concept of political space—how it opens and closes, what factors enable or constrain it, and how it can inform the way AI safety advocacy is advanced.

RSVP Here
AI Safety Thursday: Building an economic model of AI automation

Thursday, October 9th, 6:00pm-9:00pm
In this presentation, Anson Ho will discuss Epoch AI's work on the GATE model, one of the first attempts at bridging the gap between growth economics and modern deep learning in a rigorous way.

RSVP Here
AI Policy Tuesday: Redlines for AI

Tuesday, October 14th, 6:00pm - 9:00pm
Kathrin Gardhouse will walk us through a 3-part report of The Future Society introducing the concept of redlines and their relevance in global AI governance.

RSVP Here
AI Safety Thursday: Modeling and Detecting Deceptive Alignment

Thursday October 16th, 6:00pm - 9:00pm
Annie Szorkin gives a talk on Modeling and Detecting Deceptive Alignment

RSVP Here
AI Policy Tuesday: The Spectre of Transformative AI

Tuesday October 21st, 6:00pm - 9:00pm
Will AI driven security risks be more severe before the development of transformative AI, or after?
Wim Howson Creutzberg will give an overview of current research on how the severity and nature of risks stemming from the development of advanced AI are expected to change over time, drawing centrally on “The Artificial General Intelligence Race and International Security.”

RSVP Here
AI Safety Thursday: The Limitations of Reinforcement Learning for LLMs in Achieving AI for Science

Thursday October 23rd, 6:00pm - 9:00pm
LLMs combined with Reinforcement Learning (RL) have unlocked new impressive capabilities. But do we simply need more scaling to reach the next step: AI for science and research? If not, what are the limitations, and what else is required?
In this talk, Yongjin Yang will share research on three fundamental bottlenecks of reinforcement learning for LLMs: skewed queries, limited exploration, and sparse reward signals. We will also discuss potential solutions to these challenges, as well as safety concerns.

RSVP Here
AI Safety Tuesday: Debunking the US-Chinese AGI Race

Tuesday, October 28th, 6:00pm - 9:00pm
In this talk, Kristy Loke will present her research findings by drawing on both Chinese policy discussions and post-ChatGPT actions.
She'll discuss: (a) how China's been responding to the AGI race happening within Silicon Valley and how/ why China sees AI development differently and (b) what this implies for the trajectory of US-Chinese relations and room for global AI coordination.

RSVP Here
AI Safety Thursday: Chain-of-Thought Monitoring for AI Control

Thursday, October 30th, 6:00 pm - 9:00 pm
Modern reasoning models do a lot of thinking in natural language before producing their outputs. Can we catch misbehaviors by our LLMs and interpret their motivations simply by reading these chains of thought?
In this talk, Rauno Arike and Rohan Subramani will give an overview of research areas in chain-of-thought monitorability and AI control, and discuss their recent research on the usefulness of chain-of-thought monitoring for ensuring that LLM agents only pursue objectives that their developers intended them to follow.

RSVP Here

Past Events

AI Safety Thursdays: Avoiding Gradual Disempowerment

Thursday, July 3rd, 6pm-8pm
This talk explored the concept of gradual disempowerment as an alternative to the abrupt takeover scenarios often discussed in AI safety. Dr David Duvenaud examined how even incremental improvements in AI capabilities can erode human influence over critical societal systems, including the economy, culture, and governance.
AI Policy Tuesdays: Agent Governance

Tuesday, June 24th.
Kathrin Gardhouse presented on the nascent field of Agent Governance, drawing from a recent report by IAPS.
The presentation covered current agent capabilities, expected developments, governance challenges, and proposed solutions.
Hackathon: Apart x Martian Mechanistic Router Interpretability Hackathon

Friday, May 30 - Sunday, Jun 1.
We hosted a jamsite for Apart Research and Martian's hackathon.
AI Safety Thursdays: Advanced AI's Impact on Power and Society

Thursday, May 29th, 6pm-8pm
Historically, significant technological shifts often coincide with political instability, and sometimes violent transfers of power. Should we expect AI to follow this pattern, or are there reasons to hope for a smooth transition to the post AI world?
Anson Ho drew upon economic models, broad historical trends, and recent developments in deep learning to guide us through an exploration of this question.
AI + Human Flourishing: Policy Levers for AI Governance

Sunday, May 4, 2025.
Considerations of AI governance are increasingly urgent as powerful models become more capable and widely deployed. Kathrin Gardhouse delivered a presentation on the available mechanisms we can use to govern AI, from policy levers to technical AI governance. It was a high-level introduction to the world of AI policy to get a sense of the lay of the land.
AI Safety Thursday: "AI-2027"

Thursday April 24th, 2025.
On April 3rd, a team of AI experts and superforecasters at The AI Futures Project , published a narrative called AI-2027 outlining a possible scenario of explosive AI development and takeover occurring during the coming 2 years.
Mario Gibney guided us through a presentation and discussion of the scenario where we explore how likely it is to actually track reality in the coming years.

See Our Past Events Here

Upcoming Events

AI Policy Tuesday: The Case for Regulating AI Companies, Not AI Models

AI Safety Thursday: Attempts and Successes of LLMs Persuading on Harmful Topics

AI Policy Tuesday: The Concept of Political Space and AI Safety

AI Safety Thursday: Building an economic model of AI automation

AI Policy Tuesday: Redlines for AI

AI Safety Thursday: Modeling and Detecting Deceptive Alignment

AI Policy Tuesday: The Spectre of Transformative AI

AI Safety Thursday: The Limitations of Reinforcement Learning for LLMs in Achieving AI for Science

AI Safety Tuesday: Debunking the US-Chinese AGI Race

AI Safety Thursday: Chain-of-Thought Monitoring for AI Control

AI Safety Thursdays: Avoiding Gradual Disempowerment

AI Policy Tuesdays: Agent Governance

Hackathon: Apart x Martian Mechanistic Router Interpretability Hackathon

AI Safety Thursdays: Advanced AI's Impact on Power and Society

AI + Human Flourishing: Policy Levers for AI Governance

AI Safety Thursday: "AI-2027"

Trajectory Labs