
Sander Schulhoff is an AI researcher specializing in AI security, prompt injection, and red teaming.
He wrote the first comprehensive guide on prompt engineering and ran the first-ever prompt injection competition, working with top AI labs and companies.
His dataset is now used by Fortune 500 companies to benchmark their AI systems security, he’s spent more time than anyone alive studying how attackers break AI systems, and what he’s found isn’t reassuring: the guardrails companies are buying don’t actually work, and we’ve been lucky we haven’t seen more harm so far, only because AI agents aren’t capable enough yet to do real damage.
We discuss:1.
The difference between jailbreaking and prompt injection attacks on AI systems2.
Why AI guardrails don’t work3.
Why we haven’t seen major AI security incidents yet (but soon will)4.
Why AI browser agents are vulnerable to hidden attacks embedded in webpages5.
The practical steps organizations should take instead of buying ineffective security tools6.
Why solving this requires merging classical cybersecurity expertise with AI knowledge—Brought to you by:Datadog—Now home to Eppo, the leading experimentation and feature flagging platform: infrastructure for modern software companies: Giving Funds—Make year-end giving easy: biggest takeaways (for paid newsletter subscribers): to find Sander Schulhoff:• X: LinkedIn: Website: • AI Red Teaming and AI Security Masterclass on Maven: to find Lenny:• Newsletter: • X: LinkedIn:
桑德·舒尔霍夫是一位专注于人工智能安全、提示注入和红队测试的AI研究员。他撰写了首部全面的提示工程指南,并与顶尖AI实验室和企业合作举办了首届提示注入竞赛。他的数据集如今被《财富》500强企业用于评估其AI系统的安全性,他投入了比任何人都多的时间研究攻击者如何攻破AI系统,而他的发现并不令人安心:企业购买的防护措施实际上并不奏效,我们至今尚未看到更多危害只是幸运,因为AI智能体目前还不够强大,无法造成真正的破坏。我们讨论了:1. AI系统越狱与提示注入攻击的区别2. 为何AI防护措施无效3. 为何我们尚未看到重大AI安全事件(但很快会出现)4. 为何AI浏览器智能体易受网页中隐藏攻击的影响5. 组织应采取的实际步骤,而非购买无效的安全工具6. 为何解决此问题需要融合传统网络安全专业知识与AI知识——本期节目由以下赞助商提供:Datadog——现拥有领先的实验和功能标记平台Eppo:https://www.datadoghq.com/lennyMetronome——现代软件公司的变现基础设施:https://metronome.com/GoFundMe Giving Funds——让年终捐赠更轻松:http://gofundme.com/lenny——文字稿:https://www.lennysnewsletter.com/p/the-coming-ai-security-crisis——我的主要收获(面向付费订阅用户):https://www.lennysnewsletter.com/i/181089452/my-biggest-takeaways-from-this-conversation——桑德·舒尔霍夫的联系方式:• X:https://x.com/sanderschulhoff• LinkedIn:https://www.linkedin.com/in/sander-schulhoff• 个人网站:https://sanderschulhoff.com• Maven平台上的AI红队测试与AI安全大师课:https://bit.ly/44lLSbC——莱尼的联系方式:• 新闻通讯:https://www.lennysnewsletter.com• X:https://twitter.com/lennysan• LinkedIn:https://www.linkedin.com/in/lennyrac
Sander Schulhoff is an AI researcher specializing in AI security, prompt injection, and red teaming.
He wrote the first comprehensive guide on prompt engineering and ran the first-ever prompt injection competition, working with top AI labs and companies.
His dataset is now used by Fortune 500 companies to benchmark their AI systems security, he’s spent more time than anyone alive studying how attackers break AI systems, and what he’s found isn’t reassuring: the guardrails companies are buying don’t actually work, and we’ve been lucky we haven’t seen more harm so far, only because AI agents aren’t capable enough yet to do real damage.
We discuss:1.
The difference between jailbreaking and prompt injection attacks on AI systems2.
Why AI guardrails don’t work3.
Why we haven’t seen major AI security incidents yet (but soon will)4.
Why AI browser agents are vulnerable to hidden attacks embedded in webpages5.
The practical steps organizations should take instead of buying ineffective security tools6.
Why solving this requires merging classical cybersecurity expertise with AI knowledge—Brought to you by:Datadog—Now home to Eppo, the leading experimentation and feature flagging platform: infrastructure for modern software companies: Giving Funds—Make year-end giving easy: biggest takeaways (for paid newsletter subscribers): to find Sander Schulhoff:• X: LinkedIn: Website: • AI Red Teaming and AI Security Masterclass on Maven: to find Lenny:• Newsletter: • X: LinkedIn: