The coming AI security crisis (and what to do about it) | Sander Schulhoff

Lenny's Podcast: Product | Career | Growth

6 个月前1h 32m

桑德·舒尔霍夫是一位专注于AI安全、提示注入（prompt injection）和红队测试（red teaming）的AI研究员。他撰写了首份全面的提示工程指南，并举办了史上首次提示注入竞赛，与顶级AI实验室和公司合作。他的数据集现被《财富》500强企业用于基准测试其AI系统安全性。他比任何人都更深入地研究攻击者如何攻破AI系统，而他的发现并不令人安心：企业购买的防护栏（guardrails）实际上并不起作用，我们至今未看到更多危害只是侥幸——因为AI代理（AI agents）的能力尚不足以造成实质性破坏。

我们讨论的内容：

AI系统的越狱（jailbreaking）与提示注入攻击之间的区别
为什么AI防护栏不起作用
为何尚未出现重大AI安全事件（但很快会发生）
为什么AI浏览器代理容易受到网页中隐藏攻击的影响
组织应采取的实用措施，而非购买无效的安全工具
为何解决此问题需要将经典网络安全专业知识与AI知识相结合

——由以下机构赞助：
Datadog——现为Eppo的所在地，领先的实验与功能标记平台：现代软件公司的基础设施：
Giving Funds——让年终捐赠变得简单：
（付费新闻通讯订阅者专享）最大收获：

联系桑德·舒尔霍夫：
• X：
• LinkedIn：
• 网站：
• Maven平台上的AI红队测试与AI安全大师课：

联系莱尼：
• 新闻通讯：
• X：
• LinkedIn：

节目内容

原始音频

桑德·舒尔霍夫是一位专注于人工智能安全、提示注入和红队测试的AI研究员。他撰写了首部全面的提示工程指南，并与顶尖AI实验室和企业合作举办了首届提示注入竞赛。他的数据集如今被《财富》500强企业用于评估其AI系统的安全性，他投入了比任何人都多的时间研究攻击者如何攻破AI系统，而他的发现并不令人安心：企业购买的防护措施实际上并不奏效，我们至今尚未看到更多危害只是幸运，因为AI智能体目前还不够强大，无法造成真正的破坏。我们讨论了：1. AI系统越狱与提示注入攻击的区别2. 为何AI防护措施无效3. 为何我们尚未看到重大AI安全事件（但很快会出现）4. 为何AI浏览器智能体易受网页中隐藏攻击的影响5. 组织应采取的实际步骤，而非购买无效的安全工具6. 为何解决此问题需要融合传统网络安全专业知识与AI知识——本期节目由以下赞助商提供：Datadog——现拥有领先的实验和功能标记平台Eppo：https://www.datadoghq.com/lennyMetronome——现代软件公司的变现基础设施：https://metronome.com/GoFundMe Giving Funds——让年终捐赠更轻松：http://gofundme.com/lenny——文字稿：https://www.lennysnewsletter.com/p/the-coming-ai-security-crisis——我的主要收获（面向付费订阅用户）：https://www.lennysnewsletter.com/i/181089452/my-biggest-takeaways-from-this-conversation——桑德·舒尔霍夫的联系方式：• X：https://x.com/sanderschulhoff• LinkedIn：https://www.linkedin.com/in/sander-schulhoff• 个人网站：https://sanderschulhoff.com• Maven平台上的AI红队测试与AI安全大师课：https://bit.ly/44lLSbC——莱尼的联系方式：• 新闻通讯：https://www.lennysnewsletter.com• X：https://twitter.com/lennysan• LinkedIn：https://www.linkedin.com/in/lennyrac

原始描述

Sander Schulhoff is an AI researcher specializing in AI security, prompt injection, and red teaming.

He wrote the first comprehensive guide on prompt engineering and ran the first-ever prompt injection competition, working with top AI labs and companies.

His dataset is now used by Fortune 500 companies to benchmark their AI systems security, he’s spent more time than anyone alive studying how attackers break AI systems, and what he’s found isn’t reassuring: the guardrails companies are buying don’t actually work, and we’ve been lucky we haven’t seen more harm so far, only because AI agents aren’t capable enough yet to do real damage.

We discuss:1.

The difference between jailbreaking and prompt injection attacks on AI systems2.

Why AI guardrails don’t work3.

Why we haven’t seen major AI security incidents yet (but soon will)4.

Why AI browser agents are vulnerable to hidden attacks embedded in webpages5.

The practical steps organizations should take instead of buying ineffective security tools6.

Why solving this requires merging classical cybersecurity expertise with AI knowledge—Brought to you by:Datadog—Now home to Eppo, the leading experimentation and feature flagging platform: infrastructure for modern software companies: Giving Funds—Make year-end giving easy: biggest takeaways (for paid newsletter subscribers): to find Sander Schulhoff:• X: LinkedIn: Website: • AI Red Teaming and AI Security Masterclass on Maven: to find Lenny:• Newsletter: • X: LinkedIn: