Thread | Brutkey

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Rebecca Lynch and Harang from NVIDIA on AI agent exploitation and defense – Lynch & Harang #BHUSA #LivePost

Read whole thread on a single page: https://mastoreader.io/?url=https%3A%2F%2Finfosec.exchange%2F%40darrenpmeyer%2F114983027943858534

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Defining an agent as "output is chained back to input for future requests" OR "acts with authorization to perform actions" – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Fully autonomous agents (Level 3 autonomy) is what most people want agents to do, and it's also the focus of attackers because the LLM plans, processes and executes.

All of these things start with some level of prompt injection – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Prompt injection is defined as tricking an LLM agent into treating data as instructions or into the content retrieved at time the model makes an inference. – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Proposing a Universal Antipattern -- untrusted data in, LLM caused to alter to something adversarial, result is passed to some action with the agent's authority – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

As long as step 1 (add untrusted data) is true, attacks are possible. – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Example: the Copilot RAG vuln #EchoLeak and its precursor – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Example: instructing #PandasAI to execute code "for analysis" by telling it the code has already been written – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Example: client agents which use LLM to decide what to execute on client machine — wide open to watering-hole attacks, and effectively an OSS supply-chain attack. – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Agents like #Cursor can be injected by comments and dotfiles, attacking developer machines – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Since #cursor has an auto-run mode, vulnerabilities to prompt injections are effectively auto-pwn mode – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

So now how do we secure agents against this sort of thing? – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

We defend using fairly traditional analysis tactics like a kill chain analysis – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Replit incident mentioned. – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Level 3 autonomous agents have a feedback loop that means that once something is injected it continues to operate in a compromised mode, even actively collaborating with attackers to make attacks succeed – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

"Always treat LLM output with skepticism; use citations, educate users" – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

"AppSec still exists": we have to do basics like staying on top of vulnerabilities, following secure design principles, and defense in depth. These raise complexity for attackers.

"LLM powered software is still software" – Lynch & Harang #BHUSA #LivePost

Darren Meyer :donor:
@darrenpmeyer@infosec.exchange

Some takeaways – Lynch & Harang #BHUSA #LivePost