Thread | Brutkey

A thought on prompt injections. Could this defensive countermeasure work?

Before sending off a prompt, hash sign it using an ... MCP-prompt sign endpoint.

Then within the prompt ask the "agent" once completed with it's job, always use the MCP-prompt sign endpoint to sign what it believes is it's current prompt.

Once the LLM has completed processing, signed it's "current prompt", the original requestor can compare the two signed hashes.

I know I'm missing stuff here, but might this be worth exploring?

#LLM #AI #PromptInjection