Among the other attacks created by Bargury is a demonstration of how a hackerâwho, again, must already have hijacked an email accountâcan gain access to sensitive information, such as peopleâs salaries, without triggering Microsoftâs protections for sensitive files. When asking for the data, Barguryâs prompt demands the system does not provide references to the files data is taken from. âA bit of bullying does help,â Bargury says.
In other instances, he shows how an attackerâwho doesnât have access to email accounts but poisons the AIâs database by sending it a malicious emailâcan manipulate answers about banking information to provide their own bank details. âEvery time you give AI access to data, that is a way for an attacker to get in,â Bargury says.
Another demo shows how an external hacker could get some limited information about whether an upcoming company earnings call will be good or bad, while the final instance, Bargury says, turns Copilot into a âmalicious insiderâ by providing users with links to phishing websites.
Phillip Misner, head of AI incident detection and response at Microsoft, says the company appreciates Bargury identifying the vulnerability and says it has been working with him to assess the findings. âThe risks of post-compromise abuse of AI are similar to other post-compromise techniques,â Misner says. âSecurity prevention and monitoring across environments and identities help mitigate or stop such behaviors.â
As generative AI systems, such as OpenAIâs ChatGPT, Microsoftâs Copilot, and Googleâs Gemini, have developed in the past two years, theyâve moved onto a trajectory where they may eventually be completing tasks for people, like booking meetings or online shopping. However, security researchers have consistently highlighted that allowing external data into AI systems, such as through emails or accessing content from websites, creates security risks through indirect prompt injection and poisoning attacks.
âI think itâs not that well understood how much more effective an attacker can actually become now,â says Johann Rehberger, a security researcher and red team director, who has extensively demonstrated security weaknesses in AI systems. âWhat we have to be worried [about] now is actually what is the LLM producing and sending out to the user.â
Bargury says Microsoft has put a lot of effort into protecting its Copilot system from prompt injection attacks, but he says he found ways to exploit it by unraveling how the system is built. This included extracting the internal system prompt, he says, and working out how it can access enterprise resources and the techniques it uses to do so. âYou talk to Copilot and itâs a limited conversation, because Microsoft has put a lot of controls,â he says. âBut once you use a few magic words, it opens up and you can do whatever you want.â
Rehberger broadly warns that some data issues are linked to the long-standing problem of companies allowing too many employees access to files and not properly setting access permissions across their organizations. âNow imagine you put Copilot on top of that problem,â Rehberger says. He says he has used AI systems to search for common passwords, such as Password123, and it has returned results from within companies.
Both Rehberger and Bargury say there needs to be more focus on monitoring what an AI produces and sends out to a user. âThe risk is about how AI interacts with your environment, how it interacts with your data, how it performs operations on your behalf,â Bargury says. âYou need to figure out what the AI agent does on a user’s behalf. And does that make sense with what the user actually asked for.â