Microsoft’s own evil team ‘attacked’ over 100 generative AI products: here’s what they learned
- Microsoft created an AI Red Team in 2018 because it foresaw the rise of AI
- A red team represents the enemy; and takes on the hostile personality.
- The team’s latest whitepaper hopes to address common vulnerabilities in AI systems and LLMs
For the past seven years, Microsoft has been tackling risks in artificial intelligence systems through its dedicated AI “red team.”
Created to anticipate and address the growing challenges of advanced AI systems, this team takes on the role of threat actors, with the ultimate goal of identifying vulnerabilities before they can be exploited in the real world.
Now, after years of work, Microsoft has one white paper from the team, presenting some of the key findings from the work.
The findings of the white paper from Microsoft’s red team
Over the years, the focus of Microsoft’s red teaming has expanded beyond traditional vulnerabilities to address new risks unique to AI, using both Microsoft’s own Copilot and open-source AI models.
The whitepaper highlights the importance of combining human expertise with automation to effectively detect and mitigate risks.
A key lesson we learned is that integrating generative AI into modern applications has not only expanded the cyber attack surface, but also presented unique challenges.
Techniques such as prompt injections exploit the inability of models to distinguish between system-level instructions and user input, allowing attackers to manipulate the outcomes.
Meanwhile, traditional risks, such as dependencies on outdated software or improper security technology, remain significant, and Microsoft considers human expertise essential to counter them.
The team found that effectively understanding the risks surrounding automation often requires subject matter experts who can review content in specialized areas such as medicine or cybersecurity.
Additionally, cultural competency and emotional intelligence were highlighted as essential cybersecurity skills.
Microsoft also emphasized the need for continuous testing, updated practices and “break-fix” cycles, a process of identifying vulnerabilities and implementing fixes alongside additional testing.