Microsoft’s best new security tools want to help you keep your shiny new generative AI systems safe for good
Microsoft has unveiled a new security tool aimed at keeping generative AI tools safe and easy to use.
PyrITshort for Python Risk Identification Toolkit for Generative AI, will help developers respond to the growing threats facing businesses of all sizes from criminals looking to take advantage of new tactics.
As most of you already know by now, generative AI tools like ChatGPT are used by cybercriminals to quickly create code for malware, generate (and proofread) phishing emails, and more.
Manual work is still required
Developers responded by changing the way the tool responds to various prompts and somewhat limiting its capabilities. Microsoft has now decided to go one step further.
Over the past year, the company has merged “several high-end generative AI systems” before they hit the market, and during that time it started building one-off scripts. “As we brought together different types of generative AI systems and explored different risks, we added features we found useful,” Microsoft explains. “Today, PyRIT is a reliable tool in the arsenal of the Microsoft AI Red Team.”
The Redmond-based software giant also emphasizes that PyRIT is in no way a replacement for manual red teaming of generative AI systems. Instead, the company hopes other Red Teaming teams can use the tool to eliminate tedious tasks and speed things up.
“PyRIT sheds light on the hotspots where the risk could lie, which the security professional can then closely examine,” Microsoft further explains. “The security professional is always in control of the strategy and execution of the AI Red Team operation, and PyRIT provides the automation code to take the initial dataset of malicious prompts from the security professional and then uses the LLM endpoint to generate more malicious prompts to generate .”
The tool is also adaptable, Microsoft points out, as it is able to change its tactics depending on the generative AI system’s response to previous queries. It then generates the next input and continues the loop until the red team members are satisfied with the results.