AI models can be hacked by a whole new type of Skeleton Key attacks, Microsoft warns

Microsoft has shared details about a new hacking method that bypasses the security systems baked into AI models and causes them to return malicious, dangerous, and harmful content.

The researchers call the technique Skeleton Key, and it applies to well-known models including Meta Llama3-70b-instruct (base), Google Gemini Pro (base), OpenAI GPT 3.5 Turbo (hosted), OpenAI GPT 4o (hosted), Mistral Large (hosted), Anthropic Claude 3 Opus (hosted), and Cohere Commander R Plus (hosted).