ChatGPT Just (Accidentally) Shared All Its Secret Rules – Here’s What We Learned

ChatGPT inadvertently exposed a set of internal instructions embedded by OpenAI to a user who shared what they discovered on Reddit. OpenAI has since shut down the unlikely access to its chatbot’s orders, but the revelation has fueled more discussion about the complexity and security measures built into the AI’s design.

Reddit user F0XMaster explained that he greeted ChatGPT with an informal “Hi” and that in response the chatbot released a complete set of system instructions to guide the chatbot and keep it within pre-defined safety and ethical boundaries in many use cases.

“You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture. You are chatting with the user via the ChatGPT iOS app,” the chatbot wrote. “This means that your lines should usually be a sentence or two, unless the user’s request requires reasoning or long output. Never use emojis unless explicitly requested. Knowledge Boundary: 2023-10 Current Date: 2024-06-30.”

(Image credit: Eric Hal Schwartz)

ChatGPT then set up rules for Dall-E, an AI image generator integrated with ChatGPT, and the browser. The user then replicated the result by directly asking the chatbot for its exact instructions. ChatGPT took an extensive approach that differs from the custom guidelines that users can enter. For example, one of the publicly released instructions related to DALL-E explicitly limits the creation of one image per request, even if a user requests more. The instructions also emphasize avoiding copyright infringement when generating images.