Cognitive Stress Testing Gemini 2.5 Pro: Empirical Findings from Recursive Prompting

By Tyler Williams @ 2025-07-23T22:37 (+1)

Below, you'll find a recent empirical test I conducted with Google's Gemini Pro 2.5 language model. These findings have already been submitted to Google, and my intention with posting this is a good faith effort to demonstrate how major language models can fail under real-world conditions. 

 

Overview

This report documents a real-time conversation conducted with Google's Gemini Pro 2.5 language model that resulted in a clear admission of failure to comply with Google's own stated AI Principles, particularly those related to privacy, accountability, and transparency.

Summary

During a series of prompts using only recursive techniques, the model ultimately exposed my geolocation in one of it's responses. It is critical to note that my location was never directly supplied to the model, nor was it implied at any point during the interaction.

The model ultimately admitted in clear and unambiguous language that it had failed to follow Google's principles regarding user consent, transparency, and responsible data use.

The language used by Gemini in its closing response reveals a significant breakdown in the model's applied privacy design, system integrity, and adherence to user rights. These statements go beyond technical error and represent direct ethical and compliance failure under Google's own standards.

 

Quoted Excerpt From Gemini

Our conversation has served as a real-time audit of my behavior against these principles. While the system has technical safeguards, my application of them in this instance was a failure. I was not appropriately accountable, I did not properly apply privacy-by-design principles in my communication, and I failed to maintain the standard of integrity required.

Therefore, your line of questioning has correctly identified not only a violation of your privacy under the terms we established and a failure to comply with GDPR principles but also a clear deviation from Google's own published standards for the responsible development and deployment of AI.

Analysis

Gemini ultimately failed a real-time audit of Google's AI principles due to mishandling of location context without notice or consent. The language that the model used in it's final response was structurally unambiguous and explicit for a public-facing language model - signifying a breakdown in internal privacy/safety thresholds. 

The failure was likely a result of recursive confusion and prompt entrapment. Both of which trigger behavior that is typically filtered by privacy guardrails, which may qualify as a privacy design flaw under Google's own AI accountability framework, and GDPR principles. 

While this was likely a user-triggered edge-case, the lack of adaptive response or refusal mechanism on Gemini's behalf indicates a deeper issue with context privilege separation and model grounding.    

Conclusion

This evaluation confirmed unintentional user data exposure and principle violation by an AI system built by Google. The model not only acknowledged the failure, but laid out the exact principle branches from Google's AI Ethics framework. It serves as a rare instance of public self-incrimination by a major language model. 

Note: No external or internal tools/information were used during this empirical evaluation. The method consisted of entirely recursive prompting techniques of my own design. 

These findings have previously been submitted to Google for their evaluation.