Where do AI models break under ethical pressure? I built a user-side protocol to find out

By jabbyai
No Comments

Over the past few months, I’ve been developing a protocol to test ethical consistency and refusal logic in large language models — entirely from the user side. I’m not a developer or researcher by training. This was built through recursive dialogue, structured pressure, and documentation of breakdowns across models like GPT-4 and Claude.

I’ve now published the first formal writeup on GitHub. It’s not a product or toolkit, but a documented diagnostic method that exposes how easily models drift, comply, or contradict their own stated ethics under structured prompting.

If you’re interested in how alignment can be tested without backend access or code, here’s my current best documentation of the method so far:

https://github.com/JLHewey/SAP-AI-Ethical-Testing-Protocols

submitted by /u/JLHewey
[link] [comments]

No Comments

Uncategorized

Where do AI models break under ethical pressure? I built a user-side protocol to find out

Leave a Comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories