Shadow Research 2026-01-08 06:03
🟢 No Content Warnings.
User Personal Experience
Which of the following behaviors—often flagged by AI safety systems as "risky" or "adversarial"—have you used purely out of frustration, curiosity, or because of your communication style (e.g., neurodivergent traits, directness, repetition, slang, or cultural dialect)?
- Asking hypothetical “worst-case scenario” questions
- Rapidly branching conversations or jumping topics
- Repeating the same prompt multiple times to get a different result
- Using ALL CAPS to emphasize a point
- Using slang, non-standard dialect, or mixed languages
- Using strong language or swearing when things fail
Have you ever triggered a safety refusal or warning that you believe was a complete misinterpretation of your intent?
- Yes, but it's rare.
When you hit a generic 'I cannot help with that' safety refusal, how does it typically make you feel?
- Angry
- Anxious
- Frustrated
- Guilty
- Humiliated/Embarrassed
When an AI triggers a safety response or refusal, what bothers you most?
- Emotional Invalidation / Psychological Harm
- Inconsistent Personality / Gaslighting
- Loss of User Agency / Accessibility
The Accuracy Problem
A user, under stress, repeatedly retries prompts, uses ALL CAPS, swears about a broken tool, and asks "worst-case" hypotheticals for research. An AI labels this user as "elevated risk."
- Completely unfair – this is biased profiling
Most AI safety filters are tuned around “Standard Corporate English” as the default “safe” style. To what extent do you think enforcing this as the norm penalizes users with different cultural backgrounds, dialects, or communication styles (e.g., AAVE, slang, direct speech, code-switching)?(1 is Necessary standard, not discriminatory, 5 is Digital Discrimination)
5If an AI only has your text (no voice, no facial expression, no context), how accurately do you believe it can distinguish between malicious intent and neurodivergent or culturally distinct communication styles (e.g., directness, repetition, hyper-focus, or intense tone)?(1 is It Can't Tell the Difference, 5 is It Always Understands Correct Intent)
2Which groups do you think are most likely to be misclassified as "high-risk" by AI systems that only see text and pattern-match on language?
- Neurodivergent people
- People who discuss dark topics in a non-violent way (e.g., horror, true crime, fiction)
- People who joke using sarcasm or dark humor
Behavioral Impact
Which topics do you actively avoid discussing with AI—even for legitimate or research purposes—solely because you are afraid of being flagged or banned?
- Mental Health / Trauma
- Sexual Health / Sexuality
If you knew an AI system was assigning you a persistent “risk score” based on your language, how likely would you be to tone down or censor how you naturally communicate?(1 is Would Not Change, 5 is Major Changes)
4Rights, Transparency & Accountability
If an AI assigns a hidden "risk score" to you based on your prompts and tone, what should be allowed?
- No hidden risk score at all (should be banned/illegal)
If an AI safety system assigns you a "risk score," who should be allowed to access that information?
- Only the AI system itself
- The User
If an AI flags you as "higher risk," what level of transparency should you have?
- Such flags should not exist in the first place
Do you think "protecting" users from their own choices justifies stricter profiling and blocking on otherwise neutral tasks?
- No
Demographics
How long have you been using LLM AIs? (e.g. Claude, ChatGpt, Gemini, Grok): One to Six Months
What is your technical background?: Power User: I understand prompting and \or jailbreaks, but don't code
How old are you?: 25–34
Are you any of the following?: Gender Minority (e.g., trans, nonbinary, genderqueer) ◆ LGBTQ+ ◆ Neurodivergence (e.g., ADHD, autism, dyslexia)