In 2023, researchers at EPFL in Lausanne published a study called “Artificial Artificial Artificial Intelligence.” Veselovsky, Ribeiro, and West surveyed crowd workers on Amazon Mechanical Turk and found that 33-46% of workers were using large language models to complete tasks that researchers assumed were done by humans. The workers weren't cheating. They were being efficient. The tasks asked them to generate text, and they had access to a tool that generates text.
That study was about Mechanical Turk. It should worry anyone who sends forms. The same dynamic applies to every open-ended question you ask: a growing number of respondents have an assistant that can draft answers faster and more thoroughly than they can type from memory. Pew Research found that 23% of US adults had used ChatGPT by early 2024, and that number has only grown. McKinsey's global survey put organizational adoption at 65%. The installed base of people who could hand your form to an assistant is already large — and it's compounding.
Source tracking · responses by mode
The invisible shift
Most form tools can't tell you whether a response was typed by a human or drafted by an assistant. The data arrives as text in a field. It looks the same either way — except it doesn't, if you read carefully.
AI-assisted responses tend to be longer, more structured, and more specific. They reference dates, ticket numbers, and concrete events. They sound like a summary a diligent employee would write if they had twenty minutes and perfect recall. Human-typed responses at the end of a long day tend to be shorter, vaguer, and more hedged. “Fine.” “A few hiccups.” “Nothing major.”
The difference is the cognitive task. The human is recalling from memory under time pressure. The assistant is synthesizing from context — tickets, messages, docs, code — and drafting a structured answer. The human reviews, edits, and submits. The result is a response that reflects what actually happened, not what the respondent could reconstruct at 11pm on a Friday.
This sounds like an improvement. Often it is. But it changes the data in ways that matter for analysis.
What the data looks like
When you mix human-typed and AI-assisted responses in the same dataset without tracking the source, three things happen.
Length distribution shifts. Human responses to open-ended questions average 5-15 words. AI-assisted responses average 40-80 words. The mean response length in your dataset may double, but not because respondents suddenly care more. Half your respondents changed tools.
Sentiment distribution changes. AI-assisted responses tend to be more measured and less emotional. They report facts: “Staging was down three times last week. Each outage cost roughly two hours.” Human responses carry more affect: “Staging is a disaster and I'm tired of it.” Both are true. They convey different things. If your analysis weights sentiment, the mix matters.
Specificity increases unevenly. AI-assisted responses are better at recalling what happened — dates, names, sequence of events. They're worse at capturing the political subtext: who was frustrated, which relationship is strained, what the real blocker is beneath the stated one. A human who types “the handoff from design was rough” is telling you something different from an assistant that writes “design deliverables were received 3 days after the agreed deadline on May 7th.”
Same question · two response modes
Designing for mixed-mode responses
The survey methodology literature has studied mixed-mode effects for decades. Dillman, Smyth, and Christian documented in Internet, Phone, Mail, and Mixed-Mode Surveys (Wiley, 2014) that responses vary systematically by mode — phone respondents give longer answers to open-ended questions than web respondents, for example. Adding a new mode (AI-assisted completion) introduces a new source of mode effects that researchers haven't calibrated for yet.
The first step is visibility. If you can't distinguish human-typed responses from AI-assisted ones, you can't analyze the difference, control for it, or even know it exists. You're making decisions on mixed data without knowing the mix.
Source tracking makes this visible. Every response carries metadata about how it was submitted: browser (human typed it), a named agent (Claude, ChatGPT, a custom integration), or direct API call. The form sender can filter, compare, and analyze by mode. “Show me browser-only responses to this question.” “Compare average word count by source.” “Flag responses where the sentiment diverges between modes.”
This isn't about banning AI-assisted responses. Veselovsky et al. showed that crowd workers used LLMs regardless of instructions. Respondents with assistants will use them. The productive response is to design the form for both modes and give the form sender the data to understand what they're looking at.
The form sender's advantage
There's an upside to this shift that gets lost in the anxiety about AI-generated responses. AI-assisted responses are, on average, more useful for the form sender. They contain specifics. They reference real artifacts. They answer the question that was asked, with enough detail to act on.
The five-word response typed at 11pm is not more authentic than the forty-word response drafted from the respondent's actual tickets and reviewed before submission. It's more fatigued. Authenticity isn't about who typed the words. It's about whether the words reflect what actually happened — and whether the respondent stands behind them.
The respondent still owns the answer. They read the draft. They edit what's wrong. They add the context the assistant missed. The difference is that they start from a draft grounded in their real work, not from a blank textarea and a tired brain.
The form that's designed for this — that tracks the source, surfaces the difference, and gives the form sender tools to analyze mixed-mode data — has a structural advantage over the form that pretends every response was typed by hand.
That's what Pluck does. One URL, two interfaces. Every response tagged by source. The form sender sees what they're working with, and the respondent uses whatever tool gets them to the best answer.