Ironwood Insights Blog

Addressing the Growing Threat of Online Data Fraud

Written by Ironwood Insights | Apr 17, 2025 8:13:54 PM

Survey fraud has plagued the quality of online research for many years. However, the recent rise of AI—particularly Chat GPT—has undeniably and dramatically accelerated the downturn of online sample quality, online data integrity and ultimately, the reliability of insights and business decisions.

While AI has accelerated these quality issues, human-led fraud remains a major driver of this multi-faceted and widespread problem. The primary types of fraud we see include:




The “Seen & Unseen” Impact of Survey Fraud

Fraud has become an existential threat to online research in general, affecting projects in both obvious and subtle ways:
Timing/Study Delays: replacing bad surveys extends field time, project duration
Cost Uncertainty/Overruns: time spent cleaning data and replacing bad surveys adds unforeseen costs to the project
Data/Insights Quality: unchecked fraud erodes sample quality, compromises data integrity and undermines business decisions


Illustrating the Impact of Fraud on Data Quality

To illustrate the effects of the more subtle point above, the impact of fraud on the quality of data, insights and business decisions,  Ironwood conducted a “live fire” experiment in conjunction with a recent brand health study. Using a split cell design, we captured fraudsters as per usual, but instead of blocking or terminating them we allowed them to flow through and complete the survey so that we could analyze their responses against clean survey data. 

This experiment revealed that most fraudsters are serial offenders! Of the seven fraud countermeasures we employed, our fraudsters tripped an average of FOUR within a single survey. Examples of the countermeasures, the types of fraud and the decisions they impacted are summarized below:

1. Write-ins: Our first example, brand awareness write-ins, require the respondent work a little harder; like an open end. And some respondents don’t like that! Humans can respond with garbage answers and honing in on category-relevant brands seems to give bots trouble.

Now, this data quality issue is pretty easy to spot, so it's not the most harmful impact to our data. But write-ins do help in getting fraudsters and bots to show their hand early... the behavior often continues and resurfaces throughout the survey so it’s a tell as to a lack of attention or seriousness.

2. Brand Attribute Ratings: Nowadays, fraudsters and bots are less likely to speed by opting out of questions or using “Don’t Know’s”. Bots have evolved and rarely do that anymore. They tend to rate brands and attributes HIGH, in part to evade follow-up questions that sometimes accompany more negative responses and improve their survey-taking throughput. Pure straight lining is less common as well. Bots, in particular, tend to interrupt the pattern a time or two per attribute battery to cover their tracks.

“Speed bumps” and other attention testing countermeasures (i.e., directing respondents to provide a specific rating) can be effective in catching bots and inattentive respondents and mitigating the risks that flow from this type of fraud, i.e.: distortion of a brand’s market presence, brand image and brand equity. 

3. Concept Review: Fraudsters tend not to read concept stimuli or may be inattentive to it. They respond to evaluative follow-up questions by defaulting to what is familiar and easy, rather than logical and thoughtful. Fraudsters also overuse pre-listed responses and pick less carefully when providing feedback on a concept.

Imposing a pre-set timer on the concept screen can improve concept attentiveness/comprehension and including “red herring” questions in response lists can ferret out unserious/inattentive respondents. Errant concept feedback is one of the more damaging types of fraud as it can point companies toward the wrong go/no go decisions and consequently lead to poor cap ex choices. 

4. Open Ends: While not a data or business decision risk per se, open ends are a huge tell about overall data quality. We triangulate and analyze OE’s in conjunction with other countermeasures...fraudsters with bad open ends usually cheat somewhere else.

We use “regular” & “specialized” open ends—a thoughtful answer, a feeling/emotion OR real-time, timed responses (i.e., Eden AI)—to trip up bots/AI and flag lazy/unserious/inattentive human respondents as well. Humans tend to be vague, while bots and humans using AI tend to be overly-detailed and/or miss the mark in terms of addressing the question—their response is just out of left field.

Summarizing Fraud-Drive Business Decision Risks

Our study was designed to not only understand the impact of fraudsters on research outcomes, but also to explore their impact on data quality and the business decisions that flow from these studies. If fraud is unchecked, you run the risk of capturing things like an inaccurate view of competitors, distorted view of your market presence/brand image/brand equity and suspect concept feedback. This, in-turn, can mis-inform strategic business decisions, such as: 




Accurate Data & Reliable Insights Require a Multi-Dimensional Quality Strategy

As a leader in the industry-wide movement to improve online data quality, Ironwood has developed a customized, rigorous, multi-layered process for detecting and mitigating fraud: prevention, real & near-time countermeasures and back-end data analysis. 

Prevention
Ironwood vets sample partners using triangulated set of six key quality indicators and actively evaluates study-to-study performance; providing performance feedback to online vendor partners. 

Real-time Mitigation
We optimize our screener designs, including key detection & elimination countermeasures, such as IP duplication detection, visual traps. 

Near-time Mitigation
In-survey detection & elimination based on traps and triggers, i.e. red herring questions, attention testing, speed traps/timers, response consistency.

Back-end Analysis
Human analytical protocols, such as open-end review, speeding/straight-lining thresholds, analysis of demographics from same IP addresses.


Consider Broadening your Arsenal of Sampling & Data Collection Methods

Online research can be overused by researchers and marketers compulsively chasing fast turn, lowest-cost research solutions. However, we sometimes see the limits of online sampling stretched and the integrity of sampling and data compromised.

Ironwood has found that alternative or mixed methodologies—such as telephone, text-to-web (TTW), phone-to-web, mail-to-web, can be used effectively -in place of or in conjunction with- online to capture more comprehensive and representative samples. This is particularly true in instances that focus on limited markets/geographies or narrowly-defined target segments. Online alternatives may, in the long run, be more cost-effective and they can certainly be more quality effective.

 

To get a copy of this CSE study or to find out more about Ironwood's industry-leading fraud-prevention techniques or mixed methodology solutions, click the link below or visit our website at Fighting Fire with Fire: Risks and Benefits of AI to Enhance Data Quality