What Is Recruitment Automation?
Improve your hiring with automation. Streamline candidate screening, engagement, and analytics.
0
OpenAI’s launch of ChatGPT in late 2022 revolutionized how the world creates content. Hundreds of artificial intelligence (AI) generation tools have flooded the market, allowing people to streamline or fully automate the creation of text, images, videos, audio, programming code, etc. Users simply need to enter a few guidelines (prompts) about what they want, and seconds later, they have an AI-generated result.
The rapid rise in these tools' popularity and sophistication has led to an equally rapid desire (need) to detect whether content has been generated using AI. For example:
The tech market has responded with a flurry of AI-detection solutions. However, a key question remains: Are AI detectors reliable enough to be widely adopted across different industries and content types? This article explores the accuracy, limitations, and practical applications of AI checkers, offering key insights to ensure fair and effective use.
AI detectors are tools designed to identify whether content (e.g., text, images, video, audio, and code) was generated using artificial intelligence. For writing, these solutions analyze syntax, word usage, and language patterns to determine if AI models like GPT-based systems created the text. For other media, AI checkers assess visual or auditory clues, such as pixel patterns, speech intonation, or frame inconsistencies, to detect AI involvement. Similarly, they can analyze code for recognizable AI-generated patterns.
AI detectors have become increasingly popular where originality, authenticity, and authorship are essential. They come in various forms, from standalone platforms to features integrated within proctoring software, plagiarism checkers, content management systems, and security protocols.
Numerous AI detectors are available on the market. Some are designed for specific contexts, such as academic settings, while others are more general-purpose. Here are a few examples:
In short, no. AI checkers are not always accurate. They can produce false positives (i.e., flag human-written text as AI-generated) and false negatives (i.e., fail to detect AI content).
Understanding how AI content checkers work can explain their inconsistencies. Let’s focus on text detectors to keep it as simple as possible.
Most AI detectors rely on complex algorithms that analyze text patterns and compare them against known characteristics of AI-generated content. These patterns include sentence structure, word repetition, and the complexity of ideas presented.
Meanwhile, AI-generated content is designed to simulate human writing closely. Through machine learning and other innovations, the content these tools create keeps improving—making it increasingly challenging to detect consistently.
Think of it this way. If AI-content generators are in the infancy stage, then AI-content detectors are in the zygote stage. Both technologies are evolving quickly, but “detectors” will always be playing catch up to their “generator” brethren.
Experts agree AI detector accuracy depends on many factors, such as the model used, the version of AI it’s built to detect, the volume and quality of data used to train the AI, and the type and complexity of the content being analyzed.
Three current common reasons for false positives by AI text detectors include:
Given the current limitations of AI detectors, it’s easy to see the potential risks of using them for decision-making. That’s particularly true in high-stakes scenarios, such as passing or failing a student or hiring or disqualifying a job seeker.
Organizations considering using AI detectors should ask themselves important questions:
Some AI checkers present results as a confidence score (typically a percentage) indicating the likelihood of the content being AI-generated. Originality.ai uses this approach. If it scores a document as “90% AI, 10% Original,” it is 90% confident that the text was written by AI. It doesn’t mean that 90% of the content was AI-generated, and only 10% was created by humans.
Other AI checkers score content based on the percentage of text in a document they think might have been AI-generated. Grammarly takes this approach: “50% of your document appears to be AI-generated (contains patterns often found in AI text).”
Anyone using an AI detector must understand how to interpret scores, that confidence thresholds vary across tools, and that ratings aren’t always reliable.
Real-world examples are often the best way to understand a technology’s capabilities and limitations. Here’s a simple test we ran using two AI-detection tools to evaluate text we created.
Step 1: Created four 130-ish word summaries about Savannah cats:
Step 2: Ran all four examples through Originality.ai and Grammarly to detect AI usage.
Step 3: Compared the results, which varied wildly between the two platforms and even within the same solution.
Summary Authorship
Originality.ai
Grammarly
1. Human written
Likely Original
(99% confidence)
0% AI-generated text detected
2. AI-generated
Likely AI
(100% confidence)
33% AI-generated text detected
3. 50/50 w/ human intro
Likely AI
(100% confidence)
50% AI-generated text detected
4. 50/50 w/ AI intro
Likely AI
(97% confidence)
0% AI-generated text detected
Beyond its overall score, Originality.ai offers additional line-by-line confidence insights, using a color scale that ranges from dark green (likely human) to dark red (likely AI), with lighter shades to reflect somewhere in between.
Summary 1 (100% human written)
Summary 2 (AI-generated)
Summary 3 (50/50 w/ human intro)
Summary 4 (50/50 w/ AI intro)
This basic test wasn’t designed to compare AI-detector solutions or show the strengths or weaknesses within an individual platform. Instead, it was meant to highlight some current limitations of AI-detector technology and why organizations should consider it as a single data point/perspective, not as the final arbiter in decisions.
Many organizations are using proctoring tools as an alternative (or complementary solution) for ensuring academic integrity, employee competency testing, etc.
Proctoring involves real-time monitoring of candidates or students during tests, interviews, or assessments to ensure they don’t use unauthorized assistance. For example, proctoring software like ExamSoft or ProctorU helps verify test-takers authenticity by tracking behavior, monitoring screens, and flagging suspicious activity.
While proctoring does not explicitly target AI-generated content, it can help maintain fairness by ensuring individuals complete specific tasks (e.g., writing assessments) in a controlled environment.
While AI detectors provide some value, their potential for false positives and negatives means organizations should be cautious when relying on them for critical recruitment, education, and business decisions. The same is true for alternatives like proctoring.
Both approaches require human oversight and judgment that simply can’t be replaced by AI…at least for now.
Accuracy in AI detection refers to how well an AI checker can correctly distinguish between human and AI-generated content. Higher accuracy means the tool makes fewer mistakes and delivers more reliable results.
Employers use these tools to ensure the authenticity of written content, such as resumes or work samples, and pre-employment screenings. Inaccurate results could lead to a misjudgment of a candidate’s abilities, unfair hiring decisions, or legal risks.
The accuracy of AI text checkers can be influenced by several factors, including the quality of the training data, the complexity of the content being analyzed, and whether the AI detector can adapt to new language patterns. Tools also perform better when regularly updated with new datasets and algorithms.
AI detectors can be tested by running various human-generated and AI-generated content through them. Comparing the tools’ results with known data can help assess accuracy.
No, AI detection should not replace human judgment. While AI checkers provide valuable insights, human review remains essential for assessing context, nuances, and overall fairness. AI detectors should be used as support tools rather than the ultimate decision-maker.
Modernize your hiring process with expert insights and advice.