Bot Scanner
What Is a Bot Scanner?
Section titled “What Is a Bot Scanner?”A bot scanner is a JavaScript-based detection system that runs in visitors’ browsers. It analyzes browser characteristics and behavior to identify automated tools, headless browsers, and bots.
How Bot Scanners Work
Section titled “How Bot Scanners Work”Bot Scanner uses a two-phase approach to maximize detection:
Page Load (0ms) │ ├── Phase 1: Immediate Detection │ ├── Basic bot signals (webdriver, headless) │ ├── CDP artifact detection │ ├── Canvas fingerprint hash │ ├── WebGL deep parameters │ ├── Audio context fingerprint │ └── Send detection if score > 20 │ ▼User Interaction (5 seconds) │ ├── Phase 2: Behavioral Analysis │ ├── Mouse movement patterns │ ├── Click timing analysis │ ├── Scroll velocity patterns │ ├── Keystroke dynamics │ └── Send behavioral update │ ▼Final Score CalculatedWhy two phases?
- Phase 1 catches bots that leave immediately
- Phase 2 provides deep analysis for bots that stay
- Combined data gives the most accurate detection
What Bot Scanners Detect
Section titled “What Bot Scanners Detect”| Detection Type | What It Catches |
|---|---|
| WebDriver | Selenium, Puppeteer, Playwright automation |
| Headless Browsers | Chrome Headless, PhantomJS, Firefox Headless |
| AI Crawlers | GPTBot, ClaudeBot, ChatGPT-User |
| Browser Anomalies | Inconsistent navigator properties |
| Behavioral Patterns | Non-human mouse movements, instant form fills |
| Fingerprint Mismatches | Canvas, WebGL, font rendering inconsistencies |
| CDP Artifacts | ChromeDriver, Selenium, Puppeteer markers |
| Environment Integrity | MessageChannel, API mocking detection |
Bot Scanner vs. Server-Side Detection
Section titled “Bot Scanner vs. Server-Side Detection”| Aspect | Bot Scanner (Client) | Server-Side Detection |
|---|---|---|
| Where it runs | Visitor’s browser | Your server |
| What it sees | Browser internals | HTTP requests |
| Detection depth | Deep browser analysis | Headers, IP, patterns |
| Bypass difficulty | Harder to evade | Easier to spoof |
| Best for | Headless browsers, automation | Scanners, scrapers |
Expected Detection Rates
Section titled “Expected Detection Rates”| AI Browser Type | Detection Rate |
|---|---|
| Stagehand + Browserbase | ~60-70% |
| Playwright + Stealth | ~75% |
| Basic Puppeteer | ~90% |
| Commercial anti-detect | ~40% |
Creating a Bot Scanner
Section titled “Creating a Bot Scanner”Step-by-Step Guide
Section titled “Step-by-Step Guide”-
Navigate to Bot Scanners
- Click Bot Scanners in the sidebar
-
Click “New Bot Scanner”
- The create scanner dialog opens
-
Configure the Scanner
Field Description Example Name Internal identifier ”Main Website Scanner” Property Associate with property ”Production Website” Sensitivity Detection strictness Medium Enabled Active/inactive Yes -
Click “Create”
- Scanner is created with a unique ID
-
Install the Snippet
- Copy the provided JavaScript snippet
- Add it to your website (see Installing the Snippet)
Scanner Configuration Example
Section titled “Scanner Configuration Example”Name: Main Website ScannerProperty: Production WebsiteSensitivity: MediumEnabled: Yes
Detection Options:✓ Detect automation✓ Detect headless browsers✓ Detect AI crawlers✓ Detect behavioral anomalies✓ Browser fingerprinting
Honeypot Options:✓ Inject form honeypot✓ Inject link honeypotDetection Signals
Section titled “Detection Signals”Automation Detection
Section titled “Automation Detection”Detects browser automation frameworks:
| Framework | Detection Method |
|---|---|
| Selenium WebDriver | navigator.webdriver property |
| Puppeteer | Chrome DevTools Protocol traces |
| Playwright | Browser-specific markers |
| Cypress | Test runner indicators |
Enable when: You want to catch automated testing tools and bots.
Headless Browser Detection
Section titled “Headless Browser Detection”Identifies browsers running without a visible UI:
| Signal | Description |
|---|---|
| Missing plugins | Headless browsers often have no plugins |
| Canvas fingerprint | Rendering differences |
| WebGL anomalies | Graphics processing inconsistencies |
| User agent hints | Client hints mismatches |
Enable when: Attackers use headless Chrome, PhantomJS, etc.
AI Crawler Detection
Section titled “AI Crawler Detection”Identifies AI/LLM training crawlers:
| Bot | User Agent Pattern |
|---|---|
| GPTBot | GPTBot |
| ClaudeBot | ClaudeBot |
| Google-Extended | Google-Extended |
| PerplexityBot | PerplexityBot |
| CCBot | CCBot |
Enable when: You want to detect AI training data collection.
Behavioral Analysis
Section titled “Behavioral Analysis”Bot Scanner tracks mouse movements, clicks, scrolls, and keystrokes to detect non-human patterns:
Mouse Movement Signals
Section titled “Mouse Movement Signals”| Signal | Points | What It Detects |
|---|---|---|
| Low mousemove count | +25 | Fewer than 10 mouse events (bots often skip mouse simulation) |
| Linear paths | +20 | Mouse moves in perfectly straight lines (humans curve) |
| Constant velocity | +15 | No speed variation (humans accelerate/decelerate) |
| Grid-aligned moves | +15 | Positions on exact coordinates (automation artifacts) |
Click Signals
Section titled “Click Signals”| Signal | Points | What It Detects |
|---|---|---|
| Instant clicks | +30 | No delay between mouse stop and click (humans have reaction time) |
| No pre-movement | +25 | Clicks without preceding mouse movement (teleporting cursor) |
Scroll Signals
Section titled “Scroll Signals”| Signal | Points | What It Detects |
|---|---|---|
| Constant scroll velocity | +10 | Same speed throughout (humans vary) |
| Perfect scroll intervals | +10 | Exact timing between scroll events |
Keystroke Signals
Section titled “Keystroke Signals”| Signal | Points | What It Detects |
|---|---|---|
| Constant typing rhythm | +15 | No variation in keystroke timing |
| Superhuman typing speed | +20 | Less than 30ms between keystrokes |
Browser Fingerprinting
Section titled “Browser Fingerprinting”Builds a fingerprint from browser characteristics:
- Canvas rendering
- WebGL renderer
- Audio context
- Font enumeration
- Screen properties
- Timezone/language
Enable when: You want to track returning visitors and detect fingerprint anomalies.
WebGL Deep Fingerprinting
Section titled “WebGL Deep Fingerprinting”| Signal | Points | What It Detects |
|---|---|---|
| SwiftShader renderer | +30 | Software rendering (common in headless Chrome) |
| Mesa LLVMpipe renderer | +25 | Software rendering on Linux |
| No unmasked renderer | +15 | GPU info hidden (real browsers expose this) |
| Low extension count | +10 | Fewer than 10 WebGL extensions |
Audio Fingerprinting
Section titled “Audio Fingerprinting”| Signal | Points | What It Detects |
|---|---|---|
| AudioContext unavailable | +15 | API missing or blocked |
| Zero audio fingerprint | +25 | Mocked AudioContext returns zero |
| Missing baseLatency | +10 | Chrome 74+ should have this property |
| Unusual sample rate | +10 | Not 44100 or 48000 Hz |
| Zero channel count | +15 | Invalid audio configuration |
Canvas Fingerprinting
Section titled “Canvas Fingerprinting”Bot Scanner generates a unique hash from canvas rendering:
- Draws specific shapes and text
- Uses specific fonts and colors
- Generates hash from the rendered output
- Compares against known patterns
Headless browsers often have distinct canvas fingerprints due to software rendering.
CDP Artifact Detection
Section titled “CDP Artifact Detection”Chrome DevTools Protocol (CDP) is the automation protocol used by ChromeDriver, Puppeteer, Playwright, and other browser automation tools. These tools inject identifiable artifacts that are extremely difficult to hide.
| Signal | Points | What It Detects |
|---|---|---|
cdc_ properties | +40 | ChromeDriver injects cdc_* prefixed global variables |
$cdc_ properties | +40 | Older ChromeDriver variants with $cdc_* prefix |
__webdriver_evaluate | +30 | Selenium WebDriver evaluation artifacts |
__selenium_evaluate | +30 | Direct Selenium markers |
__puppeteer_evaluation_script__ | +35 | Puppeteer script injection markers |
__fxdriver_evaluate | +25 | Firefox WebDriver (Geckodriver) artifacts |
__cdp_binding__ | +40 | CDP runtime binding artifacts |
__chromium_protocol__ | +40 | Chromium protocol handler markers |
| Modified webdriver getter | +35 | Attempts to hide navigator.webdriver leave traces |
| CDP script injection | +30 | Scripts injected via Runtime.evaluate protocol |
Why CDP detection is highly reliable:
- Protocol-level injection - These artifacts are injected by the automation framework itself, not the browser
- Hard to remove - Removing them requires patching the automation tool’s source code
- Near-zero false positives - Normal browsers never have these properties
- Catches stealth attempts - Tools that try to hide
navigator.webdriveroften leave other CDP traces
Example detection:
// ChromeDriver leaves these artifacts:window.cdc_adoQpoasnfa76pfcZLmcfl_Array // Random but always cdc_ prefixedwindow.cdc_adoQpoasnfa76pfcZLmcfl_Promise // Multiple cdc_ properties
// Selenium WebDriver leaves:window.__webdriver_evaluate // Evaluation functionwindow.__driver_unwrapped // Unwrapped driver reference
// Puppeteer leaves:window.__puppeteer_evaluation_script__ // Script injection markerEnvironment Integrity Checks
Section titled “Environment Integrity Checks”Bot Scanner performs additional environment integrity checks that verify browser APIs behave correctly. Automation tools sometimes incorrectly mock or break these APIs.
MessageChannel Communication Test
Section titled “MessageChannel Communication Test”The MessageChannel API enables communication between different browsing contexts. Some automation frameworks incorrectly implement or break this API.
| Signal | Points | What It Detects |
|---|---|---|
| MessageChannel timeout | +10* | Message not received within 100ms |
| MessageChannel error | +10* | API throws error or is unavailable |
| MessageChannel exception | +10* | Cannot create MessageChannel |
*Weak signal: Only counted when combined with other strong bot indicators. This prevents false positives from legitimate environments that restrict MessageChannel (some sandboxed iframes, older browsers).
Why this detection works:
- Real browsers have full MessageChannel support
- Automation frameworks sometimes mock MessageChannel incorrectly
- The test is fast (100ms timeout) and non-blocking
- Very low performance impact
Honeypot Injection
Section titled “Honeypot Injection”Bot scanners can automatically inject honeypot elements into your pages.
Form Honeypot
Section titled “Form Honeypot”Adds hidden form fields that humans can’t see or fill:
<!-- Injected automatically by bot scanner --><input type="text" name="website_url" style="position:absolute;left:-9999px" tabindex="-1" autocomplete="off">| Behavior | Result |
|---|---|
| Field is empty | Likely human |
| Field has value | Definitely a bot |
Best for: Contact forms, signup forms, comment sections.
Link Honeypot
Section titled “Link Honeypot”Adds hidden links that only bots follow:
<!-- Injected automatically by bot scanner --><a href="/trap-path-abc123" style="display:none;visibility:hidden"> Secret Link</a>| Behavior | Result |
|---|---|
| Link not clicked | Normal user |
| Link is followed | Bot or crawler |
Best for: Any page where you want crawler detection.
Honeypot Configuration
Section titled “Honeypot Configuration”| Option | Description |
|---|---|
| Inject into forms | Add hidden fields to all forms |
| Inject links | Add hidden links to page footer |
| Custom field names | Use realistic-looking field names |
| Injection frequency | Every page, random pages, specific pages |
Sensitivity Levels
Section titled “Sensitivity Levels”The sensitivity level determines how strictly the scanner scores visitors.
Low Sensitivity
Section titled “Low Sensitivity”Score threshold: 70+ to flag as bot False positives: Very rare Detection rate: Catches obvious bots
Best for:
- Sites with privacy-conscious users
- When false positives are unacceptable
- Initial testing
Detects:
- Obvious automation (WebDriver present)
- Known headless browsers
- Honeypot interactions
Medium Sensitivity (Recommended)
Section titled “Medium Sensitivity (Recommended)”Score threshold: 50+ to flag as suspicious False positives: Rare Detection rate: Good balance
Best for:
- Most websites
- Production environments
- General protection
Detects:
- Everything in Low, plus:
- Browser inconsistencies
- Behavioral anomalies
- Fingerprint mismatches
High Sensitivity
Section titled “High Sensitivity”Score threshold: 30+ to flag as suspicious False positives: Possible Detection rate: Maximum detection
Best for:
- High-security applications
- Financial services
- When false positives are acceptable
Detects:
- Everything in Medium, plus:
- Subtle automation indicators
- Minor behavioral differences
- Edge-case browser configurations
Sensitivity Comparison
Section titled “Sensitivity Comparison”| Sensitivity | Score Range Flagged | False Positive Risk | Bot Detection |
|---|---|---|---|
| Low | 70-100 | Very Low | Basic |
| Medium | 50-100 | Low | Good |
| High | 30-100 | Medium | Maximum |
Installing the Bot Scanner Snippet
Section titled “Installing the Bot Scanner Snippet”Getting Your Snippet
Section titled “Getting Your Snippet”- Go to Bot Scanners
- Find your scanner in the list
- Click Copy Snippet (or the copy icon)
Snippet Format
Section titled “Snippet Format”<script async src="https://cdn.webdecoy.com/bot-detection/v1/pro/bot-detection-pro.min.js" data-aid="your-organization-uuid" data-sid="your-scanner-uuid"></script>Snippet Attributes
Section titled “Snippet Attributes”| Attribute | Required | Description |
|---|---|---|
src | Yes | CDN URL for scanner |
data-aid | Yes | Your organization UUID |
data-sid | Yes | Your bot scanner UUID |
data-endpoint | No | Custom ingest endpoint (default: https://ingest.webdecoy.com/api/v1/detect) |
data-exclude-paths | No | Paths to skip (comma-separated) |
data-sample-rate | No | Percentage of visitors to scan (1-100) |
CDN URLs
Section titled “CDN URLs”| Version | URL |
|---|---|
| Minified | https://cdn.webdecoy.com/bot-detection/v1/pro/bot-detection-pro.min.js |
| Source | https://cdn.webdecoy.com/bot-detection/v1/pro/bot-detection-pro.js |
Installation Methods
Section titled “Installation Methods”Method 1: Direct HTML
Section titled “Method 1: Direct HTML”Add the snippet before the closing </body> tag:
<!DOCTYPE html><html><head> <title>Your Site</title></head><body> <!-- Your content -->
<!-- WebDecoy Bot Scanner --> <script async src="https://cdn.webdecoy.com/bot-detection/v1/pro/bot-detection-pro.min.js" data-aid="your-organization-uuid" data-sid="your-scanner-uuid"> </script></body></html>Method 2: Google Tag Manager
Section titled “Method 2: Google Tag Manager”- Create a new Custom HTML tag
- Paste the snippet
- Set trigger to All Pages
- Publish the container
Method 3: WordPress (Manual)
Section titled “Method 3: WordPress (Manual)”Add to your theme’s footer.php:
<?php if (!is_admin()) : ?><script async src="https://cdn.webdecoy.com/bot-detection/v1/pro/bot-detection-pro.min.js" data-aid="your-organization-uuid" data-sid="your-scanner-uuid"></script><?php endif; ?>Method 4: React/Next.js
Section titled “Method 4: React/Next.js”// _app.js or layout.jsimport Script from 'next/script';
export default function App({ Component, pageProps }) { return ( <> <Component {...pageProps} /> <Script src="https://cdn.webdecoy.com/bot-detection/v1/pro/bot-detection-pro.min.js" data-aid="your-organization-uuid" data-sid="your-scanner-uuid" strategy="afterInteractive" /> </> );}Method 5: Vue.js
Section titled “Method 5: Vue.js”<template> <div id="app"> <router-view /> </div></template>
<script>export default { mounted() { const script = document.createElement('script'); script.src = 'https://cdn.webdecoy.com/bot-detection/v1/pro/bot-detection-pro.min.js'; script.setAttribute('data-aid', 'your-organization-uuid'); script.setAttribute('data-sid', 'your-scanner-uuid'); script.async = true; document.body.appendChild(script); }};</script>Verifying Installation
Section titled “Verifying Installation”- Load your website in a browser
- Open Developer Tools (F12)
- Go to the Network tab
- Look for
bot-detection-pro.min.jsrequest - Check Console for
[WebDecoy]messages
Managing Bot Scanners
Section titled “Managing Bot Scanners”Viewing Scanner List
Section titled “Viewing Scanner List”Go to Bot Scanners to see all scanners:
| Column | Description |
|---|---|
| Name | Scanner identifier |
| Enabled | Active status toggle |
| Methods | HTTP methods monitored |
| Created | Creation date |
| Actions | Edit, delete, copy snippet |
Enabling/Disabling a Scanner
Section titled “Enabling/Disabling a Scanner”- Find the scanner in the list
- Toggle the Enabled switch
- Scanner is immediately active/inactive
Editing a Scanner
Section titled “Editing a Scanner”- Click the menu (three dots)
- Select Edit
- Modify settings
- Click Save
Deleting a Scanner
Section titled “Deleting a Scanner”- Click the menu (three dots)
- Select Delete
- Confirm deletion
- Scanner and snippet stop working immediately
Best Practices
Section titled “Best Practices”- ✅ Start with Medium sensitivity
- ✅ Enable honeypot injection
- ✅ Test on staging before production
- ✅ Monitor false positive rates
- ✅ Combine with server-side detection
Don’ts
Section titled “Don’ts”- ❌ Use High sensitivity without testing
- ❌ Block users based solely on scanner results
- ❌ Install multiple scanners on the same page
- ❌ Forget to update snippet when changing scanners
Recommended Configuration
Section titled “Recommended Configuration”Detection Options:✓ Detect automation - Essential✓ Detect headless - Essential✓ Detect AI crawlers - Recommended✓ Behavioral analysis - Recommended✓ Fingerprinting - Optional (privacy considerations)
Honeypot Options:✓ Form honeypot - Highly recommended✓ Link honeypot - RecommendedPrivacy Considerations
Section titled “Privacy Considerations”Bot Scanner collects data for detection purposes:
| Data Type | Collected | Purpose |
|---|---|---|
| Browser properties | ✅ | Basic detection |
| Mouse coordinates | ✅ | Movement pattern analysis |
| Click positions | ✅ | Click behavior analysis |
| Scroll positions | ✅ | Scroll pattern analysis |
| Keystroke timing | ✅ | Typing rhythm analysis |
| Canvas fingerprint | ✅ | Rendering consistency |
| WebGL parameters | ✅ | Hardware fingerprinting |
| Audio fingerprint | ✅ | Audio context verification |
Important:
- Data is used solely for bot detection scoring
- No actual keystrokes are captured (only timing intervals)
- Data is not used for user tracking or advertising
- Consider your privacy policy when deploying
Troubleshooting
Section titled “Troubleshooting”Detection not sending
Section titled “Detection not sending”- Check browser console for
[WebDecoy]messages - Verify
data-aidanddata-sidattributes are set correctly - Ensure score threshold (20) is exceeded
- Check network tab for requests to ingest endpoint
Low detection rate
Section titled “Low detection rate”- Ensure behavioral phase has time to collect data (users need 5+ seconds on page)
- Check that users interact with page (mouse movement, scrolling)
- Review detection metadata to see which signals are triggering
- Consider if bots are leaving before Phase 2 completes
CORS errors
Section titled “CORS errors”Scripts are served with permissive CORS headers. If you see CORS errors:
- Ensure you’re loading from
cdn.webdecoy.com - Check if a proxy or CDN is stripping headers
- Verify no browser extensions are blocking requests
Next Steps
Section titled “Next Steps”- Integrations - Connect to Cloudflare, Slack, webhooks
- Threat Scoring - Understand how scores are calculated
- Decoy Links - Set up server-side detection