Threat Scoring

WebDecoy uses a sophisticated deception-first scoring system to evaluate every visitor and request. This page explains exactly how scores are calculated, what each category detects, and how to interpret the results.

Overview

The Threat Score is a number from 0-100 that measures the likelihood a visitor is automated or malicious. It combines signals from 8 detection categories using weighted averages, with the strongest evidence contributing the most to the final score.

┌─────────────────────────────────────────────────────────────────┐
│                    THREAT SCORING PIPELINE                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Visitor Request                                                │
│         │                                                        │
│         ▼                                                        │
│   ┌───────────────────────────────────────────────────────┐     │
│   │              8 DETECTION CATEGORIES                    │     │
│   │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐      │     │
│   │  │Honeypot │ │ Attack  │ │Fingerprint│ │Behavior │     │     │
│   │  │  40%    │ │  25%    │ │   12%    │ │  10%    │      │     │
│   │  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘      │     │
│   │       │           │           │           │            │     │
│   │  ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐      │     │
│   │  │  TLS   │ │   IP    │ │ Headers │ │  User   │      │     │
│   │  │   7%   │ │   3%    │ │   2%    │ │ Agent 1%│      │     │
│   │  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘      │     │
│   └───────┼───────────┼───────────┼───────────┼───────────┘     │
│           │           │           │           │                  │
│           └───────────┴─────┬─────┴───────────┘                  │
│                             │                                    │
│                             ▼                                    │
│                   ┌─────────────────┐                            │
│                   │  WEIGHTED SUM   │                            │
│                   │   = Final Score │                            │
│                   └────────┬────────┘                            │
│                            │                                     │
│                            ▼                                     │
│            ┌───────────────────────────────┐                     │
│            │  Threat Score: 0-100          │                     │
│            │  + Category Classification    │                     │
│            │  + Confidence Percentage      │                     │
│            └───────────────────────────────┘                     │
└─────────────────────────────────────────────────────────────────┘

Key Principles

Principle	Description
Deception-First	Honeypot signals are weighted highest because legitimate users never interact with hidden traps
Weighted Averages	Categories are multiplied by their weight, not simply summed
Defense in Depth	Multiple signals provide stronger evidence than any single indicator
Low False Positives	Easily-spoofed signals (User-Agent, headers) have minimal impact

How Scoring Works

The scoring calculation follows a simple three-step process:

Step 1: Category Scoring

Each detection category analyzes incoming requests and produces a score from 0-100:

Category Score = Analysis of signals within that category
                 (0 = no suspicious signals, 100 = maximum suspicion)

Step 2: Weight Application

Each category score is multiplied by its weight percentage:

Weighted Score = Category Score × Weight Percentage

Step 3: Final Calculation

All weighted scores are summed to produce the final Threat Score:

Final Score = Sum of all Weighted Scores

Example Calculation

Consider a visitor who triggered some honeypot signals and has suspicious headers:

Category	Raw Score	Weight	Weighted Score
Honeypot Signals	30	× 40%	= 12.0
Attack Signatures	0	× 25%	= 0
Browser Fingerprint	0	× 12%	= 0
Behavioral Analysis	0	× 10%	= 0
TLS Fingerprint	0	× 7%	= 0
IP Reputation	0	× 3%	= 0
HTTP Headers	15	× 2%	= 0.3
User Agent	0	× 1%	= 0
		Total:	12

Result: This visitor receives a Threat Score of 12 (Minimal risk).

Detection Categories & Weights

Categories are weighted by reliability. High-confidence signals like honeypot triggers contribute more than easily-spoofed signals like User-Agent strings.

Weight Distribution

Priority	Category	Weight	Why This Weight
Highest	Honeypot Signals	40%	Core deception signal - legitimate users never trigger these
High	Attack Signatures	25%	Active exploitation attempts are clear malicious intent
Medium	Browser Fingerprint	12%	Automation tools have detectable anomalies
Medium	Behavioral Analysis	10%	Non-human patterns are reliable indicators
Medium	TLS Fingerprint	7%	JA3/JA4 fingerprints are hard to spoof
Low	IP Reputation	3%	High false-positive potential (VPN users)
Low	HTTP Headers	2%	Easily spoofed by sophisticated bots
Low	User Agent	1%	Trivially spoofed, catches only obvious cases

Visual Weight Comparison

Honeypot Signals    ████████████████████████████████████████  40%
Attack Signatures   █████████████████████████                  25%
Browser Fingerprint ████████████                               12%
Behavioral Analysis ██████████                                 10%
TLS Fingerprint     ███████                                     7%
IP Reputation       ███                                         3%
HTTP Headers        ██                                          2%
User Agent          █                                           1%
                    ─────────────────────────────────────────
                    0%       25%       50%       75%      100%

Category Details

Honeypot Signals (40% Weight)

Priority: Highest

Detects visitors who access hidden decoy links, fill invisible form fields, or interact with trap endpoints.

Signal Type	Description	Score Impact
Decoy link access	Hidden link followed	+60-90
Hidden field filled	Invisible form field populated	+50-80
Fake API endpoint hit	Trap endpoint accessed	+70-95
Trap path accessed	Honeypot URL visited	+65-90
Multiple honeypots triggered	Several traps hit	+85-100

Why it matters: Legitimate users never see or interact with these hidden elements. A trigger here is strong evidence of automated scanning or malicious reconnaissance. This is the cornerstone of WebDecoy’s deception-first approach.

Attack Signatures (25% Weight)

Priority: High

Identifies known attack patterns in request payloads including injection attempts and exploitation techniques.

Attack Type	Pattern Examples	Score Impact
SQL Injection	`' OR '1'='1`, `UNION SELECT`, `; DROP TABLE`	+70-90
Cross-site Scripting (XSS)	`<script>`, `javascript:`, `onerror=`	+60-85
Command Injection	`; cat /etc/passwd`, `	ls -la, `` whoami` “
Path Traversal	`../../../etc/passwd`, `....//....//`	+55-75
XXE	`<!ENTITY xxe SYSTEM`, `file:///etc/`	+70-90
LDAP Injection	`)(cn=`, `)(uid=*))(	(uid=*`
NoSQL Injection	`{"$gt": ""}`, `{"$ne": null}`	+60-80

Why it matters: These are direct indicators of malicious intent. These patterns are rarely seen in legitimate traffic and represent active exploitation attempts.

Example Detection:

POST /api/users/login HTTP/1.1
Content-Type: application/json

{"email": "[email protected]", "password": "' OR '1'='1' --"}

→ Attack Signature Score: 85 (SQL Injection detected)

Browser Fingerprint (12% Weight)

Priority: Medium

Analyzes client-side fingerprinting data to detect headless browsers, automation tools, or spoofed environments.

Signal	Detection Method	Score Impact
WebDriver detected	`navigator.webdriver = true`	+60-80
Missing plugins	No plugins array or empty	+30-50
Canvas anomaly	Canvas fingerprint doesn’t match browser	+40-60
Headless browser markers	Chrome headless signatures	+55-75
WebGL inconsistency	GPU fingerprint mismatch	+35-55
Timezone mismatch	Browser timezone vs IP geolocation	+25-40
Language mismatch	Browser language vs expected	+20-35
CDP artifacts	ChromeDriver `cdc_` properties, Selenium evaluation markers	+70-90
Lie/tampering	Native function `toString()` modified (stealth plugins)	+45 max
Worker mismatch	Main thread vs Web Worker navigator differs	+40 max
Canvas pixel noise	Anti-fingerprinting noise in canvas output	+30

Why it matters: Automation tools often have telltale fingerprint anomalies that are difficult to fake convincingly. While sophisticated bots can spoof some signals, maintaining consistent fingerprints across all dimensions is challenging.

Example Detection:

// Detected anomalies:
navigator.webdriver = true           // WebDriver flag set
navigator.plugins.length = 0         // No plugins (unusual)
canvas.toDataURL() = [headless hash] // Known headless signature
window.cdc_adoQpoasnfa76pfcZLmcfl_Array // ChromeDriver CDP artifact

→ Fingerprint Score: 82

Behavioral Analysis (10% Weight)

Priority: Medium

Examines interaction patterns including mouse movements, keyboard input, scroll behavior, and navigation timing.

Signal	What It Detects	Score Impact
No mouse movement	Zero cursor events recorded	+40-60
Impossible timing	Actions faster than human capability	+50-70
Linear navigation	Perfectly straight mouse paths	+35-50
Missing scroll events	No scrolling on long pages	+25-40
Instant form submission	Form submitted in <500ms	+45-65
No keyboard patterns	Keys pressed without natural rhythm	+30-45
Robotic click patterns	Clicks at exact same coordinates	+40-55

Why it matters: Bots typically exhibit non-human behavior patterns—too fast, too uniform, or missing expected interactions. Real humans have micro-movements, variable timing, and natural browsing patterns that are difficult to simulate perfectly.

Example Detection:

Session Analysis:
├── Mouse events: 0 (expected: 50-200 for page complexity)
├── Time on page: 0.3s (submitted form)
├── Scroll depth: 0% (form below fold)
└── Keyboard rhythm: N/A (no typing detected)

→ Behavior Score: 68

TLS Fingerprint (7% Weight)

Priority: Medium

Uses JA3/JA4 fingerprinting to identify the TLS client implementation and match against known automation tools.

TLS Signature	What It Indicates	Score Impact
Known bot signature	Matches scrapy, selenium, etc.	+50-70
curl fingerprint	Request from curl library	+45-65
Python requests	Common in scripts/bots	+40-60
Headless Chrome TLS	Differs from regular Chrome	+35-55
Go HTTP client	Often used in scanners	+40-55
Node.js fetch	Server-side requests	+30-45
Mismatched TLS/UA	TLS says Python, UA says Chrome	+55-75

Why it matters: TLS fingerprints are hard to spoof because they’re generated at the protocol level before any application code runs. They reliably identify curl, wget, Python requests, and headless browsers even when User-Agent strings are spoofed.

JA3 Fingerprint Example:

TLS Handshake Analysis:
├── Cipher Suites: [specific order unique to client]
├── Extensions: [TLS extensions and order]
├── Curves: [supported elliptic curves]
└── Point Formats: [EC point formats]

JA3 Hash: 769,47-53-5-10-49161-49162-49171-49172...
Match: Python/requests 2.28.x

→ TLS Score: 58

IP Reputation (3% Weight)

Priority: Low

Checks IP addresses against threat intelligence feeds and identifies datacenter/proxy/VPN connections.

Signal	Source	Score Impact
Known malicious IP	Threat intelligence feeds	+60-80
Datacenter hosting	IP belongs to cloud provider	+20-35
TOR exit node	IP is TOR network endpoint	+40-55
High abuse reports	Many reports on AbuseIPDB	+35-50
Open proxy	IP listed as open proxy	+30-45
VPN service	Known VPN provider IP	+15-30
Residential proxy	Suspicious residential IP	+25-40

Why it matters: Provides useful context but has high false-positive potential. Many legitimate users use VPNs for privacy, and datacenter IPs might be corporate proxies. That’s why this category only contributes 3% to the final score.

HTTP Headers (2% Weight)

Priority: Low

Analyzes HTTP request headers for missing standard headers or patterns associated with automated tools.

Signal	What’s Detected	Score Impact
Missing Accept header	No content type preference	+25-40
No Referer	Direct access to deep pages	+15-25
Unusual header order	Non-browser header ordering	+20-35
Missing cookies	No cookie support	+15-25
Missing Accept-Language	No language preference	+20-30
Missing Accept-Encoding	No compression support	+15-25

Why it matters: Simple bots often omit headers that real browsers include automatically. However, this is easily spoofed by adding the expected headers, which is why the weight is low.

Example:

GET /api/data HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0...
# Missing: Accept, Accept-Language, Accept-Encoding, Cookie

→ Header Score: 45

User Agent (1% Weight)

Priority: Low

Examines User-Agent strings for bot signatures, outdated versions, or obvious automation markers.

Signal	Examples	Score Impact
Known bot string	”Googlebot”, “Bingbot” (verified)	0 (legitimate)
curl/wget UA	”curl/7.68.0”, “Wget/1.20”	+50-70
Empty User-Agent	No UA provided	+60-80
Scripting library UA	”python-requests”, “axios”	+40-60
Outdated browser	Chrome 60 when current is 120	+20-35
Malformed UA	Syntax errors or truncated	+35-50

Why it matters: User-Agent is trivially spoofed by any bot, so it receives minimal weight (1%). It only catches the most obvious, unsophisticated bots that don’t bother to set a realistic User-Agent.

Threat Levels

Scores are grouped into five threat levels for easier interpretation and action.

Level Overview

Score Range	Level	Color	Description	Recommended Action
0-20	MINIMAL	Green	Very low risk, likely legitimate	Allow
21-40	LOW	Lime	Some signals, probably benign	Log
41-60	MEDIUM	Yellow	Suspicious activity detected	Monitor/Challenge
61-80	HIGH	Orange	Strong bot/malicious indicators	Challenge/Block
81-100	CRITICAL	Red	Almost certainly automated/malicious	Block

MINIMAL (0-20)

Score: 12 | Level: MINIMAL | ● Green

Characteristics:

Few or no suspicious signals detected
Normal browsing patterns observed
Standard browser fingerprint
Good IP reputation

Interpretation: This is normal, legitimate traffic. The visitor behaves like a real human using a standard browser.

Recommended Actions:

Allow the request
No special logging required
No alerts needed

LOW (21-40)

Score: 35 | Level: LOW | ● Lime

Characteristics:

Minor signals present (e.g., VPN usage, missing header)
No honeypot triggers
Mostly normal behavior patterns
Could be privacy-conscious user or minor automation

Interpretation: Some signals are present but likely benign. Could be a legitimate user with a VPN or unusual browser configuration.

Recommended Actions:

Allow the request
Log for pattern analysis
Monitor for escalation

MEDIUM (41-60)

Score: 52 | Level: MEDIUM | ● Yellow

Characteristics:

Multiple suspicious signals
Possible fingerprint anomalies
Unusual behavior patterns
May have triggered low-weight honeypots

Interpretation: Suspicious activity that warrants attention. Could be a scanner, scraper, or unsophisticated bot.

Recommended Actions:

Consider CAPTCHA challenge
Log with high priority
Alert on repeated occurrences
Manual review recommended

HIGH (61-80)

Score: 73 | Level: HIGH | ● Orange

Characteristics:

Strong indicators of automation
Likely honeypot triggers
Fingerprint clearly indicates bot/headless browser
Suspicious or malicious patterns

Interpretation: Almost certainly not a legitimate human user. This is likely a bot, scanner, or attacker.

Recommended Actions:

Challenge or block
Alert security team
Add IP to watchlist
Investigate the attack pattern

CRITICAL (81-100)

Score: 92 | Level: CRITICAL | ● Red

Characteristics:

Multiple high-confidence indicators
Attack signatures detected
Honeypot(s) triggered
Clear malicious intent

Interpretation: This is definitely automated or malicious traffic. Active attack or aggressive scanning in progress.

Recommended Actions:

Block immediately
Alert security team
Consider IP blocking at edge
Investigate and document
Report to threat intelligence if applicable

Threat Categories

Beyond the numeric score, visitors are classified into categories based on which signals triggered.

Category Classification Logic

Categories are determined by evaluating signals in priority order:

1. Attack Signatures ≥ 30 → ATTACKER
2. Honeypot ≥ 40 AND Fingerprint ≥ 30 → BOT
3. Honeypot ≥ 20 → SCANNER
4. User Agent matches crawler pattern → CRAWLER
5. Fingerprint ≥ 40 (without honeypot) → SCRAPER
6. User Agent ≥ 30 OR Headers ≥ 25 → SCRAPER
7. Default → LEGITIMATE

Category Descriptions

Category	Icon	Triggers	Description	Typical Score
Attacker	⚠️	Attack signatures (SQLi, XSS, etc.)	Active exploitation attempts	70-100
Bot	🤖	Honeypot + fingerprint anomalies	Automated traffic with technical non-human signs	60-90
Scanner	📡	Any honeypot/decoy trigger	Reconnaissance or vulnerability scanning	40-80
Crawler	🕷️	Known crawler User-Agent	Web crawlers and indexing bots	20-50
Scraper	📋	Fingerprint anomalies (no honeypot)	Content scraping or data harvesting	35-65
Legitimate	✓	All signals below threshold	Normal human visitor	0-25

Category Examples

Attacker:

Request: POST /api/login
Body: {"password": "' OR '1'='1"}
Attack Signature Score: 85
→ Category: ATTACKER

Bot:

Accessed: /admin/backup.zip (honeypot)
Fingerprint: WebDriver=true, no plugins
Honeypot Score: 75, Fingerprint Score: 68
→ Category: BOT

Scanner:

Accessed: /.git/config (decoy)
Normal fingerprint otherwise
Honeypot Score: 45
→ Category: SCANNER

Legitimate:

Normal browsing pattern
No honeypots triggered
All signals < thresholds
→ Category: LEGITIMATE

Confidence Score

The Confidence percentage indicates how certain WebDecoy is about the threat assessment.

How Confidence is Calculated

Confidence is based on the number and quality of active signals:

Factor	Impact
More signal categories active	Higher confidence
Honeypot triggered	+25% confidence boost
Strong fingerprint match	+20% confidence boost
Only 1-2 weak signals	Lower confidence
Conflicting signals	Lower confidence

Confidence Formula

Base Confidence = (Active Categories / Total Categories) × 100

Adjustments:
+ 25% if honeypot triggered
+ 20% if strong fingerprint anomaly
- 15% if signals conflict

Interpreting Confidence

Confidence	Meaning	Action
80-100%	Very certain	Act on the score
60-79%	Reasonably certain	Act with monitoring
40-59%	Moderate certainty	Consider challenging
20-39%	Low certainty	Log and observe
0-19%	Very uncertain	Collect more data

Using Scores Effectively

Recommended Thresholds

Use Case	Block Threshold	Challenge Threshold
Financial / Banking	55	40
E-commerce	65	50
Standard Websites	75	60
Public Content / Blogs	85	70
Monitoring Only	N/A (log only)	N/A

Score-Based Decision Logic

function handleRequest(detection) {
  const { score, confidence, category } = detection;

  // Critical threats - block immediately
  if (score >= 80 && confidence >= 60) {
    return blockRequest();
  }

  // High threats - challenge or block
  if (score >= 60) {
    if (category === 'attacker') {
      return blockRequest();
    }
    return challengeWithCaptcha();
  }

  // Medium threats - log and monitor
  if (score >= 40) {
    logHighPriority(detection);
    return allowWithMonitoring();
  }

  // Low/Minimal threats - allow
  return allowRequest();
}

Filtering by Category

Use category filters in the Detections table to focus on specific threat types:

Attackers first: Filter to category = attacker to investigate active exploitation attempts
Scanner review: Filter to category = scanner to see what reconnaissance activity is happening
Legitimate verification: Filter to category = legitimate with high scores to find potential false positives

Tuning for Your Environment

Reducing False Positives:

Increase block threshold (e.g., 75 → 85)
Add known good IPs to allowlist
Verify good bots (Googlebot, Bingbot) by IP
Review medium-score detections manually

Catching More Threats:

Lower block threshold (e.g., 75 → 65)
Add more honeypot links to pages
Enable Detection Script Pro for JavaScript analysis
Monitor category distribution for patterns

Score Explanation Dialog

In the WebDecoy dashboard, click on any threat score to open the Score Explanation Dialog, which shows:

Overview - What the unified score means
How Scoring Works - The three-step calculation process
Category Weights - Visual breakdown of all 8 categories
Category Details - What each category detects and why it matters
Threat Levels - The 5 risk levels and their meaning
Threat Categories - How visitors are classified
Confidence Factors - What affects certainty

This dialog helps you understand exactly why a visitor received their score and make informed decisions about your security policies.

AI Scraper Scoring - Separate scoring dimension for AI training crawlers
Response Actions - Automated responses to threats
Detection Script - JavaScript-based bot detection

Threat Scoring

Overview

Key Principles

How Scoring Works

Step 1: Category Scoring

Step 2: Weight Application

Step 3: Final Calculation

Example Calculation

Detection Categories & Weights

Weight Distribution

Visual Weight Comparison

Category Details

Honeypot Signals (40% Weight)

Attack Signatures (25% Weight)

Browser Fingerprint (12% Weight)

Behavioral Analysis (10% Weight)

TLS Fingerprint (7% Weight)

IP Reputation (3% Weight)

HTTP Headers (2% Weight)

User Agent (1% Weight)

Threat Levels

Level Overview

MINIMAL (0-20)

LOW (21-40)

MEDIUM (41-60)

HIGH (61-80)

CRITICAL (81-100)

Threat Categories

Category Classification Logic

Category Descriptions

Category Examples

Confidence Score

How Confidence is Calculated

Confidence Formula

Interpreting Confidence

Using Scores Effectively

Recommended Thresholds

Score-Based Decision Logic

Filtering by Category

Tuning for Your Environment

Score Explanation Dialog

Related Documentation