Reversing the Spur.us Monocle Captcha

2025-08-26 • #captcha, #fingerprinting, #automation, #reverse-engineering

Deconstructing Spur.us's Monocle Captcha

Spur.us operates one of the more interesting captcha systems I've encountered recently - they call it "Monocle" Unlike traditional image-based captchas, Monocle focuses heavily on browser fingerprinting and IP reputation rather than asking users to identify traffic lights.

The Architecture: More Than Meets the Eye

Spur.us's infrastructure is built around four key domains that work together in an interesting way:

spur.us: The main platform serving IP data
mcl.spur.us: The Monocle captcha endpoint
app.spur.us: The authenticated dashboard for customers
.verify-euw2.spur.us: Fingerprinting domain

Unintended IPv6 Connectivity

While mcl.spur.us has no IPv6 records directly, it inadvertently achieves IPv6 connectivity by leveraging the dns records from <randomuuid>.verify-euw2.spur.us which shares the same Google Cloud scope. This oversight has unfortunate consequences for their proxy detection capabilities.

Along with this, in the past app.spur.us had IPv6 records which allowed connectivity to spur.us due to being on the same Fastly scope.

Monocle: The "Captcha" That Isn't Really a Captcha

Instead of challenging users with visual puzzles, it performs minimal fingerprinting while relying heavily on Spur's own IP reputation database. The system attempts to determine if a request is coming from a legitimate browser environment or an automated script.

Once past the captcha, Spur.us serves HTML pages with IP data embedded in the DOM structure. The data includes infrastructure classifications, device counts, observed risks, organization details, and detected proxy providers.

Here's what a structured extraction of that data looks like after parsing the HTML:

{
  "infra_type": "Unknown",
  "device_count": "2", 
  "observed_risks": "Callback Proxy",
  "org": "[REDACTED]",
  "anon_status": "Not Anonymous",
  "providers": [
    "[REDACTED]_PROXY",
    "[REDACTED]_PROXY", 
    "[REDACTED]_PROXY"
  ]
}

The Browser Fingerprinting Dance

The heart of Monocle lies in its browser environment simulation requirements. The system expects specific plugin data that mimics a real browser session. Here's how the fingerprinting process works:

Plugin System Architecture

The captcha validates browser capabilities through a series of plugin checks:

type pluginData struct {
    PID   string `json:"pid"`
    V     int    `json:"v"`
    Start string `json:"start"`
    Data  string `json:"data"`
    End   string `json:"end"`
}

Each plugin result contains timing information and specific capability data. The system checks for:

Browser Info (p/bi): Location, user agent, CPU count, timezone, and session identifiers
Drag Detection (p/dr): Whether the browser supports drag operations
WebSocket Support (p/ws): WebSocket API availability
WebGL Capabilities (p/wgl): Graphics rendering support
Version Check (p/v): Monocle version verification
Window Properties (p/win): Screen dimensions and touch capabilities

The Assessment Request Process

The system builds a comprehensive browser profile before making the assessment request:

func fetchAssessment(client *http.Client, ua string) (string, error) {
    browserInfo := map[string]any{
        "loc": SPUR_BASE + "/captcha?redirect=%2Fcontext%2F[REDACTED_IP]",
        "host": "spur.us",
        "ref": "",
        "lang": []string{"en-GB", "en"},
        "ua": ua,
        "cpu": 8,
        "lt": 0,
        "tz": "UTC",
        "time": map[string]int{
            "redr": 0, "dns": 0, "tcp": 0, "ssl": 0, 
            "resp": 0, "ftch": 0, "rqst": 0, "strt": 0, "drtn": 0
        },
        "sess": genHex(40),
        "rid": genHex(40),
        "rte": "Cannot read properties of null (reading 'rte')",
    }

    pluginHistory := []pluginData{
        makePluginData("p/bi", 1, browserInfo),
        makePluginData("p/dr", 1, map[string]any{"ok": false}),
        makePluginData("p/ws", 1, map[string]any{"ok": false}),
        makePluginData("p/wgl", 1, map[string]any{"ec": 0, "r": "null", "umv": "null", "umr": "null"}),
        makePluginData("p/v", 1, map[string]any{"ok": true, "version": VERSION}),
        makePluginData("p/win", 1, map[string]any{
            "webd": false, "cprop": false, "plen": 0, "tch": false, 
            "ptr": false, "idb": false, "outW": 0, "outH": 0, "sw": 0, "sh": 0
        }),
    }
}

The Two-Step Authentication Flow

Monocle uses a two-phase approach to validate requests:

Phase 1: Assessment Generation

The first request goes to the MCL endpoint with the browser fingerprint data:

params := url.Values{}
params.Set("v", VERSION)
params.Set("t", TYPE) 
params.Set("s", SESSION_ID)
apiUrl := MCL_BASE + "/r/bundle?" + params.Encode()

reqData := reqPayload{B: "", H: pluginHistory}
jsonBody, _ := json.Marshal(reqData)

req, _ := http.NewRequest("POST", apiUrl, bytes.NewReader(jsonBody))
req.Header.Set("Content-Type", "application/json")
req.Header.Set("X-Mcl-Tk", TOKEN)
req.Header.Set("User-Agent", ua)

The system validates the plugin data and returns an assessment token if the fingerprint appears legitimate.

Phase 2: Captcha Validation

The assessment token is then submitted to validate the captcha:

func submitCaptcha(client *http.Client, assessment, redirectPath, ua string) (*http.Response, error) {
    apiUrl := SPUR_BASE + "/api/validate-captcha"
    reqHeaders := map[string]string{
        "accept": "*/*",
        "accept-language": "en-GB,en;q=0.9",
        "cache-control": "no-cache",
        "content-type": "application/json",
        "dnt": "1",
        "origin": SPUR_BASE,
        "pragma": "no-cache",
        "priority": "u=1, i",
        "referer": SPUR_BASE + "/captcha?redirect=" + url.QueryEscape(redirectPath),
        "sec-ch-ua": "\"Not;A=Brand\";v=\"99\", \"Google Chrome\";v=\"139\", \"Chromium\";v=\"139\"",
        "sec-ch-ua-mobile": "?0",
        "sec-ch-ua-platform": "\"Windows\"",
        "sec-fetch-dest": "empty",
        "sec-fetch-mode": "cors",
        "sec-fetch-site": "same-origin",
        "user-agent": ua,
    }
    reqData := map[string]any{"assessment": assessment, "redirect": redirectPath}
}

The IPv6 Achilles' Heel

Here's where Spur.us's architecture becomes its own weakness. The company built their business around maintaining a database of IPv4 addresses used by residential proxy networks. However, an unintended consequence of their infrastructure setup enables IPv6 connectivity across all their services.

Since IPv6 addresses are vastly more numerous and less tracked than IPv4, the proxy detection database becomes essentially useless against IPv6-enabled automation. An attacker using IPv6 connectivity can bypass the very IP reputation system that Monocle relies on for detection.

This creates an ironic situation: a proxy detection company whose own infrastructure has an unintended vulnerability to proxy-based automation through IPv6.

func GetSessionCookie(proxy string, ip string, userAgent string) (string, error) {
    client := newClient(proxy)
    assessment, err := fetchAssessment(client, userAgent)
    if err != nil {
        return "", err
    }
    resp, err := submitCaptcha(client, assessment, "/context/"+ip, userAgent)
    if err != nil {
        return "", err
    }
    defer resp.Body.Close()

    for _, cookie := range resp.Cookies() {
        if cookie.Name == "__session" {
            return cookie.Value, nil
        }
    }
    return "", fmt.Errorf("__session cookie not found")
}

The __session cookie becomes the key to accessing Spur.us's protected endpoints without further captcha challenges.

The Data Extraction Process

Once authenticated, the system can scrape valuable IP intelligence data:

func setupClient(proxyURL string) *http.Client {
    transport := &http.Transport{}
    if proxyURL != "" {
        u, _ := url.Parse(proxyURL)
        transport.Proxy = http.ProxyURL(u)
    }
    return &http.Client{Transport: transport}
}

// Later in main():
req, _ := http.NewRequest("GET", "https://spur.us/context/"+*ip, nil)
req.Header.Set("user-agent", userAgent)
req.AddCookie(&http.Cookie{Name: "__session", Value: sessionToken})

resp, err := client.Do(req)

The scraping process extracts structured data about IP addresses, including infrastructure type, device counts, observed risks, and associated proxy providers.

HTML Parsing and Data Structure

The system parses the response HTML to extract structured intelligence:

findValue := func(label string) string {
    result := "Unknown"
    doc.Find("dt").Each(func(_ int, s *goquery.Selection) {
        if strings.TrimSpace(s.Text()) == label {
            if nextElement := s.Next(); nextElement != nil {
                result = strings.TrimSpace(nextElement.Text())
            }
        }
    })
    return result
}

This extracts key-value pairs from the HTML structure, looking for specific labels like "Observed risks" and "Registered to."

Security Implications and Defensive Measures

From a defensive perspective, Monocle's approach has both strengths and weaknesses:

Strengths:

Minimal user friction: No visual challenges for legitimate users
Comprehensive fingerprinting: Checks multiple browser capabilities
IP reputation integration: Leverages existing threat intelligence

Weaknesses:

IPv6 blind spot: The core IP reputation system doesn't cover IPv6
Static fingerprinting: The expected plugin responses are predictable
Unintended infrastructure bypass: Shared cloud scopes accidentally enable IPv6 connectivity

Conclusion: The Evolution of Bot Detection

The unintended IPv6 connectivity and predictable fingerprinting requirements create easy opportunities for automation. This code shows that even well-designed anti-bot systems can have architectural oversights that anyone can exploit.

Code is opensourced on github - Github