Skip to content

Benchmark Results

Aggregate findings from pressure testing on win-target (Windows Server 2022).

All tests use 50 inventory entries pointing at a single host via WinRM HTTPS through an SSH tunnel (localhost:15986 -> win-target:5986).

Test Matrix Summary

Test Forks Plugin Quotas Success Fail Auth Failures AD Risk
Safe baseline 5 pywinrm default 5/5 0 0 None
Forkbomb 50 pywinrm default 9/50 41 41 HIGH
Quota elevation 50 pywinrm elevated 20/50 30 30 High
Reduced forks 25 pywinrm elevated 20/25 5 5 Moderate
PSRP comparison 50 pypsrp elevated 24/50 26 0 None

Baseline Audit

Pre-existing state on win-target before testing:

Quota Found Value Windows Default
MaxShellsPerUser 2,147,483,647 30
MaxProcessesPerShell 2,147,483,647 25
MaxMemoryPerShellMB 2,147,483,647 1024
MaxConnections 300 25
MaxConcurrentOperationsPerUser 1,500 1,500
IdleTimeout 7,200,000ms 7,200,000ms

All shell quotas had been set to MAX INT by a prior configuration management run, masking the forkbomb issue entirely. Quotas were reset to Windows defaults before reproduction testing.

Forkbomb Reproduction (9/50 pass)

With Windows default quotas (MaxShellsPerUser=30, MaxConcurrentUsers=10):

  • 9 of 50 connections succeeded, matching the MaxConcurrentUsers=10 limit
  • 41 connections received NTLM auth failure errors
  • Error messages say ntlm: -- quota exhaustion is disguised as credential rejection
  • Each of the 41 failures counts as a failed AD authentication attempt
41 failures in ~5 seconds
AD lockout threshold = 5 failures in 15 minutes
= 8x the lockout threshold in a single burst

AD Lockout Risk

With a shared service account (like svc-ansible), a single forkbomb run locks out the account across all managed Windows hosts that share it.

Quota Elevation Fix (20/50 pass)

After raising quotas (MaxShellsPerUser=100, MaxConcurrentUsers=25):

  • 20 of 50 connections succeeded (up from 9)
  • Improvement directly correlates with MaxConcurrentUsers increase (10 -> 25)
  • Remaining 30 failures caused by SSH tunnel TCP connection limits, not WinRM quotas

A follow-up test with forks=25 (within the MaxConcurrentUsers=25 limit) still showed 5 failures, confirming the SSH tunnel as a secondary bottleneck at ~20 concurrent connections.

PSRP Comparison (24/50 pass, 0 auth failures)

Using ansible_connection=psrp with ansible_psrp_auth=ntlm:

  • 24 of 50 succeeded
  • 0 UNREACHABLE (zero authentication failures)
  • 26 FAILED with HTTPSConnectionPool: Read timed out (SSH tunnel bottleneck)
---
config:
  theme: default
---
block-beta
    columns 3
    space header1["pywinrm"] header2["pypsrp"]
    r1["Successes"] r1a["9"] r1b["24"]
    r2["Error type"] r2a["UNREACHABLE"] r2b["FAILED (timeout)"]
    r3["Auth failures"] r3a["41"] r3b["0"]
    r4["AD lockout risk"] r4a["HIGH"] r4b["NONE"]

    style r1a fill:#f66,stroke:#333
    style r2a fill:#f66,stroke:#333
    style r3a fill:#f66,stroke:#333
    style r4a fill:#f66,stroke:#333
    style r1b fill:#9f9,stroke:#333
    style r2b fill:#ff9,stroke:#333
    style r3b fill:#9f9,stroke:#333
    style r4b fill:#9f9,stroke:#333

Key Finding

pypsrp's connection pooling eliminates authentication failures entirely. All remaining failures are SSH tunnel TCP timeouts, not auth problems. This means pypsrp produces zero AD lockout risk even under pressure.

WinRM Restart Finding

Operational Hazard

Restarting the WinRM service from within a WinRM session (Restart-Service WinRM -Force) can corrupt the service state, leaving it stopped and unrecoverable without a full OS reboot.

Key details:

  • Restart-Service WinRM -Force kills the service process without graceful shutdown
  • The service shows as "Stopped" but Start-Service fails with "Cannot open WinRM service"
  • The HTTPS listener binding is lost
  • Only an OS reboot recovers the service

Mitigation: WSMan quota changes take effect immediately without a service restart. The restart handler should be removed from quota configuration roles.

Bottleneck Hierarchy

flowchart TD
    L1["Layer 1: MaxConcurrentUsers<br/><b>Primary gate</b>"]
    L1a["Determines how many sessions<br/>can authenticate simultaneously"]
    L1b["Failures appear as NTLM<br/>auth errors (misleading)"]

    L2["Layer 2: MaxShellsPerUser<br/><b>Secondary gate</b>"]
    L2a["Not reached if<br/>MaxConcurrentUsers is hit first"]
    L2b["Only relevant for<br/>long-lived parallel sessions"]

    L3["Layer 3: SSH tunnel TCP pool<br/><b>~20 connections</b>"]
    L3a["Infrastructure bottleneck,<br/>not WinRM"]
    L3b["Causes timeouts,<br/>not auth failures"]
    L3c["Does NOT trigger<br/>AD lockout"]

    L1 --> L1a & L1b
    L1 ==> L2
    L2 --> L2a & L2b
    L2 ==> L3
    L3 --> L3a & L3b & L3c

    style L1 fill:#f66,stroke:#333,stroke-width:2px
    style L2 fill:#f96,stroke:#333,stroke-width:2px
    style L3 fill:#ff9,stroke:#333,stroke-width:2px

Quota State Comparison

Setting Default Elevated Effect on 50-fork test
MaxConcurrentUsers 10 25 9 -> 20 successes
MaxShellsPerUser 30 100 Not the bottleneck
MaxProcessesPerShell 25 50 Not tested in isolation