Benchmark Results¶
Aggregate findings from pressure testing on win-target (Windows Server 2022).
All tests use 50 inventory entries pointing at a single host via WinRM HTTPS
through an SSH tunnel (localhost:15986 -> win-target:5986).
Test Matrix Summary¶
| Test | Forks | Plugin | Quotas | Success | Fail | Auth Failures | AD Risk |
|---|---|---|---|---|---|---|---|
| Safe baseline | 5 | pywinrm | default | 5/5 | 0 | 0 | None |
| Forkbomb | 50 | pywinrm | default | 9/50 | 41 | 41 | HIGH |
| Quota elevation | 50 | pywinrm | elevated | 20/50 | 30 | 30 | High |
| Reduced forks | 25 | pywinrm | elevated | 20/25 | 5 | 5 | Moderate |
| PSRP comparison | 50 | pypsrp | elevated | 24/50 | 26 | 0 | None |
Baseline Audit¶
Pre-existing state on win-target before testing:
| Quota | Found Value | Windows Default |
|---|---|---|
| MaxShellsPerUser | 2,147,483,647 | 30 |
| MaxProcessesPerShell | 2,147,483,647 | 25 |
| MaxMemoryPerShellMB | 2,147,483,647 | 1024 |
| MaxConnections | 300 | 25 |
| MaxConcurrentOperationsPerUser | 1,500 | 1,500 |
| IdleTimeout | 7,200,000ms | 7,200,000ms |
All shell quotas had been set to MAX INT by a prior configuration management run, masking the forkbomb issue entirely. Quotas were reset to Windows defaults before reproduction testing.
Forkbomb Reproduction (9/50 pass)¶
With Windows default quotas (MaxShellsPerUser=30, MaxConcurrentUsers=10):
- 9 of 50 connections succeeded, matching the
MaxConcurrentUsers=10limit - 41 connections received NTLM auth failure errors
- Error messages say
ntlm:-- quota exhaustion is disguised as credential rejection - Each of the 41 failures counts as a failed AD authentication attempt
41 failures in ~5 seconds
AD lockout threshold = 5 failures in 15 minutes
= 8x the lockout threshold in a single burst
AD Lockout Risk
With a shared service account (like svc-ansible), a single forkbomb run locks out
the account across all managed Windows hosts that share it.
Quota Elevation Fix (20/50 pass)¶
After raising quotas (MaxShellsPerUser=100, MaxConcurrentUsers=25):
- 20 of 50 connections succeeded (up from 9)
- Improvement directly correlates with
MaxConcurrentUsersincrease (10 -> 25) - Remaining 30 failures caused by SSH tunnel TCP connection limits, not WinRM quotas
A follow-up test with forks=25 (within the MaxConcurrentUsers=25 limit) still
showed 5 failures, confirming the SSH tunnel as a secondary bottleneck at ~20
concurrent connections.
PSRP Comparison (24/50 pass, 0 auth failures)¶
Using ansible_connection=psrp with ansible_psrp_auth=ntlm:
- 24 of 50 succeeded
- 0 UNREACHABLE (zero authentication failures)
- 26 FAILED with
HTTPSConnectionPool: Read timed out(SSH tunnel bottleneck)
---
config:
theme: default
---
block-beta
columns 3
space header1["pywinrm"] header2["pypsrp"]
r1["Successes"] r1a["9"] r1b["24"]
r2["Error type"] r2a["UNREACHABLE"] r2b["FAILED (timeout)"]
r3["Auth failures"] r3a["41"] r3b["0"]
r4["AD lockout risk"] r4a["HIGH"] r4b["NONE"]
style r1a fill:#f66,stroke:#333
style r2a fill:#f66,stroke:#333
style r3a fill:#f66,stroke:#333
style r4a fill:#f66,stroke:#333
style r1b fill:#9f9,stroke:#333
style r2b fill:#ff9,stroke:#333
style r3b fill:#9f9,stroke:#333
style r4b fill:#9f9,stroke:#333
Key Finding
pypsrp's connection pooling eliminates authentication failures entirely. All remaining failures are SSH tunnel TCP timeouts, not auth problems. This means pypsrp produces zero AD lockout risk even under pressure.
WinRM Restart Finding¶
Operational Hazard
Restarting the WinRM service from within a WinRM session (Restart-Service WinRM -Force)
can corrupt the service state, leaving it stopped and unrecoverable without a full
OS reboot.
Key details:
Restart-Service WinRM -Forcekills the service process without graceful shutdown- The service shows as "Stopped" but
Start-Servicefails with "Cannot open WinRM service" - The HTTPS listener binding is lost
- Only an OS reboot recovers the service
Mitigation: WSMan quota changes take effect immediately without a service restart. The restart handler should be removed from quota configuration roles.
Bottleneck Hierarchy¶
flowchart TD
L1["Layer 1: MaxConcurrentUsers<br/><b>Primary gate</b>"]
L1a["Determines how many sessions<br/>can authenticate simultaneously"]
L1b["Failures appear as NTLM<br/>auth errors (misleading)"]
L2["Layer 2: MaxShellsPerUser<br/><b>Secondary gate</b>"]
L2a["Not reached if<br/>MaxConcurrentUsers is hit first"]
L2b["Only relevant for<br/>long-lived parallel sessions"]
L3["Layer 3: SSH tunnel TCP pool<br/><b>~20 connections</b>"]
L3a["Infrastructure bottleneck,<br/>not WinRM"]
L3b["Causes timeouts,<br/>not auth failures"]
L3c["Does NOT trigger<br/>AD lockout"]
L1 --> L1a & L1b
L1 ==> L2
L2 --> L2a & L2b
L2 ==> L3
L3 --> L3a & L3b & L3c
style L1 fill:#f66,stroke:#333,stroke-width:2px
style L2 fill:#f96,stroke:#333,stroke-width:2px
style L3 fill:#ff9,stroke:#333,stroke-width:2px
Quota State Comparison¶
| Setting | Default | Elevated | Effect on 50-fork test |
|---|---|---|---|
| MaxConcurrentUsers | 10 | 25 | 9 -> 20 successes |
| MaxShellsPerUser | 30 | 100 | Not the bottleneck |
| MaxProcessesPerShell | 25 | 50 | Not tested in isolation |