Skip to content

TOOLS — the RE trifecta that cracked the Canon G6020 5B00 reset

This is the complete, reproducible enumeration of the reverse-engineering tooling used to recover the Canon MegaTank G6020 waste-ink ("5B00 ink absorber full") reset protocol. It is the workbench inventory — the capture rig, the decompiler rig, the instrumentation, and the offline analysis utilities — kept deliberately separate from the shippable native tool.

The two halves of this repo (read this first)

Half What it is Where it lives Ships?
Native reset tool Linux-first pyusb resetter + safety gates + fleet Ansible src/canon_megatank/, printers/, host/roles/canon_tool_reset/ Yes — this is the product
RE / reference rig The trifecta + utils below — captures, decompiles, instruments, analyzes the proprietary oracles host/vm-capture/, scripts/, ghidra/, host/roles/canon_tool_dev/ No — RE oracle / reproducibility only

The proprietary tools (Canon Service Tool, WICReset/PrinterPotty, Canon firmware) are RE oracles only — never redistributed, never a production dependency (see AGENTS.md and docs/adr/0007-canon-tool-reverse-engineering.md). Every binary and capture they touch is gitignored; only the scripts that drive them and the curated findings (docs/research/) are tracked.

The validated result. The write cipher reproduces WICReset's genuine reset frame 23/23 byte-exact, the reset is cloud-independent, and it cleared 5B00 on real hardware (docs/runbook/g6020-native-reset.md). The trifecta below is how that ground truth was obtained and cross-checked.


The TRIFECTA at a glance

Three independent evidence sources, cross-correlated by wall-clock timestamp and the deterministic 20-byte payload envelope. No single lane is sufficient; each anchors the others. The loop is drawn in docs/diagrams/methodology-trifecta.mmd (render with just diagrams).

  LANE 1 — usbmon        LANE 2 — Frida              LANE 3 — Ghidra
  (host wire truth)      (Win11 VM IOCTL/DRM)        (offline decompile)
  dumpcap -i usbmonN     hook DeviceIoControl,       usbprint.sys IOCTL→URB,
  over 04a9:12fe         patch 3 cloud-DRM gates,    printerpotty.exe netfree
  passthrough            read live keyword           proof, devices.xml cipher
        │                       │                            │
        └──────────────► CORRELATE by timestamp ◄────────────┘
                    + deterministic 20-byte envelope
                    VALIDATED NATIVE RESET (pyusb)

The substrate under all three lanes is the Win11 capture VM with real USB passthrough of the printer, plus the analysis utilities that turn raw captures and decompiles into the verified protocol. Both are enumerated below.


0. Substrate — the Win11 capture VM (libvirt / qemu IaC)

What it is. A throwaway Win11 guest under session-mode libvirt (qemu:///session, no root) on the capture host mbp-13 (Rocky Linux 10). Its entire reason to exist is the <hostdev> USB passthrough: Wine cannot surface USB to the closed Windows tools, so a real Win11 guest drives the printer natively while host-side usbmon records the bus. Three IaC layers make it the Windows equivalent of cloud-init.

Where it lives. host/vm-capture/ (see its README.md for the operator walk-through).

Layer File(s) Role
Domain (interactive/SPICE) host/vm-capture/canon-capture-win11.xml q35 + OVMF UEFI; <hostdev> passes 04a9:1865/04a9:12fe to the guest; SPICE for hand-driving. HOME_ABS is substituted to the capture-user home at define time.
1 · Unattended install host/vm-capture/unattend/autounattend.xml Zero-click Win11 install: TPM/SecureBoot bypass, disk, local admin cap, autologon (LogonCount 99).
1 · WinRM bootstrap host/vm-capture/unattend/SetupComplete.cmdConfigureRemotingForAnsible.ps1 Enables WinRM in SYSTEM context before first logon (the FirstLogon attempt ran on the Public NAT profile and never opened 5985).
2 · Provision (cloud-init) host/vm-capture/ansible/provision.yml + inventory.yml Ansible over WinRM/NTLM installs the Canon driver + maintenance tool + stages the reset driver. NTLM seals WS-Man so plain HTTP:5985 works without AllowUnencrypted.
3 · Reset driver host/vm-capture/win/drive-reset.ps1 UIAutomation drives the reset GUI by control name (discovery-first: -Dump prints the control tree), not by pixel.

How it is invoked. All via the Justfile (the single entrypoint):

just vm-capture-headless all      # build-iso → define → install → wait-winrm → provision
just vm-capture-headless capture  # host usbmon + drive ONE reset via Ansible/PS
# interactive/SPICE fallback:
just vm-capture setup|install|snapshot|capture|start|stop|status|detach

The driver scripts: scripts/vm-capture-headless.sh (headless: xorriso autounattend ISO → headless domain with a WinRM hostfwd 55985→5985 → unattended install → WinRM wait → ansible-playbook) and scripts/vm-capture.sh (SPICE lifecycle). The headless define-step rewrites the domain XML in place (drops <graphics>/<video>, adds the unattend CD + the qemu:commandline hostfwd) so one source XML serves both modes.

Reproduce from scratch (on mbp-13). Host prerequisites are provisioned by the canon_tool_dev role (§5); the manual one-time gates are documented in the vm-capture README:

  1. sudo setenforce 0 + persist SELINUX=permissive — the passt WinRM port-forward is blocked by SELinux Enforcing on EL10 (exit 126); permissive keeps AVC logging and also clears the USB-passthrough path.
  2. Stage (never committed) under ~/canon-tool-staging/: the Win11 ISO at iso/Win11_25H2_English_x64_v2.iso, and the Windows payload under win-payload/ (Canon G6020 driver EXE + ServiceTool/WICReset).
  3. just vm-capture-headless all then ... capture.

The managed <hostdev> grabs the G6020 from the host when the VM starts and hands it back on stop; while held, host CUPS/ipp-usb cannot use it (expected).

Background on why a VM and not Wine, and the spike history, is consolidated in the field guide (docs/research/canon-service-mode-field-guide.md).


1. Wire capture — usbmon / dumpcap / tshark (Lane 1, host wire ground truth)

What it is. The host-side USB bus tap. The Linux usbmon kernel module exposes /dev/usbmonN; dumpcap records it to .pcapng; tshark's USB dissectors decode URBs. This is the ground truth — opaque to no claim, the arbiter when the static model and the wire disagree (the wire wins).

Where it lives. - Kernel + access: provisioned by canon_tool_devusbmon autoload (host/roles/canon_tool_dev/files/usbmon.modules.conf), the usbmon/wireshark groups + the udev rule (files/50-canon-g6020.rules), and dumpcap file-capabilities so capture is unprivileged. - Orchestration scripts: scripts/wicreset-capture.sh (free read, no key), scripts/r1-capture.sh (Wine + Service Tool capture with ipp-usb toggle), and the wire layer of scripts/wicreset-instrumented-capture.sh (the 3-layer run; §6 below).

How it is invoked.

just capture-read [label]     # free WICReset "Read waste counters" on mbp-13 (no key)
just capture-sync             # rsync the capture-host pcaps into ./captures/incoming/
# under the hood, on mbp-13:
dumpcap -i usbmon1 -w captures/<label>.pcapng -q
tshark -r captures/<label>.pcapng \
  -Y 'usb.transfer_type==0x03 and usb.endpoint_address in {0x03 0x86}' \
  -T fields -e frame.number -e usb.endpoint_address -e usb.capdata

Device identities to filter on: 04a9:1865 (normal mode), 04a9:12fe ("Printer in service mode"). The maintenance transport is usbprint VENDOR control on EP0 (VENDOR_SET IOCTL 0x220038 = bmRequestType 0x41 OUT; VENDOR_GET 0x22003c = 0xC1 IN).

Reproduce. Apply just host-apply (the canon_tool_dev role) to mbp-13, log out/in for group membership, then run a capture script. There is also a guest-side pktmon layer (pktmon start --captureetl2pcap) used only to answer the local-vs-cloud question — see §6.


2. Decompile rig — Ghidra + pyghidra (Lane 3, offline static RE)

What it is. Static reverse engineering of the proprietary binaries to recover structure the wire can never show: the IOCTL→URB field map, the net-free proof of the reset subtree, and the write cipher tables. Two engines are used: Ghidra analyzeHeadless (Jython 2.7 post-scripts) and pyghidra (CPython driving a saved, analyzed project headless on newer Ghidra).

Where it lives. Tracked scripts in ghidra/ (see its README.md); the binaries + project DB are gitignored under .ghidra-work/ (no redistribution). Curated findings are consolidated in the field guide (docs/research/canon-service-mode-field-guide.md).

The scripts (what each does):

Family Scripts Purpose
Jython post-scripts (Service Tool) dump_canon.py, dump_strings.py, trace_usb.py, trace_callers.py, vtable_probe.py, dump_named_vtable.py, find_and_decomp.py, parse_dialogs.py, find_msgmap.py, peek_obj.py metadata/RTTI/string dumps; rank+decompile I/O-touching funcs; resolve C++ vtables (defeat virtual dispatch); RT_DIALOG control-ID → MFC AFX_MSGMAP_ENTRY handler → wire (the button→wire recipe).
v5103 deep-dives v5103_servicemode_probe.py, v5103_absorber_extract.py, v5103_wireresolve.py, v5103_read_extract.py, v5103_readbody_extract.py, v5103_writers.py, v5103_innerchain.py, v5103_* reconstruct the Service Tool service-mode reset handshake + readback codec.
pyghidra (CPython) pyghidra_xref_decompile.py, pyghidra_decompile_xrefs.py, and the standalone runners under .ghidra-work/ (decomp_standalone.py, ioctl_scan.py, lane2_*) byte-search exact strings in the stripped printerpotty.exe, resolve referencing functions, decompile — recovers clearCounters, the cipher chain, the IOCTL sites.
WICReset / APP.BIN wicreset_resetflow.py, wicreset_netmap.py, wicreset_decrypt_trace.py, wicreset_template_*, wicreset_tmplsrc_*, wicreset_archive_des.py, appbin_extract.py, appbin_entropy.py trace the reset orchestrator + cloud gates, the APP.BIN decrypt/mount chain, and the embedded devices.xml template DB (the cipher tables).

How it is invoked.

# Ghidra headless harness (binary + project DB gitignored under .ghidra-work/):
WORK=.ghidra-work
HEADLESS=$(dirname $(readlink -f $(which ghidra)))/support/analyzeHeadless
"$HEADLESS" "$WORK/project" canon-servicetool-v5103 \
  -process ServiceTool_v5103.exe -noanalysis \
  -scriptPath ghidra -postScript dump_canon.py "$WORK/out/v5103-report.md"

# pyghidra standalone (reuses the saved analyzed program — analyze=False):
GHIDRA_INSTALL_DIR=<ghidra> CMR_PROJ=.ghidra-work/project \
  CMR_PROG_NAME=printerpotty.exe \
  uv run --no-project --with pyghidra python ghidra/pyghidra_xref_decompile.py out.c "clearCounters,service.sendcmd"

The Justfile placeholder just ghidra <script> <args> points at this harness (ghidra/README.md); the harness intentionally lives outside nix develop (Ghidra 11.4.2 + JDK 21 + Wine are capture-host tooling, not the dev devShell).

Reproduce. Install Ghidra (nix, JDK 21), rsync the never-committed binary from mbp-13:canon-tool-staging/extracted/, one-time -import for full auto-analysis (PE + RTTI + decompiler param-id), then re-run any script with -process … -noanalysis against the saved program. Jython 2.7 gotchas (utf-8 header, getDefinedData walk) are in ghidra/README.md.


3. Dynamic instrumentation — Frida (Lane 2, Win11 VM IOCTL + DRM)

What it is. Runtime hooking of the Windows tools inside the capture VM, to see the plaintext command frame before it hits the wire, read the live 3-byte keyword, and (for the ground-truth capture) force WICReset to emit its own genuine reset with the cloud licensing gates neutralized. Frida bridges what the static decompile predicts and what the wire records.

Where it lives. host/vm-capture/win/ (the hooks + launchers).

The tool. frida-inject — the standalone PyInstaller-frozen CLI, no guest Python needed. The proven pin is frida-inject-x86-16.exe v16.5.9 (32-bit, matching the 32-bit printerpotty.exe; image base 0x400000); the headless staging script also fetches frida-inject 17.x for the 1284/session hooks. Launch pattern (from the runbooks + script headers):

frida-inject-x86-16.exe -f "C:\PROGRA~2\PRINTE~1\PRINTE~1.EXE" \
    -s C:\canon\<hook>.js -R v8 > C:\canon\<log>.log 2>&1
# v16: do NOT use -o; do NOT combine -e with stdout redirect.
# -p <pid> to attach; -R qjs|v8 runtime; -e eternalize (keep script after injector exits).

Session-0 vs session-1 matters: WinRM is session 0, but Frida's agent bootstrap stalls there, so the target is spawned in interactive session 1 via a Scheduled Task running as the autologon cap user (see sess1-appbin-dump.ps1).

The hooks (host/vm-capture/win/):

Hook What it does
frida-1284clamp-hook.js clamp nOutBufferSize 5000→4096 for IOCTL 0x220034 (GET_1284_ID). Root cause: usbprint.sys (Win11 26100.8328) caps the GET_1284_ID OUT buffer at one page; WICReset's deep read asks 5000 → ERROR_CRC. Pure app→kernel arg fix, no driver bind, no key.
frida-session-capture-hook.js session-capture — merges the clamp (extended to 0x22003c VENDOR_GET) + a full in/out hex trace of every DeviceIoControl (raised cap for the tiny vendor frames 0x220038/0x22003c/0x16000c). Captures the encrypted series-name read before the key field. NO key spent.
frida-drm-reset-hook.js DRM-bypass — patch 3 cloud gates JZ(0x74)→JMP(0xEB) (0x44012d RESET_GUID, 0x44054a QUERY_KEYS, 0x440563 valid-bit; exact for sha256 a199447db…564b3e8, with a 0x74 guard that aborts on version drift) so net-free clearCounters runs; also connect()-replace for instant cloud fast-fail + full VENDOR IOCTL trace.
frida-usbprint-driver.js usbprint-driver — open the {28d78fad} GUID_DEVINTERFACE_USBPRINT handle ourselves and issue our derived functor-3 frames straight through VENDOR_SET/VENDOR_GET, bypassing WICReset entirely. usbmon records the exact URB for the native Linux tool to replicate.
frida-wicreset-hook.js base app-layer capture: CreateFile/DeviceIoControl (minidriver path) + WinUsb_* (if a build talks WinUSB directly) + wininet/winhttp/ws2_32 connect tracing (corroborate local-vs-cloud).
appbin-dump.js in-memory cleartext dump of printerpotty.exe's APP.BIN decrypt/mount/inflate path (hooks the decrypt orchestrator, header-strip, buffer-append, inflate, dotted-path accessor) → raw .bin via the Frida File API. NO key, NO device, NO cloud.

Launchers / orchestration: run-frida-capture.ps1 (guest-side -Setup stage / -Launch spawn-under-Frida, writes a wall-clock anchor for pcap correlation; observational only — the operator enters the key over VNC) and sess1-appbin-dump.ps1 (the session-1 Scheduled-Task launcher for the appbin dump).

Reproduce. Bring up the VM (§0), stage frida-inject + the hook (the instrumented-capture script base64-chunks files into the guest over WinRM), then launch per the pattern above. Evidence trail is consolidated in the field guide (docs/research/canon-service-mode-field-guide.md).


4. GUI drive — VNC + UIAutomation (the human-in-the-loop click)

What it is. The reset click (and key entry, when WICReset is used) is driven by the operator, deliberately, so the human controls the single-use key. Two mechanisms:

  • UIAutomation (host/vm-capture/win/drive-reset.ps1) — invokes controls by Name/AutomationId (robust, scriptable). Discovery-first: -Dump prints the live control tree so selectors can be pinned for the closed-source GUI.
  • VNC / SPICE — interactive fallback for the one click that UIAutomation can't yet pin, and for entering the OctoInkjet key during a WICReset ground-truth run. The interactive domain (canon-capture-win11.xml) exposes SPICE on 127.0.0.1; on mbp-13 the capture host's headless GUI tooling (Xvfb, xdotool, scrot, installed by the canon_tool_dev role into the nix profile) supports screenshot/automation of the Wine GUI for the free-read path.

This lane is intentionally the smallest — Frida is purely observational and never enters the key; the human does, once, over VNC.


5. Host provisioning — Ansible role canon_tool_dev (the capture/RE env)

What it is. The idempotent Ansible role that turns a bare mbp-13 into the capture + RE workbench. It is not the fleet-deploy role (that is canon_tool_reset, which installs the shippable native tool — kept separate).

Where it lives. host/roles/canon_tool_dev/ (tasks, defaults, files, templates, handlers, README.md); driven by host/playbooks/canon-tool-dev.yml.

What it provisions: Wireshark/libpcap + python3-pyusb/libusb; QEMU + libvirt + edk2-ovmf (the VM substrate); Wine via Flatpak with --device=all USB + staging-dir access (WinHQ dropped EL RPMs); usbmon autoload + the Canon udev rule; the printstack/wireshark/usbmon group membership + dumpcap caps that make capture unprivileged; a scoped NOPASSWD sudoers drop-in (ipp-usb toggle only); and Xvfb/xdotool/scrot into the nix profile for headless GUI automation.

How it is invoked.

just host-check                 # ansible-playbook --syntax-check (no host contact)
just host-dry                   # --check --diff (shows changes, applies nothing)
just host-apply ['--tags sudo,groups']   # apply to mbp-13 (become pw via $BECOME_PASSWORD_FILE)

Reproduce. just host-apply against mbp-13, then log out/in (or newgrp wireshark; newgrp printstack) for group membership.


6. Analysis utilities (turn raw captures + decompiles into the verified protocol)

These are offline, hardware-free Python utilities (in the dev devShell / a local .venv) that consume the trifecta's raw output and produce the verified protocol. Each cites its ground truth in docs/research/.

Util What it does Ground truth
scripts/appbin_decrypt.py Decrypt printerpotty.exe's APP.BIN container: strip 4-byte footer → 3DES-EDE3-CBC (zero key, zero IV — empty-wxString construction) → strip PKCS pad → zlib inflate → the devices.xml template DB. Self-contained DES (no OpenSSL dep). docs/research/canon-service-mode-field-guide.md
scripts/appbin_oracle.py Validation oracle + container model for APP.BIN (PE resource offset/size, entropy, block-alignment, the FUN_00530ae0 mount pipeline). Confirms the decrypt against the static model. docs/research/canon-service-mode-field-guide.md
scripts/canon_sr5_cipher.py The CANON-SR5 reference cipher + encoder — reproduces the maintenance command-frame transform for the G6000/G6020 family. Reads all substitution tables directly from the decrypted devices.xml (never hard-codes them); functor-2/3 with the validated SUBJECT/SEED role swap that yields the 23-byte set_command. docs/research/canon-service-mode-field-guide.md
scripts/g6020_wire_codec_crack.py Offline crack of the service-mode readback wire codec from a 40-session dataset: 0x84 fully recovered (XOR stream over a constant 20-byte plaintext with a fixed keyword-byte selection table; 40/40 byte-exact); 0x8c documented as an open nonlinear item. NO device touched. docs/research/canon-service-mode-field-guide.md
scripts/parse-wicreset-capture.py The turnkey pcap extractor: pull EVERY EP0 control transfer to/from the service-mode device (bmRequestType/bRequest/wValue/wIndex/data + responses) + bulk frames, flag the absorber-reset frame, emit ordered/annotated/--json/--replay-snippet. Thin wrapper around tshark. docs/research/canon-service-mode-field-guide.md
scripts/safe-ping-probe.py Read documented-safe baseline (IEEE-1284 device-id, USB descriptors) and emit YAML for maintenance.yaml::ping_suite_baseline. No bulk-OUT, no vendor commands, no EEPROM. scripts/AGENTS.md
scripts/experiment-handshake-reset.py Live discriminator (debug unit only) for Lane A's recovered handshake — sends the candidate session-open→preamble→payload with a few GUESSED runtime bytes. Not production code. docs/research/canon-service-mode-field-guide.md

The 3-layer instrumented capture that fuses the lanes is scripts/wicreset-instrumented-capture.sh (preflight | stage | rehearse | capture | anchor | stop): WIRE (host usbmon/dumpcap) + APP (guest Frida) + NET (guest pktmon) on one reset, with a shared wall-clock anchor so all three streams correlate to the exact transfer before the EEPROM commit. The operator drives the key over VNC; the script only instruments, captures, and pulls artifacts.

How they are invoked.

just analyze <pcap>                  # canon_megatank.pcap model over a capture
just parse-capture <pcap> [--json]   # the annotated control-transfer extractor
just model                           # the formal protocol model + Hypothesis property tests
# offline cipher utils run directly (no hardware), e.g.:
python3 scripts/appbin_decrypt.py <APP.BIN> ; python3 scripts/canon_sr5_cipher.py

Reproduce. just setup (the dev devShell via direnv/nix develop), then run the utils; only the pcap extractor (parse-wicreset-capture.py) and the live experiment need the capture host (tshark / the printer).


Cross-references

  • RE findings (per protocol claim): the consolidated field guide docs/research/canon-service-mode-field-guide.md — transport, write cipher, cloud-independence, readback codec, and the button→wire decompile recipe.
  • Runbooks (operational evidence): docs/runbook/g6020-native-reset.md (the validated native reset, 23/23 byte-exact); the capture-pipeline, encrypted-session, and rig-spike evidence is consolidated in the field guide docs/research/canon-service-mode-field-guide.md.
  • Formal protocol: docs/spec/megatank-maintenance-protocol.md + src/canon_megatank/protocol/model.py (run just model).
  • Diagrams: docs/diagrams/methodology-trifecta.mmd (this loop), exploit-dataflow.dot, drm-bypass-controlflow.dot, lifecycle.mmd, maintenance-state-machine.mmd.
  • Operating contract / ethics: AGENTS.md, docs/adr/0007-canon-tool-reverse-engineering.md, ETHICS/, INTEROP.md.
  • Capture-host provisioning: host/roles/canon_tool_dev/README.md; VM rig: host/vm-capture/README.md.