Scanning Engine v2 - Quick Start Guide¶

Note: The SEv2 glossary may help explain the terms used in this document.

Credential Management¶

SEv2 supports more detailed logging and convenient impersonation for individual clients. Credentials can be created that operate as themselves, or always operate as a shared client ID but with user-based logging. This allows you to set up your users in one of three ways:

Shared Credentials: a group of people share a set of credentials, allowing them all to perform the same actions. No ability to determine which person performed which action is possible. Credential breaches require everyone update their shared credentials. A member of the group leaving may require everyone update their shared credentials.
Fully Individual Credentials: each person has their own account, but has access to a shared topic list allowing them to submit jobs to that topic and view and control their own jobs. The individual client ID for the credentials will appear in all results.
Individual Credentials with Auto-Impersonation: each person has their own account, but all operations occur as if they were done by a shared client ID. Logging will note which real client ID performed an operation, but in all other ways this style of credential will operate as shared credentials.

To facilitate in debugging, we now provide the /krang/v1/client endpoint. This endpoint will display the details of the credentials you use to communicate with it.

Abuse Complaints¶

As an enterprise customer you have been assigned dedicated metaclusters of Minions that will execute the jobs you submit. These machines run HTTP servers on port TCP/80 that will redirect here. You can view the Minions that comprise your metaclusters at the /krang/v1/minions endpoint. When that endpoint is accessed without authentication it provides the addresses of all Minions under Krang's control, but no other details. Your Minions will be a subset of those addresses. If we receive abuse complaints or requests to be added to a blocklist, we will do so for all minions, including yours.

Job Lifecycle¶

Submission¶

Jobs can be submitted to the /krang/v1/job endpoint in JSON format using an authentication token. As described below, and in the API documentation, the JSON format of jobs has changed. Depending on the permissions associated with the credentials used to submit a job, the job may specify the client_id and topic_id to associate with the job. The client_id of a job determines who can control it, while the topic_id of a job determines the Minions on which it runs.

We have added the ability to add comments in job definitions via the $comment key. This allows you to explain the source of a job template, what it's useful for, or why a module invocation uses a specific configuration. This is especially useful in large scan templates. Comments are stripped from job definitions on submission, and not even logged.

When you submit a job you may include a client_ref key. This can be used as an alternative to the job's randomly-assigned ID when querying the job's status or definition. Note that client references must be unique with respect to the client. For example, client 1 and client 2 can both submit jobs with "client_id": "Alpha-Alpha 3-0-5", neither could use that reference again for any job. Attempting to re-use a reference will result in a 409 Conflict HTTP response.

A successful job submission will result in a 200 OK HTTP response, and the HTTP response body will contain:

{
  "job_id": <str:uuid>,
  "bootstrap_task_id": <str:uuid>,
  "bootstrap_shard_id": <str:uuid>
}

Only the job_id will likely be of interest to you. The other keys exist for debugging the task tree of the job.

Status¶

To view the status of a job and determine whether it is running, completed, or has failed, you can use the /krang/v1/job/{job_id} endpoint. This endpoint is currently updated asynchronously, once every minute. We intend to provide an HTTP(S)-based callback feature at a later date in the hopes of encouraging use of a callback workflow instead of polling workflow.

Results¶

As part of the change from v1 to v2, we have updated the format of our results, shown below. For a transition period we plan to make both v1 and v2 formats available simultaneously, performing some imperfect v2-to-v1 translations in our data pipeline. Currently, only v1 format data is available via the Stream API, using the exact same API endpoints as with SEv1.

The main differences in the result formats are:

A JSON schema is defined and available for the v2 result format, and for every module's .body in the result.
.body and .error have been promoted out of the defunct .result object, to prevent confusion between multiple uses of "result".
.origin now only explains the environment in which a result was produced.
.task explains the reason a result was produced.
ip could previously contain either IP addresses or hostnames, so it has been replaced with address and hostname, either or both of which may be set, depending on a module's targeting method.
Additional execution information for debugging is contained in the .task.id and .module_index fields.

v1 Format¶

{
  "origin": {
    "ts": [int:epoch-milliseconds],
    "provider": [str:provider-name],
    "region": [str:provider-specific-region],
    "country": [str:2-digit-country-code],
    "ip": [str:source-address],
    "port": [int:source-port],
    "client_id": [str:client-id],
    "job_id": [str:uuid],
    "type": [str:module-name],
    "module": [str:grabber-or-scanner],
    "minion": [str:truncated-hostname]
  },
  "target": {
    "ip": [str:destination-address],
    "port": [int:destination-port],
    "protocol": [str:transport-protocol-name]
  },
  "result": {
    "error": [obj:module-error],
    "data": [obj:module-output]
  }
}

v2 Format¶

{
  "task": {
    "id": [str:uuid],
    "topic_id": [str:topic-id],
    "client_id": [str:client-id],
    "job_id": [str:uuid],
    "module_name": [str:module-name],
    "module_index": [int:module-index]
  },
  "origin": {
    "ts": [int:epoch-milliseconds],
    "provider": [str:provider-name],
    "region": [str:provider-specific-region],
    "country": [str:2-digit-country-code],
    "metacluster": [str:metacluster-of-minion],
    "address": [str:source-address],
    "port": [int:source-port]
  }
  "target": {
    "hostname": [str:destination-hostname],
    "address": [str:destination-address],
    "port": [int:destination-port],
    "protocol": [str:transport-protocol-name]
  }
  "error": ,
  "body": 
}

Job Definitions¶

Portscanning¶

In SEv1, jobs were one of two types: scan or grab. scan jobs would run the previously-hidden portscan scanning module against the targets and then run the requested scanning modules¹. grab jobs would skip portscan, running the scanning modules against all specified targets and ports.

SEv2 does not differentiate job types, but instead gives access to the portscan module directly, and offers a flag on the new bootstrap module that controls whether portscan module invocations are implicitly added to a job's task tree.

In SEv2, if we wanted to run the noop module, which echoes its configuration parameters as a result without performing any scanning, but we nonetheless pointed it at port TCP/443, the job would appear as follows:

{
  "targets": [
    "example.com"
  ],
  "modules": [
    {
      "name": "noop",
      "ports": [
        "TCP/443"
      ]
    }
  ]
}

This would cause the bootstrap module, which evaluates the job's structure, to add an implicit module invocation of the portscan scanning module on port TCP/443. The implicit job that would be run is:

{
  "targets": [
    "example.com"
  ],
  "modules": [
    {
      "name": "bootstrap",
      "config": {
        "portscan": true
      }
    },
    {
      "name": "portscan",
      "ports": [
        "TCP/443"
      ]
    },
    {
      "name": "noop",
      "ports": [
        "TCP/443"
      ]
    }
  ]
}

The above SEv2 job is equivalent to an SEv1 scan job. It is also possible to disable portscan by passing "portscan": false in the bootstrap module's configuration. If only a portscan is desired, it can be invoked directly:

{
  "targets": [
    "8.8.8.8"
  ],
  "modules": [
    {
      "name": "portscan",
      "ports": [
        "udp/53"
      ],
      "config": {
        "probe": "WOULDN'T YOU LIKE TO BE A PEPPER TOO?"
      }
    }
  ]
}

Ports¶

SEv1 permitted ports and port ranges to be specified in its jobs, but treated them as top-level keys below which modules were listed and configured. SEv2 reverses this convention by treating modules as top-level entities to be invoked with different configurations. In SEv1, a job to run the snmp module on several ports with different protocols would require separate declarations:

{
  "type": "scan",
  "options": [
    {
      "targets": [
        "example.com"
      ],
      "ports": [
        {
          "port": "161-162,10161-10162",
          "protocol": "tcp",
          "modules": [
            "snmp"
          ],
          "config": {
            "connect-timeout": "10s"
          }
        },
        {
          "port": "161-162,10161-10162",
          "protocol": "udp",
          "modules": [
            "snmp"
          ],
          "config": {
            "connect-timeout": "10s"
          }
        }
      ]
    }
  ]
}

SEv2 combines everything into a single module invocation:

{
  "targets": [
    "example.com"
  ],
  "modules": [
    {
      "name": "snmp",
      "ports": [
        "TCP/161-162",
        "TCP/10161-10162",
        "UDP/161-162",
        "UDP/10161-10162"
      ],
      "config": {
        "connect-timeout": "10s"
      }
    }
  ]
}

Protocols are always included when referencing a port in SEv2, because ports are aspects of transport-layer protocols and make no sense in isolation. Ports can be specified using either / or : as a separator, using either the full protocol name (e.g., TCP) or the first letter (e.g., t), and using uppercase or lowercase letters.

Two keywords are allowed to take the place of port numbers: all and any. all is currently only supported by the portscan module, and indicates that it should scan ports 1-65535 of that protocol. Scanning TCP/0 and UDP/0 are not supported. any is usable by all other scanning modules and indicates the module accepts targets with positive results from certain other modules, regardless of the port number. Currently only portscan can cause another module to be executed. For example, if you wanted to scan every TCP port on a host, identify the services listening on those ports, and try to get any TLS and X.509 information available, you could write a job as follows:

{
  "targets": [
    "example.com"
  ],
  "modules": [
    {
      "name": "portscan",
      "ports": [
        "TCP/all"
      ]
    },
    {
      "name": "service-simple",
      "ports": [
        "TCP/any"
      ]
    },
    {
      "name": "ssl-simple",
      "ports": [
        "TCP/any"
      ]
    }
  ]
}

Scanning Modules¶

While we have tried to point out the most important changes to the scanning modules here, this list is incomplete and evolving. Please check the SEv2 Module Documentation for details on how it can be configured.

New Modules¶

Many new modules have been added, but the three most important modules are described here.

The bootstrap module acts as the Unix init process for a job's task tree. The module provides some job validation and creates the initial tasks to be executed. By explicitly invoking this module it is possible to disable implicit portscans.

The portscan module is now exposed and can be invoked directly. Note that both open source and proprietary portscanners are wrapped by this module and chosen based upon the configuration and targets.

The noop module is useful for both debugging and testing jobs. It is incapable of generating network traffic and exists to be called exclusively in ad hoc jobs run manually. Do not include it in any job templates.

Name Changes¶

The doublepulsar module has been removed as it was deemed no longer useful.

The rdpeudp module has been renamed to rdp-udp. Microsoft appears to use both names in their documentation, but the latter name is much clearer in what it refers to.

The malware-simple module was previously an alias for the service-simple module, but with the configuration option alternative-probes set to malware. We have removed this alias in SEv2, while keeping the malware probes file available.

Configuration Changes¶

Note that this list of changes does not include changes to undocumented options.

The most common changes are type changes -- string to integer, comma-delimited string to list -- and name changes (_ -> -). Arguments have also been renamed where appropriate to provide consistency across all modules.

custom_probes (plural) is now custom-probe (singular).

`elasticsearch`¶

ipv6 will now be set appropriately by the Minion and should never be set in a job. IPv4 and IPv6 addresses passed to this module will be grouped separately.

`ftp`¶

ftps was a string, but is now ssl and a boolean to be consistent with other modules and prevent confusion.
max_recursion was a string, but is now max-depth and an integer.

`kubernetes`¶

full_mode is now full-mode.
ipv6 will now be set appropriately by the Minion and should never be set in a job. IPv4 and IPv6 addresses passed to this module will be grouped separately.

`mqtt`¶

mqtt_mode is now version, because it controls the version of the MQTT protocol used.
mqtts is now ssl, to be consistent with other modules.

`mqttinfo`¶

ipv6 will now be set appropriately by the Minion and should never be set in a job. IPv4 and IPv6 addresses passed to this module will be grouped separately.
mqtts is now ssl, to be consistent with other modules.

`rsync`¶

full_mode is now full-mode.

`service`¶

user_agent was removed because it was only used in grab mode. The user agent for HTTP requests issued by Nmap scripts can still be set by configuring script-args to contain a list with an element like "http.useragent=TtlyLegitBrowser.

`service-simple`¶

custom_probes was a comma-delimited string, but is now custom-probes and an array of strings.
prioritize_probes was a comma-delimited string, but is now named probe-order and an array of strings.
probe_rarity is now probe-rarity.

`socks`¶

ipv6 will now be set appropriately by the Minion and should never be set in a job. IPv4 and IPv6 addresses passed to this module will be grouped separately.

`sslv2`¶

cert_only is now cert-only.
check_crl was removed due to lack of use and its performance impact when used.
cypher_only was a comma-delimited string, but is now named cipher-only and an array of strings.
exclude_cyphers was a comma-delimited string, but is now exclude-ciphers and an array of strings.
robot_only is now robot-only.

`websocket`¶

ipv6 will now be set appropriately by the Minion and should never be set in a job. IPv4 and IPv6 addresses passed to this module will be grouped separately.
http_path is now path.
https is now ssl, to be consistent with other modules.

`webv2`¶

body_inline is now body-inline.
custom_http_headers was a CRLF-delimited string, but is now named custom-http-headers and an array of strings.
favicon_inline is now favicon-inline.
follow_meta is now follow-meta.
host_header is now host-header.
http_method is now http-method.
http_path was a pipe-delimited string, but is now named http-paths and an array of strings.
no_redirects is now no-redirects.
user_agent is now user-agent.

API Endpoints¶

While we have tried to point out the most important changes to the API here, the API is ever-evolving. Please check the SEv2 Documentation for details on its usage, or check out the dynamically-generated SEv2 API documentation.

The reality was slightly more complicated. portscan tasks, like most tasks in SEv1 and SEv2, were broken up into shards across the list of targets. In SEv1, all shards for child tasks were registered with the system after the completion of each task shard. In SEv2, a shard for a child task is registered immediately upon a scanning module emitting the number of positive results necessary to meet the maximum number of targets shards for the child task may contain. ↩