API endpoints¶
In some software development ecosystems, it is necessary to separate the data acquisition process from the import control. This is usually achieved with an agent and controller setup to perform these two tasks in isolated networks. In order to have agents properly collect data at scheduled intervals and provide this to the controller server for further processing and import, we need to exchange data regarding status and registration. This is where two APIs come into play. This document provides an overview of the APIs, also as a starting point to set up this agent-based networking in a virtualized ecosystem.
Scraper agent web API¶
In the Docker instance of the agent when running the ‘Daemon’ mode, one can make use of a web API to collect status information about the agent and immediately start a scrape operation. By default, the web API server runs on port 7070. The API uses JSON as an output format. The following endpoints are provided:
/status
: Check the status of the scrape process. Returns a body containing a JSON object with keysok
andmessage
. If a scrape is in operation, then a200
status code is returned andok
is set totrue
. Otherwise, a503
status code is returned andok
is set tofalse
.message
provides a human-readable description of the status./scrape
: Request a scrape operation. This request must be POSTed, otherwise a400
error is returned. If a scrape is in operation, then a503
error is returned. If the scrape cannot be started, then a500
error is returned. If the scrape can be started but immediately provides an error code, then a503
error is returned. Otherwise, a201
status code is returned with a body containing a JSON object with keyok
and valuetrue
.
When any error is returned, then a JSON body is provided with a JSON object
containing details regarding the error. The object has a key ok
with the
value false
, a key version
with a JSON object containing names of
components and libraries as keys and version strings as values, and a key
error
with a JSON object containing the following keys and values:
status
: The error status code.message
: The message provided with the error.traceback
: If display of tracebacks is enabled, then the error traceback is provided as a string. Otherwise, the value isnull
.
More details on the scraper API are found in the schemas or in the Swagger UI.
Controller API¶
The controller is meant to run on a host that is accessible by the scraper
agents in order to exchange information with the agents, databases and
Jenkins-style scrape jobs. Setup of this host requires some extensive
configuration of directories and users/permissions in order to keep data secure
during the scrape process while allowing administration of the agent users. The
controller
directory provides a few services which play a role in setting up
all the backend services.
A web API is exposed by the controller API, provided from the controller/auth
directory. The API is meant to run on HTTPS port 443, with a certificate
provided in the certs
directory, which may be self-signed (and in that case
the agents must have the public part as well). The following endpoints exist:
access.py
: Check a list of networks to determine if a user should be shown exported data from the projects (one, multiple or all of them).agent.py
: Set up an agent to allow access to update trackers and project salts using a SSH key, updating the permissions of relevant directories.encrypt.py
: Use the project salts to provide an encrypted version of a provided piece of text.export.py
: Update status of an agent, start a Jenkins scrape job and import the agent’s scrape data into the database.log.py
: Write logging from the agent to a central location for debugging.status.py
: Check if the agent should be allowed to collect new scrape data based on environment conditions (accessibility of services, allowed networks, correct configuration and directory permissions, and a tracker-based timer). If the agent is POSTing data to this endpoint, then instead store status information in a database or other centralized location.version.py
: Check whether a provided version is up to date.
More details on the controller API are found in the schemas or in the Swagger UI.
Note that next to the controller API, the agents also connect to the controller
via SSH port 22 after registration in order to upload batches of collected data
before signalling the new upload on the export.py
endpoint.