API endpoints¶
In some software development ecosystems, it is necessary to separate the data acquisition process from the import control. This is usually achieved with an agent and controller setup to perform these two tasks in isolated networks. In order to have agents properly collect data at scheduled intervals and provide this to the controller server for further processing and import, we need to exchange data regarding status and registration. This is where two APIs come into play. This document provides an overview of the APIs, also as a starting point to set up this agent-based networking in a virtualized ecosystem.
Scraper agent web API¶
In the Docker instance of the agent when running the ‘Daemon’ mode, one can make use of a web API to collect status information about the agent and immediately start a scrape operation. By default, the web API server runs on port 7070. The API uses JSON as an output format. The following endpoints are provided:
- /status: Check the status of the scrape process. Returns a body containing a JSON object with keys- okand- message. If a scrape is in operation, then a- 200status code is returned and- okis set to- true. Otherwise, a- 503status code is returned and- okis set to- false.- messageprovides a human-readable description of the status.
- /scrape: Request a scrape operation. This request must be POSTed, otherwise a- 400error is returned. If a scrape is in operation, then a- 503error is returned. If the scrape cannot be started, then a- 500error is returned. If the scrape can be started but immediately provides an error code, then a- 503error is returned. Otherwise, a- 201status code is returned with a body containing a JSON object with key- okand value- true.
When any error is returned, then a JSON body is provided with a JSON object
containing details regarding the error. The object has a key ok with the
value false, a key version with a JSON object containing names of
components and libraries as keys and version strings as values, and a key
error with a JSON object containing the following keys and values:
- status: The error status code.
- message: The message provided with the error.
- traceback: If display of tracebacks is enabled, then the error traceback is provided as a string. Otherwise, the value is- null.
More details on the scraper API are found in the schemas or in the Swagger UI.
Controller API¶
The controller is meant to run on a host that is accessible by the scraper
agents in order to exchange information with the agents, databases and
Jenkins-style scrape jobs. Setup of this host requires some extensive
configuration of directories and users/permissions in order to keep data secure
during the scrape process while allowing administration of the agent users. The
controller directory provides a few services which play a role in setting up
all the backend services.
A web API is exposed by the controller API, provided from the controller/auth
directory. The API is meant to run on HTTPS port 443, with a certificate
provided in the certs directory, which may be self-signed (and in that case
the agents must have the public part as well). The following endpoints exist:
- access.py: Check a list of networks to determine if a user should be shown exported data from the projects (one, multiple or all of them).
- agent.py: Set up an agent to allow access to update trackers and project salts using a SSH key, updating the permissions of relevant directories.
- encrypt.py: Use the project salts to provide an encrypted version of a provided piece of text.
- export.py: Update status of an agent, start a Jenkins scrape job and import the agent’s scrape data into the database.
- log.py: Write logging from the agent to a central location for debugging.
- status.py: Check if the agent should be allowed to collect new scrape data based on environment conditions (accessibility of services, allowed networks, correct configuration and directory permissions, and a tracker-based timer). If the agent is POSTing data to this endpoint, then instead store status information in a database or other centralized location.
- version.py: Check whether a provided version is up to date.
More details on the controller API are found in the schemas or in the Swagger UI.
Note that next to the controller API, the agents also connect to the controller
via SSH port 22 after registration in order to upload batches of collected data
before signalling the new upload on the export.py endpoint.