Data analysis¶
These schemas specify how the data analysis component functions, by defining configuration, fields, definitions, queries and source tracking information.
Data analysis BigBoat status fields¶
https://gros.liacs.nl/schema/data-analysis/bigboat_status.json |
bigboat_status¶
type |
object |
|||
properties |
||||
|
type |
object |
||
patternProperties |
||||
|
type |
object |
||
properties |
||||
|
Localization titles for a subgraph of a BigBoat performance status field. |
|||
|
Localization texts for a subgraph of a BigBoat performance status field. |
|||
|
Units used by the BigBoat performance status field values. |
|||
type |
string |
|||
enum |
bytes, seconds |
|||
|
Regular expressions that match portions of BigBoat status field names and their normalized field names. |
|||
type |
object |
|||
patternProperties |
||||
|
Normalized field name for the fields that match the regular expression. |
|||
type |
string |
|||
pattern |
^.+$ |
BigBoat status field locales¶
type |
object |
|
patternProperties |
||
|
Localization item for a specific language. Valid languages use two-letter ISO 639-1 language codes plus optional BCP 47 subtags, so only a subset of languages is recognized. |
|
type |
string |
Data analysis configuration¶
data-analysis-config¶
anyOf |
||
type |
object |
|
patternProperties |
||
|
config-org¶
type |
object |
||||
properties |
|||||
|
type |
object |
|||
properties |
|||||
|
type |
string |
|||
format |
hostname |
||||
|
type |
string |
|||
|
type |
string |
|||
|
type |
string |
|||
|
type |
string |
|||
enum |
jira, jira_version, jira_component_version, tfs |
||||
|
type |
object |
|||
properties |
|||||
|
type |
string |
|||
format |
uri-reference |
||||
|
type |
string |
|||
format |
uri-reference |
||||
|
type |
string |
|||
format |
uri-reference |
||||
|
type |
string |
|||
format |
uri-reference |
||||
|
type |
string |
|||
format |
uri-reference |
||||
|
type |
string |
|||
pattern |
^[a-zA-Z0-9_.-]+$ |
||||
|
type |
string |
|||
pattern |
^[a-zA-Z0-9_.-]+$ |
||||
|
oneOf |
||||
type |
boolean |
||||
enum |
False |
||||
|
type |
string |
|||
format |
uri |
||||
|
type |
string |
|||
format |
uri-reference |
||||
|
type |
string |
|||
|
properties |
||||
|
type |
string |
|||
format |
uri |
||||
|
type |
string |
|||
|
type |
number |
|||
minimum |
-90.0 |
||||
|
type |
number |
|||
minimum |
-180.0 |
||||
|
type |
string |
|||
format |
date |
||||
|
type |
array |
|||
items |
type |
string |
|||
|
type |
array |
|||
items |
type |
object |
|||
properties |
|||||
|
type |
string |
|||
|
type |
string |
|||
|
|||||
|
type |
boolean |
|||
|
type |
boolean |
|||
|
type |
integer |
|||
|
type |
array |
|||
items |
oneOf |
||||
|
|||||
|
type |
array |
|||
items |
type |
string |
board_id¶
type |
integer |
minimum |
1 |
project_key¶
Project key from JIRA or team name from TFS. |
|
type |
string |
project_board¶
type |
object |
||
properties |
|||
|
|||
|
oneOf |
||
type |
array |
||
items |
|||
|
type |
boolean |
|
|
type |
boolean |
|
|
type |
string |
|
format |
date |
||
|
type |
string |
|
format |
date |
||
|
type |
string |
|
|
type |
string |
prediction_combine¶
type |
string |
enum |
mean, median, mode, sum, min, max |
components¶
type |
array |
||
items |
type |
object |
|
properties |
|||
|
type |
string |
|
|
type |
string |
|
|
|||
patternProperties |
|||
|
component_filter¶
type |
object |
|||
properties |
||||
|
oneOf |
type |
string |
|
type |
array |
|||
items |
type |
string |
||
|
oneOf |
type |
string |
|
type |
array |
|||
items |
type |
string |
Data analysis definitions for queries¶
definitions¶
type |
object |
|||
properties |
||||
|
type |
object |
||
patternProperties |
||||
|
type |
object |
||
properties |
||||
|
Description of the definition. |
|||
type |
string |
|||
|
||||
patternProperties |
||||
|
Definition for a specific primary source. |
|||
type |
object |
|||
properties |
||||
|
||||
anyOf |
||||
anyOf |
||||
|
Query filter conditions that may be used in WHERE clauses. |
|||
type |
object |
|||
patternProperties |
||||
|
type |
object |
||
properties |
||||
|
Description of the condition. |
|||
type |
string |
|||
|
||||
patternProperties |
||||
|
Condition for a specific primary source. |
|||
type |
object |
|||
properties |
||||
|
||||
anyOf |
||||
anyOf |
field¶
SQL template of the definition, which may contain ${…} for nested expansions. |
|
type |
string |
condition¶
SQL template of the condition, which may contain ${…} for nested expansions. |
|
type |
string |
references¶
type |
object |
||
properties |
|||
|
Table(s) involved in the query, which would need to be in the FROM clause or JOIN clauses for a successful query. |
||
oneOf |
|||
type |
array |
||
items |
|||
|
Fields involved in the query, which may be used in, e.g., SELECT or GROUP BY clauses to ensure a successful query. |
||
type |
array |
||
items |
type |
string |
|
pattern |
^[a-z_]+$ |
table_name¶
type |
string |
pattern |
^[a-z_]+$ |
Data analysis performance¶
performance¶
type |
object |
|
patternProperties |
||
|
type |
object |
properties |
||
|
||
|
performance_query¶
Performance metrics for a query. |
||
oneOf |
type |
array |
items |
||
performance_result¶
type |
object |
|
properties |
||
|
Compiled SQL query used during the performance test. |
|
type |
string |
|
|
Number of columns in the query result. |
|
type |
integer |
|
minimum |
0 |
|
|
Number of rows in the query result. |
|
type |
integer |
|
minimum |
0 |
|
|
Average number of microseconds spent in the optimizer pipeline of the database. |
|
type |
number |
|
minimum |
0.0 |
|
|
Standard deviation of microseconds spent in the optimizer pipeline of the database across runs. |
|
type |
number |
|
minimum |
0.0 |
|
|
Average number of seconds between the start and the end of the query, based on wall clock time. |
|
type |
number |
|
minimum |
0.0 |
|
|
Standard deviation of seconds between the start and the end of the query, based on wall clock time across runs. |
|
type |
number |
|
minimum |
0.0 |
|
|
Average number of microseconds spent on the query before the result could be exported. |
|
type |
number |
|
minimum |
0.0 |
|
|
Standard deviation of microseconds spent on the query before the result could be exported across runs. |
|
type |
number |
|
minimum |
0.0 |
|
|
Average number of microseconds spent on exporting the result. |
|
type |
number |
|
minimum |
0.0 |
|
|
Standard deviation of microseconds spent on exporting the result across runs. |
|
type |
number |
|
minimum |
0.0 |
|
|
Average CPU load percentage during query execution. |
|
type |
number |
|
maximum |
100.0 |
|
minimum |
0.0 |
|
|
Standard deviation CPU load percentage during query execution across runs. |
|
type |
number |
|
maximum |
100.0 |
|
minimum |
0.0 |
|
|
Average percentage of time waiting for I/O. |
|
type |
number |
|
maximum |
100.0 |
|
minimum |
0.0 |
|
|
Standard deviation of percentage of time waiting for I/O across runs. |
|
type |
number |
|
maximum |
100.0 |
|
minimum |
0.0 |
Data analysis query index¶
queries¶
type |
object |
||||
properties |
|||||
|
Directory in which the query files are stored. |
||||
type |
string |
||||
|
Queries known to this index. |
||||
type |
array |
||||
items |
oneOf |
||||
|
Category names that could be used by the queries in order to group them together, with localization for the category. |
||||
type |
object |
||||
patternProperties |
|||||
|
type |
object |
|||
properties |
|||||
|
Portions of a FontAwesome icon class that indicates the category. |
||||
type |
array |
||||
items |
type |
string |
|||
minItems |
2 |
||||
patternProperties |
|||||
|
Localization item for the category name in a specific language. Valid languages use two-letter ISO 639-1 language codes plus optional BCP 47 subtags, so only a subset of languages is recognized. |
||||
type |
string |
Analysis report query¶
Query for an analysis report. |
||
type |
object |
|
properties |
||
|
||
|
Name of the report. |
|
type |
string |
|
|
Sprint event query¶
Query for a sprint event. |
||
type |
object |
|
properties |
||
|
||
|
Name of the event. |
|
type |
string |
|
|
Whether to show the event in a timeline chart by default. |
|
type |
boolean |
|
|
Whether to show the event in a separate subchart of a timeline chart. |
|
type |
boolean |
|
|
Descriptions of the event. |
|
Feature query¶
Query for one or more features. |
||||
type |
object |
|||
properties |
||||
|
Name that could be used as a table name if the query were to be placed in a subquery, and in general as a normalized identifier for the query. |
|||
type |
string |
|||
pattern |
^[a-z_]+$ |
|||
|
Name(s) of the feature(s) that the query provides. |
|||
oneOf |
||||
type |
array |
|||
items |
||||
minItems |
1 |
|||
|
Whether to use an earlier occurring sample’s value(s) for the features(s) when the current sample has no result in the query. |
|||
type |
boolean |
|||
|
Default value for the feature(s) of samples that had no result in the query. |
|||
type |
number |
|||
|
||||
|
Indication of how to summarize values of the feature when the query result provides multiple rows per sample. |
|||
oneOf |
||||
type |
array |
|||
items |
||||
minItems |
1 |
|||
|
Operations to use to combine values of the feature(s), either when multiple projects are combined for a team or when concurrent sprints are combined into one. |
|||
oneOf |
||||
Operations to use to combine values of the features when combining multiple projects for a team. If summarize is an array, then the number of combine operations must match that length in order to combine for each summarizing operation. |
||||
type |
array |
|||
items |
||||
minItems |
1 |
|||
type |
object |
|||
properties |
||||
|
Operation to use to combine values of the feature when combining multiple projects for a team. |
|||
|
Operation to use to combine values of the feature when combining multiple concurrent sprints. |
|||
|
Different methods of predicting the feature. |
|||
type |
array |
|||
items |
type |
object |
||
properties |
||||
|
URL template to retrieve prediction data values for this feature. The template may contain ${…} for variable expansions. |
|||
type |
string |
|||
format |
uri-reference |
|||
|
Feature that can be used as a linear regression over sprints to predict the overall change of the feature. |
|||
|
||||
minItems |
1 |
|||
|
type |
object |
||
properties |
||||
|
Format type of the feature. |
|||
type |
string |
|||
enum |
fraction, duration, icon |
|||
|
Maximum denominator to use when formatting the value of the feature as a common fraction when it has the fraction type. |
|||
type |
integer |
|||
minimum |
1 |
|||
|
type |
array |
||
items |
||||
|
Icons to use in place of the value of the feature when it has the icon type. |
|||
type |
object |
|||
patternProperties |
||||
|
||||
|
Descriptions for the feature(s). |
|||
oneOf |
||||
|
Longer descriptions for the feature(s). |
|||
oneOf |
||||
|
Sprintf-compatible format strings that indicate the feature(s) along with longer descriptions of unit(s). |
|||
oneOf |
||||
|
Sprintf-compatible format strings that indicate the feature(s) along with unit(s). |
|||
oneOf |
||||
|
Shorter descriptions for the feature(s) that indicate their presence in or statefulness of the sample. |
|||
oneOf |
||||
|
Shorter descriptions for the feature(s) when used as factors of another feature’s prediction, in order to differentiate different prediction strategies. |
|||
oneOf |
||||
|
Metadata of the feature(s) regarding the type of units, relations to other features and moments when the query result is (available to be) collected. |
|||
oneOf |
||||
type |
array |
|||
items |
||||
minItems |
1 |
|||
|
Whether the feature(s) should be prominently displayed in reports. If this is false, then the features(s) may be hidden behind expandable sections. |
|||
type |
boolean |
|||
|
URL template(s) to human-readable websites that should roughly display the same information as the query result. The template may contain ${…} for variable expansions. |
|||
oneOf |
||||
|
Whether the result of the query should be cached in a database table for reuse. |
|||
type |
boolean |
|||
|
Category to group the feature(s) in. |
|||
type |
string |
|||
pattern |
^[a-z]+$ |
|||
|
Feature to use by default as a divisor of the feature, in order to display a normalized value in a report. |
|||
|
type |
array |
||
items |
Categories to group the feature in when displaying in a card-based report. The group normalize makes the feature available to act as a divisor for other features. |
|||
type |
string |
|||
enum |
project, metric_history, metric_options, quality_time, quality, sonar, jenkins, jira, vcs, git, gitlab, github, tfs, subversion, prediction, normalize |
|||
minItems |
1 |
|||
oneOf |
||||
Definition feature¶
properties |
||
|
Name of the query definition to use to calculate the feature. |
|
type |
string |
|
pattern |
^[a-z_]+$ |
Expression feature¶
properties |
||
|
R code to generate the feature, using an environment where names of at least certain requested non-expression features are available. |
|
type |
string |
|
|
Whether to perform the feature generation based on the expression during the selection of non-expression features instead of during other expression features, such that it is available in the environment of other expression features and in summarizing operations. |
|
type |
boolean |
|
|
Parameters for calculating the expression based on samples occurring before the generated sample. |
|
type |
object |
Query file feature¶
properties |
|
|
Metric feature¶
properties |
||||
|
Name of the metric(s) to use to calculate the feature. |
|||
oneOf |
type |
string |
||
type |
array |
|||
items |
type |
string |
||
minItems |
1 |
|||
|
When the metric is measured more than once in the time span that each sample represents, perform an aggregation query to calculate a proper value for each sample. - end: Select the last measured value. - max: Select the highest value. - min: Select the lowest value. - avg: Calculate the average value. |
|||
type |
string |
|||
enum |
end, max, min, avg |
|||
|
Source types that should not be included when providing a source for the feature. |
|||
type |
array |
filename¶
Name of the file stored in the path of the query index where the query is stored. |
|
type |
string |
pattern |
^[^/]+$ |
Feature locales¶
type |
object |
|
patternProperties |
||
|
Localization item for a specific language. Valid languages use two-letter ISO 639-1 language codes plus optional BCP 47 subtags, so only a subset of languages is recognized. |
|
type |
string |
Multi-feature locales¶
type |
object |
||
patternProperties |
|||
|
Localization items for the features in a specific language. Valid languages use two-letter ISO 639-1 language codes plus optional BCP 47 subtags, so only a subset of languages is recognized. |
||
type |
array |
||
items |
type |
string |
|
minItems |
1 |
patterns_fields¶
type |
object |
||||
patternProperties |
|||||
|
type |
object |
|||
patternProperties |
|||||
|
Column name(s) to use in the field in the query for the given table or primary source. |
||||
oneOf |
|||||
type |
array |
||||
items |
type |
string |
|||
pattern |
^[a-z_]+$ |
patterns_conditions¶
type |
object |
|||
patternProperties |
||||
|
oneOf |
SQL template to use in the field in the query. |
||
type |
string |
|||
type |
object |
|||
patternProperties |
||||
|
SQL template to use in the filed in the query for the given primary source. The template may contain ${…} for nested expansions. |
|||
type |
string |
summarize¶
Summariziation operation. Can be one of the combine operations or one of:
|
||
oneOf |
||
type |
string |
|
enum |
count, count_unique, end, sum_of_na_avg, sum_of_na_diff |
summarize_params¶
type |
object |
||
properties |
|||
|
|||
|
Include missing values in summarizing operation, for example with count. |
||
type |
boolean |
||
|
Field(s) from the query result to provide to the summarizing operation. Multiple fields can be provided to sum_of_na_diff. |
||
oneOf |
|||
type |
array |
||
items |
|||
minItems |
1 |
||
|
Feature(s) whose detailed values should be used to detect and remove overlapping values when using sum or sum_of_na_diff operation. |
||
type |
array |
||
items |
|||
minItems |
1 |
||
|
Feature whose field from its details, or the value itself (when expression is true) can be used as additional parameter of the summarizing operation when using sum_of_na_avg (use referenced values for non-empty divisor) or sum_of_na_diff (use as default value for old values). |
||
|
Field(s) from the query result to retain the values for each sample for, which may be used for tracking detailed information on how the feature was calculated. |
||
type |
array |
||
items |
|||
minItems |
1 |
||
|
R code using an environment of the query result fields to filter which rows are retained for the details. |
||
type |
string |
||
|
When using reference, whether to use the referenced feature’s value itself instead of the field from its details the summarizing operation. |
||
type |
boolean |
Feature combining¶
Combining operation.
|
|
type |
string |
enum |
mean, median, mode, sum, min, max |
Monte Carlo parameters¶
Parameters for a Monte Carlo simulation to predict the feature. |
|||||
type |
object |
||||
properties |
|||||
|
Name of the simulation. |
||||
type |
string |
||||
|
type |
array |
|||
items |
type |
object |
|||
properties |
|||||
|
Feature to use for the base factor. |
||||
|
Feature to use as multiplication of the column feature. |
||||
|
Weight to apply to this factor. |
||||
type |
number |
||||
|
Probability density function. |
||||
type |
string |
||||
|
Parameters for the probability density function. |
||||
type |
array |
||||
items |
type |
number / string |
|||
|
Whether to use the probability function to select new random data. When this is missing or false, the actual data from the column feature is selected instead. |
||||
type |
boolean |
Interval specification¶
Interval specification for the feature when it has the duration type. When formatting the value of the feature, each interval specifies the divisor value to apply until the value is small enough. |
||
type |
object |
|
properties |
||
|
Interval unit. |
|
|
Shorthand key for the interval unit. |
|
type |
string |
|
enum |
s, m, h, d, w, M, y |
|
|
Interval size. |
|
type |
integer |
|
minimum |
1 |
Icon value¶
FontAwesome specification for an icon that represents the value. |
||
type |
array |
|
items |
type |
string |
minItems |
2 |
time_unit¶
type |
string |
enum |
seconds, minutes, hours, days, weeks, months, years |
Feature unit measurment metadata¶
type |
object |
||||
properties |
|||||
|
Unit(s) of the feature. Either a singular unit or a fractional unit, where the divisor may be a fraction itself. |
||||
oneOf |
|||||
|
Feature that corresponds to the dividend of the feature. |
||||
|
Feature(s) corresponding to the divisor of the feature. In the case of fractional divisor, the feature of the leading unit may be described, or the related features may be described as far as they can. |
||||
oneOf |
|||||
|
Feature that corresponds to a larger superset of the feature. |
||||
|
Indicator of when the feature is measured compared to the current sample. post indicates that the value is only complete once the time span of the sample has ended. |
||||
oneOf |
|||||
type |
string |
||||
enum |
post |
||||
type |
array |
||||
items |
oneOf |
type |
number |
||
maxItems |
2 |
||||
minItems |
2 |
||||
|
Feature that corresponds to the feature’s value at the start of the current sample, which could be used to compare progress. |
||||
|
Feature that corresponds to the feature’s value at the end of the current sample, which could be used to compare progress. |
||||
unit¶
Unit of a feature.
|
|
type |
string |
enum |
change, commit, issue, item, file, line, byte, metric, person, point, sprint, time, meta |
fractional_unit¶
type |
array |
|
items |
oneOf |
|
maxItems |
2 |
|
minItems |
2 |
fractional_divisor¶
type |
array |
|
items |
oneOf |
|
maxItems |
2 |
|
minItems |
1 |
source_url¶
oneOf |
Human-readable website at the source. |
|
type |
string |
|
format |
uri-reference |
|
Indication that there is no usable human-readable website at the source. |
||
type |
null |
source_type_urls¶
type |
object |
patternProperties |
|
|
column_name¶
type |
string |
pattern |
^[a-z_]+$ |
Data analysis source types¶
https://gros.liacs.nl/schema/data-analysis/source_types.json |
source_types¶
type |
object |
|||
patternProperties |
||||
|
Localization for a data source type. |
|||
type |
object |
|||
properties |
||||
|
Portions of a FontAwesome icon class that indicates the source type. |
|||
type |
array |
|||
items |
type |
string |
||
minItems |
2 |
|||
patternProperties |
||||
|
Localization title for a data source type in a specific language. Valid languages use two-letter ISO 639-1 language codes plus optional BCP 47 subtags, so only a subset of languages is recognized. |
|||
type |
string |
Data analysis source update trackers¶
trackers¶
type |
object |
|||
patternProperties |
||||
|
Trackers for a data source type. |
|||
type |
array |
|||
items |
type |
object |
||
properties |
||||
|
Filename of the tracker as stored in the database. |
|||
type |
string |
|||
pattern |
^[a-zA-Z0-9_.-]+$ |
|||
|
Sprintf-compatible format string that indicates the contents of the tracker file, which consists only of the timestamp parseable with the format string. If both format and json are not provided, then the contents are ignored. |
|||
type |
string |
|||
|
Indication that the tracker file is stored as a JSON structure, and the means to parse the contents. This is ignored if format is provided.
|
|||
type |
string |
|||
enum |
object |