Grip on Software

Large-scale data-driven analysis of Scrum software development practices

Research

Overview

The Grip on Software (GROS) project performs a scientifically grounded research into the effects and effectiveness of various practices within software development processes. Specifically, we focus on investigating the use of Scrum, an Agile software development method. We study the efficacy of events and of actions that take place during the development cycle, including code changes and team meetings. Finally, we forecast the outcome of the process or detailed divisions of it, including the successful release of the product.

The GROS project is a collaboration between the Leiden Institute of Advanced Computer Science (LIACS)—as part of Leiden University—and Stichting ICTU, an independent government-owned organization in the Netherlands. Additional data is provided by Wigo4it (Dutch).

Research in large-scale data analytics of software development is increasingly important because such projects are often high-risk with funding of public money and application of novel technologies. Furthermore, many end-users of citizens may be impacted by the success or failure of the project. Many factors play a role in the quality, reliability, usability, security and effectiveness aspects of the developed product, which are measurable during the advancement of the developed product increments.

Models

We first model the software development ecosystem as a whole, including actors, information sources, stages and transitions. This flexible yet specific model explains the behavior we observe in the process and allows for comparisons with existing frameworks as well as perform model validation using models from other applications that share some intrinsic similarities, such as predator-prey models, arrangements of factory pipelines and (self-)learning environments.

These models provide us with insights and different viewpoints of the inner workings of agile software development practices, such as Scrum, Lean and extreme programming. This forms the basis for extracting information from the process.

Workflow of a sprint in the Scrum framework: A Product Owner maintains a Product Backlog with Stories. Some stories are selected during a (pre)refinement and sprint planning to be part of the Sprint Backlog. The Sprint (2-3 weeks) is a cycle, where the Development Team works on the Stories and holds Daily Scrum meetings. In the end, the work leads to a Potentially Shippable Product Increment, while Sprint Review and Retrospective meetings are held.

Data

Software development projects have their own unique set of instances of tools used during the process, where there is usually a core selection of systems there are essential to the process, regardless of methodology. These tools are version control systems, e.g., Git or Subversion, potentially combined with a built-in code review and tracking utility provided by services like GitLab, GitHub and Azure DevOps Server (formerly TFS and VSTS). User stories, bug reports and technical backlogs are often managed separately using project tracking software such as Jira. Inspection of software quality is provided by platforms such as SonarQube. Automated build and continuous integration pipelines are served by systems like Jenkins. Additional project registration systems for authorization control and meeting reservations also play a role.

We consider all of these tools as feasible sources of information for our research, because they each provide disclosure of specific parts of the development process. Many tools store the complete record of actions taken within the timeline of the process for the purpose of referencing or auditing these actions at a later moment. Some tools provide limited aggregation of their own data, such as burn down charts, but these are always restricted to the data available within the tool itself.

The GROS data acquisition pipeline: The software development project defines a number of systems, namely source code repository, issue tracker and quality control metrics collection. We gather data from this system with a Python-based agent, leading to an intermediate collection in JSON form. This collection is imported with a Java program into a MonetDB database. With an R/SQL based analysis program, we extract features for visualization with D3.js and for prediction with TensorFlow models.

We construct the GROS data acquisition pipeline, an automated data pipeline to extract metrics, events and metadata from these systems and tools, which combines the data sources and links relevant events where possible. This database forms the basis for a data set of features, which we used for machine learning and information visualization.

Security

Privacy and security play an important role during the entire research project. All data is stored on secured servers with full disk encryption. The transfer of data only occurs in secure networks or through the use of HTTPS and GPG encryption.
Personal information is treated specially. The names of the people that are involved in a software development process, including the project leaders, clients and developers themselves, are obfuscated using one-way encryption immediately when the data is collected. This means that one can only determine whether someone is involved in a project if the attacker already has the personal information as well as the original encryption key, which always remains at the source location.

We collect personal information only to uniquely identify the same actor across development platforms used within the same project. Standard obfuscation techniques would be insufficient for this purpose. Our intention is to provide feedback of results on a project level, but details about actor roles may still help in this aspect.

The raw data is only handled by researchers of the GROS project after certificates of good conduct, non-disclosure agreements and privacy impact assessments have been produced or signed. Personal data and project-sensitive information is never shared with anyone outside of the project.

Prediction

Our goal is to determine how we can effectively and accurately make use of the factors that strongly determine the success or failure of a software development process, using multiple means to measure these factors and the success factor. Finally, the aim is to design and implement an automated system that provides a recommendation regarding the risk, explains how the risk was measured and provides recommendations on how to reduce this risk.

Since there exist different kinds of development processes with different means of determining the risk during the lifespan of a project, we focus on a group of methodologies known as Agile software development. In particular, we look at the Scrum framework, which makes use of short time frames known as sprints, where developers commit to a number of goals that they think are reachable during this time. The developers then commit to implementing the desired features corresponding to those goals. This provides us with a large set of historical information of measurable events at various time ranges, as well as some feedback about whether the developers and the client was satisfied with the process and the delivered product increment.

We focus on determining whether the developers reach all of their goals completely within the allotted time, through the use of various machine learning algorithms which learn from past sprints and provide an indication of the risk. We also estimate the effort required to undertake the user stories and other goals at hand. This allows us to find what kinds of properties are relevant to reaching the goals with more certainty.

Detailed information regarding the predictions of upcoming sprints are provided in an overview dashboard, through the use of information visualizations. The data set used for training, testing and validating the models that provide the risk classification and estimations based on similar sprints is also made available as an open data set with accompanying documentation for further scientific inquiry.

Visualization

Aside from the prediction result, we provide distinct but intertwined visualizations which allow people within the organization to inspect the collected data set and view all kinds of events, statistics and properties in a new way that existing tools do not make available. We specifically make comparisons between projects from the same organization easy such that teams can learn from each other, provide context to their practical methodologies and help think about improvements to the entire process.

The visualizations that we offer include:

a timeline of events occurring in the lifespan of a project as well as of individual or multiple sprints, including a linked burn down chart;
a heat map showing a calendar of code commit volumes per day while ensuring that projects of different sizes are comparable using color map scaling, with a mode to find extraordinary changes, as well as integration with external calendar data such as weather and traffic;
a collaboration graph exhibiting the tight network within an organization where teams help each other at various times of their project lifespans, including a time lapse mode;
a leaderboard where statistics about different projects are presented in an interactive fashion, allowing users to delve through and combine different attributes, while still providing measures relevant to the attributes;
a status dashboard showing the health and resource usage of projects in virtual networks where containers are deployed during development, with support for historical data comparisons;
a visualization that displays the workings of various parts of the process, such as the flow of that issues follow;
more formats that provide options to generate customized reports, including details, attributes and results from our predictive pipeline.

For an impression, we showcase the visualizations in an anonymized version as well as a number of videos that demonstrate how to interact with them, including captions for full detail.

Relevant links

You can learn more about the research in the following documents and repositories.

Technical resources

Database structure model diagram ( PDF).
Grip on Software code repositories are available on GitHub.
OpenAPI specifications of Prediction API and other APIs used internally and exposed by the data acquisition pipeline.
The Prediction API includes the Grip on Software sprint features, the complete data set on used to train, test and validate the prediction models.
Preserved versions of the code repositories and data sets are also archived on Zenodo.
Other intermediate exchange formats are described in JSON schemas and additional documentation of the data acquisition pipeline is at data gathering.

Academic publications and articles

L. Helwerda (2024). Grip on Software: Understanding development progress of Scrum sprints and backlogs. PhD thesis.
ICTU, May 2018. GROS onderzoekt successfactoren Scrum (Dutch).
L. Helwerda, F. Niessink, F.J. Verbeek (2017). Conceptual process models and quantitative analysis of classification problems in Scrum software development practices. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management: Volume 1 (KDIR), presented in November 2017 at the Knowledge Discovery and Information Retrieval conference. 10.5220/0006602803570366
ICTU, January 2018. Publicatie paper 'Grip op Software' (Dutch).
ICTU, December 2016. Softwareontwikkeling onder wetenschappelijke loep (Dutch).
Universiteit Leiden, July 2016. Leiden University and ICTU start research on software development.

Contact

If you have any inquiries regarding the Grip on Software research, please contact the researcher at l.s.helwerda@liacs.leidenuniv.nl.