Here’s how you can apply Predictive Analytics to your testing.
Every day, the world produces 2.5 exabytes of data - a huge amount that will only keep growing. Some of the largest contributors are companies, but only the most advanced among them make systematic use of it to improve their work.
The software testing industry is no exception; hundreds of thousands of testers create reams of data as a byproduct of their day-to-day activities, most of it never to find any productive use.
The most advanced among them, however, harness the deluge of data and direct it towards the power station of Predictive Analytics: a discipline that helps predict future application performance based on its past behavior.
To adapt to an increasingly agile development landscape, QA must change from a reactive process to a preventive one. This means integrating deeper into the software delivery lifecycle (SDLC). The following areas are where Predictive Analytics will unleash most of its power.
Predictive models can generate tests and test data using methods such as search-based software testing (SBST). For example, when testing a function call with SBST, various search algorithms (such as genetic algorithms) selectively find arguments that provide the maximum test coverage in the fewest number of test cases. The result is better coverage, fewer redundant tests, and faster execution.
Techniques such as Rayleigh prediction can anticipate the number of defects for each phase in the SDLC. Rayleigh requires three data points from previous releases: the number of phases, the number of defects found in each phase, and the number of defects originating in the phase they were found in (such as a coding defect created and reported in the coding phase). The defect rate and defect injection rates determine your defect detection efficiency, which determines your expected defect count for each phase. Both QA and development can focus more resources on phases likely to produce more defects.
Predictive Analytics can also predict the severity of new defects. One study used a multinomial Naive Bayes algorithm to classify a set of defects using their description, severity, and affected project component. After training on data from public bug trackers, the model was able to categorize defect severity with an accuracy of 76–90%. This meant potentially critical defects could be identified immediately without having to be triaged first.
Deep learning models have been used to estimate story points. To find the total testing time, we can combine these story points with QA's average throughput and factor in risks such as development delays. This gives us the likelihood of completing a test plan before its target date. Focused Objective, an agile consulting company, demonstrates this in their forecasting spreadsheets, which generate a range of dates and estimate the probability of meeting test goals before those dates.
Test times can also predict performance problems. A simple linear regression model can show how execution times have changed for individual test cases and the plan as a whole. This highlights performance anomalies that would otherwise impact customers.
Feeding insights back into the SDLC helps prevent new defects by identifying components that are the most at risk. This can influence the direction of new features, the allocation of development resources, and the scope of new changes. The result is fewer defects created during development and even fewer defects reaching the customer.
A study of 2020 software conducted by Synopsys found that poor quality software cost just US companies $2.08 trillion. 75% of that cost was caused by software fails, which is a 22% increase from 2018.
And those were only those that were reported - clearly only the tip of the iceberg. In this context, it is of utmost importance that all QA departments deploy all the tools at their disposal to minimize the devastating impact that software bugs can have on their company's customers and the public at large.
As computing power continues to increase and costs continue to fall, the real potential for Predictive Analytics to make a significant impact on the testing industry is yet to be fully realized. With the adoption of Predictive Analytics expected to grow by 21.9% from 2020 to 2027, the opportunity to supercharge your QA has never been greater.