The New G2 Crowd Algorithm

G2 Crowd’s primary goal is to help buyers make better software buying decisions. It was a founding principal for the company and is woven throughout the website, blog and research reports. As a part of that buyer focus, the first product comparison developed using the site review data, which is still the flagship for comparing software products in a category, is the Grid℠. It provides a comparison matrix based on two factors, user satisfaction and market presence, and is continuously updated in real time on the category page. An example of a Grid can be found here. Grids are also released periodically in a Grid Report.

Both axes of the Grid℠ are based on reviews, but the review data is combined in each based on a proprietary algorithm, and in the case of the market presence score with other external data. The components of each score is publicly available on the G2 Crowd website, but the weightings of factors in the score are not released to prevent the possibility of “gaming” the score. The two algorithms were developed, tested, and put in production a few years ago, and have continued to function as designed. However, in a review earlier this year, we discovered a few areas in the algorithm that could be improved. With those factors in mind, we made plans to build a version two that would:

  • Incorporate better data sources.
  • Give us the opportunity to survey our community and determine the priority of factors that drive most software buying decisions and thus adjust the weighting and factors to make them more representative of what buyers are actually looking for.
  • Add additional factors in the market presence algorithm to increase it’s accuracy.

With those qualifications set, we formed a project team that included the research team, the development team and an outside expert on modeling and game theory, Christof Schlindwein. We kicked off the project to build version two last March.

To begin the project we conducted a survey of 651 individuals made up of our community and an outside survey house. The survey focused on buyer behavior and tested the different factors that influence buying decisions. The data from that survey formed the foundation to the modifications of the satisfaction algorithm and the weighting of different factors in both algorithms. While this list doesn’t represent every change, some important adjustments we made include:

  • Reducing the impact of the Net Promoter Score (NPS) on the satisfaction algorithm (the survey showed that it was the least impactful factor by quite a margin).
  • Making the number of reviews dynamic, based on the category, so that in effect there will not be a cap on the reviews that are counted into the satisfaction algorithm. (in other words, the more reviews you get the better and that is not capped at a fixed number)
  • Adding a factor that rewards higher quality reviews by making them worth more in the calculations (quality is measured in our review evaluation algorithm and validated by our QA team).
  • Adding several new data elements to the market presence algorithm.
  • Revising the data sources for many of the data elements in the market presence algorithm.
  • Changing how missing or “backup” data points influenced the algorithm to increase accuracy.
  • Refining how reviews “age” to make it more important for products to have current reviews that more accurately reflect the product releases currently available.

And that’s just a quick summary; there are many more small changes that increase accuracy. As you can see, quite a lot of effort went into this new version in order to represent the review data most accurately. After many weeks of testing the new algorithm, it is now live on the site across all categories. If you have any questions on version two email us at research@g2crowd.com.