About the data

Transforming the raw data requires significant effort

spolightonspend is an innovative online platform that seeks to deliver meaningful visibility of public sector spending on goods & services. To that end, significant effort is required to improve the original raw financial data such that it is accessible, relevant and of value to the general public. The work to effect the data improvement is undertaken by Spikes Cavell - a private sector organization that transforms and analyzes spend and related data for more than 1,000 public sector bodies worldwide to help them save money, address important policy related questions and as a bi-product of those efforts deliver transparency.

A quick, cost effective and pragmatic methodology

Spikes Cavell's methodology has been developed over the past nine years and is a quick, cost-effective and pragmatic approach to turning raw financial data from a public body's financial management systems into actionable business intelligence designed to support the delivery of better value for money in public spending. The methodology is comprized of the following broad steps:

  • Extraction - to support rapid extraction of the raw financial data with minimum effort, the public body is provided with a documented data extraction specification and an experienced project manager provides practical advice, guidance and support to ensure that the raw data is properly and accurately extracted.
  • Standardization - the validated data extract is processed using a specially developed engine to standardize the data, remove duplicates, identify and fix errors and prepare the file for subsequent processing.
  • Redaction - to minimize the risk of inadvertent breach of personal privacy laws, the validated and standardized data extract is further processed to identify payments made to individuals (for example expense payments directly to staff). Spikes Cavell's redaction algorithms are sophisticated and leverage unique reference datasets designed to ensure that bone-fide sole traders are not inadvertently obscured. Once an individual has been identified and validated by a data analyst, any identifying information is then overwritten to ensure that the individual cannot be identified.
  • Classification - every public body's financial management system is broadly similar, but when it comes to delivering meaningful visibility of spending on goods & services there are significant differences that mean that it is not possible to make meaningful like-for-like comparisons (for example by departmental, cost center or subjective code) of spend on goods & services. To overcome the absence of uniform and reliable classification, a sophisticated matching & inference engine is used to match the supplier record or item description to Spikes Cavell and licensed 3rd party reference datasets and append classifications derived from Spikes Cavell's "vCode". The vCode is a hierarchical classification system specific to the public sector that that facilitates the classification of suppliers of goods & services in such a way that it is possible to compare spend data across the public sector organizations. All classifications that represent 97% by value of spend are validated by classification experts. It is this classification and validation effort that is used to provide analysis of "Spend by Category" in spotlightonspend.
  • Enrichment - the matching & inference engine is also used to match the supplier record to the Spikes Cavell and licensed 3rd party reference datasets and append a range of attributes to each supplier record. Standard attributes include: the Number of Employees, Annual Revenue (Actual or Modelled), Date of Incorporation (Birth Year), Geographic Location and Risk Classification (Modelled). Several of these attributes are used to deliver "Spend in Summary" in spotlightonspend.
  • Aggregation - we bring the standardized, classified and enriched records together and link the supplier records to each supplier’s master record in the Spikes Cavell reference datasets. The aggregated datasets are used to calculate national averages that use all of the datasets processed by Spikes Cavell irrespective of whether the public body has elected to have their enhanced spend data published to the spotlightonspend platform. The unique "national averages" allow for effective comparison by entity type and are used in the "Spend in Summary" section of the spotlightonspend application.

Who supplies the data to spotlightonspend

spotlightonspend uses invoice data supplied by public bodies who have elected to have their spend data published to spotlightonspend.org. spotlightonspend is powered by Spikes Cavell, a private company that classifies the invoice data into clear and consistent categories of goods and services that people will recognize such as "Computer Hardware" and "Offices of Lawyers". It also provides information on overall spend against useful and standardized metrics such as average spend per creditor, spend with small and medium sized enterprises (SMEs) and local suppliers.

How the values are calculated

You are able to view data for those public bodies that have elected to publish their data to spotlightonspend.  Comparisons and national averages for public bodies are calculated using all the available data supplied to Spikes Cavell.

Suppliers with whom there was a total spend of less than $1,000, over a 12 month period, are excluded from the data in the charts and graphs, as experience has shown it usually represents less than 1% of total spend on goods and services. Transaction detail is available (which may include payments to vendors where less than $1,000 was spent over a 12 month period based on the threshold set by the organization) in the raw data download section of this site.