Go back to blog

Technology

Evaluating ACR: Introducing Pex’s audio fingerprinting benchmark toolkit

Jakub Gałka

When social media platforms first emerged, they were about connection and sharing your life with friends through words or images. As these platforms grew and changed, so did the type of content they hosted. Suddenly you could share music and videos, which could be viewed by millions of people. Then came monetization, and with revenue on the line, it became increasingly important for content to be owned, so that the right people could be paid. Since the early 2000s, technologies have been developed to identify, analyze, and track digital content, especially music. One of those technologies is Automated Content Recognition (ACR), which is primarily used to track audio, but can also be used for video, images, and text. It’s used by major platforms and rightsholders to identify and license content, and it becomes more necessary every day. At the same time, it becomes increasingly inadequate nearly every day. This kind of technology must innovate at the same rate as digital content, or it becomes obsolete. 

Content has evolved, but not all ACR technologies have 

There are numerous ACR technology providers, and some content platforms have even created their own. And yet, we don’t see all content properly identified and attributed. With new developments in media and tech – such as AI-generated content, speed and pitch modified audio, NFTs, and social media trends – content becomes more and more difficult to identify. For example, Pex found hundreds of millions of modified audio tracks across social media platforms that were not identified by leading ACR solutions. Nearly 64% of identified matches on SoundCloud and 30% of matches on TikTok were modified in just Q1 of 2023. Clearly, not all ACR technologies have risen to meet the challenges these new technologies have created. 

How do you know which ACR solutions are keeping up? How do you know if content is being missed or what an adequate identification rate is? ACR solutions must be tested and compared to other solutions – that is the only way to set standards for identification and accuracy. To date, this has largely not been done. Why? Because developing a fair test is complicated, and it can be even more challenging to find willing participants. 

In order to properly credit creators, pay rightsholders, and uphold the value of copyright, ACR technology needs to evolve with content. In order to do that, the technology must be tested against real-world use cases and constantly improved to meet new challenges. 

Introducing the first toolkit for benchmarking audio ACR

Pex is the global leader in digital rights technology, and our ACR technology has been trained and tested on digital content’s most challenging use cases. We are always evolving our solutions to identify more content with greater accuracy, and we believe we are best suited to propose a first-of-its-kind Audio Fingerprinting Benchmark Toolkit: an open-software library for designing, running, and measuring ACR solutions. 

With this toolkit, there is finally a framework for testing ACR that is fair, challenging, and repeatable. The kit provides tools to easily generate challenging datasets with several real-world audio transformations applied, including audio with modified speed or pitch, echo and reverberation, filtering, noise, and cropping. Such transformations are typically found in user-generated content, audio mixes, and other modern content.

Overview of the toolkit

The purpose of the Audio Fingerprinting Benchmark Toolkit is to evaluate the accuracy of different audio fingerprinting and matching solutions. The aim of audio matching is to identify which pieces of the reference audios are present in the query audios. This is usually achieved by first creating unique audio fingerprints for both the reference and query audio files, and then using such fingerprints to compute audio similarity confidence. To make evaluation easy, the toolkit comes with a dataset of carefully crafted and processed audio files, which can be used for benchmarks on various difficulty levels: easy, medium, or hard. By opening the software, we encourage transparency and set fair and just conditions for anyone interested in using the toolkit. 

Concise accuracy metrics implementation, which ensures fair comparison between different tests and systems, are also provided as an essential part of the toolkit. The evaluation considers: 

  • True Positive count (TPs): the amount of correctly recognized (correctly matched) audio pairs. The higher the value, the higher the accuracy. 
  • False Negative count (FNs): the amount of  non-recognized (missed) audio pairs. The lower the value, the higher the accuracy. 
  • False Positive count (FPs): the amount of falsely-recognized (wrongly matched) audio pairs. The lower the value, the higher the accuracy. 
  • Cumulative recall and precision metrics are also computed for each experiment for easy systems comparison.

By publishing this toolkit, Pex presents what we believe industry best practices should be and encourages all interested parties to adopt them for the ACR technology evaluation, regardless of the vendor type or scale of implementation. 

Rightsholders and platforms: how does your ACR provider compare?

At Pex, we have found many instances of content not being identified by legacy ACR technologies, especially when it comes to modified audio. We invite all audio and music rightsholders to test their providers using the Pex toolkit and determine the accuracy of their current solution. Rightsholders or digital content platforms in need of ACR technology can also use the toolkit to assess vendors. A provider with high accuracy will reduce infringement liability for platforms and distributors, and increase revenue for rightsholders. For researchers, the toolkit can generate and define more challenging experiments and accelerate technology development.

Interested in seeing Pex technology in action or comparing our solutions to your provider? Reach out to schedule a demo with our team.

Recent stories

See more Technology