Several weeks ago, Lyor Cohen, YouTube’s music ambassador, published a blog post on the role of YouTube in the music industry. He pointed out that “there’s still a disconnect between YouTube and the rest of the industry” and outlined 5 factors that, in his opinion, are responsible for the current state of affairs.
Just a day later, the CEO of RIAA Cary Sherman, wrote a reply that questioned many of Cohen’s statements, and backed some of the counterarguments with data.
No matter who you side with, we all can agree that the lack of transparency is not helping the debate. So instead of picking sides, we will try to answer a question: How big is music on YouTube?
But before we do so, you need to understand where the data is coming from. Pex is a search engine not dissimilar to Google or Bing, with a primary focus on video and music.
At its core, Pex crawls the web to identify audio-visual content. When it does so, it fingerprints the multimedia files and also extracts the surrounding metadata, which is then constantly updated (on average every couple of hours). That is why we have clear visibility, not only into the performance of selected content, but also into whole platforms, including YouTube. At the time of writing, Pex has indexed over 6.1B videos and songs across 20+ sites, including YouTube, Facebook, Instagram, Soundcloud, Vk, Youku/Tudou, Twitch, and more.
So, how big is music on YouTube?
One way to answer this question is to count all of the videos that are uploaded to Youtube, and that are registered in the category of Music. However, this approach is flawed, as the categories are self reported by the uploader. Users can choose one of 15 or so categories, to best describe their content, with People & Blogs being the default selection.
In the table above, you can see the breakdown of all the videos uploaded to YouTube by August 22, 2017, sorted by views. Music is by far the most viewed category with an overall traffic of over 27%.
A better way to answer the question above is to look at how many videos contain any music, regardless of the user’s categorization of the video. Thankfully we can do so, because we run a classifier on every single ingested video, which annotates the audio track with labels like “music”, “human speech”, and more.
Based on our calculations, more than 84% of videos contain at least 10 seconds of music.
These results don’t imply the ownership of the music, nor its legal status. At the moment, it’s way too complicated to answer this question.
What we can try to answer instead is: How much of the content containing music is being monetized on YouTube? As Mr. Cohen stated in his article “as of 2016, 99.5 percent of music claims on YouTube are matched automatically by Content ID and are either removed or monetized.“
However our data shows that almost 65% of these videos are not claimed, and thus generate no revenue.
What the data doesn’t show is the correctness of the claims. It’s not easy to tell, if the videos are being claimed by the true rightholders, or by users who may be claiming ownership improperly.
If we plot the data over time, the trend seems to be getting worse, not better. Here is a graph showing the difference between “videos not claimed” vs. “claimed videos”, over time.
When we were speculating on the reasons why this could be, we thought that maybe Content ID doesn’t have enough reference files to claim more content. To verify this thinking we picked one of the most popular songs on YouTube: “Gangnam Style”.
Of all 891,685 copies we found, 182,220 weren’t claimed (~20%), representing an accumulated 0.5B views.
Perhaps this performance is because the segments containing Gangnam Style were under 20 seconds? With the recent boom of short-form content, this would make sense. However, when we consulted the data, the results paint a very different picture.
The average length of a segment containing the Gangnam Style’s music is 46.6 second for all content that was not claimed. This is roughly three times less than the length of the segments for all claimed content (131.7 seconds), but still long enough for Content ID to identify these videos.
In fact, the average duration of all videos uploaded on YouTube surpassed 14 minutes (851 seconds) at the end of 2016, and is currently approaching the 15 minute mark. It seems like truly short content is more commonly uploaded to other platforms, like Giphy or Instagram. If Pex can match segments of videos down to 0.5 seconds, across all platforms, the length of matches and videos on YouTube should be no excuse for Content ID.