Did You Know Spotify Uses Peer-to-Peer Networking?

[ Illustration of how Spotify streams songs from The Pansentient League. ]

Spotify just hit 2.5M paying subscribers and is poised to announce a "new direction" on Wednesday, four months after launching in the United States. But few people probably give much consideration to how the U.K.-based music streaming service that started in Sweden actually works.

That's why a post on the Spotify-tracking blog The Pansentient League -- based on a 2010 academic paper by Gunnar Kreitz and Fredrik Niemela of Sweden's Royal Institute of Technology and Spotify -- caught our attention.

Unlike Pandora and Spotify's competitors, Spotify's desktop app for Windows and Mac uses a peer-to-peer network combined with server-based streaming. Why use P2P? Quite simply, to make the service more scalable and lighten the load on Spotify's servers.

Spotify manages to do this without making users wait too long for a requested song to start playing. The paper reports that the median latency of a track is 265 milliseconds, or faster than the average blink of an eye (300 to 400 ms). Any Spotify user can tell you that the speed at which songs play feels instant.

Note: Spotify's mobile apps stream only from Spotify's servers [ the paper doesn't address mobile apps.]

Among the paper's interesting data points: about 61% of playbacks are listened to in a predictable order (one song after the other in an album, for example) and less than 1% of all playbacks "stutter."

While just 8.8% of data came from the servers during one measurement period (with 35.8% from the P2P network and 55.4% from cached data), Kreitz and Niemela noticed that more music is played from the cache at night although this is less true on weekend nights, when people play more new music. Also, the P2P network gets more of a workout on weekends than weekdays.

In the Hacker News discussion of the Pansentient post, a user who goes by Sudonim says he canceled his Spotify subscription partly because they use P2P networking. "If I'm paying for it, I don't want to be a node in their network. Most users are probably clueless about that because they aren't up front at all."

The paper points out early on that Spotify uses "similar mechanisms" for locating peers as file-sharing apps like BitTorrent and Gnutella, services everyone knows are used for illegal downloads. Coupling this with Sudonim's reasoning makes us wonder if Spotify's system might become a PR issue as the company grows.

How the P2P Network on Spotify Works

When you request a song from Spotify, your desktop app, which is called a "client" in the Kreitz and Niemela paper, decides where to stream from based on how much data is in your play-out buffer.

"The connection to the server is assumed to be more reliable than peer-connections, so if a client’s buffer levels are low, it requests data from the server," they write. "As long as the client’s buffers are sufficiently full and there are peers to stream from, the client only streams from the peer-to-peer network."

When you request music popular with other Spotify users, you're more likely to get the track from the P2P network than the server, Kreitz and Niemela point out.

So how is Spotify's P2P network set up? Kreitz and Niemela explain that it's an unstructured network where every peer is equal. Trackers help construct and maintain the network; a tracker keeps a list of 20 of the most recent peers for each song. There are no supernodes performing maintenance functions.

Your Spotify client stores local caches of tracks you've downloaded. "The content of these caches are also what the clients offer to serve to their peers,"  write Kreitz and Niemela. The protocol is set up so that your client will only offer tracks it has completely cached.

Read section III of the paper if you want the nitty-gritty details of how the trackers find peers and how clients find songs.

We like the authors' succinct explanation of how your Spotify client gets connected to the P2P network when you fire it up:

"If it was still listed in the tracker for some tracks then it is possible that other clients will connect to it asking for those tracks. If the user starts streaming a track, it will search the peer-to-peer network and connect to peers who have the track, thus becoming a part of the overlay."