Of course most of us already heard about Spotify, it provides a more or less scalable music streaming service offering low latency access to a library of over 8 million music tracks in a “non delay-preserving” way. I’ve been reading about it and felt like sharing some of the things I found interesting about it.
So Spotify’s proposed solution consisted on offloading some of the workload into a P2P overlay while keeping a low latency.
The P2P network is unstructured and its construction and maintenance is assisted by trackers much like the ones from bittorrent. The nodes connect to each other when they download a track, and there are limitations on the number of established connections (so there is a heuristic to choose which connections should remain established in terms of their usefulness to the peer and to the overlay network).
Other interesting details:
- Clients only serve tracks that they have completed downloading.
- There is a different P2P overlay network for each different server. This means that there can be two or more disjoint networks.
- There are two mechanisms of finding peers. Either directly using the tracker or querying the overlay network (similarly to gnutella it contacts all its neighbours, and the neighbours also do the same, reaching a depth of 2).
- As tracks are small files, downloading new tracks is a very frequent operation. Therefore the lookup time becomes a big issue which is one of the reasons for Spotify not using a DHT (also for reducing overhead).
- It maintains overall big sized caches (10% of disk free space) to keep tracks that where fully downloaded in order to allow future replays or to serve them in the P2P overlay.
- TCP connections are used and maintained established during execution in order to reduce handshake phases and to know if the nodes are still alive. Also some changes were done in the linux kernel in order to avoid reductions of the congestion window.
- It does prefetching of tracks even before the current track has finished. About 30 seconds before it finishes it starts searching for the next track in the overlay network and if there are only 10 seconds left, it contacts directly the server.
- Generally nodes decide whether to download from the overlay or from the server depending on the client buffer levels. If this are low then they download from the server, otherwise they download only from the overlay network. In case this level is critically low the node cancels all of it’s uploads (emergency mode).
- Files are divided into 16 kB chunks and nodes decide from which other nodes they should download through a greedy algorithm based on the expected download times (number of requests vs download speed).
- Before playback commences, the client periodically uses Markov chains to simulate the playback. It considers it as failing if a underrun occurs. It performs 100 simulations and if more than 1 fails, it waits before beginning the playback.
- It uses urgency schemas to prioritize requests and achieve a better quality of service.
- The connection between two peers is tried both ways to avoid NAT traversing and clients use UPnP protocol to ask home users for a port to use for accepting connections.
- P2P Overlay searches make the system more robust to tracker failures.
- The insertion mechanism of peers creates some sort of clustering by interest and the tracker search allows to jump between different areas of interest (more of a future work perspective).
I summarized the experimental results in this document, I recommend reading the paper though. For now I’m just left with the feeling that being both of the Spotify paper authors from KTH, I would definitely not mind working with them when I move to Stockholm next semester.
All presentations and reviews so far can be found in this folder.