Explorify

Web

An elegant way to make sense of your Spotify extended streaming history

Explorify lets you answer (almost) any question you want about your lifetime listening history. What songs do you listen to the most? What month of the year do you listen to the most music? What artists do you skip the highest percent of the time? All 100% local, so your data never leaves your device.

Gallery

The Story

I perceive time through music. I organize my memories around what music I was listening to at the time. When I learned Spotify lets you access a file of every one of your lifetime streams, it was like Christmas morning.

When I received a JSON file so big that formatting it crashed my text editor, I realized there might be a little more work involved. If it was too big for my editor, it was too big for me to understand anything by reading the raw file. I wrote some aggregation scripts, but I got tired of remembering exactly which lines to uncomment when I wanted to see my top songs vs. my top artists. So I envisioned an app; one that let you play around with lots of different filters (and combines, sorts, etc.) and then save sets of those filters for future use.

The first draft of the design focused almost exclusively on the filtering UI.

The initial Figma design of Explorify

It was way too complicated. I added explanations, reorganized the filters; nothing worked. It was too a steep learning curve. The breakthrough finally came from a part of the app I'd neglected: the "saved presets" (now called "views" in the final product). Instead of having the home page be the filters, it would be a grid showing all the views plus a preview of their contents, immediately showing a new user interesting information about their listening.

The first Figma design of the new homepage

Each view would get its own page, and the filtering UI would only be shown when customizing a view, gradually introducing the complexity. It seemed right, so I started working on the website.

Building the first draft of the site didn't take all that long. A month in, I texted my friend and beta-tester Tristan "it’s pretty close to release". Of course, I was wrong. From rewriting the entire filtering algorithm to the nightmare that was adding a percent view, the classic 80%/20% rule reared its ugly head. Here are some of the highlights of the challenges I faced along the way.

Sharing views

I wanted a way to share if you'd come up with a cool set of filters. You'd send a link like https://explorify.link/a1b2c3 to another user, and it'd load that view on their computer. Since the app never stores any data on servers, I needed an algorithm that could encode every possible permutation of filters as a URL (and vice versa).

The initial idea was (pretty) simple: for every individual filter, give it a binary number based on its current state. Then concatenate those binary segments to get one number representing all the filters in the view, and convert it to a base-64 hash to shorten it. This diagram might help illustrate the process:

When it's time to parse a URL, the hash gets converted back into binary, then that number is split into segments (at the same points as concatenation). Each binary segment is used to assign a value back to its filter, and we've completed the conversion back.

Some filters were more complicated than that (I'm still not sure encoding the group sort order isn't some form of witchcraft),¹ but the resulting algorithm is pretty cool. It's even used to encode the views stored on your device so they take up less space.

The two searches

Taking a break from the technical stuff, I also had to solve some tricky UX challenges. One of the biggest was something I ended up calling the "two search problem". If a user had their songs ranked by plays, and searched up "Taylor Swift", did they want to see their 1st, 2nd, 3rd, etc. most played Taylor Swift songs? Or did they want to see that their 2nd, 5th, and 9th most played songs were by Taylor Swift?

As you might be able to tell, this was a hard problem to even explain to users. The Yes/No switch for the "Rerank items on search" filter (my original solution) was easily the thing my beta testers were most confused about.

A seemingly separate problem I was facing was the fact that ⌘F was broken. Hundreds of thousands of song names is too much for a browser to show at once, so I added a way to only show the items currently on screen based on the scroll position. Unfortunately, that meant that anything off-screen was lost to ⌘F. It struck me that if I made a "Find in page" replacement, it would solve the two search problem also. Searching in the filters could always rerank to 1st-2nd-3rd, and the find in page replacement (now called "Jump to") could show the original ranks (e.g. 2nd-5th-9th).

The "Jump to" UI

For those curious, the witchcraft works like this: all 11 possible sort criteria (Plays, Day of week, Artist name, etc.) have an index 0-10 assigned to them that determines their place in the sort priority. The naive storage mechanism would be to store each index back-to-back, which requires 11 criteria * 4 bits = 44 bits of storage space. That's inefficient, because it's enough to store any 10 four-bit numbers (could be 10 thirteens, for example), but we only need to store 10 unique numbers.

So we turn to a different algorithm. We assign each critera a letter (Plays = a, Day of week = b, Artist name = c, etc.), and assign them to a string based on their place in the sort order. So an order of (Plays tiebroken by Artist name tiebroken by Day of week tiebroken by ...) would result in a string of acb.... Finally, we take a list of every possible permutation of abcdefghijk in alphabetical order, and the index of our string within that list is the number of our segment.

Using this technique, we only need to store 11! possibilities (the actual number of possible sort order permutations), which cuts the sort order storage to 26 bits, a marked improvement over the naive 44 bits.