Retrieving multiple historical copies of a file from iTunes

In order to build a full history of all songs I've played in iTunes over time, I must first retrieve relevant copies of my iTunes Library files. I do not want to do this manually as this would take hours. Instead I will use Python to locate and restore previous versions.

I know that I can use the command line to retrieve a list of all backups stored in Time Machine using the tmutil listbackups command. This gives me output similar to:

$ tmutil listbackups
/Volumes/Time Machine Backups 1/Backups.backupdb/mymac/2013-06-04-063902
/Volumes/Time Machine Backups 1/Backups.backupdb/mymac/2013-06-11-053216
/Volumes/Time Machine Backups 1/Backups.backupdb/mymac/2013-06-18-153848
/Volumes/Time Machine Backups 1/Backups.backupdb/mymac/2013-06-26-095305
/Volumes/Time Machine Backups 1/Backups.backupdb/mymac/2013-07-03-105231
/Volumes/Time Machine Backups 1/Backups.backupdb/mymac/2013-07-12-115817
...

I also know that within each backup, the file I want will be located at ./mymac/Users/myuser/iTunes/iTunes Library.itl

This is enough to build the complete list of copy commands. For each directory in the output of the tmutil command, I want to copy the file at the specified location and put the result (renamed to include the date) in my Previous iTunes Libraries directory (who would ever have guessed this directory might be useful someday?).

I will start with some imports to make sure I have all the packages I will need for this to work:

import datetime, os, shutil, subprocess

Next, I like to specify strings as constants early on in the script. This way I can easily change it later if I want to re-use the script for a similar purpose. For this script this includes the username, the command to execute, date formats that will be used for parsing input and controlling output later, the path within each backup to find the desired file, and the destination directory:

username = getpass.getuser()

list_backups = [ "/usr/bin/tmutil", "listbackups" ]

short_date_format = "%Y-%m-%d"
date_format = "{}-%H%M%S".format(short_date_format)

file_of_interest_path_format = "{}/{}/Users/{}/Music/iTunes/iTunes Library.itl"
dest_directory = "/Users/{}/Music/iTunes/Previous iTunes Libraries".format(username)

Now the real processing begins by retrieving the list of all backups and turning it into a native Python list. Note that result contains a single string with linefeeds, so we must split on '\n' characters and strip off the last element in the list (which will be a blank line):

result = subprocess.check_output(list_backups)
backups = result.split('\n')[0:-1]

I could simply use this list as-is but this would actually retrieve more entries than I care about. All the old entries are relevant, but newer entries are stored daily (within the last month) and hourly (within the last 24 hours). I only need a weekly copy as I don't listen to songs more than once every two weeks (I have a smart playlist that ensures anything in "Recently Played" is not in the list). I also happen to have some gaps in my history where I was traveling and did not have access to Time Machine, so I want to make sure I minimize gaps between backups.

I am using the following logic to decide which backups to retrieve as I iterate over the list of backups:

  1. if the previous backup is more than 6 days ago, keep this one
  2. if the previous backup is more than 9 days ago, keep the previous one even if it wouldn't have otherwise been kept and keep this one
  3. ... otherwise, do not keep this one

In this loop I am also building a map of backup timestamps to paths in Time Machine for quick lookups later.

Here's the loop:

data = {}
to_retrieve = []
last_seen = datetime.datetime.strptime("1984-01-24", short_date_format)
last_to_retrieve = last_seen
for backup in backups:
    date_raw = backup.split("/")[-1]
    date = datetime.datetime.strptime(date_raw, date_format)
    data[date] = backup
    delta = date - last_to_retrieve
    print "{} (delta={})".format(date_raw, delta.days)
    if (delta.days > 6):
        if (delta.days > 9 and last_seen != last_to_retrieve):
            to_retrieve.append(last_seen)
        to_retrieve.append(date)
        last_to_retrieve = date
    last_seen = date

Now data contains the map of datetime objects to pathnames and to_retrieve contains a list of the backups I want to retreve.

The next step actually retrieves the desired backups. You could just iterate over the list and grab them, but I have one additional complication: I changed the name of my computer a while back. This affects the path within the backup that I need to use to find the right file.

The overall path looks like this:

/Volumes/<...>/mymac/<date string>/<computer name>/Users/<username>/<...>

I will store the name of the computer from the path given in the results from the tmutil command ('mymac' above) and use it if it exists in each backup. If it doesn't exist, then I will grab the first (only) name that doesn't start with '.' found when getting a directory list of the path up to the field... this will be the no-longer-used computer name.

Here's the loop:

expected_machine_name = backups[0].split('/')[-2]
for s in to_retrieve:
    path = data[s]
    machine_names = os.listdir(path)
    if expected_machine_name in machine_names:
        machine_name = expected_machine_name
    else:
        machine_name = [ x for x in machine_names if x.find('.') != 0 ][0]
    src_filename = file_of_interest_path_format.format(data[s], machine_name, username)
    dest_filename = "{}/iTunes Library {}.itl".format(dest_directory, s.strftime(short_date_format))
    print "copy from {} to {}".format(src_filename, dest_filename)
    shutil.copyfile(src_filename, dest_filename)

This program now retrieves all the desired copies of the .itl file and places them in Previous iTunes Libraries. Each one has been renamed to include the date of the backup in the filename.