Is there an interface for automated systems to access comics and metadata?
Yes. You can get comics through the JSON interface, at URLs like https://xkcd.com/info.0.json (current comic) and https://xkcd.com/614/info.0.json (comic #614).
It also says:
xkcd.com updates without fail every Monday, Wednesday and Friday.
It’s always seemed so to me, but I felt like checking.
First, let’s get the data. Friday’s comic was number 2799, and we can fetch the JSON data for it with curl and format it with jq:
jq is built for picking information out of JSON, like so:
Better yet, we can format it as CSV:
The comics are numbered from 1–2799 without break, so we can fetch them with a looping shell script that gets each JSON file, picks out the data, and appends it to a file:
(The echo parameters show the number of the file being downloaded in a tidy way.)
When that’s done, we can fire up R. (Of course, we could have got the data in R, but a shell script is faster and easier for me.) Load in the tidyverse packages, then read the data and turn the raw date information stored into something more useful:
I don’t use day_of_year but I’ll leave it in there just in case.
The first 44 comics are dated 2006-01-01, but we’ll ignore that. Try a quick chart (the image is a link that will show the image on its own, probably much larger):
Looks very regular overall, with a few odd weeks here and there. Let’s tweak a few things:
The days of the week aren’t plotting nicely, so we need to make the week start on Monday. Then fiddle a few more options to make a more finished chart.
At a glance it’s easy to see that (setting aside early 2006) Munroe overwhelmingly does post every Monday, Wednesday and Friday. There have been four weeks where he also posted on Tuesday and Thursday. It looks like there have been a number of weeks where he posted Wednesday’s comic on Tuesday. Maybe that’s a time zone thing. How many weeks have there been where he didn’t post three comics?
Look at all those weeks numbered 01 or 53: those are partials at the beginning or end of a year. Ignore them. (To be sure about the counts, I should handle these weeks specially and sum across the entire week containing 01 January, but I can’t be bothered right now, so I’ll just ignore them. By eye it looks right.)
So it happened once each in 2006, 2012, 2016 and 2018. That’s dedication!
(I could have used the R package xkcd to make the charts look like an xkcd comic, but it requires installing a special font, and I couldn’t be bothered to do that either.)
Many thanks to Randall Munroe for providing the JSON files as well as his great comic.