Casual Gaming Revolution posted a great list of all the major American gaming conventions happening in 2016. You should go check out their original post here. Looking at the spreadsheet, I had trouble parsing out some of the information I wanted to glean — which cons were closest to me? Which ones would dovetail best with my schedule this year? Could I take a list of the websites and try and run it through some of the data gathering tools leftover from my failed startup to find more information than what’s in the list?
Answering these questions was tough, because the data were spread across several sheets tabs, one for each month, plus a summary sheet. I can understand why they did it — the most common question is probably “Hey I wonder what’s going on in June, anything interesting?”, and I guess I could see this format being slightly easier for the public to add new items to the list (slightly). As I was digging, I felt like this issue was something akin to a premise I’ve held for a long time “be strict when sending, and tolerant when receiving”, in this case, having a rigorous, complete dataset, and presenting 12 separate ‘Views’ into that dataset would accomplish the same job of showing Cons by month, while simultaneously open the full data set up to more interesting queries and manipulation.
I spent an hour or so copying/aggregating/refactoring a copy of the original spreadsheet, and then another few minutes cleaning up some of the data that didn’t fit exactly fit my new ‘schema’.
Here’s the refactored spreadsheet.
After that, I was off to the races. I popped over to FusionTables, an oft-overlooked but quite awesome sidealong product to Google Drive Sheets. I imported the sheet from Drive, and plotted the new aggregate “Location” column to a map, which gave a really cool view:
Discoving immediately only two cons in my home state of Colorado :(. The fusion table can do more cool shit as well; The map can be filtered by the start date of the convention, so if you know you’ve got some vacation coming up, and want to see what’s close and convenient timing-wise, you can do that.
Moral of the story is, when you’re organizing data (and this includes building a card or component list for a prototype), the most important thing is to consider what’s going to provide the most flexibility in answering questions. Most likely, that’s going to be served best by a full data set, with each individual data element stored in their most granular form, and then fields aggregated together via functions, and records split by criteria, rather than trying to merge post-data entry.
I reached out to Casual Gaming Revolution and I’m currently working with them on a way to get these refactors merged back upstream into their primary document. Look for an update there real soon!