Category Archives: modelling & sci/tech

From Data Science & Transportation to Software

mds
As of early September, I’ve shifted to a new job in software development / 3D graphics with Mental Canvas. This comes after eight years as a data scientist / transport modeller at Metrolinx. I’m very excited to take on this new position, and I’m particularly keen to shift to a job where the product is working software instead of analysis and advice.

While the transportation modellers I’ve talked understand the close relationship between software and modelling, for others in transportation this seems like a drastic shift in direction. It’s not; the two jobs have quite a bit in common, and I’d like to explain the shared ground. To me, there are three basic stories here:

  • Similar Data: both transportation and graphics have a lot of similarly structured data. Both jobs involve a lot of time organizing, structuring and automating the flow of data.
  • Data Science & Software: a software person would use the term “data science” to describe transport modelling; it’s a hybrid of software and statistics, applied to the problems in the transportation field. Data Science and Software Development are closely related career paths.
  • Software and Transportation are Converging: or perhaps more boldly, software is in the early stages of disrupting the transportation sector.

Continue reading From Data Science & Transportation to Software

Communicating Visually, Using Data

Picture1I’ve long been a fan of data visualization, dating back to my days in the Imager lab at UBC, which had a research area in that subject. I’ve realized that my approach to “telling stories visually with data” includes a lot of knowledge that isn’t common in the transportation world, and I decided share what I know.

Drawing from the Internet, here’s a basic collection of content that gives a good introduction to how to communicate visually, using data. It’s even more compelling if you have my running commentary alongside… batteries not included.

  1. Telling Compelling Stories with Numbers
  2. Graphical Integrity
    • Edward Tufte slides (selected slides; original here)
  3. The Human Visual System
  4. Colour
  5. Tabular data
  6. Maps
  7. Closing Thoughts & References
    • Examples
    • Tufte, “Graphical Displays Should…” & Pantoliano
    • Tufte, “Principles of Graphical Excellence”
    • Kelleher & Wagener – Ten Guidelines

I haven’t covered one area in here: the basic principles of graphic design. But these are more widely known and can be learned in normal courses.

Scrums outside the Software world

Scrumboard picture The agile process (or “scrum”) is very successful in the software world, but little known outside. My team has been doing scrums at a transit agency for 3-4 years, and I get asked about it regularly. I’ve assembled a few links that are useful for explaining it without being too jargony or software-specific.

I’ll talk more about:

  1. Visual Management
  2. Agile Product Development
  3. Scrum Process

Continue reading Scrums outside the Software world

Toronto transit map updated

A very quick update – I’ve finally updated my Toronto transit map to use more recent map information. As of March 2016, the maps now show data from roughly Dec. 2015 for all GTHA transit systems. (I hadn’t had time to update them since originally building the map in 2011.)

The main visible change is that the TTC map is now simpler and shows route frequency with the thickness of the lines. Unfortunately, I may have difficulty updating the TTC map going into the future – as of 2016, their map is now more “conceptual” and is not geographically accurate; I can’t readily warp it to sit on a geographically-accurate map.

Federal Transit Administration (FTA) forecasting workshops

Many years ago, I found an excellent resource for transit modelling: slides from a series of 2006-2009 workshops held by the US Federal Transit Administration (FTA) advising agencies applying for federal funding for rapid transit construction under the “New Starts” funding program. It’s very deeply buried on their website, and since then I’ve seen very few people reference this material, nor have I seen it assembled into a formal report.

So, for those interested – I’ve pulled together an easier-to-use table of contents to the three separate workshops, and tried to “deep link” into them to make it easier to browse and find the material. Enjoy!

UPDATE April 2016 – FTA has reorganized their website and the reports are no longer available there. I’ve mirrored everything here on my website.
Continue reading Federal Transit Administration (FTA) forecasting workshops

Consistency in Time Weights

Modellers all learn about the different components of transit trip travel time, and the “perceived” weights that people put on them. It’s a useful insight into how transit works, and I find it’s a great exercise for testing how “useful” a new transit service is. The trouble is, after learning about weights, everyone wants to customize them – for their economic analysis, for one component of their model, etc.  And analysis quickly gets inconsistent. Here’s why I think that’s often a bad idea – and why I think the weights used in transit assignment should be applied, unchanged, for all other parts of analysis.  (And it’s not just me – the US Federal Transit Administration made this exact point in a 2006 discussion.)

The scenario

Suppose that we have a four-stage model with different transit time weights: Continue reading Consistency in Time Weights

Transit Map Mashup (Tech Talk)

GTHA Transit MapA few years ago, I built a Google Maps app that combined the maps from several Toronto transit agencies all in one mashup map. I never got around to discussing the technical issues associated with that effort, and thought it might be worth writing up. This is an extra-technical post, covering the GIS / raster graphics / GDAL programming techniques I used to make the mashup work, for anyone else interested in trying a similar exercise. Continue reading Transit Map Mashup (Tech Talk)

Represent Data

I recently received an NGO request for the House of Commons members’ contact information in Comma Separated Value (CSV) format. I quickly found the helpful Represent website by Open North. This small group who has scraped government websites and created machine-readable versions of MP contact information, plus a minimal maps interface layered on top.

However, the Represent effort leans a little too “techno-elite.” The results are provided in the JSON format, a hyper-current, versatile and Web-buzzword-compliant format. But most NGOs I know are still struggling to figure out Excel, never mind anything more recent; and Excel can’t handle much more than CSV files (if even that – modern web-friendly text encodings like UTF8 don’t even work properly).

So, I put together a quick bit of Javascript to convert the Represent JSON format to CSV format, ready for Excel to use. This should be “live” – pulling the latest version from Represent each time. Here are links to a few of the Represent datasets:

Federal

Provincial

Advances in Population Synthesis, the journal article

I’ve finally published my M.A.Sc. thesis as a journal article, under the title Advances in Population Synthesis: fitting many attributes per agent and fitting to household and person margins simultaneously.

This article is the preferred citation going forward; I think it tells the story best:

  • A brief summary of the key contributions described in detail in my thesis
  • A better explanation of the U.S. context and the applicability of this work outside Canada. Statistics Canada goes to great lengths to protect Canadian privacy, and some of my work was motivated by the particular difficulties associated with Canadian census data.

My thesis is still a good source for anyone wanting greater detail, or anyone interested in a clear explanation of some of the Canadian data sources I used.

Continue reading Advances in Population Synthesis, the journal article

Greater Toronto Area transit map

Many years ago, when Google first released Google Maps and revolutionized online mapping from the stagnant MapQuest era, I put together a few quick demos showing the Vancouver and Toronto transit maps. I’ve made a few updates over the years since then, but not much more.  The Vancouver one is still quite popular – more popular than TransLink’s own map, to be honest – but other web gurus made better Toronto maps, such as the excellent one by Ian Stevens.

I’ve noticed that Google has revamped the mapping APIs and is preparing to eliminate version 2.0. The whole treatment of online mapping is changing rapidly, as the mobile market takes off. I was thinking of just scrapping the Toronto map since it’s not well-used – but then I thought a little further. What if I could make a proper map of the Greater Toronto Area?  Ian’s map doesn’t cover that – in fact, since there isn’t even a good print product covering the full area. Perhaps I could make something useful for the “regional traveller” using GO, and also help mobile users who have trouble with Ian’s site.

I set to work, borrowing liberally from others. It’s a patchwork by nature, since each agency has its own colour and line conventions, but hopefully still useful. When Ian made his map, taking a bitmap image and turning it into tiles was a bleeding-edge endeavour and required painstaking effort – but the tools have improved a lot. Even still, a bitmap this size (35,000 pixels square) takes some horsepower. I didn’t have the energy to do everything Ian did (like removing the background); his map will still probably work better for most TTC riders. I also couldn’t figure out what map projection Oakville Transit used, and couldn’t get it to line up nicely with the other data.

I’ll probably do a few more revisions on this in the next few months – an adjustable opacity slider would be nice, a legend for each operator, and higher zoom levels. But I thought I’d release a beta version and see if anyone likes it, and see how expensive the bandwidth is.

Version 3 of my map is now up (and version 2 is still around for anyone who wants it). New in this version:

  • Local transit operator maps
  • More mobile friendly: full-screen view by default, location-aware (uses GPS to detect your current location, if available)
  • May seem slower, unless you have a new browser, like Google Chrome or Firefox 4
  • Graphics updates: labels cleaner, interchange stations cleaner, labels always visible instead of showing on hover (for touchscreen users)
  • Search tries to find a transit station first, otherwise tries other non-transit locations
  • No legend… yet
  • Added “Get directions to here” link to each station
  • API version 3
  • Fixes: added Lincolnville station, fixed broken links