Imagine there are no variable names. Imagine working – in 2016 – with registers. Imagine one minute file load times. Imagine that all commands are just numbers. Imagine there’s no usable string processing.
Welcome to Emme 3. During the years that I worked in travel demand forecasting, this was the main tool available to me.
Emme was undoubtedly a trailblazing innovator when it first came out in 1982 and remained a power user’s dream through to the early 90s. But it clearly missed the Windows boat; the software seems to have stagnated until beginning a revival in the late 00s.
As of early September, I’ve shifted to a new job in software development / 3D graphics with Mental Canvas. This comes after eight years as a data scientist / transport modeller at Metrolinx. I’m very excited to take on this new position, and I’m particularly keen to shift to a job where the product is working software instead of analysis and advice.
While the transportation modellers I’ve talked understand the close relationship between software and modelling, for others in transportation this seems like a drastic shift in direction. It’s not; the two jobs have quite a bit in common, and I’d like to explain the shared ground. To me, there are three basic stories here:
Similar Data: both transportation and graphics have a lot of similarly structured data. Both jobs involve a lot of time organizing, structuring and automating the flow of data.
Data Science & Software: a software person would use the term “data science” to describe transport modelling; it’s a hybrid of software and statistics, applied to the problems in the transportation field. Data Science and Software Development are closely related career paths.
Software and Transportation are Converging: or perhaps more boldly, software is in the early stages of disrupting the transportation sector.
I’ve long been a fan of data visualization, dating back to my days in the Imager lab at UBC, which had a research area in that subject. I’ve realized that my approach to “telling stories visually with data” includes a lot of knowledge that isn’t common in the transportation world, and I decided share what I know.
Drawing from the Internet, here’s a basic collection of content that gives a good introduction to how to communicate visually, using data. It’s even more compelling if you have my running commentary alongside… batteries not included.
The agile process (or “scrum”) is very successful in the software world, but little known outside. My team has been doing scrums at a transit agency for 3-4 years, and I get asked about it regularly. I’ve assembled a few links that are useful for explaining it without being too jargony or software-specific.
I’ll talk more about:
A very quick update – I’ve finally updated my Toronto transit map to use more recent map information. As of March 2016, the maps now show data from roughly Dec. 2015 for all GTHA transit systems. (I hadn’t had time to update them since originally building the map in 2011.)
The main visible change is that the TTC map is now simpler and shows route frequency with the thickness of the lines. Unfortunately, I may have difficulty updating the TTC map going into the future – as of 2016, their map is now more “conceptual” and is not geographically accurate; I can’t readily warp it to sit on a geographically-accurate map.
Many years ago, I found an excellent resource for transit modelling: slides from a series of 2006-2009 workshops held by the US Federal Transit Administration (FTA) advising agencies applying for federal funding for rapid transit construction under the “New Starts” funding program. It’s very deeply buried on their website, and since then I’ve seen very few people reference this material, nor have I seen it assembled into a formal report.
So, for those interested – I’ve pulled together an easier-to-use table of contents to the three separate workshops, and tried to “deep link” into them to make it easier to browse and find the material. Enjoy! UPDATE April 2016 – FTA has reorganized their website and the reports are no longer available there. I’ve mirrored everything here on my website. Continue reading Federal Transit Administration (FTA) forecasting workshops→
Modellers all learn about the different components of transit trip travel time, and the “perceived” weights that people put on them. It’s a useful insight into how transit works, and I find it’s a great exercise for testing how “useful” a new transit service is. The trouble is, after learning about weights, everyone wants to customize them – for their economic analysis, for one component of their model, etc. And analysis quickly gets inconsistent. Here’s why I think that’s often a bad idea – and why I think the weights used in transit assignment should be applied, unchanged, for all other parts of analysis. (And it’s not just me – the US Federal Transit Administration made this exact point in a 2006 discussion.)
A few years ago, I built a Google Maps app that combined the maps from several Toronto transit agencies all in one mashup map. I never got around to discussing the technical issues associated with that effort, and thought it might be worth writing up. This is an extra-technical post, covering the GIS / raster graphics / GDAL programming techniques I used to make the mashup work, for anyone else interested in trying a similar exercise. Continue reading Transit Map Mashup (Tech Talk)→
I recently received an NGO request for the House of Commons members’ contact information in Comma Separated Value (CSV) format. I quickly found the helpful Represent website by Open North. This small group who has scraped government websites and created machine-readable versions of MP contact information, plus a minimal maps interface layered on top.
However, the Represent effort leans a little too “techno-elite.” The results are provided in the JSON format, a hyper-current, versatile and Web-buzzword-compliant format. But most NGOs I know are still struggling to figure out Excel, never mind anything more recent; and Excel can’t handle much more than CSV files (if even that – modern web-friendly text encodings like UTF8 don’t even work properly).
I’ve finally published my M.A.Sc. thesis as a journal article, under the title Advances in Population Synthesis: fitting many attributes per agent and fitting to household and person margins simultaneously.
This article is the preferred citation going forward; I think it tells the story best:
A brief summary of the key contributions described in detail in my thesis
A better explanation of the U.S. context and the applicability of this work outside Canada. Statistics Canada goes to great lengths to protect Canadian privacy, and some of my work was motivated by the particular difficulties associated with Canadian census data.