From Data Science & Transportation to Software

As of early September, I’ve shifted to a new job in software development / 3D graphics with Mental Canvas. This comes after eight years as a data scientist / transport modeller at Metrolinx. I’m very excited to take on this new position, and I’m particularly keen to shift to a job where the product is working software instead of analysis and advice.

While the transportation modellers I’ve talked understand the close relationship between software and modelling, for others in transportation this seems like a drastic shift in direction. It’s not; the two jobs have quite a bit in common, and I’d like to explain the shared ground. To me, there are three basic stories here:

  • Similar Data: both transportation and graphics have a lot of similarly structured data. Both jobs involve a lot of time organizing, structuring and automating the flow of data.
  • Data Science & Software: a software person would use the term “data science” to describe transport modelling; it’s a hybrid of software and statistics, applied to the problems in the transportation field. Data Science and Software Development are closely related career paths.
  • Software and Transportation are Converging: or perhaps more boldly, software is in the early stages of disrupting the transportation sector.

Similar Data

What’s the primary data I worked with as a transportation modeller?

  1. 2D Networks: nodes, links, intersections/turns, transit lines.
  2. Matrices (origin-destination): flows of demand, benefits, fares, costs, etc.
  3. 2D Polygons: zone boundaries (nodes, edges, faces)

What’s the primary data I used as a computer graphics programmer?

  1. 3D Polygons: vertices (nodes/points), edges (links), faces
  2. 2D Rasters: bitmaps, textures
  3. Matrices: transformations/projections, Jacobians, Laplacian operator

Most transportation planners do not look “under the hood” of transportation models. But for those who do, the models borrow a lot of ideas from computer science, and from Geographic Information Systems (GIS).  If you read the EMME manual from cover-to-cover (and who doesn’t enjoy that?), you can’t help but see the fundamental connections with computer science.

Software & Data Science

A software person would use the term “data science” to describe transportation modelling. That label is utterly foreign to the transportation world, so allow me to quote a bit of Wikipedia:

Data scientists use their data and analytical ability to find and interpret rich data sources; manage large amounts of data despite hardware, software, and bandwidth constraints; merge data sources; ensure consistency of datasets; create visualizations to aid in understanding data; build mathematical models using the data; and present and communicate the data insights/findings.

Does that sound a bit like transportation modelling? Sure, but without any explicit understanding of transportation. In the software world, every jobs combines a certain amount of software knowledge and details of the particular “domain” the software is applied to (like “transportation”). Software jobs are usually 90% software and maybe 10% domain. To do data science well requires a much greater immersion in the domain: maybe 70% software/stats and 30% domain knowledge.

I’d say that’s not far from what makes a good, transportation modelling specialist: 70% technical skills, 30% understanding of transportation such as traffic theory, transit demand, etc.  Some choose to go further and have more domain knowledge – that is, to be part “modeller” and part “transportation planner”, but that’s a very rare mix of skills.

Data science is one of the hot areas within computer science. You see headlines like “Data Scientist: the Sexiest Job of the 21st Century” (Harvard Business Review), questions like “Which career is more promising: data scientist or software developer?” (Quora), and salary surveys showing Data Scientist of the #4 best-paid category within software (Stack Overflow).  Again, it can be seen as an area related to software development.

I’m originally a software developer, I did several years of work in data science / transportation modelling, and now I’m returning to regular software development.

Software and Transportation are Converging

The software industry has clearly set its sights on the transportation sector. To me, one of the recent signs was in May 2016 when tech news site Re/Code dedicated one of its nine major topic areas to transportation. Uber and its Chinese competitor Didi Chuxing are the first breakout multibillion dollar hits (unless you count Google Maps), and Tesla is Silicon Valley’s first real car company. Many industry giants are investing heavily, particularly in the self-driving car and ridesharing spaces: Google, Tesla, Apple and Uber in particular, with substantial competition from the traditional car companies. In-city goods movement and retailing patterns are also starting to shift due to the growth of e-commerce and especially the growth of Amazon as a real competitor to the “big box” / “power centre” format of the 1990s. Long-haul goods movement hasn’t seen any real change yet, but self-driving truck companies like Otto are emerging.

Of course, the Silicon Valley hype machine is also in full effect. I don’t think software is going to change everything instantly. But I do think that having both skillsets will be a valuable combination, and that a future job may bring me back to tackling transportation problems with my software skills.

We shall see.

Leave a Reply

Your email address will not be published. Required fields are marked *