Story of how we implemented bike distance restrictions edit

 Hey folks!

Roberta tasked me to write a post about how we changed the restrictions for bikes on Generate Match (GM) from filtering based on estimated time of arrival (ETA) to filtering based on distance. I hope you will enjoy it.

I think it is always cool to share what we do and why we do it. The following article explains what we did to improve the assignments of jobs to our bike Quiqees. This has already been working well in production for some time. To tell the truth, I implemented the solution early November, 2017. As they say - it is better late than never to share this with you.

 

Problem

Sometime ago (in a galaxy far far away…) CSR’s had complaints from bike Quiqees checking estimated time of travel on Google Maps. They knew we limited assignments of jobs with an estimated pickup or dropoff higher than 20 minutes, and they checked it on their mobile phones, if Google Maps said more than 20 minutes some of the drivers would  try to complain about the job and ask to be unassigned.

We use a neural network models to predict times that are valid for our business (see previous Time Team posts), but they compared with Google and obviously they were seeing different numbers than what we predicted.

Indeed in the past we checked if we could use Google to predict our jobs, but found that that it is not an option. This isn’t a problem with Google or our models, it’s just that our couriers behaviour can’t be predicted based on the data used by Google.

The time you need to go from one place to another depends on many variables, the same path may require different lengths of time due to weather conditions, traffic, transport mode, etc. Thus something more stable is needed.Therefore our solution was to use distance instead of time to ensure more stability in how bikes are restricted from long distance jobs and reduce number of complaints from drivers. Because distance is always the same. No matter if we use distance or time we knew couriers were going to check jobs using mapping applications So we had to be sure that we calculate something close to the predicted distance on-the-road by e.g. The Almighty Google Maps.

We had few options to choose from: a) querying Google Maps itself (or any similar paid 3rd party mapping service), b) resurrecting the open source mapping service OSRM for GM purposes and retrieving the distance from this internal service, c) modelling the actual on-the-road distance with simple mathematical formulas. Guess what we chose… of course option C as it was the easiest, cheapest and fastest action to take. In few days of research we had no cost no latency solution which was good enough for our purposes. Further down you’ll find more details about the modelling, evaluation process and the results.

 

Modelling

As mentioned above we chose ‘the cheapest’ option of solving the business problem, i.e. we were aiming to experiment with different distance metrics and algorithms to come up with the most accurate calculation of the on-the-road distance only knowing the current position of the courier and coordinates of the destination locations.

 

Definitions of distance metrics from A to B

Using straight line distance between two coordinates isn’t a good option (you can try, but I don’t recommend it!) because we all know in the city we have restrictions because of the roads, buildings, bridges, etc...

Mathematically there are few different ways to calculate the distance between two points. In the data team we use haversine distance, for simplicity we can say haversine distance is the same as a straight line, but over the earth (you know, the earth isn’t flat right?).

Another way to calculate the distance is using what is called Manhattan distance (or taxicab distance). This looks more similar to what you might do while driving in the city.

Manhattan distances

 

Blue and yellow lines are what you might do while driving, i.e. you can’t go straight from one point to another, but have to go around buildings. Same for red line, but well bit less realistic. Note all those lines have same distance, although their trajectories are different. Example below of the Manhattan vs. haversine vs. actual on-the-road distance on the real map.
 

 

 

Dataset and evaluation of success

Firstly, I used our internal dataset. It simply included pickup coordinates, drop coordinates and current locations of drivers when they accepted jobs. This is the input data, and the output data comprised of distance on-the-road (Ground Truth). I retrieved this Ground Truth value from querying Google Maps API service for the limited amount of our observations.

Secondly, in every modelling attempt there is a requirement to measure the success or, in other words, have a way to quantify how much your predicted values correspond to the reality (Ground Truth). For this purpose I used average error measured in meters. For each observation error is calculated as: abs(predicted distance in meters - actual distance in meters), where abs is an absolute value without regard to its sign. I took an average of an error for thousands of observations we were testing our new model against and got the final evaluation metrics. This procedure would be repeated for every separate model.  Larger the error, worse the representation of reality.

 

Trialed approaches and their results

Raw haversine distance

Using the straight distance between two points we have:

Avg. error: 876 meters

As we can see we have an average error of 876 meters, thus the prediction isn’t doing a very good job. This was our benchmark model. All the subsequent models were aiming to bring the average error down.

 

Linear regression using haversine distance

Next, I tried a linear regression model with the haversine distance as a feature. If we call haversine distance “X” then we have the formula:

A*X + B

Where we have to calculate A and B, to fit the data we have from Google Maps.

Avg. error: 470 meters

The model is working much better than using just straight line.

 

Linear regression using haversine distance + Manhattan distance

We often have hypothesis that we wish to try, to improve our models. The Manhattan distance is not as direct as straight line distance and thus we hoped it would account for obstacles drivers would have to navigate between. So we experimented, using this as a feature. If we call “X” to haversine distance and “Y” to manhattan distance then we have the following formula:

A*X + B*Y + C

Where we have to calculate A, B and C (the machine does this for us, easy peasy!)

Avg. error: 470 meters

That’s interesting!! As we can see using Manhattan distance isn’t useful at all, there is no significant improvement including Manhattan distance.

 

Gradient boosting algorithm

I also tried gradient boosting, one of the most advanced methods, to push the boundaries of our knowledge. However, using more advanced and more accurate methods isn’t always the best option, there is a trade of with complexity which comes with a cost, sometimes it is not possible (or is really hard) to implement it in production.

Avg. error: 393 meters

This is about 100 meters better in average. But as I said, complexity is bad, really bad. It is not worth using this method for only an improvement of 100 meters on average, this is not significant enough to warrant implementing gradient boosting in production.

 

Choosing the model

After considering all the data we have to choose a model, and I chose the Linear Regression with haversine distance as a feature. The model is good enough while keeping complexity to a minimum, with an average error of 470 meters. Another advantage is that it is easy to understand what is happening, consider our formula (remember that was calculated by the computer!), X means straight distance:

1.27 * X + 306

What does it means? It means that for straight distance we have to multiply by 1.27, then add 306, and that give us a distance close to what Google Maps will return.

 

Impact

We are near to finish! Just need to measure what impact we do expect.

Lets make some plots:

 

 

This plot shows the error (distance calculated by our model compared with Google). We can see that on average the error is 119 meters and 80% of the data is between  + 931 meters of error and - 776 meters of error.

Positives values means we overpredicted and negative values means we underpredicted.

And last one:

Not an easy one, but we can see that our predictions (red) are pretty close to Google (green). Meaning that we don’t have to use Google like our Quiqees do (reducing costs). Well done!

Blue is the performance of the previous prediction based on time, not distance. That prediction were calculated using neural networks. It sounds cool, but they aren’t always the best option. The best option is whatever solves the problem better, in this case reducing the number of jobs rejected by couriers due to length.

This graph tell us how many bikers will be rejected by GM. Remember too many low rejections is not desired because couriers will complain about inaccurate predictions!

 

Conclusion

In data science we have to find the most parsimonious model that accurately predicts our target variable. As previously discussed we achieved this with something as simple as multiplying straight line distance by a coefficient, we prefer to find simple solutions over the most complex ones. If simple solutions can provide good enough results, we should not waste 80% of time to achieve 20% of improvement. For example, in our case we didn’t need accuracy down to the meter.

After some time we have confirmed that the number of complaints from bike Quiqees about long runs went down, so we can say this project was a success.

Bye!