1

I'm trying to create an offline estimator for how long it would take to get from one lat/long to another. Two approaches I have come across are the Haversine distance and the Manhattan distance. What I'm thinking of doing is calculating both of them and then using the average between the two as the distance and then use some average speed to calculate time.

Since this value will be used as an estimator for drivers in a city a straight line distance will grossly underestimate the actual distance drivers will have to travel.

Any thoughts would be greatly appreciated.

C Murphy
  • 129
  • 1

4 Answers4

1

Build a model! Get your training/test data by spamming the above API to gather travel times given start/stop lat/lon.

If you want it run with your idea, make the only feature be the distance between the two locations and your model would just be simple product between distance and a constant.

Michael Higgins
  • 351
  • 2
  • 7
  • Yeah agree, it depends on the sample and travel mode of course, but across quite a few scenarios I have dealt with a linear regression does a decent job, https://andrewpwheeler.com/2014/11/19/using-the-google-distance-api-in-spss-plus-some-eda-of-travel-time-versus-geographic-distance/. – Andy W Oct 05 '22 at 12:24
  • Nice link. I would love to see a version of this that is world wide and includes all modes of transport! – Michael Higgins Oct 08 '22 at 06:51
0

This example is carried out using R.

Suppose there exists a data frame with latitude and longitude coordinates for two locations:

coors <- data.frame(
  lat = c(48.8566,52.5200),
  long = c(2.3522,13.4050),
  stringsAsFactors = FALSE
)

The distance between the two points is calculated as follows:

> p=0.017453292519943295
> a = 0.5 - cos((coors$lat[2] - coors$lat[1]) * p)/2 + cos(coors$lat[1] * p) * cos(coors$lat[2] * p) * (1 - cos((coors$long[2] - coors$long[1]) * p)) / 2
> distance <-12742*sin(sqrt(a))
> distance #Distance in kilometers
[1] 876.0783
Michael Grogan
  • 482
  • 2
  • 7
0

You may also look at external APIs to get the actual travel time between two locations. There exists an Google Maps API.

fhaase
  • 161
  • 3
0

Apart from the most important feature in your model which is distance, you might want to add some other information like neighborhood/traffic/time of departure/type of car/ drivers ( if you have any informations about them ) etc..

Example of a feature : you could extract from looking at the departure long/lat nearby points that could cause delay the arrival to destination point. Driver experience might be a contributing factor to the ETA. Also you use Uber's data to add average travel times from the physical zones you're working on.

Distance is an obvious straight line feature that will be the center piece of your model, but adding data about stuff to break down that linearity is what will make your model good enough to be put to test.

Blenz
  • 2,044
  • 10
  • 28