3

Right now, I’m working on road and street safety analysis. I have a dataset of dangerous points in four regions of a city. Some of the available variables are road lighting status, ITS, latitude, longitude, longitudinal protection status, type of point control, as well as the number of injuries and deaths for the past two years. Also, there are some dummy variables(road lighting status, ITS, longitudinal protection status, type of point control). I want to know whether I could rank these points from the most dangerous one to the least dangerous one based on their potential danger using machine learning methods. If possible, which methods could help? I would appreciate it if you could share relevant links or references. Thanks in advance.

sahar
  • 31
  • 2
  • 4
    Hi and welcome to the site! Can you explain why you want to address this as an ML problem instead of, for example, simply defining a "danger score" for all data points and thereby derive the ranking explicitly? – Jonathan Jan 01 '20 at 19:47
  • 2
    Please add some more info: What is your dependent variable (y)? Is it accidents per year? – Peter Jan 01 '20 at 22:21

1 Answers1

1

Yes, you can rank those data points.

First, you need to define a quantitative measure of "potential danger" called a target variable. That could be a single variable (e.g., number of accidents) or weighted collection of variables, frequently called an index.

Then a regression model can be trained to predict that continuously-measured target variable and the data points can be rank-ordered.

Brian Spiering
  • 20,142
  • 2
  • 25
  • 102