Skip to main content
Uber AI, Data / ML

Inside Uber ATG’s Data Mining Operation: Identifying Real Road Scenarios at Scale for Machine Learning

June 2, 2020 / Global
Featured image for Inside Uber ATG’s Data Mining Operation: Identifying Real Road Scenarios at Scale for Machine Learning
Figure 1. In this graphic, bar height indicates the number of times the SDV drove on a specific lane.
Figure 2. This heatmap of driving volume by hour and day shows that more miles in this sample data set are from the mornings and afternoons, predominantly between Monday and Wednesday.
Figure 3. We can validate the quality of data mined scenarios visually by animating the individual detections collected by our SDV perception system on a map.
Figure 4. The “start” and “end” points of pedestrians crossing the street on a map provide important information about pedestrian crossing locations.
Figure 5. A scatter-plot of average crossing speeds highlights the fastest and slowest average speeds observed by our SDV sensing and perception system.
Figure 6. A distribution of the average pedestrian crossing speeds classified by “moderate” and “extreme” highlights that, on average, pedestrians were most likely to cross the street at a velocity of around 1.39 m/s.
Figure 7. A two-dimensional scatterplot of average walking speeds and distance travelled reveal that most pedestrians in our analysis traveled ~18 meters to cross the street, at a velocity of 1.4 m/s.
Figure 8. A three-dimensional scatter plot including the crossing time duration tells us the pedestrian crossing speed, the amount of time it took for the pedestrian to cross the street, and the width of the road the pedestrian crossed.
Figure 9. The blue, “common and moderate” scenarios above enable teams like simulation to create “common and moderate” test sets that mirror real pedestrian crossings.
Figure 10. This seven dimensional scenario visualization is a window into generating increasingly fine grained scenario “fingerprints” and into the process of surfacing unique observations.
Figure 11. A three-plot progression charts observations, confidence intervals, and the margin of error, giving us richer, more nuanced statistical insights about our “pedestrian crossing the street” scenario.
Steffon Davis

Steffon Davis

Steffon Davis is a product manager with Uber's Advanced Technologies Group, working on the development of self-driving vehicles.

Shouheng Yi

Shouheng Yi

Shouheng Yi is a senior software engineer at Uber ATG.

Andy Li

Andy Li

Andy Li is a software engineer on the Data Engineering team at Uber ATG. In his free time, he enjoys making squash brownies.

Mallika Chawda

Mallika Chawda

Mallika Chawda is a software engineer on the Data Engineering team at Uber ATG.

Posted by Steffon Davis, Shouheng Yi, Andy Li, Mallika Chawda