By: Camilla Mahon
We just purchased 8.2 million km² of high-resolution satellite imagery from DigitalGlobe: enough to cover 10,000 metropolitan areas the size of New York City all around the world. And it looks beautiful!
Behind the scenes
This new imagery is from DigitalGlobe’s ultra-high-res satellites. They collect data in state of the art sub–50 cm pixels. At that sharpness, people can see details of buildings, pathways, and parks, making it easy to orient in the landscape and find everything from shops to mailboxes. It also allows autonomous vehicles and machine learning algorithms to see lane markings, speed bumps, and other ingredients of HD maps
How much of a difference does sharpness make for algorithmic work? Consider that a compact car takes up an area of fewer than 8 pixels in one-meter imagery — barely enough to see which way it’s pointing. In best-resolution satellite imagery, the same car is represented by more than 10 times as many pixels.
Combining human and data insights
Knowing where to update imagery at this scale is hard. We have users all over the world, and they have many different coverage priorities for their satellite views: roadways, new developments, commercial districts, farms, national parks, waterfronts, you name it. Even for the same point on the ground, people have different requirements. For example, recency is the main concern for some, while others would gladly see last year’s images if they’re a little clearer than this year’s. To balance everyone’s needs, we’ve learned to take a “measure twice, cut once” approach.
First, we analyze our existing imagery to learn where it would benefit most from improvement. We have fine-tuned algorithms that measure haze, poor lighting, and low resolution, among other potential problems. As they run across all our imagery, they create a multi-layer “map of the map” that represents quality on several axes. We collect these metrics into an aggregate that we call the Satellite Health Index (SHIdx), a single indicator of how healthy the imagery is in any given area.
Our anonymized telemetry data shapes the other half of the formula. It lets us spot shifts in user demand; for example, we can identify a quickly developing city, a newly opened bridge, or a suddenly fashionable neighborhood. We combine this data with explicit user requests.
We like user requests because they’re products of human judgment. However, they aren’t as spatially dense as anonymized telemetry, and they can be biased by cultural, economic, and technical factors. Combined, user requests and telemetry density tell us where our users need us to update imagery.
There’s still a healthy proportion of common sense in our update allocations, but it’s always built on hard data. And what the data tells us is sometimes surprising — that’s why it’s valuable. For example, going into this round of updates, we knew we wanted to refresh Shanghai. But when we referred to the SHIdx and user demand, we saw that the high-priority area extended far beyond Shanghai proper, across many surrounding cities. If we didn’t look at that data, we would have refreshed only the downtown core. With the data, we decided to refresh well over 150,000 km², an area bigger than Wisconsin.
Explore our imagery
Ultimately, our goal is the most beautiful and accurate map possible. That means it has to be a living map, continually growing and updating; essentially learning from itself. Combining human and data insights to prioritize which regions to update is part of that complex process.