Batch Geocoding and Batch Reverse-Geocoding with Bing Maps

NOTE: An updated version of this article can be found in the Microsoft Maps Blog.
https://www.microsoft.com/en-us/maps/news/batch-geocoding-and-batch-reverse-geocoding-with-bing-maps

Updated blog!

This Blog now reflects that Batch Geocoding is now supported by the Geocode Dataflow API. documented at https://docs.microsoft.com/bingmaps/spatial-data-services/geocode-dataflow-api.


Introduction

Geocoding and reverse geocoding are services that Bing Maps provides in SDKs, such as the Web Control, Windows, iOS and Android as well as in REST API services. This process takes text descriptions or addresses and outputs accurate geographic coordinates that correspond to a given physical location. What happens if you are about to start a project and you have to geocode thousands of addresses? Or what if you have a requirement to batch-process data updates as a recurring task?

Of course you could just call the geocoder again and again but that doesn’t seem to be a very efficient approach. With our June release we also launched a new batch-geocoder and batch reverse-geocoder as part of the Bing Spatial Data API in order to address just these scenarios. Chris Pendleton briefly touched on it in his blog post here.

Today, we would like to go in a bit more detail and build a little application that leverages the Bing Spatial Data Services. During this walkthrough we will follow the process below.

As a prerequisite you will need a Bing Maps Key which you can create yourself at the Bing Maps Portal.

Using Batch Geocoding

Batch geocoding is a useful feature to have for a business operating in almost any industry. For logistics and delivery businesses, the ability to quickly geocode addresses into location coordinates and vice versa can be incredibly useful, particularly for compression and porting over location data between different devices.

Important geocoding updates

Updated 2022:

Before you create a job to geocode data, it’s worth pointing out that there have been significant updates to the batch-geocoders and batch reverse geocoders, which are now referred to as the Geocode Dataflow API. As a developer you’ll have the option to choose between staying with the older data schema (version 1.0) or porting over to the updated one (version 2.0). 

The new data schema provides developers with much more useful information and hence can save large amounts of time otherwise spent on tasks that can now be automated. For example, version 2.0 includes different points for routing and display, as well as an easier method of creating location bounds.

As an overview, using the Geocode Dataflow API involves the following steps:

  1. Formatting your location data depending on the data schema you’ve chosen. As mentioned above, this can be an XML format or as a set of values.
  2. Create a geocode job. This essentially just involves uploading location and point data for batch geocoding and batch reverse geocoding respectively.
  3. Monitor your created job’s status. - This step is simple and only involves two parameters, with the third being optional. Make sure you’re using the same Bing Maps key that you used for creating the geocoding job.
  4. Done! Download the geocode job results. You’ll know the results are ready for download once the value ‘Completed’ shows up in the job status field.

These four steps are all developers need to geocode and reverse geocode data. Now let’s have a look at how the batch geocoding process works step-by-step with a few examples.

Format Data

Your data can be either in XML- or text-files. In text files you can separate values with comma, tab or pipe (|). The data can be:

  • latitudes and longitudes which would be reverse geocoded

  • query-strings such as place-names, postcodes or unformatted addresses

  • formatted addresses with separate attributes for each address-part

You will find a full description of the data schema here and some sample data here. An interesting aspect of the service is that we can mix different types of information.

In the sample data set below you see for examples for batch geocoding, including well known places We've also included UK postcodes, latitudes and longitudes for reverse geocoding. There is also an empty entry which we intentionally put in there to demonstrate what happens if a record cannot be resolved.

<GeocodeFeed>
  <
GeocodeEntity Id="1" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
GeocodeRequest Culture="de-DE">
      <
Address AddressLine="Konrad-Zuse-Str. 1"
Locality="Unterschleißheim"
PostalCode="85716" />
    </
GeocodeRequest>
  </
GeocodeEntity>
  <GeocodeEntity Id="4" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
GeocodeRequest Culture="en-GB"
Query="Tower of London">
    </
GeocodeRequest>
  </
GeocodeEntity>
  <
GeocodeEntity Id="5" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
GeocodeRequest Culture="en-GB"
Query="Angel of the North">
    </
GeocodeRequest>
  </
GeocodeEntity>
  <
GeocodeEntity Id="6" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
GeocodeRequest Culture="en-GB"
Query="RG6 1WG">
    </
GeocodeRequest>
  </
GeocodeEntity>
  <
GeocodeEntity Id="7" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
ReverseGeocodeRequest Culture="fr-FR">
      <
Location Longitude="2.265087118043766" Latitude="48.83431718199653"/>
    </
ReverseGeocodeRequest>
  </
GeocodeEntity>
  <
GeocodeEntity Id="8" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
GeocodeRequest Culture="en-US" Query="">
      <
Address AddressLine="" AdminDistrict="" />
    </
GeocodeRequest>
  </
GeocodeEntity>
</
GeocodeFeed>

There is a size-limitation to consider though. The file to upload must not exceed 100 MB. You can have up to 10 jobs at a time but if you really need to go to the limits you should consider using a more efficient file format such as a pipe(|)-delimited text file. The sample data above would look in this format as shown below and would only be a quarter of the size of the XML-file

1|de-DE||Konrad-Zuse-Str. 1|||||Unterschleißheim|85716||||||||||||||||||||||
4|en-GB|Tower of London||||||||||||||||||||||||||||
5|en-GB|Angel of the North||||||||||||||||||||||||||||
6|en-GB|RG6 1WG||||||||||||||||||||||||||||
7|fr-FR||||||||||||||||||||||||||||48.83431718199653|2.265087118043766
8|en-US|||||||||||||||||||||||||||||

Create a Job

In the SDK you will find sample code for a console application in C#. In this walk-through we will build a WinForm-application in VB.NET. The final application will look like shown below:

Once we have selected our source-data-file we first set the content-type .

' The 'Content-Type' header must be "text/plain" or "application/xml"
' depending on the input data format.

Dim contentType As String = "text/plain"
If Microsoft.VisualBasic.Right(txtSelectedFile.Text, 3).ToLower = "xml" Then
 
contentType = "application/xml"
End If

Next we build our HTTP-POST-request adding parameters for the source-data-format and the Bing Maps key. We also add our source-data-file as bytes from a file-stream. If the job was successfully submitted, we receive a job-ID as part of the response-header. Together with a desired output-format (JSON or XML) and the Bing Maps key we can use this job-ID to monitor the batch geocoding or batch reverse geocoding job status. We will start a timer to do just that every 30 seconds (or whatever time interval you think is appropriate).

Dim queryStringBuilder As New StringBuilder()

' The 'input' and 'key' parameters are required.
queryStringBuilder.Append("input=").Append(Uri.EscapeUriString(cbInputFormat.Text))
queryStringBuilder.Append("&")
queryStringBuilder.Append("key=").Append(Uri.EscapeUriString(txtBMKey.Text))

' The 'description' parameter is optional.
If Not String.IsNullOrEmpty(txtDescription.Text) Then
  queryStringBuilder.Append("&")
  queryStringBuilder.Append("description=").Append(Uri.EscapeUriString(txtDescription.Text))
End If

Dim uriBuilder As New UriBuilder("http://spatial.virtualearth.net")
uriBuilder.Path = "/REST/v1/dataflows/geocode"
uriBuilder.Query = queryStringBuilder.ToString()

Using dataStream As FileStream = File.OpenRead(txtSelectedFile.Text)
  Dim request As HttpWebRequest = DirectCast(WebRequest.Create(uriBuilder.Uri), HttpWebRequest)

  ' The method must be 'POST'.
 
request.Method = "POST"
 
request.ContentType = contentType

  Using requestStream As Stream = request.GetRequestStream()
    Dim buffer As Byte() = New Byte(16383) {}
    Dim bytesRead As Integer = dataStream.Read(buffer, 0, buffer.Length)
    While bytesRead > 0
      requestStream.Write(buffer, 0, bytesRead)

      bytesRead = dataStream.Read(buffer, 0, buffer.Length)
    End While
  End Using

  Try
    Using response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
      ' If the job was created successfully, the status code should be
      ' 201 (Created) and the 'Location' header should contain the
      ' location of the new dataflow job.
     
If response.StatusCode <> HttpStatusCode.Created Then
       
lblStatus.Text = "Unexpected status code."
     
End If

      Dim dataflowJobLocation As String = response.GetResponseHeader("Location")
      If String.IsNullOrEmpty(dataflowJobLocation) Then
       
lblStatus.Text = "Expected the 'Location' header."
     
End If

      myStatusUrl = dataflowJobLocation & "?output=" + cbOutputFormat.Text + "&key=" + txtBMKey.Text
      lblStatusUrl.Visible = True

      ' Start a timer to monitor the status.
      ' in this sample the timer ticks every 30 seconds
     
myTimer.Start()
    End Using
  Catch
ex As Exception
   
lblStatus.Text = ex.Message
  End Try
End Using

Monitor Status

In the previous section we have created our batch-job, retrieved the job-ID and started a timer which checks the batch geocoding job-status periodically. The job status can be returned either in XML or JSON format and would look like shown below. As you can see we can retrieve the status of the job as well as URLs from where we can download our geocoded data as well as those that failed to geocode.

<?xml version="1.0" encoding="utf-8"?>
<
Response xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns="http://schemas.microsoft.com/search/local/ws/rest/v1  <">
Copyright>Copyright © 2010 Microsoft and its suppliers. All rights reserved. ...</Copyright>
  <
BrandLogoUri>http://spatial.virtualearth.net/Branding/logo_powered_by.png</BrandLogoUri>
  <
StatusCode>200</StatusCode>
  <
StatusDescription>OK</StatusDescription>
  <
AuthenticationResultCode>ValidCredentials</AuthenticationResultCode>
  <
TraceId>0508be9c784f4a9c898003942643b7f2|LTSM001003|02.00.136.1000|</TraceId>
  <
ResourceSets>
    <
ResourceSet>
      <
EstimatedTotal>1</EstimatedTotal>
      <
Resources>
        <
DataflowJob>
          <
Id>ce3548b360ca42d3adac0f7c4a26f392</Id>
          <
Link role="self">https://spatial.virtualearth.net/REST/...</Link>
          <
Link role="output" name="succeeded">https://...</Link>
          <
Link role="output" name="failed">https://...</Link>
          <
Description>My Batch Job 31/08/2010 00:00:00</Description>
          <
Status>Completed</Status>
          <
CreatedDate>2010-08-31T02:46:47.1744785-07:00</CreatedDate>
          <
CompletedDate>2010-08-31T02:47:36.7504986-07:00</CompletedDate>
          <
TotalEntityCount>12</TotalEntityCount>
          <
ProcessedEntityCount>12</ProcessedEntityCount>
          <
FailedEntityCount>1</FailedEntityCount>
        </
DataflowJob>
      </
Resources>
    </
ResourceSet>
  </
ResourceSets>
</
Response>

In the procedure that is being executed when the timer ticks we evaluate the job status. If the job has been completed we update our user interface with statistical information and download links.

Dim myXmlDocument As New XmlDocument
Dim numTotal As Integer = 0
Dim numProcessed As Integer = 0
Dim numFailed As Integer = 0

myXmlDocument.Load(myStatusUrl)

Dim myJobStatus As String = myXmlDocument.Item("Response").Item("ResourceSets")._
  Item("ResourceSet").Item("Resources").Item("DataflowJob").Item("Status").InnerText

If myJobStatus = "Completed" Then
   
lblStatus.Text = "Job Complete"
   
Dim myXmlNode As XmlNode = myXmlDocument.Item("Response").Item("ResourceSets")._
      Item("ResourceSet").Item("Resources").Item("DataflowJob")
    For i = 0 To myXmlNode.ChildNodes.Count - 1
        Select Case myXmlNode.ChildNodes(i).Name
            Case "Link"
               
If myXmlNode.ChildNodes(i).Attributes.Count > 1 Then
                    If
(myXmlNode.ChildNodes(i).Attributes("role").Value = "output" And _
myXmlNode.ChildNodes(i).Attributes("name").Value = "succeeded") Then
                       
mySucessUrl = myXmlNode.ChildNodes(i).InnerText + "?key=" + txtBMKey.Text
                    ElseIf (myXmlNode.ChildNodes(i).Attributes("role").Value = "output" And _
myXmlNode.ChildNodes(i).Attributes("name").Value = "failed") Then
                       
myFailedUrl = myXmlNode.ChildNodes(i).InnerText + "?key=" + txtBMKey.Text
                    End If
                End If
            Case
"TotalEntityCount"
               
numTotal = CInt(myXmlNode.ChildNodes(i).InnerText)
            Case "ProcessedEntityCount"
               
numProcessed = CInt(myXmlNode.ChildNodes(i).InnerText)
            Case "FailedEntityCount"
               
numFailed = CInt(myXmlNode.ChildNodes(i).InnerText)
        End Select
    Next

    lblSummary.Text = "Summary" + vbCrLf _
        + "Total Entities: " + numTotal.ToString + vbCrLf _
        + "Processed Entities: " + numProcessed.ToString + vbCrLf _
        + "Failed Entities: " + numFailed.ToString

Download Results

Batch geocoding and batch reverse geocoding results will remain available for download for up to 14 days. Again, a detailed description of the data schema is available here in the SDK but let’s have a quick look at our sample data in XML-format:

<?xml version="1.0"?>
<
GeocodeFeed >
  <
GeocodeEntity Id="1" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
GeocodeRequest Culture="de-DE">
      <
Address AddressLine="Konrad-Zuse-Str. 1"
Locality="Unterschleißheim"
PostalCode="85716" />
    </
GeocodeRequest>
    <
GeocodeResponse DisplayName="Konrad-Zuse-straße 1, 85716 Unterschleißheim"
EntityType="Address"
Confidence="Medium"
StatusCode="Success">
      <
Address AddressLine="Konrad-Zuse-straße 1"
AdminDistrict="BY"
CountryRegion="Germany"
FormattedAddress="Konrad-Zuse-straße 1, 85716 Unterschleißheim"
Locality="Unterschleißheim"
PostalCode="85716" />
      <
RooftopLocation Latitude="48.290643" Longitude="11.581654" />
      <
InterpolatedLocation Latitude="48.290542" Longitude="11.581076" />
    </
GeocodeResponse>
  </
GeocodeEntity>
  <GeocodeEntity Id="4" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
GeocodeRequest Culture="en-GB"
Query="Tower of London" />
    <
GeocodeResponse DisplayName="Tower of London, United Kingdom"
EntityType="HistoricalSite"
Confidence="High"
StatusCode="Success">
      <
Address AdminDistrict="England"
CountryRegion="United Kingdom"
FormattedAddress="Tower of London, United Kingdom"
Locality="London" />
      <
RooftopLocation Latitude="51.5081448107958" Longitude="-0.0762598961591721" />
    </
GeocodeResponse>
  </
GeocodeEntity>
  <
GeocodeEntity Id="5" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
GeocodeRequest Culture="en-GB"
Query="Angel of the North" />
    <
GeocodeResponse DisplayName="Angel of the North, United Kingdom"
EntityType="LandmarkBuilding"
Confidence="High"
StatusCode="Success">
      <
Address AdminDistrict="England"
CountryRegion="United Kingdom"
FormattedAddress="Angel of the North, United Kingdom"
Locality="Gateshead" />
      <
RooftopLocation Latitude="54.9144704639912" Longitude="-1.58999472856522" />
    </
GeocodeResponse>
  </
GeocodeEntity>
  <
GeocodeEntity Id="6" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
GeocodeRequest Culture="en-GB"
Query="RG6 1WG" />
    <
GeocodeResponse DisplayName="RG6 1WG, Wokingham, United Kingdom"
EntityType="Postcode1"
Confidence="High"
StatusCode="Success">
      <
Address AdminDistrict="England"
CountryRegion="United Kingdom"
FormattedAddress="RG6 1WG, Wokingham, United Kingdom"
PostalCode="RG6 1WG" />
      <
RooftopLocation Latitude="51.461179330945" Longitude="-0.925943478941917" />
    </
GeocodeResponse>
  </
GeocodeEntity>
  <
GeocodeEntity Id="7" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <
ReverseGeocodeRequest Culture="fr-FR">
      <
Location Latitude="48.8343171819965" Longitude="2.26508711804377" />
    </
ReverseGeocodeRequest>
    <
GeocodeResponse DisplayName="Quai du Président Roosevelt, 92130 Issy-les-Moulineaux"
EntityType="Address"
Confidence="Medium"
StatusCode="Success">
      <
Address AddressLine="Quai du Président Roosevelt"
AdminDistrict="IdF"
CountryRegion="France"
FormattedAddress="Quai du Président Roosevelt, 92130 Issy-les-Moulineaux"
Locality="Issy-les-Moulineaux"
PostalCode="92130" />
      <
InterpolatedLocation Latitude="48.8343036174774" Longitude="2.26509869098663" />
    </
GeocodeResponse>
  </
GeocodeEntity>
</
GeocodeFeed>

Our customers continue to use geocoding and location intelligence to create powerful all-in-one geospatial mapping experiences. Workers benefit from Bing Maps API’s geocoding capabilities in their everyday activities, quickly receiving geographic coordinates by entering text-based addresses for locations of interest.

As mentioned previously in this article, the latest updates for using Geocode Dataflow API can be found here. The above code samples are now out of date and are for informational purposes only.

And that’s it! Beginner or expert, these 4 steps are all any developer needs to geocode large batches of location data. Create your own free Bing Maps API key to get started with geocoding today. Happy coding!

Technorati Tags: ,,,,,