A Starter’s Guide to International Address Validation

geopostcodes-guide-address-validation

Table of Contents

Not much gets done without a physical address, even on the internet. Whether for an e-commerce platform or a supply chain management system, businesses need physical addresses for their operations.

Verifying the accuracy of an address’s components—house number, street name, postal code, city, and state—is critical for businesses to reach their customers and deliver products and services efficiently. Different countries, and sometimes even states, have their address formats, making international address validation a complex task. Moreover, street names, postal codes, and city names change more often than you might guess. As it’s difficult to keep up with all those rules and changes internally, many businesses opt for an address validation service.

This article takes you through a few use cases for international address validation and shows you how international address validation works.

Why Do You Need International Address Validation?

Businesses need accurate customer addresses, but ensuring yours is accurate gets tricky when dealing with international customers.

A country’s address format is usually standardized by several federal and state organizations, along with the country’s largest postal services. Different locations require different information in their addresses. In Japan, for example, a person’s postal address is written in order from the largest administration or geographical unit to the smallest, with the name of the customer at the very end. However, Japan Post recommends that you write the address in reverse order if you’re using Roman script or English.

Address normalization across countries is another challenge in address validation. In Belgium, for example, the postal service recommends using abbreviations for multi-flat buildings and only if the line is longer than fifty characters. In France, the postal service recommends writing the last three lines in uppercase. Preferences like these are different in almost every country and postal system.

Deviating formats, incorrect mandatory fields like postal codes, outdated or incomplete addresses, and issues arising from translating one language to another are all inaccuracies that can lead to delays in service, incorrect billing, and other negative impacts on customer experience.

Additionally, the costs of being unable to reach your customers to inform them of service or billing updates, and shipping something to an incorrect address can add up. Back in 2015, it was already estimated that the cost of a bad address is between $35 and $70 per package, and that’s just for domestic mail. It’s easy to imagine how much higher that cost could be when including international costs like tariffs or overseas shipping.

Validated addresses also enable geospatial analysis, which can make your business more efficient in various ways. For instance, you can optimize delivery routes and timing as well as accurately calculate how many customers live in a specific area or within a certain distance of a location.

How Do You Validate an International Address?

First, acknowledge that there are many ways to write the same address. There are some commonalities, yes. For example, most address forms require a country. But as can be seen from the example of Japan, even the same country will sometimes have inconsistencies.

To attempt the problem of validation, you need to understand a country’s address formats, along with their variations and exceptions.

The Components of an International Address

An international address has several components managed by different administrative bodies. The tool you use for address validation needs to understand:

  • Which components make up the address of a particular country
  • Which components are optional, or optional but recommended
  • Which components are mandatory

It’s tempting to think that catering for these variances is easy. You would simply add some optional fields and make them mandatory for certain countries, as Amazon.com did for an address in Japan in the example below:

Japan address validation example - GeoPostcodes

This common solution hides the complexity of different address formats on the front end, but it doesn’t resolve the complexity of having to validate these addresses. It also introduces further challenges. For instance, if users don’t see an address in the format they are used to, they will not know where to enter which values.

So even though Amazon.com uses the format above when you want to deliver something to Japan, Amazon.co.jp uses the following format, which will make sense to someone from Japan:

Amazon address validation example Japan - GeoPostcodes

In the Amazon.com example, it’s not clear where to add the mandatory prefecture field, the 000-0000 format isn’t enforced, and should Chome, Banchi, Go be populated in the first or second street address line? Also, the native Japanese format moves from general to specific, like zooming in on a map, until you get to the address. Someone who is used to this format could easily get confused when they have to start at the street level and zoom out to the country level.

Address Lookup by Country

Broadly speaking, there are two main methods for validating international addresses. Both require that you first look up the address format of the country in question. You can then either use vendor-provided APIs to validate an address or your own custom validators.

Let us examine both these methods.

Look Up Formatting through Third-Party APIs

Third-party vendors can validate your international address list, usually by making an API call. Some of these vendors respond with formatted and enriched data, while others use machine learning and artificial intelligence to correct potential errors within an address. Regardless of the methodology, the result you get is a trustworthy validation of an address.

A downside of this method is that making external API calls for validating addresses can significantly slow down your website, affecting the user’s experience. External API calls can also impose rate limits and come at cost. This cost will sometimes be hard to predict as it’s directly related to the number of calls you’ll make.

Look Up Formatting through Raw Data

The alternative to using a third-party API is to write custom address validators. To do this, you need to take into account different geographies, exceptions to addressing rules, language barriers, and so on. You can run addresses in your raw data across those validators to see whether each address is valid.

However, only using static, raw data to create the validators is problematic. Names of cities, streets, and counties keep changing. Their boundaries might also be reorganized every so often. If you use a static formatting data lookup method, you must keep the raw data updated to ensure you can validate addresses accurately.

You also need to keep accurate historical records since place names can change. In February 2021, the city of Port Elizabeth in South Africa was renamed as Gqeberha. However, people who have lived there for years are not necessarily going to change their habits overnight. It might take them years to get used to using the new name. And if someone is shipping to them from abroad, they are unlikely to know about this location change. Your validation, therefore needs to have records of the current name and previous ones to validate addresses accurately.

Because of these challenges, it’s best to use vendor-provided raw data. However, remember to refresh the data frequently to incorporate changes like renaming, new suburbs, and so forth.

What Do You Need to Build Your Own International Address Validator?

Creating an address validator is like creating an advanced pattern-matching program.

First off, there are many techniques you can use. For instance, deterministic matching involves the precision matching of different parts of an address with an up-to-date city and street-level database. You could also use Levenshtein distance calculation for partial or fuzzy matching, where you take into consideration small mistakes in the addresses, such as an extra blank space or a minor spelling mistake.

Building a custom validator involves coding one or more of these matching techniques into a library. Different applications, services, and users can then use the library to validate addresses based on their own use cases.

Another non-negotiable component of an international address validator is a trustworthy, accurate data source, such as GeoPostcodes. Its postal database contains all the administrative divisions and postal codes for every country in the world, and its street database is a georeferenced and structured database that can help you validate street-level addresses.

To illustrate how you can use this data for your address validator, let’s look at some sample data for street addresses in Singapore. This data comes from the GeoPostcodes Download Center. The specific file this demo uses is GPC-STRT-GEO-SAMPLE-SG.csv.

Singapore download - GeoPostcodes

After importing this data into a Postgres database on db<>fiddle, you can run a couple of queries. For example, you can use this data to populate your web frontend with a drop down:

-- Getting all the valid street numbers for a specific street
SELECT "range"
  FROM Singapore
 WHERE "street" = 'Amoy Street'
 ORDER BY "range";
range
1
2
3
109
110
112
115
130
131
132
133
134
135
137

Alternatively, you could use a numeric edit box:

 -- Getting the minimum and maximum street numbers for a numeric edit box
SELECT MIN("range") AS first_number
      ,MAX("range") AS last_number
  FROM Singapore
 WHERE "street" = 'Amoy Street';
first_numberlast_number
1137

You can also use this data to help autocomplete an address:

-- Getting street names for an auto-complete
 SELECT DISTINCT "street"
  FROM Singapore
 WHERE "street" LIKE 'A%'
   AND "range" = 11
 ORDER BY "street";

In this example, when someone has entered a street number of 11 and starts typing A, they will get Amoy Street and Ann Siang Road but not Ann Siang Hill as that doesn’t have a number 11:

street
Amoy Street
Ann Siang Road

You can also use this data to validate a specific address, for example:

 -- Validating a specific street address
SELECT "range"
  FROM Singapore
 WHERE "street" = 'Amoy Street'
   AND "range" = 15
 ORDER BY "range";

This example will give you There are no results to be displayed. because there isn’t a number 15 in the particular street.

These are only some simple, illustrative examples. As you can see, having trustworthy data in the correct format reduces the amount of effort needed to validate international addresses.

Conclusion

Physical addresses aren’t going anywhere, and their impact on business is real, from logistics to marketing to sales. Ensuring the accuracy of the addresses you’re working with is critical to your business success, which is why address validation can’t be an afterthought.

This guide has shown how complex it can be to cater to addresses in different countries, and it highlighted some of the key considerations for ensuring you have accurate data. We also showcased how reliable data helps address some of the challenges with validation.

Instead of maintaining all that data, let GeoPostcodes handle it for you. Consider its highly maintained data sets so you can best serve your international customers.

Related posts