Example: Country and Country Subdivision with ISO code values.
Please download here the source spreadsheet files for import.
The following data is from the ISO 3166-1 country code standard. This information can be found on Wikipedia at https://en.wikipedia.org/wiki/ISO_3166-1. The state codes are from the ISO 3166-2:US standard found at https://en.wikipedia.org/wiki/ISO_3166-2:US.
Reference data must relate to a business term. This ensures that the technical information (the Code Values) relate back to business information (a Business Term). For example, a business user would never say “Good news everyone! We just opened up a new account in ISO-3166 US-NJ!” She would simply say “We opened a new account in New Jersey!”, because we all agree what “New Jersey” means. By creating relationships between the Business Terms and Code Values, the business user can say New Jersey and the developer can know to use Code Value “US-NJ”. Therefore, This exercise starts by creating glossaries for the countries and states.
- Create a new asset type to denote business terms for countries, named “Country” and one for states called “Country Subdivision.” There are various attributes that are unique to a country and country subdivision such as latitude and longitude. These special requirements means that Country will need be a specific type of business term. The operating model must be changed to support this new requirement. Go to Settings > Types > Asset Types > + Add Asset Type, to create the new asset types.
Country and Country Subdivision will be a special type of Business Term. Therefore, make the parent type as Business Term. This will result in two new Asset types under Business Term:
The new attribute types must be added to the new asset types of Country and Country Subdivision. Go to Settings -> Attributes ->Attributes -> +Add Attribute Type to create new attributes.
Always assign the new attribute type to an asset after its creation. Go to Settings > Attributes > Assign. The group name provides a label for all the attribute types assigned to the asset type. The attribute types are the newly created and existing attribute types Latitude, Longitude, Source System, etc. Type refers to the Asset Type to which you are assigning attributes.
Repeat these steps to create to assign all the newly created and existing attribute types to the Country and Country Subdivision asset types.
Repeating this process will result in the following new asset type assignments:
Next, assign these new asset types (Country and Country Subdivision) to a new domain type named Country Glossary. Country Glossary will be a special type of Business Asset Domain. Create the new domain types. Go to Settings > Types > +Add Domain Type.
Assign the new asset types (Country and Country Subdivision) to the new Domain Type (Country Glossary). Go to Settings-> Types > Assignment> Assign
This results in a new Domain Type (Country Glossary) with two new asset types assigned:
Finally, the two new domains can be created to contain the countries and states. Note this is not necessary but we do this to allow for a subdivision and a country having the same name.
Import the Country using the country tables shown above.
This is the result:
Import the Country Subdivision. This is the state information found in the ISO reference tables.
Here is the result:
All the necessary business metadata is now created. Now the reference data can be imported and related to the business terms.
Create two domains for the ISO country code values. In this example, there are two codes that can represent a country (2-digit,3-digit)
It is easiest to import the code values from within the Codelist domain. This is your importing domain. From within the ISO 3-digit Country Code domain, select the import button to start the import process.
Anytime you import an asset that is related to an asset contained in a different domain, you should explicitly declare the domain and asset type of the related asset. In this example, Country (eg. Algeria) is outside of the new codelist domain (ISO 3-digit Country Code). Therefore, the domain and type of the country must be provided in a column. The highlighted columns are only to give context to the import.
There are two different code lists, so this import will have to be repeated two times (once for for Alpha-2 code, once for Alpha-3 code).
Use the “Code Value Business Term Business Term” relation to map the code value to a business term.
This is the result for the two character country code.
Repeat this process for the ISO 3-digit Country Codes. Now the page for Afghanistan shows the two different ISO code styles.
It is now possible to build a hierarchy. Hierarchy views can be saved for quick reference in the future.
Next import the Country Subdivisions by following the same process. This is a good opportunity to create a relationship between the Country Subdivision and the Country. A Groups/Grouped By relationship exists between the Country and the State. The yellow columns provide instructions to Collibra for the import. They are not directly imported.
For the import, use the “Business Asset Grouped By Business Asset” relationship. This means that the country groups the state. For example, state Alabama is grouped by country United States.
Here is the result:
- Import the ISO reference data for state by following the same process as above.
Adding this grouping in the import makes the following hierarchy view possible.
This real world example results in a cohesive and traceable model that is suitable for data stewardship. Every information asset can be independently managed and provides context to its relationship with other assets. This approach to metadata management is also freely scalable.
The traceability diagram for Alabama shows that it has a code US-AL and is part of the United States.
One graph can provide the entire context of New York.
If a future system uses an ISO numeric identifier for countries rather than a 2-digit code, a third domain named ISO Numeric Country Code can be created and related to the Country business term. Technical assets could also be created to show where these codes live within systems, databases, tables and columns. Business rules can be created like “All countries must be represented by an ISO value” , and they can be enforced with data quality rules that test for representational consistency. The Collibra framework offers unlimited flexibility to operationalize data governance.
You have to login to comment.