Five indicators you should consider Azure Data Catalog to manage your metadata
DataMetadata is the neglected child of information management. It just doesn’t generate the same level of interest and executive buy-in as aspects like data quality and master data. Yet most organisations are quick to rattle off the ‘self-service’ mantra when it comes to their information management capabilities.
Surely self-service access to a centralised repository of trusted metadata should go hand-in-hand with access to the actual data?
What triggers demand in metadata?
There are many situations where business users may be seeking to interrogate your organisation’s metadata. When a new initiative is being launched, analysts are often looking to understand to what extent the organisation has the required data to support it. Other situations relate to executive-level triggers such as data audits or establishing the data currency of your organisation.
Yet the supply of metadata information can be far from optimal. Here are five scenarios that indicate you should consider modern tooling for managing your organisation’s metadata:
Indicator #1
When attempting to discover the data sets available within your organisation you are bounced between several different teams. Each team only offers a partial view of your organisation’s metadata or, more interestingly, two or more teams provide conflicting views.
Indicator #2
You obtain a metadata report but it is highly technical and lacks any business context. This is the opposite to self-service, as it requires the users who are seeking data to sit with the individuals who published the metadata report and translate the information.
Indicator #3
A very high-level metadata report is obtained but it doesn’t allow field-level analysis. Users who are seeking data often know the fields they are seeking so it is important to publish metadata at a field-level granularity.
Indicator #4
You are provided with a metadata report however it is nine months out of date. The most recently loaded data sets are often the ones users are seeking to interrogate so data timeliness issues can cause confusion and erode trust in the teams that publish metadata.
Indicator #5
The management and publication of your organisation’s metadata is too labour intensive and unsustainable for the targeted growth of data within your organisation.
Azure Data Catalog
Azure Data Catalog is a cloud-based service for managing organisational metadata. It is one of many data-related services on offer through Microsoft’s Azure platform and it appeals to both data engineers and business users.
Data engineers use Data Catalog because it can be rapidly provisioned, it has ‘out-of-the-box’ functionality for accelerating the registration of metadata and it allows business users to take responsibility for their metadata.
Data engineers can rapidly establish a metadata repository associated with their trusted data stores (both cloud and ‘on-prem’) through the data registration tool, which automatically extracts structural metadata (table definitions, field definitions, data types, etc). Data Catalog also has REST API support for more sophisticated interfaces.
Business users perform the functions of discovery, enrichment and consumption of metadata. Enrichment is performed through annotations and the application of custom tags against your organisation’s data assets. Data Catalog also offers a business glossary where business terms and their associated definitions are mastered.