The need for Data Governance has been established — it has become one of the key initiatives organizations are focusing on when it comes to managing data. This blog talks about the differences in Data Governance in the digital era compared to traditional Data Governance practices.
DG Process and Outcome
Traditional: Business stakeholders for the Data Governance organization are clearly identified. DG financial benefits are quantified based on the overall data program. The focus of efforts is based on the attributes and classification of the data.
Digital: With the growing variety and volume of data, it is very difficult to identify all business stakeholders for big data governance processes, which may require revising thought patterns on source versus attribute. Identification and managing the attributes of the data becomes challenging, and the source of data generation can guide the DG processes.
Organizational Structures and Stewardship
Traditional: Organizational structure for DG is defined with representation from both business and IT after the overall organization data strategy definition. Stewardship is relatively simple with stewards defined based on system of record.
Digital: There are struggles to define and prioritize DG initiatives in the big data/digital world. Additional roles in the DG council will be required — roles such as Chief Data Officer, Data Scientist, and Legal team need to be involved. Stewardship is extended at the enterprise level to cover machine/sensor, bio, geospatial, and other newer data sources.
Policy
Traditional: Policies are limited to onboarding and consumption of internal/external structured data stored in operational data stores and data warehouses.
Digital: Standards and policies should be defined to evaluate data use cases for business value realization and onboarding/consumption of semi/unstructured data. Policies need to evaluate and define “Fit for Use” data.
Data Quality Management
Traditional: Data quality business/technical rules can be easily defined, analyzed/corrected, and measured. Data Quality rules are deterministic in nature.
Digital: Data quality processes need to be defined as per the speed of the data — requiring quality tollgates in both the cold path and hot path. Data Quality is not restrictive; it becomes a qualitative qualification rather than rule-based filtering.
Ownership
Traditional: Format and schema of the data are defined, and the ownership of the data is confined to the enterprise as most data is generated internally.
Digital: Data sources such as social data (Facebook, Twitter, ecommerce sites), GPS/mobile data, RFID, sensors, and devices require a change in the ownership structure of data within the enterprise.
Operational Metadata and Other Attributes
Traditional: Operational metadata such as timeliness and velocity are less important as data ultimately resides in data warehouses. The focus is on completeness and consistency.
Digital: Timeliness and velocity of data become important as there is a chance of data getting lost forever. There is less focus on completeness and consistency as the source and information in the data is more important than a particular record.
These are the key things organizations need to think about when doing data governance in the digital era. If you have another idea, feel free to reach out.