In modern, data-driven busienss, it’s integral that organizations understand the reasons for bad data and how best to address them. Data has revolutionized how organizations operate, from customer relationships to strategic decision-making and everything in between. And with more emphasis on automation and artificial intelligence, the need for data/digital trust also has risen. Even minor errors in an organization’s data can cause massive headaches because the inaccuracies don’t involve just one corrupt data unit.
Inaccurate or “bad” data also affects relationships to other units of data, making the business context difficult or impossible to determine. For example, are data units tagged according to their sensitivity [i.e., personally identifiable information subject to the General Data Protection Regulation (GDPR)], and is data ownership and lineage discernable (i.e., who has access, where did it originate)?
Relying on inaccurate data will hamper decisions, decrease productivity, and yield suboptimal results. Given these risks, organizations must increase their data’s integrity. But how?
Integrated Data Governance
Modern, data-driven organizations are essentially data production lines. And like physical production lines, their associated systems and processes must run smoothly to produce the desired results. Sound data governance provides the framework to address data quality at its source, ensuring any data recorded and stored is done so correctly, securely and in line with organizational requirements. But it needs to integrate all the data disciplines.
By integrating data governance with enterprise architecture, businesses can define application capabilities and interdependencies within the context of their connection to enterprise strategy to prioritize technology investments so they align with business goals and strategies to produce the desired outcomes. A business process and analysis component enables an organization to clearly define, map and analyze workflows and build models to drive process improvement, as well as identify business practices susceptible to the greatest security, compliance or other risks and where controls are most needed to mitigate exposures.
And data modeling remains the best way to design and deploy new relational databases with high-quality data sources and support application development. Being able to cost-effectively and efficiently discover, visualize and analyze “any data” from “anywhere” underpins large-scale data integration, master data management, Big Data and business intelligence/analytics with the ability to synthesize, standardize and store data sources from a single design, as well as reuse artifacts across projects.
Let’s look at some of the main reasons for bad data and how data governance helps confront these issues …
Reasons for Bad Data: Data Entry
The concept of “garbage in, garbage out” explains the most common cause of inaccurate data: mistakes made at data entry. While this concept is easy to understand, totally eliminating errors isn’t feasible so organizations need standards and systems to limit the extent of their damage.
With the right data governance approach, organizations can ensure the right people aren’t left out of the cataloging process, so the right context is applied. Plus you can ensure critical fields are not left blank, so data is recorded with as much context as possible.
With the business process integration discussed above, you’ll also have a single metadata repository.
All of this ensures sensitive data doesn’t fall through the cracks.
Reasons for Bad Data: Data Migration
Data migration is another key reason for bad data. Modern organizations often juggle a plethora of data systems that process data from an abundance of disparate sources, creating a melting pot for potential issues as data moves through the pipeline, from tool to tool and system to system.
The solution is to introduce a predetermined standard of accuracy through a centralized metadata repository with data governance at the helm. In essence, metadata describes data about data, ensuring that no matter where data is in relation to the pipeline, it still has the necessary context to be deciphered, analyzed and then used strategically.
The potential fallout of using inaccurate data has become even more severe with the GDPR’s implementation. A simple case of tagging and subsequently storing personally identifiable information incorrectly could lead to a serious breach in compliance and significant fines.
Such fines must be considered along with the costs resulting from any PR fallout.
Reasons for Bad Data: Data Integration
The proliferation of data sources, types, and stores increases the challenge of combining data into meaningful, valuable information. While companies are investing heavily in initiatives to increase the amount of data at their disposal, most information workers are spending more time finding the data they need rather than putting it to work, according to Database Trends and Applications (DBTA). erwin is co-sponsoring a DBTA webinar on this topic on July 17. To register, click here.
The need for faster and smarter data integration capabilities is growing. At the same time, to deliver business value, people need information they can trust to act on, so balancing governance is absolutely critical, especially with new regulations.
Organizations often invest heavily in individual software development tools for managing projects, requirements, designs, development, testing, deployment, releases, etc. Tools lacking inter-operability often result in cumbersome manual processes and heavy time investments to synchronize data or processes between these disparate tools.
Data integration combines data from several various sources into a unified view, making it more actionable and valuable to those accessing it.
Getting the Data Governance “EDGE”
The benefits of integrated data governance discussed above won’t be realized if it is isolated within IT with no input from other stakeholders, the day-to-day data users – from sales and customer service to the C-suite. Every data citizen has DG roles and responsibilities to ensure data units have context, meaning they are labeled, cataloged and secured correctly so they can be analyzed and used properly. In other words, the data can be trusted.
Once an organization understands that IT and the business are both responsible for data, it can develop comprehensive, holistic data governance capable of:
- Reaching every stakeholder in the process
- Providing a platform for understanding and governing trusted data assets
- Delivering the greatest benefit from data wherever it lives, while minimizing risk
- Helping users understand the impact of changes made to a specific data element across the enterprise.
To reduce the risks of and tackle the reasons for bad data and realize larger organizational objectives, organizations must make data governance everyone’s business.
To learn more about the collaborative approach to data governance and how it helps compliance in addition to adding value and reducing costs, get the free e-book here.