Categories
erwin Expert Blog Data Governance Data Intelligence

Demystifying Data Lineage: Tracking Your Data’s DNA

Getting the most out of your data requires getting a handle on data lineage. That’s knowing what data you have, where it is, and where it came from – plus understanding its quality and value to the organization.

But you can’t understand your data in a business context much less track data lineage, its physical existence and maximize its security, quality and value if it’s scattered across different silos in numerous applications.

Data lineage provides a way of tracking data from its origin to destination across its lifespan and all the processes it’s involved in. It also plays a vital role in data governance. Beyond the simple ability to know where the data came from and whether or not it can be trusted, there’s an element of statutory reporting and compliance that often requires a knowledge of how that same data (known or unknown, governed or not) has changed over time.

A platform that provides insights like data lineage, impact analysis, full-history capture, and other data management features serves as a central hub from which everything can be learned and discovered about the data – whether a data lake, a data vault or a traditional data warehouse.

In a traditional data management organization, Excel spreadsheets are used to manage the incoming data design, what’s known as the “pre-ETL” mapping documentation, but this does not provide any sort of visibility or auditability. In fact, each unit of work represented in these ‘mapping documents’ becomes an independent variable in the overall system development lifecycle, and therefore nearly impossible to learn from much less standardize.

The key to accuracy and integrity in any exercise is to eliminate the opportunity for human error – which does not mean eliminating humans from the process but incorporating the right tools to reduce the likelihood of error as the human beings apply their thought processes to the work.

Data Lineage

Data Lineage: A Crucial First Step for Data Governance

Knowing what data you have and where it lives and where it came from is complicated. The lack of visibility and control around “data at rest” combined with “data in motion,” as well as difficulties with legacy architectures, means organizations spend more time finding the data they need rather than using it to produce meaningful business outcomes.

Organizations need to create and sustain an enterprise-wide view of and easy access to underlying metadata, but that’s a tall order with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration. So the applications and initiatives that depend on a solid data infrastructure may be compromised, resulting in faulty analyses.

These issues can be addressed with a strong data management strategy underpinned by technology that enables the data quality the business requires, which encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossaries maintenance and metadata management (associations and lineage).

An automated, metadata-driven framework for cataloging data assets and their flows across the business provides an efficient, agile and dynamic way to generate data lineage from operational source systems (databases, data models, file-based systems, unstructured files and more) across the information management architecture; construct business glossaries; assess what data aligns with specific business rules and policies; and inform how that data is transformed, integrated and federated throughout business processes – complete with full documentation.

Centralized design, immediate lineage and impact analysis, and change-activity logging means you will always have answers readily available, or just a few clicks away. Subsets of data can be identified and generated via predefined templates, generic designs generated from standard mapping documents, and pushed via ETL process for faster processing via automation templates.

With automation, data quality is systemically assured and the data pipeline is seamlessly governed and operationalized to the benefit of all stakeholders. Without such automation, business transformation will be stymied. Companies, especially large ones with thousands of systems, files and processes, will be particularly challenged by a manual approach. And outsourcing these data management efforts to professional services firms only increases costs and schedule delays.

With erwin Mapping Manager, organizations can automate enterprise data mapping and code generation for faster time-to-value and greater accuracy when it comes to data movement projects, as well as synchronize “data in motion” with data management and governance efforts.

Map data elements to their sources within a single repository to determine data lineage, deploy data warehouses and other Big Data solutions, and harmonize data integration across platforms. The web-based solution reduces the need for specialized, technical resources with knowledge of ETL and database procedural code, while making it easy for business analysts, data architects, ETL developers, testers and project managers to collaborate for faster decision-making.

Data Lineage

Categories
erwin Expert Blog

Six Reasons Business Glossary Management Is Crucial to Data Governance

A business glossary is crucial to any data governance strategy, yet it is often overlooked.

Consider this – no one likes unpleasant surprises, especially in business. So when it comes to objectively understanding what’s happening from the top of the sales funnel to the bottom line of finance, everyone wants – and needs – to trust the data they have.

That’s why you can’t underestimate the importance of a business glossary. Sometimes the business folks say IT or marketing speaks a different language. Or in the case of mergers and acquisitions, different companies call the same thing something else.

A business glossary solves this complexity by creating a common business vocabulary. Regardless of the industry you’re in or the type of data initiative you’re undertaking, the ability for an organization to have a unified, common language is a key component of data governance, ensuring you can trust your data.

Are we speaking the same language?

How can two reports show different results for the same region? A quick analysis of invoices will likely reveal that some of the data fed into the report wasn’t based on a clear understanding of business terms.

Business Glossary Management is Crucial to Data Governance

In such embarrassing scenarios, a business glossary and its ongoing management has obvious significance. And with the complexity of today’s business environment, organizations need the right solution to make sense out of their data and govern it properly.

Here are six reasons a business glossary is vital to data governance:

  1. Bridging the gap between Business & IT

A sound data governance initiative bridges the gap between the business and IT. By understanding the underlying metadata associated with business terms and the associated data lineage, a business glossary helps bridge this gap to deliver greater value to the organization.

  1. Integrated search

The biggest appeal of business glossary management is that it helps establish relationships between business terms to drive data governance across the entire organization. A good business glossary should provide an integrated search feature that can find context-specific results, such as business terms, definitions, technical metadata, KPIs and process areas.

  1. The ability to capture business terms and all associated artifacts

What good is a business term if it can’t be associated with other business terms and KPIs? Capturing relationships between business terms as well as between technical and business entities is essential in today’s regulatory and compliance-conscious environment. A business glossary defines the relationship between the business terms and their underlying metadata for faster analysis and enhanced decision-making.

  1. Integrated project management and workflow

When the business and cross-functional teams operate in silos, users start defining business terms according to their own preferences rather than following standard policies and best practices. To be effective, a business glossary should enable a collaborative workflow management and approval process so stakeholders have visibility with established data governance roles and responsibilities. With this ability, business glossary users can provide input during the entire data definition process prior to publication.

  1. The ability to publish business terms

Successful businesses not only capture business terms and their definitions, they also publish them so that the business-at-large can access it. Business glossary users, who are typically members of the data governance team, should be assigned roles for creating, editing, approving and publishing business glossary content. A workflow feature will show which users are assigned which roles, including those with publishing permissions.

After initial publication, business glossary content can be revised and republished on an ongoing basis, based on the needs of your enterprise.

  1. End-to-end traceability

Capturing business terms and establishing relationships are key to glossary management. However, it is far from a complete solution without traceability. A good business glossary can help generate enterprise-level traceability in the form of mind maps or tabular reports to the business community once relationships have been established.

Business Glossary, the Heart of Data Governance

With a business glossary at the heart of your regulatory compliance and data governance initiatives, you can help break down organizational and technical silos for data visibility, context, control and collaboration across domains. It ensures that you can trust your data.

Plus, you can unify the people, processes and systems that manage and protect data through consistent exchange, understanding and processing to increase quality and trust.

By building a glossary of business terms in taxonomies with synonyms, acronyms and relationships, and publishing approved standards and prioritizing them, you can map data in all its forms to the central catalog of data elements.

That answers the vital question of “where is our data?” Then you can understand who and what is using your data to ensure adherence to usage standards and rules.

Value of Data Intelligence IDC Report

Categories
erwin Expert Blog

Why Data Governance is the Key to Better Decision-Making

The ability to quickly collect vast amounts of data, analyze it, and then use what you’ve learned to help foster better decision-making is the dream of many a business executive. But like any number of things that can be summarized in a single sentence, it’s much harder to execute on such a vision than it might first appear.

According to Forrester, 74 percent of firms say they want to be “data-driven,” but only 29 percent say they are good at connecting analytics to action. Consider this: Forrester found that business satisfaction with analytics dropped by 21 percent between 2014 and 2015 – a period of great promise and great investment in Big Data. In other words, the more data businesses were collecting and mining, the less happy they were with their analytics.

A number of factors are potentially at play here, including the analytics software, the culture of the business, and the skill sets of the people using the data. But your analytics applications and the conclusions you draw from your analysis are only as good as the data that is collected and analyzed. Collecting, safeguarding and mining large amounts of data isn’t an inexpensive exercise, and as the saying goes, “garbage in, garbage out.”

“It’s a big investment and if people don’t trust data, they won’t use things like business intelligence tools because they won’t have faith in what they tell them,” says Danny Sandwell, director of product marketing at erwin, Inc.

Using data to inform business decisions is hardly new, of course. The modern idea of market research dates back to the 1920s, and ever since businesses have collected, analyzed and drawn conclusions from information they draw from customers or prospective customers.

The difference today, as you might expect, is the amount of data and how it’s collected. Data is generated by machines large and small, by people, and by old-fashioned market research. It enters today’s businesses from all angles, at lightning speed, and can, in many cases, be available for instant analysis.

As the volume and velocity of data increases, overload becomes a potential problem. Unless the business has a strategic plan for data governance, decisions around where the data is stored, who and what can access it, and how it can be used, becomes increasingly difficult to understand.

Not every business collects massive amounts of data like Facebook and Yahoo, but recent headlines demonstrate how those companies’ inability to govern data is harming their reputations and bottom lines. For Facebook, it was the revelation that the data of 87 million users was improperly obtained to influence the 2016 U. S. presidential election. For Yahoo, the U.S. Securities and Exchange Commission (SEC) levied a $35 million fine for failure to disclose a data breach in a timely manner.

In both the Facebook and Yahoo cases, the misuse or failure to protect data was one problem. Their inability to quickly quantify the scope of the problem and disclose the details made a big issue even worse – and kept it in the headlines even longer.

The issues of data security, data privacy and data governance may not be top of mind for some business users, but these issues manifest themselves in a number of ways that affect what they do on a daily basis. Think of it this way: somewhere in all of the data your organization collects, a piece of information that can support or refute a decision you’re about to make is likely there. Can you find it? Can you trust it?

If the answer to these questions is “no,” then it won’t be easy for your organization to make data-driven decisions.

Better Decision-Making - Data Governance

Powering Better Decision-Making with Data Governance

Nearly half (45 percent) of the respondents to a November 2017 survey by erwin and UBM said better decision-making was one of the factors driving their data governance initiatives.

Data governance helps businesses understand what data they have, how good it is, where it is, and how it’s used. A lot of people are talking about data governance today, and some are putting that talk into action. The erwin/UBM survey found that 52 percent of respondents say data is critically important to their organization and they have a formal data governance strategy in place. But almost as many respondents (46 percent) say they recognize the value of data to their organization but don’t have a formal governance strategy.

Many early attempts at instituting data governance failed to deliver results. They were narrowly focused, and their proponents often had difficulty articulating the value of data governance to the organization, making it difficult to secure budget. Some organizations even understood data governance as a type of data security, locking up data so tightly that the people who wanted to use it to foster better decision-making had trouble getting access.

Issues of ownership also stymied early data governance efforts, as IT and the business couldn’t agree on which side was responsible for a process that affects both on a regular basis. Today, organizations are better equipped to resolve issues of ownership, thanks in large part to a new corporate structure that recognizes how important data is to modern businesses. Roles like chief data officer (CDO), which increasingly sits on the business side, and the data protection officer (DPO), are more common than they were a few years ago.

A modern data governance strategy works a lot like data itself – it permeates the business and its infrastructure. It is part of the enterprise architecture, the business processes, and it help organizations better understand the relationships between data assets using techniques like visualization. Perhaps most important, a modern approach to data governance is ongoing, because organizations and their data are constantly changing and transforming, so their approach to data governance can’t sit still.

As you might expect, better visibility into your data goes a long way toward using that data to make more informed decisions. There is, however, another advantage to the visibility offered by a holistic data governance strategy: it helps you better understand what you don’t know.

By helping businesses understand the areas where they can improve their data collection, data governance helps organizations continually work to create better data, which manifests itself in real business advantages, like better decision-making and top-notch customer experiences, all of which will help grow the business.

Michael Pastore is the Director, Content Services at QuinStreet B2B Tech. This content originally appeared as a sponsored post on http://www.eweek.com/.

Previous posts:

You can determine how effective your current data governance initiative is by taking erwin’s DG RediChek.

Take the DG RediChek

Categories
erwin Expert Blog

Data Governance Tackles the Top Three Reasons for Bad Data

In modern, data-driven busienss, it’s integral that organizations understand the reasons for bad data and how best to address them. Data has revolutionized how organizations operate, from customer relationships to strategic decision-making and everything in between. And with more emphasis on automation and artificial intelligence, the need for data/digital trust also has risen. Even minor errors in an organization’s data can cause massive headaches because the inaccuracies don’t involve just one corrupt data unit.

Inaccurate or “bad” data also affects relationships to other units of data, making the business context difficult or impossible to determine. For example, are data units tagged according to their sensitivity [i.e., personally identifiable information subject to the General Data Protection Regulation (GDPR)], and is data ownership and lineage discernable (i.e., who has access, where did it originate)?

Relying on inaccurate data will hamper decisions, decrease productivity, and yield suboptimal results. Given these risks, organizations must increase their data’s integrity. But how?

Integrated Data Governance

Modern, data-driven organizations are essentially data production lines. And like physical production lines, their associated systems and processes must run smoothly to produce the desired results. Sound data governance provides the framework to address data quality at its source, ensuring any data recorded and stored is done so correctly, securely and in line with organizational requirements. But it needs to integrate all the data disciplines.

By integrating data governance with enterprise architecture, businesses can define application capabilities and interdependencies within the context of their connection to enterprise strategy to prioritize technology investments so they align with business goals and strategies to produce the desired outcomes. A business process and analysis component enables an organization to clearly define, map and analyze workflows and build models to drive process improvement, as well as identify business practices susceptible to the greatest security, compliance or other risks and where controls are most needed to mitigate exposures.

And data modeling remains the best way to design and deploy new relational databases with high-quality data sources and support application development. Being able to cost-effectively and efficiently discover, visualize and analyze “any data” from “anywhere” underpins large-scale data integration, master data management, Big Data and business intelligence/analytics with the ability to synthesize, standardize and store data sources from a single design, as well as reuse artifacts across projects.

Let’s look at some of the main reasons for bad data and how data governance helps confront these issues …

Reasons for Bad Data

Reasons for Bad Data: Data Entry

The concept of “garbage in, garbage out” explains the most common cause of inaccurate data: mistakes made at data entry. While this concept is easy to understand, totally eliminating errors isn’t feasible so organizations need standards and systems to limit the extent of their damage.

With the right data governance approach, organizations can ensure the right people aren’t left out of the cataloging process, so the right context is applied. Plus you can ensure critical fields are not left blank, so data is recorded with as much context as possible.

With the business process integration discussed above, you’ll also have a single metadata repository.

All of this ensures sensitive data doesn’t fall through the cracks.

Reasons for Bad Data: Data Migration

Data migration is another key reason for bad data. Modern organizations often juggle a plethora of data systems that process data from an abundance of disparate sources, creating a melting pot for potential issues as data moves through the pipeline, from tool to tool and system to system.

The solution is to introduce a predetermined standard of accuracy through a centralized metadata repository with data governance at the helm. In essence, metadata describes data about data, ensuring that no matter where data is in relation to the pipeline, it still has the necessary context to be deciphered, analyzed and then used strategically.

The potential fallout of using inaccurate data has become even more severe with the GDPR’s implementation. A simple case of tagging and subsequently storing personally identifiable information incorrectly could lead to a serious breach in compliance and significant fines.

Such fines must be considered along with the costs resulting from any PR fallout.

Reasons for Bad Data: Data Integration

The proliferation of data sources, types, and stores increases the challenge of combining data into meaningful, valuable information. While companies are investing heavily in initiatives to increase the amount of data at their disposal, most information workers are spending more time finding the data they need rather than putting it to work, according to Database Trends and Applications (DBTA). erwin is co-sponsoring a DBTA webinar on this topic on July 17. To register, click here.

The need for faster and smarter data integration capabilities is growing. At the same time, to deliver business value, people need information they can trust to act on, so balancing governance is absolutely critical, especially with new regulations.

Organizations often invest heavily in individual software development tools for managing projects, requirements, designs, development, testing, deployment, releases, etc. Tools lacking inter-operability often result in cumbersome manual processes and heavy time investments to synchronize data or processes between these disparate tools.

Data integration combines data from several various sources into a unified view, making it more actionable and valuable to those accessing it.

Getting the Data Governance “EDGE”

The benefits of integrated data governance discussed above won’t be realized if it is isolated within IT with no input from other stakeholders, the day-to-day data users – from sales and customer service to the C-suite. Every data citizen has DG roles and responsibilities to ensure data units have context, meaning they are labeled, cataloged and secured correctly so they can be analyzed and used properly. In other words, the data can be trusted.

Once an organization understands that IT and the business are both responsible for data, it can develop comprehensive, holistic data governance capable of:

  • Reaching every stakeholder in the process
  • Providing a platform for understanding and governing trusted data assets
  • Delivering the greatest benefit from data wherever it lives, while minimizing risk
  • Helping users understand the impact of changes made to a specific data element across the enterprise.

To reduce the risks of and tackle the reasons for bad data and realize larger organizational objectives, organizations must make data governance everyone’s business.

To learn more about the collaborative approach to data governance and how it helps compliance in addition to adding value and reducing costs, get the free e-book here.

Data governance is everyone's business

Categories
erwin Expert Blog

Data Modeling is Changing – Time to Make NoSQL Technology a Priority

As the amount of data enterprises are tasked with managing increases, the benefits of NoSQL technology are becoming more apparent. 

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – Big Data & MPP

By now, you’ve likely heard a lot about Big Data. You may have even heard about “the three Vs” of Big Data. Originally defined by Gartner, “Big Data is “high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision-making, insight discovery and process optimization.”

Categories
erwin Expert Blog

Data-Driven Business Transformation: the Data Foundation

In light of data’s prominence in modern business, organizations need to ensure they have a strong data foundation in place.

The ascent of data’s value has been as steep as it is staggering. In 2016, it was suggested that more data would be created in 2017 than in the previous 5000 years of humanity.

But what’s even more shocking is that the peak still not may not even be in sight.

To put its value into context, the five most valuable businesses in the world all deal in data (Alphabet/Google, Amazon, Apple, Facebook and Microsoft). It’s even overtaken oil as the world’s most valuable resource.

Yet, even with data’s value being as high as it is, there’s still a long way to go. Many businesses are still getting to grips with data storage, management and analysis.

Fortune 1000 companies, for example, could earn another $65 million in net income, with access to just 10 percent more of their data (from Data-Driven Business Transformation 2017).

We’re already witnessing the beginnings of this increased potential across various industries. Data-driven businesses such as Airbnb, Uber and Netflix are all dominating, disrupting and revolutionizing their respective sectors.

Interestingly, although they provide very different services for the consumer, the organizations themselves all identify as data companies. This simple change in perception and outlook stresses the importance of data to their business models. For them, data analysis isn’t just an arm of the business… It’s the core.

Data foundation

The dominating data-driven businesses use data to influence almost everything. How decisions are made, how processes could be improved, and where the business should focus its innovation efforts.

However, simply establishing that your business could (and should) be getting more out of data, doesn’t necessarily mean you’re ready to reap the rewards.

In fact, a pre-emptive dive into a data strategy could in fact, slow your digital transformation efforts down. Hurried software investments in response to disruption can lead to teething problems in your strategy’s adoption, and shelfware, wasting time and money.

Additionally, oversights in the strategy’s implementation will stifle the very potential effectiveness you’re hoping to benefit from.

Therefore, when deciding to bolster your data efforts, a great place to start is to consider the ‘three Vs’.

The three Vs

The three Vs of data are volume, variety and velocity. Volume references the amount of data; variety, its different sources; and velocity, the speed in which it must be processed.

When you’re ready to start focusing on the business outcomes that you hope data will provide, you can also stretch those three Vs, to five. The five Vs include the aforementioned, and also acknowledge veracity (confidence in the data’s accuracy) and value, but for now we’ll stick to three.

As discussed, the total amount of data in the world is staggering. But the total data available to any one business can be huge in its own right (depending on the extent of your data strategy).

Unsurprisingly, vast volumes of data are sourced from a vast amount of potential sources. It takes dedicated tools to be processed. Even then, the sources are often disparate, and very unlikely to offer worthwhile insight in a vacuum.

This is why it’s so important to have an assured data foundation upon which to build a data platform on.

A solid data foundation

The Any2 approach is a strategy for housing, sorting and analysing data that aims to be that very foundation on which you build your data strategy.

Shorthand for Any Data, Anywhere, Anycan help clean up the disparate noise, and let businesses drill down on, and effectively analyze the data in order to yield more reliable and informative results.

It’s especially important today, as data sources are becoming increasingly unstructured, and so more difficult to manage.

Big data for example, can consist of click stream data, Internet of Things data, machine data and social media data. The sources need to be rationalized and correlated so they can be analyzed more effectively.

When it comes to actioning an Anyapproach, a fluid relationship between the various data initiative involved is essential. Those being, Data ModelingEnterprise ArchitectureBusiness Process, and Data Governance.

It also requires collaboration, both in between the aforementioned initiatives, and with the wider business to ensure everybody is working towards the same goal.

With a solid data foundation platform in place, your business can really begin to start realizing data’s potential for itself. You also ensure you’re not left behind as new disruptors enter the market, and your competition continues to evolve.

For more data advice and best practices, follow us on Twitter, and LinkedIn to stay up to date with the blog.

For a deeper dive into best practices for data, its benefits, and its applications, get the FREE whitepaper below.

Data-Driven Business Transformation