Categories
erwin Expert Blog Data Governance Enterprise Architecture Data Intelligence

Integrating Data Governance and Enterprise Architecture

Aligning these practices for regulatory compliance and other benefits

Why should you integrate data governance (DG) and enterprise architecture (EA)? It’s time to think about EA beyond IT.

Two of the biggest challenges in creating a successful enterprise architecture initiative are: collecting accurate information on application ecosystems and maintaining the information as application ecosystems change.

Data governance provides time-sensitive, current-state architecture information with a high level of quality. It documents your data assets from end to end for business understanding and clear data lineage with traceability.

In the context of EA, data governance helps you understand what information you have; where it came from; if it’s secure; who’s accountable for it; who accessed it and in which systems and applications it’s located and moves between.

You can collect complete application ecosystem information; objectively identify connections/interfaces between applications, using data; provide accurate compliance assessments; and quickly identify security risks and other issues.

Data governance and EA also provide many of the same benefits of enterprise architecture or business process modeling projects: reducing risk, optimizing operations, and increasing the use of trusted data.

To better understand and align data governance and enterprise architecture, let’s look at data at rest and data in motion and why they both have to be documented.

  1. Documenting data at rest involves looking at where data is stored, such as in databases, data lakes, data warehouses and flat files. You must capture all of this information from the columns, fields and tables – and all the data overlaid on top of that. This means understanding not just the technical aspects of a data asset but also how the business uses that data asset.
  2. Documenting data in motion looks at how data flows between source and target systems and not just the data flows themselves but also how those data flows are structured in terms of metadata. We have to document how our systems interact, including the logical and physical data assets that flow into, out of and between them.

data governance and enterprise architecture

Automating Data Governance and Enterprise Architecture

If you have a data governance program and tooling in place, you’re able to document a lot of information that enterprise architects and process modelers usually spend months, if not years, collecting and keeping up to date.

So within a data governance repository, you’re capturing systems, environments, databases and data — both logical and physical. You’re also collecting information about how those systems are interconnected.

With all this information about the data landscape and the systems that use and store it, you’re automatically collecting your organization’s application architecture. Therefore you can drastically reduce the time to achieving value because your enterprise architecture will always be up to date because you’re managing the associated data properly.

If your organization also has an enterprise architecture practice and tooling, you can automate the current-state architecture, which is arguably the most expensive and time-intensive aspect of enterprise architecture to have at your fingertips.

In erwin’s 2020 State of Data Governance and Automation report, close to 70 percent of respondents said they spend an average of 10 or more hours per week on data-related activities, and most of that time is spent searching for and preparing data.

At the same time, it’s also critical to answer the executives’ questions. You can’t do impact analysis if you don’t understand the current-state architecture, and it’s not going to be delivered quick enough if it isn’t documented.

Data Governance and Enterprise Architecture for Regulatory Compliance

First and foremost, we can start to document the application inventory automatically because we are scanning systems and understanding the architecture itself. When you pre-populate your interface inventory, application lineage and data flows, you see clear-cut dependencies.

That makes regulatory compliance a fantastic use case for both data governance and EA. You can factor this use case into process and application architecture diagrams, looking at where this type of data goes and what sort of systems in touches.

With that information, you can start to classify information for such regulations as the European Union’s General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA) or any type of compliance data for an up-to-date regulatory compliance repository. Then all this information flows into processing controls and will ultimately deliver real-time, true impact analysis and traceability.

erwin for Data Governance and Enterprise Architecture

Using data governance and enterprise architecture in tandem will give you a data-driven architecture, reducing time to value and show true results to your executives.

You can better manage risk because of real-time data coming into the EA space. You can react quicker, answering questions for stakeholders that will ultimately drive business transformation. And you can reinforce the value of your role as an enterprise architect.

erwin Evolve is a full-featured, configurable set of enterprise architecture and business process modeling and analysis tools. It integrates with erwin’s data governance software, the erwin Data Intelligence Suite.

With these unified capabilities, every enterprise stakeholder – enterprise architect, business analyst, developer, chief data officer, risk manager, and CEO – can discover, understand, govern and socialize data assets to realize greater value while mitigating data-related risks.

You can start a free trial of erwin Evolve here.

Enterprise architecture review

Categories
erwin Expert Blog Data Governance Data Intelligence

Data Governance Definition, Best Practices and Benefits

Any organziation with a data-driven strategy should understand the definition of data governance. In fact, in light of increasingly stringent data regulations, any organzation that uses or even stores data, should understand the definition of data governance.

Organizations with a solid understanding of data governance (DG) are better equipped to keep pace with the speed of modern business.

In this post, the erwin Experts address:

 

 

Data Governance Definition

Data governance’s definition is broad as it describes a process, rather than a predetermined method. So an understanding of the process and the best practices associated with it are key to a successful data governance strategy.

Data governance is best defined as the strategic, ongoing and collaborative processes involved in managing data’s access, availability, usability, quality and security in line with established internal policies and relevant data regulations.

It’s often said that when we work together, we can achieve things greater than the sum of our parts. Collective, societal efforts have seen mankind move metaphorical mountains and land on the literal moon.

Such feats were made possible through effective government – or governance.

The same applies to data. A single unit of data in isolation can’t do much, but the sum of an organization’s data can prove invaluable.

Put simply, DG is about maximizing the potential of an organization’s data and minimizing the risk. In today’s data-driven climate, this dynamic is more important than ever.

That’s because data’s value depends on the context in which it exists: too much unstructured or poor-quality data and meaning is lost in a fog; too little insight into data’s lineage, where it is stored, or who has access and the organization becomes an easy target for cybercriminals and/or non-compliance penalties.

So DG is quite simply, about how an organization uses its data. That includes how it creates or collects data, as well as how its data is stored and accessed. It ensures that the right data of the right quality, regardless of where it is stored or what format it is stored in, is available for use – but only by the right people and for the right purpose.

With well governed data, organizations can get more out of their data by making it easier to manage, interpret and use.

Why Is Data Governance Important?

Although governing data is not a new practice, using it as a strategic program is and so are the expectations as to who is responsible for it.

Historically, governing data has been IT’s business because it primarily involved cataloging data to support search and discovery.

But now, governing data is everyone’s business. Both the data “keepers” in IT and the data users everywhere else within the organization have a role to play.

That makes sense, too. The sheer volume and importance of data the average organization now processes are too great to be effectively governed by a siloed IT department.

Think about it. If all the data you access as an employee of your organization had to be vetted by IT first, could you get anything done?

While the exponential increase in the volume and variety of data has provided unparalleled insights for some businesses, only those with the means to deal with the velocity of data have reaped the rewards.

By velocity, we mean the speed at which data can be processed and made useful. More on “The Three Vs of Data” here.

Data giants like Amazon, Netflix and Uber have reshaped whole industries, turning smart, proactive data governance into actionable and profitable insights.

And then, of course, there’s the regulatory side of things. The European Union’s General Data Protection Regulation (GDPR) mandates organization’s govern their data.

Poor data governance doesn’t just lead to breaches – although of course it does – but compliance audits also need an effective data governance initiative in order to pass.

Since non-compliance can be costly, good data governance not only helps organizations make money, it helps them save it too. And organizations are recognizing this fact.

In the lead up to GDPR, studies found that the biggest driver for initiatives for governing data was regulatory compliance. However, since GDPR’s implementation better decision-making and analytics are their top drivers for investing in data governance.

Other areas in where well governed data plays an important role include digital transformation, data standards and uniformity, self-service and customer trust and satisfaction.

For the full list of drivers and deeper insight into the state of data governance, get the free 2020 State of DGA report here.

What Is Good Data Governance?

We’re constantly creating new data whether we’re aware of it or not. Every new sale, every new inquiry, every website interaction, every swipe on social media generates data.

This means the work of governing data is ongoing, and organizations without it can become overwhelmed quickly.

Therefore good data governance is proactive not reactive.

In addition, good data governance requires organizations to encourage a culture that stresses the importance of data with effective policies for its use.

An organization must know who should have access to what, both internally and externally, before any technical solutions can effectively compartmentalize the data.

So good data governance requires both technical solutions and policies to ensure organizations stay in control of their data.

But culture isn’t built on policies alone. An often-overlooked element of good data governance is arguably philosophical. Effectively communicating the benefits of well governed data to employees – like improving the discoverability of data – is just as important as any policy or technology.

And it shouldn’t be difficult. In fact, it should make data-oriented employees’ jobs easier, not harder.

What Are the Key Benefits of Data Governance?

Organizations with a effectively governed data enjoy:

  • Better alignment with data regulations: Get a more holistic understanding of your data and any associated risks, plus improve data privacy and security through better data cataloging.
  • A greater ability to respond to compliance audits: Take the pain out of preparing reports and respond more quickly to audits with better documentation of data lineage.
  • Increased operational efficiency: Identify and eliminate redundancies and streamline operations.
  • Increased revenue: Uncover opportunities to both reduce expenses and discover/access new revenue streams.
  • More accurate analytics and improved decision-making: Be more confident in the quality of your data and the decisions you make based on it.
  • Improved employee data literacy: Consistent data standards help ensure employees are more data literate, and they reduce the risk of semantic misinterpretations of data.
  • Better customer satisfaction/trust and reputation management: Use data to provide a consistent, efficient and personalized customer experience, while avoiding the pitfalls and scandals of breaches and non-compliance.

For a more in-depth assessment of data governance benefits, check out The Top 6 Benefits of Data Governance.

The Best Data Governance Solution

Data has always been important to erwin; we’ve been a trusted data modeling brand for more than 30 years. But we’ve expanded our product portfolio to reflect customer needs and give them an edge, literally.

The erwin EDGE platform delivers an “enterprise data governance experience.” And at the heart of the erwin EDGE is the erwin Data Intelligence Suite (erwin DI).

erwin DI provides all the tools you need for the effective governance of your data. These include data catalog, data literacy and a host of built-in automation capabilities that take the pain out of data preparation.

With erwin DI, you can automatically harvest, transform and feed metadata from a wide array of data sources, operational processes, business applications and data models into a central data catalog and then make it accessible and understandable via role-based, contextual views.

With the broadest set of metadata connectors, erwin DI combines data management and DG processes to fuel an automated, real-time, high-quality data pipeline.

See for yourself why erwin DI is a DBTA 2020 Readers’ Choice Award winner for best data governance solution with your very own, very free demo of erwin DI.

data governance preparedness

Categories
erwin Expert Blog Data Governance Data Intelligence

Do I Need a Data Catalog?

If you’re serious about a data-driven strategy, you’re going to need a data catalog.

Organizations need a data catalog because it enables them to create a seamless way for employees to access and consume data and business assets in an organized manner.

Given the value this sort of data-driven insight can provide, the reason organizations need a data catalog should become clearer.

It’s no surprise that most organizations’ data is often fragmented and siloed across numerous sources (e.g., legacy systems, data warehouses, flat files stored on individual desktops and laptops, and modern, cloud-based repositories.)

These fragmented data environments make data governance a challenge since business stakeholders, data analysts and other users are unable to discover data or run queries across an entire data set. This also diminishes the value of data as an asset.

Data catalogs combine physical system catalogs, critical data elements, and key performance measures with clearly defined product and sales goals in certain circumstances.

You also can manage the effectiveness of your business and ensure you understand what critical systems are for business continuity and measuring corporate performance.

The data catalog is a searchable asset that enables all data – including even formerly siloed tribal knowledge – to be cataloged and more quickly exposed to users for analysis.

Organizations with particularly deep data stores might need a data catalog with advanced capabilities, such as automated metadata harvesting to speed up the data preparation process.

For example, before users can effectively and meaningfully engage with robust business intelligence (BI) platforms, they must have a way to ensure that the most relevant, important and valuable data set are included in analysis.

The most optimal and streamlined way to achieve this is by using a data catalog, which can provide a first stop for users ahead of working in BI platforms.

As a collective intelligent asset, a data catalog should include capabilities for collecting and continually enriching or curating the metadata associated with each data asset to make them easier to identify, evaluate and use properly.

Data Catalog Benefits

Three Types of Metadata in a Data Catalog

A data catalog uses metadata, data that describes or summarizes data, to create an informative and searchable inventory of all data assets in an organization.

These assets can include but are not limited to structured data, unstructured data (including documents, web pages, email, social media content, mobile data, images, audio, video and reports) and query results, etc. The metadata provides information about the asset that makes it easier to locate, understand and evaluate.

For example, Amazon handles millions of different products, and yet we, as consumers, can find almost anything about everything very quickly.

Beyond Amazon’s advanced search capabilities, the company also provides detailed information about each product, the seller’s information, shipping times, reviews, and a list of companion products. Sales are measured down to a zip code territory level across product categories.

Another classic example is the online or card catalog at a library. Each card or listing contains information about a book or publication (e.g., title, author, subject, publication date, edition, location) that makes the publication easier for a reader to find and to evaluate.

There are many types of metadata, but a data catalog deals primarily with three: technical metadata, operational or “process” metadata, and business metadata.

Technical Metadata

Technical metadata describes how the data is organized, stored, its transformation and lineage. It is structural and describes data objects such as tables, columns, rows, indexes and connections.

This aspect of the metadata guides data experts on how to work with the data (e.g. for analysis and integration purposes).

Operational Metadata

Operational metadata describes systems that process data, the applications in those systems, and the rules in those applications. This is also called “process” metadata that describes the data asset’s creation, when, how and by whom it has been accessed, used, updated or changed.

Operational metadata provides information about the asset’s history and lineage, which can help an analyst decide if the asset is recent enough for the task at hand, if it comes from a reliable source, if it has been updated by trustworthy individuals, and so on.

As illustrated above, a data catalog is essential to business users because it synthesizes all the details about an organization’s data assets across multiple data sources. It organizes them into a simple, easy- to-digest format and then publishes them to data communities for knowledge-sharing and collaboration.

Business Metadata

Business metadata is sometimes referred to as external metadata attributed to the business aspects of a data asset. It defines the functionality of the data captured, definition of the data, definition of the elements, and definition of how the data is used within the business.

This is the area which binds all users together in terms of consistency and usage of catalogued data asset.

Tools should be provided that enable data experts to explore the data catalogs, curate and enrich the metadata with tags, associations, ratings, annotations, and any other information and context that helps users find data faster and use it with confidence.

Why You Need a Data Catalog – Three Business Benefits of Data Catalogs

When data professionals can help themselves to the data they need—without IT intervention and having to rely on finding experts or colleagues for advice, limiting themselves to only the assets they know about, and having to worry about governance and compliance—the entire organization benefits.

Catalog critical systems and data elements plus enable the calculation and evaluation of key performance measures. It is also important to understand data linage and be able to analyze the impacts to critical systems and essential business processes if a change occurs.

  1. Makes data accessible and usable, reducing operational costs while increasing time to value

Open your organization’s data door, making it easier to access, search and understand information assets. A data catalog is the core of data analysis for decision-making, so automating its curation and access with the associated business context will enable stakeholders to spend more time analyzing it for meaningful insights they can put into action.

Data asset need to be properly scanned, documented, tagged and annotated with their definitions, ownership, lineage and usage. Automating the cataloging of data assets saves initial development time and streamlines its ongoing maintenance and governance.

Automating the curation of data assets also accelerates the time to value for analytics/insights reporting and significantly reduces operational costs.

  1. Ensures regulatory compliance

Regulations like the California Consumer Privacy Act (CCPA ) and the European Union’s General Data Protection Regulation (GDPR) require organizations to know where all their customer, prospect and employee data resides to ensure its security and privacy.

A fine for noncompliance or reputational damage are the last things you need to worry about, so using a data catalog centralizes data management and the associated usage policies and guardrails.

See a Data Catalog in Action

The erwin Data Intelligence Suite (erwin DI) provides data catalog and data literacy capabilities with built-in automation so you can accomplish all the above and much more.

Request your own demo of erwin DI.

Data Intelligence for Data Automation

Categories
erwin Expert Blog Data Governance Data Intelligence

Overcoming the 80/20 Rule – Finding More Time with Data Intelligence

The 80/20 rule is well known. It describes an unfortunate reality for many data stewards, who spend 80 percent of their time finding, cleaning and reorganizing huge amounts of data, and only 20 percent of their time on actual data analysis.

That’s a lot wasted of time.

Earlier this year, erwin released its 2020 State of Data Governance and Automation (DGA) report. About 70 percent of the DGA report respondents – a combination of roles from data architects to executive managers – say they spend an average of 10 or more hours per week on data-related activities.

COVID-19 has changed the way we work – essentially overnight – and may change how companies work moving forward. Companies like Twitter, Shopify and Box have announced that they are moving to a permanent work-from-home status as their new normal.

For much of our time as data stewards, collecting, revising and building consensus around our metadata has meant that we need to balance find time on multiple calendars against multiple competing priorities so that we can pull the appropriate data stakeholders into a room to discuss term definitions, the rules for measuring “clean” data, and identifying processes and applications that use the data.

Overcoming the 80/20 Rule - Analyzing Data

This style of data governance most often presents us with eight one-hour opportunities per day (40 one-hour opportunities per week) to meet.

As the 80/20 rule suggests, getting through hundreds, or perhaps thousands of individual business terms using this one-hour meeting model can take … a … long … time.

Now that pulling stakeholders into a room has been disrupted …  what if we could use this as 40 opportunities to update the metadata PER DAY?

What if we could buck the trend, and overcome the 80/20 rule?

Overcoming the 80/20 Rule with Micro Governance for Metadata

Micro governance is a strategy that leverages the native functionality around workflows.

erwin Data Intelligence (DI) offers Workflow Manager that creates a persistent, reusable role-based workflow such that edits to the metadata for any term can move from, for example, draft to under review to approved to published.

Using a defined workflow, it can eliminate the need for hour-long meetings with multiple stakeholders in a room. Now users can suggest edits, review changes, and approve changes on their own schedule! Using micro governance these steps should take less than 10 minutes per term:

  • Log on the DI Suite
  • Open your work queue to see items requiring your attention
  • Review and/or approve changes
  • Log out

That’s it!

And as a bonus, where stakeholders may need to discuss the edits to achieve consensus, the Collaboration Center within the Business Glossary Manager facilitates conversations between stakeholders that persistent and attached directly to the business term. No more searching through months of email conversations or forgetting to cc a key stakeholder.

Using the DI Suite Workflow Manager and the Collaboration Center, and assuming an 8-hour workday, we should each have 48 opportunities for 10 minutes of micro-governance stewardship each day.

A Culture of Micro Governance

In these days when we are all working at home, and face-to-face meetings are all but impossible, we should see this time as an opportunity to develop a culture of micro governance around our metadata.

This new way of thinking and acting will help us continuously improve our transparency and semantic understanding of our data while staying connected and collaborating with each other.

When we finally get back into the office, the micro governance ethos we’ve built while at home will help make our data governance programs more flexible, responsive and agile. And ultimately, we’ll take up less of our colleagues’ precious time.

Request a free demo of erwin DI.

Data Intelligence for Data Automation

Categories
erwin Expert Blog Data Governance

What is Data Lineage? Top 5 Benefits of Data Lineage

What is Data Lineage and Why is it Important?

Data lineage is the journey data takes from its creation through its transformations over time. It describes a certain dataset’s origin, movement, characteristics and quality.

Tracing the source of data is an arduous task.

Many large organizations, in their desire to modernize with technology, have acquired several different systems with various data entry points and transformation rules for data as it moves into and across the organization.

data lineage

These tools range from enterprise service bus (ESB) products, data integration tools; extract, transform and load (ETL) tools, procedural code, application program interfaces (API)s, file transfer protocol (FTP) processes, and even business intelligence (BI) reports that further aggregate and transform data.

With all these diverse data sources, and if systems are integrated, it is difficult to understand the complicated data web they form much less get a simple visual flow. This is why data’s lineage must be tracked and why its role is so vital to business operations, providing the ability to understand where data originates, how it is transformed, and how it moves into, across and outside a given organization.

Data Lineage Use Case: From Tracing COVID-19’s Origins to Data-Driven Business

A lot of theories have emerged about the origin of the coronavirus. A recent University of California San Francisco (UCSF) study conducted a genetic analysis of COVID-19 to determine how the virus was introduced specifically to California’s Bay Area.

It detected at least eight different viral lineages in 29 patients in February and early March, suggesting no regional patient zero but rather multiple independent introductions of the pathogen. The professor who directed the study said, “it’s like sparks entering California from various sources, causing multiple wildfires.”

Much like understanding viral lineage is key to stopping this and other potential pandemics, understanding the origin of data, is key to a successful data-driven business.

Top Five Data Lineage Benefits

From my perspective in working with customers of various sizes across multiple industries, I’d like to highlight five data lineage benefits:

1. Business Impact

Data is crucial to every organization’s survival. For that reason, businesses must think about the flow of data across multiple systems that fuel organizational decision-making.

For example, the marketing department uses demographics and customer behavior to forecast sales. The CEO also makes decisions based on performance and growth statistics. An understanding of the data’s origins and history helps answer questions about the origin of data in a Key Performance Indicator (KPI) reports, including:

  • How the report tables and columns are defined in the metadata?
  • Who are the data owners?
  • What are the transformation rules?

Without data lineage, these functions are irrelevant, so it makes sense for a business to have a clear understanding of where data comes from, who uses it, and how it transforms. Also, when there is a change to the environment, it is valuable to assess the impacts to the enterprise application landscape.

In the event of a change in data expectations, data lineage provides a way to determine which downstream applications and processes are affected by the change and helps in planning for application updates.

2. Compliance & Auditability

Business terms and data policies should be implemented through standardized and documented business rules. Compliance with these business rules can be tracked through data lineage, incorporating auditability and validation controls across data transformations and pipelines to generate alerts when there are non-compliant data instances.

Regulatory compliance places greater transparency demands on firms when it comes to tracing and auditing data. For example, capital markets trading firms must understand their data’s origins and history to support risk management, data governance and reporting for various regulations such as BCBS 239 and MiFID II.

Also, different organizational stakeholders (customers, employees and auditors) need to be able to understand and trust reported data. Data lineage offers proof that the data provided is reflected accurately.

3. Data Governance

An automated data lineage solution stitches together metadata for understanding and validating data usage, as well as mitigating the associated risks.

It can auto-document end-to-end upstream and downstream data lineage, revealing any changes that have been made, by whom and when.

This data ownership, accountability and traceability is foundational to a sound data governance program.

See: The Benefits of Data Governance

4. Collaboration

Analytics and reporting are data-dependent, making collaboration among different business groups and/or departments crucial.

The visualization of data lineage can help business users spot the inherent connections of data flows and thus provide greater transparency and auditability.

Seeing data pipelines and information flows further supports compliance efforts.

5. Data Quality

Data quality is affected by data’s movement, transformation, interpretation and selection through people, process and technology.

Root-cause analysis is the first step in repairing data quality. Once a data steward determines where a data flaw was introduced, the reason for the error can be determined.

With data lineage and mapping, the data steward can trace the information flow backward to examine the standardizations and transformations applied to confirm whether they were performed correctly.

See Data Lineage in Action

Data lineage tools document the flow of data into and out of an organization’s systems. They capture end-to-end lineage and ensure proper impact analysis can be performed in the event of problems or changes to data assets as they move across pipelines.

The erwin Data Intelligence Suite (erwin DI) automatically generates end-to-end data lineage, down to the column level and between repositories. You can view data flows from source systems to the reporting layers, including intermediate transformation and business logic.

Join us for the next live demo of erwin Data Intelligence (DI) to see metadata-driven, automated data lineage in action.

erwin data intelligence

Subscribe to the erwin Expert Blog

Once you submit the trial request form, an erwin representative will be in touch to verify your request and help you start data modeling.

Categories
erwin Expert Blog Data Governance

Data Governance Frameworks: The Key to Successful Data Governance Implementation

A strong data governance framework is central to successful data governance implementation in any data-driven organization because it ensures that data is properly maintained, protected and maximized.

But despite this fact, enterprises often face push back when implementing a new data governance initiative or trying to mature an existing one.

Let’s assume you have some form of informal data governance operation with some strengths to build on and some weaknesses to correct. Some parts of the organization are engaged and behind the initiative, while others are skeptical about its relevance or benefits.

Some other common data governance implementation obstacles include:

  • Questions about where to begin and how to prioritize which data streams to govern first
  • Issues regarding data quality and ownership
  • Concerns about data lineage
  • Competing project and resources (time, people and funding)

By using a data governance framework, organizations can formalize their data governance implementation and subsequent adherence to. This addressess common concerns including data quality and data lineage, and provides a clear path to successful data governance implementation.

In this blog, we will cover three key steps to successful data governance implementation. We will also look into how we can expand the scope and depth of a data governance framework to ensure data governance standards remain high.

Data Governance Implementation in 3 Steps

When maturing or implementing data governance and/or a data governance framework, an accurate assessment of the ‘here and now’ is key. Then you can rethink the path forward, identifying any current policies or business processes that should be incorporated, being careful to avoid making the same mistakes of prior iterations.

With this in mind, here are three steps we recommend for implementing data governance and a data governance framework.

Data Governance Framework

Step 1: Shift the culture toward data governance

Data governance isn’t something to set and forget; it’s a strategic approach that needs to evolve over time in response to new opportunities and challenges. Therefore, a successful data governance framework has to become part of the organization’s culture but such a shift requires listening – and remembering that it’s about people, empowerment and accountability.

In most cases, a new data governance framework requires people – those in IT and across the business, including risk management and information security – to change how they work. Any concerns they raise or recommendations they make should be considered. You can encourage feedback through surveys, workshops and open dialog.

Once input has been discussed and plan agreed upon, it is critical to update roles and responsibilities, provide training and ensure ongoing communication. Many organizations now have internal certifications for different data governance roles who wear these badges with pride.

A top-down management approach will get a data governance initiative off the ground, but only bottom-up cultural adoption will carry it out.

Step 2: Refine the data governance framework

The right capabilities and tools are important for fueling an accurate, real-time data pipeline and governing it for maximum security, quality and value. For example:

Data catalogingOrganization’s implementing a data governance framework will benefit from automated metadata harvesting, data mapping, code generation and data lineage with reference data management, lifecycle management and data quality. With these capabilities, you can  efficiently integrate and activate enterprise data within a single, unified catalog in accordance with business requirements.

Data literacy Being able to discover what data is available and understand what it means in common, standardized terms is important because data elements may mean different things to different parts of the organization. A business glossary answers this need, as does the ability for stakeholders to view data relevant to their roles and understand it within a business context through a role-based portal.

Such tools are further enhanced if they can be integrated across data and business architectures and when they promote self-service and collaboration, which also are important to the cultural shift.

 

Subscribe to the erwin Expert Blog

Once you submit the trial request form, an erwin representative will be in touch to verify your request and help you start data modeling.

 

 

Step 3: Prioritize then scale the data governance framework

Because data governance is on-going, it’s important to prioritize the initial areas of focus and scale from there. Organizations that start with 30 to 50 data items are generally more successful than those that attempt more than 1,000 in the early stages.

Find some representative (familiar) data items and create examples for data ownership, quality, lineage and definition so stakeholders can see real examples of the data governance framework in action. For example:

  • Data ownership model showing a data item, its definition, producers, consumers, stewards and quality rules (for profiling)
  • Workflow showing the creation, enrichment and approval of the above data item to demonstrate collaboration

Whether your organization is just adopting data governance or the goal is to refine an existing data governance framework, the erwin DG RediChek will provide helpful insights to guide you in the journey.

Categories
erwin Expert Blog Data Governance

For Pharmaceutical Companies Data Governance Shouldn’t Be a Hard Pill to Swallow

Using data governance in the pharmaceutical industry is a critical piece of the data management puzzle.

Pharmaceutical and life sciences companies face many of the same digital transformation pressures as other industries, such as financial services and healthcare that we have explored previously.

In response, they are turning to technologies like advanced analytics platforms and cloud-based resources to help better inform their decision-making and create new efficiencies and better processes.

Among the conditions that set digital transformation in pharmaceuticals and life sciences apart from other sectors are the regulatory environment and the high incidence of mergers and acquisitions (M&A).

Data Governance, GDPR and Your Business

Protecting sensitive data in these industries is a matter of survival, in terms of the potential penalties for failing to comply with any number of industry and government regulations and because of the near-priceless value of data around research and development (R&D).

The high costs and huge potential of R&D is one of the driving factors of M&A activity in the pharmaceutical and life sciences space. With roughly $156 billion in M&A deals in healthcare in the first quarter of 2018 alone – many involving drug companies – the market is the hottest it’s been in more than a decade. Much of the M&A activity is being driven by companies looking to buy competitors, acquire R&D, and offset losses from expiring drug patents.

 

[GET THE FREE E-BOOK]: APPLICATION PORTFOLIO MANAGEMENT FOR MERGERS & ACQUISITIONS IN THE FINANCIAL SERVICES SECTOR

 

With M&A activity comes the challenge of integrating two formerly separate companies into one. That means integrating technology platforms, business processes, and, of course, the data each organization brings to the deal.

Data Integrity for Risk Management and More

As in virtual every other industry, data is quickly becoming one of the most valuable assets within pharmaceutical and life science companies. In its 2018 Global Life Sciences Outlook, Deloitte speaks to the importance of “data integrity,” which it defines as data that is complete, consistent and accurate throughout the data lifecycle.

Data integrity helps manage risk in pharmaceutical and life sciences by making it easier to comply with a complex web of regulations that touch many different parts of these organizations, from finance to the supply chain and beyond. Linking these cross-functional teams to data they can trust eases the burden of compliance by supplying team members with what many industries now refer to as “a single version of truth” – which is to say, data with integrity.

Data integrity also helps deliver insights for important initiatives in the pharmaceutical and life sciences industries like value-based pricing and market access.

Developing data integrity and taking advantage of it to reduce risk and identify opportunities in pharmaceuticals and life sciences isn’t possible without a holistic approach to data governance that permeates every part of these companies, including business processes and enterprise architecture.

Healthcare Data

Data Governance in the Pharmaceutical Industry Maximizes Value

Data governance gives businesses the visibility they need to understand where their data is, where it came from, its value, its quality and how it can be used by people and software applications. This type of understanding of your data is, of course, essential to compliance. In fact, according to a 2017 survey by erwin, Inc. and UBM, 60 percent of organizations said compliance is driving their data governance initiatives.

Using data governance in the pharmaceutical industry helps organizations contemplating M&A, not only by helping them understand the data they are acquiring, but also by informing decisions around complex IT infrastructures and applications that need to be integrated. Decisions about application rationalization and business processes are easier to make when they are viewed through the lens of a pervasive data governance strategy.

Data governance in the pharmaceutical industry can be leveraged to hone data integrity and move toward what Deloitte refers to as end-to-end evidence management (E2E), which unifies the data in pharmaceuticals and life sciences from R&D to clinical trials and through commercialization.

Once implemented, Deloitte predicts E2E will help organizations maximize the value of their data by:

  • Providing a better understanding of emerging risks
  • Enabling collaboration with health systems, patient advocacy groups, and other constituents
  • Streamlining the development of new therapies
  • Driving down costs

If that list of benefits sounds familiar, it’s because it matches up nicely with the goals of digital transformation at many organizations – more efficient processes, better collaboration, improved visibility and better cost management. And it’s all built on a foundation of data and data governance.

To learn more, download our free whitepaper on the Regulatory Rationale for Integrating Data Management & Data Governance.

Data Modeling Data Goverance

 

Categories
erwin Expert Blog Data Governance Data Intelligence

Demystifying Data Lineage: Tracking Your Data’s DNA

Getting the most out of your data requires getting a handle on data lineage. That’s knowing what data you have, where it is, and where it came from – plus understanding its quality and value to the organization.

But you can’t understand your data in a business context much less track data lineage, its physical existence and maximize its security, quality and value if it’s scattered across different silos in numerous applications.

Data lineage provides a way of tracking data from its origin to destination across its lifespan and all the processes it’s involved in. It also plays a vital role in data governance. Beyond the simple ability to know where the data came from and whether or not it can be trusted, there’s an element of statutory reporting and compliance that often requires a knowledge of how that same data (known or unknown, governed or not) has changed over time.

A platform that provides insights like data lineage, impact analysis, full-history capture, and other data management features serves as a central hub from which everything can be learned and discovered about the data – whether a data lake, a data vault or a traditional data warehouse.

In a traditional data management organization, Excel spreadsheets are used to manage the incoming data design, what’s known as the “pre-ETL” mapping documentation, but this does not provide any sort of visibility or auditability. In fact, each unit of work represented in these ‘mapping documents’ becomes an independent variable in the overall system development lifecycle, and therefore nearly impossible to learn from much less standardize.

The key to accuracy and integrity in any exercise is to eliminate the opportunity for human error – which does not mean eliminating humans from the process but incorporating the right tools to reduce the likelihood of error as the human beings apply their thought processes to the work.

Data Lineage

Data Lineage: A Crucial First Step for Data Governance

Knowing what data you have and where it lives and where it came from is complicated. The lack of visibility and control around “data at rest” combined with “data in motion,” as well as difficulties with legacy architectures, means organizations spend more time finding the data they need rather than using it to produce meaningful business outcomes.

Organizations need to create and sustain an enterprise-wide view of and easy access to underlying metadata, but that’s a tall order with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration. So the applications and initiatives that depend on a solid data infrastructure may be compromised, resulting in faulty analyses.

These issues can be addressed with a strong data management strategy underpinned by technology that enables the data quality the business requires, which encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossaries maintenance and metadata management (associations and lineage).

An automated, metadata-driven framework for cataloging data assets and their flows across the business provides an efficient, agile and dynamic way to generate data lineage from operational source systems (databases, data models, file-based systems, unstructured files and more) across the information management architecture; construct business glossaries; assess what data aligns with specific business rules and policies; and inform how that data is transformed, integrated and federated throughout business processes – complete with full documentation.

Centralized design, immediate lineage and impact analysis, and change-activity logging means you will always have answers readily available, or just a few clicks away. Subsets of data can be identified and generated via predefined templates, generic designs generated from standard mapping documents, and pushed via ETL process for faster processing via automation templates.

With automation, data quality is systemically assured and the data pipeline is seamlessly governed and operationalized to the benefit of all stakeholders. Without such automation, business transformation will be stymied. Companies, especially large ones with thousands of systems, files and processes, will be particularly challenged by a manual approach. And outsourcing these data management efforts to professional services firms only increases costs and schedule delays.

With erwin Mapping Manager, organizations can automate enterprise data mapping and code generation for faster time-to-value and greater accuracy when it comes to data movement projects, as well as synchronize “data in motion” with data management and governance efforts.

Map data elements to their sources within a single repository to determine data lineage, deploy data warehouses and other Big Data solutions, and harmonize data integration across platforms. The web-based solution reduces the need for specialized, technical resources with knowledge of ETL and database procedural code, while making it easy for business analysts, data architects, ETL developers, testers and project managers to collaborate for faster decision-making.

Data Lineage

Categories
erwin Expert Blog Data Governance

Data Governance & GDPR: How it Will Affect Your Business

If you’re a data professional, data governance and GDPR are likely at the top of your agenda right now.

Because if your organization exists within the European Union (EU) or trades with the EU, the General Data Protection Regulation (GDPR) will affect your operations.

Despite this fact, only 6% of organizations say they are “completely prepared” ahead of the mandate’s May 25 effective date, according to the 2018 State of Data Governance Report.

Perhaps some solace can be found in that 39% of those surveyed for the report indicate they are “somewhat prepared,” with 27% starting preparations.

But 11% indicate they are “not prepared at all,” and the most damning of revelations is that 17% of organizations believe GDPR does not affect them.

I’m afraid these folks and their organizations are misguided because any company in any industry is within GDPR’s reach. Even if only one EU citizen’s data is included within an organization’s database(s), compliance is mandatory.

So it’s important for organizations to understand exactly what they need to do before the deadline and the potential fines of up to €20 million or 4% of annual turnover, whichever is greater.

How Does GDPR Affect My Business

With the advent of any new regulation, it’s crucial that organizations know which elements of their organization are affected and what they need to do to stay compliant. Regarding the latter, the GDPR requires organizations to have a comprehensive and effective data governance strategy. In terms of the areas affected, organizations need to be aware of the following:

Personally Identifiable Information (PII)

GDPR introduces tighter regulations around the storage, management and transfer of PII. According to the GDPR, personal data is any information related to a person such as a name, a photo, an email address, bank details, updates on social networking websites, location details, medical information, or a computer IP address.

Personal data also comes in many forms and extends to the combination of different data elements that individually are not PII but contribute to PII status when consolidated.

Data governance allows organizations to more easily identify and classify PII and in turn, introduce appropriate measures to keep it safe.

Therefore, a good data governance solution should enable organizations to add and manage metadata – the data about data – regarding a unit of data’s sensitivity. It should also have strong data discoverability capabilities, and the ability to control access to data through user-based permissions.

Active Consent, Data Processing and the Right to Be Forgotten

GDPR also strengthens the conditions for consent, which must be clear and distinguishable from other matters and provided in an intelligible and easily accessible form, using clear and plain language. It must be as easy to withdraw consent as it is to give it.​

Data subjects also have the right to obtain confirmation as to whether their personal data is being processed, where and for what purpose. The data controller must provide a copy of said personal data in an electronic format – free of charge. This change is a dramatic shift in data transparency and consumer empowerment.

The right to be forgotten entitles the data subject to have the data controller erase his/her personal data, cease further dissemination of the data, and potentially have third parties halt processing of the data.

The information and processes required to address these restrictions can be found in the metadata and managed via metadata management tools – a key facet of data governance. Better management of such metadata is key to optimizing an organization’s data processing capabilities. Without such optimization, compliance with the GDPR-granted “right to be forgotten” can become too complex to effictively manage.

Gartner Magic Quadrant

Documenting Compliance and Data Breaches

GDPR also looks to curb data breaches that have become more extensive and frequent in recent years. Data’s value has sky-rocketed, making data-driven businesses targets of cyber threats.

Organizations must document what data they have, where it resides, the controls in place to protect it, and the measures that will be taken to address mistakes/breaches. In fact, data breach notification is mandatory within 72 hours if that breach is likely to “result in risk for the rights and freedoms of individuals.”

A comprehensive data governance strategy encompasses and enables the documentation process outlined above. However, a data governance strategy decreases the likelihood of such breaches occurring as it provides organizations with greater insight as to which data should be more closely guarded.

Data Governance and GDPR Compliance

Based on the results of the State of DG Report referenced at the beginning of this post, organizations aren’t as GDPR-ready as they should be. But there’s still time to act.

Data governance and GDPR go hand in hand. A strong data governance program is critical to the data visibility and categorization needed for GDPR compliance. And it will help in assessing and prioritizing data risks and enable easier verification of compliance with GDPR auditors.

Data governance enables an organization to discover, understand, govern and socialize its data assets – not just within IT but across the entire organization. Not only does it encompass data’s current iteration but also its entire lineage and connections through the data ecosystem.

Understanding data lineage is absolutely necessary in the context of GDPR. Take the right to be forgotten, for example. Such compliance requires an organization to locate all an individual’s PII and any information that can be cross-referenced with other data points to become PII.

With the right data governance approach and supporting technology, organizations can ensure GDPR compliance with their current, as-is architecture and data assets – and ensure new data sources and/or changes to the to-be architecture incorporate the appropriate controls.

Stakeholders across the enterprise need to be GDPR aware and enabled so that compliance is built in at a cultural level.

For more information about increasing your expertise in relation to data governance and GDPR, download our guide to managing GDPR with data governance.

Data Governance, GDPR and Your Business

Categories
erwin Expert Blog Data Governance

The Top Five Data Governance Use Cases and Drivers

As the applications for data have grown, so too have the data governance use cases. And the legacy, IT-only approach to data governance, Data Governance 1.0, has made way for the collaborative, enterprise-wide Data Governance 2.0.

In addition to increasing data applications, Data Governance 1.0’s decline is being hastened by recurrent failings in its implementation. Leaving it to IT, with no input from the wider business, ignores the desired business outcomes and the opportunities to contribute to and speed their accomplishment. Lack of input from the departments that use the data also causes data quality and completeness to suffer.

So Data Governance 1.0 was destined to fail in yielding a significant return. But changing regulatory requirements and mega-disruptors effectively leveraging data has spawned new interest in making data governance work.

The 2018 State of Data Governance Report indicates that 98% of organizations consider data governance important. Furthermore, 66% of respondents say that understanding and governing enterprise assets has become more or very important for their executives.

Below, we consider the primary data governance use cases and drivers as outlined in this report.

The Top 5 Data Governance Use Cases

1. Changing Regulatory Requirements

Changing regulations are undoubtedly the biggest driver for data governance. The European Union’s General Data Protection Regulation (GDPR) will soon take effect, and it’s the first attempt at a near-global, uniform approach to regulating the way organizations use and store data.

Data governance is mandatory under the new law, and failure to comply will leave organizations liable for huge fines – up to €20 million or 4% of the company’s global annual turnover. For context, GDPR fines could wipe off two percentage points of revenue from Google parent company, Alphabet.

Although 60% of the organizations surveyed for the State of DG Report indicate that regulatory compliance is the key driver for implementing data governance, only 6% of enterprises are prepared for GDPR with less than four months to go.

But data governance use cases go beyond just compliance.

2. Customer Satisfaction

Another primary driver for data governance is improving customer satisfaction, with 49% of our survey respondents citing it.

A Data Governance 2.0 approach is paramount to this use case and should be strong justification to secure C-level buy-in. In fact, the correlation between effective data governance and customer satisfaction is clear. A 2017 report from Aberdeen Group shows that the user-base of organizations with more effective data governance programs are far happier with:

  • The business’ ability to share data (66% – Data Governance Leaders vs. 21% Data Governance followers)
  • Data systems’ ease of use (64% vs. 24%)
  • Speed of information delivery (61% vs. 18%)

3. Decision-Making

Another data governance use case as indicated by the State of DG Report is improved decision-making. Forty-five percent of respondents identify it as the third key driver, and for good reason.

Data governance success manifests itself as well-defined data that is consistent throughout the business, understood across departments, and used to pull the business in the desired direction. It also improves the quality of the data.

By moving data governance out of its IT silo, the employees responsible for business outcomes are part of its governance. This collaboration makes data both more discoverable, more insightful and more contextual.

The decision-making process becomes more efficient, as the velocity at which data can be interpreted increases. The organization can also better interpret and trust the information it is using to determine course.

4. Reputation Management

In the survey behind the State of DG Report, 30% of respondents name reputation management as a driver for DG’s implementation.

We’ve seen it time and time again with high-profile data breaches inflicting the likes of Equifax, Uber and Yahoo. All were met with costly PR fallout. For example, Equifax’s breach had a price tag of $90 million, as of November 2017.

So the discrepancy between the 60% who cite regulatory compliance as a key driver and the 30% who cite reputation management as DG drivers is interesting. One could argue they are the same; both call for data governance to help prevent or at least limit damaging breaches.

The difference might come down to smaller businesses that believe they have less brand equity to maintain. They, as well as some of their larger counterparts, have taken a reactionary approach to data governance. But GDPR should now encourage more proactive data governance across the board.

In terms of data governance use cases for managing the risk of data breaches, consider that data governance, at a fundamental level, is about knowing where your data is, who’s responsible for it, and what it is supposed to be used for.

This understanding enables organizations to focus security spending on the areas of highest risk. Thus, they can take a more cost-effective but thorough approach to risk management.

5. Analytics and Big Data

Analytics and Big Data also were identified as key drivers for data governance among 27% and 20% of respondents, respectively.

The need for data governance in these cases is largely driven by the amount of data businesses are now tasked with overseeing. In terms of volume, Big Data speaks for itself. Twenty-two percent of respondents in the State of DG Report manage more than 10 petabytes of data, which lines up closely with those who identify Big Data as a key driver.

However, the amount of data the average organization without a Big Data strategy consumes, stores and processes has climbed considerably in recent years.

Research indicates that 90% of the world’s data has been created just in the last two years. Globally, we generate 2.5 quintillion bytes a day. Other studies equate data’s value to that of oil, so clearly there’s a lot of value to be found.

However, the “three Vs of data” (volume, velocity, variety) tend to be positively correlated. When one increases, so do the other two. Higher volumes of data mean higher velocities of data that must be processed faster for worthwhile, valuable insights. It also means an increase in the data types – both structured and unstructured – which makes processing more difficult.

A Strong DG Foundation

A strong data governance foundation ensures data is more manageable, and therefore more valuable.

With Data Governance 2.0, data governance use cases shift from reactionary to proactive with a clear focus on business outcomes.

Although new regulations can be seen as bureaucratic and cumbersome, GDPR actually presents organizations with great opportunity – at least for those that choose to take the evolved Data Governance 2.0 path. They will benefit from an outcome-focused DG initiative that adds value beyond just regulatory compliance.

To learn more, download the complete State of Data Governance Report.

2020 Data Governance and Automation Report