Categories
erwin Expert Blog

Top 5 Data Catalog Benefits

A data catalog benefits organizations in a myriad of ways. With the right data catalog tool, organizations can automate enterprise metadata management – including data cataloging, data mapping, data quality and code generation for faster time to value and greater accuracy for data movement and/or deployment projects.

Data cataloging helps curate internal and external datasets for a range of content authors. Gartner says this doubles business benefits and ensures effective management and monetization of data assets in the long-term if linked to broader data governance, data quality and metadata management initiatives.

But even with this in mind, the importance of data cataloging is growing. In the regulated data world (GDPR, HIPAA etc) organizations need to have a good understanding of their data lineage – and the data catalog benefits to data lineage are substantial.

Data lineage is a core operational business component of data governance technology architecture, encompassing the processes and technology to provide full-spectrum visibility into the ways data flows across an enterprise.

There are a number of different approaches to data lineage. Here, I outline the common approach, and the approach incorporating data cataloging – including the top 5 data catalog benefits for understanding your organization’s data lineage.

Data Catalog Benefits

Data Lineage – The Common Approach

The most common approach for assembling a collection of data lineage mappings traces data flows in a reverse manner. The process begins with the target or data end-point, and then traversing the processes, applications, and ETL tasks in reverse from the target.

For example, to determine the mappings for the data pipelines populating a data warehouse, a data lineage tool might begin with the data warehouse and examine the ETL tasks that immediately proceed the loading of the data into the target warehouse.

The data sources that feed the ETL process are added to a “task list,” and the process is repeated for each of those sources. At each stage, the discovered pieces of lineage are documented. At the end of the sequence, the process will have reverse-mapped the pipelines for populating that warehouse.

While this approach does produce a collection of data lineage maps for selected target systems, there are some drawbacks.

  • First, this approach focuses only on assembling the data pipelines populating the selected target system but does not necessarily provide a comprehensive view of all the information flows and how they interact.
  • Second, this process produces the information that can be used for a static view of the data pipelines, but the process needs to be executed on a regular basis to account for changes to the environment or data sources.
  • Third, and probably most important, this process produces a technical view of the information flow, but it does not necessarily provide any deeper insights into the semantic lineage, or how the data assets map to the corresponding business usage models.

A Data Catalog Offers an Alternate Data Lineage Approach

An alternate approach to data lineage combines data discovery and the use of a data catalog that captures data asset metadata with a data mapping framework that documents connections between the data assets.

This data catalog approach also takes advantage of automation, but in a different way: using platform-specific data connectors, the tool scans the environment for storing each data asset and imports data asset metadata into the data catalog.

When data asset structures are similar, the tool can compare data element domains and value sets, and automatically create the data mapping.

In turn, the data catalog approach performs data discovery using the same data connectors to parse the code involved in data movement, such as major ETL environments and procedural code – basically any executable task that moves data.

The information collected through this process is reverse engineered to create mappings from source data sets to target data sets based on what was discovered.

For example, you can map the databases used for transaction processing, determine that subsets of the transaction processing database are extracted and moved to a staging area, and then parse the ETL code to infer the mappings.

These direct mappings also are documented in the data catalog. In cases where the mappings are not obvious, a tool can help a data steward manually map data assets into the catalog.

The result is a data catalog that incorporates the structural and semantic metadata associated with each data asset as well as the direct mappings for how that data set is populated.

Learn more about data cataloging.

Value of Data Intelligence IDC Report

And this is a powerful representative paradigm – instead of capturing a static view of specific data pipelines, it allows a data consumer to request a dynamically-assembled lineage from the documented mappings.

By interrogating the catalog, the current view of any specific data lineage can be rendered on the fly that shows all points of the data lineage: the origination points, the processing stages, the sequences of transformations, and the final destination.

Materializing the “current active lineage” dynamically reduces the risk of having an older version of the lineage that is no longer relevant or correct. When new information is added to the data catalog (such as a newly-added data source of a modification to the ETL code), dynamically-generated views of the lineage will be kept up-to-date automatically.

Top 5 Data Catalog Benefits for Understanding Data Lineage

A data catalog benefits data lineage in the following five distinct ways:

1. Accessibility

The data catalog approach allows the data consumer to query the tool to materialize specific data lineage mappings on demand.

2. Currency

The data lineage is rendered from the most current data in the data catalog.

3. Breadth

As the number of data assets documented in the data catalog increases, the scope of the materializable lineage expands accordingly. With all corporate data assets cataloged, any (or all!) data lineage mappings can be produced on demand.

4. Maintainability and Sustainability

Since the data lineage mappings are not managed as distinct artifacts, there are no additional requirements for maintenance. As long as the data catalog is kept up to date, the data lineage mappings can be materialized.

5. Semantic Visibility

In addition to visualizing the physical movement of data across the enterprise, the data catalog approach allows the data steward to associate business glossary terms, data element definitions, data models, and other semantic details with the different mappings. Additional visualization methods can demonstrate where business terms are used, how they are mapped to different data elements in different systems, and the relationships among these different usage points.

One can impose additional data governance controls with project management oversight, which allows you to designate data lineage mappings in terms of the project life cycle (such as development, test or production).

Aside from these data catalog benefits, this approach allows you to reduce the amount of manual effort for accumulating the information for data lineage and continually reviewing the data landscape to maintain consistency, thus providing a greater return on investment for your data intelligence budget.

Learn more about data cataloging.

Categories
erwin Expert Blog

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

In light of recent, high-profile data breaches, it’s past-time we re-examined strategic data governance and its role in managing regulatory requirements.

News broke earlier this week of British Airways being fined 183 million pounds – or $228 million – by the U.K. for alleged violations of the European Union’s General Data Protection Regulation (GDPR). While not the first, it is the largest penalty levied since the GDPR went into effect in May 2018.

Given this, Oppenheimer & Co. cautions:

“European regulators could accelerate the crackdown on GDPR violators, which in turn could accelerate demand for GDPR readiness. Although the CCPA [California Consumer Privacy Act, the U.S. equivalent of GDPR] will not become effective until 2020, we believe that new developments in GDPR enforcement may influence the regulatory framework of the still fluid CCPA.”

With all the advance notice and significant chatter for GDPR/CCPA,  why aren’t organizations more prepared to deal with data regulations?

In a word? Complexity.

The complexity of regulatory requirements in and of themselves is aggravated by the complexity of the business and data landscapes within most enterprises.

So it’s important to understand how to use strategic data governance to manage the complexity of regulatory compliance and other business objectives …

Designing and Operationalizing Regulatory Compliance Strategy

It’s not easy to design and deploy compliance in an environment that’s not well understood and difficult in which to maneuver. First you need to analyze and design your compliance strategy and tactics, and then you need to operationalize them.

Modern, strategic data governance, which involves both IT and the business, enables organizations to plan and document how they will discover and understand their data within context, track its physical existence and lineage, and maximize its security, quality and value. It also helps enterprises put these strategic capabilities into action by:

  • Understanding their business, technology and data architectures and their inter-relationships, aligning them with their goals and defining the people, processes and technologies required to achieve compliance.
  • Creating and automating a curated enterprise data catalog, complete with physical assets, data models, data movement, data quality and on-demand lineage.
  • Activating their metadata to drive agile data preparation and governance through integrated data glossaries and dictionaries that associate policies to enable stakeholder data literacy.

Strategic Data Governance for GDPR/CCPA

Five Steps to GDPR/CCPA Compliance

With the right technology, GDPR/CCPA compliance can be automated and accelerated in these five steps:

  1. Catalog systems

Harvest, enrich/transform and catalog data from a wide array of sources to enable any stakeholder to see the interrelationships of data assets across the organization.

  1. Govern PII “at rest”

Classify, flag and socialize the use and governance of personally identifiable information regardless of where it is stored.

  1. Govern PII “in motion”

Scan, catalog and map personally identifiable information to understand how it moves inside and outside the organization and how it changes along the way.

  1. Manage policies and rules

Govern business terminology in addition to data policies and rules, depicting relationships to physical data catalogs and the applications that use them with lineage and impact analysis views.

  1. Strengthen data security

Identify regulatory risks and guide the fortification of network and encryption security standards and policies by understanding where all personally identifiable information is stored, processed and used.

How erwin Can Help

erwin is the only software provider with a complete, metadata-driven approach to data governance through our integrated enterprise modeling and data intelligence suites. We help customers overcome their data governance challenges, with risk management and regulatory compliance being primary concerns.

However, the erwin EDGE also delivers an “enterprise data governance experience” in terms of agile innovation and business transformation – from creating new products and services to keeping customers happy to generating more revenue.

Whatever your organization’s key drivers are, a strategic data governance approach – through  business process, enterprise architecture and data modeling combined with data cataloging and data literacy – is key to success in our modern, digital world.

If you’d like to get a handle on handling your data, you can sign up for a free, one-on-one demo of erwin Data Intelligence.

For more information on GDPR/CCPA, we’ve also published a white paper on the Regulatory Rationale for Integrating Data Management and Data Governance.

GDPR White Paper

Categories
erwin Expert Blog

Choosing the Right Data Modeling Tool

The need for an effective data modeling tool is more significant than ever.

For decades, data modeling has provided the optimal way to design and deploy new relational databases with high-quality data sources and support application development. But it provides even greater value for modern enterprises where critical data exists in both structured and unstructured formats and lives both on premise and in the cloud.

In today’s hyper-competitive, data-driven business landscape, organizations are awash with data and the applications, databases and schema required to manage it.

For example, an organization may have 300 applications, with 50 different databases and a different schema for each. Additional challenges, such as increasing regulatory pressures – from the General Data Protection Regulation (GDPR) to the Health Insurance Privacy and Portability Act (HIPPA) – and growing stores of unstructured data also underscore the increasing importance of a data modeling tool.

Data modeling, quite simply, describes the process of discovering, analyzing, representing and communicating data requirements in a precise form called the data model. There’s an expression: measure twice, cut once. Data modeling is the upfront “measuring tool” that helps organizations reduce time and avoid guesswork in a low-cost environment.

From a business-outcome perspective, a data modeling tool is used to help organizations:

  • Effectively manage and govern massive volumes of data
  • Consolidate and build applications with hybrid architectures, including traditional, Big Data, cloud and on premise
  • Support expanding regulatory requirements, such as GDPR and the California Consumer Privacy Act (CCPA)
  • Simplify collaboration across key roles and improve information alignment
  • Improve business processes for operational efficiency and compliance
  • Empower employees with self-service access for enterprise data capability, fluency and accountability

Data Modeling Tool

Evaluating a Data Modeling Tool – Key Features

Organizations seeking to invest in a new data modeling tool should consider these four key features.

  1. Ability to visualize business and technical database structures through an integrated, graphical model.

Due to the amount of database platforms available, it’s important that an organization’s data modeling tool supports a sufficient (to your organization) array of platforms. The chosen data modeling tool should be able to read the technical formats of each of these platforms and translate them into highly graphical models rich in metadata. Schema can be deployed from models in an automated fashion and iteratively updated so that new development can take place via model-driven design.

  1. Empowering of end-user BI/analytics by data source discovery, analysis and integration. 

A data modeling tool should give business users confidence in the information they use to make decisions. Such confidence comes from the ability to provide a common, contextual, easily accessible source of data element definitions to ensure they are able to draw upon the correct data; understand what it represents, including where it comes from; and know how it’s connected to other entities.

A data modeling tool can also be used to pull in data sources via self-service BI and analytics dashboards. The data modeling tool should also have the ability to integrate its models into whatever format is required for downstream consumption.

  1. The ability to store business definitions and data-centric business rules in the model along with technical database schemas, procedures and other information.

With business definitions and rules on board, technical implementations can be better aligned with the needs of the organization. Using an advanced design layer architecture, model “layers” can be created with one or more models focused on the business requirements that then can be linked to one or more database implementations. Design-layer metadata can also be connected from conceptual through logical to physical data models.

  1. Rationalize platform inconsistencies and deliver a single source of truth for all enterprise business data.

Many organizations struggle to breakdown data silos and unify data into a single source of truth, due in large part to varying data sources and difficulty managing unstructured data. Being able to model any data from anywhere accounts for this with on-demand modeling for non-relational databases that offer speed, horizontal scalability and other real-time application advantages.

With NoSQL support, model structures from non-relational databases, such as Couchbase and MongoDB can be created automatically. Existing Couchbase and MongoDB data sources can be easily discovered, understood and documented through modeling and visualization. Existing entity-relationship diagrams and SQL databases can be migrated to Couchbase and MongoDB too. Relational schema also will be transformed to query-optimized NoSQL constructs.

Other considerations include the ability to:

  • Compare models and databases.
  • Increase enterprise collaboration.
  • Perform impact analysis.
  • Enable business and IT infrastructure interoperability.

When it comes to data modeling, no one knows it better. For more than 30 years, erwin Data Modeler has been the market leader. It is built on the vision and experience of data modelers worldwide and is the de-facto standard in data model integration.

You can learn more about driving business value and underpinning governance with erwin DM in this free white paper.

Data Modeling Drives Business Value

Categories
erwin Expert Blog

The Importance of EA/BP for Mergers and Acquisitions

Over the past few weeks several huge mergers and acquisitions (M&A) have been announced, including Raytheon and United Technologies, the Salesforce acquisition of Tableau and the Merck acquisition of Tilos Therapeutics.

According to collated research and a Harvard Business Review report, the M&A failure rate sits between 70 and 90 percent. Additionally, McKinsey estimates that around 70 percent of mergers do not achieve their expected “revenue synergies.”

Combining two organizations into one is complicated. And following a merger or acquisition, businesses typically find themselves with duplicate applications and business capabilities that are costly and obviously redundant, making alignment difficult.

Enterprise architecture is essential to successful mergers and acquisitions. It helps alignment by providing a business- outcome perspective for IT and guiding transformation. It also helps define strategy and models, improving interdepartmental cohesion and communication. Roadmaps can be used to provide a common focus throughout the new company, and if existing roadmaps are in place, they can be modified to fit the new landscape.

Additionally, an organization must understand both sets of processes being brought to the table. Without business process modeling, this is near impossible.

In an M&A scenario, businesses need to ensure their systems are fully documented and rationalized. This way, they can comb through their inventories to make more informed decisions about which systems to cut or phase out to operate more efficiently and then deliver the roadmap to enable those changes.

Mergers and Acquisitions

Getting Rid of Duplications Duplications

Mergers and acquisitions are daunting. Depending on the size of the businesses, hundreds of systems and processes need to be accounted for, which can be difficult, and even impossible to do in advance.

Enterprise architecture aids in rooting out process and operational duplications, making the new entity more cost efficient. Needless to say, the behind-the-scenes complexities are many and can include discovering that the merging enterprises use the same solution but under different names in different parts of the organizations, for example.

Determinations also may need to be made about whether particular functions, that are expected to become business-critical, have a solid, scalable base to build upon. If an existing application won’t be able to handle the increased data load and processing, then those previously planned investments don’t need to be made.

Gaining business-wide visibility of data and enterprise architecture all within a central repository enables relevant parties across merging companies to work from a single source of information. This provides insights to help determine whether, for example, two equally adept applications of the same nature can continue to be used as the companies merge, because they share common underlying data infrastructures that make it possible for them to interoperate across a single source of synched information.

Or, in another scenario, it may be obvious that it is better to keep only one of the applications because it alone serves as the system of record for what the organization has determined are valuable conceptual data entities in its data model.

At the same time, it can reveal the location of data that might otherwise have been unwittingly discharged with the elimination of an application, enabling it to be moved to a lower-cost storage tier for potential future use.

Knowledge Retention – Avoiding Brain Drain

When employees come and go, as they tend to during mergers and acquisitions, they take critical institutional knowledge with them.

Unlocking knowledge and then putting systems in place to retain that knowledge is one key benefit of business process modeling. Knowledge retention and training has become a pivotal area in which businesses will either succeed or fail.

Different organizations tend to speak different languages. For instance, one company might refer to a customer as “customer,” while another might refer to them as a “client.” Business process modeling is a great way to get everybody in the organization using the same language, referring to things in the same way.

Drawing out this knowledge then allows a centralized and uniform process to be adopted across the company. In any department within any company, individuals and teams develop processes for doing things. Business process modeling extracts all these pieces of information from individuals and teams so they can be turned into centrally adopted processes.

 

[FREE EBOOK] Application Portfolio Management For Mergers & Acquisitions 

 

Ensuring Compliance

Industry and government regulations affect businesses that work in or do business with any number of industries or in specific geographies. Industry-specific regulations in areas like healthcare, pharmaceuticals and financial services have been in place for some time.

Now, broader mandates like the European Union’s Generation Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) require businesses across industries to think about their compliance efforts. Business process modeling helps organizations prove what they are doing to meet compliance requirements and understand how changes to their processes impact compliance efforts (and vice versa).

In highly regulated industries like financial services and pharmaceuticals, where mergers and acquisitions activity is frequent, identifying and standardizing business processes meets the scrutiny of regulatory compliance.

Business process modeling makes it easier to document processes, align documentation within document control and learning management systems, and give R&D employees easy access and intuitive navigation so they can find the information they need.

Introducing Business Architecture

Organizations often interchange the terms “business process” and “enterprise architecture” because both are strategic functions with many interdependencies.

However, business process architecture defines the elements of a business and how they interact with the aim of aligning people, processes, data, technologies and applications. Enterprise architecture defines the structure and operation of an organization with the purpose of determining how it can achieve its current and future objectives most effectively, translating those goals into a blueprint of IT capabilities.

Although both disciplines seek to achieve the organization’s desired outcomes, both have largely operated in silos.

To learn more about how erwin provides modeling and analysis software to support both business process and enterprise architecture practices and enable their broader collaboration, click here.

Cloud-based enterprise architecture and business process

Categories
erwin Expert Blog

A Guide to CCPA Compliance and How the California Consumer Privacy Act Compares to GDPR

California Consumer Privacy Act (CCPA) compliance shares many of the same requirements in the European Unions’ General Data Protection Regulation (GDPR).

While the CCPA has been signed into law, organizations have until Jan. 1, 2020, to enact its mandates. Luckily, many organizations have already laid the regulatory groundwork for it because of their efforts to comply with GDPR.

However, there are some key differences that we’ll explore in the Q&A below.

Data governance, thankfully, provides a framework for compliance with either or both – in addition to other regulatory mandates your organization may be subject to.

CCPA Compliance Requirements vs. GDPR FAQ

Does CCPA apply to not-for-profit organizations? 

No, CCPA compliance only applies to for-profit organizations. GDPR compliance is required for any organization, public or private (including not-for-profit).

What for-profit businesses does CCPA apply to?

The mandate for CCPA compliance only applies if a for-profit organization:

  • Has an annual gross revenue exceeding $25 million
  • Collects, sells or shares the personal data of 50,000 or more consumers, households or devices
  • Earns 50% of more of its annual revenue by selling consumers’ personal information

Does the CCPA apply outside of California?

As the name suggests, the legislation is designed to protect the personal data of consumers who reside in the state of California.

But like GDPR, CCPA compliance has impacts outside the area of origin. This means businesses located outside of California, but selling to (or collecting the data of) California residents must also comply.

Does the CCPA exclude anything that GDPR doesn’t? 

GDPR encompasses all categories of “personal data,” with no distinctions.

CCPA does make distinctions, particularly when other regulations may overlap. These include:

  • Medical information covered by the Confidentiality of Medical Information Act (CMIA) and the Health Insurance Portability and Accountability Act (HIPAA)
  • Personal information covered by the Gramm-Leach-Bliley Act (GLBA)
  • Personal information covered by the Driver’s Privacy Protection Act (DPPA)
  • Clinical trial data
  • Information sold to or by consumer reporting agencies
  • Publicly available personal information (federal, state and local government records)

What about access requests? 

Under the GDPR, organizations must make any personal data collected from an EU citizen available upon request.

CCPA compliance only requires data collected within the last 12 months to be shared upon request.

Does the CCPA include the right to opt out?

CCPA, like GDPR, empowers gives consumers/citizens the right to opt out in regard to the processing of their personal data.

However, CCPA compliance only requires an organization to observe an opt-out request when it comes to the sale of personal data. GDPR does not make any distinctions between “selling” personal data and any other kind of data processing.

To meet CCPA compliance opt-out standards, organizations must provide a “Do Not Sell My Personal Information” link on their home pages.

Does the CCPA require individuals to willingly opt in?

No. Whereas the GDPR requires informed consent before an organization sells an individual’s information, organizations under the scope of the CCPA can still assume consent. The only exception involves the personal information of children (under 16). Children over 13 can consent themselves, but if the consumer is a child under 13, a parent or guardian must authorize the sale of said child’s personal data.

What about fines for CCPA non-compliance? 

In theory, fines for CCPA non-compliance are potentially more far reaching than those of GDPR because there is no ceiling for CCPA penalties. Under GDPR, penalties have a ceiling of 4% of global annual revenue or €20 million, whichever is greater. GDPR recently resulted in a record fine for Google.

Organizations outside of CCPA compliance can only be fined up to $7,500 per violation, but there is no upper ceiling.

CCPA compliance is a data governance issue

Data Governance for Regulatory Compliance

While CCPA has a more narrow geography and focus than GDPR, compliance is still a serious effort for organizations under its scope. And as data-driven business continues to expand, so too will the pressure on lawmakers to regulate how organizations process data. Remember the Facebook hearings and now inquiries into Google and Twitter, for example?

Regulatory compliance remains a key driver for data governance. After all, to understand how to meet data regulations, an organization must first understand its data.

An effective data governance initiative should enable just that, by giving an organization the tools to:

  • Discover data: Identify and interrogate metadata from various data management silos
  • Harvest data: Automate the collection of metadata from various data management silos and consolidate it into a single source
  • Structure data: Connect physical metadata to specific business terms and definitions and reusable design standards
  • Analyze data: Understand how data relates to the business and what attributes it has
  • Map data flows: Identify where to integrate data and track how it moves and transforms
  • Govern data: Develop a governance model to manage standards and policies and set best practices
  • Socialize data: Enable all stakeholders to see data in one place in their own context

A Regulatory EDGE

The erwin EDGE software platform creates an “enterprise data governance experience” to transform how all stakeholders discover, understand, govern and socialize data assets. It includes enterprise modeling, data cataloging and data literacy capabilities, giving organizations visibility and control over their disparate architectures and all the supporting data.

Both IT and business stakeholders have role-based, self-service access to the information they need to collaborate in making strategic decisions. And because many of the associated processes can be automated, you reduce errors and increase the speed and quality of your data pipeline. This data intelligence unlocks knowledge and value.

The erwin EDGE provides the most agile, efficient and cost-effective means of launching and sustaining a strategic and comprehensive data governance initiative, whether you wish to deploy on premise or in the cloud. But you don’t have to implement every component of the erwin EDGE all at once to see strategic value.

Because of the platform’s federated design, you can address your organization’s most urgent needs, such as regulatory compliance, first. Then you can proactively address other organization objectives, such as operational efficiency, revenue growth, increasing customer satisfaction and improving overall decision-making.

You can learn more about leveraging data governance to navigate the changing tide of data regulations here.

Are you compliant with data regulations?

Categories
erwin Expert Blog

Keeping Up with New Data Protection Regulations

Keeping up with new data protection regulations can be difficult, and the latest – the General Data Protection Regulation (GDPR) – isn’t the only new data protection regulation organizations should be aware of.

California recently passed a law that gives residents the right to control the data companies collect about them. Some suggest the California Consumer Privacy Act (CCPA), which takes effect January 1, 2020, sets a precedent other states will follow by empowering consumers to set limits on how companies can use their personal information.

In fact, organizations should expect increasing pressure on lawmakers to introduce new data protection regulations. A number of high-profile data breaches and scandals have increased public awareness of the issue.

Facebook was in the news again last week for another major problem around the transparency of its user data, and the tech-giant also is reportedly facing 10 GDPR investigations in Ireland – along with Apple, LinkedIn and Twitter.

Some industries, such as healthcare and financial services, have been subject to stringent data regulations for years: GDPR now joins the Health Insurance Portability and Accountability Act (HIPAA), the Payment Card Industry Data Security Standard (PCI DSS) and the Basel Committee on Banking Supervision (BCBS).

Due to these pre-existing regulations, organizations operating within these sectors, as well as insurance, had some of the GDPR compliance bases covered in advance.

Other industries had their own levels of preparedness, based on the nature of their operations. For example, many retailers have robust, data-driven e-commerce operations that are international. Such businesses are bound to comply with varying local standards, especially when dealing with personally identifiable information (PII).

Smaller, more brick-and-mortar-focussed retailers may have had to start from scratch.

But starting position aside, every data-driven organization should strive for a better standard of data management — and not just for compliance sake. After all, organizations are now realizing that data is one of their most valuable assets.

New Data Protection Regulations – Always Be Prepared

When it comes to new data protection regulations in the face of constant data-driven change, it’s a matter of when, not if.

As they say, the best defense is a good offense. Fortunately, whenever the time comes, the first point of call will always be data governance, so organizations can prepare.

Effective compliance with new data protection regulations requires a robust understanding of the “what, where and who” in terms of data and the stakeholders with access to it (i.e., employees).

The Regulatory Rationale for Integrating Data Management & Data Governance

This is also true for existing data regulations. Compliance is an on-going requirement, so efforts to become compliant should not be treated as static events.

Less than four months before GDPR came into effect, only 6 percent of enterprises claimed they were prepared for it. Many of these organizations will recall a number of stressful weeks – or even months – tidying up their databases and their data management processes and policies.

This time and money was spent reactionarily, at the behest of proactive efforts to grow the business.

The implementation and subsequent observation of a strong data governance initiative ensures organizations won’t be put on the spot going forward. Should an audit come up, current projects aren’t suddenly derailed as they reenact pre-GDPR panic.

New Data Regulations

Data Governance: The Foundation for Compliance

The first step to compliance with new – or old – data protection regulations is data governance.

A robust and effective data governance initiative ensures an organization understands where security should be focussed.

By adopting a data governance platform that enables you to automatically tag sensitive data and track its lineage, you can ensure nothing falls through the cracks.

Your chosen data governance solution should enable you to automate the scanning, detection and tagging of sensitive data by:

  • Monitoring and controlling sensitive data – Gain better visibility and control across the enterprise to identify data security threats and reduce associated risks.
  • Enriching business data elements for sensitive data discovery – By leveraging a comprehensive mechanism to define business data elements for PII, PHI and PCI across database systems, cloud and Big Data stores, you can easily identify sensitive data based on a set of algorithms and data patterns.
  • Providing metadata and value-based analysis – Simplify the discovery and classification of sensitive data based on metadata and data value patterns and algorithms. Organizations can define business data elements and rules to identify and locate sensitive data, including PII, PHI and PCI.

With these precautionary steps, organizations are primed to respond if a data breach occurs. Having a well governed data ecosystem with data lineage capabilities means issues can be quickly identified.

Additionally, if any follow-up is necessary –  such as with GDPR’s data breach reporting time requirements – it can be handles swiftly and in accordance with regulations.

It’s also important to understand that the benefits of data governance don’t stop with regulatory compliance.

A better understanding of what data you have, where it’s stored and the history of its use and access isn’t only beneficial in fending off non-compliance repercussions. In fact, such an understanding is arguably better put to use proactively.

Data governance improves data quality standards, it enables better decision-making and ensures businesses can have more confidence in the data informing those decisions.

The same mechanisms that protect data by controlling its access also can be leveraged to make data more easily discoverable to approved parties – improving operational efficiency.

All in all, the cumulative result of data governance’s influence on data-driven businesses both drives revenue (through greater efficiency) and reduces costs (less errors, false starts, etc.).

To learn more about data governance and the regulatory rationale for its implementation, get our free guide here.

DG RediChek

Categories
erwin Expert Blog

Data Mapping Tools: What Are the Key Differentiators

The need for data mapping tools in light of increasing volumes and varieties of data – as well as the velocity at which it must be processed – is growing.

It’s not difficult to see why either. Data mapping tools have always been a key asset for any organization looking to leverage data for insights.

Isolated units of data are essentially meaningless. By linking data and enabling its categorization in relation to other data units, data mapping provides the context vital for actionable information.

Now with the General Data Protection Regulation (GDPR) in effect, data mapping has become even more significant.

The scale of GDPR’s reach has set a new precedent and is the closest we’ve come to a global standard in terms of data regulations. The repercussions can be huge – just ask Google.

Data mapping tools are paramount in charting a path to compliance for said new, near-global standard and avoiding the hefty fines.

Because of GDPR, organizations that may not have fully leveraged data mapping for proactive data-driven initiatives (e.g., analysis) are now adopting data mapping tools with compliance in mind.

Arguably, GDPR’s implementation can be viewed as an opportunity – a catalyst for digital transformation.

Those organizations investing in data mapping tools with compliance as the main driver will definitely want to consider this opportunity and have it influence their decision as to which data mapping tool to adopt.

With that in mind, it’s important to understand the key differentiators in data mapping tools and the associated benefits.

Data Mapping Tools: erwin Mapping Manager

Data Mapping Tools: Automated or Manual?

In terms of differentiators for data mapping tools, perhaps the most distinct is automated data mapping versus data mapping via manual processes.

Data mapping tools that allow for automation mean organizations can benefit from in-depth, quality-assured data mapping, without the significant allocations of resources typically associated with such projects.

Eighty percent of data scientists’ and other data professionals’ time is spent on manual data maintenance. That’s anything and everything from addressing errors and inconsistencies and trying to understand source data or track its lineage. This doesn’t even account for the time lost due to missed errors that contribute to inherently flawed endeavors.

Automated data mapping tools render such issues and concerns void. In turn, data professionals’ time can be put to much better, proactive use, rather than them being bogged down with reactive, house-keeping tasks.

FOUR INDUSTRY FOCUSSED CASE STUDIES FOR AUTOMATED METADATA-DRIVEN AUTOMATION 
(BFSI, PHARMA, INSURANCE AND NON-PROFIT) 

 

As well as introducing greater efficiency to the data governance process, automated data mapping tools enable data to be auto-documented from XML that builds mappings for the target repository or reporting structure.

Additionally, a tool that leverages and draws from a single metadata repository means that mappings are dynamically linked with underlying metadata to render automated lineage views, including full transformation logic in real time.

Therefore, changes (e.g., in the data catalog) will be reflected across data governance domains (business process, enterprise architecture and data modeling) as and when they’re made – no more juggling and maintaining multiple, out-of-date versions.

It also enables automatic impact analysis at the table and column level – even for business/transformation rules.

For organizations looking to free themselves from the burden of juggling multiple versions, siloed business processes and a disconnect between interdepartmental collaboration, this feature is a key benefit to consider.

Data Mapping Tools: Other Differentiators

In light of the aforementioned changes to data regulations, many organizations will need to consider the extent of a data mapping tool’s data lineage capabilities.

The ability to reverse-engineer and document the business logic from your reporting structures for true source-to-report lineage is key because it makes analysis (and the trust in said analysis) easier. And should a data breach occur, affected data/persons can be more quickly identified in accordance with GDPR.

Article 33 of GDPR requires organizations to notify the appropriate supervisory authority “without undue delay and, where, feasible, not later than 72 hours” after discovering a breach.

As stated above, a data governance platform that draws from a single metadata source is even more advantageous here.

Mappings can be synchronized with metadata so that source or target metadata changes can be automatically pushed into the mappings – so your mappings stay up to date with little or no effort.

The Data Mapping Tool For Data-Driven Businesses

Nobody likes manual documentation. It’s arduous, error-prone and a waste of resources. Quite frankly, it’s dated.

Any organization looking to invest in data mapping, data preparation and/or data cataloging needs to make automation a priority.

With automated data mapping, organizations can achieve “true data intelligence,”. That being the ability to tell the story of how data enters the organization and changes throughout the entire lifecycle to the consumption/reporting layer.  If you’re working harder than your tool, you have the wrong tool.

The manual tools of old do not have auto documentation capabilities, cannot produce outbound code for multiple ETL or script types, and are a liability in terms of accuracy and GDPR.

Automated data mapping is the only path to true GDPR compliance, and erwin Mapping Manager can get you there in a matter of weeks thanks to our robust reverse-engineering technology. 

Learn more about erwin’s automation framework for data governance here.

Automate Data Mapping

Categories
erwin Expert Blog

Data Governance Stock Check: Using Data Governance to Take Stock of Your Data Assets

For regulatory compliance (e.g., GDPR) and to ensure peak business performance, organizations often bring consultants on board to help take stock of their data assets. This sort of data governance “stock check” is important but can be arduous without the right approach and technology. That’s where data governance comes in …

While most companies hold the lion’s share of operational data within relational databases, it also can live in many other places and various other formats. Therefore, organizations need the ability to manage any data from anywhere, what we call our “any-squared” (Any2) approach to data governance.

Any2 first requires an understanding of the ‘3Vs’ of data – volume, variety and velocity – especially in context of the data lifecycle, as well as knowing how to leverage the key  capabilities of data governance – data cataloging, data literacy, business process, enterprise architecture and data modeling – that enable data to be leveraged at different stages for optimum security, quality and value.

Following are two examples that illustrate the data governance stock check, including the Any2 approach in action, based on real consulting engagements.

Data Governance Stock Check

Data Governance “Stock Check” Case 1: The Data Broker

This client trades in information. Therefore, the organization needed to catalog the data it acquires from suppliers, ensure its quality, classify it, and then sell it to customers. The company wanted to assemble the data in a data warehouse and then provide controlled access to it.

The first step in helping this client involved taking stock of its existing data. We set up a portal so data assets could be registered via a form with basic questions, and then a central team received the registrations, reviewed and prioritized them. Entitlement attributes also were set up to identify and profile high-priority assets.

A number of best practices and technology solutions were used to establish the data required for managing the registration and classification of data feeds:

1. The underlying metadata is harvested followed by an initial quality check. Then the metadata is classified against a semantic model held in a business glossary.

2. After this classification, a second data quality check is performed based on the best-practice rules associated with the semantic model.

3. Profiled assets are loaded into a historical data store within the warehouse, with data governance tools generating its structure and data movement operations for data loading.

4. We developed a change management program to make all staff aware of the information brokerage portal and the importance of using it. It uses a catalog of data assets, all classified against a semantic model with data quality metrics to easily understand where data assets are located within the data warehouse.

5. Adopting this portal, where data is registered and classified against an ontology, enables the client’s customers to shop for data by asset or by meaning (e.g., “what data do you have on X topic?”) and then drill down through the taxonomy or across an ontology. Next, they raise a request to purchase the desired data.

This consulting engagement and technology implementation increased data accessibility and capitalization. Information is registered within a central portal through an approved workflow, and then customers shop for data either from a list of physical assets or by information content, with purchase requests also going through an approval workflow. This, among other safeguards, ensures data quality.

Benefits of Data Governance

Data Governance “Stock Check” Case 2: Tracking Rogue Data

This client has a geographically-dispersed organization that stored many of its key processes in Microsoft Excel TM spreadsheets. They were planning to move to Office 365TM and were concerned about regulatory compliance, including GDPR mandates.

Knowing that electronic documents are heavily used in key business processes and distributed across the organization, this company needed to replace risky manual processes with centralized, automated systems.

A key part of the consulting engagement was to understand what data assets were in circulation and how they were used by the organization. Then process chains could be prioritized to automate and outline specifications for the system to replace them.

This organization also adopted a central portal that allowed employees to register data assets. The associated change management program raised awareness of data governance across the organization and the importance of data registration.

For each asset, information was captured and reviewed as part of a workflow. Prioritized assets were then chosen for profiling, enabling metadata to be reverse-engineered before being classified against the business glossary.

Additionally, assets that were part of a process chain were gathered and modeled with enterprise architecture (EA) and business process (BP) modeling tools for impact analysis.

High-level requirements for new systems then could be defined again in the EA/BP tools and prioritized on a project list. For the others, decisions could be made on whether they could safely be placed in the cloud and whether macros would be required.

In this case, the adoption of purpose-built data governance solutions helped build an understanding of the data assets in play, including information about their usage and content to aid in decision-making.

This client then had a good handle of the “what” and “where” in terms of sensitive data stored in their systems. They also better understood how this sensitive data was being used and by whom, helping reduce regulatory risks like those associated with GDPR.

In both scenarios, we cataloged data assets and mapped them to a business glossary. It acts as a classification scheme to help govern data and located data, making it both more accessible and valuable. This governance framework reduces risk and protects its most valuable or sensitive data assets.

Focused on producing meaningful business outcomes, the erwin EDGE platform was pivotal in achieving these two clients’ data governance goals – including the infrastructure to undertake a data governance stock check. They used it to create an “enterprise data governance experience” not just for cataloging data and other foundational tasks, but also for a competitive “EDGE” in maximizing the value of their data while reducing data-related risks.

To learn more about the erwin EDGE data governance platform and how it aids in undertaking a data governance stock check, register for our free, 30-minute demonstration here.

Categories
erwin Expert Blog

Google’s Record GDPR Fine: Avoiding This Fate with Data Governance

The General Data Protection Regulation (GDPR) made its first real impact as Google’s record GDPR fine dominated news cycles.

Historically, fines had peaked at six figures with the U.K.’s Information Commissioner’s Office (ICO) fines of 500,000 pounds ($650,000 USD) against both Facebook and Equifax for their data protection breaches.

Experts predicted an uptick in GDPR enforcement in 2019, and Google’s recent record GDPR fine has brought that to fruition. France’s data privacy enforcement agency hit the tech giant with a $57 million penalty – more than 80 times the steepest ICO fine.

If it can happen to Google, no organization is safe. Many in fact still lag in the GDPR compliance department. Cisco’s 2019 Data Privacy Benchmark Study reveals that only 59 percent of organizations are meeting “all or most” of GDPR’s requirements.

So many more GDPR violations are likely to come to light. And even organizations that are currently compliant can’t afford to let their data governance standards slip.

Data Governance for GDPR

Google’s record GDPR fine makes the rationale for better data governance clear enough. However, the Cisco report offers even more insight into the value of achieving and maintaining compliance.

Organizations with GDPR-compliant security measures are not only less likely to suffer a breach (74 percent vs. 89 percent), but the breaches suffered are less costly too, with fewer records affected.

However, applying such GDPR-compliant provisions can’t be done on a whim; organizations must expand their data governance practices to include compliance.

GDPR White Paper

A robust data governance initiative provides a comprehensive picture of an organization’s systems and the units of data contained or used within them. This understanding encompasses not only the original instance of a data unit but also its lineage and how it has been handled and processed across an organization’s ecosystem.

With this information, organizations can apply the relevant degrees of security where necessary, ensuring expansive and efficient protection from external (i.e., breaches) and internal (i.e., mismanaged permissions) data security threats.

Although data security cannot be wholly guaranteed, these measures can help identify and contain breaches to minimize the fallout.

Looking at Google’s Record GDPR Fine as An Opportunity

The tertiary benefits of GDPR compliance include greater agility and innovation and better data discovery and management. So arguably, the “tertiary” benefits of data governance should take center stage.

While once exploited by such innovators as Amazon and Netflix, data optimization and governance is now on everyone’s radar.

So organization’s need another competitive differentiator.

An enterprise data governance experience (EDGE) provides just that.

THE REGULATORY RATIONALE FOR INTEGRATING DATA MANAGEMENT & DATA GOVERNANCE

This approach unifies data management and data governance, ensuring that the data landscape, policies, procedures and metrics stem from a central source of truth so data can be trusted at any point throughout its enterprise journey.

With an EDGE, the Any2 (any data from anywhere) data management philosophy applies – whether structured or unstructured, in the cloud or on premise. An organization’s data preparation (data mapping), enterprise modeling (business, enterprise and data) and data governance practices all draw from a single metadata repository.

In fact, metadata from a multitude of enterprise systems can be harvested and cataloged automatically. And with intelligent data discovery, sensitive data can be tagged and governed automatically as well – think GDPR as well as HIPAA, BCBS and CCPA.

Organizations without an EDGE can still achieve regulatory compliance, but data silos and the associated bottlenecks are unavoidable without integration and automation – not to mention longer timeframes and higher costs.

To get an “edge” on your competition, consider the erwin EDGE platform for greater control over and value from your data assets.

Data preparation/mapping is a great starting point and a key component of the software portfolio. Join us for a weekly demo.

Automate Data Mapping

Categories
erwin Expert Blog

erwin Automation Framework: Achieving Faster Time-to-Value in Data Preparation, Deployment and Governance

Data governance is more important to the enterprise than ever before. It ensures everyone in the organization can discover and analyze high-quality data to quickly deliver business value.

It assists in successfully meeting increasingly strict compliance requirements, such as those in the General Data Protection Regulation (GDPR). And it provides a clear gauge on business performance.

A mature and sustainable data governance initiative must include data integration.

This often requires reconciling two groups of individuals within the organization: 1) those who care about governance and the meaningful use of data and 2) those who care about getting and transforming the data from source to target for actionable insights.

Both ends of the data value chain are covered when governance is coupled programmatically with IT’s integration practices.

The tools and processes for this should automatically generate “pre-ETL” source-to-target mapping to minimize human errors that can occur while manually compiling and interpreting a multitude of Excel-based data mappings that exist across the organization.

In addition to reducing errors and improving data quality, the efficiencies gained through automation, including minimizing rework, can help cut system development lifecycle costs in half.

In fact, being able to rely on automated and repeatable processes can result in up to 50 percent in design savings, up to 70 percent conversion savings, and up to 70 percent acceleration in total project delivery.

Data Governance and the System Development Lifecycle

Boosting data governance maturity starts with a central metadata repository (data dictionary) for version-controlling metadata imported from a broad array of file and database types to inform data mappings. It can be used to automatically generate governed design mappings and code in the design phase of the system development lifecycle.

The right toolset – one that supports a unifying and underlying metadata model – will be a design and code-generation platform that introduces efficiency, visibility and governance principles while reducing the opportunity for human error.

Automatically generating ETL/ELT jobs for leading ETL tools based on best design practices accommodates those principles; it functions according to approved corporate and industry standards.

Automatically importing mappings from developers’ Excel sheets, flat files, access and ETL tools into a comprehensive mappings inventory, complete with automatically generated and meaningful documentation of the mappings, is a powerful way to support governance while providing real insight into data movement – for lineage and impact analysis – without interrupting system developers’ normal work methods.

GDPR compliance, for example, requires a business to discover source-to-target mappings with all accompanying transactions, such as what business rules in the repository are applied to it, to comply with audits.

THE REGULATORY RATIONALE FOR INTEGRATING DATA MANAGEMENT & DATA GOVERNANCE

When data movement has been tracked and version-controlled, it’s possible to conduct data archeology – that is, reverse-engineering code from existing XML within the ETL layer – to uncover what has happened in the past and incorporating it into a mapping manager for fast and accurate recovery.

This is one example of how to meet data governance demands with more agility and accuracy at high speed.

Faster Time-to-Value with the erwin Automation Framework

The erwin Automation Framework is a metadata-driven universal code generator that works hand in hand with erwin Mapping Manager (MM) for:

  • Pre-ETL enterprise data mapping
  • Governing metadata
  • Governing and versioning source-to-target mappings throughout the lifecycle
  • Data lineage, impact analysis and business rules repositories
  • Automated code generation

If you’d like to save time and money in preparing, deploying and governing you organization’s data, please join us for a demo of erwin MM.

Automate Data Mapping