Tag: data discovery

Data Governance Makes Data Security Less Scary

Post author By Mariann McDonagh
Post date October 31, 2019
No Comments on Data Governance Makes Data Security Less Scary

Happy Halloween!

Do you know where your data is? What data you have? Who has had access to it?

These can be frightening questions for an organization to answer.

Add to the mix the potential for a data breach followed by non-compliance, reputational damage and financial penalties and a real horror story could unfold.

In fact, we’ve seen some frightening ones play out already:

Google’s record GDPR fine – France’s data privacy enforcement agency hit the tech giant with a $57 million penalty in early 2019 – more than 80 times the steepest fine the U.K.’s Information Commissioner’s Office had levied against both Facebook and Equifax for their data breaches.
In July 2019, British Airways received the biggest GDPR fine to date ($229 million) because the data of more than 500,000 customers was compromised.
Marriot International was fined $123 million, or 1.5 percent of its global annual revenue, because 330 million hotel guests were affected by a breach in 2018.

The Regulatory Rationale for Integrating Data Management & Data Governance

Now, as Cybersecurity Awareness Month comes to a close – and ghosts and goblins roam the streets – we thought it a good time to resurrect some guidance on how data governance can make data security less scary.

We don’t want you to be caught off guard when it comes to protecting sensitive data and staying compliant with data regulations.

Don’t Scream; You Can Protect Your Sensitive Data

It’s easier to protect sensitive data when you know what it is, where it’s stored and how it needs to be governed.

Data security incidents may be the result of not having a true data governance foundation that makes it possible to understand the context of data – what assets exist and where, the relationship between them and enterprise systems and processes, and how and by what authorized parties data is used.

That knowledge is critical to supporting efforts to keep relevant data secure and private.

Without data governance, organizations don’t have visibility of the full data landscape – linkages, processes, people and so on – to propel more context-sensitive security architectures that can better assure expectations around user and corporate data privacy. In sum, they lack the ability to connect the dots across governance, security and privacy – and to act accordingly.

This addresses these fundamental questions:

What private data do we store and how is it used?
Who has access and permissions to the data?
What data do we have and where is it?

Where Are the Skeletons?

Data is a critical asset used to operate, manage and grow a business. While sometimes at rest in databases, data lakes and data warehouses; a large percentage is federated and integrated across the enterprise, introducing governance, manageability and risk issues that must be managed.

Knowing where sensitive data is located and properly governing it with policy rules, impact analysis and lineage views is critical for risk management, data audits and regulatory compliance.

However, when key data isn’t discovered, harvested, cataloged, defined and standardized as part of integration processes, audits may be flawed and therefore your organization is at risk.

Sensitive data – at rest or in motion – that exists in various forms across multiple systems must be automatically tagged, its lineage automatically documented, and its flows depicted so that it is easily found and its usage across workflows easily traced.

Thankfully, tools are available to help automate the scanning, detection and tagging of sensitive data by:

Monitoring and controlling sensitive data: Better visibility and control across the enterprise to identify data security threats and reduce associated risks
Enriching business data elements for sensitive data discovery: Comprehensively defining business data element for PII, PHI and PCI across database systems, cloud and Big Data stores to easily identify sensitive data based on a set of algorithms and data patterns
Providing metadata and value-based analysis: Discovery and classification of sensitive data based on metadata and data value patterns and algorithms. Organizations can define business data elements and rules to identify and locate sensitive data including PII, PHI, PCI and other sensitive information.

No Hocus Pocus

Truly understanding an organization’s data, including its value and quality, requires a harmonized approach embedded in business processes and enterprise architecture.

Such an integrated enterprise data governance experience helps organizations understand what data they have, where it is, where it came from, its value, its quality and how it’s used and accessed by people and applications.

An ounce of prevention is worth a pound of cure – from the painstaking process of identifying what happened and why to notifying customers their data and thus their trust in your organization has been compromised.

A well-formed security architecture that is driven by and aligned by data intelligence is your best defense. However, if there is nefarious intent, a hacker will find a way. So being prepared means you can minimize your risk exposure and the damage to your reputation.

Multiple components must be considered to effectively support a data governance, security and privacy trinity. They are:

Data models
Enterprise architecture
Business process models

Creating policies for data handling and accountability and driving culture change so people understand how to properly work with data are two important components of a data governance initiative, as is the technology for proactively managing data assets.

Without the ability to harvest metadata schemas and business terms; analyze data attributes and relationships; impose structure on definitions; and view all data in one place according to each user’s role within the enterprise, businesses will be hard pressed to stay in step with governance standards and best practices around security and privacy.

As a consequence, the private information held within organizations will continue to be at risk.

Organizations suffering data breaches will be deprived of the benefits they had hoped to realize from the money spent on security technologies and the time invested in developing data privacy classifications.

They also may face heavy fines and other financial, not to mention PR, penalties.

You can learn more by reading our whitepaper: Examining the Data Trinity: Governance, Security and Privacy.

erwin Expert Blog

Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

Post author By Danny Sandwell
Post date October 24, 2019
No Comments on Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

Untapped data, if mined, represents tremendous potential for your organization. While there has been a lot of talk about big data over the years, the real hero in unlocking the value of enterprise data is metadata, or the data about the data.

However, most organizations don’t use all the data they’re flooded with to reach deeper conclusions about how to drive revenue, achieve regulatory compliance or make other strategic decisions. They don’t know exactly what data they have or even where some of it is.

Quite honestly, knowing what data you have and where it lives is complicated. And to truly understand it, you need to be able to create and sustain an enterprise-wide view of and easy access to underlying metadata.

This isn’t an easy task. Organizations are dealing with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and with little thought for downstream integration.

As a result, the applications and initiatives that depend on a solid data infrastructure may be compromised, leading to faulty analysis and insights.

Metadata Is the Heart of Data Intelligence

A recent IDC Innovators: Data Intelligence Report says that getting answers to such questions as “where is my data, where has it been, and who has access to it” requires harnessing the power of metadata.

Metadata is generated every time data is captured at a source, accessed by users, moves through an organization, and then is profiled, cleansed, aggregated, augmented and used for analytics to guide operational or strategic decision-making.

In fact, data professionals spend 80 percent of their time looking for and preparing data and only 20 percent of their time on analysis, according to IDC.

To flip this 80/20 rule, they need an automated metadata management solution for:

• Discovering data – Identify and interrogate metadata from various data management silos.
• Harvesting data – Automate the collection of metadata from various data management silos and consolidate it into a single source.
• Structuring and deploying data sources – Connect physical metadata to specific data models, business terms, definitions and reusable design standards.
• Analyzing metadata – Understand how data relates to the business and what attributes it has.
• Mapping data flows – Identify where to integrate data and track how it moves and transforms.
• Governing data – Develop a governance model to manage standards, policies and best practices and associate them with physical assets.
• Socializing data – Empower stakeholders to see data in one place and in the context of their roles.

Addressing the Complexities of Metadata Management

The complexities of metadata management can be addressed with a strong data management strategy coupled with metadata management software to enable the data quality the business requires.

This encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossary maintenance, and metadata management (associations and lineage).

erwin has developed the only data intelligence platform that provides organizations with a complete and contextual depiction of the entire metadata landscape.

It is the only solution that can automatically harvest, transform and feed metadata from operational processes, business applications and data models into a central data catalog and then made accessible and understandable within the context of role-based views.

erwin’s ability to integrate and continuously refresh metadata from an organization’s entire data ecosystem, including business processes, enterprise architecture and data architecture, forms the foundation for enterprise-wide data discovery, literacy, governance and strategic usage.

Organizations then can take a data-driven approach to business transformation, speed to insights, and risk management.
With erwin, organizations can:

1. Deliver a trusted metadata foundation through automated metadata harvesting and cataloging
2. Standardize data management processes through a metadata-driven approach
3. Centralize data-driven projects around centralized metadata for planning and visibility
4. Accelerate data preparation and delivery through metadata-driven automation
5. Master data management platforms through metadata abstraction
6. Accelerate data literacy through contextual metadata enrichment and integration
7. Leverage a metadata repository to derive lineage, impact analysis and enable audit/oversight ability

With erwin Data Intelligence as part of the erwin EDGE platform, you know what data you have, where it is, where it’s been and how it transformed along the way, plus you can understand sensitivities and risks.

With an automated, real-time, high-quality data pipeline, enterprise stakeholders can base strategic decisions on a full inventory of reliable information.

Many of our customers are hard at work addressing metadata management challenges, and that’s why erwin was Named a Leader in Gartner’s “2019 Magic Quadrant for Metadata Management Solutions.”

Tags data modeling, enterprise architecture, business process, GDPR, data management, big data, data governance, data architecture, gartner, metadata, metadata management, data quality, IDC, data discovery, metadata repository, structure data, data sources, data infrastructure, automated metadata management, harvest data, govern data, socialize data, regulatory compliance, data models, data intelligence, CCPA, data professionals, data literacy, IDC report, untapped data, data type, downstream integration, metadata data intelligence, data intelligence report, harvesting data, deploy data, analyze metadata, mapping data flows, data management strategy, data ecosystem, gartner metadata management solutions, magic quadrant, magic quadrant metadata, gartner magic quadrant for metadata management

erwin Expert Blog

Top 5 Data Catalog Benefits

Post author By David Loshin
Post date August 7, 2019
No Comments on Top 5 Data Catalog Benefits

A data catalog benefits organizations in a myriad of ways. With the right data catalog tool, organizations can automate enterprise metadata management – including data cataloging, data mapping, data quality and code generation for faster time to value and greater accuracy for data movement and/or deployment projects.

Data cataloging helps curate internal and external datasets for a range of content authors. Gartner says this doubles business benefits and ensures effective management and monetization of data assets in the long-term if linked to broader data governance, data quality and metadata management initiatives.

But even with this in mind, the importance of data cataloging is growing. In the regulated data world (GDPR, HIPAA etc) organizations need to have a good understanding of their data lineage – and the data catalog benefits to data lineage are substantial.

Data lineage is a core operational business component of data governance technology architecture, encompassing the processes and technology to provide full-spectrum visibility into the ways data flows across an enterprise.

There are a number of different approaches to data lineage. Here, I outline the common approach, and the approach incorporating data cataloging – including the top 5 data catalog benefits for understanding your organization’s data lineage.

Data Lineage – The Common Approach

The most common approach for assembling a collection of data lineage mappings traces data flows in a reverse manner. The process begins with the target or data end-point, and then traversing the processes, applications, and ETL tasks in reverse from the target.

For example, to determine the mappings for the data pipelines populating a data warehouse, a data lineage tool might begin with the data warehouse and examine the ETL tasks that immediately proceed the loading of the data into the target warehouse.

The data sources that feed the ETL process are added to a “task list,” and the process is repeated for each of those sources. At each stage, the discovered pieces of lineage are documented. At the end of the sequence, the process will have reverse-mapped the pipelines for populating that warehouse.

While this approach does produce a collection of data lineage maps for selected target systems, there are some drawbacks.

First, this approach focuses only on assembling the data pipelines populating the selected target system but does not necessarily provide a comprehensive view of all the information flows and how they interact.
Second, this process produces the information that can be used for a static view of the data pipelines, but the process needs to be executed on a regular basis to account for changes to the environment or data sources.
Third, and probably most important, this process produces a technical view of the information flow, but it does not necessarily provide any deeper insights into the semantic lineage, or how the data assets map to the corresponding business usage models.

A Data Catalog Offers an Alternate Data Lineage Approach

An alternate approach to data lineage combines data discovery and the use of a data catalog that captures data asset metadata with a data mapping framework that documents connections between the data assets.

This data catalog approach also takes advantage of automation, but in a different way: using platform-specific data connectors, the tool scans the environment for storing each data asset and imports data asset metadata into the data catalog.

When data asset structures are similar, the tool can compare data element domains and value sets, and automatically create the data mapping.

In turn, the data catalog approach performs data discovery using the same data connectors to parse the code involved in data movement, such as major ETL environments and procedural code – basically any executable task that moves data.

The information collected through this process is reverse engineered to create mappings from source data sets to target data sets based on what was discovered.

For example, you can map the databases used for transaction processing, determine that subsets of the transaction processing database are extracted and moved to a staging area, and then parse the ETL code to infer the mappings.

These direct mappings also are documented in the data catalog. In cases where the mappings are not obvious, a tool can help a data steward manually map data assets into the catalog.

The result is a data catalog that incorporates the structural and semantic metadata associated with each data asset as well as the direct mappings for how that data set is populated.

Learn more about data cataloging.

And this is a powerful representative paradigm – instead of capturing a static view of specific data pipelines, it allows a data consumer to request a dynamically-assembled lineage from the documented mappings.

By interrogating the catalog, the current view of any specific data lineage can be rendered on the fly that shows all points of the data lineage: the origination points, the processing stages, the sequences of transformations, and the final destination.

Materializing the “current active lineage” dynamically reduces the risk of having an older version of the lineage that is no longer relevant or correct. When new information is added to the data catalog (such as a newly-added data source of a modification to the ETL code), dynamically-generated views of the lineage will be kept up-to-date automatically.

Top 5 Data Catalog Benefits for Understanding Data Lineage

A data catalog benefits data lineage in the following five distinct ways:

1. Accessibility

The data catalog approach allows the data consumer to query the tool to materialize specific data lineage mappings on demand.

2. Currency

The data lineage is rendered from the most current data in the data catalog.

3. Breadth

As the number of data assets documented in the data catalog increases, the scope of the materializable lineage expands accordingly. With all corporate data assets cataloged, any (or all!) data lineage mappings can be produced on demand.

4. Maintainability and Sustainability

Since the data lineage mappings are not managed as distinct artifacts, there are no additional requirements for maintenance. As long as the data catalog is kept up to date, the data lineage mappings can be materialized.

5. Semantic Visibility

In addition to visualizing the physical movement of data across the enterprise, the data catalog approach allows the data steward to associate business glossary terms, data element definitions, data models, and other semantic details with the different mappings. Additional visualization methods can demonstrate where business terms are used, how they are mapped to different data elements in different systems, and the relationships among these different usage points.

One can impose additional data governance controls with project management oversight, which allows you to designate data lineage mappings in terms of the project life cycle (such as development, test or production).

Aside from these data catalog benefits, this approach allows you to reduce the amount of manual effort for accumulating the information for data lineage and continually reviewing the data landscape to maintain consistency, thus providing a greater return on investment for your data intelligence budget.

Learn more about data cataloging.

erwin Expert Blog

Constructing a Digital Transformation Strategy: Putting the Data in Digital Transformation

Post author By Mariann McDonagh
Post date July 17, 2019
1 Comment on Constructing a Digital Transformation Strategy: Putting the Data in Digital Transformation

Having a clearly defined digital transformation strategy is an essential best practice for successful digital transformation. But what makes a digital transformation strategy viable?

Part Two of the Digital Transformation Journey …

Part one

In our last blog on driving digital transformation, we explored how business architecture and process (BP) modeling are pivotal factors in a viable digital transformation strategy.

EA and BP modeling squeeze risk out of the digital transformation process by helping organizations really understand their businesses as they are today. It gives them the ability to identify what challenges and opportunities exist, and provides a low-cost, low-risk environment to model new options and collaborate with key stakeholders to figure out what needs to change, what shouldn’t change, and what’s the most important changes are.

Once you’ve determined what part(s) of your business you’ll be innovating — the next step in a digital transformation strategy is using data to get there.

Constructing a Digital Transformation Strategy: Data Enablement

Many organizations prioritize data collection as part of their digital transformation strategy. However, few organizations truly understand their data or know how to consistently maximize its value.

If your business is like most, you collect and analyze some data from a subset of sources to make product improvements, enhance customer service, reduce expenses and inform other, mostly tactical decisions.

The real question is: are you reaping all the value you can from all your data? Probably not.

Most organizations don’t use all the data they’re flooded with to reach deeper conclusions or make other strategic decisions. They don’t know exactly what data they have or even where some of it is, and they struggle to integrate known data in various formats and from numerous systems—especially if they don’t have a way to automate those processes.

How does your business become more adept at wringing all the value it can from its data?

The reality is there’s not enough time, people and money for true data management using manual processes. Therefore, an automation framework for data management has to be part of the foundations of a digital transformation strategy.

Your organization won’t be able to take complete advantage of analytics tools to become data-driven unless you establish a foundation for agile and complete data management.

You need automated data mapping and cataloging through the integration lifecycle process, inclusive of data at rest and data in motion.

An automated, metadata-driven framework for cataloging data assets and their flows across the business provides an efficient, agile and dynamic way to generate data lineage from operational source systems (databases, data models, file-based systems, unstructured files and more) across the information management architecture; construct business glossaries; assess what data aligns with specific business rules and policies; and inform how that data is transformed, integrated and federated throughout business processes—complete with full documentation.

Without this framework and the ability to automate many of its processes, business transformation will be stymied. Companies, especially large ones with thousands of systems, files and processes, will be particularly challenged by taking a manual approach. Outsourcing these data management efforts to professional services firms only delays schedules and increases costs.

With automation, data quality is systemically assured. The data pipeline is seamlessly governed and operationalized to the benefit of all stakeholders.

Constructing a Digital Transformation Strategy: Smarter Data

Ultimately, data is the foundation of the new digital business model. Companies that have the ability to harness, secure and leverage information effectively may be better equipped than others to promote digital transformation and gain a competitive advantage.

While data collection and storage continues to happen at a dramatic clip, organizations typically analyze and use less than 0.5 percent of the information they take in – that’s a huge loss of potential. Companies have to know what data they have and understand what it means in common, standardized terms so they can act on it to the benefit of the organization.

Unfortunately, organizations spend a lot more time searching for data rather than actually putting it to work. In fact, data professionals spend 80 percent of their time looking for and preparing data and only 20 percent of their time on analysis, according to IDC.

The solution is data intelligence. It improves IT and business data literacy and knowledge, supporting enterprise data governance and business enablement.

It helps solve the lack of visibility and control over “data at rest” in databases, data lakes and data warehouses and “data in motion” as it is integrated with and used by key applications.

Organizations need a real-time, accurate picture of the metadata landscape to:

Discover data – Identify and interrogate metadata from various data management silos.
Harvest data – Automate metadata collection from various data management silos and consolidate it into a single source.
Structure and deploy data sources – Connect physical metadata to specific data models, business terms, definitions and reusable design standards.
Analyze metadata – Understand how data relates to the business and what attributes it has.
Map data flows – Identify where to integrate data and track how it moves and transforms.
Govern data – Develop a governance model to manage standards, policies and best practices and associate them with physical assets.
Socialize data – Empower stakeholders to see data in one place and in the context of their roles.

The Right Tools

When it comes to digital transformation (like most things), organizations want to do it right. Do it faster. Do it cheaper. And do it without the risk of breaking everything. To accomplish all of this, you need the right tools.

The erwin Data Intelligence (DI) Suite is the heart of the erwin EDGE platform for creating an “enterprise data governance experience.” erwin DI combines data cataloging and data literacy capabilities to provide greater awareness of and access to available data assets, guidance on how to use them, and guardrails to ensure data policies and best practices are followed.

erwin Data Catalog automates enterprise metadata management, data mapping, reference data management, code generation, data lineage and impact analysis. It efficiently integrates and activates data in a single, unified catalog in accordance with business requirements. With it, you can:

Schedule ongoing scans of metadata from the widest array of data sources.
Keep metadata current with full versioning and change management.
Easily map data elements from source to target, including data in motion, and harmonize data integration across platforms.

erwin Data Literacy provides self-service, role-based, contextual data views. It also provides a business glossary for the collaborative definition of enterprise data in business terms, complete with built-in accountability and workflows. With it, you can:

Enable data consumers to define and discover data relevant to their roles.
Facilitate the understanding and use of data within a business context.
Ensure the organization is fluent in the language of data.

With data governance and intelligence, enterprises can discover, understand, govern and socialize mission-critical information. And because many of the associated processes can be automated, you reduce errors and reliance on technical resources while increasing the speed and quality of your data pipeline to accomplish whatever your strategic objectives are, including digital transformation.

Check out our latest whitepaper, Data Intelligence: Empowering the Citizen Analyst with Democratized Data.

erwin Expert Blog

Keeping Up with New Data Protection Regulations

Post author By Bunny Tharpe
Post date April 11, 2019
No Comments on Keeping Up with New Data Protection Regulations

Keeping up with new data protection regulations can be difficult, and the latest – the General Data Protection Regulation (GDPR) – isn’t the only new data protection regulation organizations should be aware of.

California recently passed a law that gives residents the right to control the data companies collect about them. Some suggest the California Consumer Privacy Act (CCPA), which takes effect January 1, 2020, sets a precedent other states will follow by empowering consumers to set limits on how companies can use their personal information.

In fact, organizations should expect increasing pressure on lawmakers to introduce new data protection regulations. A number of high-profile data breaches and scandals have increased public awareness of the issue.

Facebook was in the news again last week for another major problem around the transparency of its user data, and the tech-giant also is reportedly facing 10 GDPR investigations in Ireland – along with Apple, LinkedIn and Twitter.

Some industries, such as healthcare and financial services, have been subject to stringent data regulations for years: GDPR now joins the Health Insurance Portability and Accountability Act (HIPAA), the Payment Card Industry Data Security Standard (PCI DSS) and the Basel Committee on Banking Supervision (BCBS).

Due to these pre-existing regulations, organizations operating within these sectors, as well as insurance, had some of the GDPR compliance bases covered in advance.

Other industries had their own levels of preparedness, based on the nature of their operations. For example, many retailers have robust, data-driven e-commerce operations that are international. Such businesses are bound to comply with varying local standards, especially when dealing with personally identifiable information (PII).

Smaller, more brick-and-mortar-focussed retailers may have had to start from scratch.

But starting position aside, every data-driven organization should strive for a better standard of data management — and not just for compliance sake. After all, organizations are now realizing that data is one of their most valuable assets.

New Data Protection Regulations – Always Be Prepared

When it comes to new data protection regulations in the face of constant data-driven change, it’s a matter of when, not if.

As they say, the best defense is a good offense. Fortunately, whenever the time comes, the first point of call will always be data governance, so organizations can prepare.

Effective compliance with new data protection regulations requires a robust understanding of the “what, where and who” in terms of data and the stakeholders with access to it (i.e., employees).

The Regulatory Rationale for Integrating Data Management & Data Governance

This is also true for existing data regulations. Compliance is an on-going requirement, so efforts to become compliant should not be treated as static events.

Less than four months before GDPR came into effect, only 6 percent of enterprises claimed they were prepared for it. Many of these organizations will recall a number of stressful weeks – or even months – tidying up their databases and their data management processes and policies.

This time and money was spent reactionarily, at the behest of proactive efforts to grow the business.

The implementation and subsequent observation of a strong data governance initiative ensures organizations won’t be put on the spot going forward. Should an audit come up, current projects aren’t suddenly derailed as they reenact pre-GDPR panic.

Data Governance: The Foundation for Compliance

The first step to compliance with new – or old – data protection regulations is data governance.

A robust and effective data governance initiative ensures an organization understands where security should be focussed.

By adopting a data governance platform that enables you to automatically tag sensitive data and track its lineage, you can ensure nothing falls through the cracks.

Your chosen data governance solution should enable you to automate the scanning, detection and tagging of sensitive data by:

Monitoring and controlling sensitive data – Gain better visibility and control across the enterprise to identify data security threats and reduce associated risks.
Enriching business data elements for sensitive data discovery – By leveraging a comprehensive mechanism to define business data elements for PII, PHI and PCI across database systems, cloud and Big Data stores, you can easily identify sensitive data based on a set of algorithms and data patterns.
Providing metadata and value-based analysis – Simplify the discovery and classification of sensitive data based on metadata and data value patterns and algorithms. Organizations can define business data elements and rules to identify and locate sensitive data, including PII, PHI and PCI.

With these precautionary steps, organizations are primed to respond if a data breach occurs. Having a well governed data ecosystem with data lineage capabilities means issues can be quickly identified.

Additionally, if any follow-up is necessary – such as with GDPR’s data breach reporting time requirements – it can be handles swiftly and in accordance with regulations.

It’s also important to understand that the benefits of data governance don’t stop with regulatory compliance.

A better understanding of what data you have, where it’s stored and the history of its use and access isn’t only beneficial in fending off non-compliance repercussions. In fact, such an understanding is arguably better put to use proactively.

Data governance improves data quality standards, it enables better decision-making and ensures businesses can have more confidence in the data informing those decisions.

The same mechanisms that protect data by controlling its access also can be leveraged to make data more easily discoverable to approved parties – improving operational efficiency.

All in all, the cumulative result of data governance’s influence on data-driven businesses both drives revenue (through greater efficiency) and reduces costs (less errors, false starts, etc.).

To learn more about data governance and the regulatory rationale for its implementation, get our free guide here.

erwin Expert Blog

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms, including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. But the attempts to standardize data across the entire enterprise haven’t produced the desired results.

A company can’t effectively implement data governance – documenting and applying business rules and processes, analyzing the impact of changes and conducting audits – if it fails at data management.

The problem usually starts by relying on manual integration methods for data preparation and mapping. It’s only when companies take their first stab at manually cataloging and documenting operational systems, processes and the associated data, both at rest and in motion, that they realize how time-consuming the entire data prepping and mapping effort is, and why that work is sure to be compounded by human error and data quality issues.

To effectively promote business transformation, as well as fulfil regulatory and compliance mandates, there can’t be any mishaps.

It’s obvious that the manual road is very challenging to discover and synthesize data that resides in different formats in thousands of unharvested, undocumented databases, applications, ETL processes and procedural code.

Consider the problematic issue of manually mapping source system fields (typically source files or database tables) to target system fields (such as different tables in target data warehouses or data marts).

These source mappings generally are documented across a slew of unwieldy spreadsheets in their “pre-ETL” stage as the input for ETL development and testing. However, the ETL design process often suffers as it evolves because spreadsheet mapping data isn’t updated or may be incorrectly updated thanks to human error. So questions linger about whether transformed data can be trusted.

Data Quality Obstacles

The sad truth is that high-paid knowledge workers like data scientists spend up to 80 percent of their time finding and understanding source data and resolving errors or inconsistencies, rather than analyzing it for real value.

Statistics are similar when looking at major data integration projects, such as data warehousing and master data management with data stewards challenged to identify and document data lineage and sensitive data elements.

So how can businesses produce value from their data when errors are introduced through manual integration processes? How can enterprise stakeholders gain accurate and actionable insights when data can’t be easily and correctly translated into business-friendly terms?

How can organizations master seamless data discovery, movement, transformation and IT and business collaboration to reverse the ratio of preparation to value delivered.

What’s needed to overcome these obstacles is establishing an automated, real-time, high-quality and metadata- driven pipeline useful for everyone, from data scientists to enterprise architects to business analysts to C-level execs.

Doing so will require a hearty data management strategy and technology for automating the timely delivery of quality data that measures up to business demands.

From there, they need a sturdy data governance strategy and technology to automatically link and sync well-managed data with core capabilities for auditing, statutory reporting and compliance requirements as well as to drive business insights.

Creating a High-Quality Data Pipeline

Working hand-in-hand, data management and data governance provide a real-time, accurate picture of the data landscape, including “data at rest” in databases, data lakes and data warehouses and “data in motion” as it is integrated with and used by key applications. And there’s control of that landscape to facilitate insight and collaboration and limit risk.

With a metadata-driven, automated, real-time, high-quality data pipeline, all stakeholders can access data that they now are able to understand and trust and which they are authorized to use. At last they can base strategic decisions on what is a full inventory of reliable information.

The integration of data management and governance also supports industry needs to fulfill regulatory and compliance mandates, ensuring that audits are not compromised by the inability to discover key data or by failing to tag sensitive data as part of integration processes.

Data-driven insights, agile innovation, business transformation and regulatory compliance are the fruits of data preparation/mapping and enterprise modeling (business process, enterprise architecture and data modeling) that revolves around a data governance hub.

erwin Mapping Manager (MM) combines data management and data governance processes in an automated flow through the integration lifecycle from data mapping for harmonization and aggregation to generating the physical embodiment of data lineage – that is the creation, movement and transformation of transactional and operational data.

Its hallmark is a consistent approach to data delivery (business glossaries connect physical metadata to specific business terms and definitions) and metadata management (via data mappings).

erwin Expert Blog

Six Reasons Business Glossary Management Is Crucial to Data Governance

Post author By Bunny Tharpe
Post date October 18, 2018
3 Comments on Six Reasons Business Glossary Management Is Crucial to Data Governance

A business glossary is crucial to any data governance strategy, yet it is often overlooked.

Consider this – no one likes unpleasant surprises, especially in business. So when it comes to objectively understanding what’s happening from the top of the sales funnel to the bottom line of finance, everyone wants – and needs – to trust the data they have.

That’s why you can’t underestimate the importance of a business glossary. Sometimes the business folks say IT or marketing speaks a different language. Or in the case of mergers and acquisitions, different companies call the same thing something else.

A business glossary solves this complexity by creating a common business vocabulary. Regardless of the industry you’re in or the type of data initiative you’re undertaking, the ability for an organization to have a unified, common language is a key component of data governance, ensuring you can trust your data.

Are we speaking the same language?

How can two reports show different results for the same region? A quick analysis of invoices will likely reveal that some of the data fed into the report wasn’t based on a clear understanding of business terms.

In such embarrassing scenarios, a business glossary and its ongoing management has obvious significance. And with the complexity of today’s business environment, organizations need the right solution to make sense out of their data and govern it properly.

Here are six reasons a business glossary is vital to data governance:

Bridging the gap between Business & IT

A sound data governance initiative bridges the gap between the business and IT. By understanding the underlying metadata associated with business terms and the associated data lineage, a business glossary helps bridge this gap to deliver greater value to the organization.

Integrated search

The biggest appeal of business glossary management is that it helps establish relationships between business terms to drive data governance across the entire organization. A good business glossary should provide an integrated search feature that can find context-specific results, such as business terms, definitions, technical metadata, KPIs and process areas.

The ability to capture business terms and all associated artifacts

What good is a business term if it can’t be associated with other business terms and KPIs? Capturing relationships between business terms as well as between technical and business entities is essential in today’s regulatory and compliance-conscious environment. A business glossary defines the relationship between the business terms and their underlying metadata for faster analysis and enhanced decision-making.

Integrated project management and workflow

When the business and cross-functional teams operate in silos, users start defining business terms according to their own preferences rather than following standard policies and best practices. To be effective, a business glossary should enable a collaborative workflow management and approval process so stakeholders have visibility with established data governance roles and responsibilities. With this ability, business glossary users can provide input during the entire data definition process prior to publication.

The ability to publish business terms

Successful businesses not only capture business terms and their definitions, they also publish them so that the business-at-large can access it. Business glossary users, who are typically members of the data governance team, should be assigned roles for creating, editing, approving and publishing business glossary content. A workflow feature will show which users are assigned which roles, including those with publishing permissions.

After initial publication, business glossary content can be revised and republished on an ongoing basis, based on the needs of your enterprise.

End-to-end traceability

Capturing business terms and establishing relationships are key to glossary management. However, it is far from a complete solution without traceability. A good business glossary can help generate enterprise-level traceability in the form of mind maps or tabular reports to the business community once relationships have been established.

Business Glossary, the Heart of Data Governance

With a business glossary at the heart of your regulatory compliance and data governance initiatives, you can help break down organizational and technical silos for data visibility, context, control and collaboration across domains. It ensures that you can trust your data.

Plus, you can unify the people, processes and systems that manage and protect data through consistent exchange, understanding and processing to increase quality and trust.

By building a glossary of business terms in taxonomies with synonyms, acronyms and relationships, and publishing approved standards and prioritizing them, you can map data in all its forms to the central catalog of data elements.

That answers the vital question of “where is our data?” Then you can understand who and what is using your data to ensure adherence to usage standards and rules.

Tags data management, data governance, data, mergers and acquisitions, collaborative data governance, data governance solutions, data discovery, trust data, business glossary, business IT alignment, search and discovery, data capture, project management, workflow, end-to-end traceability

erwin Expert Blog

Solving the Enterprise Data Dilemma

Post author By Mariann McDonagh
Post date August 30, 2018
No Comments on Solving the Enterprise Data Dilemma

Due to the adoption of data-driven business, organizations across the board are facing their own enterprise data dilemmas.

This week erwin announced its acquisition of metadata management and data governance provider AnalytiX DS. The combined company touches every piece of the data management and governance lifecycle, enabling enterprises to fuel automated, high-quality data pipelines for faster speed to accurate, actionable insights.

Why Is This a Big Deal?

From digital transformation to AI, and everything in between, organizations are flooded with data. So, companies are investing heavily in initiatives to use all the data at their disposal, but they face some challenges. Chiefly, deriving meaningful insights from their data – and turning them into actions that improve the bottom line.

This enterprise data dilemma stems from three important but difficult questions to answer: What data do we have? Where is it? And how do we get value from it?

Large enterprises use thousands of unharvested, undocumented databases, applications, ETL processes and procedural code that make it difficult to gather business intelligence, conduct IT audits, and ensure regulatory compliance – not to mention accomplish other objectives around customer satisfaction, revenue growth and overall efficiency and decision-making.

The lack of visibility and control around “data at rest” combined with “data in motion”, as well as difficulties with legacy architectures, means these organizations spend more time finding the data they need rather than using it to produce meaningful business outcomes.

To remedy this, enterprises need smarter and faster data management and data governance capabilities, including the ability to efficiently catalog and document their systems, processes and the associated data without errors. In addition, business and IT must collaborate outside their traditional operational silos.

But this coveted state of data nirvana isn’t possible without the right approach and technology platform.

Enterprise Data: Making the Data Management-Data Governance Love Connection

Bringing together data management and data governance delivers greater efficiencies to technical users and better analytics to business users. It’s like two sides of the same coin:

Data management drives the design, deployment and operation of systems that deliver operational and analytical data assets.
Data governance delivers these data assets within a business context, tracks their physical existence and lineage, and maximizes their security, quality and value.

Although these disciplines approach data from different perspectives and are used to produce different outcomes, they have a lot in common. Both require a real-time, accurate picture of an organization’s data landscape, including data at rest in data warehouses and data lakes and data in motion as it is integrated with and used by key applications.

However, creating and maintaining this metadata landscape is challenging because this data in its various forms and from numerous sources was never designed to work in concert. Data infrastructures have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration, so the applications and initiatives that depend on data infrastructure are often out-of-date and inaccurate, rendering faulty insights and analyses.

Organizations need to know what data they have and where it’s located, where it came from and how it got there, what it means in common business terms [or standardized business terms] and be able to transform it into useful information they can act on – all while controlling its access.

To support the total enterprise data management and governance lifecycle, they need an automated, real-time, high-quality data pipeline. Then every stakeholder – data scientist, ETL developer, enterprise architect, business analyst, compliance officer, CDO and CEO – can fuel the desired outcomes with reliable information on which to base strategic decisions.

Enterprise Data: Creating Your “EDGE”

At the end of the day, all industries are in the data business and all employees are data people. The success of an organization is not measured by how much data it has, but by how well it’s used.

Data governance enables organizations to use their data to fuel compliance, innovation and transformation initiatives with greater agility, efficiency and cost-effectiveness.

Organizations need to understand their data from different perspectives, identify how it flows through and impacts the business, aligns this business view with a technical view of the data management infrastructure, and synchronizes efforts across both disciplines for accuracy, agility and efficiency in building a data capability that impacts the business in a meaningful and sustainable fashion.

The persona-based erwin EDGE creates an “enterprise data governance experience” that facilitates collaboration between both IT and the business to discover, understand and unlock the value of data both at rest and in motion.

By bringing together enterprise architecture, business process, data mapping and data modeling, erwin’s approach to data governance enables organizations to get a handle on how they handle their data. With the broadest set of metadata connectors and automated code generation, data mapping and cataloging tools, the erwin EDGE Platform simplifies the total data management and data governance lifecycle.

This single, integrated solution makes it possible to gather business intelligence, conduct IT audits, ensure regulatory compliance and accomplish any other organizational objective by fueling an automated, high-quality and real-time data pipeline.

With the erwin EDGE, data management and data governance are unified and mutually supportive, with one hand aware and informed by the efforts of the other to:

Discover data: Identify and integrate metadata from various data management silos.
Harvest data: Automate the collection of metadata from various data management silos and consolidate it into a single source.
Structure data: Connect physical metadata to specific business terms and definitions and reusable design standards.
Analyze data: Understand how data relates to the business and what attributes it has.
Map data flows: Identify where to integrate data and track how it moves and transforms.
Govern data: Develop a governance model to manage standards and policies and set best practices.
Socialize data: Enable stakeholders to see data in one place and in the context of their roles.

An integrated solution with data preparation, modeling and governance helps businesses reach data governance maturity – which equals a role-based, collaborative data governance system that serves both IT and business users equally. Such maturity may not happen overnight, but it will ultimately deliver the accurate and actionable insights your organization needs to compete and win.

Your journey to data nirvana begins with a demo of the enhanced erwin Data Governance solution. Register now.

erwin Expert Blog

Data Discovery Fire Drill: Why Isn’t My Executive Business Intelligence Report Correct?

Post author By Robert Lutton
Post date May 24, 2018
No Comments on Data Discovery Fire Drill: Why Isn’t My Executive Business Intelligence Report Correct?

Executive business intelligence (BI) reporting can be incomplete, inconsistent and/or inaccurate, becoming a critical concern for the executive management team trying to make informed business decisions. When issues arise, it is up to the IT department to figure out what the problem is, where it occurred, and how to fix it. This is not a trivial task.

Take the following scenario in which a CEO receives two reports supposedly from the same set of data, but each report shows different results. Which report is correct? If this is something your organization has experienced, then you know what happens next – the data discovery fire drill.

A flurry of activities take place, suspending all other top priorities. A special team is quickly assembled to delve into each report. They review the data sources, ETL processes and data marts in an effort to trace the events that affected the data. Fire drills like the above can consume days if not weeks of effort to locate the error.

In the above situation it turns out there was a new update to one ETL process that was implemented in only one report. When you multiply the number of data discovery fire drills by the number of data quality concerns for any executive business intelligence report, the costs continue to mount.

Data can arrive from multiple systems at the same time, often occurring rapidly and in parallel. In some cases, the ETL load itself may generate new data. Through all of this, IT still has to answer two fundamental questions: where did this data come from, and how did it get here?

Accurate Executive Business Intelligence Reporting Requires Data Governance

As the volume of data rapidly increases, BI data environments are becoming more complex. To manage this complexity, organizations invest in a multitude of elaborate and expensive tools. But despite this investment, IT is still overwhelmed trying to track the vast collection of data within their BI environment. Is more technology the answer?

Perhaps the better question we should look to answer is: how can we avoid these data discovery fires in the future?

We believe it’s possible to prevent data discovery fires, and that starts with proper data governance and a strong data lineage capability.

Why is data governance important?

Governed data promotes data sharing.
Data standards make data more reusable.
Greater context in data definitions assist in more accurate analytics.
A clear set of data policies and procedures support data security.

Why is data lineage important?

Data trust is built by establishing its origins.
The troubleshooting process is simplified by enabling data to be traced.
The risk of ETL data loss is reduced by exposing potential problems in the process.
Business rules, which otherwise would be buried in an ETL process, are visible.

Data Governance Enables Data-Driven Business

In the context of modern, data-driven business in which organizations are essentially production lines of information – data governance is responsible for the health and maintenance of said production line.

It’s the enabling factor of the enterprise data management suite that ensures data quality, so organizations can have greater trust in their data. It ensures that any data created is properly stored, tagged and assigned the context needed to prevent corruption or loss as it moves through the production line – greatly enhancing data discovery.

Alongside improving data quality, aiding in regulatory compliance, and making practices like tracing data lineage easier, sound data governance also helps organizations be proactive with their data, using it to drive revenue. They can make better decisions faster and negate the likelihood of costly mistakes and data breaches that would eat into their bottom lines.

For more information about how data governance supports executive business intelligence and the rest of the enterprise data management suite, click here.

Tags data governance, business intelligence, data quality, data lineage, BI, BI report, data discovery, data volume, data-driven business data governance, enterprise data management