Categories
erwin Expert Blog

Data Mapping Tools: What Are the Key Differentiators

The need for data mapping tools in light of increasing volumes and varieties of data – as well as the velocity at which it must be processed – is growing.

It’s not difficult to see why either. Data mapping tools have always been a key asset for any organization looking to leverage data for insights.

Isolated units of data are essentially meaningless. By linking data and enabling its categorization in relation to other data units, data mapping provides the context vital for actionable information.

Now with the General Data Protection Regulation (GDPR) in effect, data mapping has become even more significant.

The scale of GDPR’s reach has set a new precedent and is the closest we’ve come to a global standard in terms of data regulations. The repercussions can be huge – just ask Google.

Data mapping tools are paramount in charting a path to compliance for said new, near-global standard and avoiding the hefty fines.

Because of GDPR, organizations that may not have fully leveraged data mapping for proactive data-driven initiatives (e.g., analysis) are now adopting data mapping tools with compliance in mind.

Arguably, GDPR’s implementation can be viewed as an opportunity – a catalyst for digital transformation.

Those organizations investing in data mapping tools with compliance as the main driver will definitely want to consider this opportunity and have it influence their decision as to which data mapping tool to adopt.

With that in mind, it’s important to understand the key differentiators in data mapping tools and the associated benefits.

Data Mapping Tools: erwin Mapping Manager

Data Mapping Tools: Automated or Manual?

In terms of differentiators for data mapping tools, perhaps the most distinct is automated data mapping versus data mapping via manual processes.

Data mapping tools that allow for automation mean organizations can benefit from in-depth, quality-assured data mapping, without the significant allocations of resources typically associated with such projects.

Eighty percent of data scientists’ and other data professionals’ time is spent on manual data maintenance. That’s anything and everything from addressing errors and inconsistencies and trying to understand source data or track its lineage. This doesn’t even account for the time lost due to missed errors that contribute to inherently flawed endeavors.

Automated data mapping tools render such issues and concerns void. In turn, data professionals’ time can be put to much better, proactive use, rather than them being bogged down with reactive, house-keeping tasks.

FOUR INDUSTRY FOCUSSED CASE STUDIES FOR AUTOMATED METADATA-DRIVEN AUTOMATION 
(BFSI, PHARMA, INSURANCE AND NON-PROFIT) 

 

As well as introducing greater efficiency to the data governance process, automated data mapping tools enable data to be auto-documented from XML that builds mappings for the target repository or reporting structure.

Additionally, a tool that leverages and draws from a single metadata repository means that mappings are dynamically linked with underlying metadata to render automated lineage views, including full transformation logic in real time.

Therefore, changes (e.g., in the data catalog) will be reflected across data governance domains (business process, enterprise architecture and data modeling) as and when they’re made – no more juggling and maintaining multiple, out-of-date versions.

It also enables automatic impact analysis at the table and column level – even for business/transformation rules.

For organizations looking to free themselves from the burden of juggling multiple versions, siloed business processes and a disconnect between interdepartmental collaboration, this feature is a key benefit to consider.

Data Mapping Tools: Other Differentiators

In light of the aforementioned changes to data regulations, many organizations will need to consider the extent of a data mapping tool’s data lineage capabilities.

The ability to reverse-engineer and document the business logic from your reporting structures for true source-to-report lineage is key because it makes analysis (and the trust in said analysis) easier. And should a data breach occur, affected data/persons can be more quickly identified in accordance with GDPR.

Article 33 of GDPR requires organizations to notify the appropriate supervisory authority “without undue delay and, where, feasible, not later than 72 hours” after discovering a breach.

As stated above, a data governance platform that draws from a single metadata source is even more advantageous here.

Mappings can be synchronized with metadata so that source or target metadata changes can be automatically pushed into the mappings – so your mappings stay up to date with little or no effort.

The Data Mapping Tool For Data-Driven Businesses

Nobody likes manual documentation. It’s arduous, error-prone and a waste of resources. Quite frankly, it’s dated.

Any organization looking to invest in data mapping, data preparation and/or data cataloging needs to make automation a priority.

With automated data mapping, organizations can achieve “true data intelligence,”. That being the ability to tell the story of how data enters the organization and changes throughout the entire lifecycle to the consumption/reporting layer.  If you’re working harder than your tool, you have the wrong tool.

The manual tools of old do not have auto documentation capabilities, cannot produce outbound code for multiple ETL or script types, and are a liability in terms of accuracy and GDPR.

Automated data mapping is the only path to true GDPR compliance, and erwin Mapping Manager can get you there in a matter of weeks thanks to our robust reverse-engineering technology. 

Learn more about erwin’s automation framework for data governance here.

Automate Data Mapping

Categories
erwin Expert Blog

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms, including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. But the attempts to standardize data across the entire enterprise haven’t produced the desired results.

A company can’t effectively implement data governance – documenting and applying business rules and processes, analyzing the impact of changes and conducting audits – if it fails at data management.

The problem usually starts by relying on manual integration methods for data preparation and mapping. It’s only when companies take their first stab at manually cataloging and documenting operational systems, processes and the associated data, both at rest and in motion, that they realize how time-consuming the entire data prepping and mapping effort is, and why that work is sure to be compounded by human error and data quality issues.

To effectively promote business transformation, as well as fulfil regulatory and compliance mandates, there can’t be any mishaps.

It’s obvious that the manual road is very challenging to discover and synthesize data that resides in different formats in thousands of unharvested, undocumented databases, applications, ETL processes and procedural code.

Consider the problematic issue of manually mapping source system fields (typically source files or database tables) to target system fields (such as different tables in target data warehouses or data marts).

These source mappings generally are documented across a slew of unwieldy spreadsheets in their “pre-ETL” stage as the input for ETL development and testing. However, the ETL design process often suffers as it evolves because spreadsheet mapping data isn’t updated or may be incorrectly updated thanks to human error. So questions linger about whether transformed data can be trusted.

Data Quality Obstacles

The sad truth is that high-paid knowledge workers like data scientists spend up to 80 percent of their time finding and understanding source data and resolving errors or inconsistencies, rather than analyzing it for real value.

Statistics are similar when looking at major data integration projects, such as data warehousing and master data management with data stewards challenged to identify and document data lineage and sensitive data elements.

So how can businesses produce value from their data when errors are introduced through manual integration processes? How can enterprise stakeholders gain accurate and actionable insights when data can’t be easily and correctly translated into business-friendly terms?

How can organizations master seamless data discovery, movement, transformation and IT and business collaboration to reverse the ratio of preparation to value delivered.

What’s needed to overcome these obstacles is establishing an automated, real-time, high-quality and metadata- driven pipeline useful for everyone, from data scientists to enterprise architects to business analysts to C-level execs.

Doing so will require a hearty data management strategy and technology for automating the timely delivery of quality data that measures up to business demands.

From there, they need a sturdy data governance strategy and technology to automatically link and sync well-managed data with core capabilities for auditing, statutory reporting and compliance requirements as well as to drive business insights.

Creating a High-Quality Data Pipeline

Working hand-in-hand, data management and data governance provide a real-time, accurate picture of the data landscape, including “data at rest” in databases, data lakes and data warehouses and “data in motion” as it is integrated with and used by key applications. And there’s control of that landscape to facilitate insight and collaboration and limit risk.

With a metadata-driven, automated, real-time, high-quality data pipeline, all stakeholders can access data that they now are able to understand and trust and which they are authorized to use. At last they can base strategic decisions on what is a full inventory of reliable information.

The integration of data management and governance also supports industry needs to fulfill regulatory and compliance mandates, ensuring that audits are not compromised by the inability to discover key data or by failing to tag sensitive data as part of integration processes.

Data-driven insights, agile innovation, business transformation and regulatory compliance are the fruits of data preparation/mapping and enterprise modeling (business process, enterprise architecture and data modeling) that revolves around a data governance hub.

erwin Mapping Manager (MM) combines data management and data governance processes in an automated flow through the integration lifecycle from data mapping for harmonization and aggregation to generating the physical embodiment of data lineage – that is the creation, movement and transformation of transactional and operational data.

Its hallmark is a consistent approach to data delivery (business glossaries connect physical metadata to specific business terms and definitions) and metadata management (via data mappings).

Automate Data Mapping

Categories
erwin Expert Blog

Top 10 Data Governance Predictions for 2019

This past year witnessed a data governance awakening – or as the Wall Street Journal called it, a “global data governance reckoning.” There was tremendous data drama and resulting trauma – from Facebook to Equifax and from Yahoo to Marriott. The list goes on and on. And then, the European Union’s General Data Protection Regulation (GDPR) took effect, with many organizations scrambling to become compliant.

So what’s on the horizon for data governance in the year ahead? We’re making the following data governance predictions for 2019:

Data Governance Predictions

Top 10 Data Governance Predictions for 2019

1. GDPR-esque regulation for the United States:

GDPR has set the bar and will become the de facto standard across geographies. Look at California as an example with California Consumer Privacy Act (CCPA) going into effect in 2020. Even big technology companies like Apple, Google, Amazon and Twitter are encouraging more regulations in part because they realize that companies that don’t put data privacy at the forefront will feel the wrath from both the government and the consumer.

2. GDPR fines are coming and they will be massive:

Perhaps one of the safest data governance predictions for 2019 is the coming clamp down on GDPR enforcement. The regulations weren’t brought in for show and so it’s likely the fine-free streak for GDPR will be ending … and soon. The headlines will resemble data breaches or hospitals with Health Information Portability Privacy Act (HIPAA) violations in the U.S. healthcare sector. Lots of companies will have an “oh crap” moment and realize they have a lot more to do to get their compliance house in order.

3. Data policies as a consumer buying criteria:

The threat of “data trauma” will continue to drive visibility for enterprise data in the C-suite. How they respond will be the key to their long-term success in transforming data into a true enterprise asset. We will start to see a clear delineation between organizations that maintain a reactive and defensive stance (pain avoidance) versus those that leverage this negative driver as an impetus to increase overall data visibility and fluency across the enterprise with a focus on opportunity enablement. The latter will drive the emergence of true data-driven entities versus those that continue to try to plug the holes in the boat.

4. CDOs will rise, better defined role within the organization:

We will see the chief data officer (CDO) role elevated from being a lieutenant of the CIO to taking a proper seat at the table beside the CIO, CMO and CFO.  This will give them the juice needed to create a sustainable vision and roadmap for data. So far, there’s been a profound lack of consensus on the nature of the role and responsibilities, mandate and background that qualifies a CDO. As data becomes increasingly more vital to an organization’s success from a compliance and business perspective, the role of the CDO will become more defined.

5. Data operations (DataOps) gains traction/will be fully optimized:

Much like how DevOps has taken hold over the past decade, 2019 will see a similar push for DataOps. Data is no longer just an IT issue. As organizations become data-driven and awash in an overwhelming amount of data from multiple data sources (AI, IOT, ML, etc.), organizations will need to get a better handle on data quality and focus on data management processes and practices. DataOps will enable organizations to better democratize their data and ensure that all business stakeholders work together to deliver quality, data-driven insights.

Data Management and Data Governance

6. Business process will move from back office to center stage:

Business process management will make its way out of the back office and emerge as a key component to digital transformation. The ability for an organization to model, build and test automated business processes is a gamechanger. Enterprises can clearly define, map and analyze workflows and build models to drive process improvement as well as identify business practices susceptible to the greatest security, compliance or other risks and where controls are most needed to mitigate exposures.

7. Turning bad AI/ML data good:

Artificial Intelligence (AI) and Machine Learning (ML) are consumers of data. The risk of training AI and ML applications with bad data will initially drive the need for data governance to properly govern the training data sets. Once trained, the data they produce should be well defined, consistent and of high quality. The data needs to be continuously governed for assurance purposes.

8. Managing data from going over the edge:

Edge computing will continue to take hold. And while speed of data is driving its adoption, organizations will also need to view, manage and secure this data and bring it into an automated pipeline. The internet of things (IoT) is all about new data sources (device data) that often have opaque data structures. This data is often integrated and aggregated with other enterprise data sources and needs to be governed like any other data. The challenge is documenting all the different device management information bases (MIBS) and mapping them into the data lake or integration hub.

9. Organizations that don’t have good data harvesting are doomed to fail:

Research shows that data scientists and analysts spend 80 percent of their time preparing data for use and only 20 percent of their time actually analyzing it for business value. Without automated data harvesting and ingesting data from all enterprise sources (not just those that are convenient to access), data moving through the pipeline won’t be the highest quality and the “freshest” it can be. The result will be faulty intelligence driving potentially disastrous decisions for the business.

10. Data governance evolves to data intelligence:

Regulations like GDPR are driving most large enterprises to address their data challenges. But data governance is more than compliance. “Best-in-breed” enterprises are looking at how their data can be used as a competitive advantage. These organizations are evolving their data governance practices to data intelligence – connecting all of the pieces of their data management and data governance lifecycles to create actionable insights. Data intelligence can help improve the customer experiences and enable innovation of products and services.

The erwin Expert Blog will continue to follow data governance trends and provide best practice advice in the New Year so you can see how our data governance predictions pan out for yourself. To stay up to date, click here to subscribe.

Data Management and Data Governance: Solving the Enterprise Data Dilemma

Categories
erwin Expert Blog

The Unified Data Platform – Connecting Everything That Matters

Businesses stand to gain a lot from a unified data platform.

This decade has seen data-driven leaders dominate their respective markets and inspire other organizations across the board to use data to fuel their businesses, leveraging this strategic asset to create more value below the surface. It’s even been dubbed “the new oil,” but data is arguably more valuable than the analogy suggests.

Data governance (DG) is a key component of the data value chain because it connects people, processes and technology as they relate to the creation and use of data. It equips organizations to better deal with  increasing data volumes, the variety of data sources, and the speed in which data is processed.

But for an organization to realize and maximize its true data-driven potential, a unified data platform is required. Only then can all data assets be discovered, understood, governed and socialized to produce the desired business outcomes while also reducing data-related risks.

Benefits of a Unified Data Platform

Data governance can’t succeed in a bubble; it has to be connected to the rest of the enterprise. Whether strategic, such as risk and compliance management, or operational, like a centralized help desk, your data governance framework should span and support the entire enterprise and its objectives, which it can’t do from a silo.

Let’s look at some of the benefits of a unified data platform with data governance as the key connection point.

Understand current and future state architecture with business-focused outcomes:

A unified data platform with a single metadata repository connects data governance to the roles, goals strategies and KPIs of the enterprise. Through integrated enterprise architecture modeling, organizations can capture, analyze and incorporate the structure and priorities of the enterprise and related initiatives.

This capability allows you to plan, align, deploy and communicate a high-impact data governance framework and roadmap that sets manageable expectations and measures success with metrics important to the business.

Document capabilities and processes and understand critical paths:

A unified data platform connects data governance to what you do as a business and the details of how you do it. It enables organizations to document and integrate their business capabilities and operational processes with the critical data that serves them.

It also provides visibility and control by identifying the critical paths that will have the greatest impacts on the business.

Realize the value of your organization’s data:

A unified data platform connects data governance to specific business use cases. The value of data is realized by combining different elements to answer a business question or meet a specific requirement. Conceptual and logical schemas and models provide a much richer understanding of how data is related and combined to drive business value.

2020 Data Governance and Automation Report

Harmonize data governance and data management to drive high-quality deliverables:

A unified data platform connects data governance to the orchestration and preparation of data to drive the business, governing data throughout the entire lifecycle – from creation to consumption.

Governing the data management processes that make data available is of equal importance. By harmonizing the data governance and data management lifecycles, organizations can drive high-quality deliverables that are governed from day one.

Promote a business glossary for unanimous understanding of data terminology:

A unified data platform connects data governance to the language of the business when discussing and describing data. Understanding the terminology and semantic meaning of data from a business perspective is imperative, but most business consumers of data don’t have technical backgrounds.

A business glossary promotes data fluency across the organization and vital collaboration between different stakeholders within the data value chain, ensuring all data-related initiatives are aligned and business-driven.

Instill a culture of personal responsibility for data governance:

A unified data platform is inherently connected to the policies, procedures and business rules that inform and govern the data lifecycle. The centralized management and visibility afforded by linking policies and business rules at every level of the data lifecycle will improve data quality, reduce expensive re-work, and improve the ideation and consumption of data by the business.

Business users will know how to use (and how not to use) data, while technical practitioners will have a clear view of the controls and mechanisms required when building the infrastructure that serves up that data.

Better understand the impact of change:

Data governance should be connected to the use of data across roles, organizations, processes, capabilities, dashboards and applications. Proactive impact analysis is key to efficient and effective data strategy. However, most solutions don’t tell the whole story when it comes to data’s business impact.

By adopting a unified data platform, organizations can extend impact analysis well beyond data stores and data lineage for true visibility into who, what, where and how the impact will be felt, breaking down organizational silos.

Getting the Competitive “EDGE”

The erwin EDGE delivers an “enterprise data governance experience” in which every component of the data value chain is connected.

Now with data mapping, it unifies data preparation, enterprise modeling and data governance to simplify the entire data management and governance lifecycle.

Both IT and the business have access to an accurate, high-quality and real-time data pipeline that fuels regulatory compliance, innovation and transformation initiatives with accurate and actionable insights.

Categories
erwin Expert Blog

Top 10 Reasons to Automate Data Mapping and Data Preparation

Data preparation is notorious for being the most time-consuming area of data management. It’s also expensive.

“Surveys show the vast majority of time is spent on this repetitive task, with some estimates showing it takes up as much as 80% of a data professional’s time,” according to Information Week. And a Trifacta study notes that overreliance on IT resources for data preparation costs organizations billions.

The power of collecting your data can come in a variety of forms, but most often in IT shops around the world, it comes in a spreadsheet, or rather a collection of spreadsheets often numbering in the hundreds or thousands.

Most organizations, especially those competing in the digital economy, don’t have enough time or money for data management using manual processes. And outsourcing is also expensive, with inevitable delays because these vendors are dependent on manual processes too.

Automate Data Mapping

Taking the Time and Pain Out of Data Preparation: 10 Reasons to Automate Data Preparation/Data Mapping

  1. Governance and Infrastructure

Data governance and a strong IT infrastructure are critical in the valuation, creation, storage, use, archival and deletion of data. Beyond the simple ability to know where the data came from and whether or not it can be trusted, there is an element of statutory reporting and compliance that often requires a knowledge of how that same data (known or unknown, governed or not) has changed over time.

A design platform that allows for insights like data lineage, impact analysis, full history capture, and other data management features can provide a central hub from which everything can be learned and discovered about the data – whether a data lake, a data vault, or a traditional warehouse.

  1. Eliminating Human Error

In the traditional data management organization, excel spreadsheets are used to manage the incoming data design, or what is known as the “pre-ETL” mapping documentation – this does not lend to any sort of visibility or auditability. In fact, each unit of work represented in these ‘mapping documents’ becomes an independent variable in the overall system development lifecycle, and therefore nearly impossible to learn from much less standardize.

The key to creating accuracy and integrity in any exercise is to eliminate the opportunity for human error – which does not mean eliminating humans from the process but incorporating the right tools to reduce the likelihood of error as the human beings apply their thought processes to the work.  

  1. Completeness

The ability to scan and import from a broad range of sources and formats, as well as automated change tracking, means that you will always be able to import your data from wherever it lives and track all of the changes to that data over time.

  1. Adaptability

Centralized design, immediate lineage and impact analysis, and change activity logging means that you will always have the answer readily available, or a few clicks away.  Subsets of data can be identified and generated via predefined templates, generic designs generated from standard mapping documents, and pushed via ETL process for faster processing via automation templates.

  1. Accuracy

Out-of-the-box capabilities to map your data from source to report, make reconciliation and validation a snap, with auditability and traceability built-in.  Build a full array of validation rules that can be cross checked with the design mappings in a centralized repository.

  1. Timeliness

The ability to be agile and reactive is important – being good at being reactive doesn’t sound like a quality that deserves a pat on the back, but in the case of regulatory requirements, it is paramount.

  1. Comprehensiveness

Access to all of the underlying metadata, source-to-report design mappings, source and target repositories, you have the power to create reports within your reporting layer that have a traceable origin and can be easily explained to both IT, business, and regulatory stakeholders.

  1. Clarity

The requirements inform the design, the design platform puts those to action, and the reporting structures are fed the right data to create the right information at the right time via nearly any reporting platform, whether mainstream commercial or homegrown.

  1. Frequency

Adaptation is the key to meeting any frequency interval. Centralized designs, automated ETL patterns that feed your database schemas and reporting structures will allow for cyclical changes to be made and implemented in half the time of using conventional means. Getting beyond the spreadsheet, enabling pattern-based ETL, and schema population are ways to ensure you will be ready, whenever the need arises to show an audit trail of the change process and clearly articulate who did what and when through the system development lifecycle.

  1. Business-Friendly

A user interface designed to be business-friendly means there’s no need to be a data integration specialist to review the common practices outlined and “passively enforced” throughout the tool. Once a process is defined, rules implemented, and templates established, there is little opportunity for error or deviation from the overall process. A diverse set of role-based security options means that everyone can collaborate, learn and audit while maintaining the integrity of the underlying process components.

Faster, More Accurate Analysis with Fewer People

What if you could get more accurate data preparation 50% faster and double your analysis with less people?

erwin Mapping Manager (MM) is a patented solution that automates data mapping throughout the enterprise data integration lifecycle, providing data visibility, lineage and governance – freeing up that 80% of a data professional’s time to put that data to work.

With erwin MM, data integration engineers can design and reverse-engineer the movement of data implemented as ETL/ELT operations and stored procedures, building mappings between source and target data assets and designing the transformation logic between them. These designs then can be exported to most ETL and data asset technologies for implementation.

erwin MM is 100% metadata-driven and used to define and drive standards across enterprise integration projects, enable data and process audits, improve data quality, streamline downstream work flows, increase productivity (especially over geographically dispersed teams) and give project teams, IT leadership and management visibility into the ‘real’ status of integration and ETL migration projects.

If an automated data preparation/mapping solution sounds good to you, please check out erwin MM here.

Solving the Enterprise Data Dilemma