Tag: data scientist

Four Use Cases Proving the Benefits of Metadata-Driven Automation

Post author By Bunny Tharpe
Post date February 7, 2019
1 Comment on Four Use Cases Proving the Benefits of Metadata-Driven Automation

Organization’s cannot hope to make the most out of a data-driven strategy, without at least some degree of metadata-driven automation.

The volume and variety of data has snowballed, and so has its velocity. As such, traditional – and mostly manual – processes associated with data management and data governance have broken down. They are time-consuming and prone to human error, making compliance, innovation and transformation initiatives more complicated, which is less than ideal in the information age.

So it’s safe to say that organizations can’t reap the rewards of their data without automation.

Data scientists and other data professionals can spend up to 80 percent of their time bogged down trying to understand source data or addressing errors and inconsistencies.

That’s time needed and better used for data analysis.

By implementing metadata-driven automation, organizations across industry can unleash the talents of their highly skilled, well paid data pros to focus on finding the goods: actionable insights that will fuel the business.

Metadata-Driven Automation in the BFSI Industry

The banking, financial services and insurance industry typically deals with higher data velocity and tighter regulations than most. This bureaucracy is rife with data management bottlenecks.

These bottlenecks are only made worse when organizations attempt to get by with systems and tools that are not purpose-built.

For example, manually managing data mappings for the enterprise data warehouse via MS Excel spreadsheets had become cumbersome and unsustainable for one BSFI company.

After embracing metadata-driven automation and custom code automation templates, it saved hundreds of thousands of dollars in code generation and development costs and achieved more work in less time with fewer resources. ROI on the automation solutions was realized within the first year.

Metadata-Driven Automation in the Pharmaceutical Industry

Despite its shortcomings, the Excel spreadsheet method for managing data mappings is common within many industries.

But with the amount of data organizations need to process in today’s business climate, this manual approach makes change management and determining end-to-end lineage a significant and time-consuming challenge.

One global pharmaceutical giant headquartered in the United States experienced such issues until it adopted metadata-driven automation. Then the pharma company was able to scan in all source and target system metadata and maintain it within a single repository. Users now view end-to-end data lineage from the source layer to the reporting layer within seconds.

On the whole, the implementation resulted in extraordinary time savings and a total cost reduction of 60 percent.

Metadata-Driven Automation in the Insurance Industry

Insurance is another industry that has to cope with high data velocity and stringent data regulations. Plus many organizations in this sector find that they’ve outgrown their systems.

For example, an insurance company using a CDMA product to centralize data mappings is probably missing certain critical features, such as versioning, impact analysis and lineage, which adds to costs, times to market and errors.

By adopting metadata-driven automation, organizations can standardize the pre-ETL data mapping process and better manage data integration through the change and release process. As a result, both internal data mapping and cross functional teams now have easy and fast web-based access to data mappings and valuable information like impact analysis and lineage.

Here is the story of a business that adopted such an approach and achieved operational excellence and a delivery time reduction by 80 percent, as well as achieving ROI within 12 months.

Metadata-Driven Automation for a Non-Profit

Another common issue cited by organizations using manual data mapping is ballooning complexity and subsequent confusion.

Any organization expanding its data-driven focus without sufficiently maturing data management initiative(s) will experience this at some point.

One of the world’s largest humanitarian organizations, with millions of members and volunteers operating all over the world, was confronted with this exact issue.

It recognized the need for a solution to standardize the pre-ETL data mapping process to make data integration more efficient and cost-effective.

With metadata-driven automation, the organization would be able to scan and store metadata and data dictionaries in a central repository, as well as manage the business definitions and data dictionary for legacy systems contributing data to the enterprise data warehouse.

By adopting such an approach, the organization realized time savings across all IT development and cross-functional testing teams. Additionally, they were able to more easily manage mappings, code sets, reference data and data validation rules.

Again, ROI was achieved within a year.

A Universal Solution for Metadata-Driven Automation

Metadata-driven automation is a capability any organization can benefit from – regardless of industry, as demonstrated by the various real-world use cases chronicled here.

The erwin Automation Framework is a key component of the erwin EDGE platform for comprehensive data management and data governance.

With it, data professionals realize these industry-agnostic benefits:

Centralized and standardized code management with all automation templates stored in a governed repository
Better quality code and minimized rework
Business-driven data movement and transformation specifications
Superior data movement job designs based on best practices
Greater agility and faster time-to-value in data preparation, deployment and governance
Cross-platform support of scripting languages and data movement technologies

Learn more about metadata-driven automation as it relates to data preparation and enterprise data mapping.

Join one our weekly erwin Mapping Manager demos.

erwin Expert Blog

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms, including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. But the attempts to standardize data across the entire enterprise haven’t produced the desired results.

A company can’t effectively implement data governance – documenting and applying business rules and processes, analyzing the impact of changes and conducting audits – if it fails at data management.

The problem usually starts by relying on manual integration methods for data preparation and mapping. It’s only when companies take their first stab at manually cataloging and documenting operational systems, processes and the associated data, both at rest and in motion, that they realize how time-consuming the entire data prepping and mapping effort is, and why that work is sure to be compounded by human error and data quality issues.

To effectively promote business transformation, as well as fulfil regulatory and compliance mandates, there can’t be any mishaps.

It’s obvious that the manual road is very challenging to discover and synthesize data that resides in different formats in thousands of unharvested, undocumented databases, applications, ETL processes and procedural code.

Consider the problematic issue of manually mapping source system fields (typically source files or database tables) to target system fields (such as different tables in target data warehouses or data marts).

These source mappings generally are documented across a slew of unwieldy spreadsheets in their “pre-ETL” stage as the input for ETL development and testing. However, the ETL design process often suffers as it evolves because spreadsheet mapping data isn’t updated or may be incorrectly updated thanks to human error. So questions linger about whether transformed data can be trusted.

Data Quality Obstacles

The sad truth is that high-paid knowledge workers like data scientists spend up to 80 percent of their time finding and understanding source data and resolving errors or inconsistencies, rather than analyzing it for real value.

Statistics are similar when looking at major data integration projects, such as data warehousing and master data management with data stewards challenged to identify and document data lineage and sensitive data elements.

So how can businesses produce value from their data when errors are introduced through manual integration processes? How can enterprise stakeholders gain accurate and actionable insights when data can’t be easily and correctly translated into business-friendly terms?

How can organizations master seamless data discovery, movement, transformation and IT and business collaboration to reverse the ratio of preparation to value delivered.

What’s needed to overcome these obstacles is establishing an automated, real-time, high-quality and metadata- driven pipeline useful for everyone, from data scientists to enterprise architects to business analysts to C-level execs.

Doing so will require a hearty data management strategy and technology for automating the timely delivery of quality data that measures up to business demands.

From there, they need a sturdy data governance strategy and technology to automatically link and sync well-managed data with core capabilities for auditing, statutory reporting and compliance requirements as well as to drive business insights.

Creating a High-Quality Data Pipeline

Working hand-in-hand, data management and data governance provide a real-time, accurate picture of the data landscape, including “data at rest” in databases, data lakes and data warehouses and “data in motion” as it is integrated with and used by key applications. And there’s control of that landscape to facilitate insight and collaboration and limit risk.

With a metadata-driven, automated, real-time, high-quality data pipeline, all stakeholders can access data that they now are able to understand and trust and which they are authorized to use. At last they can base strategic decisions on what is a full inventory of reliable information.

The integration of data management and governance also supports industry needs to fulfill regulatory and compliance mandates, ensuring that audits are not compromised by the inability to discover key data or by failing to tag sensitive data as part of integration processes.

Data-driven insights, agile innovation, business transformation and regulatory compliance are the fruits of data preparation/mapping and enterprise modeling (business process, enterprise architecture and data modeling) that revolves around a data governance hub.

erwin Mapping Manager (MM) combines data management and data governance processes in an automated flow through the integration lifecycle from data mapping for harmonization and aggregation to generating the physical embodiment of data lineage – that is the creation, movement and transformation of transactional and operational data.

Its hallmark is a consistent approach to data delivery (business glossaries connect physical metadata to specific business terms and definitions) and metadata management (via data mappings).

erwin Expert Blog

The Data Governance (R)Evolution

Post author By Mariann McDonagh
Post date September 28, 2018
No Comments on The Data Governance (R)Evolution

Data governance continues to evolve – and quickly.

Historically, Data Governance 1.0 was siloed within IT and mainly concerned with cataloging data to support search and discovery. However, it fell short in adding value because it neglected the meaning of data assets and their relationships within the wider data landscape.

Then the push for digital transformation and Big Data created the need for DG to come out of IT’s shadows – Data Governance 2.0 was ushered in with principles designed for modern, data-driven business. This approach acknowledged the demand for collaborative data governance, the tearing down of organizational silos, and spreading responsibilities across more roles.

But this past year we all witnessed a data governance awakening – or as the Wall Street Journal called it, a “global data governance reckoning.” There was tremendous data drama and resulting trauma – from Facebook to Equifax and from Yahoo to Aetna. The list goes on and on. And then, the European Union’s General Data Protection Regulation (GDPR) took effect, with many organizations scrambling to become compliant.

So where are we today?

Simply put, data governance needs to be a ubiquitous part of your company’s culture. Your stakeholders encompass both IT and business users in collaborative relationships, so that makes data governance everyone’s business.

Data governance underpins data privacy, security and compliance. Additionally, most organizations don’t use all the data they’re flooded with to reach deeper conclusions about how to grow revenue, achieve regulatory compliance, or make strategic decisions. They face a data dilemma: not knowing what data they have or where some of it is—plus integrating known data in various formats from numerous systems without a way to automate that process.

To accelerate the transformation of business-critical information into accurate and actionable insights, organizations need an automated, real-time, high-quality data pipeline. Then every stakeholder—data scientist, ETL developer, enterprise architect, business analyst, compliance officer, CDO and CEO—can fuel the desired outcomes based on reliable information.

Connecting Data Governance to Your Organization

Data Mapping & Data Governance

The automated generation of the physical embodiment of data lineage—the creation, movement and transformation of transactional and operational data for harmonization and aggregation—provides the best route for enabling stakeholders to understand their data, trust it as a well-governed asset and use it effectively. Being able to quickly document lineage for a standardized, non-technical environment brings business alignment and agility to the task of building and maintaining analytics platforms.

Data Modeling & Data Governance

Data modeling discovers and harvests data schema, and analyzes, represents and communicates data requirements. It synthesizes and standardizes data sources for clarity and consistency to back up governance requirements to use only controlled data. It benefits from the ability to automatically map integrated and cataloged data to and from models, where they can be stored in a central repository for re-use across the organization.

Business Process Modeling & Data Governance

Business process modeling reveals the workflows, business capabilities and applications that use particular data elements. That requires that these assets be appropriately governed components of an integrated data pipeline that rests on automated data lineage and business glossary creation.

Enterprise Architecture & Data Governance

Data flows and architectural diagrams within enterprise architecture benefit from the ability to automatically assess and document the current data architecture. Automatically providing and continuously maintaining business glossary ontologies and integrated data catalogs inform a key part of the governance process.

The EDGE Revolution

By bringing together enterprise architecture, business process, data mapping and data modeling, erwin’s approach to data governance enables organizations to get a handle on how they handle their data and realize its maximum value. With the broadest set of metadata connectors and automated code generation, data mapping and cataloging tools, the erwin EDGE Platform simplifies the total data management and data governance lifecycle.

This single, integrated solution makes it possible to gather business intelligence, conduct IT audits, ensure regulatory compliance and accomplish any other organizational objective by fueling an automated, high-quality and real-time data pipeline.

The erwin EDGE creates an “enterprise data governance experience” that facilitates collaboration between both IT and the business to discover, understand and unlock the value of data both at rest and in motion.

With the erwin EDGE, data management and data governance are unified and mutually supportive of business stakeholders and IT to:

Discover data: Identify and integrate metadata from various data management silos.
Harvest data: Automate the collection of metadata from various data management silos and consolidate it into a single source.
Structure data: Connect physical metadata to specific business terms and definitions and reusable design standards.
Analyze data: Understand how data relates to the business and what attributes it has.
Map data flows: Identify where to integrate data and track how it moves and transforms.
Govern data: Develop a governance model to manage standards and policies and set best practices.
Socialize data: Enable stakeholders to see data in one place and in the context of their roles.

If you’ve enjoyed this latest blog series, then you’ll want to request a copy of Solving the Enterprise Data Dilemma, our new e-book that highlights how to answer the three most important data management and data governance questions: What data do we have? Where is it? And how do we get value from it?