Categories
erwin Expert Blog Data Modeling

erwin, Microsoft and the Power of the Common Data Model

What is Microsoft’s Common Data Model (CDM), and why is it so powerful?

Imagine if every person in your organization spoke a different language, and you had no simple way to translate what they were saying? It would make your work frustrating, complicated and slow.

The same is true for data, with a number of vendors creating data models by vertical industry (financial services, healthcare, etc.) and making them commercially available to improve how organizations understand and work with their data assets. The CDM takes this concept to the next level.

Microsoft has delivered a critical building block for the data-driven enterprise by capturing proven business data constructs and semantic descriptors for data across a wide range of business domains in the CDM and providing the contents in an open-source format for consumption and integration. The CDM provides a best-practices approach to defining data to accelerate data literacy, automation, integration and governance across the enterprise.

Why Is the CDM Such a Big Deal?

The value of the CDM shows up in multiple ways. One is enabling data to be unified. Another is to reduce the time and effort in manual mapping – ultimately saving the organization money.

By having a single definition of something, complex ETL doesn’t have to be performed repeatedly. Once something is defined, then then everyone can map to the standard definition of what the data means.

Beyond saving time, effort and money, CDM can help transform your business in even more ways, including:

  • Innovation: With data having a common meaning, the business can unlock new scenarios, like modern and advanced analytics, experiential analytics, AI, email, etc.
  • Insights: Given the meaning of the data is the same, regardless of the domain it came from, an organization can use its data to power business insights.
  • Compliance: It improves data governance to comply with such regulations as the General Data Protection Regulation (GDPR).
  • Cloud migration and other data platform modernization efforts: definition is missing here.

Once the organization understands what something is, and it is commonly understood across the enterprise, anyone can build semantically aware reporting and analytical requirements plus deliver a uniform view because there is a common understanding of data.

Data Modeling Tool

erwin Expands Collaboration with Microsoft

The combination of Microsoft’s CDM with erwin’s industry-leading data modeling, governance and automation solutions can optimize an organization’s data capability and accelerate the impact and business value of enterprise data.

erwin recently announced its expanded collaboration with Microsoft. By working together, the companies will help organizations get a handle on disparate data, put it in one place, and then determine how to do something meaningful with it.

The erwin solutions that use Microsoft’s CDM are:

erwin Data Modeler: erwin DM automatically transforms the CDM into a graphical model, complete with business-data constructs and semantic metadata, to feed your existing data-source models and new database designs – regardless of the technology upon which these structures are deployed.

erwin DM’s reusable model templates, design layer and model compare/synchronization capabilities, combined with our design lifecycle and modeler collaboration services, enables organizations to capture and use CDM contents and best practices to optimize enterprise data definition, design and deployment.

erwin DM also enables the reuse of the CDM in the design and maintenance of enterprise data sources. It automatically consumes, integrates and maintains CDM metadata in a standardized, reusable design and supports logical and physical modeling and integration with all major DBMS technologies.

The erwin Data Intelligence Suite: erwin DI automatically scans, captures and activates metadata from the CDM into a central business glossary. Here, it is intelligently integrated and connected to the metadata from the data sources that feed enterprise applications.

Your comprehensive metadata landscape, including CDM metadata, is governed with the appropriate terminology, policies, rules and other business classifications you decide to build into your framework.

The resulting data intelligence is then discoverable via a self-service business user portal that provides role-based, contextual views. All this metadata-driven automation is possible thanks to erwin DI’s ability to consume and associate CDM metadata to create a data intelligence framework.

erwin and Microsoft recently co-presented a session on the power of the CDM that included a demonstration of how to create a data lake for disparate data sources, migrate all that data to it, and then provide business users with contextual views of the underlying metadata, based on a CDM-enriched business glossary.

The simulation also discussed the automatic generation of scripts for ETL tools, as well as the auto generation of data lineage diagrams and impact analysis so data governance is built in and continuous.

You can watch the full erwin/Microsoft session here.

Data Modeling Data Goverance

Categories
erwin Expert Blog Data Modeling

How to Do Data Modeling the Right Way

Data modeling supports collaboration among business stakeholders – with different job roles and skills – to coordinate with business objectives.

Data resides everywhere in a business, on-premise and in private or public clouds. And it exists across these hybrid architectures in different formats: big and unstructured and traditional structured business data may physically sit in different places.

What’s desperately needed is a way to understand the relationships and interconnections between so many entities in data sets in detail.

Visualizing data from anywhere defined by its context and definition in a central model repository, as well as the rules for governing the use of those data elements, unifies enterprise data management. A single source of data truth helps companies begin to leverage data as a strategic asset.

What, then, should users look for in a data modeling product to support their governance/intelligence requirements in the data-driven enterprise?

Data Modeling

Nine Steps to Data Modeling

  1. Provide metadata and schema visualization regardless of where data is stored

Data modeling solutions need to account for metadata and schema visualization to mitigate complexity and increase collaboration and literacy across a broad range of data stakeholders. They should automatically generate data models, providing a simple, graphical display to visualize a wide range of enterprise data sources based on a common repository of standard data assets through a single interface.

  1. Have a process and mechanism to capture, document and integrate business and semantic metadata for data sources

As the best way to view metadata to support data governance and intelligence, data models can depict the metadata content for a data catalog. A data modeling solution should make it possible for business and semantic metadata to be created to augment physical data for ingestion into a data catalog, which provides a mechanism for IT and business users to make use of the metadata or data structures underpinning source systems.

High-functioning data catalogs will provide a technical view of information flow as well as deeper insights into semantic lineage – that is, how the data asset metadata maps to corresponding business usage tables.

Data stewards can associate business glossary terms, data element definitions, data models and other semantic details with different mappings, drawing upon visualizations that demonstrate where business terms are in use, how they are mapped to different data elements in different systems and the relationships among these different usage points.

  1. Create database designs from visual models

Time is saved and errors are reduced when visual data models are available for use in translating the high-quality data sources that populate them into new relational and non-relational database design, standardization, deployment and maintenance.

  1. Reverse engineer databases into data models

Ideally a solution will let users create a logical and physical data model by adroitly extracting information from an existing data source – ERP, CRM or other enterprise application — and choosing the objects to use in the model.

This can be employed to translate the technical formats of the major database platforms into detailed physical entity-relationship models rich in business and semantic metadata that visualizes and diagrams the complex database objects.

Database code reverse-engineering, integrated development environment connections and model exchange will ensure efficiency, effectiveness and consistency in the design, standardization, documentation and deployment of data structures for comprehensive enterprise database management. Also helpful is if the offline reverse-engineering process is automated so that modelers can focus on other high-value tasks.

  1. Harness model reusability and design standards

When data modelers can take advantage of intuitive graphical interfaces, they’ll have an easier time viewing data from anywhere in context or meaning and relationships support of artifact reuse for large-scale data integration, master data management, big data and business intelligence/analytics initiatives.

It’s typically the case that modelers will want to create models containing reusable objects such as modeling templates, entities, tables, domains, automation macros. naming and database standards, formatting options, and so on.

The ability to modify the way data types are mapped for specific DBMS data types and to create reusable design standards across the business should be fostered through customizable functionality. Reuse serves to help lower the costs of development and maintenance and ensure data quality for governance requirements.

Additionally, templates should be available to help enable standardization and reuse while accelerating the development and maintenance of models. Standardization and reuse of models across data management environments will be possible when there is support for model exchange.

Consistency and reuse are more efficient when model development and assets are centralized. That makes it easier to publish models across various stakeholders and incorporate comments and changes from them as necessary.

  1. Enable user configuration and point-and-click report interfaces

A key part of data modeling is to create text-based reports for diagrams and metadata via a number of formats – HTML, PDF and CSV. By taking the approach of using point-and-click interfaces, a solution can make it easier to create detailed metadata reports of models and drill down into granular graphical views of reports that are inclusive of object types-tables, UDPS and more.

The process is made even more simple when users can take advantage of out-of-the-box reports that are pertinent to their needs as well as create them for individual models or across multiple models.

When generic ODBC interfaces are included, options grow for querying metadata, regardless of where it is sourced, from a variety of tools and interfaces.

  1. Support an all-inclusive environment of collaboration

When solutions focus on model management in a centralized repository, modular and bidirectional collaboration services are empowered across all data generators – human or machine—and stewards and consumers across the enterprise.

Data siloes, of course, are the enemies of data governance. They make it difficult to have a clear understanding of where information resides and how data is commonly defined.

It’s far better to centralize and manage access to ordered assets – whether by particular internal staff roles or to business partners granted role-based and read-only access – to maintain security.

Such an approach supports coordinated version control, model change management and conflict resolution and seeds cross-model impact analysis across stakeholders. Modeler productivity and independence can be enhanced, too.

  1. Promote data literacy

Stakeholder collaboration, in fact, depends on and is optimized by data literacy, the key to creating an organization that is fluent in the language of data. Everyone in the enterprise – from data scientists to ETL developers to compliance officers to C-level executives – ought to be assured of having a dynamic view of high-quality data pipelines operating on common and standardized terms.

So, it is critical that solutions focus on making the pipeline data available and discoverable in such a way that it reflects different user roles. When consumers can view data relevant to their roles and understand its definition within the business context in which they operate, their ability to produce accurate, actionable insights and collaborate across the enterprise to enact them for the desired outcomes is enhanced.

Data literacy built on business glossaries that enable the collaborative definition of enterprise data in business terms and rules for built-in accountability and workflow promote adherence to governance requirements.

  1. Embed data governance constructs within data models

Data governance should be integrated throughout the data modeling process. It manifests in a solution’s ability to adroitly discover and document any data from anywhere for consistency, clarity and artifact reuse across large-scale data integration, master data management, metadata management and big data requirements.

Data catalogs and business glossaries with properly defined data definitions in a controlled central repository are the result of ingesting metadata from data models for business intelligence and analytics initiatives.

You Don’t Know What You’ve Got

Bottom line, without centralized data models and a metadata hub, there is no efficient means to comply with industry regulations and business standards regarding security and privacy; set permissions for access controls; and consolidate information in easy-to-understand reports for business analysts.

The value of participating in data modeling to classify the data that is most important to the business in terms that are meaningful to the business and having a breakdown of complex data organization scenarios supports critical business reporting, intelligence and analytics tasks. That’s a clear need, as organizations today analyze and use less than 0.5 percent of the information they take in – a huge loss of potential value in the age of data-driven business.

Without illustrative data models businesses may not even realize that they already have the data needed for a new report, and time is lost and costs increase as data is gathered and interfaces are rebuilt.

To learn more about data modeling and its role in the enterprise, join us for our upcoming webinar, Data Modeling Is Cool, Seriously.

Data Modeling Is Cool, Seriously

Categories
erwin Expert Blog Data Modeling

Modern Data Modeling: The Foundation of Enterprise Data Management and Data Governance

The role of data modeling (DM) has expanded to support enterprise data management, including data governance and intelligence efforts. After all, you can’t manage or govern what you can’t see, much less use it to make smart decisions.

Metadata management is the key to managing and governing your data and drawing intelligence from it. Beyond harvesting and cataloging metadata, it also must be visualized to break down the complexity of how data is organized and what data relationships there are so that meaning is explicit to all stakeholders in the data value chain.

Data Governance and Automation

Data models provide this visualization capability, create additional metadata and standardize the data design across the enterprise.

While modeling has always been the best way to understand complex data sources and automate design standards, modern data modeling goes well beyond these domains to ensure and accelerate the overall success of data governance in any organization.

You can’t overestimate the importance of success as data governance keeps the business in line with privacy mandates such as the General Data Protection Regulation (GDPR). It drives innovation too. Companies who want to advance AI initiatives, for instance, won’t get very far without quality data and well-defined data models.

Why Is Data Modeling the Building Block of Enterprise Data Management?

DM mitigates complexity and increases collaboration and literacy across a broad range of data stakeholders.

  • DM uncovers the connections between disparate data elements.

The DM process enables the creation and integration of business and semantic metadata to augment and accelerate data governance and intelligence efforts.

  • DM captures and shares how the business describes and uses data.

DM delivers design task automation and enforcement to ensure data integrity.

  • DM builds higher quality data sources with the appropriate structural veracity.

DM delivers design task standardization to improve business alignment and simplify integration.

  • DM builds a more agile and governable data architecture.

The DM process manages the design and maintenance lifecycle for data sources.

  • DM governs the design and deployment of data across the enterprise.

DM documents, standardizes and aligns any type of data no matter where it lives. 

Realizing the Data Governance Value from Data Modeling

Modeling becomes the point of true collaboration within an organization because it delivers a visual source of truth for everyone to follow – data management and business professionals – to conform to governance requirements.

Information is readily available within intuitive business glossaries, accessible to user roles according to parameters set by the business. The metadata repository behind these glossaries, populated by information stored in data models, serves up the key terms that are understandable and meaningful to every party in the enterprise.

The stage, then, is equally set for improved data intelligence, because stakeholders now can use, understand and trust relevant data to enhance decision-making across the enterprise.

The enterprise is coming to the point where both business and IT co-own data modeling processes and data models. Business analysts and other power users start to understand data complexities because they can grasp terms and contribute to making the data in their organization accurate and complete, and modeling grows in importance in the eyes of business users.

Bringing data to the business and making it easy to access and understand increases the value of  data assets, providing a return on investment and a return on opportunity. But neither would be possible without data modeling providing the backbone for metadata management and proper data governance.

For more information, check out our whitepaper, Drive Business Value and Underpin Data Governance with an Enterprise Data Model.

You also can take erwin DM, the world’s No. 1 data modeling software, for a free spin.

erwin Data Modeler Free Trial - Data Modeling

Categories
erwin Expert Blog

Talk Data to Me: Why Employee Data Literacy Matters  

Organizations are flooded with data, so they’re scrambling to find ways to derive meaningful insights from it – and then act on them to improve the bottom line.

In today’s data-driven business, enabling employees to access and understand the data that’s relevant to their roles allows them to use data and put those insights into action. To do this, employees need to “talk data,” aka data literacy.

However, Gartner predicts that this year 50 percent of organizations will lack sufficient AI and data literacy skills to achieve business value. This requires organizations to invest in ensuring their employees are data literate.

Data Literacy & the Rise of the Citizen Analyst

According to Gartner, “data literacy is the ability to read, write and communicate data in context, including an understanding of data sources and constructs, analytical methods and techniques applied — and the ability to describe the use case, application and resulting value.”

Today, your employees are essentially data consumers. There are three technological advances driving this data consumption and, in turn, the ability for employees to leverage this data to deliver business value 1) exploding data production 2) scalable big data computation, and 3) the accessibility of advanced analytics, machine learning (ML) and artificial intelligence (AI).

The confluence of this data explosion has created a fertile environment for data innovation and transformation. As a result, we’re seeing the rise of the “citizen analyst,” who brings business knowledge and subject-matter expertise to data-driven insights.

Some examples of citizen analysts include the VP of finance who may be looking for opportunities to optimize the top- and bottom-line results for growth and profitability. Or the product line manager who wants to understand enterprise impact of pricing changes.

David Loshin explores this concept in an erwin-sponsored whitepaper, Data Intelligence: Empowering the Citizen Analyst with Democratized Data.

In the whitepaper he states, the priority of the citizen analyst is straightforward: find the right data to develop reports and analyses that support a larger business case. However, some practical data management issues contribute to a growing need for enterprise data governance, including:

  • Increasing data volumes that challenge the traditional enterprise’s ability to store, manage and ultimately find data
  • Increased data variety, balancing structured, semi-structured and unstructured data, as well as data originating from a widening array of external sources
  • Reducing the IT bottleneck that creates barriers to data accessibility
  • Desire for self-service to free the data consumers from strict predefined data transformations and organizations
  • Hybrid on-premises/cloud environments that complicate data integration and preparation
  • Privacy and data protection laws from many countries that influence the ways data assets may be accessed and used

Data Democratization Requires Data Intelligence

According to Loshin, organizations need to empower their citizen analysts. A fundamental component of data literacy involves data democratization, sharing data assets with a broad set of data consumer communities in a governed way.

  • The objectives of governed data democratization include:
  • Raising data awareness
  • Improving data literacy
  • Supporting observance of data policies to support regulatory compliance
  • Simplifying data accessibility and use

Effective data democratization requires data intelligence. This is dependent on accumulating, documenting and publishing information about the data assets used across the entire enterprise data landscape.

Here are the steps to effective data intelligence:

  • Reconnaissance: Understanding the data environment and the corresponding business contexts and collecting as much information as possible
  • Surveillance: Monitoring the environment for changes to data sources
  • Logistics and Planning: Mapping the collected information production flows and mapping how data moves across the enterprise
  • Impact Assessment: Using what you have learned to assess how external changes impact the environment
  • Synthesis: Empowering data consumers by providing a holistic perspective associated with specific business terms
  • Sustainability: Embracing automation to always provide up-to-date and correct intelligence
  • Auditability: Providing oversight and being able to explain what you have learned and why

Data Literacy: The Heart of Data-Driven Innovation

Data literacy is at the heart of successful data-driven innovation and accelerating the realization of actionable data-driven insights.

It can reduce data source discovery and analyses cycles, improve accuracy in results, reduce the reliance expensive technical resources, assure the “right” data is used the first time reducing deployed errors and the need for expensive re-work.

Ultimately, a successful data literacy program will empower your employees to:

  • Better understand and identify the data they require
  • Be more self-sufficient in accessing and preparing the data they require
  • Better articulate the gaps that exist in the data landscape when it comes to fulfilling their data needs
  • Share their knowledge and experience with data with other consumers to contribute to the greater good
  • Collaborate more effectively with their partners in data (management and governance) for greater efficiency and higher quality outcomes

erwin offers a data intelligence software suite combining the capabilities of erwin Data Catalog with erwin Data Literacy to fuel an automated, real-time, high-quality data pipeline.

Then all enterprise stakeholders – data scientists, data stewards, ETL developers, enterprise architects, business analysts, compliance officers, citizen analysts, CDOs and CEOs – can access data relevant to their roles for insights they can put into action.

Click here to request a demo of erwin Data Intelligence.

erwin Data Intelligence

Categories
erwin Expert Blog

Data Governance and Metadata Management: You Can’t Have One Without the Other

When an organization’s data governance and metadata management programs work in harmony, then everything is easier.

Data governance is a complex but critical practice. There’s always more data to handle, much of it unstructured; more data sources, like IoT, more points of integration, and more regulatory compliance requirements.

Creating and sustaining an enterprise-wide view of and easy access to underlying metadata is also a tall order.

The numerous data types and data sources that exist today weren’t designed to work together, and data infrastructures have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration.

Therefore, most enterprises have encountered difficulty trying to master data governance and metadata management, but they need a solid data infrastructure on which to build their applications and initiatives.

Without it, they risk faulty analyses and insights that effect not only revenue generation but regulatory compliance and any number of other organizational objectives.

Data Governance Predictions

Data Governance Attitudes Are Shifting

The 2020 State of Data Governance and Automation (DGA) shows that attitudes about data governance and the drivers behind it are changing – arguably for the better.

Regulatory compliance was the biggest driver for data governance implementation, according to the 2018 report. That’s not surprising given the General Data Protection Regulation (GDPR) was going into effect just six months after the survey.

Now better decision-making is the primary reason to implement data governance, cited by 60 percent of survey participants. This shift suggests organizations are using data to improve their overall performance, rather than just trying to tick off a compliance checkbox.

We’re pleased to see this because we’ve always believed that IT-siloed data governance has limited value. Instead, data governance has to be an enterprise initiative with IT and the wider business collaborating to limit data-related risks and determine where greater potential and value can be unleashed.

Metadata Management Takes Time

About 70 percent of DGA report respondents – a combination of roles from data architects to executive managers – say they spend an average of 10 or more hours per week on data-related activities.

Most of that time is spent on data analysis – but only after searching for and preparing data.

A separate study by IDC indicates data professionals actually spend 80 percent of their time on data discovery, preparation and protection and only 20 percent on analysis.

Why such a heavy lift? Finding metadata, “the data about the data,” isn’t easy.

When asked about the most significant bottlenecks in the data value chain, documenting complete data lineage leads with 62 percent followed by understanding the quality of the source data (58 percent), discovery, identification and harvesting (55 percent), and curating data assets with business context (52%.)

So it make sense that the data operations deemed most valuable in terms of automation are:

  • Data Lineage (65%)
  • Data Cataloging (61%)
  • Data Mapping (53%)
  • Impact Analysis (48%)
  • Data Harvesting (38%)
  • Code Generation (21%)

But as suspected, most data operations are still manual and largely dependent on technical resources. They aren’t taking advantage of repeatable, sustainable practices – also known as automation.

The Benefits of Automating Data Governance and Metadata Management Processes

Availability, quality, consistency, usability and reduced latency are requirements at the heart of successful data governance.

And with a solid framework for automation, organizations can generate metadata every time data is captured at a source, accessed by users, moved through an organization, integrated or augmented with other data from other sources, profiled, cleansed and analyzed.

Other benefits of automating data governance and metadata management processes include:

  • Better Data Quality – Identification and repair of data issues and inconsistencies within integrated data sources in real time
  • Quicker Project Delivery – Acceleration of Big Data deployments, Data Vaults, data warehouse modernization, cloud migration, etc.
  • Faster Speed to Insights – Reversing the 80/20 rule that keeps high-paid knowledge workers too busy finding, understanding and resolving errors or inconsistencies to actually analyze source data
  • Greater Productivity & Reduced Costs – Use of automated, repeatable processes to for metadata discovery, data design, data conversion, data mapping and code generation
  • Digital Transformation – Better understanding of what data exists and its potential value to improve digital experiences, enhance digital operations, drive digital innovation and build digital ecosystems
  • Enterprise Collaboration – The ability for IT and the wider business to find, trust and use data to effectively meet organizational objectives

To learn more about the information we’ve covered in today’s blog, please join us for our webinar with Dataversity on Feb. 18.

Data Governance Webinar

Categories
erwin Expert Blog

5 Ways Data Modeling Is Critical to Data Governance

Enterprises are trying to manage data chaos. They might have 300 applications, with 50 different databases and a different schema for each one.

They also face increasing regulatory pressure because of global data regulations, such as the European Union’s General Data Protection Regulation (GDPR) and the new California Consumer Privacy Act (CCPA), that went into effect last week on Jan. 1.

Then there’s unstructured data with no contextual framework to govern data flows across the enterprise not to mention time-consuming manual data preparation and limited views of data lineage.

For decades, data modeling has been the optimal way to design and deploy new relational databases with high-quality data sources and support application development. It is a tried-and-true practice for lowering data management costs, reducing data-related risks, and improving the quality and agility of an organization’s overall data capability.

And the good news is that it just keeps getting better. Today’s data modeling is not your father’s data modeling software.

While it’s always been the best way to understand complex data sources and automate design standards and integrity rules, the role of data modeling continues to expand as the fulcrum of collaboration between data generators, stewards and consumers.

That’s because it’s the best way to visualize metadata, and metadata is now the heart of enterprise data management and data governance/ intelligence efforts.

So here’s why data modeling is so critical to data governance.

1. Uncovering the connections between disparate data elements: Visualize metadata and schema to mitigate complexity and increase data literacy and collaboration across a broad range of data stakeholders. Because data modeling reduces complexity, all members of the team can work around a data model to better understand and contribute to the project.

2. Capturing and sharing how the business describes and uses data: Create and integrate business and semantic metadata to augment and accelerate data intelligence and governance efforts. Data modeling captures how the business uses data and provides context to the data source.

3. Deploying higher quality data sources with the appropriate structural veracity: Automate and enforce data model design tasks to ensure data integrity. From regulatory compliance and business intelligence to target marketing, data modeling maintains an automated connection back to the source.

4. Building a more agile and governable data architecture: Create and implement common data design standards from the start. Data modeling standardizes design tasks to improve business alignment and simplify integration.

5. Governing the design and deployment of data across the enterprise: Manage the design and maintenance lifecycle for data sources. Data modeling provides visibility, management and full version control over the lifecycle for data design, definition and deployment.

Data Modeling Tool

erwin Data Modeler: Where the Magic Happens

erwin has just released a new version of erwin DM, the world’s No. 1 data modeling software for designing, deploying and understanding data sources to meet modern business needs. erwin DM 2020 is an essential source of metadata and a critical enabler of data governance and intelligence efforts.

The new version of erwin DM includes these features:

  • A modern, configurable workspace so users can customize the modeling canvas and optimize access to features and functionality that best support their workflows
  • Support for and model integration from major databases to work effectively across platforms and reuse work product, including native support for Amazon Redshift and updated support for the latest DB2 releases and certification for the latest MS SQL Server releases
  • Model exchange (import/export) to/from a wide variety of data management environments
  • Modeling task automation that saves modelers time, reduces errors and increases work product quality and speed, including a new scheduler to automate the offline reverse-engineering of databases into data models
  • New Quick Compare templates as part of the Complete Compare feature to compare and synchronize data models and sources
  • New ODBC query tool for creating and running custom model and metadata reports
  • Design transformations to customize and automate super-type/sub-type relationships between logical and physical models

erwin DM also integrates with the erwin Data Intelligence Suite (erwin DI) to automatically harvest the metadata in erwin data models for ingestion into the data catalog for better analytics, governance and overall data intelligence.

The role of data modeling in the modern data-driven business continues to expand with the benefits long-realized by database professionals and developers now experienced by a wider range of architects, business analysts and data administrators in a variety of data-centric initiatives.

Click here to test drive of the new erwin DM.

Categories
erwin Expert Blog

Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

Untapped data, if mined, represents tremendous potential for your organization. While there has been a lot of talk about big data over the years, the real hero in unlocking the value of enterprise data is metadata, or the data about the data.

However, most organizations don’t use all the data they’re flooded with to reach deeper conclusions about how to drive revenue, achieve regulatory compliance or make other strategic decisions. They don’t know exactly what data they have or even where some of it is.

Quite honestly, knowing what data you have and where it lives is complicated. And to truly understand it, you need to be able to create and sustain an enterprise-wide view of and easy access to underlying metadata.

This isn’t an easy task. Organizations are dealing with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and with little thought for downstream integration.

As a result, the applications and initiatives that depend on a solid data infrastructure may be compromised, leading to faulty analysis and insights.

Metadata Is the Heart of Data Intelligence

A recent IDC Innovators: Data Intelligence Report says that getting answers to such questions as “where is my data, where has it been, and who has access to it” requires harnessing the power of metadata.

Metadata is generated every time data is captured at a source, accessed by users, moves through an organization, and then is profiled, cleansed, aggregated, augmented and used for analytics to guide operational or strategic decision-making.

In fact, data professionals spend 80 percent of their time looking for and preparing data and only 20 percent of their time on analysis, according to IDC.

To flip this 80/20 rule, they need an automated metadata management solution for:

• Discovering data – Identify and interrogate metadata from various data management silos.
• Harvesting data – Automate the collection of metadata from various data management silos and consolidate it into a single source.
• Structuring and deploying data sources – Connect physical metadata to specific data models, business terms, definitions and reusable design standards.
• Analyzing metadata – Understand how data relates to the business and what attributes it has.
• Mapping data flows – Identify where to integrate data and track how it moves and transforms.
• Governing data – Develop a governance model to manage standards, policies and best practices and associate them with physical assets.
• Socializing data – Empower stakeholders to see data in one place and in the context of their roles.

Addressing the Complexities of Metadata Management

The complexities of metadata management can be addressed with a strong data management strategy coupled with metadata management software to enable the data quality the business requires.

This encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossary maintenance, and metadata management (associations and lineage).

erwin has developed the only data intelligence platform that provides organizations with a complete and contextual depiction of the entire metadata landscape.

It is the only solution that can automatically harvest, transform and feed metadata from operational processes, business applications and data models into a central data catalog and then made accessible and understandable within the context of role-based views.

erwin’s ability to integrate and continuously refresh metadata from an organization’s entire data ecosystem, including business processes, enterprise architecture and data architecture, forms the foundation for enterprise-wide data discovery, literacy, governance and strategic usage.

Organizations then can take a data-driven approach to business transformation, speed to insights, and risk management.
With erwin, organizations can:

1. Deliver a trusted metadata foundation through automated metadata harvesting and cataloging
2. Standardize data management processes through a metadata-driven approach
3. Centralize data-driven projects around centralized metadata for planning and visibility
4. Accelerate data preparation and delivery through metadata-driven automation
5. Master data management platforms through metadata abstraction
6. Accelerate data literacy through contextual metadata enrichment and integration
7. Leverage a metadata repository to derive lineage, impact analysis and enable audit/oversight ability

With erwin Data Intelligence as part of the erwin EDGE platform, you know what data you have, where it is, where it’s been and how it transformed along the way, plus you can understand sensitivities and risks.

With an automated, real-time, high-quality data pipeline, enterprise stakeholders can base strategic decisions on a full inventory of reliable information.

Many of our customers are hard at work addressing metadata management challenges, and that’s why erwin was Named a Leader in Gartner’s “2019 Magic Quadrant for Metadata Management Solutions.”

Gartner Magic Quadrant Metadata Management

Categories
erwin Expert Blog

Benefits of Data Vault Automation

The benefits of Data Vault automation from the more abstract – like improving data integrity – to the tangible – such as clearly identifiable savings in cost and time.

So Seriously … You Should Automate Your Data Vault

 By Danny Sandwell

Data Vault is a methodology for architecting and managing data warehouses in complex data environments where new data types and structures are constantly introduced.

Without Data Vault, data warehouses are difficult and time consuming to change causing latency issues and slowing time to value. In addition, the queries required to maintain historical integrity are complex to design and run slow causing performance issues and potentially incorrect results because the ability to understand relationships between historical snap shots of data is lacking.

In his blog, Dan Linstedt, the creator of Data Vault methodology, explains that Data Vaults “are extremely scalable, flexible architectures” enabling the business to grow and change without “the agony and pain of high costs, long implementation and test cycles, and long lists of impacts across the enterprise warehouse.”

With a Data Vault, new functional areas typically are added quickly and easily, with changes to existing architecture taking less than half the traditional time with much less impact on the downstream systems, he notes.

Astonishingly, nearly 20 years since the methodology’s creation, most Data Vault design, development and deployment phases are still handled manually. But why?

Traditional manual efforts to define the Data Vault population and create ETL code from scratch can take weeks or even months. The entire process is time consuming slowing down the data pipeline and often riddled with human errors.

On the flipside, automating the development and deployment of design changes and the resulting data movement processing code ensures companies can accelerate dev and deployment in a timely and cost-effective manner.

Benefits of Data Vault Automation

Benefits of Data Vault Automation – A Case Study …

Global Pharma Company Saves Considerable Time and Money with Data Vault Automation

Let’s take a look at a large global pharmaceutical company that switched to Data Vault automation with staggering results.

Like many pharmaceutical companies, it manages a massive data warehouse combining clinical trial, supply chain and other mission-critical data. They had chosen a Data Vault schema for its flexibility in handling change but found creating the hubs and satellite structure incredibly laborious.

They needed to accelerate development, as well as aggregate data from different systems for internal customers to access and share. Additionally, the company needed lineage and traceability for regulatory compliance efforts.

With this ability, they can identify data sources, transformations and usage to safeguard protected health information (PHI) for clinical trials.

After an initial proof of concept, they deployed erwin Data Vault Automation and generated more than 200 tables, jobs and processes with 10 to 12 scripts. The highly schematic structure of the models enabled large portions of the modeling process to be automated, dramatically accelerating Data Vault projects and optimizing data warehouse management.

erwin Data Vault Automation helped this pharma customer automate the complete lifecycle – accelerating development while increasing consistency, simplicity and flexibility – to save considerable time and money.

For this customer the benefits of data vault automation were as such:

  • Saving an estimated 70% of the costs of manual development
  • Generating 95% of the production code with “zero touch,” improving the time to business value and significantly reduced costly re-work associated with error-prone manual processes
  • Increasing data integrity, including for new requirements and use cases regardless of changes to the warehouse structure because legacy source data doesn’t degrade
  • Creating a sustainable approach to Data Vault deployment, ensuring the agile, adaptable and timely delivery of actionable insights to the business in a well-governed facility for regulatory compliance, including full transparency and ease of auditability

Homegrown Tools Never Provide True Data Vault Automation

Many organizations use some form of homegrown tool or standalone applications. However, they don’t integrate with other tools and components of the architecture, they’re expensive, and quite frankly, they make it difficult to derive any meaningful results.

erwin Data Vault Automation centralizes the specification and deployment of Data Vault architectures for better control and visibility of the software development lifecycle. erwin Data Catalog makes it easy to discover, organize, curate and govern data being sourced for and managed in the warehouse.

With this solution, users select data sets to be included in the warehouse and fully automate the loading of Data Vault structures and ETL operations.

erwin Data Vault Smart Connectors eliminate the need for a business analyst and ETL developers to repeat mundane tasks, so they can focus on choosing and using the desired data instead. This saves considerable development time and effort plus delivers a high level of standardization and reuse.

After the Data Vault processes have been automated, the warehouse is well documented with traceability from the marts back to the operational data to speed the investigation of issues and analyze the impact of changes.

Bottom line: if your Data Vault integration is not automated, you’re already behind.

If you’d like to get started with erwin Data Vault Automation or request a quote, you can email consulting@erwin.com.

Data Modeling Drives Business Value

Categories
erwin Expert Blog

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

In light of recent, high-profile data breaches, it’s past-time we re-examined strategic data governance and its role in managing regulatory requirements.

News broke earlier this week of British Airways being fined 183 million pounds – or $228 million – by the U.K. for alleged violations of the European Union’s General Data Protection Regulation (GDPR). While not the first, it is the largest penalty levied since the GDPR went into effect in May 2018.

Given this, Oppenheimer & Co. cautions:

“European regulators could accelerate the crackdown on GDPR violators, which in turn could accelerate demand for GDPR readiness. Although the CCPA [California Consumer Privacy Act, the U.S. equivalent of GDPR] will not become effective until 2020, we believe that new developments in GDPR enforcement may influence the regulatory framework of the still fluid CCPA.”

With all the advance notice and significant chatter for GDPR/CCPA,  why aren’t organizations more prepared to deal with data regulations?

In a word? Complexity.

The complexity of regulatory requirements in and of themselves is aggravated by the complexity of the business and data landscapes within most enterprises.

So it’s important to understand how to use strategic data governance to manage the complexity of regulatory compliance and other business objectives …

Designing and Operationalizing Regulatory Compliance Strategy

It’s not easy to design and deploy compliance in an environment that’s not well understood and difficult in which to maneuver. First you need to analyze and design your compliance strategy and tactics, and then you need to operationalize them.

Modern, strategic data governance, which involves both IT and the business, enables organizations to plan and document how they will discover and understand their data within context, track its physical existence and lineage, and maximize its security, quality and value. It also helps enterprises put these strategic capabilities into action by:

  • Understanding their business, technology and data architectures and their inter-relationships, aligning them with their goals and defining the people, processes and technologies required to achieve compliance.
  • Creating and automating a curated enterprise data catalog, complete with physical assets, data models, data movement, data quality and on-demand lineage.
  • Activating their metadata to drive agile data preparation and governance through integrated data glossaries and dictionaries that associate policies to enable stakeholder data literacy.

Strategic Data Governance for GDPR/CCPA

Five Steps to GDPR/CCPA Compliance

With the right technology, GDPR/CCPA compliance can be automated and accelerated in these five steps:

  1. Catalog systems

Harvest, enrich/transform and catalog data from a wide array of sources to enable any stakeholder to see the interrelationships of data assets across the organization.

  1. Govern PII “at rest”

Classify, flag and socialize the use and governance of personally identifiable information regardless of where it is stored.

  1. Govern PII “in motion”

Scan, catalog and map personally identifiable information to understand how it moves inside and outside the organization and how it changes along the way.

  1. Manage policies and rules

Govern business terminology in addition to data policies and rules, depicting relationships to physical data catalogs and the applications that use them with lineage and impact analysis views.

  1. Strengthen data security

Identify regulatory risks and guide the fortification of network and encryption security standards and policies by understanding where all personally identifiable information is stored, processed and used.

How erwin Can Help

erwin is the only software provider with a complete, metadata-driven approach to data governance through our integrated enterprise modeling and data intelligence suites. We help customers overcome their data governance challenges, with risk management and regulatory compliance being primary concerns.

However, the erwin EDGE also delivers an “enterprise data governance experience” in terms of agile innovation and business transformation – from creating new products and services to keeping customers happy to generating more revenue.

Whatever your organization’s key drivers are, a strategic data governance approach – through  business process, enterprise architecture and data modeling combined with data cataloging and data literacy – is key to success in our modern, digital world.

If you’d like to get a handle on handling your data, you can sign up for a free, one-on-one demo of erwin Data Intelligence.

For more information on GDPR/CCPA, we’ve also published a white paper on the Regulatory Rationale for Integrating Data Management and Data Governance.

GDPR White Paper

Categories
erwin Expert Blog

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms, including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. But the attempts to standardize data across the entire enterprise haven’t produced the desired results.

A company can’t effectively implement data governance – documenting and applying business rules and processes, analyzing the impact of changes and conducting audits – if it fails at data management.

The problem usually starts by relying on manual integration methods for data preparation and mapping. It’s only when companies take their first stab at manually cataloging and documenting operational systems, processes and the associated data, both at rest and in motion, that they realize how time-consuming the entire data prepping and mapping effort is, and why that work is sure to be compounded by human error and data quality issues.

To effectively promote business transformation, as well as fulfil regulatory and compliance mandates, there can’t be any mishaps.

It’s obvious that the manual road is very challenging to discover and synthesize data that resides in different formats in thousands of unharvested, undocumented databases, applications, ETL processes and procedural code.

Consider the problematic issue of manually mapping source system fields (typically source files or database tables) to target system fields (such as different tables in target data warehouses or data marts).

These source mappings generally are documented across a slew of unwieldy spreadsheets in their “pre-ETL” stage as the input for ETL development and testing. However, the ETL design process often suffers as it evolves because spreadsheet mapping data isn’t updated or may be incorrectly updated thanks to human error. So questions linger about whether transformed data can be trusted.

Data Quality Obstacles

The sad truth is that high-paid knowledge workers like data scientists spend up to 80 percent of their time finding and understanding source data and resolving errors or inconsistencies, rather than analyzing it for real value.

Statistics are similar when looking at major data integration projects, such as data warehousing and master data management with data stewards challenged to identify and document data lineage and sensitive data elements.

So how can businesses produce value from their data when errors are introduced through manual integration processes? How can enterprise stakeholders gain accurate and actionable insights when data can’t be easily and correctly translated into business-friendly terms?

How can organizations master seamless data discovery, movement, transformation and IT and business collaboration to reverse the ratio of preparation to value delivered.

What’s needed to overcome these obstacles is establishing an automated, real-time, high-quality and metadata- driven pipeline useful for everyone, from data scientists to enterprise architects to business analysts to C-level execs.

Doing so will require a hearty data management strategy and technology for automating the timely delivery of quality data that measures up to business demands.

From there, they need a sturdy data governance strategy and technology to automatically link and sync well-managed data with core capabilities for auditing, statutory reporting and compliance requirements as well as to drive business insights.

Creating a High-Quality Data Pipeline

Working hand-in-hand, data management and data governance provide a real-time, accurate picture of the data landscape, including “data at rest” in databases, data lakes and data warehouses and “data in motion” as it is integrated with and used by key applications. And there’s control of that landscape to facilitate insight and collaboration and limit risk.

With a metadata-driven, automated, real-time, high-quality data pipeline, all stakeholders can access data that they now are able to understand and trust and which they are authorized to use. At last they can base strategic decisions on what is a full inventory of reliable information.

The integration of data management and governance also supports industry needs to fulfill regulatory and compliance mandates, ensuring that audits are not compromised by the inability to discover key data or by failing to tag sensitive data as part of integration processes.

Data-driven insights, agile innovation, business transformation and regulatory compliance are the fruits of data preparation/mapping and enterprise modeling (business process, enterprise architecture and data modeling) that revolves around a data governance hub.

erwin Mapping Manager (MM) combines data management and data governance processes in an automated flow through the integration lifecycle from data mapping for harmonization and aggregation to generating the physical embodiment of data lineage – that is the creation, movement and transformation of transactional and operational data.

Its hallmark is a consistent approach to data delivery (business glossaries connect physical metadata to specific business terms and definitions) and metadata management (via data mappings).

Automate Data Mapping