Categories
Data Governance erwin Expert Blog

The What & Why of Data Governance

Modern data governance is a strategic, ongoing and collaborative practice that enables organizations to discover and track their data, understand what it means within a business context, and maximize its security, quality and value.

It is the foundation for regulatory compliance and de-risking operations for competitive differentiation and growth.

However, while digital transformation and other data-driven initiatives are desired outcomes, few organizations know what data they have or where it is, and they struggle to integrate known data in various formats and numerous systems – especially if they don’t have a way to automate those processes.

But when IT-driven data management and business-oriented data governance work together in terms of both personnel, processes and technology, decisions can be made and their impacts determined based on a full inventory of reliable information.

Recently, erwin held the first in a six-part webinar series on the practice of data governance and how to proactively deal with its complexities. Led by Frank Pörschmann of iDIGMA GmbH, an IT industry veteran and data governance strategist, it examined “The What & Why of Data Governance.”

The What: Data Governance Defined

Data governance has no standard definition. However, Dataversity defines it as “the practices and processes which help to ensure the formal management of data assets within an organization.”

At erwin by Quest, we further break down this definition by viewing data governance as a strategic, continuous commitment to ensuring organizations are able to discover and track data, accurately place it within the appropriate business context(s), and maximize its security, quality and value.

Mr. Pörschmann asked webinar attendees to stop trying to explain what data governance is to executives and clients. Instead, he suggests they put data governance in real-world scenarios to answer these questions: “What is the problem you believe data governance is the answer to?” Or “How would you recognize having effective data governance in place?”

In essence, Mr. Pörschmann laid out the “enterprise data dilemma,” which stems from three important but difficult questions for an enterprise to answer: What data do we have? Where is it? And how do we get value from it?

Asking how you recognize having effective data governance in place is quite helpful in executive discussions, according to Mr. Pörschmann. And when you talk about that question at a high level, he says, you get a very “simple answer,”– which is ‘the only thing we want to have is the right data with the right quality to the right person at the right time at the right cost.’

The Why: Data Governance Drivers

Why should companies care about data governance?

erwin’s 2020 State of Data Governance and Automation report found that better decision-making is the primary driver for data governance (62 percent), with analytics secondary (51 percent), and regulatory compliance coming in third (48 percent).

In the webinar, Mr. Pörschmann called out that the drivers of data governance are the same as those for digital transformation initiatives. “This is not surprising at all,” he said. “Because data is one of the success elements of a digital agenda or digital transformation agenda. So without having data governance and data management in place, no full digital transformation will be possible.”

Drivers of data governance

Data Privacy Regulations

While compliance is not the No. 1 driver for data governance, it’s still a major factor – especially since the rollout of the European Union’s General Data Protection Regulation (GDPR) in 2018.

According to Mr. Pörschmann, many decision-makers believe that if they get GDPR right, they’ll be fine and can move onto other projects. But he cautions “this [notion] is something which is not really likely to happen.”

For the EU, he warned, organizations need to prepare for the Digital Single Market, agreed on last year by the European Parliament and commission. With it comes clear definitions or rules on data access and exchange, especially across digital platforms, as well as clear regulations and also instruments to execute on data ownership. He noted, “Companies will be forced to share some specific data which is relevant for public security, i.e., reduction of carbon dioxide. So companies will be forced to classify their data and to find mechanisms to share it with such platforms.”

GDPR is also proving to be the de facto model for data privacy across the United States. The new Virginia Consumer Data Privacy Act, which was modeled on the California Consumer Privacy Act (CCPA), and the California Privacy Rights Act (CPRA), all share many of the same requirements as GDPR.

Like CCPA, the Virginia bill would give consumers the right to access their data, correct inaccuracies, and request the deletion of information. Virginia residents also would be able to opt out of data collection.

Nevada, Vermont, Maine, New York, Washington, Oklahoma and Utah also are leading the way with some type of consumer privacy regulation. Several other bills are on the legislative docket in Alabama, Arizona, Florida, Connecticut and Kentucky, all of which follow a similar format to the CCPA.

Stop Wasting Time

In addition to drivers like digital transformation and compliance, it’s really important to look at the effect of poor data on enterprise efficiency/productivity.

Respondents to McKinsey’s 2019 Global Data Transformation Survey reported that an average of 30 percent of their total enterprise time was spent on non-value-added tasks because of poor data quality and availability.

Wasted time is also an unfortunate reality for many data stewards, who spend 80 percent of their time finding, cleaning and reorganizing huge amounts of data, and only 20 percent of their time on actual data analysis.

According to erwin’s 2020 report, about 70 percent of respondents – a combination of roles from data architects to executive managers – said they spent an average of 10 or more hours per week on data-related activities.

The Benefits of erwin Data Intelligence

erwin Data Intelligence by Quest supports enterprise data governance, digital transformation and any effort that relies on data for favorable outcomes.

The software suite combines data catalog and data literacy capabilities for greater awareness of and access to available data assets, guidance on their use, and guardrails to ensure data policies and best practices are followed.

erwin Data Intelligence automatically harvests, transforms and feeds metadata from a wide array of data sources, operational processes, business applications and data models into a central catalog. Then it is accessible and understandable via role-based, contextual views so stakeholders can make strategic decisions based on accurate insights.

You can request a demo of erwin Data Intelligence here.

[blog-cta header=”Webinar: The Value of Data Governance & How to Quantify It” body=”Join us March 15 at 10 a.m. ET for the second webinar in this series, “The Value of Data Governance & How to Quantify It.” Mr. Pörschmann will discuss how justifying a data governance program requires building a solid business case in which you can prove its value.” button=”Register Now” button_link=”https://attendee.gotowebinar.com/register/5489626673791671307″ image=”https://s38605.p1254.sites.pressdns.com/wp-content/uploads/2018/11/iStock-914789708.jpg” ]

Categories
erwin Expert Blog

Documenting and Managing Governance, Risk and Compliance with Business Process

Managing an organization’s governance, risk and compliance (GRC) via its enterprise and business architectures means managing them against business processes (BP).

Shockingly, a lot of organizations, even today, manage this through, either homemade tools or documents, checklists, Excel files, custom-made databases and so on and so forth. The three main reasons organizations tend to still operate in this manual and disparate way comes down to three reasons:

  1. Cost
  2. Governance, risk and compliance are treated as isolated bubbles.
  3. Data-related risks are not connected with the data architects/data scientists.

If we look at this past year, COVID-19 fundamentally changed everything overnight – and it was something that nobody could have anticipated. However, only organizations that had their risks mapped at the process level could see their operational risk profiles and also see what processes needed adjustments – quickly.

Furthermore, by linking compliance with process, those organizations were prepared to answer very specific compliance questions. For example, if a customer asked, “Since most of your employees are working from home now, how can you ensure that my data is not shared with their kids?” Organizations with business process could respond with, “We have anticipated these kinds of risks and implemented the following controls, and this is how we protect you in different layers.”

Every company must understand its business processes, particularly those in industries in which quality, regulatory, health, safety or environmental standards are serious considerations. BP modeling and analysis shows process flows, system interactions and organizational hierarchies to identity areas for improvement as well as practices susceptible to the greatest security, compliance or other risks so controls and audits can be implemented to mitigate exposures.

Connecting the GRC, Data and Process Layers

The GRC layer comprises mandatory components like risks, controls and compliance elements. Traditionally, these are manually documented, monitored and managed.

For example, if tomorrow you decide you want ISO (International Organization for Standardization) 27001 compliance for your information security management system, you can go to the appropriate ISO site, download the entire standard with all the assessments with all the descriptions, mandates, questions and documents that you will need to provide. All of these items would comprise the GRC layer.

However, many organizations maintain Excel files with risk and control information and other Office files with compliance files and information in isolation. Or some of these files are uploaded to various systems, but they don’t talk to each other or any other enterprise systems for that matter. This is the data layer, which is factual, objective and, as opposed to the GRC layer, can be either fully or partly automated.

Now, let’s add the process layer to the equation. Why? Because that is where the GRC and data layers meet. How? Processes produce, process and consume data –information captured in the metadata layer. By following the process sequence, I can actually trace the data lineage as it flows across the entire business ecosystem, beyond the application layer.

Taking it further, from processes, I can look at how the data is being managed by my capabilities. In other words, if I do have a data breach, how do I mitigate it? What impact will it have on my organization? And what are the necessary controls to manage it? Looking at them from right to left, I can identify the effected systems, and I can identify the interfaces between systems.

Mitigating Data Breaches

Most data breaches happen either at the database or interface level. Interfaces are how applications talk to each other.

Organizations are showing immense interest in expanding the development of risk profiles, not only for isolated layers but also in how those layers interact – how applications talk to each other, how processes use data, how data is stored, and how infrastructure is managed. Understanding these profiles allows for more targeted and even preemptive risk mitigation, enabling organizations to fortify their weak points with sufficient controls but also practical and effective processes.

We’re moving from a world in which everything is performed manually and in isolation to one that is fully automated and integrated.

erwin instructs how to document and manage governance, risk and compliance using business process modeling and enterprise architecture solution erwin Evolve.

The C-Level Demands GRC Real-Time Impact Analysis

Impact analysis is critical. Everything needs to be clearly documented, covering all important and relevant aspects. No service, capability or delivery process is considered complete unless the risks and controls that affect it, or are implemented through it, are mapped and that assessment is used to generate risk profiles for the process, service or capability. And the demand for this to happen automatically increases daily.

This is now one of the key mandates across many organizations. C-level executives now demand risk profile dashboards at the process ,organizational and local level.

For example, an executive travelling from one country to another, or from one continent to another, can make a query: “I’m traveling to X, so what is the country’s risk profile and how is it being managed What do I need to be aware of or address while I’m there?” Or when a new legislation is introduced affecting multiple countries, the impact of that legislation to those countries’ risk profiles can be quickly and accurately calculated and actions planned accordingly.

erwin Evolve

GRC is more critical than ever. Organizations and specifically the C-suite are demanding to see risk profiles at different slices and dices of a particular process. But this is impossible without automation.

erwin Evolve is a full-featured, configurable enterprise architecture (EA) and BP modeling and analysis software suite that aids regulatory and industry compliance and maps business systems that support the enterprise. Its automated visualization, documentation and enterprise collaboration capabilities turn EA and BP artifacts into insights both IT and business users can access in a central location for making strategic decisions and managing GRC.

Please click here to start your free trial of erwin Evolve.

Categories
Data Governance erwin Expert Blog

Are Data Governance Bottlenecks Holding You Back?

Better decision-making has now topped compliance as the primary driver of data governance. However, organizations still encounter a number of bottlenecks that may hold them back from fully realizing the value of their data in producing timely and relevant business insights.

While acknowledging that data governance is about more than risk management and regulatory compliance may indicate that companies are more confident in their data, the data governance practice is nonetheless growing in complexity because of more:

  • Data to handle, much of it unstructured
  • Sources, like IoT
  • Points of integration
  • Regulations

Without an accurate, high-quality, real-time enterprise data pipeline, it will be difficult to uncover the necessary intelligence to make optimal business decisions.

So what’s holding organizations back from fully using their data to make better, smarter business decisions?

Data Governance Bottlenecks

erwin’s 2020 State of Data Governance and Automation report, based on a survey of business and technology professionals at organizations of various sizes and across numerous industries, examined the role of automation in  data governance and intelligence  efforts.  It uncovered a number of obstacles that organizations have to overcome to improve their data operations.

The No.1 bottleneck, according to 62 percent of respondents, was documenting complete data lineage. Understanding the quality of source data is the next most serious bottleneck (58 percent); followed by finding, identifying, and harvesting data (55 percent); and curating assets with business context (52 percent).

The report revealed that all but two of the possible bottlenecks were marked by more than 50 percent of respondents. Clearly, there’s a massive need for a data governance framework to keep these obstacles from stymying enterprise innovation.

As we zeroed in on the bottlenecks of day-to-day operations, 25 percent of respondents said length of project/delivery time was the most significant challenge, followed by data quality/accuracy is next at 24 percent, time to value at 16 percent, and reliance on developer and other technical resources at 13 percent.

Are Data Governance Bottlenecks Holding You Back?

Overcoming Data Governance Bottlenecks

The 80/20 rule describes the unfortunate reality for many data stewards: they spend 80 percent of their time finding, cleaning and reorganizing huge amounts of data and only 20 percent on actual data analysis.

In fact, we found that close to 70 percent of our survey respondents spent an average of 10 or more hours per week on data-related activities, most of it searching for and preparing data.

What can you do to reverse the 80/20 rule and subsequently overcome data governance bottlenecks?

1. Don’t ignore the complexity of data lineage: It’s a risky endeavor to support data lineage using a manual approach, and businesses that attempt it that way will find it’s not sustainable given data’s constant movement from one place to another via multiple routes – and doing it correctly down to the column level. Adopting automated end-to-end lineage makes it possible to view data movement from the source to reporting structures, providing a comprehensive and detailed view of data in motion.

2. Automate code generation: Alleviate the need for developers to hand code connections from data sources to target schema. Mapping data elements to their sources within a single repository to determine data lineage and harmonize data integration across platforms reduces the need for specialized, technical resources with knowledge of ETL and database procedural code. It also makes it easier for business analysts, data architects, ETL developers, testers and project managers to collaborate for faster decision-making.

3. Use an integrated impact analysis solution: By automating data due diligence for IT you can deliver operational intelligence to the business. Business users benefit from automating impact analysis to better examine value and prioritize individual data sets. Impact analysis has equal importance to IT for automatically tracking changes and understanding how data from one system feeds other systems and reports. This is an aspect of data lineage, created from technical metadata, ensuring nothing “breaks” along the change train.

4. Put data quality first: Users must have confidence in the data they use for analytics. Automating and matching business terms with data assets and documenting lineage down to the column level are critical to good decision-making. If this approach hasn’t been the case to date, enterprises should take a few steps back to review data quality measures before jumping into automating data analytics.

5. Catalog data using a solution with a broad set of metadata connectors: All data sources will be leveraged, including big data, ETL platforms, BI reports, modeling tools, mainframe, and relational data as well as data from many other types of systems. Don’t settle for a data catalog from an emerging vendor that only supports a narrow swath of newer technologies, and don’t rely on a catalog from a legacy provider that may supply only connectors for standard, more mature data sources.

6. Stress data literacy: You want to ensure that data assets are used strategically. Automation expedites the benefits of data cataloging. Curated internal and external datasets for a range of content authors doubles business benefits and ensures effective management and monetization of data assets in the long-term if linked to broader data governance, data quality and metadata management initiatives. There’s a clear connection to data literacy here because of its foundation in business glossaries and socializing data so all stakeholders can view and understand it within the context of their roles.

7. Make automation the norm across all data governance processes: Too many companies still live in a world where data governance is a high-level mandate, not practically implemented. To fully realize the advantages of data governance and the power of data intelligence, data operations must be automated across the board. Without automated data management, the governance housekeeping load on the business will be so great that data quality will inevitably suffer. Being able to account for all enterprise data and resolve disparity in data sources and silos using manual approaches is wishful thinking.

8. Craft your data governance strategy before making any investments: Gather multiple stakeholders—both business and IT— with multiple viewpoints to discover where their needs mesh and where they diverge and what represents the greatest pain points to the business. Solve for these first, but build buy-in by creating a layered, comprehensive strategy that ultimately will address most issues. From there, it’s on to matching your needs to an automated data governance solution that squares with business and IT – both for immediate requirements and future plans.

Register now for the first of a new, six-part webinar series on the practice of data governance and how to proactively deal with the complexities. “The What & Why of Data Governance” webinar on Tuesday, Feb. 23rd at 3 pm GMT/10 am ET.

Categories
erwin Expert Blog

Cloud Migration and the Importance of Data Governance

Tackling data-related challenges to keep cloud migration projects on track and optimized

By Wendy Petty 

The cloud has many operational and competitive advantages, so cloud-first and other cloud transformation initiatives continue to be among the top data projects organizations are pursuing.

For many of those yet to adopt and adapt, it is a case of “when” not “if” the enterprise will undergo a form of digital transformation requiring data migration to the cloud.

Due to today’s prevalence of internal and external market disruptors, many organizations are aligning their digital transformation and cloud migration efforts with other strategic requirements (e.g., compliance with the General Data Protection Regulation).

And now organizations also must navigate a post-COVID world, which is forcing organizations to fast track their cloud migrations to become more agile, lean and focused on business outcomes that will enable the business to survive and then thrive new market dynamics.

However, cloud migration is not just a lift and shift, a one-off or a silver bullet. Usually when organizations go from an on-premises environment to a cloud environment, they are actually converting two different technologies. And as you migrate to the cloud, you need to keep in mind some data-related challenges.

cloud migration data governance

Dollars and Cents

For 47 percent of enterprise companies, cost optimization is the main reason they migrate to the cloud. However, cloud migrations can be expensive, with costs piling up the longer a migration takes to complete.

Not only are cloud migrations generally expensive, but many companies don’t budget for them appropriately. In 2020, companies went over their public cloud spend budget by an average of 23 percent. Most likely, this comes down to a lack of planning, leading to long, drawn-out migrations and ill-informed product decisions. Additionally, completely manual migrations generally take longer and cost more than those that employ automation.

In terms of budget and cost, automated tools that scan repositories in your environment help by adding structure and business context (where it is, who can access it, etc.) in the transformation of legacy structures. New structures will enable new capabilities for your data and business processes.

Automated tools can help you lower risks and costs and reduce the time it takes to realize value. Automated software handles data cataloging and locates, models and governs cloud data assets.

Tools that help IT organizations plan and execute their cloud migrations aren’t difficult to find. Many large cloud providers offer tools to help ease the migration to their platform. But a technology-agnostic approach to such tools adds value to cloud migration projects.

Proprietary tools from cloud vendors funnel clients into a single preferred environment. Agnostic tools, on the other hand, help organizations understand which cloud environment is best for them. Their goal is to identify the cloud platform and strategy that will deliver the most value after taking budget and feature requirements into account.

Institutional Knowledge

Institutional knowledge is another obstacle many companies face when exploring cloud migrations. People leave the organization and take with them an understanding of how and why things are done. Because of this, you may not know what data you have or how you should be using it.

The challenge comes when it’s time to migrate; you need to understand what you have, how it’s used, what its value is, and what should be migrated. Otherwise, you may spend time and money migrating data, only to discover that no one has touched it in several years and it wasn’t necessary for you to retain it.

In addition, if you’re planning to use a multi-cloud approach, you need to ensure that the clouds you work with are compatible. Only 24 percent of IT organizations have a high degree of interoperability between their cloud environments. This means that more than three-quarters suffer from inefficient cloud setups and can’t readily combine or analyze data from multiple cloud environments.

Data Governance

Migrating enterprise data to the cloud is only half the story – once there, it has to be governed. That means your cloud data assets must be available for use by the right people for the right purposes to maximize their security, quality and value.

Around 60 percent of enterprises worry about regulatory issues, governance and compliance with cloud services. The difficulty comes with creating good governance around data while avoiding risk and getting more out of that data. More than three-quarters (79 percent) of businesses are looking for better integrated security and governance for the data they put in the cloud.

Cloud migration provides a unique opportunity not simply to move things as they are to the cloud but also to make strategic changes. Companies are using the move to the cloud to make data governance a priority and show their customers they are good data stewards.

Unfortunately, 72 percent of companies state that deciding which workloads they should migrate to the cloud is one of their top four hurdles to cloud implementation. However, cloud migration is not an endpoint; it’s just the next step in making your business flexible and agile for the long term.

Determining which data sets need to be migrated can help you prepare for growth in the long run. The degree of governance each set of data needs will help determine what you should migrate and what you should keep in place.

Automated Cloud Migration and Data Governance

The preceding list of cloud migration challenges might seem daunting, especially for an organization that collects and manages a great deal of data. When enterprises face the prospect of manual, cumbersome work related to their business processes, IT infrastructure, and more, they often turn to automation.

You can apply the same idea to your cloud migration strategy because automated software tools can aid in the planning and heavy lifting of cloud migrations. As such, they should be considered when it comes to choosing platforms, forecasting costs, and understanding the value of the data being considered for migration.

erwin Cloud Catalyst is a suite of automated cloud migration and data governance software and services to simplify and accelerate the move to cloud platforms and govern those data assets throughout their lifecycle. Automation is a critical differentiator for erwin’s cloud migration and data governance tools.

Key Benefits of erwin Cloud Catalyst:

  • Cost Mitigation: Automated tools scan repositories in your environment and add structure and business context (where it is, who can access it, etc.) in the transformation of legacy structures.
  • Reduced Risk and Faster Time to Value: Automated tools can help you reduce risks, costs and the time it takes to realize value.
  • Tech-Agnostic: Technology-agnostic approach adds value to cloud migration projects.
  • Any Cloud to Any Cloud: Automatically gathering the abstracted essence of the data will make it easier to point that information at another cloud platform or technology if, or likely when, you migrate again.
  • Institutional Knowledge Retention: Collect and retain institutional knowledge around data and enable transparency.
  • Continuous Data Governance: Automation helps IT organizations address data governance during cloud migrations and then for the rest of the cloud data lifecycle and minimizes human intervention.

Every customer’s environment and data is unique. That’s why the first step is working with you to assess your cloud migration strategy. Then we deliver an automation roadmap and design the appropriate smart data connectors to help your IT services team achieve your future-state architecture, including accelerating data ingestion and ETL conversion.

To get started, request your cloud-readiness assessment.

And here’s a video with some more information about our approach to cloud migration and data governance.

Gartner Magic Quadrant

Categories
Data Governance erwin Expert Blog

erwin Positioned as a Leader in Gartner’s 2020 Magic Quadrant for Metadata Management Solutions for Second Year in a Row

erwin has once again been positioned as a Leader in the Gartner “2020 Magic Quadrant for Metadata Management Solutions.”

This year, erwin had the largest move of any player on the Quadrant and moved up significantly in terms of “Ability to Execute” and also in “Vision.”

This recognition affirms our efforts in developing an integrated platform for enterprise modeling and data intelligence to support data governance, digital transformation and any other effort that relies on data for favorable outcomes.

erwin’s metadata management offering, the erwin Data Intelligence Suite (erwin DI), includes data catalog, data literacy and automation capabilities for greater awareness of and access to data assets, guidance on their use, and guardrails to ensure data policies and best practices are followed.

With erwin DI’s automated, metadata-driven framework, organizations have visibility and control over their disparate data streams – from harvesting to aggregation and integration, including transformation with complete upstream and downstream lineage and all the associated documentation.

We’re super proud of this achievement and the value erwin DI provides.

We invite you to download the report and quadrant graphic.

Categories
Data Governance erwin Expert Blog

There’s More to erwin Data Governance Automation Than Meets the AI

Prashant Parikh, erwin’s Senior Vice President of Software Engineering, talks about erwin’s vision to automate every aspect of the data governance journey to increase speed to insights. The clear benefit is that data stewards spend less time building and populating the data governance framework and more time realizing value and ROI from it. 

Industry analysts and other people who write about data governance and automation define it narrowly, with an emphasis on artificial intelligence (AI) and machine learning (ML). Although AI and ML are massive fields with tremendous value, erwin’s approach to data governance automation is much broader.

Automation adds a lot of value by making processes more effective and efficient. For data governance, automation ensures the framework is always accurate and up to date; otherwise the data governance initiative itself falls apart.

From our perspective, the key to data governance success is meeting the needs of both IT and business users in the discovery and application of enterprise “data truths.” We do this through an open, configurable and flexible metamodel across data catalog, business glossary, and self-service data discovery capabilities with built-in automation.

To better explain our vision for automating data governance, let’s look at some of the different aspects of how the erwin Data Intelligence Suite (erwin DI) incorporates automation.

Metadata Harvesting and Ingestion: Automatically harvest, transform and feed metadata from virtually any source to any target to activate it within the erwin Data Catalog (erwin DC). erwin provides this metadata-driven automation through two types of data connectors: 1) erwin Standard Data Connectors for data at rest or JDBC-compliant data sources and 2) optional erwin Smart Data Connectors for data in motion or a broad variety of code types and industry-standard languages, including ELT/ETL platforms, business intelligence reports, database procedural code, testing automation tools, ecosystem utilities and ERP environments.

Data Cataloging: Catalog and sync metadata with data management and governance artifacts according to business requirements in real time. erwin DC helps organizations learn what data they have and where it’s located, including data at rest and in motion. It’s an inventory of the entire metadata universe, able to tell you the data and metadata available for a certain topic so those particular sources and assets can be found quickly for analysis and decision-making.

Data Mapping: erwin DI’s Mapping Manager provides an integrated development environment for creating and maintaining source-to-target mapping and transformation specifications to centrally version control data movement, integration and transformation. Import existing Excel or CSV files, use the drag-and-drop feature to extract the mappings from your ETL scripts, or manually populate the inventory to then be visualized with the lineage analyzer.

Code Generation: Generate ETL/ELT, Data Vault and code for other data integration components with plug-in SDKs to accelerate project delivery and reduce rework.

Data Lineage: Document and visualize how data moves and transforms across your enterprise. erwin DC generates end-to-end data lineage, down to the column level, between repositories and shows data flows from source systems to reporting layers, including intermediate transformation and business logic. Whether you’re a business user or a technical user, you can understand how data travels and transforms from point A to point B.

Data Profiling: Easily assess the contents and quality of registered data sets and associate these metrics with harvested metadata as part of ongoing data curation. Find hidden inconsistencies and highlight other potential problems using intelligent statistical algorithms and provides robust validation scores to help correct errors.

Business Glossary Management: Curate, associate and govern data assets so all stakeholders can find data relevant to their roles and understand it within a business context. erwin DI’s Business Glossary Manager is a central repository for all terms, policies and rules with out-of-the-box, industry-specific business glossaries with best-practice taxonomies and ontologies.

Semantic and Metadata Associations: erwin AIMatch automatically discovers and suggests relationships and associations between business terms and technical metadata to accelerate the creation and maintenance of governance frameworks.

Sensitive Data Discovery + Mind Mapping: Identify, document and prioritize sensitive data elements, flagging sensitive information to accelerate compliance efforts and reduce data-related risks. For example, we ship out-of-the-box General Data Protection Regulation (GDPR) policies and critical data elements that make up the GDPR policy. 

Additionally, the mind map automatically connects technical and business objects so both sets of stakeholders can easily visualize the organization’s most valuable data assets. It provides a current, holistic and enterprise-wide view of risks, enabling compliance and regulatory managers to quickly update the classifications at one level or at higher levels, if necessary. The mind map also shows you the sensitivity indicator and it allows you to propagate the sensitivity across your related objects to ensure compliance.

Self-Service Data Discovery: With an easy-to-use UI and flexible search mechanisms, business users can look up information and then perform the required analysis for quick and accurate decision-making. It further enables data socialization and collaboration between data functions within the organization.

Data Modeling Integration: By automatically harvesting your models from erwin Data Modeler and all the associated metadata for ingestion into a data catalog you ensure a single source of truth.  Then you can associate metadata with physical assets, develop a business glossary with model-driven naming standards, and socialize data models with a wider range of stakeholders. This integration also helps the business stewards because if your data model has your naming standard convention filled in, we also help them by populating the business glossary.

Enterprise Architecture Integration: erwin DI Harvester for Evolve systemically harvests data assets via smart data connectors for a wide range of data sources, both data at rest and data in motion. The harvested metadata integrates with enterprise architecture providing an accurate picture of the processes, applications and data within an organization.

Why Automating Everything Matters

The bottom line is you do not need to waste precious time, energy and resources to search, manage, analyze, prepare or protect data manually. And unless your data is well-governed, downstream data analysts and data scientists will not be able to generate significant value from it.

erwin DI provides you with the ability to populate your system with the metadata from your enterprise. We help you every step with the built in, out-of-the-box solutions and automation for every aspect of your data governance journey.

By ensuring your environment always stays controlled, you are always on top of your compliance, your tagging of sensitive data, and satisfying your unique governance needs with flexibility built into the product, and automation guiding you each step of the way.

erwin DI also enables and encourages collaboration and democratization of the data that is collected in the system; letting business users mine the data sets, because that is the ultimate value of your data governance solution.

With software-based automation and guidance from humans, the information in your data governance framework will never be outdated or out of sync with your IT and business functions. Stale data can’t fuel a successful data governance program.

Learn more about erwin automation, including what’s on the technology roadmap, by watching “Our Vision to Automate Everything” from the first day of erwin Insights 2020.

Or you can request your own demo of erwin DI.

erwin Insights 2020 on demand

Categories
Data Governance erwin Expert Blog

Automating Data Governance

Automating Data Governance

Automating data governance is key to addressing the exponentially growing volume and variety of data.

erwin CMO Mariann McDonagh recounts erwin’s vision to automate everything from day 1 of erwin Insights 2020.

Data readiness is everything. Whether driving digital experiences, mapping customer journeys, enhancing digital operations, developing digital innovations, finding new ways to interact with customers, or building digital ecosystems or marketplaces – all of this digital transformation is powered by data.

In a COVID and post-COVID world, organizations need to radically change as we look to reimagine business models and reform the way we approach almost everything.

The State of Data Automation

Data readiness depends on automation to create the data pipeline. Earlier this year, erwin conducted a research project in partnership with Dataversity, the 2020 State of Data Governance and Automation.

We asked participants to “talk to us about data value chain bottlenecks.” They told us their number one challenge is documenting complete data lineage (62%), followed by understanding the quality of the data source (58%).

Two other significant bottlenecks are finding, identifying and harvesting data (55%) curating data assets with business content for context and semantics (52%). Every item mentioned here are recurring themes we hear from our customers in terms of what led them to erwin.

We also looked at data preparation, governance and intelligence to see where organizations might be getting stuck and spending lots of time. We found that project length, slow delivery time, is one of the biggest inhibitors. Data quality and accuracy are recurring themes as well.

Reliance on developers and technical resources is another barrier to productivity. Even with data scientists in the front office, the lack of people in the back office to harvest and prepare the data means  time to value is prolonged.

Last but not least, we looked at the amount of time spent on data activities. The great news is that most organizations spend more than 10 hours a week on data-related activities. But the problem is that not enough of that time is spent on analysis because of being stuck in data prep.

IDC talks about this reverse 80/20 rule: 80% of time and effort is spent on data preparation, with only 20% focused on data analysis. This means 80% of your time is left on the cutting-room floor and can’t be used to drive your business forward.

2020 Data Governance and Automation Report

Data Automation Adds Value

Automating data operations adds a lot of value by making a solution more effective and more powerful. Consider a smart home’s thermostat, smoke detectors, lights, doorbell, etc. You have centralized access and control – from anywhere.

At erwin, our goal is to automate the entire data governance journey, whether top down or bottom up. We’re on a mission to automate all the tasks data stewards typically perform so they spend less time building and populating the data governance framework and more time using the framework to realize value and ROI.

Automation also ensures that the data governance framework is always up to date and never stale. Because without current and accurate data, a data governance initiative will fall apart.

Here are some ways erwin adds value by automating the data governance journey:

  • Metadata ingestion into the erwin Data Intelligence Suite (erwin DI) through our standard data connectors. And you can schedule metadata scans to ensure it’s always refreshed and up to date.
  • erwin Smart Data Connectors address data in motion, how it travels and transforms across the enterprise. These custom software solutions document all the traversing and transformations of data and populate the erwin DI’s Metadata Manager with the technical metadata. erwin Smart Data Connectors also document ETL scripts work with the tool of your choice.
  • erwin Lineage Analyzer puts everything together in an easy-to-understand format, making it easy for both business and technical users to visualize how data is traversing the enterprise, how it is getting transformed and the different hops it is taking along the way.
  • erwin DM Connect for DI makes it easy for metadata to be ingested from erwin Data Modeler to erwin DI. erwin DM customers can take advantage of all the rich metadata created and stored in their erwin data models. With just a couple of clicks, some or all data models can be configured and pushed erwin DI’s Metadata Manager.

The automation and integration of erwin DM and erwin DI ensures that your data models are always updated and uploaded, providing a single source of truth for your data governance journey.

This is part one of a two-part series on how erwin is automating data governance. Learn more by watching this session from erwin Insights 2020, which now is available on demand.

erwin Insights 2020

Categories
Data Governance erwin Expert Blog Data Intelligence

Doing Cloud Migration and Data Governance Right the First Time

More and more companies are looking at cloud migration.

Migrating legacy data to public, private or hybrid clouds provide creative and sustainable ways for organizations to increase their speed to insights for digital transformation, modernize and scale their processing and storage capabilities, better manage and reduce costs, encourage remote collaboration, and enhance security, support and disaster recovery.

But let’s be honest – no one likes to move. So if you’re going to move from your data from on-premise legacy data stores and warehouse systems to the cloud, you should do it right the first time. And as you make this transition, you need to understand what data you have, know where it is located, and govern it along the way.

cloud migration

Automated Cloud Migration

Historically, moving legacy data to the cloud hasn’t been easy or fast.

As organizations look to migrate their data from legacy on-prem systems to cloud platforms, they want to do so quickly and precisely while ensuring the quality and overall governance of that data.

The first step in this process is converting the physical table structures themselves. Then you must bulk load the legacy data. No less daunting, your next step is to re-point or even re-platform your data movement processes.

Without automation, this is a time-consuming and expensive undertaking. And you can’t risk false starts or delayed ROI that reduces the confidence of the business and taint this transformational initiative.

By using automated and repeatable capabilities, you can quickly and safely migrate data to the cloud and govern it along the way.

But transforming and migrating enterprise data to the cloud is only half the story – once there, it needs to be governed for completeness and compliance. That means your cloud data assets must be available for use by the right people for the right purposes to maximize their security, quality and value.

Why You Need Cloud Data Governance

Companies everywhere are building innovative business applications to support their customers, partners and employees and are increasingly migrating from legacy to cloud environments. But even with the “need for speed” to market, new applications must be modeled and documented for compliance, transparency and stakeholder literacy.

The desire to modernize technology, over time, leads to acquiring many different systems with various data entry points and transformation rules for data as it moves into and across the organization.

These tools range from enterprise service bus (ESB) products, data integration tools; extract, transform and load (ETL) tools, procedural code, application program interfaces (APIs), file transfer protocol (FTP) processes, and even business intelligence (BI) reports that further aggregate and transform data.

With all these diverse metadata sources, it is difficult to understand the complicated web they form much less get a simple visual flow of data lineage and impact analysis.

Regulatory compliance is also a major driver of data governance (e.g., GDPR, CCPA, HIPAA, SOX, PIC DSS). While progress has been made, enterprises are still grappling with the challenges of deploying comprehensive and sustainable data governance, including reliance on mostly manual processes for data mapping, data cataloging and data lineage.

Introducing erwin Cloud Catalyst

erwin just announced the release of erwin Cloud Catalyst, a suite of automated cloud migration and data governance software and services. It helps organizations quickly and precisely migrate their data from legacy, on-premise databases to the cloud and then govern those data assets throughout their lifecycle.

Only erwin provides software and services that automate the complete cloud migration and data governance lifecycle – from the reverse-engineering and transformation of legacy systems and ETL/ELT code to moving bulk data to cataloging and auto generating lineage. The metadata-driven suite automatically finds, models, ingests, catalogs and governs cloud data assets.

erwin Cloud Catalyst is comprised of erwin Data Modeler (erwin DM), erwin Data Intelligence (erwin DI) and erwin Smart Data Connectors, working together to simplify and accelerate cloud migration by removing barriers, reducing risks and decreasing time to value for your investments in these modern systems, such Snowflake, Microsoft Azure and Google Cloud.

We start with an assessment of your cloud migration strategy to determine what automation and optimization opportunities exist. Then we deliver an automation roadmap and design the appropriate smart data connectors to help your IT services team achieve your future-state cloud architecture, including accelerating data ingestion and ETL conversion.

Once your data reaches the cloud, you’ll have deep and detailed metadata management with full data governance, data lineage and impact analysis. With erwin Cloud Catalyst, you automate these data governance steps:

  • Harvest and catalog cloud data: erwin DM and erwin DI’s Metadata Manager natively scans RDBMS sources to catalog/document data assets.
  • Model cloud data structures: erwin DM converts, modifies and models the new cloud data structures.
  • Map data movement: erwin DI’s Mapping Manager defines data movement and transformation requirements via drag-and-drop functionality.
  • Generate source code: erwin DI’s automation framework generates data migration source code for any ETL/ELT SDK.
  • Test migrated data: erwin DI’s automation framework generates test cases and validation source code to test migrated data.
  • Govern cloud data: erwin DI gives cloud data assets business context and meaning through the Business Glossary Manager, as well as policies and rules for use.
  • Distribute cloud data: erwin DI’s Business User Portal provides self-service access to cloud data asset discovery and reporting tools.

Request an erwin Cloud Catalyst assessment.

And don’t forget to register for erwin Insights 2020 on October 13-14, with sessions on Snowflake, Microsoft and data lake initiatives powered by erwin Cloud Catalyst.

erwin Data Intelligence

Subscribe to the erwin Expert Blog

Once you submit the trial request form, an erwin representative will be in touch to verify your request and help you start data modeling.

Categories
Data Intelligence erwin Expert Blog

Why You Need End-to-End Data Lineage

Not Documenting End-to-End Data Lineage Is Risky Business – Understanding your data’s origins is key to successful data governance.

Not everyone understands what end-to-end data lineage is or why it is important. In a previous blog, I explained that data lineage is basically the history of data, including a data set’s origin, characteristics, quality and movement over time.

This information is critical to regulatory compliance, change management and data governance not to mention delivering an optimal customer experience. But given the volume, velocity and variety of data (the three Vs of data) we generate today, producing and keeping up with end-to-end data linage is complex and time-consuming.

Yet given this era of digital transformation and fierce competition, understanding what data you have, where it came from, how it’s changed since creation or acquisition, and whether it poses any risks is paramount to optimizing its value. Furthermore, faulty decision-making based on inconsistent analytics and inaccurate reporting can cost millions.

Data Lineage

Data Lineage Tells an Important Origin Story

End-to-end data lineage explains how information flows into, across and outside an organization. And knowing how information was created, its origin and quality may have greater value than a given data set’s current state.

For example, data lineage provides a way to determine which downstream applications and processes are affected by a change in data expectations and helps in planning for application updates.

As I mentioned above, the three Vs of data and the integration of systems makes it difficult to understand the resulting data web much less capture a simple visual of that flow. Yet a consistent view of data and how it flows is paramount to the success of enterprise data governance and any data-driven initiative.

Whether you need to drill down for a granular view of a particular data set or create a high-level summary to describe a particular system and the data it relies on, end-to-end data lineage must be documented and tracked, with an emphasis on the dynamics of data processing and movement as opposed to data structures. Data lineage helps answer questions about the origin of data in key performance indicator (KPI) reports, including:

  • How are the report tables and columns defined in the metadata?
  • Who are the data owners?
  • What are the transformation rules?

Five Consequences of Ignoring Data Lineage

Why do so many organizations struggle with end-to-end data lineage?

The struggle is real for a number of reasons. At the top of the list, organizations are dealing with more data than ever before using systems that weren’t designed to communicate effectively with one another.

Next, their IT and business stakeholders have a difficult time collaborating. And, for a lot of organizations, they’ve relied mostly on manual processes – if data lineage documentation has been attempted at all.

The risks of ignoring end-to-end data lineage are just too great. Let’s look at some of those consequences:

  1. Derailed Projects

Effectively managing business operations is a key factor to success– especially for organizations that are in the midst of digital transformation. Failures in business processes attributed to errors can be a big problem.

For example, in a typical business scenario where an incorrect data set is discovered within a report, the length of time (on average) that it takes a team to find the source of the error can take days or sometimes weeks – derailing the project and costing time and money.

  1. Policy Bloat and Unruly Rules

The business glossary environment must represent the actual environment, e.g., be refreshed and synched, otherwise it becomes obsolete. You need real collaboration.

Data dictionaries, glossaries and policies can’t live in different formats and in different places. It is common for these to be expressed in different ways, depending on the database and underlying storage technology, but this causes policy bloat and rules that no organization, team or employee will understand, let alone realistically manage.

Effective data governance requires that business glossaries, data dictionaries and data privacy policies live in one central location, so they can be easily tracked, monitored and updated over time.

  1. Major Inefficiencies

Successful data migration and upgrades rely on seamless integration of tools and processes with coordinated efforts of people/resources. A passive approach frequently relies on creating new copies of data, usually with sensitive identifiers removed or obscured.

Not only does this passive approach create inefficiencies between determining what data to copy, how to copy it, and where to store the copy, it also creates new volumes of data that become harder to track over time. Yet again, a passive approach to data cannot scale. Direct access to the same live data across the organization is required.

  1. Not Knowing Where Your Data Is

Metadata management and manual mapping are a challenge to most organizations. Data comes in all shapes, sizes and formats, and there is no way to know what type of data a project will need – or even where that data will sit.

Some data might be in the cloud, some on premise, and sometimes projects will require a hybrid approach. All data must be governed, regardless of where it is located.

  1. Privacy and Compliance Challenges

Privacy and compliance personnel know the rules that must be applied to data, but may not necessarily know the technology. Instead, automated data governance requires that anyone, with any level of expertise, can understand what rules (e.g. privacy policies) are applied to enterprise data.

Organizations with established data governance must empower both those with technical skill sets and those with privacy and compliance knowledge, so all teams can play a meaningful role controlling how data is used.

For more information on data lineage, get the free white paper, Tech Brief: Data Lineage.

End-to-End Data Lineage

 

Categories
Data Intelligence Enterprise Architecture Data Governance erwin Expert Blog

Integrating Data Governance and Enterprise Architecture

Aligning these practices for regulatory compliance and other benefits

Why should you integrate data governance (DG) and enterprise architecture (EA)? It’s time to think about EA beyond IT.

Two of the biggest challenges in creating a successful enterprise architecture initiative are: collecting accurate information on application ecosystems and maintaining the information as application ecosystems change.

Data governance provides time-sensitive, current-state architecture information with a high level of quality. It documents your data assets from end to end for business understanding and clear data lineage with traceability.

In the context of EA, data governance helps you understand what information you have; where it came from; if it’s secure; who’s accountable for it; who accessed it and in which systems and applications it’s located and moves between.

You can collect complete application ecosystem information; objectively identify connections/interfaces between applications, using data; provide accurate compliance assessments; and quickly identify security risks and other issues.

Data governance and EA also provide many of the same benefits of enterprise architecture or business process modeling projects: reducing risk, optimizing operations, and increasing the use of trusted data.

To better understand and align data governance and enterprise architecture, let’s look at data at rest and data in motion and why they both have to be documented.

  1. Documenting data at rest involves looking at where data is stored, such as in databases, data lakes, data warehouses and flat files. You must capture all of this information from the columns, fields and tables – and all the data overlaid on top of that. This means understanding not just the technical aspects of a data asset but also how the business uses that data asset.
  2. Documenting data in motion looks at how data flows between source and target systems and not just the data flows themselves but also how those data flows are structured in terms of metadata. We have to document how our systems interact, including the logical and physical data assets that flow into, out of and between them.

data governance and enterprise architecture

Automating Data Governance and Enterprise Architecture

If you have a data governance program and tooling in place, you’re able to document a lot of information that enterprise architects and process modelers usually spend months, if not years, collecting and keeping up to date.

So within a data governance repository, you’re capturing systems, environments, databases and data — both logical and physical. You’re also collecting information about how those systems are interconnected.

With all this information about the data landscape and the systems that use and store it, you’re automatically collecting your organization’s application architecture. Therefore you can drastically reduce the time to achieving value because your enterprise architecture will always be up to date because you’re managing the associated data properly.

If your organization also has an enterprise architecture practice and tooling, you can automate the current-state architecture, which is arguably the most expensive and time-intensive aspect of enterprise architecture to have at your fingertips.

In erwin’s 2020 State of Data Governance and Automation report, close to 70 percent of respondents said they spend an average of 10 or more hours per week on data-related activities, and most of that time is spent searching for and preparing data.

At the same time, it’s also critical to answer the executives’ questions. You can’t do impact analysis if you don’t understand the current-state architecture, and it’s not going to be delivered quick enough if it isn’t documented.

Data Governance and Enterprise Architecture for Regulatory Compliance

First and foremost, we can start to document the application inventory automatically because we are scanning systems and understanding the architecture itself. When you pre-populate your interface inventory, application lineage and data flows, you see clear-cut dependencies.

That makes regulatory compliance a fantastic use case for both data governance and EA. You can factor this use case into process and application architecture diagrams, looking at where this type of data goes and what sort of systems in touches.

With that information, you can start to classify information for such regulations as the European Union’s General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA) or any type of compliance data for an up-to-date regulatory compliance repository. Then all this information flows into processing controls and will ultimately deliver real-time, true impact analysis and traceability.

erwin for Data Governance and Enterprise Architecture

Using data governance and enterprise architecture in tandem will give you a data-driven architecture, reducing time to value and show true results to your executives.

You can better manage risk because of real-time data coming into the EA space. You can react quicker, answering questions for stakeholders that will ultimately drive business transformation. And you can reinforce the value of your role as an enterprise architect.

erwin Evolve is a full-featured, configurable set of enterprise architecture and business process modeling and analysis tools. It integrates with erwin’s data governance software, the erwin Data Intelligence Suite.

With these unified capabilities, every enterprise stakeholder – enterprise architect, business analyst, developer, chief data officer, risk manager, and CEO – can discover, understand, govern and socialize data assets to realize greater value while mitigating data-related risks.

You can start a free trial of erwin Evolve here.

Enterprise architecture review