Tag: data modeling

Data Governance Makes Data Security Less Scary

Post author By Mariann McDonagh
Post date October 31, 2019
No Comments on Data Governance Makes Data Security Less Scary

Happy Halloween!

Do you know where your data is? What data you have? Who has had access to it?

These can be frightening questions for an organization to answer.

Add to the mix the potential for a data breach followed by non-compliance, reputational damage and financial penalties and a real horror story could unfold.

In fact, we’ve seen some frightening ones play out already:

Google’s record GDPR fine – France’s data privacy enforcement agency hit the tech giant with a $57 million penalty in early 2019 – more than 80 times the steepest fine the U.K.’s Information Commissioner’s Office had levied against both Facebook and Equifax for their data breaches.
In July 2019, British Airways received the biggest GDPR fine to date ($229 million) because the data of more than 500,000 customers was compromised.
Marriot International was fined $123 million, or 1.5 percent of its global annual revenue, because 330 million hotel guests were affected by a breach in 2018.

The Regulatory Rationale for Integrating Data Management & Data Governance

Now, as Cybersecurity Awareness Month comes to a close – and ghosts and goblins roam the streets – we thought it a good time to resurrect some guidance on how data governance can make data security less scary.

We don’t want you to be caught off guard when it comes to protecting sensitive data and staying compliant with data regulations.

Don’t Scream; You Can Protect Your Sensitive Data

It’s easier to protect sensitive data when you know what it is, where it’s stored and how it needs to be governed.

Data security incidents may be the result of not having a true data governance foundation that makes it possible to understand the context of data – what assets exist and where, the relationship between them and enterprise systems and processes, and how and by what authorized parties data is used.

That knowledge is critical to supporting efforts to keep relevant data secure and private.

Without data governance, organizations don’t have visibility of the full data landscape – linkages, processes, people and so on – to propel more context-sensitive security architectures that can better assure expectations around user and corporate data privacy. In sum, they lack the ability to connect the dots across governance, security and privacy – and to act accordingly.

This addresses these fundamental questions:

What private data do we store and how is it used?
Who has access and permissions to the data?
What data do we have and where is it?

Where Are the Skeletons?

Data is a critical asset used to operate, manage and grow a business. While sometimes at rest in databases, data lakes and data warehouses; a large percentage is federated and integrated across the enterprise, introducing governance, manageability and risk issues that must be managed.

Knowing where sensitive data is located and properly governing it with policy rules, impact analysis and lineage views is critical for risk management, data audits and regulatory compliance.

However, when key data isn’t discovered, harvested, cataloged, defined and standardized as part of integration processes, audits may be flawed and therefore your organization is at risk.

Sensitive data – at rest or in motion – that exists in various forms across multiple systems must be automatically tagged, its lineage automatically documented, and its flows depicted so that it is easily found and its usage across workflows easily traced.

Thankfully, tools are available to help automate the scanning, detection and tagging of sensitive data by:

Monitoring and controlling sensitive data: Better visibility and control across the enterprise to identify data security threats and reduce associated risks
Enriching business data elements for sensitive data discovery: Comprehensively defining business data element for PII, PHI and PCI across database systems, cloud and Big Data stores to easily identify sensitive data based on a set of algorithms and data patterns
Providing metadata and value-based analysis: Discovery and classification of sensitive data based on metadata and data value patterns and algorithms. Organizations can define business data elements and rules to identify and locate sensitive data including PII, PHI, PCI and other sensitive information.

No Hocus Pocus

Truly understanding an organization’s data, including its value and quality, requires a harmonized approach embedded in business processes and enterprise architecture.

Such an integrated enterprise data governance experience helps organizations understand what data they have, where it is, where it came from, its value, its quality and how it’s used and accessed by people and applications.

An ounce of prevention is worth a pound of cure – from the painstaking process of identifying what happened and why to notifying customers their data and thus their trust in your organization has been compromised.

A well-formed security architecture that is driven by and aligned by data intelligence is your best defense. However, if there is nefarious intent, a hacker will find a way. So being prepared means you can minimize your risk exposure and the damage to your reputation.

Multiple components must be considered to effectively support a data governance, security and privacy trinity. They are:

Data models
Enterprise architecture
Business process models

Creating policies for data handling and accountability and driving culture change so people understand how to properly work with data are two important components of a data governance initiative, as is the technology for proactively managing data assets.

Without the ability to harvest metadata schemas and business terms; analyze data attributes and relationships; impose structure on definitions; and view all data in one place according to each user’s role within the enterprise, businesses will be hard pressed to stay in step with governance standards and best practices around security and privacy.

As a consequence, the private information held within organizations will continue to be at risk.

Organizations suffering data breaches will be deprived of the benefits they had hoped to realize from the money spent on security technologies and the time invested in developing data privacy classifications.

They also may face heavy fines and other financial, not to mention PR, penalties.

You can learn more by reading our whitepaper: Examining the Data Trinity: Governance, Security and Privacy.

erwin Expert Blog

Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

Post author By Danny Sandwell
Post date October 24, 2019
No Comments on Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

Untapped data, if mined, represents tremendous potential for your organization. While there has been a lot of talk about big data over the years, the real hero in unlocking the value of enterprise data is metadata, or the data about the data.

However, most organizations don’t use all the data they’re flooded with to reach deeper conclusions about how to drive revenue, achieve regulatory compliance or make other strategic decisions. They don’t know exactly what data they have or even where some of it is.

Quite honestly, knowing what data you have and where it lives is complicated. And to truly understand it, you need to be able to create and sustain an enterprise-wide view of and easy access to underlying metadata.

This isn’t an easy task. Organizations are dealing with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and with little thought for downstream integration.

As a result, the applications and initiatives that depend on a solid data infrastructure may be compromised, leading to faulty analysis and insights.

Metadata Is the Heart of Data Intelligence

A recent IDC Innovators: Data Intelligence Report says that getting answers to such questions as “where is my data, where has it been, and who has access to it” requires harnessing the power of metadata.

Metadata is generated every time data is captured at a source, accessed by users, moves through an organization, and then is profiled, cleansed, aggregated, augmented and used for analytics to guide operational or strategic decision-making.

In fact, data professionals spend 80 percent of their time looking for and preparing data and only 20 percent of their time on analysis, according to IDC.

To flip this 80/20 rule, they need an automated metadata management solution for:

• Discovering data – Identify and interrogate metadata from various data management silos.
• Harvesting data – Automate the collection of metadata from various data management silos and consolidate it into a single source.
• Structuring and deploying data sources – Connect physical metadata to specific data models, business terms, definitions and reusable design standards.
• Analyzing metadata – Understand how data relates to the business and what attributes it has.
• Mapping data flows – Identify where to integrate data and track how it moves and transforms.
• Governing data – Develop a governance model to manage standards, policies and best practices and associate them with physical assets.
• Socializing data – Empower stakeholders to see data in one place and in the context of their roles.

Addressing the Complexities of Metadata Management

The complexities of metadata management can be addressed with a strong data management strategy coupled with metadata management software to enable the data quality the business requires.

This encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossary maintenance, and metadata management (associations and lineage).

erwin has developed the only data intelligence platform that provides organizations with a complete and contextual depiction of the entire metadata landscape.

It is the only solution that can automatically harvest, transform and feed metadata from operational processes, business applications and data models into a central data catalog and then made accessible and understandable within the context of role-based views.

erwin’s ability to integrate and continuously refresh metadata from an organization’s entire data ecosystem, including business processes, enterprise architecture and data architecture, forms the foundation for enterprise-wide data discovery, literacy, governance and strategic usage.

Organizations then can take a data-driven approach to business transformation, speed to insights, and risk management.
With erwin, organizations can:

1. Deliver a trusted metadata foundation through automated metadata harvesting and cataloging
2. Standardize data management processes through a metadata-driven approach
3. Centralize data-driven projects around centralized metadata for planning and visibility
4. Accelerate data preparation and delivery through metadata-driven automation
5. Master data management platforms through metadata abstraction
6. Accelerate data literacy through contextual metadata enrichment and integration
7. Leverage a metadata repository to derive lineage, impact analysis and enable audit/oversight ability

With erwin Data Intelligence as part of the erwin EDGE platform, you know what data you have, where it is, where it’s been and how it transformed along the way, plus you can understand sensitivities and risks.

With an automated, real-time, high-quality data pipeline, enterprise stakeholders can base strategic decisions on a full inventory of reliable information.

Many of our customers are hard at work addressing metadata management challenges, and that’s why erwin was Named a Leader in Gartner’s “2019 Magic Quadrant for Metadata Management Solutions.”

Tags data modeling, enterprise architecture, business process, GDPR, data management, big data, data governance, data architecture, gartner, metadata, metadata management, data quality, IDC, data discovery, metadata repository, structure data, data sources, data infrastructure, automated metadata management, harvest data, govern data, socialize data, regulatory compliance, data models, data intelligence, CCPA, data professionals, data literacy, IDC report, untapped data, data type, downstream integration, metadata data intelligence, data intelligence report, harvesting data, deploy data, analyze metadata, mapping data flows, data management strategy, data ecosystem, gartner metadata management solutions, magic quadrant, magic quadrant metadata, gartner magic quadrant for metadata management

erwin Expert Blog

Benefits of Data Vault Automation

Post author By Danny Sandwell
Post date September 26, 2019
No Comments on Benefits of Data Vault Automation

The benefits of Data Vault automation from the more abstract – like improving data integrity – to the tangible – such as clearly identifiable savings in cost and time.

So Seriously … You Should Automate Your Data Vault

By Danny Sandwell

Data Vault is a methodology for architecting and managing data warehouses in complex data environments where new data types and structures are constantly introduced.

Without Data Vault, data warehouses are difficult and time consuming to change causing latency issues and slowing time to value. In addition, the queries required to maintain historical integrity are complex to design and run slow causing performance issues and potentially incorrect results because the ability to understand relationships between historical snap shots of data is lacking.

In his blog, Dan Linstedt, the creator of Data Vault methodology, explains that Data Vaults “are extremely scalable, flexible architectures” enabling the business to grow and change without “the agony and pain of high costs, long implementation and test cycles, and long lists of impacts across the enterprise warehouse.”

With a Data Vault, new functional areas typically are added quickly and easily, with changes to existing architecture taking less than half the traditional time with much less impact on the downstream systems, he notes.

Astonishingly, nearly 20 years since the methodology’s creation, most Data Vault design, development and deployment phases are still handled manually. But why?

Traditional manual efforts to define the Data Vault population and create ETL code from scratch can take weeks or even months. The entire process is time consuming slowing down the data pipeline and often riddled with human errors.

On the flipside, automating the development and deployment of design changes and the resulting data movement processing code ensures companies can accelerate dev and deployment in a timely and cost-effective manner.

Benefits of Data Vault Automation – A Case Study …

Global Pharma Company Saves Considerable Time and Money with Data Vault Automation

Let’s take a look at a large global pharmaceutical company that switched to Data Vault automation with staggering results.

Like many pharmaceutical companies, it manages a massive data warehouse combining clinical trial, supply chain and other mission-critical data. They had chosen a Data Vault schema for its flexibility in handling change but found creating the hubs and satellite structure incredibly laborious.

They needed to accelerate development, as well as aggregate data from different systems for internal customers to access and share. Additionally, the company needed lineage and traceability for regulatory compliance efforts.

With this ability, they can identify data sources, transformations and usage to safeguard protected health information (PHI) for clinical trials.

After an initial proof of concept, they deployed erwin Data Vault Automation and generated more than 200 tables, jobs and processes with 10 to 12 scripts. The highly schematic structure of the models enabled large portions of the modeling process to be automated, dramatically accelerating Data Vault projects and optimizing data warehouse management.

erwin Data Vault Automation helped this pharma customer automate the complete lifecycle – accelerating development while increasing consistency, simplicity and flexibility – to save considerable time and money.

For this customer the benefits of data vault automation were as such:

Saving an estimated 70% of the costs of manual development
Generating 95% of the production code with “zero touch,” improving the time to business value and significantly reduced costly re-work associated with error-prone manual processes
Increasing data integrity, including for new requirements and use cases regardless of changes to the warehouse structure because legacy source data doesn’t degrade
Creating a sustainable approach to Data Vault deployment, ensuring the agile, adaptable and timely delivery of actionable insights to the business in a well-governed facility for regulatory compliance, including full transparency and ease of auditability

Homegrown Tools Never Provide True Data Vault Automation

Many organizations use some form of homegrown tool or standalone applications. However, they don’t integrate with other tools and components of the architecture, they’re expensive, and quite frankly, they make it difficult to derive any meaningful results.

erwin Data Vault Automation centralizes the specification and deployment of Data Vault architectures for better control and visibility of the software development lifecycle. erwin Data Catalog makes it easy to discover, organize, curate and govern data being sourced for and managed in the warehouse.

With this solution, users select data sets to be included in the warehouse and fully automate the loading of Data Vault structures and ETL operations.

erwin Data Vault Smart Connectors eliminate the need for a business analyst and ETL developers to repeat mundane tasks, so they can focus on choosing and using the desired data instead. This saves considerable development time and effort plus delivers a high level of standardization and reuse.

After the Data Vault processes have been automated, the warehouse is well documented with traceability from the marts back to the operational data to speed the investigation of issues and analyze the impact of changes.

Bottom line: if your Data Vault integration is not automated, you’re already behind.

If you’d like to get started with erwin Data Vault Automation or request a quote, you can email consulting@erwin.com.

Tags data modeling, data warehouse, data vault, data modeler, data catalog, ETL code, data vault automation, data vault methodology, data vault design, data vault schema, data vault smart connectors, data vault architectures, data vault integration

erwin Expert Blog

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

Post author By Danny Sandwell
Post date July 12, 2019
1 Comment on Using Strategic Data Governance to Manage GDPR/CCPA Complexity

In light of recent, high-profile data breaches, it’s past-time we re-examined strategic data governance and its role in managing regulatory requirements.

News broke earlier this week of British Airways being fined 183 million pounds – or $228 million – by the U.K. for alleged violations of the European Union’s General Data Protection Regulation (GDPR). While not the first, it is the largest penalty levied since the GDPR went into effect in May 2018.

Given this, Oppenheimer & Co. cautions:

“European regulators could accelerate the crackdown on GDPR violators, which in turn could accelerate demand for GDPR readiness. Although the CCPA [California Consumer Privacy Act, the U.S. equivalent of GDPR] will not become effective until 2020, we believe that new developments in GDPR enforcement may influence the regulatory framework of the still fluid CCPA.”

With all the advance notice and significant chatter for GDPR/CCPA, why aren’t organizations more prepared to deal with data regulations?

In a word? Complexity.

The complexity of regulatory requirements in and of themselves is aggravated by the complexity of the business and data landscapes within most enterprises.

So it’s important to understand how to use strategic data governance to manage the complexity of regulatory compliance and other business objectives …

Designing and Operationalizing Regulatory Compliance Strategy

It’s not easy to design and deploy compliance in an environment that’s not well understood and difficult in which to maneuver. First you need to analyze and design your compliance strategy and tactics, and then you need to operationalize them.

Modern, strategic data governance, which involves both IT and the business, enables organizations to plan and document how they will discover and understand their data within context, track its physical existence and lineage, and maximize its security, quality and value. It also helps enterprises put these strategic capabilities into action by:

Understanding their business, technology and data architectures and their inter-relationships, aligning them with their goals and defining the people, processes and technologies required to achieve compliance.
Creating and automating a curated enterprise data catalog, complete with physical assets, data models, data movement, data quality and on-demand lineage.
Activating their metadata to drive agile data preparation and governance through integrated data glossaries and dictionaries that associate policies to enable stakeholder data literacy.

Five Steps to GDPR/CCPA Compliance

With the right technology, GDPR/CCPA compliance can be automated and accelerated in these five steps:

Catalog systems

Harvest, enrich/transform and catalog data from a wide array of sources to enable any stakeholder to see the interrelationships of data assets across the organization.

Govern PII “at rest”

Classify, flag and socialize the use and governance of personally identifiable information regardless of where it is stored.

Govern PII “in motion”

Scan, catalog and map personally identifiable information to understand how it moves inside and outside the organization and how it changes along the way.

Manage policies and rules

Govern business terminology in addition to data policies and rules, depicting relationships to physical data catalogs and the applications that use them with lineage and impact analysis views.

Strengthen data security

Identify regulatory risks and guide the fortification of network and encryption security standards and policies by understanding where all personally identifiable information is stored, processed and used.

How erwin Can Help

erwin is the only software provider with a complete, metadata-driven approach to data governance through our integrated enterprise modeling and data intelligence suites. We help customers overcome their data governance challenges, with risk management and regulatory compliance being primary concerns.

However, the erwin EDGE also delivers an “enterprise data governance experience” in terms of agile innovation and business transformation – from creating new products and services to keeping customers happy to generating more revenue.

Whatever your organization’s key drivers are, a strategic data governance approach – through business process, enterprise architecture and data modeling combined with data cataloging and data literacy – is key to success in our modern, digital world.

If you’d like to get a handle on handling your data, you can sign up for a free, one-on-one demo of erwin Data Intelligence.

For more information on GDPR/CCPA, we’ve also published a white paper on the Regulatory Rationale for Integrating Data Management and Data Governance.

erwin Expert Blog

Choosing the Right Data Modeling Tool

Post author By Bunny Tharpe
Post date June 27, 2019
No Comments on Choosing the Right Data Modeling Tool

The need for an effective data modeling tool is more significant than ever.

For decades, data modeling has provided the optimal way to design and deploy new relational databases with high-quality data sources and support application development. But it provides even greater value for modern enterprises where critical data exists in both structured and unstructured formats and lives both on premise and in the cloud.

Get up to speed: Data modeling best practices for data-driven organizations

In today’s hyper-competitive, data-driven business landscape, organizations are awash with data and the applications, databases and schema required to manage it.

For example, an organization may have 300 applications, with 50 different databases and a different schema for each. Additional challenges, such as increasing regulatory pressures – from the General Data Protection Regulation (GDPR) to the Health Insurance Privacy and Portability Act (HIPPA) – and growing stores of unstructured data also underscore the increasing importance of a data modeling tool.

Data modeling, quite simply, describes the process of discovering, analyzing, representing and communicating data requirements in a precise form called the data model. There’s an expression: measure twice, cut once. Data modeling is the upfront “measuring tool” that helps organizations reduce time and avoid guesswork in a low-cost environment.

Types of Data Models Explained: Conceptual, Logical and Physical

From a business-outcome perspective, a data modeling tool is used to help organizations:

Effectively manage and govern massive volumes of data
Consolidate and build applications with hybrid architectures, including traditional, Big Data, cloud and on premise
Support expanding regulatory requirements, such as GDPR and the California Consumer Privacy Act (CCPA)
Simplify collaboration across key roles and improve information alignment
Improve business processes for operational efficiency and compliance
Empower employees with self-service access for enterprise data capability, fluency and accountability

Evaluating a Data Modeling Tool – Key Features

Organizations seeking to invest in a new data modeling tool should consider these four key features.

Ability to visualize business and technical database structures through an integrated, graphical model.

Due to the amount of database platforms available, it’s important that an organization’s data modeling tool supports a sufficient (to your organization) array of platforms. The chosen data modeling tool should be able to read the technical formats of each of these platforms and translate them into highly graphical models rich in metadata. Schema can be deployed from models in an automated fashion and iteratively updated so that new development can take place via model-driven design.

Empowering of end-user BI/analytics by data source discovery, analysis and integration.

A data modeling tool should give business users confidence in the information they use to make decisions. Such confidence comes from the ability to provide a common, contextual, easily accessible source of data element definitions to ensure they are able to draw upon the correct data; understand what it represents, including where it comes from; and know how it’s connected to other entities.

A data modeling tool can also be used to pull in data sources via self-service BI and analytics dashboards. The data modeling tool should also have the ability to integrate its models into whatever format is required for downstream consumption.

The ability to store business definitions and data-centric business rules in the model along with technical database schemas, procedures and other information.

With business definitions and rules on board, technical implementations can be better aligned with the needs of the organization. Using an advanced design layer architecture, model “layers” can be created with one or more models focused on the business requirements that then can be linked to one or more database implementations. Design-layer metadata can also be connected from conceptual through logical to physical data models.

Rationalize platform inconsistencies and deliver a single source of truth for all enterprise business data.

Many organizations struggle to breakdown data silos and unify data into a single source of truth, due in large part to varying data sources and difficulty managing unstructured data. Being able to model any data from anywhere accounts for this with on-demand modeling for non-relational databases that offer speed, horizontal scalability and other real-time application advantages.

With NoSQL support, model structures from non-relational databases, such as Couchbase and MongoDB can be created automatically. Existing Couchbase and MongoDB data sources can be easily discovered, understood and documented through modeling and visualization. Existing entity-relationship diagrams and SQL databases can be migrated to Couchbase and MongoDB too. Relational schema also will be transformed to query-optimized NoSQL constructs.

Other considerations include the ability to:

Compare models and databases.
Increase enterprise collaboration.
Perform impact analysis.
Enable business and IT infrastructure interoperability.

When it comes to data modeling, no one knows it better. For more than 30 years, erwin Data Modeler has been the market leader. It is built on the vision and experience of data modelers worldwide and is the de-facto standard in data model integration.

You can learn more about driving business value and underpinning governance with erwin DM in this free white paper.

erwin Expert Blog

What’s Business Process Modeling Got to Do with It? – Choosing A BPM Tool

Post author By Bunny Tharpe
Post date March 21, 2019
No Comments on What’s Business Process Modeling Got to Do with It? – Choosing A BPM Tool

With business process modeling (BPM) being a key component of data governance, choosing a BPM tool is part of a dilemma many businesses either have or will soon face.

Historically, BPM didn’t necessarily have to be tied to an organization’s data governance initiative.

However, data-driven business and the regulations that oversee it are becoming increasingly extensive, so the need to view data governance as a collective effort – in terms of personnel and the tools that make up the strategy – is becoming harder to ignore.

Data governance also relies on business process modeling and analysis to drive improvement, including identifying business practices susceptible to security, compliance or other risks and adding controls to mitigate exposures.

Choosing a BPM Tool: An Overview

As part of a data governance strategy, a BPM tool aids organizations in visualizing their business processes, system interactions and organizational hierarchies to ensure elements are aligned and core operations are optimized.

The right BPM tool also helps organizations increase productivity, reduce errors and mitigate risks to achieve strategic objectives.

With insights from the BPM tool, you can clarify roles and responsibilities – which in turn should influence an organization’s policies about data ownership and make data lineage easier to manage.

Organizations also can use a BPM tool to identify the staff who function as “unofficial data repositories.” This has both a primary and secondary function:

1. Organizations can document employee processes to ensure vital information isn’t lost should an employee choose to leave.

2. It is easier to identify areas where expertise may need to be bolstered.

Organizations that adopt a BPM tool also enjoy greater process efficiency. This is through a combination of improving existing processes or designing new process flows, eliminating unnecessary or contradictory steps, and documenting results in a shareable format that is easy to understand so the organization is pulling in one direction.

Silo Buster

Understanding the typical use cases for business process modeling is the first step. As with any tech investment, it’s important to understand how the technology will work in the context of your organization/business.

For example, it’s counter-productive to invest in a solution that reduces informational silos only to introduce a new technological silo through a lack of integration.

Ideally, organizations want a BPM tool that works in conjunction with the wider data management platform and data governance initiative – not one that works against them.

That means it must support data imports and integrations from/with external sources, a solution that enables in-tool collaboration to reduce departmental silos, and most crucial, a solution that taps into a central metadata repository to ensure consistency across the whole data management and governance initiatives.

The lack of a central metadata repository is a far too common thorn in an organization’s side. Without it, they have to juggle multiple versions as changes to the underlying data aren’t automatically updated across the platform.

It also means organizations waste crucial time manually manufacturing and maintaining data quality, when an automation framework could achieve the same goal instantaneously, without human error and with greater consistency.

A central metadata repository ensures an organization can acknowledge and get behind a single source of truth. This has a wealth of favorable consequences including greater cohesion across the organization, better data quality and trust, and faster decision-making with less false starts due to plans based on misleading information.

Three Key Questions to Ask When Choosing a BPM Tool

Organizations in the market for a BPM tool should also consider the following:

1. Configurability: Does the tool support the ability to model and analyze business processes with links to data, applications and other aspects of your organization? And how easy is this to achieve?

2. Role-based views: Can the tool develop integrated business models for a single source of truth but with different views for different stakeholders based on their needs – making regulatory compliance more manageable? Does it enable cross-functional and enterprise collaboration through discussion threads, surveys and other social features?

3. Business and IT infrastructure interoperability: How well does the tool integrate with other key components of data governance including enterprise architecture, data modeling, data cataloging and data literacy? Can it aid in providing data intelligence to connect all the pieces of the data management and governance lifecycles?

For more information and to find out how such a solution can integrate with your organization and current data management and data governance initiatives, click here.

erwin Expert Blog

Data Mapping Tools: What Are the Key Differentiators

Post author By Bunny Tharpe
Post date March 14, 2019
No Comments on Data Mapping Tools: What Are the Key Differentiators

The need for data mapping tools in light of increasing volumes and varieties of data – as well as the velocity at which it must be processed – is growing.

It’s not difficult to see why either. Data mapping tools have always been a key asset for any organization looking to leverage data for insights.

Isolated units of data are essentially meaningless. By linking data and enabling its categorization in relation to other data units, data mapping provides the context vital for actionable information.

Now with the General Data Protection Regulation (GDPR) in effect, data mapping has become even more significant.

The scale of GDPR’s reach has set a new precedent and is the closest we’ve come to a global standard in terms of data regulations. The repercussions can be huge – just ask Google.

Data mapping tools are paramount in charting a path to compliance for said new, near-global standard and avoiding the hefty fines.

Because of GDPR, organizations that may not have fully leveraged data mapping for proactive data-driven initiatives (e.g., analysis) are now adopting data mapping tools with compliance in mind.

Arguably, GDPR’s implementation can be viewed as an opportunity – a catalyst for digital transformation.

Those organizations investing in data mapping tools with compliance as the main driver will definitely want to consider this opportunity and have it influence their decision as to which data mapping tool to adopt.

With that in mind, it’s important to understand the key differentiators in data mapping tools and the associated benefits.

Data Mapping Tools: Automated or Manual?

In terms of differentiators for data mapping tools, perhaps the most distinct is automated data mapping versus data mapping via manual processes.

Data mapping tools that allow for automation mean organizations can benefit from in-depth, quality-assured data mapping, without the significant allocations of resources typically associated with such projects.

Eighty percent of data scientists’ and other data professionals’ time is spent on manual data maintenance. That’s anything and everything from addressing errors and inconsistencies and trying to understand source data or track its lineage. This doesn’t even account for the time lost due to missed errors that contribute to inherently flawed endeavors.

Automated data mapping tools render such issues and concerns void. In turn, data professionals’ time can be put to much better, proactive use, rather than them being bogged down with reactive, house-keeping tasks.

FOUR INDUSTRY FOCUSSED CASE STUDIES FOR AUTOMATED METADATA-DRIVEN AUTOMATION
(BFSI, PHARMA, INSURANCE AND NON-PROFIT)

As well as introducing greater efficiency to the data governance process, automated data mapping tools enable data to be auto-documented from XML that builds mappings for the target repository or reporting structure.

Additionally, a tool that leverages and draws from a single metadata repository means that mappings are dynamically linked with underlying metadata to render automated lineage views, including full transformation logic in real time.

Therefore, changes (e.g., in the data catalog) will be reflected across data governance domains (business process, enterprise architecture and data modeling) as and when they’re made – no more juggling and maintaining multiple, out-of-date versions.

It also enables automatic impact analysis at the table and column level – even for business/transformation rules.

For organizations looking to free themselves from the burden of juggling multiple versions, siloed business processes and a disconnect between interdepartmental collaboration, this feature is a key benefit to consider.

Data Mapping Tools: Other Differentiators

In light of the aforementioned changes to data regulations, many organizations will need to consider the extent of a data mapping tool’s data lineage capabilities.

The ability to reverse-engineer and document the business logic from your reporting structures for true source-to-report lineage is key because it makes analysis (and the trust in said analysis) easier. And should a data breach occur, affected data/persons can be more quickly identified in accordance with GDPR.

Article 33 of GDPR requires organizations to notify the appropriate supervisory authority “without undue delay and, where, feasible, not later than 72 hours” after discovering a breach.

As stated above, a data governance platform that draws from a single metadata source is even more advantageous here.

Mappings can be synchronized with metadata so that source or target metadata changes can be automatically pushed into the mappings – so your mappings stay up to date with little or no effort.

The Data Mapping Tool For Data-Driven Businesses

Nobody likes manual documentation. It’s arduous, error-prone and a waste of resources. Quite frankly, it’s dated.

Any organization looking to invest in data mapping, data preparation and/or data cataloging needs to make automation a priority.

With automated data mapping, organizations can achieve “true data intelligence,”. That being the ability to tell the story of how data enters the organization and changes throughout the entire lifecycle to the consumption/reporting layer. If you’re working harder than your tool, you have the wrong tool.

The manual tools of old do not have auto documentation capabilities, cannot produce outbound code for multiple ETL or script types, and are a liability in terms of accuracy and GDPR.

Automated data mapping is the only path to true GDPR compliance, and erwin Mapping Manager can get you there in a matter of weeks thanks to our robust reverse-engineering technology.

Learn more about erwin’s automation framework for data governance here.

erwin Expert Blog

Data Governance Stock Check: Using Data Governance to Take Stock of Your Data Assets

Post author By Bunny Tharpe
Post date March 8, 2019
No Comments on Data Governance Stock Check: Using Data Governance to Take Stock of Your Data Assets

For regulatory compliance (e.g., GDPR) and to ensure peak business performance, organizations often bring consultants on board to help take stock of their data assets. This sort of data governance “stock check” is important but can be arduous without the right approach and technology. That’s where data governance comes in …

While most companies hold the lion’s share of operational data within relational databases, it also can live in many other places and various other formats. Therefore, organizations need the ability to manage any data from anywhere, what we call our “any-squared” (Any²) approach to data governance.

Any²first requires an understanding of the ‘3Vs’ of data – volume, variety and velocity – especially in context of the data lifecycle, as well as knowing how to leverage the key capabilities of data governance – data cataloging, data literacy, business process, enterprise architecture and data modeling – that enable data to be leveraged at different stages for optimum security, quality and value.

Following are two examples that illustrate the data governance stock check, including the Any² approach in action, based on real consulting engagements.

Data Governance “Stock Check” Case 1: The Data Broker

This client trades in information. Therefore, the organization needed to catalog the data it acquires from suppliers, ensure its quality, classify it, and then sell it to customers. The company wanted to assemble the data in a data warehouse and then provide controlled access to it.

The first step in helping this client involved taking stock of its existing data. We set up a portal so data assets could be registered via a form with basic questions, and then a central team received the registrations, reviewed and prioritized them. Entitlement attributes also were set up to identify and profile high-priority assets.

A number of best practices and technology solutions were used to establish the data required for managing the registration and classification of data feeds:

1. The underlying metadata is harvested followed by an initial quality check. Then the metadata is classified against a semantic model held in a business glossary.

2. After this classification, a second data quality check is performed based on the best-practice rules associated with the semantic model.

3. Profiled assets are loaded into a historical data store within the warehouse, with data governance tools generating its structure and data movement operations for data loading.

4. We developed a change management program to make all staff aware of the information brokerage portal and the importance of using it. It uses a catalog of data assets, all classified against a semantic model with data quality metrics to easily understand where data assets are located within the data warehouse.

5. Adopting this portal, where data is registered and classified against an ontology, enables the client’s customers to shop for data by asset or by meaning (e.g., “what data do you have on X topic?”) and then drill down through the taxonomy or across an ontology. Next, they raise a request to purchase the desired data.

This consulting engagement and technology implementation increased data accessibility and capitalization. Information is registered within a central portal through an approved workflow, and then customers shop for data either from a list of physical assets or by information content, with purchase requests also going through an approval workflow. This, among other safeguards, ensures data quality.

Data Governance “Stock Check” Case 2: Tracking Rogue Data

This client has a geographically-dispersed organization that stored many of its key processes in Microsoft Excel^TM spreadsheets. They were planning to move to Office 365^TM and were concerned about regulatory compliance, including GDPR mandates.

Knowing that electronic documents are heavily used in key business processes and distributed across the organization, this company needed to replace risky manual processes with centralized, automated systems.

A key part of the consulting engagement was to understand what data assets were in circulation and how they were used by the organization. Then process chains could be prioritized to automate and outline specifications for the system to replace them.

This organization also adopted a central portal that allowed employees to register data assets. The associated change management program raised awareness of data governance across the organization and the importance of data registration.

For each asset, information was captured and reviewed as part of a workflow. Prioritized assets were then chosen for profiling, enabling metadata to be reverse-engineered before being classified against the business glossary.

Additionally, assets that were part of a process chain were gathered and modeled with enterprise architecture (EA) and business process (BP) modeling tools for impact analysis.

High-level requirements for new systems then could be defined again in the EA/BP tools and prioritized on a project list. For the others, decisions could be made on whether they could safely be placed in the cloud and whether macros would be required.

In this case, the adoption of purpose-built data governance solutions helped build an understanding of the data assets in play, including information about their usage and content to aid in decision-making.

This client then had a good handle of the “what” and “where” in terms of sensitive data stored in their systems. They also better understood how this sensitive data was being used and by whom, helping reduce regulatory risks like those associated with GDPR.

In both scenarios, we cataloged data assets and mapped them to a business glossary. It acts as a classification scheme to help govern data and located data, making it both more accessible and valuable. This governance framework reduces risk and protects its most valuable or sensitive data assets.

Focused on producing meaningful business outcomes, the erwin EDGE platform was pivotal in achieving these two clients’ data governance goals – including the infrastructure to undertake a data governance stock check. They used it to create an “enterprise data governance experience” not just for cataloging data and other foundational tasks, but also for a competitive “EDGE” in maximizing the value of their data while reducing data-related risks.

To learn more about the erwin EDGE data governance platform and how it aids in undertaking a data governance stock check, register for our free, 30-minute demonstration here.

erwin Expert Blog

Google’s Record GDPR Fine: Avoiding This Fate with Data Governance

Post author By Bunny Tharpe
Post date January 31, 2019
1 Comment on Google’s Record GDPR Fine: Avoiding This Fate with Data Governance

The General Data Protection Regulation (GDPR) made its first real impact as Google’s record GDPR fine dominated news cycles.

Historically, fines had peaked at six figures with the U.K.’s Information Commissioner’s Office (ICO) fines of 500,000 pounds ($650,000 USD) against both Facebook and Equifax for their data protection breaches.

Experts predicted an uptick in GDPR enforcement in 2019, and Google’s recent record GDPR fine has brought that to fruition. France’s data privacy enforcement agency hit the tech giant with a $57 million penalty – more than 80 times the steepest ICO fine.

If it can happen to Google, no organization is safe. Many in fact still lag in the GDPR compliance department. Cisco’s 2019 Data Privacy Benchmark Study reveals that only 59 percent of organizations are meeting “all or most” of GDPR’s requirements.

So many more GDPR violations are likely to come to light. And even organizations that are currently compliant can’t afford to let their data governance standards slip.

Data Governance for GDPR

Google’s record GDPR fine makes the rationale for better data governance clear enough. However, the Cisco report offers even more insight into the value of achieving and maintaining compliance.

Organizations with GDPR-compliant security measures are not only less likely to suffer a breach (74 percent vs. 89 percent), but the breaches suffered are less costly too, with fewer records affected.

However, applying such GDPR-compliant provisions can’t be done on a whim; organizations must expand their data governance practices to include compliance.

A robust data governance initiative provides a comprehensive picture of an organization’s systems and the units of data contained or used within them. This understanding encompasses not only the original instance of a data unit but also its lineage and how it has been handled and processed across an organization’s ecosystem.

With this information, organizations can apply the relevant degrees of security where necessary, ensuring expansive and efficient protection from external (i.e., breaches) and internal (i.e., mismanaged permissions) data security threats.

Although data security cannot be wholly guaranteed, these measures can help identify and contain breaches to minimize the fallout.

Looking at Google’s Record GDPR Fine as An Opportunity

The tertiary benefits of GDPR compliance include greater agility and innovation and better data discovery and management. So arguably, the “tertiary” benefits of data governance should take center stage.

While once exploited by such innovators as Amazon and Netflix, data optimization and governance is now on everyone’s radar.

So organization’s need another competitive differentiator.

An enterprise data governance experience (EDGE) provides just that.

THE REGULATORY RATIONALE FOR INTEGRATING DATA MANAGEMENT & DATA GOVERNANCE

This approach unifies data management and data governance, ensuring that the data landscape, policies, procedures and metrics stem from a central source of truth so data can be trusted at any point throughout its enterprise journey.

With an EDGE, the Any² (any data from anywhere) data management philosophy applies – whether structured or unstructured, in the cloud or on premise. An organization’s data preparation (data mapping), enterprise modeling (business, enterprise and data) and data governance practices all draw from a single metadata repository.

In fact, metadata from a multitude of enterprise systems can be harvested and cataloged automatically. And with intelligent data discovery, sensitive data can be tagged and governed automatically as well – think GDPR as well as HIPAA, BCBS and CCPA.

Organizations without an EDGE can still achieve regulatory compliance, but data silos and the associated bottlenecks are unavoidable without integration and automation – not to mention longer timeframes and higher costs.

To get an “edge” on your competition, consider the erwin EDGE platform for greater control over and value from your data assets.

Data preparation/mapping is a great starting point and a key component of the software portfolio. Join us for a weekly demo.

erwin Expert Blog

Five Benefits of an Automation Framework for Data Governance

Post author By Sam Benedict and John Carter
Post date January 24, 2019
1 Comment on Five Benefits of an Automation Framework for Data Governance

Organizations are responsible for governing more data than ever before, making a strong automation framework a necessity. But what exactly is an automation framework and why does it matter?

In most companies, an incredible amount of data flows from multiple sources in a variety of formats and is constantly being moved and federated across a changing system landscape.

Often these enterprises are heavily regulated, so they need a well-defined data integration model that helps avoid data discrepancies and removes barriers to enterprise business intelligence and other meaningful use.

IT teams need the ability to smoothly generate hundreds of mappings and ETL jobs. They need their data mappings to fall under governance and audit controls, with instant access to dynamic impact analysis and lineage.

With an automation framework, data professionals can meet these needs at a fraction of the cost of the traditional manual way.

In data governance terms, an automation framework refers to a metadata-driven universal code generator that works hand in hand with enterprise data mapping for:

Pre-ETL enterprise data mapping
Governing metadata
Governing and versioning source-to-target mappings throughout the lifecycle
Data lineage, impact analysis and business rules repositories
Automated code generation

Such automation enables organizations to bypass bottlenecks, including human error and the time required to complete these tasks manually.

In fact, being able to rely on automated and repeatable processes can result in up to 50 percent in design savings, up to 70 percent conversion savings and up to 70 percent acceleration in total project delivery.

So without further ado, here are the five key benefits of an automation framework for data governance.

Benefits of an Automation Framework for Data Governance

Creates simplicity, reliability, consistency and customization for the integrated development environment.

Code automation templates (CATs) can be created – for virtually any process and any tech platform – using the SDK scripting language or the solution’s published libraries to completely automate common, manual data integration tasks.

CATs are designed and developed by senior automation experts to ensure they are compliant with industry or corporate standards as well as with an organization’s best practice and design standards.

The 100-percent metadata-driven approach is critical to creating reliable and consistent CATs.

It is possible to scan, pull in and configure metadata sources and targets using standard or custom adapters and connectors for databases, ERP, cloud environments, files, data modeling, BI reports and Big Data to document data catalogs, data mappings, ETL (XML code) and even SQL procedures of any type.

Provides blueprints anyone in the organization can use.

Stage DDL from source metadata for the target DBMS; profile and test SQL for test automation of data integration projects; generate source-to-target mappings and ETL jobs for leading ETL tools, among other capabilities.

It also can populate and maintain Big Data sets by generating PIG, Scoop, MapReduce, Spark, Python scripts and more.

Incorporates data governance into the system development process.

An organization can achieve a more comprehensive and sustainable data governance initiative than it ever could with a homegrown solution.

An automation framework’s ability to automatically create, version, manage and document source-to-target mappings greatly matters both to data governance maturity and a shorter-time-to-value.

This eliminates duplication that occurs when project teams are siloed, as well as prevents the loss of knowledge capital due to employee attrition.

Another value capability is coordination between data governance and SDLC, including automated metadata harvesting and cataloging from a wide array of sources for real-time metadata synchronization with core data governance capabilities and artifacts.

Proves the value of data lineage and impact analysis for governance and risk assessment.

Automated reverse-engineering of ETL code into natural language enables a more intuitive lineage view for data governance.

With end-to-end lineage, it is possible to view data movement from source to stage, stage to EDW, and on to a federation of marts and reporting structures, providing a comprehensive and detailed view of data in motion.

The process includes leveraging existing mapping documentation and auto-documented mappings to quickly render graphical source-to-target lineage views including transformation logic that can be shared across the enterprise.

Similarly, impact analysis – which involves data mapping and lineage across tables, columns, systems, business rules, projects, mappings and ETL processes – provides insight into potential data risks and enables fast and thorough remediation when needed.

Impact analysis across the organization while meeting regulatory compliance with industry regulators requires detailed data mapping and lineage.

THE REGULATORY RATIONALE FOR INTEGRATING DATA MANAGEMENT & DATA GOVERNANCE

Supports a wide spectrum of business needs.

Intelligent automation delivers enhanced capability, increased efficiency and effective collaboration to every stakeholder in the data value chain: data stewards, architects, scientists, analysts; business intelligence developers, IT professionals and business consumers.

It makes it easier for them to handle jobs such as data warehousing by leveraging source-to-target mapping and ETL code generation and job standardization.

It’s easier to map, move and test data for regular maintenance of existing structures, movement from legacy systems to new systems during a merger or acquisition, or a modernization effort.

erwin’s Approach to Automation for Data Governance: The erwin Automation Framework

Mature and sustainable data governance requires collaboration from both IT and the business, backed by a technology platform that accelerates the time to data intelligence.

Part of the erwin EDGE portfolio for an “enterprise data governance experience,” the erwin Automation Framework transforms enterprise data into accurate and actionable insights by connecting all the pieces of the data management and data governance lifecycle.

As with all erwin solutions, it embraces any data from anywhere (Any²) with automation for relational, unstructured, on-premise and cloud-based data assets and data movement specifications harvested and coupled with CATs.

If your organization would like to realize all the benefits explained above – and gain an “edge” in how it approaches data governance, you can start by joining one of our weekly demos for erwin Mapping Manager.