Categories
erwin Expert Blog

Data Governance Makes Data Security Less Scary

Happy Halloween!

Do you know where your data is? What data you have? Who has had access to it?

These can be frightening questions for an organization to answer.

Add to the mix the potential for a data breach followed by non-compliance, reputational damage and financial penalties and a real horror story could unfold.

In fact, we’ve seen some frightening ones play out already:

  1. Google’s record GDPR fine – France’s data privacy enforcement agency hit the tech giant with a $57 million penalty in early 2019 – more than 80 times the steepest fine the U.K.’s Information Commissioner’s Office had levied against both Facebook and Equifax for their data breaches.
  2. In July 2019, British Airways received the biggest GDPR fine to date ($229 million) because the data of more than 500,000 customers was compromised.
  3. Marriot International was fined $123 million, or 1.5 percent of its global annual revenue, because 330 million hotel guests were affected by a breach in 2018.

Now, as Cybersecurity Awareness Month comes to a close – and ghosts and goblins roam the streets – we thought it a good time to resurrect some guidance on how data governance can make data security less scary.

We don’t want you to be caught off guard when it comes to protecting sensitive data and staying compliant with data regulations.

Data Governance Makes Data Security Less Scary

Don’t Scream; You Can Protect Your Sensitive Data

It’s easier to protect sensitive data when you know what it is, where it’s stored and how it needs to be governed.

Data security incidents may be the result of not having a true data governance foundation that makes it possible to understand the context of data – what assets exist and where, the relationship between them and enterprise systems and processes, and how and by what authorized parties data is used.

That knowledge is critical to supporting efforts to keep relevant data secure and private.

Without data governance, organizations don’t have visibility of the full data landscape – linkages, processes, people and so on – to propel more context-sensitive security architectures that can better assure expectations around user and corporate data privacy. In sum, they lack the ability to connect the dots across governance, security and privacy – and to act accordingly.

This addresses these fundamental questions:

  1. What private data do we store and how is it used?
  2. Who has access and permissions to the data?
  3. What data do we have and where is it?

Where Are the Skeletons?

Data is a critical asset used to operate, manage and grow a business. While sometimes at rest in databases, data lakes and data warehouses; a large percentage is federated and integrated across the enterprise, introducing governance, manageability and risk issues that must be managed.

Knowing where sensitive data is located and properly governing it with policy rules, impact analysis and lineage views is critical for risk management, data audits and regulatory compliance.

However, when key data isn’t discovered, harvested, cataloged, defined and standardized as part of integration processes, audits may be flawed and therefore your organization is at risk.

Sensitive data – at rest or in motion – that exists in various forms across multiple systems must be automatically tagged, its lineage automatically documented, and its flows depicted so that it is easily found and its usage across workflows easily traced.

Thankfully, tools are available to help automate the scanning, detection and tagging of sensitive data by:

  • Monitoring and controlling sensitive data: Better visibility and control across the enterprise to identify data security threats and reduce associated risks
  • Enriching business data elements for sensitive data discovery: Comprehensively defining business data element for PII, PHI and PCI across database systems, cloud and Big Data stores to easily identify sensitive data based on a set of algorithms and data patterns
  • Providing metadata and value-based analysis: Discovery and classification of sensitive data based on metadata and data value patterns and algorithms. Organizations can define business data elements and rules to identify and locate sensitive data including PII, PHI, PCI and other sensitive information.

No Hocus Pocus

Truly understanding an organization’s data, including its value and quality, requires a harmonized approach embedded in business processes and enterprise architecture.

Such an integrated enterprise data governance experience helps organizations understand what data they have, where it is, where it came from, its value, its quality and how it’s used and accessed by people and applications.

An ounce of prevention is worth a pound of cure  – from the painstaking process of identifying what happened and why to notifying customers their data and thus their trust in your organization has been compromised.

A well-formed security architecture that is driven by and aligned by data intelligence is your best defense. However, if there is nefarious intent, a hacker will find a way. So being prepared means you can minimize your risk exposure and the damage to your reputation.

Multiple components must be considered to effectively support a data governance, security and privacy trinity. They are:

  1. Data models
  2. Enterprise architecture
  3. Business process models

Creating policies for data handling and accountability and driving culture change so people understand how to properly work with data are two important components of a data governance initiative, as is the technology for proactively managing data assets.

Without the ability to harvest metadata schemas and business terms; analyze data attributes and relationships; impose structure on definitions; and view all data in one place according to each user’s role within the enterprise, businesses will be hard pressed to stay in step with governance standards and best practices around security and privacy.

As a consequence, the private information held within organizations will continue to be at risk.

Organizations suffering data breaches will be deprived of the benefits they had hoped to realize from the money spent on security technologies and the time invested in developing data privacy classifications.

They also may face heavy fines and other financial, not to mention PR, penalties.

Gartner Magic Quadrant Metadata Management

Categories
erwin Expert Blog

Benefits of Data Vault Automation

The benefits of Data Vault automation from the more abstract – like improving data integrity – to the tangible – such as clearly identifiable savings in cost and time.

So Seriously … You Should Automate Your Data Vault

 By Danny Sandwell

Data Vault is a methodology for architecting and managing data warehouses in complex data environments where new data types and structures are constantly introduced.

Without Data Vault, data warehouses are difficult and time consuming to change causing latency issues and slowing time to value. In addition, the queries required to maintain historical integrity are complex to design and run slow causing performance issues and potentially incorrect results because the ability to understand relationships between historical snap shots of data is lacking.

In his blog, Dan Linstedt, the creator of Data Vault methodology, explains that Data Vaults “are extremely scalable, flexible architectures” enabling the business to grow and change without “the agony and pain of high costs, long implementation and test cycles, and long lists of impacts across the enterprise warehouse.”

With a Data Vault, new functional areas typically are added quickly and easily, with changes to existing architecture taking less than half the traditional time with much less impact on the downstream systems, he notes.

Astonishingly, nearly 20 years since the methodology’s creation, most Data Vault design, development and deployment phases are still handled manually. But why?

Traditional manual efforts to define the Data Vault population and create ETL code from scratch can take weeks or even months. The entire process is time consuming slowing down the data pipeline and often riddled with human errors.

On the flipside, automating the development and deployment of design changes and the resulting data movement processing code ensures companies can accelerate dev and deployment in a timely and cost-effective manner.

Benefits of Data Vault Automation

Benefits of Data Vault Automation – A Case Study …

Global Pharma Company Saves Considerable Time and Money with Data Vault Automation

Let’s take a look at a large global pharmaceutical company that switched to Data Vault automation with staggering results.

Like many pharmaceutical companies, it manages a massive data warehouse combining clinical trial, supply chain and other mission-critical data. They had chosen a Data Vault schema for its flexibility in handling change but found creating the hubs and satellite structure incredibly laborious.

They needed to accelerate development, as well as aggregate data from different systems for internal customers to access and share. Additionally, the company needed lineage and traceability for regulatory compliance efforts.

With this ability, they can identify data sources, transformations and usage to safeguard protected health information (PHI) for clinical trials.

After an initial proof of concept, they deployed erwin Data Vault Automation and generated more than 200 tables, jobs and processes with 10 to 12 scripts. The highly schematic structure of the models enabled large portions of the modeling process to be automated, dramatically accelerating Data Vault projects and optimizing data warehouse management.

erwin Data Vault Automation helped this pharma customer automate the complete lifecycle – accelerating development while increasing consistency, simplicity and flexibility – to save considerable time and money.

For this customer the benefits of data vault automation were as such:

  • Saving an estimated 70% of the costs of manual development
  • Generating 95% of the production code with “zero touch,” improving the time to business value and significantly reduced costly re-work associated with error-prone manual processes
  • Increasing data integrity, including for new requirements and use cases regardless of changes to the warehouse structure because legacy source data doesn’t degrade
  • Creating a sustainable approach to Data Vault deployment, ensuring the agile, adaptable and timely delivery of actionable insights to the business in a well-governed facility for regulatory compliance, including full transparency and ease of auditability

Homegrown Tools Never Provide True Data Vault Automation

Many organizations use some form of homegrown tool or standalone applications. However, they don’t integrate with other tools and components of the architecture, they’re expensive, and quite frankly, they make it difficult to derive any meaningful results.

erwin Data Vault Automation centralizes the specification and deployment of Data Vault architectures for better control and visibility of the software development lifecycle. erwin Data Catalog makes it easy to discover, organize, curate and govern data being sourced for and managed in the warehouse.

With this solution, users select data sets to be included in the warehouse and fully automate the loading of Data Vault structures and ETL operations.

erwin Data Vault Smart Connectors eliminate the need for a business analyst and ETL developers to repeat mundane tasks, so they can focus on choosing and using the desired data instead. This saves considerable development time and effort plus delivers a high level of standardization and reuse.

After the Data Vault processes have been automated, the warehouse is well documented with traceability from the marts back to the operational data to speed the investigation of issues and analyze the impact of changes.

Bottom line: if your Data Vault integration is not automated, you’re already behind.

If you’d like to get started with erwin Data Vault Automation or request a quote, you can email consulting@erwin.com.

Data Modeling Drives Business Value

Categories
erwin Expert Blog

Constructing a Digital Transformation Strategy: Putting the Data in Digital Transformation

Having a clearly defined digital transformation strategy is an essential best practice for successful digital transformation. But what makes a digital transformation strategy viable?

Part Two of the Digital Transformation Journey …

In our last blog on driving digital transformation, we explored how business architecture and process (BP) modeling are pivotal factors in a viable digital transformation strategy.

EA and BP modeling squeeze risk out of the digital transformation process by helping organizations really understand their businesses as they are today. It gives them the ability to identify what challenges and opportunities exist, and provides a low-cost, low-risk environment to model new options and collaborate with key stakeholders to figure out what needs to change, what shouldn’t change, and what’s the most important changes are.

Once you’ve determined what part(s) of your business you’ll be innovating — the next step in a digital transformation strategy is using data to get there.

Digital Transformation Examples

Constructing a Digital Transformation Strategy: Data Enablement

Many organizations prioritize data collection as part of their digital transformation strategy. However, few organizations truly understand their data or know how to consistently maximize its value.

If your business is like most, you collect and analyze some data from a subset of sources to make product improvements, enhance customer service, reduce expenses and inform other, mostly tactical decisions.

The real question is: are you reaping all the value you can from all your data? Probably not.

Most organizations don’t use all the data they’re flooded with to reach deeper conclusions or make other strategic decisions. They don’t know exactly what data they have or even where some of it is, and they struggle to integrate known data in various formats and from numerous systems—especially if they don’t have a way to automate those processes.

How does your business become more adept at wringing all the value it can from its data?

The reality is there’s not enough time, people and money for true data management using manual processes. Therefore, an automation framework for data management has to be part of the foundations of a digital transformation strategy.

Your organization won’t be able to take complete advantage of analytics tools to become data-driven unless you establish a foundation for agile and complete data management.

You need automated data mapping and cataloging through the integration lifecycle process, inclusive of data at rest and data in motion.

An automated, metadata-driven framework for cataloging data assets and their flows across the business provides an efficient, agile and dynamic way to generate data lineage from operational source systems (databases, data models, file-based systems, unstructured files and more) across the information management architecture; construct business glossaries; assess what data aligns with specific business rules and policies; and inform how that data is transformed, integrated and federated throughout business processes—complete with full documentation.

Without this framework and the ability to automate many of its processes, business transformation will be stymied. Companies, especially large ones with thousands of systems, files and processes, will be particularly challenged by taking a manual approach. Outsourcing these data management efforts to professional services firms only delays schedules and increases costs.

With automation, data quality is systemically assured. The data pipeline is seamlessly governed and operationalized to the benefit of all stakeholders.

Constructing a Digital Transformation Strategy: Smarter Data

Ultimately, data is the foundation of the new digital business model. Companies that have the ability to harness, secure and leverage information effectively may be better equipped than others to promote digital transformation and gain a competitive advantage.

While data collection and storage continues to happen at a dramatic clip, organizations typically analyze and use less than 0.5 percent of the information they take in – that’s a huge loss of potential. Companies have to know what data they have and understand what it means in common, standardized terms so they can act on it to the benefit of the organization.

Unfortunately, organizations spend a lot more time searching for data rather than actually putting it to work. In fact, data professionals spend 80 percent of their time looking for and preparing data and only 20 percent of their time on analysis, according to IDC.

The solution is data intelligence. It improves IT and business data literacy and knowledge, supporting enterprise data governance and business enablement.

It helps solve the lack of visibility and control over “data at rest” in databases, data lakes and data warehouses and “data in motion” as it is integrated with and used by key applications.

Organizations need a real-time, accurate picture of the metadata landscape to:

  • Discover data – Identify and interrogate metadata from various data management silos.
  • Harvest data – Automate metadata collection from various data management silos and consolidate it into a single source.
  • Structure and deploy data sources – Connect physical metadata to specific data models, business terms, definitions and reusable design standards.
  • Analyze metadata – Understand how data relates to the business and what attributes it has.
  • Map data flows – Identify where to integrate data and track how it moves and transforms.
  • Govern data – Develop a governance model to manage standards, policies and best practices and associate them with physical assets.
  • Socialize data – Empower stakeholders to see data in one place and in the context of their roles.

The Right Tools

When it comes to digital transformation (like most things), organizations want to do it right. Do it faster. Do it cheaper. And do it without the risk of breaking everything. To accomplish all of this, you need the right tools.

The erwin Data Intelligence (DI) Suite is the heart of the erwin EDGE platform for creating an “enterprise data governance experience.” erwin DI combines data cataloging and data literacy capabilities to provide greater awareness of and access to available data assets, guidance on how to use them, and guardrails to ensure data policies and best practices are followed.

erwin Data Catalog automates enterprise metadata management, data mapping, reference data management, code generation, data lineage and impact analysis. It efficiently integrates and activates data in a single, unified catalog in accordance with business requirements. With it, you can:

  • Schedule ongoing scans of metadata from the widest array of data sources.
  • Keep metadata current with full versioning and change management.
  • Easily map data elements from source to target, including data in motion, and harmonize data integration across platforms.

erwin Data Literacy provides self-service, role-based, contextual data views. It also provides a business glossary for the collaborative definition of enterprise data in business terms, complete with built-in accountability and workflows. With it, you can:

  • Enable data consumers to define and discover data relevant to their roles.
  • Facilitate the understanding and use of data within a business context.
  • Ensure the organization is fluent in the language of data.

With data governance and intelligence, enterprises can discover, understand, govern and socialize mission-critical information. And because many of the associated processes can be automated, you reduce errors and reliance on technical resources while increasing the speed and quality of your data pipeline to accomplish whatever your strategic objectives are, including digital transformation.

Check out our latest whitepaper, Data Intelligence: Empowering the Citizen Analyst with Democratized Data.

Data Intelligence: Empowering the Citizen Analyst with Democratized Data

Categories
erwin Expert Blog

Data Governance Stock Check: Using Data Governance to Take Stock of Your Data Assets

For regulatory compliance (e.g., GDPR) and to ensure peak business performance, organizations often bring consultants on board to help take stock of their data assets. This sort of data governance “stock check” is important but can be arduous without the right approach and technology. That’s where data governance comes in …

While most companies hold the lion’s share of operational data within relational databases, it also can live in many other places and various other formats. Therefore, organizations need the ability to manage any data from anywhere, what we call our “any-squared” (Any2) approach to data governance.

Any2 first requires an understanding of the ‘3Vs’ of data – volume, variety and velocity – especially in context of the data lifecycle, as well as knowing how to leverage the key  capabilities of data governance – data cataloging, data literacy, business process, enterprise architecture and data modeling – that enable data to be leveraged at different stages for optimum security, quality and value.

Following are two examples that illustrate the data governance stock check, including the Any2 approach in action, based on real consulting engagements.

Data Governance Stock Check

Data Governance “Stock Check” Case 1: The Data Broker

This client trades in information. Therefore, the organization needed to catalog the data it acquires from suppliers, ensure its quality, classify it, and then sell it to customers. The company wanted to assemble the data in a data warehouse and then provide controlled access to it.

The first step in helping this client involved taking stock of its existing data. We set up a portal so data assets could be registered via a form with basic questions, and then a central team received the registrations, reviewed and prioritized them. Entitlement attributes also were set up to identify and profile high-priority assets.

A number of best practices and technology solutions were used to establish the data required for managing the registration and classification of data feeds:

1. The underlying metadata is harvested followed by an initial quality check. Then the metadata is classified against a semantic model held in a business glossary.

2. After this classification, a second data quality check is performed based on the best-practice rules associated with the semantic model.

3. Profiled assets are loaded into a historical data store within the warehouse, with data governance tools generating its structure and data movement operations for data loading.

4. We developed a change management program to make all staff aware of the information brokerage portal and the importance of using it. It uses a catalog of data assets, all classified against a semantic model with data quality metrics to easily understand where data assets are located within the data warehouse.

5. Adopting this portal, where data is registered and classified against an ontology, enables the client’s customers to shop for data by asset or by meaning (e.g., “what data do you have on X topic?”) and then drill down through the taxonomy or across an ontology. Next, they raise a request to purchase the desired data.

This consulting engagement and technology implementation increased data accessibility and capitalization. Information is registered within a central portal through an approved workflow, and then customers shop for data either from a list of physical assets or by information content, with purchase requests also going through an approval workflow. This, among other safeguards, ensures data quality.

Benefits of Data Governance

Data Governance “Stock Check” Case 2: Tracking Rogue Data

This client has a geographically-dispersed organization that stored many of its key processes in Microsoft Excel TM spreadsheets. They were planning to move to Office 365TM and were concerned about regulatory compliance, including GDPR mandates.

Knowing that electronic documents are heavily used in key business processes and distributed across the organization, this company needed to replace risky manual processes with centralized, automated systems.

A key part of the consulting engagement was to understand what data assets were in circulation and how they were used by the organization. Then process chains could be prioritized to automate and outline specifications for the system to replace them.

This organization also adopted a central portal that allowed employees to register data assets. The associated change management program raised awareness of data governance across the organization and the importance of data registration.

For each asset, information was captured and reviewed as part of a workflow. Prioritized assets were then chosen for profiling, enabling metadata to be reverse-engineered before being classified against the business glossary.

Additionally, assets that were part of a process chain were gathered and modeled with enterprise architecture (EA) and business process (BP) modeling tools for impact analysis.

High-level requirements for new systems then could be defined again in the EA/BP tools and prioritized on a project list. For the others, decisions could be made on whether they could safely be placed in the cloud and whether macros would be required.

In this case, the adoption of purpose-built data governance solutions helped build an understanding of the data assets in play, including information about their usage and content to aid in decision-making.

This client then had a good handle of the “what” and “where” in terms of sensitive data stored in their systems. They also better understood how this sensitive data was being used and by whom, helping reduce regulatory risks like those associated with GDPR.

In both scenarios, we cataloged data assets and mapped them to a business glossary. It acts as a classification scheme to help govern data and located data, making it both more accessible and valuable. This governance framework reduces risk and protects its most valuable or sensitive data assets.

Focused on producing meaningful business outcomes, the erwin EDGE platform was pivotal in achieving these two clients’ data governance goals – including the infrastructure to undertake a data governance stock check. They used it to create an “enterprise data governance experience” not just for cataloging data and other foundational tasks, but also for a competitive “EDGE” in maximizing the value of their data while reducing data-related risks.

To learn more about the erwin EDGE data governance platform and how it aids in undertaking a data governance stock check, register for our free, 30-minute demonstration here.

Categories
erwin Expert Blog

Digital Transformation in Municipal Government: The Hidden Force Powering Smart Cities

Smart cities are changing the world.

When you think of real-time, data-driven experiences and modern applications to accomplish tasks faster and easier, your local town or city government probably doesn’t come to mind. But municipal government is starting to embrace digital transformation and therefore data governance.

Municipal government has never been an area in which to look for tech innovation. Perpetually strapped for resources and budget, often relying on legacy applications and infrastructure, and perfectly happy being available during regular business hours (save for emergency responders), most municipal governments lacked the ability and motivation to (as they say in the private sector) digitally transform. Then an odd thing happened – the rest of the world started transforming.

If you shop at a retailer that doesn’t deliver a modern, personalized experience, thousands more retailers are just a click away. But people rarely pick up and move to a new city because the new city offers a better website or mobile app. The motivation for municipal governments to transform simply isn’t there in the same way it is for the private sector.

But there are some things many city residents care about deeply: public safety, quality of life, how their tax dollars are spent, and the ability to do business with their local government when they want, not when it’s convenient for the municipality. And much like the private sector, better decisions around all of these concerns can be made when accurate, timely data is available to help inform them.

Digital transformation in municipal government is taking place in two main areas today: constituent services and the “smart cities” movement.

Digital Transformation in Municipal Government: Being “Smart” About It

The ability to serve constituents easily and efficiently is of increasing importance and a key objective of digital transformation in municipal government. It’s a direct result of the data-driven customer experiences that are increasingly the norm in the private sector.

Residents want the ability to pay their taxes online, report a pothole from their phone, and generally make it easier to interact with their local officials and services. This can be accomplished with dashboards and constituent portals.

The smart cities movement refers to the broad effort of municipal governments to incorporate sensors, data collection and analysis to improve responses to everything from rush-hour traffic to air quality to crime prevention. When the McKinsey Global Institute examined smart technologies that could be deployed by cities, it found that the public sector would be the natural owner of 70 percent of the applications it reviewed.

“Cities are getting in on the data game,” says Danny Sandwell, product marketing director at erwin, Inc. And with information serving as the lifeblood of many of these projects, the effectiveness of the services offered, the return on the investments in hardware and software, and the happiness of the users all depend on timely, accurate and effective data.

These initiatives present a pretty radical departure from the way cities have traditionally been managed.

A constituent portal, for example, requires that users can be identified, authenticated and then have access to information that resides in various departments, such as the tax collector to view and pay taxes, the building department to view a building permit, and the parking authority to manage public parking permits.

For many municipalities, this is uncharted territory.

Smart Cities

Data Governance: The Force Powering Smart Cities

The efficiencies offered by smart city technologies only exist if the data leads to a proper allocation of resources.

If you can identify an increase in crime in a certain neighborhood, for example, you can increase police patrols in response. But if the data is inaccurate, those patrols are wasted while other neighborhoods experience a rise in crime.

Now that they’re in the data game, it’s time for municipal governments to understand data governance – the driving force behind any successful data-driven operation. When you have the ability to understand all of the information related to a piece of data, you have more confidence in how it is analyzed, used and protected.

Data governance doesn’t take place at a single application or in the data warehouse. It needs to be woven into the enterprise architecture and processes of the municipality to ensure data is accurate, timely and accessible to those who need it (and inaccessible to everyone else).

When this all comes together – good data, solid analytics and improved services for residents – the results can be quite striking. New efficiencies will make municipal governments better stewards of tax dollars. An improved quality of life can lift tax revenue by making the city more appealing to citizens and developers.

There’s a lot for cities to gain if they get in the data game. And truly smart cities will make sure they play the game right with effective data governance.

Benefits of Data Governance

Categories
erwin Expert Blog

Four Use Cases Proving the Benefits of Metadata-Driven Automation

Organization’s cannot hope to make the most out of a data-driven strategy, without at least some degree of metadata-driven automation.

The volume and variety of data has snowballed, and so has its velocity. As such, traditional – and mostly manual – processes associated with data management and data governance have broken down. They are time-consuming and prone to human error, making compliance, innovation and transformation initiatives more complicated, which is less than ideal in the information age.

So it’s safe to say that organizations can’t reap the rewards of their data without automation.

Data scientists and other data professionals can spend up to 80 percent of their time bogged down trying to understand source data or addressing errors and inconsistencies.

That’s time needed and better used for data analysis.

By implementing metadata-driven automation, organizations across industry can unleash the talents of their highly skilled, well paid data pros to focus on finding the goods: actionable insights that will fuel the business.

Metadata-Driven Automation

Metadata-Driven Automation in the BFSI Industry

The banking, financial services and insurance industry typically deals with higher data velocity and tighter regulations than most. This bureaucracy is rife with data management bottlenecks.

These bottlenecks are only made worse when organizations attempt to get by with systems and tools that are not purpose-built.

For example, manually managing data mappings for the enterprise data warehouse via MS Excel spreadsheets had become cumbersome and unsustainable for one BSFI company.

After embracing metadata-driven automation and custom code automation templates, it saved hundreds of thousands of dollars in code generation and development costs and achieved more work in less time with fewer resources. ROI on the automation solutions was realized within the first year.

Metadata-Driven Automation in the Pharmaceutical Industry

Despite its shortcomings, the Excel spreadsheet method for managing data mappings is common within many industries.

But with the amount of data organizations need to process in today’s business climate, this manual approach makes change management and determining end-to-end lineage a significant and time-consuming challenge.

One global pharmaceutical giant headquartered in the United States experienced such issues until it adopted metadata-driven automation. Then the pharma company was able to scan in all source and target system metadata and maintain it within a single repository. Users now view end-to-end data lineage from the source layer to the reporting layer within seconds.

On the whole, the implementation resulted in extraordinary time savings and a total cost reduction of 60 percent.

Metadata-Driven Automation in the Insurance Industry

Insurance is another industry that has to cope with high data velocity and stringent data regulations. Plus many organizations in this sector find that they’ve outgrown their systems.

For example, an insurance company using a CDMA product to centralize data mappings is probably missing certain critical features, such as versioning, impact analysis and lineage, which adds to costs, times to market and errors.

By adopting metadata-driven automation, organizations can standardize the pre-ETL data mapping process and better manage data integration through the change and release process. As a result, both internal data mapping and cross functional teams now have easy and fast web-based access to data mappings and valuable information like impact analysis and lineage.

Here is the story of a business that adopted such an approach and achieved operational excellence and a delivery time reduction by 80 percent, as well as achieving ROI within 12 months.

Metadata-Driven Automation for a Non-Profit

Another common issue cited by organizations using manual data mapping is ballooning complexity and subsequent confusion.

Any organization expanding its data-driven focus without sufficiently maturing data management initiative(s) will experience this at some point.

One of the world’s largest humanitarian organizations, with millions of members and volunteers operating all over the world, was confronted with this exact issue.

It recognized the need for a solution to standardize the pre-ETL data mapping process to make data integration more efficient and cost-effective.

With metadata-driven automation, the organization would be able to scan and store metadata and data dictionaries in a central repository, as well as manage the business definitions and data dictionary for legacy systems contributing data to the enterprise data warehouse.

By adopting such an approach, the organization realized time savings across all IT development and cross-functional testing teams. Additionally, they were able to more easily manage mappings, code sets, reference data and data validation rules.

Again, ROI was achieved within a year.

A Universal Solution for Metadata-Driven Automation

Metadata-driven automation is a capability any organization can benefit from – regardless of industry, as demonstrated by the various real-world use cases chronicled here.

The erwin Automation Framework is a key component of the erwin EDGE platform for comprehensive data management and data governance.

With it, data professionals realize these industry-agnostic benefits:

  • Centralized and standardized code management with all automation templates stored in a governed repository
  • Better quality code and minimized rework
  • Business-driven data movement and transformation specifications
  • Superior data movement job designs based on best practices
  • Greater agility and faster time-to-value in data preparation, deployment and governance
  • Cross-platform support of scripting languages and data movement technologies

Learn more about metadata-driven automation as it relates to data preparation and enterprise data mapping.

Join one our weekly erwin Mapping Manager demos.

Automate Data Mapping

Categories
erwin Expert Blog Data Governance Data Intelligence

Demystifying Data Lineage: Tracking Your Data’s DNA

Getting the most out of your data requires getting a handle on data lineage. That’s knowing what data you have, where it is, and where it came from – plus understanding its quality and value to the organization.

But you can’t understand your data in a business context much less track data lineage, its physical existence and maximize its security, quality and value if it’s scattered across different silos in numerous applications.

Data lineage provides a way of tracking data from its origin to destination across its lifespan and all the processes it’s involved in. It also plays a vital role in data governance. Beyond the simple ability to know where the data came from and whether or not it can be trusted, there’s an element of statutory reporting and compliance that often requires a knowledge of how that same data (known or unknown, governed or not) has changed over time.

A platform that provides insights like data lineage, impact analysis, full-history capture, and other data management features serves as a central hub from which everything can be learned and discovered about the data – whether a data lake, a data vault or a traditional data warehouse.

In a traditional data management organization, Excel spreadsheets are used to manage the incoming data design, what’s known as the “pre-ETL” mapping documentation, but this does not provide any sort of visibility or auditability. In fact, each unit of work represented in these ‘mapping documents’ becomes an independent variable in the overall system development lifecycle, and therefore nearly impossible to learn from much less standardize.

The key to accuracy and integrity in any exercise is to eliminate the opportunity for human error – which does not mean eliminating humans from the process but incorporating the right tools to reduce the likelihood of error as the human beings apply their thought processes to the work.

Data Lineage

Data Lineage: A Crucial First Step for Data Governance

Knowing what data you have and where it lives and where it came from is complicated. The lack of visibility and control around “data at rest” combined with “data in motion,” as well as difficulties with legacy architectures, means organizations spend more time finding the data they need rather than using it to produce meaningful business outcomes.

Organizations need to create and sustain an enterprise-wide view of and easy access to underlying metadata, but that’s a tall order with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration. So the applications and initiatives that depend on a solid data infrastructure may be compromised, resulting in faulty analyses.

These issues can be addressed with a strong data management strategy underpinned by technology that enables the data quality the business requires, which encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossaries maintenance and metadata management (associations and lineage).

An automated, metadata-driven framework for cataloging data assets and their flows across the business provides an efficient, agile and dynamic way to generate data lineage from operational source systems (databases, data models, file-based systems, unstructured files and more) across the information management architecture; construct business glossaries; assess what data aligns with specific business rules and policies; and inform how that data is transformed, integrated and federated throughout business processes – complete with full documentation.

Centralized design, immediate lineage and impact analysis, and change-activity logging means you will always have answers readily available, or just a few clicks away. Subsets of data can be identified and generated via predefined templates, generic designs generated from standard mapping documents, and pushed via ETL process for faster processing via automation templates.

With automation, data quality is systemically assured and the data pipeline is seamlessly governed and operationalized to the benefit of all stakeholders. Without such automation, business transformation will be stymied. Companies, especially large ones with thousands of systems, files and processes, will be particularly challenged by a manual approach. And outsourcing these data management efforts to professional services firms only increases costs and schedule delays.

With erwin Mapping Manager, organizations can automate enterprise data mapping and code generation for faster time-to-value and greater accuracy when it comes to data movement projects, as well as synchronize “data in motion” with data management and governance efforts.

Map data elements to their sources within a single repository to determine data lineage, deploy data warehouses and other Big Data solutions, and harmonize data integration across platforms. The web-based solution reduces the need for specialized, technical resources with knowledge of ETL and database procedural code, while making it easy for business analysts, data architects, ETL developers, testers and project managers to collaborate for faster decision-making.

Data Lineage

Categories
erwin Expert Blog

Top 10 Reasons to Automate Data Mapping and Data Preparation

Data preparation is notorious for being the most time-consuming area of data management. It’s also expensive.

“Surveys show the vast majority of time is spent on this repetitive task, with some estimates showing it takes up as much as 80% of a data professional’s time,” according to Information Week. And a Trifacta study notes that overreliance on IT resources for data preparation costs organizations billions.

The power of collecting your data can come in a variety of forms, but most often in IT shops around the world, it comes in a spreadsheet, or rather a collection of spreadsheets often numbering in the hundreds or thousands.

Most organizations, especially those competing in the digital economy, don’t have enough time or money for data management using manual processes. And outsourcing is also expensive, with inevitable delays because these vendors are dependent on manual processes too.

Automate Data Mapping

Taking the Time and Pain Out of Data Preparation: 10 Reasons to Automate Data Preparation/Data Mapping

  1. Governance and Infrastructure

Data governance and a strong IT infrastructure are critical in the valuation, creation, storage, use, archival and deletion of data. Beyond the simple ability to know where the data came from and whether or not it can be trusted, there is an element of statutory reporting and compliance that often requires a knowledge of how that same data (known or unknown, governed or not) has changed over time.

A design platform that allows for insights like data lineage, impact analysis, full history capture, and other data management features can provide a central hub from which everything can be learned and discovered about the data – whether a data lake, a data vault, or a traditional warehouse.

  1. Eliminating Human Error

In the traditional data management organization, excel spreadsheets are used to manage the incoming data design, or what is known as the “pre-ETL” mapping documentation – this does not lend to any sort of visibility or auditability. In fact, each unit of work represented in these ‘mapping documents’ becomes an independent variable in the overall system development lifecycle, and therefore nearly impossible to learn from much less standardize.

The key to creating accuracy and integrity in any exercise is to eliminate the opportunity for human error – which does not mean eliminating humans from the process but incorporating the right tools to reduce the likelihood of error as the human beings apply their thought processes to the work.  

  1. Completeness

The ability to scan and import from a broad range of sources and formats, as well as automated change tracking, means that you will always be able to import your data from wherever it lives and track all of the changes to that data over time.

  1. Adaptability

Centralized design, immediate lineage and impact analysis, and change activity logging means that you will always have the answer readily available, or a few clicks away.  Subsets of data can be identified and generated via predefined templates, generic designs generated from standard mapping documents, and pushed via ETL process for faster processing via automation templates.

  1. Accuracy

Out-of-the-box capabilities to map your data from source to report, make reconciliation and validation a snap, with auditability and traceability built-in.  Build a full array of validation rules that can be cross checked with the design mappings in a centralized repository.

  1. Timeliness

The ability to be agile and reactive is important – being good at being reactive doesn’t sound like a quality that deserves a pat on the back, but in the case of regulatory requirements, it is paramount.

  1. Comprehensiveness

Access to all of the underlying metadata, source-to-report design mappings, source and target repositories, you have the power to create reports within your reporting layer that have a traceable origin and can be easily explained to both IT, business, and regulatory stakeholders.

  1. Clarity

The requirements inform the design, the design platform puts those to action, and the reporting structures are fed the right data to create the right information at the right time via nearly any reporting platform, whether mainstream commercial or homegrown.

  1. Frequency

Adaptation is the key to meeting any frequency interval. Centralized designs, automated ETL patterns that feed your database schemas and reporting structures will allow for cyclical changes to be made and implemented in half the time of using conventional means. Getting beyond the spreadsheet, enabling pattern-based ETL, and schema population are ways to ensure you will be ready, whenever the need arises to show an audit trail of the change process and clearly articulate who did what and when through the system development lifecycle.

  1. Business-Friendly

A user interface designed to be business-friendly means there’s no need to be a data integration specialist to review the common practices outlined and “passively enforced” throughout the tool. Once a process is defined, rules implemented, and templates established, there is little opportunity for error or deviation from the overall process. A diverse set of role-based security options means that everyone can collaborate, learn and audit while maintaining the integrity of the underlying process components.

Faster, More Accurate Analysis with Fewer People

What if you could get more accurate data preparation 50% faster and double your analysis with less people?

erwin Mapping Manager (MM) is a patented solution that automates data mapping throughout the enterprise data integration lifecycle, providing data visibility, lineage and governance – freeing up that 80% of a data professional’s time to put that data to work.

With erwin MM, data integration engineers can design and reverse-engineer the movement of data implemented as ETL/ELT operations and stored procedures, building mappings between source and target data assets and designing the transformation logic between them. These designs then can be exported to most ETL and data asset technologies for implementation.

erwin MM is 100% metadata-driven and used to define and drive standards across enterprise integration projects, enable data and process audits, improve data quality, streamline downstream work flows, increase productivity (especially over geographically dispersed teams) and give project teams, IT leadership and management visibility into the ‘real’ status of integration and ETL migration projects.

If an automated data preparation/mapping solution sounds good to you, please check out erwin MM here.

Solving the Enterprise Data Dilemma

Categories
erwin Expert Blog

Healthy Co-Dependency: Data Management and Data Governance

Data management and data governance are now more important than ever before. The hyper competitive nature of data-driven business means organizations need to get more out of their data than ever before – and fast.

A few data-driven exemplars have led the way, turning data into actionable insights that influence everything from corporate structure to new products and pricing. “Few” being the operative word.

It’s true, data-driven business is big business. Huge actually. But it’s dominated by a handful of organizations that realized early on what a powerful and disruptive force data can be.

The benefits of such data-driven strategies speak for themselves: Netflix has replaced Blockbuster, and Uber continues to shake up the taxi business. Organizations indiscriminate of industry are following suit, fighting to become the next big, disruptive players.

But in many cases, these attempts have failed or are on the verge of doing so.

Now with the General Data Protection Regulation (GDPR) in effect, data that is unaccounted for is a potential data disaster waiting to happen.

So organizations need to understand that getting more out of their data isn’t necessarily about collecting more data. It’s about unlocking the value of the data they already have.

Data Management and Data Governance Co-Dependency

The Enterprise Data Dilemma

However, most organizations don’t know exactly what data they have or even where some of it is. And some of the data they can account for is going to waste because they don’t have the means to process it. This is especially true of unstructured data types, which organizations are collecting more frequently.

Considering that 73 percent of company data goes unused, it’s safe to assume your organization is dealing with some if not all of these issues.

Big picture, this means your enterprise is missing out on thousands, perhaps millions in revenue.

The smaller picture? You’re struggling to establish a single source of data truth, which contributes to a host of problems:

  • Inaccurate analysis and discrepancies in departmental reporting
  • Inability to manage the amount and variety of data your organization collects
  • Duplications and redundancies in processes
  • Issues determining data ownership, lineage and access
  • Achieving and sustaining compliance

To avoid such circumstances and get more value out of data, organizations need to harmonize their approach to data management and data governance, using a platform of established tools that work in tandem while also enabling collaboration across the enterprise.

Data management drives the design, deployment and operation of systems that deliver operational data assets for analytics purposes.

Data governance delivers these data assets within a business context, tracking their physical existence and lineage, and maximizing their security, quality and value.

Although these two disciplines approach data from different perspectives (IT-driven and business-oriented), they depend on each other. And this co-dependency helps an organization make the most of its data.

The P-M-G Hub

Together, data management and data governance form a critical hub for data preparation, modeling and data governance. How?

It starts with a real-time, accurate picture of the data landscape, including “data at rest” in databases, data warehouses and data lakes and “data in motion” as it is integrated with and used by key applications. That landscape also must be controlled to facilitate collaboration and limit risk.

But knowing what data you have and where it lives is complicated, so you need to create and sustain an enterprise-wide view of and easy access to underlying metadata. That’s a tall order with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration. So the applications and initiatives that depend on a solid data infrastructure may be compromised, and data analysis based on faulty insights.

However, these issues can be addressed with a strong data management strategy and technology to enable the data quality required by the business, which encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossaries maintenance and metadata management (associations and lineage).

Being able to pinpoint what data exists and where must be accompanied by an agreed-upon business understanding of what it all means in common terms that are adopted across the enterprise. Having that consistency is the only way to assure that insights generated by analyses are useful and actionable, regardless of business department or user exploring a question. Additionally, policies, processes and tools that define and control access to data by roles and across workflows are critical for security purposes.

These issues can be addressed with a comprehensive data governance strategy and technology to determine master data sets, discover the impact of potential glossary changes across the enterprise, audit and score adherence to rules, discover risks, and appropriately and cost-effectively apply security to data flows, as well as publish data to people/roles in ways that are meaningful to them.

Data Management and Data Governance: Play Together, Stay Together

When data management and data governance work in concert empowered by the right technology, they inform, guide and optimize each other. The result for an organization that takes such a harmonized approach is automated, real-time, high-quality data pipeline.

Then all stakeholders — data scientists, data stewards, ETL developers, enterprise architects, business analysts, compliance officers, CDOs and CEOs – can access the data they’re authorized to use and base strategic decisions on what is now a full inventory of reliable information.

The erwin EDGE creates an “enterprise data governance experience” through integrated data mapping, business process modeling, enterprise architecture modeling, data modeling and data governance. No other software platform on the market touches every aspect of the data management and data governance lifecycle to automate and accelerate the speed to actionable business insights.

Categories
erwin Expert Blog

Solving the Enterprise Data Dilemma

Due to the adoption of data-driven business, organizations across the board are facing their own enterprise data dilemmas.

This week erwin announced its acquisition of metadata management and data governance provider AnalytiX DS. The combined company touches every piece of the data management and governance lifecycle, enabling enterprises to fuel automated, high-quality data pipelines for faster speed to accurate, actionable insights.

Why Is This a Big Deal?

From digital transformation to AI, and everything in between, organizations are flooded with data. So, companies are investing heavily in initiatives to use all the data at their disposal, but they face some challenges. Chiefly, deriving meaningful insights from their data – and turning them into actions that improve the bottom line.

This enterprise data dilemma stems from three important but difficult questions to answer: What data do we have? Where is it? And how do we get value from it?

Large enterprises use thousands of unharvested, undocumented databases, applications, ETL processes and procedural code that make it difficult to gather business intelligence, conduct IT audits, and ensure regulatory compliance – not to mention accomplish other objectives around customer satisfaction, revenue growth and overall efficiency and decision-making.

The lack of visibility and control around “data at rest” combined with “data in motion”, as well as difficulties with legacy architectures, means these organizations spend more time finding the data they need rather than using it to produce meaningful business outcomes.

To remedy this, enterprises need smarter and faster data management and data governance capabilities, including the ability to efficiently catalog and document their systems, processes and the associated data without errors. In addition, business and IT must collaborate outside their traditional operational silos.

But this coveted state of data nirvana isn’t possible without the right approach and technology platform.

Enterprise Data: Making the Data Management-Data Governance Love Connection

Enterprise Data: Making the Data Management-Data Governance Love Connection

Bringing together data management and data governance delivers greater efficiencies to technical users and better analytics to business users. It’s like two sides of the same coin:

  • Data management drives the design, deployment and operation of systems that deliver operational and analytical data assets.
  • Data governance delivers these data assets within a business context, tracks their physical existence and lineage, and maximizes their security, quality and value.

Although these disciplines approach data from different perspectives and are used to produce different outcomes, they have a lot in common. Both require a real-time, accurate picture of an organization’s data landscape, including data at rest in data warehouses and data lakes and data in motion as it is integrated with and used by key applications.

However, creating and maintaining this metadata landscape is challenging because this data in its various forms and from numerous sources was never designed to work in concert. Data infrastructures have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration, so the applications and initiatives that depend on data infrastructure are often out-of-date and inaccurate, rendering faulty insights and analyses.

Organizations need to know what data they have and where it’s located, where it came from and how it got there, what it means in common business terms [or standardized business terms] and be able to transform it into useful information they can act on – all while controlling its access.

To support the total enterprise data management and governance lifecycle, they need an automated, real-time, high-quality data pipeline. Then every stakeholder – data scientist, ETL developer, enterprise architect, business analyst, compliance officer, CDO and CEO – can fuel the desired outcomes with reliable information on which to base strategic decisions.

Enterprise Data: Creating Your “EDGE”

At the end of the day, all industries are in the data business and all employees are data people. The success of an organization is not measured by how much data it has, but by how well it’s used.

Data governance enables organizations to use their data to fuel compliance, innovation and transformation initiatives with greater agility, efficiency and cost-effectiveness.

Organizations need to understand their data from different perspectives, identify how it flows through and impacts the business, aligns this business view with a technical view of the data management infrastructure, and synchronizes efforts across both disciplines for accuracy, agility and efficiency in building a data capability that impacts the business in a meaningful and sustainable fashion.

The persona-based erwin EDGE creates an “enterprise data governance experience” that facilitates collaboration between both IT and the business to discover, understand and unlock the value of data both at rest and in motion.

By bringing together enterprise architecture, business process, data mapping and data modeling, erwin’s approach to data governance enables organizations to get a handle on how they handle their data. With the broadest set of metadata connectors and automated code generation, data mapping and cataloging tools, the erwin EDGE Platform simplifies the total data management and data governance lifecycle.

This single, integrated solution makes it possible to gather business intelligence, conduct IT audits, ensure regulatory compliance and accomplish any other organizational objective by fueling an automated, high-quality and real-time data pipeline.

With the erwin EDGE, data management and data governance are unified and mutually supportive, with one hand aware and informed by the efforts of the other to:

  • Discover data: Identify and integrate metadata from various data management silos.
  • Harvest data: Automate the collection of metadata from various data management silos and consolidate it into a single source.
  • Structure data: Connect physical metadata to specific business terms and definitions and reusable design standards.
  • Analyze data: Understand how data relates to the business and what attributes it has.
  • Map data flows: Identify where to integrate data and track how it moves and transforms.
  • Govern data: Develop a governance model to manage standards and policies and set best practices.
  • Socialize data: Enable stakeholders to see data in one place and in the context of their roles.

An integrated solution with data preparation, modeling and governance helps businesses reach data governance maturity – which equals a role-based, collaborative data governance system that serves both IT and business users equally. Such maturity may not happen overnight, but it will ultimately deliver the accurate and actionable insights your organization needs to compete and win.

Your journey to data nirvana begins with a demo of the enhanced erwin Data Governance solution. Register now.

erwin ADS webinar