Categories
erwin Expert Blog

Choosing the Right Data Modeling Tool

The need for an effective data modeling tool is more significant than ever.

For decades, data modeling has provided the optimal way to design and deploy new relational databases with high-quality data sources and support application development. But it provides even greater value for modern enterprises where critical data exists in both structured and unstructured formats and lives both on premise and in the cloud.

In today’s hyper-competitive, data-driven business landscape, organizations are awash with data and the applications, databases and schema required to manage it.

For example, an organization may have 300 applications, with 50 different databases and a different schema for each. Additional challenges, such as increasing regulatory pressures – from the General Data Protection Regulation (GDPR) to the Health Insurance Privacy and Portability Act (HIPPA) – and growing stores of unstructured data also underscore the increasing importance of a data modeling tool.

Data modeling, quite simply, describes the process of discovering, analyzing, representing and communicating data requirements in a precise form called the data model. There’s an expression: measure twice, cut once. Data modeling is the upfront “measuring tool” that helps organizations reduce time and avoid guesswork in a low-cost environment.

From a business-outcome perspective, a data modeling tool is used to help organizations:

  • Effectively manage and govern massive volumes of data
  • Consolidate and build applications with hybrid architectures, including traditional, Big Data, cloud and on premise
  • Support expanding regulatory requirements, such as GDPR and the California Consumer Privacy Act (CCPA)
  • Simplify collaboration across key roles and improve information alignment
  • Improve business processes for operational efficiency and compliance
  • Empower employees with self-service access for enterprise data capability, fluency and accountability

Data Modeling Tool

Evaluating a Data Modeling Tool – Key Features

Organizations seeking to invest in a new data modeling tool should consider these four key features.

  1. Ability to visualize business and technical database structures through an integrated, graphical model.

Due to the amount of database platforms available, it’s important that an organization’s data modeling tool supports a sufficient (to your organization) array of platforms. The chosen data modeling tool should be able to read the technical formats of each of these platforms and translate them into highly graphical models rich in metadata. Schema can be deployed from models in an automated fashion and iteratively updated so that new development can take place via model-driven design.

  1. Empowering of end-user BI/analytics by data source discovery, analysis and integration. 

A data modeling tool should give business users confidence in the information they use to make decisions. Such confidence comes from the ability to provide a common, contextual, easily accessible source of data element definitions to ensure they are able to draw upon the correct data; understand what it represents, including where it comes from; and know how it’s connected to other entities.

A data modeling tool can also be used to pull in data sources via self-service BI and analytics dashboards. The data modeling tool should also have the ability to integrate its models into whatever format is required for downstream consumption.

  1. The ability to store business definitions and data-centric business rules in the model along with technical database schemas, procedures and other information.

With business definitions and rules on board, technical implementations can be better aligned with the needs of the organization. Using an advanced design layer architecture, model “layers” can be created with one or more models focused on the business requirements that then can be linked to one or more database implementations. Design-layer metadata can also be connected from conceptual through logical to physical data models.

  1. Rationalize platform inconsistencies and deliver a single source of truth for all enterprise business data.

Many organizations struggle to breakdown data silos and unify data into a single source of truth, due in large part to varying data sources and difficulty managing unstructured data. Being able to model any data from anywhere accounts for this with on-demand modeling for non-relational databases that offer speed, horizontal scalability and other real-time application advantages.

With NoSQL support, model structures from non-relational databases, such as Couchbase and MongoDB can be created automatically. Existing Couchbase and MongoDB data sources can be easily discovered, understood and documented through modeling and visualization. Existing entity-relationship diagrams and SQL databases can be migrated to Couchbase and MongoDB too. Relational schema also will be transformed to query-optimized NoSQL constructs.

Other considerations include the ability to:

  • Compare models and databases.
  • Increase enterprise collaboration.
  • Perform impact analysis.
  • Enable business and IT infrastructure interoperability.

When it comes to data modeling, no one knows it better. For more than 30 years, erwin Data Modeler has been the market leader. It is built on the vision and experience of data modelers worldwide and is the de-facto standard in data model integration.

You can learn more about driving business value and underpinning governance with erwin DM in this free white paper.

Data Modeling Drives Business Value

Categories
erwin Expert Blog

Data Governance Stock Check: Using Data Governance to Take Stock of Your Data Assets

For regulatory compliance (e.g., GDPR) and to ensure peak business performance, organizations often bring consultants on board to help take stock of their data assets. This sort of data governance “stock check” is important but can be arduous without the right approach and technology. That’s where data governance comes in …

While most companies hold the lion’s share of operational data within relational databases, it also can live in many other places and various other formats. Therefore, organizations need the ability to manage any data from anywhere, what we call our “any-squared” (Any2) approach to data governance.

Any2 first requires an understanding of the ‘3Vs’ of data – volume, variety and velocity – especially in context of the data lifecycle, as well as knowing how to leverage the key  capabilities of data governance – data cataloging, data literacy, business process, enterprise architecture and data modeling – that enable data to be leveraged at different stages for optimum security, quality and value.

Following are two examples that illustrate the data governance stock check, including the Any2 approach in action, based on real consulting engagements.

Data Governance Stock Check

Data Governance “Stock Check” Case 1: The Data Broker

This client trades in information. Therefore, the organization needed to catalog the data it acquires from suppliers, ensure its quality, classify it, and then sell it to customers. The company wanted to assemble the data in a data warehouse and then provide controlled access to it.

The first step in helping this client involved taking stock of its existing data. We set up a portal so data assets could be registered via a form with basic questions, and then a central team received the registrations, reviewed and prioritized them. Entitlement attributes also were set up to identify and profile high-priority assets.

A number of best practices and technology solutions were used to establish the data required for managing the registration and classification of data feeds:

1. The underlying metadata is harvested followed by an initial quality check. Then the metadata is classified against a semantic model held in a business glossary.

2. After this classification, a second data quality check is performed based on the best-practice rules associated with the semantic model.

3. Profiled assets are loaded into a historical data store within the warehouse, with data governance tools generating its structure and data movement operations for data loading.

4. We developed a change management program to make all staff aware of the information brokerage portal and the importance of using it. It uses a catalog of data assets, all classified against a semantic model with data quality metrics to easily understand where data assets are located within the data warehouse.

5. Adopting this portal, where data is registered and classified against an ontology, enables the client’s customers to shop for data by asset or by meaning (e.g., “what data do you have on X topic?”) and then drill down through the taxonomy or across an ontology. Next, they raise a request to purchase the desired data.

This consulting engagement and technology implementation increased data accessibility and capitalization. Information is registered within a central portal through an approved workflow, and then customers shop for data either from a list of physical assets or by information content, with purchase requests also going through an approval workflow. This, among other safeguards, ensures data quality.

Benefits of Data Governance

Data Governance “Stock Check” Case 2: Tracking Rogue Data

This client has a geographically-dispersed organization that stored many of its key processes in Microsoft Excel TM spreadsheets. They were planning to move to Office 365TM and were concerned about regulatory compliance, including GDPR mandates.

Knowing that electronic documents are heavily used in key business processes and distributed across the organization, this company needed to replace risky manual processes with centralized, automated systems.

A key part of the consulting engagement was to understand what data assets were in circulation and how they were used by the organization. Then process chains could be prioritized to automate and outline specifications for the system to replace them.

This organization also adopted a central portal that allowed employees to register data assets. The associated change management program raised awareness of data governance across the organization and the importance of data registration.

For each asset, information was captured and reviewed as part of a workflow. Prioritized assets were then chosen for profiling, enabling metadata to be reverse-engineered before being classified against the business glossary.

Additionally, assets that were part of a process chain were gathered and modeled with enterprise architecture (EA) and business process (BP) modeling tools for impact analysis.

High-level requirements for new systems then could be defined again in the EA/BP tools and prioritized on a project list. For the others, decisions could be made on whether they could safely be placed in the cloud and whether macros would be required.

In this case, the adoption of purpose-built data governance solutions helped build an understanding of the data assets in play, including information about their usage and content to aid in decision-making.

This client then had a good handle of the “what” and “where” in terms of sensitive data stored in their systems. They also better understood how this sensitive data was being used and by whom, helping reduce regulatory risks like those associated with GDPR.

In both scenarios, we cataloged data assets and mapped them to a business glossary. It acts as a classification scheme to help govern data and located data, making it both more accessible and valuable. This governance framework reduces risk and protects its most valuable or sensitive data assets.

Focused on producing meaningful business outcomes, the erwin EDGE platform was pivotal in achieving these two clients’ data governance goals – including the infrastructure to undertake a data governance stock check. They used it to create an “enterprise data governance experience” not just for cataloging data and other foundational tasks, but also for a competitive “EDGE” in maximizing the value of their data while reducing data-related risks.

To learn more about the erwin EDGE data governance platform and how it aids in undertaking a data governance stock check, register for our free, 30-minute demonstration here.

Categories
erwin Expert Blog

Google’s Record GDPR Fine: Avoiding This Fate with Data Governance

The General Data Protection Regulation (GDPR) made its first real impact as Google’s record GDPR fine dominated news cycles.

Historically, fines had peaked at six figures with the U.K.’s Information Commissioner’s Office (ICO) fines of 500,000 pounds ($650,000 USD) against both Facebook and Equifax for their data protection breaches.

Experts predicted an uptick in GDPR enforcement in 2019, and Google’s recent record GDPR fine has brought that to fruition. France’s data privacy enforcement agency hit the tech giant with a $57 million penalty – more than 80 times the steepest ICO fine.

If it can happen to Google, no organization is safe. Many in fact still lag in the GDPR compliance department. Cisco’s 2019 Data Privacy Benchmark Study reveals that only 59 percent of organizations are meeting “all or most” of GDPR’s requirements.

So many more GDPR violations are likely to come to light. And even organizations that are currently compliant can’t afford to let their data governance standards slip.

Data Governance for GDPR

Google’s record GDPR fine makes the rationale for better data governance clear enough. However, the Cisco report offers even more insight into the value of achieving and maintaining compliance.

Organizations with GDPR-compliant security measures are not only less likely to suffer a breach (74 percent vs. 89 percent), but the breaches suffered are less costly too, with fewer records affected.

However, applying such GDPR-compliant provisions can’t be done on a whim; organizations must expand their data governance practices to include compliance.

GDPR White Paper

A robust data governance initiative provides a comprehensive picture of an organization’s systems and the units of data contained or used within them. This understanding encompasses not only the original instance of a data unit but also its lineage and how it has been handled and processed across an organization’s ecosystem.

With this information, organizations can apply the relevant degrees of security where necessary, ensuring expansive and efficient protection from external (i.e., breaches) and internal (i.e., mismanaged permissions) data security threats.

Although data security cannot be wholly guaranteed, these measures can help identify and contain breaches to minimize the fallout.

Looking at Google’s Record GDPR Fine as An Opportunity

The tertiary benefits of GDPR compliance include greater agility and innovation and better data discovery and management. So arguably, the “tertiary” benefits of data governance should take center stage.

While once exploited by such innovators as Amazon and Netflix, data optimization and governance is now on everyone’s radar.

So organization’s need another competitive differentiator.

An enterprise data governance experience (EDGE) provides just that.

THE REGULATORY RATIONALE FOR INTEGRATING DATA MANAGEMENT & DATA GOVERNANCE

This approach unifies data management and data governance, ensuring that the data landscape, policies, procedures and metrics stem from a central source of truth so data can be trusted at any point throughout its enterprise journey.

With an EDGE, the Any2 (any data from anywhere) data management philosophy applies – whether structured or unstructured, in the cloud or on premise. An organization’s data preparation (data mapping), enterprise modeling (business, enterprise and data) and data governance practices all draw from a single metadata repository.

In fact, metadata from a multitude of enterprise systems can be harvested and cataloged automatically. And with intelligent data discovery, sensitive data can be tagged and governed automatically as well – think GDPR as well as HIPAA, BCBS and CCPA.

Organizations without an EDGE can still achieve regulatory compliance, but data silos and the associated bottlenecks are unavoidable without integration and automation – not to mention longer timeframes and higher costs.

To get an “edge” on your competition, consider the erwin EDGE platform for greater control over and value from your data assets.

Data preparation/mapping is a great starting point and a key component of the software portfolio. Join us for a weekly demo.

Automate Data Mapping

Categories
erwin Expert Blog

Data Governance Tackles the Top Three Reasons for Bad Data

In modern, data-driven busienss, it’s integral that organizations understand the reasons for bad data and how best to address them. Data has revolutionized how organizations operate, from customer relationships to strategic decision-making and everything in between. And with more emphasis on automation and artificial intelligence, the need for data/digital trust also has risen. Even minor errors in an organization’s data can cause massive headaches because the inaccuracies don’t involve just one corrupt data unit.

Inaccurate or “bad” data also affects relationships to other units of data, making the business context difficult or impossible to determine. For example, are data units tagged according to their sensitivity [i.e., personally identifiable information subject to the General Data Protection Regulation (GDPR)], and is data ownership and lineage discernable (i.e., who has access, where did it originate)?

Relying on inaccurate data will hamper decisions, decrease productivity, and yield suboptimal results. Given these risks, organizations must increase their data’s integrity. But how?

Integrated Data Governance

Modern, data-driven organizations are essentially data production lines. And like physical production lines, their associated systems and processes must run smoothly to produce the desired results. Sound data governance provides the framework to address data quality at its source, ensuring any data recorded and stored is done so correctly, securely and in line with organizational requirements. But it needs to integrate all the data disciplines.

By integrating data governance with enterprise architecture, businesses can define application capabilities and interdependencies within the context of their connection to enterprise strategy to prioritize technology investments so they align with business goals and strategies to produce the desired outcomes. A business process and analysis component enables an organization to clearly define, map and analyze workflows and build models to drive process improvement, as well as identify business practices susceptible to the greatest security, compliance or other risks and where controls are most needed to mitigate exposures.

And data modeling remains the best way to design and deploy new relational databases with high-quality data sources and support application development. Being able to cost-effectively and efficiently discover, visualize and analyze “any data” from “anywhere” underpins large-scale data integration, master data management, Big Data and business intelligence/analytics with the ability to synthesize, standardize and store data sources from a single design, as well as reuse artifacts across projects.

Let’s look at some of the main reasons for bad data and how data governance helps confront these issues …

Reasons for Bad Data

Reasons for Bad Data: Data Entry

The concept of “garbage in, garbage out” explains the most common cause of inaccurate data: mistakes made at data entry. While this concept is easy to understand, totally eliminating errors isn’t feasible so organizations need standards and systems to limit the extent of their damage.

With the right data governance approach, organizations can ensure the right people aren’t left out of the cataloging process, so the right context is applied. Plus you can ensure critical fields are not left blank, so data is recorded with as much context as possible.

With the business process integration discussed above, you’ll also have a single metadata repository.

All of this ensures sensitive data doesn’t fall through the cracks.

Reasons for Bad Data: Data Migration

Data migration is another key reason for bad data. Modern organizations often juggle a plethora of data systems that process data from an abundance of disparate sources, creating a melting pot for potential issues as data moves through the pipeline, from tool to tool and system to system.

The solution is to introduce a predetermined standard of accuracy through a centralized metadata repository with data governance at the helm. In essence, metadata describes data about data, ensuring that no matter where data is in relation to the pipeline, it still has the necessary context to be deciphered, analyzed and then used strategically.

The potential fallout of using inaccurate data has become even more severe with the GDPR’s implementation. A simple case of tagging and subsequently storing personally identifiable information incorrectly could lead to a serious breach in compliance and significant fines.

Such fines must be considered along with the costs resulting from any PR fallout.

Reasons for Bad Data: Data Integration

The proliferation of data sources, types, and stores increases the challenge of combining data into meaningful, valuable information. While companies are investing heavily in initiatives to increase the amount of data at their disposal, most information workers are spending more time finding the data they need rather than putting it to work, according to Database Trends and Applications (DBTA). erwin is co-sponsoring a DBTA webinar on this topic on July 17. To register, click here.

The need for faster and smarter data integration capabilities is growing. At the same time, to deliver business value, people need information they can trust to act on, so balancing governance is absolutely critical, especially with new regulations.

Organizations often invest heavily in individual software development tools for managing projects, requirements, designs, development, testing, deployment, releases, etc. Tools lacking inter-operability often result in cumbersome manual processes and heavy time investments to synchronize data or processes between these disparate tools.

Data integration combines data from several various sources into a unified view, making it more actionable and valuable to those accessing it.

Getting the Data Governance “EDGE”

The benefits of integrated data governance discussed above won’t be realized if it is isolated within IT with no input from other stakeholders, the day-to-day data users – from sales and customer service to the C-suite. Every data citizen has DG roles and responsibilities to ensure data units have context, meaning they are labeled, cataloged and secured correctly so they can be analyzed and used properly. In other words, the data can be trusted.

Once an organization understands that IT and the business are both responsible for data, it can develop comprehensive, holistic data governance capable of:

  • Reaching every stakeholder in the process
  • Providing a platform for understanding and governing trusted data assets
  • Delivering the greatest benefit from data wherever it lives, while minimizing risk
  • Helping users understand the impact of changes made to a specific data element across the enterprise.

To reduce the risks of and tackle the reasons for bad data and realize larger organizational objectives, organizations must make data governance everyone’s business.

To learn more about the collaborative approach to data governance and how it helps compliance in addition to adding value and reducing costs, get the free e-book here.

Data governance is everyone's business

Categories
erwin Expert Blog

The Role of An Effective Data Governance Initiative in Customer Purchase Decisions

A data governance initiative will maximize the security, quality and value of data, all of which build customer trust.

Without data, modern business would cease to function. Data helps guide decisions about products and services, makes it easier to identify customers, and serves as the foundation for everything businesses do today. The problem for many organizations is that data enters from any number of angles and gets stored in different places by different people and different applications.

Getting the most out of your data requires that you know what you have, where you have it, and that you understand its quality and value to the organization. This is where data governance comes into play. You can’t optimize your data if it’s scattered across different silos and lurking in various applications.

For about 150 years, manufacturers relied on their machinery and its ability to run reliably, properly and safely, to keep customers happy and revenue flowing. A data governance initiative has a similar role today, except its aim is to maximize the security, quality and value of data instead of machinery.

Customers are increasingly concerned about the safety and privacy of their data. According to a survey by Research+Data Insights, 85 percent of respondents worry about technology compromising their personal privacy. In a survey of 2,000 U.S. adults in 2016, researchers from Vanson Bourne found that 76 percent of respondents said they would move away from companies with a high record of data breaches.

For years, buying decisions were driven mainly by cost and quality, says Danny Sandwell, director of product marketing at erwin, Inc. But today’s businesses must consider their reputations in terms of both cost/quality and how well they protect their customers’ data when trying to win business.

Once the reputation is tarnished because of a breach or misuse of data, customers will question those relationships.

Unfortunately for consumers, examples of companies failing to properly govern their data aren’t difficult to find. Look no further than Under Armour, which announced this spring that 150 million accounts at its MyFitnessPal diet and exercise tracking app were breached, and Facebook, where the data of millions of users was harvested by third parties hoping to influence the 2016 presidential election in the United States.

Customers Hate Breaches, But They Love Data

While consumers are quick to report concerns about data privacy, customers also yearn for (and increasingly expect) efficient, personalized and relevant experiences when they interact with businesses. These experiences are, of course, built on data.

In this area, customers and businesses are on the same page. Businesses want to collect data that helps them build the omnichannel, 360-degree customer views that make their customers happy.

These experiences allow businesses to connect with their customers and demonstrate how well they understand them and know their preferences, like and dislikes – essentially taking the personalized service of the neighborhood market to the internet.

The only way to manage that effectively at scale is to properly govern your data.

Delivering personalized service is also valuable to businesses because it helps turn customers into brand ambassadors, and it’s a fact that it’s much easier to build on existing customer relationships than to find new customers.

Here’s the upshot: If your organization is doing data governance right, it’s helping create happy, loyal customers, while at the same time avoiding the bad press and financial penalties associated with poor data practices.

Putting A Data Governance Initiative Into Action

The good news is that 76 percent of respondents to a November 2017 survey we conducted with UBM said understanding and governing the data assets in the organization was either important or very important to the executives in their organization. Nearly half (49 percent) of respondents said that customer trust/satisfaction was driving their data governance initiatives.

Importance of a data governance initiative

What stops organizations from creating an effective data governance initiative? At some businesses, it’s a cultural issue. Both the business and IT sides of the organization play important roles in data, with the IT side storing and protecting it, and the business side consuming data and analyzing it.

For years, however, data governance was the volleyball passed back and forth over the net between IT and the business, with neither side truly owning it. Our study found signs this is changing. More than half (57 percent) of the respondents said both and IT and the business/corporate teams were responsible for data in their organization.

Who's responsible for a data governance initiative

Once an organization understands that IT and the business are both responsible for data, it still needs to develop a comprehensive, holistic strategy for data governance that is capable of:

  • Reaching every stakeholder in the process
  • Providing a platform for understanding and governing trusted data assets
  • Delivering the greatest benefit from data wherever it lives, while minimizing risk
  • Helping users understand the impact of changes made to a specific data element across the enterprise.

To accomplish this, a modern data governance initiative needs to be interdisciplinary. It should include not only data governance, which is ongoing because organizations are constantly changing and transforming, but other disciples as well.

Enterprise architecture is important because it aligns IT and the business, mapping a company’s applications and the associated technologies and data to the business functions they enable.

By integrating data governance with enterprise architecture, businesses can define application capabilities and interdependencies within the context of their connection to enterprise strategy to prioritize technology investments so they align with business goals and strategies to produce the desired outcomes.

A business process and analysis component is also vital to modern data governance. It defines how the business operates and ensures employees understand and are accountable for carrying out the processes for which they are responsible.

Enterprises can clearly define, map and analyze workflows and build models to drive process improvement, as well as identify business practices susceptible to the greatest security, compliance or other risks and where controls are most needed to mitigate exposures.

Finally, data modeling remains the best way to design and deploy new relational databases with high-quality data sources and support application development.

Being able to cost-effectively and efficiently discover, visualize and analyze “any data” from “anywhere” underpins large-scale data integration, master data management, Big Data and business intelligence/analytics with the ability to synthesize, standardize and store data sources from a single design, as well as reuse artifacts across projects.

Michael Pastore is the Director, Content Services at QuinStreet B2B Tech. This content originally appeared as a sponsored post on http://www.eweek.com/.

Read the previous post on how compliance concerns and the EU’s GDPR are driving businesses to implement data governance.

Determine how effective your current data governance initiative is by taking our DG RediChek.

Take the DG RediChek

Categories
erwin Expert Blog

Defining Data Governance: What Is Data Governance?

Data governance (DG) is one of the fastest growing disciplines, yet when it comes to defining data governance many organizations struggle.

Dataversity says DG is “the practices and processes which help to ensure the formal management of data assets within an organization.” These practices and processes can vary, depending on an organization’s needs. Therefore, when defining data governance for your organization, it’s important to consider the factors driving its adoption.

The General Data Protection Regulation (GDPR) has contributed significantly to data governance’s escalating prominence. In fact, erwin’s 2018 State of Data Governance Report found that 60% of organizations consider regulatory compliance to be their biggest driver of data governance.

Defining data governance: DG Drivers

Other significant drivers include improving customer trust/satisfaction and encouraging better decision-making, but they trail behind regulatory compliance at 49% and 45% respectively. Reputation management (30%), analytics (27%) and Big Data (21%) also are factors.

But data governance’s adoption is of little benefit without understanding how DG should be applied within these contexts. This is arguably one of the issues that’s held data governance back in the past.

With no set definition, and the historical practice of isolating data governance within IT, organizations often have had different ideas of what data governance is, even between departments. With this inter-departmental disconnect, it’s not hard to imagine why data governance has historically left a lot to be desired.

However, with the mandate for DG within GDPR, organizations must work on defining data governance organization-wide to manage its successful implementation, or face GDPR’s penalties.

Defining Data Governance: Desired Outcomes

A great place to start when defining an organization-wide DG initiative is to consider the desired business outcomes. This approach ensures that all parties involved have a common goal.

Past examples of Data Governance 1.0 were mainly concerned with cataloging data to support search and discovery. The nature of this approach, coupled with the fact that DG initiatives were typically siloed within IT departments without input from the wider business, meant the practice often struggled to add value.

Without input from the wider business, the data cataloging process suffered from a lack of context. By neglecting to include the organization’s primary data citizens – those that manage and or leverage data on a day-to-day basis for analysis and insight – organizational data was often plagued by duplications, inconsistencies and poor quality.

The nature of modern data-driven business means that such data citizens are spread throughout the organization. Furthermore, many of the key data citizens (think value-adding approaches to data use such as data-driven marketing) aren’t actively involved with IT departments.

Because of this, Data Governance 1.0 initiatives fizzled out at discouraging frequencies.

This is, of course, problematic for organizations that identify regulatory compliance as a driver of data governance. Considering the nature of data-driven business – with new data being constantly captured, stored and leveraged – meeting compliance standards can’t be viewed as a one-time fix, so data governance can’t be de-prioritized and left to fizzle out.

Even those businesses that manage to maintain the level of input data governance needs on an indefinite basis, will find the Data Governance 1.0 approach wanting. In terms of regulatory compliance, the lack of context associated with data governance 1.0, and the inaccuracies it leads to mean that potentially serious data governance issues could go unfounded and result in repercussions for non-compliance.

We recommend organizations look beyond just data cataloging and compliance as desired outcomes when implementing DG. In the data-driven business landscape, data governance finds its true potential as a value-added initiative.

Organizations that identify the desired business outcome of data governance as a value-added initiative should also consider data governance 1.0’s shortcomings and any organizations that hasn’t identified value-adding as a business outcome, should ask themselves, “why?”

Many of the biggest market disruptors of the 21st Century have been digital savvy start-ups with robust data strategies – think Airbnb, Amazon and Netflix. Without high data governance standards, such companies would not have the level of trust in their data to confidently action such digital-first strategies, making them difficult to manage.

Therefore, in the data-driven business era, organizations should consider a Data Governance 2.0 strategy, with DG becoming an organization-wide, strategic initiative that de-silos the practice from the confines of IT.

This collaborative take on data governance intrinsically involves data’s biggest beneficiaries and users in the governance process, meaning functions like data cataloging benefit from greater context, accuracy and consistency.

It also means that organizations can have greater trust in their data and be more assured of meeting the standards set for regulatory compliance. It means that organizations can better respond to customer needs through more accurate methods of profiling and analysis, improving rates of satisfaction. And it means that organizations are less likely to suffer data breaches and their associated damages.

Defining Data Governance: The Enterprise Data Governance Experience (EDGE)

The EDGE is the erwin approach to Data Governance 2.0, empowering an organization to:

  • Manage any data, anywhere (Any2)
  • Instil a culture of collaboration and organizational empowerment
  • Introduce an integrated ecosystem for data management that draws from one central repository and ensures data (including real-time changes) is consistent throughout the organization
  • Have visibility across domains by breaking down silos between business and IT and introducing a common data vocabulary
  • Have regulatory peace of mind through mitigation of a wide range of risks, from GDPR to cybersecurity. 

To learn more about implementing data governance, click here.

Take the DG RediChek

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – The Logical Data Warehouse

There’s debate surrounding the term “logical data warehouse.” Some argue that it is a new concept, while others argue that all well-designed data warehouses are logical and so the term is meaningless. This is a key point I’ll address in this post.

I’ll also discuss data warehousing that incorporates some of the technologies and approaches we’ve covered in previous installments of this series (1, 2, 3, 4, 5, 6 ) but with a different architecture that embraces “any data, anywhere.”

So what is a “logical data warehouse?”

Bill Inmon and Barry Devlin provide two oft-quoted definitions of a “data warehouse.” Inmon says “a data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision-making process.

Devlin stripped down the definition, saying “a data warehouse is simply a single, complete and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand and use in a business context.

Although these definitions are widely adopted, there is some disparity in their interpretation. Some insist that such definitions imply a single repository, and thus a limitation.

On the other hand, some argue that a “collection of data” or a “single, complete and consistent store” could just as easily be virtual and therefore not inherently singular. They argue that the language is down to most early implementations only being single, physical data stores due to technology limitations.

Mark Beyer of Gartner is a prominent name in the former, singular repository camp. In 2011, he saidthe logical data warehouse (LDW) is a new data management architecture for analytics which combines the strengths of traditional repository warehouses with alternative data management and access strategy,” and the work has since been widely circulated.

So proponents of the “logical data warehouse,” as defined by Mark Beyer, don’t disagree with the value of an integrated collection of data. They just feel that if said collection is managed and accessed as something other than a monolithic, single physical database, then it is something different and should be called a “logical data warehouse” instead of just a “data warehouse.”

As the author of a series of posts about a jargon-filled [data] world, who am I to argue with the introduction of more new jargon?

In fact, I’d be remiss if I didn’t point out that the notion of a logical data warehouse has numerous jargon-rich enabling technologies and synonyms, including Service Oriented Architecture (SOA), Enterprise Services Bus (ESB), Virtualization Layer, and Data Fabric, though the latter term also has other unrelated uses.

So the essence of a logical data warehouse approach is to integrate diverse data assets into a single, integrated virtual data warehouse, without the traditional batch ETL or ELT processes required to copy data into a single, integrated physical data warehouse.

One of the key attractions to proponents of the approach is the avoidance of recurring batch extraction, transformation and loading activities that, typically argued, cause delays and lead to decisions being made based on data that is not as current as it could be.

The idea is to use caching and other technologies to create a virtualization layer that enables information consumers to ask a question as though they were interrogating a single, integrated physical data warehouse and to have the virtualization layer (which together with the data resident in some combination of underlying application systems, IoT data streams, external data sources, blockchains, data lakes, data warehouses and data marts, constitutes the logical data warehouse) respond correctly with more current data and without having to extract, transform and load data into a centralized physical store.

Logical Data Warehouse

While the moniker may be new, the idea of bringing the query to the data set(s) and then assembling an integrated result is not a new idea. There have been numerous successful implementations in the past, though they often required custom coding and rigorous governance to ensure response times and logical correctness.

Some would argue that such previous implementations were also not at the leading edge of data warehousing in terms of data volume or scope.

What is generating renewed interest in this approach is the continued frustration on the part of numerous stakeholders with delays attributed to ETL/ELT in traditional data warehouse implementations.

When you compound this with the often high costs of large (physical) data warehouse implementations, it’s not hard to see why. Especially if it’s based on MPP hardware, juxtaposed against the promise of some new solutions from vendors like Denodo and Cisco that capitalize on the increasing prevalence of new technologies, such as the cloud and in-memory.

One topic that quickly becomes clear as one learns more about the various logical data warehouse vendor solutions is that metadata is a very important component. However, this shouldn’t be a surprise, as the objective is still to present a single, integrated view to the information consumer.

So a well-architected, comprehensive and easily understood data model is as important as ever, both to ensure that information consumers can easily access properly integrated data and because the virtualization technology itself must depend on a properly architected data model to accurately transform an information request into queries to multiple data sources and then correctly synthesize the result sets into an appropriate response to the original information request.

We hope you’ve enjoyed our series, Data Modeling in a Jargon-filled World, learning something from this post or one of the previous posts in the series (1, 2, 3, 4, 5, 6 ).

The underlying theme, as you’ve probably deduced, is that data modeling remains critical in a world in which the volume, variety and velocity of data continue to grow while information consumers find it difficult to synthesize the right data in the right context to help them draw the right conclusions.

We encourage you to read other blog posts on this site by erwin staff members and other guest bloggers and to participate in ongoing events and webinars.

If you’d like to know more about accelerating your data modeling efforts for specific industries, while reducing risk and benefiting from best practices and lessons learned by other similar organizations in your industry, please visit erwin partner ADRM Software.

Data-Driven Business Transformation

Categories
erwin Expert Blog

NoSQL Database Adoption Is Poised to Explode

NoSQL database technology is gaining a lot of traction across industry. So what is it, and why is it increasing in use?

Techopedia defines NoSQL as “a class of database management systems (DBMS) that do not follow all of the rules of a relational DBMS and cannot use traditional SQL to query data.”

The rise of the NoSQL database

The rise of NoSQL can be attributed to the limitations of its predecessor. SQL databases were not conceived with today’s vast amount of data and storage requirements in mind.

Businesses, especially those with digital business models, are choosing to adopt NoSQL to help manage “the three Vs” of Big Data: increased volume, variety and velocity. Velocity in particular is driving NoSQL adoption because of the inevitable bottlenecks of SQL’s sequential data processing.

MongoDB, the fastest-growing supplier of NoSQL databases, notes this when comparing the traditional SQL relational database with the NoSQL database, saying “relational databases were not designed to cope with the scale and agility challenges that face modern applications, nor were they built to take advantage of the commodity storage and processing power available today.”

With all this in mind, we can see why the NoSQL database market is expected to reach $4.2 billion in value by 2020.

What’s next and why?

We can expect the adoption of NoSQL databases to continue growing, in large part because of Big Data’s continued growth.

And analysis indicates that data-driven decision-making improves productivity and profitability by 6%.

Businesses across industry appear to be picking up on this fact. An EY/Nimbus Ninety study found that 81% of companies understand the importance of data for improving efficiency and business performance.

However, understanding the importance of data to modern business isn’t enough. What 100% of organizations need to grasp is that strategic data analysis that produces useful insights has to start from a stable data management platform.

Gartner indicates that 90% of all data is unstructured, highlighting the need for dedicated data modeling efforts, and at a wider level, data management. Businesses can’t leave that 90% on the table because they don’t have the tools to properly manage it.

This is the crux of the Any2 data management approach – being able to manage “any data” from “anywhere.” NoSQL plays an important role in end-to-end data management by helping to accelerate the retrieval and analysis of Big Data.

The improved handling of data velocity is vital to becoming a successful digital business, one that can effectively respond in real time to customers, partners, suppliers and other parties, and profit from these efforts.

In fact, the velocity with which businesses are able to harness and query large volumes of unstructured, structured and semi-structured data in NoSQL databases makes them a critical asset for supporting modern cloud applications and their scale, speed and agile development demands.

For more data advice and best practices, follow us on Twitter, and LinkedIn to stay up to date with the blog.

For a deeper dive into Taking Control of NoSQL Databases, get the FREE eBook below.

Benefits of NoSQL

Categories
erwin Expert Blog

Enterprise Architecture vs. Data Architecture vs. Business Process Architecture

Despite the nomenclature, enterprise architecture, data architecture and business process architecture are very different disciplines. Despite this, organizations that combine the disciplines enjoy much greater success in data management.

Both an understanding of the differences between the three and an understanding of how the three work together, has to start with understanding the disciplines individually:

What is Enterprise Architecture?

Enterprise architecture defines the structure and operation of an organization. Its desired outcome is to determine current and future objectives and translate those goals into a blueprint of IT capabilities.

A useful analogy for understanding enterprise architecture is city planning. A city planner devises the blueprint for how a city will come together, and how it will be interacted with. They need to be cognizant of regulations (zoning laws) and understand the current state of city and its infrastructure.

A good city planner means less false starts, less waste and a faster, more efficient carrying out of the project.

In this respect, a good enterprise architect is a lot like a good city planner.

What is Data Architecture?

The Data Management Body of Knowledge (DMBOK), define data architecture as  “specifications used to describe existing state, define data requirements, guide data integration, and control data assets as put forth in a data strategy.”

So data architecture involves models, policy rules or standards that govern what data is collected and how it is stored, arranged, integrated and used within an organization and its various systems. The desired outcome is enabling stakeholders to see business-critical information regardless of its source and relate to it from their unique perspectives.

There is some crossover between enterprise and data architecture. This is because data architecture is inherently an offshoot of enterprise architecture. Where enterprise architects take a holistic, enterprise-wide view in their duties, data architects tasks are much more refined, and focussed. If an enterprise architect is the city planner, then a data architect is an infrastructure specialist – think plumbers, electricians etc.

For a more in depth look into enterprise architecture vs data architecture, see: The Difference Between Data Architecture and Enterprise Architecture

What is Business Process Architecture?

Business process architecture describes an organization’s business model, strategy, goals and performance metrics.

It provides organizations with a method of representing the elements of their business and how they interact with the aim of aligning people, processes, data, technologies and applications to meet organizational objectives. With it, organizations can paint a real-world picture of how they function, including opportunities to create, improve, harmonize or eliminate processes to improve overall performance and profitability.

Enterprise, Data and Business Process Architecture in Action

A successful data-driven business combines enterprise architecture, data architecture and business process architecture. Integrating these disciplines from the ground up ensures a solid digital foundation on which to build. A strong foundation is necessary because of the amount of data businesses already have to manage. In the last two years, more data has been created than in all of humanity’s history.

And it’s still soaring. Analysts predict that by 2020, we’ll create about 1.7 megabytes of new information every second for every human being on the planet.

While it’s a lot to manage, the potential gains of becoming a data-driven enterprise are too high to ignore. Fortune 1000 companies could potentially net an additional $65 million in income with access to just 10 percent more of their data.

To effectively employ enterprise architecture, data architecture and business process architecture, it’s important to know the differences in how they operate and their desired business outcomes.Enterprise Architecture, Data Architecture and Business Process Architecture

Combining Enterprise, Data and Business Process Architecture for Better Data Management

Historically, these three disciplines have been siloed, without an inherent means of sharing information. Therefore, collaboration between the tools and relevant stakeholders has been difficult.

To truly power a data-driven business, removing these silos is paramount, so as not to limit the potential analysis your organization can carry out. Businesses that understand and adopt this approach will benefit from much better data management when it comes to the ‘3 Vs.’

They’ll be better able to cope with the massive volumes of data a data-driven business will introduce; be better equipped to handle increased velocity of data, processing data accurately and quickly in order to keep time to markets low; and be able to effectively manage data from a growing variety of different sources.

In essence, enabling collaboration between enterprise architecture, data architecture and business process architecture helps an organization manage “any data, anywhere” – or Any2. This all-encompassing view provides the potential for deeper data analysis.

However, attempting to manage all your data without all the necessary tools is like trying to read a book without all the chapters. And trying to manage data with a host of uncollaborative, disparate tools is like trying to read a story with chapters from different books. Clearly neither approach is ideal.

Unifying the disciplines as the foundation for data management provides organizations with the whole ‘data story.’

The importance of getting the whole data story should be very clear considering the aforementioned statistic – Fortune 1000 companies could potentially net an additional $65 million in income with access to just 10 percent more of their data.

Download our eBook, Solving the Enterprise Data Dilemma to learn more about data management tools, particularly enterprise architecture, data architecture and business process architecture, working in tandem.

Categories
erwin Expert Blog

Data-Driven Business Transformation: the Data Foundation

In light of data’s prominence in modern business, organizations need to ensure they have a strong data foundation in place.

The ascent of data’s value has been as steep as it is staggering. In 2016, it was suggested that more data would be created in 2017 than in the previous 5000 years of humanity.

But what’s even more shocking is that the peak still not may not even be in sight.

To put its value into context, the five most valuable businesses in the world all deal in data (Alphabet/Google, Amazon, Apple, Facebook and Microsoft). It’s even overtaken oil as the world’s most valuable resource.

Yet, even with data’s value being as high as it is, there’s still a long way to go. Many businesses are still getting to grips with data storage, management and analysis.

Fortune 1000 companies, for example, could earn another $65 million in net income, with access to just 10 percent more of their data (from Data-Driven Business Transformation 2017).

We’re already witnessing the beginnings of this increased potential across various industries. Data-driven businesses such as Airbnb, Uber and Netflix are all dominating, disrupting and revolutionizing their respective sectors.

Interestingly, although they provide very different services for the consumer, the organizations themselves all identify as data companies. This simple change in perception and outlook stresses the importance of data to their business models. For them, data analysis isn’t just an arm of the business… It’s the core.

Data foundation

The dominating data-driven businesses use data to influence almost everything. How decisions are made, how processes could be improved, and where the business should focus its innovation efforts.

However, simply establishing that your business could (and should) be getting more out of data, doesn’t necessarily mean you’re ready to reap the rewards.

In fact, a pre-emptive dive into a data strategy could in fact, slow your digital transformation efforts down. Hurried software investments in response to disruption can lead to teething problems in your strategy’s adoption, and shelfware, wasting time and money.

Additionally, oversights in the strategy’s implementation will stifle the very potential effectiveness you’re hoping to benefit from.

Therefore, when deciding to bolster your data efforts, a great place to start is to consider the ‘three Vs’.

The three Vs

The three Vs of data are volume, variety and velocity. Volume references the amount of data; variety, its different sources; and velocity, the speed in which it must be processed.

When you’re ready to start focusing on the business outcomes that you hope data will provide, you can also stretch those three Vs, to five. The five Vs include the aforementioned, and also acknowledge veracity (confidence in the data’s accuracy) and value, but for now we’ll stick to three.

As discussed, the total amount of data in the world is staggering. But the total data available to any one business can be huge in its own right (depending on the extent of your data strategy).

Unsurprisingly, vast volumes of data are sourced from a vast amount of potential sources. It takes dedicated tools to be processed. Even then, the sources are often disparate, and very unlikely to offer worthwhile insight in a vacuum.

This is why it’s so important to have an assured data foundation upon which to build a data platform on.

A solid data foundation

The Any2 approach is a strategy for housing, sorting and analysing data that aims to be that very foundation on which you build your data strategy.

Shorthand for Any Data, Anywhere, Anycan help clean up the disparate noise, and let businesses drill down on, and effectively analyze the data in order to yield more reliable and informative results.

It’s especially important today, as data sources are becoming increasingly unstructured, and so more difficult to manage.

Big data for example, can consist of click stream data, Internet of Things data, machine data and social media data. The sources need to be rationalized and correlated so they can be analyzed more effectively.

When it comes to actioning an Anyapproach, a fluid relationship between the various data initiative involved is essential. Those being, Data ModelingEnterprise ArchitectureBusiness Process, and Data Governance.

It also requires collaboration, both in between the aforementioned initiatives, and with the wider business to ensure everybody is working towards the same goal.

With a solid data foundation platform in place, your business can really begin to start realizing data’s potential for itself. You also ensure you’re not left behind as new disruptors enter the market, and your competition continues to evolve.

For more data advice and best practices, follow us on Twitter, and LinkedIn to stay up to date with the blog.

For a deeper dive into best practices for data, its benefits, and its applications, get the FREE whitepaper below.

Data-Driven Business Transformation