Categories
erwin Expert Blog

Why Data Governance is the Key to Better Decision-Making

The ability to quickly collect vast amounts of data, analyze it, and then use what you’ve learned to help foster better decision-making is the dream of many a business executive. But like any number of things that can be summarized in a single sentence, it’s much harder to execute on such a vision than it might first appear.

According to Forrester, 74 percent of firms say they want to be “data-driven,” but only 29 percent say they are good at connecting analytics to action. Consider this: Forrester found that business satisfaction with analytics dropped by 21 percent between 2014 and 2015 – a period of great promise and great investment in Big Data. In other words, the more data businesses were collecting and mining, the less happy they were with their analytics.

A number of factors are potentially at play here, including the analytics software, the culture of the business, and the skill sets of the people using the data. But your analytics applications and the conclusions you draw from your analysis are only as good as the data that is collected and analyzed. Collecting, safeguarding and mining large amounts of data isn’t an inexpensive exercise, and as the saying goes, “garbage in, garbage out.”

“It’s a big investment and if people don’t trust data, they won’t use things like business intelligence tools because they won’t have faith in what they tell them,” says Danny Sandwell, director of product marketing at erwin, Inc.

Using data to inform business decisions is hardly new, of course. The modern idea of market research dates back to the 1920s, and ever since businesses have collected, analyzed and drawn conclusions from information they draw from customers or prospective customers.

The difference today, as you might expect, is the amount of data and how it’s collected. Data is generated by machines large and small, by people, and by old-fashioned market research. It enters today’s businesses from all angles, at lightning speed, and can, in many cases, be available for instant analysis.

As the volume and velocity of data increases, overload becomes a potential problem. Unless the business has a strategic plan for data governance, decisions around where the data is stored, who and what can access it, and how it can be used, becomes increasingly difficult to understand.

Not every business collects massive amounts of data like Facebook and Yahoo, but recent headlines demonstrate how those companies’ inability to govern data is harming their reputations and bottom lines. For Facebook, it was the revelation that the data of 87 million users was improperly obtained to influence the 2016 U. S. presidential election. For Yahoo, the U.S. Securities and Exchange Commission (SEC) levied a $35 million fine for failure to disclose a data breach in a timely manner.

In both the Facebook and Yahoo cases, the misuse or failure to protect data was one problem. Their inability to quickly quantify the scope of the problem and disclose the details made a big issue even worse – and kept it in the headlines even longer.

The issues of data security, data privacy and data governance may not be top of mind for some business users, but these issues manifest themselves in a number of ways that affect what they do on a daily basis. Think of it this way: somewhere in all of the data your organization collects, a piece of information that can support or refute a decision you’re about to make is likely there. Can you find it? Can you trust it?

If the answer to these questions is “no,” then it won’t be easy for your organization to make data-driven decisions.

Better Decision-Making - Data Governance

Powering Better Decision-Making with Data Governance

Nearly half (45 percent) of the respondents to a November 2017 survey by erwin and UBM said better decision-making was one of the factors driving their data governance initiatives.

Data governance helps businesses understand what data they have, how good it is, where it is, and how it’s used. A lot of people are talking about data governance today, and some are putting that talk into action. The erwin/UBM survey found that 52 percent of respondents say data is critically important to their organization and they have a formal data governance strategy in place. But almost as many respondents (46 percent) say they recognize the value of data to their organization but don’t have a formal governance strategy.

Many early attempts at instituting data governance failed to deliver results. They were narrowly focused, and their proponents often had difficulty articulating the value of data governance to the organization, making it difficult to secure budget. Some organizations even understood data governance as a type of data security, locking up data so tightly that the people who wanted to use it to foster better decision-making had trouble getting access.

Issues of ownership also stymied early data governance efforts, as IT and the business couldn’t agree on which side was responsible for a process that affects both on a regular basis. Today, organizations are better equipped to resolve issues of ownership, thanks in large part to a new corporate structure that recognizes how important data is to modern businesses. Roles like chief data officer (CDO), which increasingly sits on the business side, and the data protection officer (DPO), are more common than they were a few years ago.

A modern data governance strategy works a lot like data itself – it permeates the business and its infrastructure. It is part of the enterprise architecture, the business processes, and it help organizations better understand the relationships between data assets using techniques like visualization. Perhaps most important, a modern approach to data governance is ongoing, because organizations and their data are constantly changing and transforming, so their approach to data governance can’t sit still.

As you might expect, better visibility into your data goes a long way toward using that data to make more informed decisions. There is, however, another advantage to the visibility offered by a holistic data governance strategy: it helps you better understand what you don’t know.

By helping businesses understand the areas where they can improve their data collection, data governance helps organizations continually work to create better data, which manifests itself in real business advantages, like better decision-making and top-notch customer experiences, all of which will help grow the business.

Michael Pastore is the Director, Content Services at QuinStreet B2B Tech. This content originally appeared as a sponsored post on http://www.eweek.com/.

Previous posts:

You can determine how effective your current data governance initiative is by taking erwin’s DG RediChek.

Take the DG RediChek

Categories
erwin Expert Blog

Digital Trust: Enterprise Architecture and the Farm Analogy

With the General Data Protection Regulation (GDPR) taking effect soon, organizations can use it as a catalyst in developing digital trust.

Data breaches are increasing in scope and frequency, creating PR nightmares for the organizations affected. The more data breaches, the more news coverage that stays on consumers’ minds.

The Equifax breach and subsequent stock price fall was well documented and should serve as a warning to businesses and how they manage their data. Large or small,  organizations have lessons to learn when it comes to building and maintaining digital trust, especially with GDPR looming ever closer.

Previously, we discussed the importance of fostering a relationship of trust between business and consumer.  Here, we focus more specifically on data keepers and the public.

Digital Tust: Data Farm

Digital Trust and The Farm Analogy

Any approach to mitigating the risks associated with data management needs to consider the ‘three Vs’: variety, velocity and volume.

In describing best practices for handling data, let’s imagine data as an asset on a farm. The typical farm’s wide span makes constant surveillance impossible, similar in principle to data security.

With a farm, you can’t just put a fence around the perimeter and then leave it alone. The same is true of data because you need a security approach that makes dealing with volume and variety easier.

On a farm, that means separating crops and different types of animals. For data, segregation serves to stop those without permissions from accessing sensitive information.

And as with a farm and its seeds, livestock and other assets, data doesn’t just come in to the farm. You also must manage what goes out.

A farm has several gates allowing people, animals and equipment to pass through, pending approval. With data, gates need to make sure only the intended information filters out and that it is secure when doing so. Failure to correctly manage data transfer will leave your business in breach of GDPR and liable for a hefty fine.

Furthermore, when looking at the gates in which data enters and streams out of an organization, we must also consider the third ‘V’ – velocity, the amount of data an organization’s systems can process at any given time.

Of course, the velocity of data an organization can handle is most often tied to how efficiently a business operates. Effectively dealing with high velocities of data requires faster analysis and times to market.

However, it’s arguably a matter of security too. Although not a breach, DDOS attacks are one such vulnerability associated with data velocity.

DDOS attacks are designed to put the aforementioned data gates under pressure, ramping up the amount of data that passes through them at any one time. Organizations with the infrastructure to deal with such an attack, especially one capable of scaling to demand, will suffer less preventable down time.

Enterprise Architecture and Harvesting the Farm

Making sure you can access, understand and use your data for strategic benefit – including fostering digital trust – comes down to effective data management and governance. And enterprise architecture is a great starting point because it provides a holistic view of an organization’s capabilities, applications and systems including how they all connect.

Enterprise architecture at the core of any data-driven business will serve to identify what parts of the farm need extra protections – those fences and gates mentioned earlier.

It also makes GDPR compliance and overall data governance easier, as the first step for both is knowing where all your data is.

For more data management best practices, click here. And you can subscribe to our blog posts here.

erwin blog

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – In-memory Databases

With the volume and velocity of data increasing, in-memory databases provide a way to keep processing speeds low.

Traditionally, databases have stored their data on mechanical storage media such as hard disks. While this has contributed to durability, it’s also constrained attainable query speeds. Database and software designers have long realized this limitation and sought ways to harness the faster speeds of in-memory processing.

The traditional approach to database design – and analytics solutions to access them – includes in-memory caching, which retains a subset of recently accessed data in memory for fast access. While caching often worked well for online transaction processing (OLTP), it was not optimal for analytics and business intelligence. In these cases, the most frequently accessed information – rather than the most recently accessed information – is typically of most interest.

That said, loading an entire data warehouse or even a large data mart into memory has been challenging until recent years.

In-Memory

There are a few key factors in making in-memory databases and analytics offerings relevant for more and more use cases. One such factor has been the shift to 64-bit operating systems. Another is that it makes available much more addressable memory. And as one might assume, the availability of increasingly large and affordable memory solutions has also played a part.

Database and software developers have begun to take advantage of in-memory databases in a myriad of ways. These include the many key-value stores such as Amazon DynamoDB, which provide very low latency for IoT and a host of other use cases.

Another way businesses are taking advantage of in-memory is through distributed in-memory NoSQL databases such as Aerospike, to in-memory NewSQL databases such as VoltDB. However, for the remainder of this post, we’ll touch in more detail on several solutions with which you might be more familiar.

Some database vendors have chosen to build hybrid solutions that incorporate in-memory technologies. They aim to bridge in-memory with solutions based on tried-and-true, disk-based RDBMS technologies. Such vendors include Microsoft with its incorporation of xVelocity into SQL Server, Analysis Services and PowerPivot, and Teradata with its Intelligent Memory.

Other vendors, like IBM with its dashDB database, have chosen to deploy in-memory technology in the cloud, while capitalizing on previously developed or acquired technologies (in-database analytics from Netezza in the case of dashDB).

However, probably the most high-profile application of in-memory technology has been SAP’s significant bet on its HANA in-memory database, which first shipped in late 2010. SAP has since made it available in the cloud through its SAP HANA Cloud Platform, and on Microsoft Azure and it has released a comprehensive application suite called S/4HANA.

Like most of the analytics-focused in-memory databases and analytics tools, HANA stores data in a column-oriented, in-memory database. The primary rationale for taking a column-oriented approach to storing data in memory is that in analytic use cases, where data is queried but not updated, it allows for often very impressive compression of data values in each column. This means much less memory is used, resulting in even higher throughput and less need for expensive memory.

So what approach should a data architect adopt? Are Microsoft, Teradata and other “traditional” RDBMS vendors correct with their hybrid approach?

As memory gets cheaper by the day, and the value of rapid insights increases by the minute, should we host the whole data warehouse or data mart in-memory as with vendors SAP and IBM?

It depends on the specific use case, data volumes, business requirements, budget, etc. One thing that is not in dispute is that all the major vendors recognize that in-memory technology adds value to their solutions. And that extends beyond the database vendors to analytics tool stalwarts like Tableau and newer arrivals like Yellowfin.

It is incumbent upon enterprise architects to learn about the relative merits of the different approaches championed by the various vendors and to select the best fit for their specific situation. This is something that’s admittedly, not easy given the pace of adoption of in-memory databases and the variety of approaches being taken.

But there’s a silver lining to the creative disruption caused by the increasing adoption of in-memory technologies. Because of the sheer speed the various solutions offered, many organizations are finding that the need to pre-aggregate data to achieve certain performance targets for specific analytics workloads is disappearing. The same goes for the need to de-normalize database designs to achieve specific analytics performance targets.

Instead, organizations are finding that it’s more important to create comprehensive atomic data models that are flexible and independent of any assumed analytics workload.

Perhaps surprisingly to some, third normal form (3NF) is once again not an unreasonable standard of data modeling for modelers who plan to deploy to a pure in-memory or in-memory-augmented platform.

Organizations can forgo the time-consuming effort to model and transform data to support specific analytics workloads, which are likely to change over time anyway. They also can stop worrying about de-normalizing and tuning an RDBMS for those same fickle and variable analytics workloads, focusing on creating a logical data model of the business that reflects the business information requirements and relationships in a flexible and detailed format, that doesn’t assume specific aggregations and transformations.

The blinding speed of in-memory technologies provides the aggregations, joins and other transformations on the fly, without the onerous performance penalties we have historically experienced with very large data volumes on disk-only-based solutions. As a long-time data modeler, I like the sound of that. And so far in my experience with many of the solutions mentioned in this post, the business people like the blinding speed and flexibility of these new in-memory technologies!

Please join us next time for the final installment of our series, Data Modeling in a Jargon-filled World – The Logical Data Warehouse. We’ll discuss an approach to data warehousing that uses some of the technologies and approaches we’ve discussed in the previous six installments while embracing “any data, anywhere.”

Categories
erwin Expert Blog

Every Company Requires Data Governance and Here’s Why

With GDPR regulations imminent, businesses need to ensure they have a handle on data governance.