Categories
erwin Expert Blog

Digital Trust: Earning It and Keeping It with Data Governance

Digital trust can make or break a brand.

Amazon understood this concept early on. When the company first launched as an online bookseller in 1994, consumer confidence in online shopping was low, to say the least.

Exclusively competing with local bookstores, Amazon and many e-tailers throughout the 90s and early 2000s had to work to create trust in online shopping. Their efforts paid off, ushering in a new era and transforming the way we all shop today.

Amazon is a good example of digital trust making a brand. But data breaches are a telling metric of how lack of digital trust can break a brand.

Frequency of Data Breaches and Its Impact on Consumer Trust

Since Privacy Rights Clearinghouse began tracking data breaches in 2005, 7,731 have been reported, with an estimated 1 billion individual records breached. And that estimate is conservative. While a data breach may have been reported, the number of individual records involved isn’t always known.

The Ponemon Institute’s 2017 Cost of Data Breach Study suggests the odds of suffering a data breach within the year are as high as one in four. As if the growing number of data breaches isn’t enough to contend with, considerable evidence suggests their impact is increasing too.

Although the Ponemon Institute study found the financial cost of a data breach fell by 10 percent between 2016 and 2017, the “financial cost” doesn’t account for the various intangible effects of a data breach that can, and do, add up.

For example, the reputational cost more than likely outweighs the clean-up costs of a high-profile data breach like the one Equifax suffered recently. That incident is believed to have reduced Equifax’s market value by $3 billion, as share prices tumbled by as much as 17 percent.

In fact, any company disclosing a data breach saw its average stock price fall by 5 percent, according to Ponemon. And 21 percent of consumers included in its study reported ending their relationships with a company that had been breached. Why? They lost trust in those businesses.

Perhaps the most relevant finding here is that “organizations with a poor security posture experienced an increase of up to 7 percent customer churn, which can amount to millions in lost revenue.” Clearly this shows the correlation between digital trust and customer retention. It also demonstrates that the consumer is aware of such matters.

That’s why digital trust poses an opportunity. Yes, consumer trust is declining. Yes, high-profile breaches are increasing. But these are alarm bells, not death knells.

Businesses can use the issue of digital trust to their advantage. By making it a unique value proposition reinforced by a solid data governance (DG) program, you can set yourself apart from the competition – not to mention avoid GDPR penalties.

Building digital trust

Building Digital Trust Through Data Governance

In today’s digital economy, the consumer holds the power with more avenues of research and reviews to inform purchase decisions. Even in the B2B world, studies indicate that 47 percent of buyers view three to five pieces of content before engaging with a sales rep.

In other words, the consumer is clued in. But if a data breach occurs, it doesn’t have to lead to customer losses. It could actually reinforce customer loyalty and produce an uptick in new customers – if you are proactive in your response and transparent about your procedures for data governance.

Of course, consumer trust isn’t built overnight. It’s a process, influenced by sound data governance practices and routine demonstrations of said practices so trust becomes part of your brand.

While considering the long-term payoff, it’s also worth noting the advantages a data governance program has in the short-term. For better or worse, short-term positive outcomes are what business leaders and decision- makers want to see.

When it comes to both digital trust and business outcomes, DG’s biggest advantage is ensuring an organization can first trust its own data.

In addition to helping an organization discover, understand and then socialize its mission-critical information for greater visibility, it also improves the enterprise’s ability to govern and control data. You literally get a handle on how you handle your data – and not just to help prevent breaches.

Greater certainty around the quality of data leads to faster and more productive decision-making. It reduces the risk of misleading models, analysis and prediction, meaning less time, money and other resources are wasted.

Additionally, the very data used in such models and analysis benefits from improved clarity. Meaning what’s relevant is more readily discoverable, speeding up the entire strategic planning and decision-making process.

So, proactive and proficient data governance doesn’t just mitigate risk, it fundamentally improves operational performance and accelerates growth.

For more data best practices click here, and you can stay up to date with our latest posts here.

erwin blog

Categories
erwin Expert Blog

Using Enterprise Architecture to Improve Security

The personal data of more than 143 million people – half the United States’ entire population – may have been compromised in the recent Equifax data breach. With every major data breach comes post-mortems and lessons learned, but one area we haven’t seen discussed is how enterprise architecture might aid in the prevention of data breaches.

For Equifax, the reputational hit, loss of profits/market value, and potential lawsuits is really bad news. For other organizations that have yet to suffer a breach, be warned. The clock is ticking for the General Data Protection Regulation (GDPR) to take effect in May 2018. GDPR changes everything, and it’s just around the corner.

Organizations of all sizes must take greater steps to protect consumer data or pay significant penalties. Negligent data governance and data management could cost up to 4 percent of an organization’s global annual worldwide turnover or up to 20 million Euros, whichever is greater.

With this in mind, the Equifax data breach – and subsequent lessons – is a discussion potentially worth millions.

Enterprise architecture for security

Proactive Data Protection and Cybersecurity

Given that data security has long been considered paramount, it’s surprising that enterprise architecture is one approach to improving data protection that has been overlooked.

It’s a surprise because when you consider enterprise architecture use cases and just how much of an organization it permeates (which is really all of it), EA should be commonplace in data security planning.

So, the Equifax breach provides a great opportunity to explore how enterprise architecture could be used for improving cybersecurity.

Security should be proactive, not reactive, which is why EA should be a huge part of security planning. And while we hope the Equifax incident isn’t the catalyst for an initial security assessment and improvements, it certainly should prompt a re-evaluation of data security policies, procedures and technologies.

By using well-built enterprise architecture for the foundation of data security, organizations can help mitigate risk. EA’s comprehensive view of the organization means security can be involved in the planning stages, reducing risks involved in new implementations. When it comes to security, EA should get a seat at the table.

Enterprise architecture also goes a long way in nullifying threats born of shadow IT, out-dated applications, and other IT faux pas. Well-documented, well-maintained EA gives an organization the best possible view of current tech assets.

This is especially relevant in Equifax’s case as the breach has been attributed to the company’s failure to update a web application although it had sufficient warning to do so.

By leveraging EA, organizations can shore up data security by ensuring updates and patches are implemented proactively.

Enterprise Architecture, Security and Risk Management

But what about existing security flaws? Implementing enterprise architecture in security planning now won’t solve them.

An organization can never eliminate security risks completely. The constantly evolving IT landscape would require businesses to spend an infinite amount of time, resources and money to achieve zero risk. Instead, businesses must opt to mitigate and manage risk to the best of their abilities.

Therefore, EA has a role in risk management too.

In fact, EA’s risk management applications are more widely appreciated than its role in security. But effective EA for risk management is a fundamental part of how EA for implementing security works.

Enterprise architecture’s comprehensive accounting of business assets (both technological and human) means it’s best placed to align security and risk management with business goals and objectives. This can give an organization insight into where time and money can best be spent in improving security, as well as the resources available to do so.

This is because of the objective view enterprise architecture analysis provides for an organization.

To use somewhat of a crude but applicable analogy, consider the risks of travel. A fear of flying is more common than fear of driving in a car. In a business sense, this could unwarrantedly encourage more spending on mitigating the risks of flying. However, an objective enterprise architecture analysis would reveal, that despite fear, the risk of travelling by car is much greater.

Applying the same logic to security spending, enterprise architecture analysis would give an organization an indication of how to prioritize security improvements.

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – The Cloud

There’s no escaping data’s role in the cloud, and so it’s crucial that we analyze the cloud’s impact on data modeling. 

Categories
erwin Expert Blog

Every Company Requires Data Governance and Here’s Why

With GDPR regulations imminent, businesses need to ensure they have a handle on data governance.

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – Managed Data Lakes

More and more businesses are adopting managed data lakes.

Earlier in this blog series, we established that leading organizations are adopting a variety of approaches to manage data, including data that may be sourced from a wide range of NoSQL, NewSQL, RDBMS and unstructured sources.

In this post, we’ll discuss managed data lakes and their applications as a hybrid of less structured data and more traditionally structured relational data. We’ll also talk about whether there’s still a need for data modeling and metadata management.

The term Data Lake was first coined by James Dixon of Pentaho in a blog entry in which he said:

“If you think of a data mart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.”

Use of the term quickly took on a life of its own with often divergent meanings. So much so that four years later Mr. Dixon felt compelled to refute some criticisms by the analyst community by pointing out that they were objecting to things he actually never said about data lakes.

However, in my experience and despite Mr. Dixon’s objections, the notion that a data lake can contain data from more than one source is now widely accepted..

Similarly, while most early data lake implementations used Hadoop with many vendors pitching the idea that a data lake had to be implemented as a Hadoop data store, the notion that data lakes can be implemented on non-Hadoop platforms, such as Azure Blob storage or Amazon S3, has become increasingly widespread.

So a data lake – as the term is widely used in 2017 – is a detailed (non-aggregated) data store that can contain structured and/or non-structured data from more than one source implemented on some kind of inexpensive, massively scalable storage platform.

But what are “managed data lakes?”

To answer that question, let’s first touch on why many early data lake projects failed or significantly missed expectations. Criticisms were quick to arise, many of which were critiques of data lakes when they strayed from the original vision, as established earlier.

Vendors seized on data lakes as a marketing tool, and as often happens in our industry, they promised it could do almost anything. As long as you poured your data into the lake, people in the organization would somehow magically find exactly the data they needed just when they needed it. As is usually the case, it turned out that for most organizations, their reality was quite different. And for three important reasons:

  1. Most large organizations’ analysts didn’t have the skillsets to wade through the rapidly accumulating pool of information in Hadoop or whichever new platforms had been chosen to implement their data lakes to locate the data they needed.
  2. Not enough attention was paid to the need of providing metadata to help people find the data they needed.
  3. Most interesting analytics are a result of integrating disparate data points to draw conclusions, and integration had not been an area of focus in most data lake implementations.

In the face of growing disenchantment with data lake implementations, some organizations and vendors pivoted to address these drawbacks. They did so by embracing what is most commonly called a managed data lake, though some prefer the label “curated data lake” or “modern data warehouse.”

The idea is to address the three criticisms mentioned above by developing an architectural approach that allows for the use of SQL, making data more accessible and providing more metadata about the data available in the data lake. It also takes on some of the challenging work of integration and transformation that earlier data lake implementations had hoped to kick down the road or avoid entirely.

The result in most implementations of a managed data lake is a hybrid that tries to blend the strengths of the original data lake concept with the strengths of traditional large-scale data warehousing (as opposed to the narrow data mart approach Mr. Dixon used as a foil when originally describing data lakes).

Incoming data, either structured or unstructured, can be easily and quickly loaded from many different sources (e.g., applications, IoT, third parties, etc.). The data can be accumulated with minimal processing at reasonable cost using a bulk storage platform such as Hadoop, Azure Blob storage or Amazon S3.

Then the data, which is widely used within the organization, can be integrated and made available through a SQL or SQL-like interface, such as those from Hive to Postgres to a tried-and-true commercial relational database such as SQL Server (or its cloud-based cousin Azure SQL Data Warehouse).

In this scenario, a handful of self-sufficient data scientists may wade (or swim or dive) in the surrounding data lake. However, most analysts in most organizations still spend most of their time using familiar SQL-capable tools to analyze data stored in the core of the managed data lake – an island in the lake if we really want to torture the analogy – which is typically implemented either using an RDBMS or a relational layer like Hive on top of the bulk-storage layer.

It’s important to note that these are not two discrete silos. Most major vendors have added capabilities to their database and BI offerings to enable analysis of both RDBMS-based and bulk-storage layer data through a familiar SQL interface.

This enables a much larger percentage of an organization’s analysts to access data both in the core and the less structured surrounding lake, using tools with which they’re already familiar.

As this hybrid managed data lake approach incorporates a relational core, robust data modeling capabilities are as important as ever. The same goes for data governance and a thorough focus on metadata to provide clear naming and definitions to assist in finding and linking with the most appropriate data.

This is true whether inside the structured relational core of the managed data lake or in the surrounding, more fluid data lake.

As you probably guessed from some of the links in this post, more and more managed data lakes are being implemented in the cloud. Please join us next time for the fifth installment in our series: Data Modeling in a Jargon-filled World – The Cloud.

Categories
erwin Expert Blog

NoSQL Database Adoption Is Poised to Explode

NoSQL database technology is gaining a lot of traction across industry. So what is it, and why is it increasing in use?

Techopedia defines NoSQL as “a class of database management systems (DBMS) that do not follow all of the rules of a relational DBMS and cannot use traditional SQL to query data.”

The rise of the NoSQL database

The rise of NoSQL can be attributed to the limitations of its predecessor. SQL databases were not conceived with today’s vast amount of data and storage requirements in mind.

Businesses, especially those with digital business models, are choosing to adopt NoSQL to help manage “the three Vs” of Big Data: increased volume, variety and velocity. Velocity in particular is driving NoSQL adoption because of the inevitable bottlenecks of SQL’s sequential data processing.

MongoDB, the fastest-growing supplier of NoSQL databases, notes this when comparing the traditional SQL relational database with the NoSQL database, saying “relational databases were not designed to cope with the scale and agility challenges that face modern applications, nor were they built to take advantage of the commodity storage and processing power available today.”

With all this in mind, we can see why the NoSQL database market is expected to reach $4.2 billion in value by 2020.

What’s next and why?

We can expect the adoption of NoSQL databases to continue growing, in large part because of Big Data’s continued growth.

And analysis indicates that data-driven decision-making improves productivity and profitability by 6%.

Businesses across industry appear to be picking up on this fact. An EY/Nimbus Ninety study found that 81% of companies understand the importance of data for improving efficiency and business performance.

However, understanding the importance of data to modern business isn’t enough. What 100% of organizations need to grasp is that strategic data analysis that produces useful insights has to start from a stable data management platform.

Gartner indicates that 90% of all data is unstructured, highlighting the need for dedicated data modeling efforts, and at a wider level, data management. Businesses can’t leave that 90% on the table because they don’t have the tools to properly manage it.

This is the crux of the Any2 data management approach – being able to manage “any data” from “anywhere.” NoSQL plays an important role in end-to-end data management by helping to accelerate the retrieval and analysis of Big Data.

The improved handling of data velocity is vital to becoming a successful digital business, one that can effectively respond in real time to customers, partners, suppliers and other parties, and profit from these efforts.

In fact, the velocity with which businesses are able to harness and query large volumes of unstructured, structured and semi-structured data in NoSQL databases makes them a critical asset for supporting modern cloud applications and their scale, speed and agile development demands.

For more data advice and best practices, follow us on Twitter, and LinkedIn to stay up to date with the blog.

For a deeper dive into Taking Control of NoSQL Databases, get the FREE eBook below.

Benefits of NoSQL

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – NoSQL/NewSQL

In the first two posts of this series, we focused on the “volume” and “velocity” of Big Data, respectively.  In this post, we’ll cover “variety,” the third of Big Data’s “three Vs.” In particular, I plan to discuss NoSQL and NewSQL databases and their implications for data modeling.

As the volume and velocity of data available to organizations continues to rapidly increase, developers have chafed under the performance shackles of traditional relational databases and SQL.

An astonishing array of database solutions have arisen during the past decade to provide developers with higher performance solutions for various aspects of managing their application data. These have been collectively labeled as NoSQL databases.

Originally NoSQL meant that “no SQL” was required to interface with the database. In many cases, developers viewed this as a positive characteristic.

However, SQL is very useful for some tasks, with many organizations having rich SQL skillsets. Consequently, as more organizations demanded SQL as an option to complement some of the new NoSQL databases, the term NoSQL evolved to mean “not only SQL.” This way, SQL capabilities can be leveraged alongside other non-traditional characteristics.

Among the most popular of these new NoSQL options are document databases like MongoDB. MongoDB offers the flexibility to vary fields from document to document and change structure over time. Document databases typically store data in JSON-like documents, making it easy to map to objects in application code.

As the scale of NoSQL deployments in some organizations has rapidly grown, it has become increasingly important to have access to enterprise-grade tools to support modeling and management of NoSQL databases and to incorporate such databases into the broader enterprise data modeling and governance fold.

While document databases, key-value databases, graph databases and other types of NoSQL databases have added valuable options for developers to address various challenges posed by the “three Vs,” they did so largely by compromising consistency in favor of availability and speed, instead offering “eventual consistency.” Consequently, most NoSQL stores lack true ACID transactions, though there are exceptions, such as Aerospike and MarkLogic.

But some organizations are unwilling or unable to forgo consistency and transactional requirements, giving rise to a new class of modern relational database management systems (RDBMS) that aim to guarantee ACIDity while also providing the same level of scalability and performance offered by NoSQL databases.

NewSQL databases are typically designed to operate using a shared nothing architecture. VoltDB is one prominent example of this emerging class of ACID-compliant NewSQL RDBMS. The logical design for NewSQL database schemas is similar to traditional RDBMS schema design, and thus, they are well supported by popular enterprise-grade data modeling tools such as erwin DM.

Whatever mixture of databases your organization chooses to deploy for your OLTP requirements on premise and in the cloud – RDBMS, NoSQL and/or NewSQL – it’s as important as ever for data-driven organizations to be able to model their data and incorporate it into an overall architecture.

When it comes to organizations’ analytics requirements, including data that may be sourced from a wide range of NoSQL, NewSQL RDBMS and unstructured sources, leading organizations are adopting a variety of approaches, including a hybrid approach that many refer to as Managed Data Lakes.

Please join us next time for the fourth installment in our series: Data Modeling in a Jargon-filled World – Managed Data Lakes.

nosql

Categories
erwin Expert Blog

erwin Brings NoSQL into the Enterprise Data Modeling and Governance Fold

“NoSQL is not an option — it has become a necessity to support next-generation applications.”

Categories
erwin Expert Blog

Data-Driven Business Transformation: the Data Foundation

In light of data’s prominence in modern business, organizations need to ensure they have a strong data foundation in place.

The ascent of data’s value has been as steep as it is staggering. In 2016, it was suggested that more data would be created in 2017 than in the previous 5000 years of humanity.

But what’s even more shocking is that the peak still not may not even be in sight.

To put its value into context, the five most valuable businesses in the world all deal in data (Alphabet/Google, Amazon, Apple, Facebook and Microsoft). It’s even overtaken oil as the world’s most valuable resource.

Yet, even with data’s value being as high as it is, there’s still a long way to go. Many businesses are still getting to grips with data storage, management and analysis.

Fortune 1000 companies, for example, could earn another $65 million in net income, with access to just 10 percent more of their data (from Data-Driven Business Transformation 2017).

We’re already witnessing the beginnings of this increased potential across various industries. Data-driven businesses such as Airbnb, Uber and Netflix are all dominating, disrupting and revolutionizing their respective sectors.

Interestingly, although they provide very different services for the consumer, the organizations themselves all identify as data companies. This simple change in perception and outlook stresses the importance of data to their business models. For them, data analysis isn’t just an arm of the business… It’s the core.

Data foundation

The dominating data-driven businesses use data to influence almost everything. How decisions are made, how processes could be improved, and where the business should focus its innovation efforts.

However, simply establishing that your business could (and should) be getting more out of data, doesn’t necessarily mean you’re ready to reap the rewards.

In fact, a pre-emptive dive into a data strategy could in fact, slow your digital transformation efforts down. Hurried software investments in response to disruption can lead to teething problems in your strategy’s adoption, and shelfware, wasting time and money.

Additionally, oversights in the strategy’s implementation will stifle the very potential effectiveness you’re hoping to benefit from.

Therefore, when deciding to bolster your data efforts, a great place to start is to consider the ‘three Vs’.

The three Vs

The three Vs of data are volume, variety and velocity. Volume references the amount of data; variety, its different sources; and velocity, the speed in which it must be processed.

When you’re ready to start focusing on the business outcomes that you hope data will provide, you can also stretch those three Vs, to five. The five Vs include the aforementioned, and also acknowledge veracity (confidence in the data’s accuracy) and value, but for now we’ll stick to three.

As discussed, the total amount of data in the world is staggering. But the total data available to any one business can be huge in its own right (depending on the extent of your data strategy).

Unsurprisingly, vast volumes of data are sourced from a vast amount of potential sources. It takes dedicated tools to be processed. Even then, the sources are often disparate, and very unlikely to offer worthwhile insight in a vacuum.

This is why it’s so important to have an assured data foundation upon which to build a data platform on.

A solid data foundation

The Any2 approach is a strategy for housing, sorting and analysing data that aims to be that very foundation on which you build your data strategy.

Shorthand for Any Data, Anywhere, Anycan help clean up the disparate noise, and let businesses drill down on, and effectively analyze the data in order to yield more reliable and informative results.

It’s especially important today, as data sources are becoming increasingly unstructured, and so more difficult to manage.

Big data for example, can consist of click stream data, Internet of Things data, machine data and social media data. The sources need to be rationalized and correlated so they can be analyzed more effectively.

When it comes to actioning an Anyapproach, a fluid relationship between the various data initiative involved is essential. Those being, Data ModelingEnterprise ArchitectureBusiness Process, and Data Governance.

It also requires collaboration, both in between the aforementioned initiatives, and with the wider business to ensure everybody is working towards the same goal.

With a solid data foundation platform in place, your business can really begin to start realizing data’s potential for itself. You also ensure you’re not left behind as new disruptors enter the market, and your competition continues to evolve.

For more data advice and best practices, follow us on Twitter, and LinkedIn to stay up to date with the blog.

For a deeper dive into best practices for data, its benefits, and its applications, get the FREE whitepaper below.

Data-Driven Business Transformation

Categories
erwin Expert Blog

GDPR guide: The role of the Data Protection Officer

Over the past few weeks we’ve been exploring aspects related to the new EU data protection law (GDPR) which will come into effect in 2018.