Categories
erwin Expert Blog Data Modeling

erwin, Microsoft and the Power of the Common Data Model

What is Microsoft’s Common Data Model (CDM), and why is it so powerful?

Imagine if every person in your organization spoke a different language, and you had no simple way to translate what they were saying? It would make your work frustrating, complicated and slow.

The same is true for data, with a number of vendors creating data models by vertical industry (financial services, healthcare, etc.) and making them commercially available to improve how organizations understand and work with their data assets. The CDM takes this concept to the next level.

Microsoft has delivered a critical building block for the data-driven enterprise by capturing proven business data constructs and semantic descriptors for data across a wide range of business domains in the CDM and providing the contents in an open-source format for consumption and integration. The CDM provides a best-practices approach to defining data to accelerate data literacy, automation, integration and governance across the enterprise.

Why Is the CDM Such a Big Deal?

The value of the CDM shows up in multiple ways. One is enabling data to be unified. Another is to reduce the time and effort in manual mapping – ultimately saving the organization money.

By having a single definition of something, complex ETL doesn’t have to be performed repeatedly. Once something is defined, then then everyone can map to the standard definition of what the data means.

Beyond saving time, effort and money, CDM can help transform your business in even more ways, including:

  • Innovation: With data having a common meaning, the business can unlock new scenarios, like modern and advanced analytics, experiential analytics, AI, email, etc.
  • Insights: Given the meaning of the data is the same, regardless of the domain it came from, an organization can use its data to power business insights.
  • Compliance: It improves data governance to comply with such regulations as the General Data Protection Regulation (GDPR).
  • Cloud migration and other data platform modernization efforts: definition is missing here.

Once the organization understands what something is, and it is commonly understood across the enterprise, anyone can build semantically aware reporting and analytical requirements plus deliver a uniform view because there is a common understanding of data.

Data Modeling Tool

erwin Expands Collaboration with Microsoft

The combination of Microsoft’s CDM with erwin’s industry-leading data modeling, governance and automation solutions can optimize an organization’s data capability and accelerate the impact and business value of enterprise data.

erwin recently announced its expanded collaboration with Microsoft. By working together, the companies will help organizations get a handle on disparate data, put it in one place, and then determine how to do something meaningful with it.

The erwin solutions that use Microsoft’s CDM are:

erwin Data Modeler: erwin DM automatically transforms the CDM into a graphical model, complete with business-data constructs and semantic metadata, to feed your existing data-source models and new database designs – regardless of the technology upon which these structures are deployed.

erwin DM’s reusable model templates, design layer and model compare/synchronization capabilities, combined with our design lifecycle and modeler collaboration services, enables organizations to capture and use CDM contents and best practices to optimize enterprise data definition, design and deployment.

erwin DM also enables the reuse of the CDM in the design and maintenance of enterprise data sources. It automatically consumes, integrates and maintains CDM metadata in a standardized, reusable design and supports logical and physical modeling and integration with all major DBMS technologies.

The erwin Data Intelligence Suite: erwin DI automatically scans, captures and activates metadata from the CDM into a central business glossary. Here, it is intelligently integrated and connected to the metadata from the data sources that feed enterprise applications.

Your comprehensive metadata landscape, including CDM metadata, is governed with the appropriate terminology, policies, rules and other business classifications you decide to build into your framework.

The resulting data intelligence is then discoverable via a self-service business user portal that provides role-based, contextual views. All this metadata-driven automation is possible thanks to erwin DI’s ability to consume and associate CDM metadata to create a data intelligence framework.

erwin and Microsoft recently co-presented a session on the power of the CDM that included a demonstration of how to create a data lake for disparate data sources, migrate all that data to it, and then provide business users with contextual views of the underlying metadata, based on a CDM-enriched business glossary.

The simulation also discussed the automatic generation of scripts for ETL tools, as well as the auto generation of data lineage diagrams and impact analysis so data governance is built in and continuous.

You can watch the full erwin/Microsoft session here.

Data Modeling Data Goverance

Categories
erwin Expert Blog Data Modeling

Data Modeling Best Practices for Data-Driven Organizations

As data-driven business becomes increasingly prominent, an understanding of data modeling and data modeling best practices is crucial. This posts outlines just that, and other key questions related to data modeling such as “SQL vs. NoSQL.”

What is Data Modeling?

Data modeling is a process that enables organizations to discover, design, visualize, standardize and deploy high-quality data assets through an intuitive, graphical interface.

Data models provide visualization, create additional metadata and standardize data design across the enterprise.

As the value of data and the way it is used by organizations has changed over the years, so too has data modeling.

In the modern context, data modeling is a function of data governance.

While data modeling has always been the best way to understand complex data sources and automate design standards, modern data modeling goes well beyond these domains to accelerate and ensure the overall success of data governance in any organization.

 

 

As well as keeping the business in compliance with data regulations, data governance – and data modeling – also drive innovation.

Companies that want to advance artificial intelligence (AI) initiatives, for instance, won’t get very far without quality data and well-defined data models.

With the right approach, data modeling promotes greater cohesion and success in organizations’ data strategies.

But what is the right data modeling approach?

Data Modeling Data Goverance

Data Modeling Best Practices

The right approach to data modeling is one in which organizations can make the right data available at the right time to the right people. Otherwise, data-driven initiatives can stall.

Thanks to organizations like Amazon, Netflix and Uber, businesses have changed how they leverage their data and are transforming their business models to innovate – or risk becoming obsolete.

According to a 2018 survey by Tech Pro Research, 70 percent of survey respondents said their companies either have a digital transformation strategy in place or are working on one. And 60% of companies that have undertaken digital transformation have created new business models.

But data-driven business success doesn’t happen by accident. Organizations that adapt that strategy without the necessary processes, platforms and solutions quickly realize that data creates a lot of noise but not necessarily the right insights.

This phenomenon is perhaps best articulated through the lens of the “three Vs” of data: volume, variety and velocity.

Data Modeling Tool

Any2 Data Modeling and Navigating Data Chaos

The three Vs describe the volume (amount), variety (type) and velocity (speed at which it must be processed) of data.

Data’s value grows with context, and such context is found within data. That means there’s an incentive to generate and store higher volumes of data.

Typically, an increase in the volume of data leads to more data sources and types. And higher volumes and varieties of data become increasingly difficult to manage in a way that provides insight.

Without due diligence, the above factors can lead to a chaotic environment for data-driven organizations.

Therefore, the data modeling best practice is one that allows users to view any data from anywhere – a data governance and management best practice we dub “any-squared” (Any2).

Organizations that adopt the Any2 approach can expect greater consistency, clarity and artifact reuse across large-scale data integration, master data management, metadata management, Big Data and business intelligence/analytics initiatives.

SQL or NoSQL? The Advantages of NoSQL Data Modeling

For the most part, databases use “structured query language” (SQL) for maintaining and manipulating data. This structured approach and its proficiency in handling complex queries has led to its widespread use.

But despite the advantages of such structure, its inherent sequential nature (“this, then “this”), means it can be hard to operate holistically and deal with large amounts of data at once.

Additionally, as alluded to earlier, the nature of modern, data-driven business and the three VS means organizations are dealing with increasing amounts of unstructured data.

As such in a modern business context, the three Vs have become somewhat of an Achilles’ heel for SQL databases.

The sheer rate at which businesses collect and store data – as well as the various types of data stored – mean organizations have to adapt and adopt databases that can be maintained with greater agility.

That’s where NoSQL comes in.

Benefits of NoSQL

Despite what many might assume, adopting a NoSQL database doesn’t mean abandoning SQL databases altogether. In fact, NoSQL is actually a contraction of “not only SQL.”

The NoSQL approach builds on the traditional SQL approach, bringing old (but still relevant) ideas in line with modern needs.

NoSQL databases are scalable, promote greater agility, and handle changes to data and the storing of new data more easily.

They’re better at dealing with other non-relational data too. NoSQL supports JavaScript Object Notation (JSON), log messages, XML and unstructured documents.

Data Modeling Is Different for Every Organization

It perhaps goes without saying, but different organizations have different needs.

For some, the legacy approach to databases meets the needs of their current data strategy and maturity level.

For others, the greater flexibility offered by NoSQL databases makes NoSQL databases, and by extension NoSQL data modeling, a necessity.

Some organizations may require an approach to data modeling that promotes collaboration.

Bringing data to the business and making it easy to access and understand increases the value of data assets, providing a return-on-investment and a return-on-opportunity. But neither would be possible without data modeling providing the backbone for metadata management and proper data governance.

Whatever the data modeling need, erwin can help you address it.

erwin DM is available in several versions, including erwin DM NoSQL, with additional options to improve the quality and agility of data capabilities.

And we just announced a new version of erwin DM, with a modern and customizable modeling environment, support for Amazon Redshift; updated support for the latest DB2 releases; time-saving modeling task automation, and more.

New to erwin DM? You can try the new erwin Data Modeler for yourself for free!

erwin Data Modeler Free Trial - Data Modeling

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – NoSQL/NewSQL

In the first two posts of this series, we focused on the “volume” and “velocity” of Big Data, respectively.  In this post, we’ll cover “variety,” the third of Big Data’s “three Vs.” In particular, I plan to discuss NoSQL and NewSQL databases and their implications for data modeling.

As the volume and velocity of data available to organizations continues to rapidly increase, developers have chafed under the performance shackles of traditional relational databases and SQL.

An astonishing array of database solutions have arisen during the past decade to provide developers with higher performance solutions for various aspects of managing their application data. These have been collectively labeled as NoSQL databases.

Originally NoSQL meant that “no SQL” was required to interface with the database. In many cases, developers viewed this as a positive characteristic.

However, SQL is very useful for some tasks, with many organizations having rich SQL skillsets. Consequently, as more organizations demanded SQL as an option to complement some of the new NoSQL databases, the term NoSQL evolved to mean “not only SQL.” This way, SQL capabilities can be leveraged alongside other non-traditional characteristics.

Among the most popular of these new NoSQL options are document databases like MongoDB. MongoDB offers the flexibility to vary fields from document to document and change structure over time. Document databases typically store data in JSON-like documents, making it easy to map to objects in application code.

As the scale of NoSQL deployments in some organizations has rapidly grown, it has become increasingly important to have access to enterprise-grade tools to support modeling and management of NoSQL databases and to incorporate such databases into the broader enterprise data modeling and governance fold.

While document databases, key-value databases, graph databases and other types of NoSQL databases have added valuable options for developers to address various challenges posed by the “three Vs,” they did so largely by compromising consistency in favor of availability and speed, instead offering “eventual consistency.” Consequently, most NoSQL stores lack true ACID transactions, though there are exceptions, such as Aerospike and MarkLogic.

But some organizations are unwilling or unable to forgo consistency and transactional requirements, giving rise to a new class of modern relational database management systems (RDBMS) that aim to guarantee ACIDity while also providing the same level of scalability and performance offered by NoSQL databases.

NewSQL databases are typically designed to operate using a shared nothing architecture. VoltDB is one prominent example of this emerging class of ACID-compliant NewSQL RDBMS. The logical design for NewSQL database schemas is similar to traditional RDBMS schema design, and thus, they are well supported by popular enterprise-grade data modeling tools such as erwin DM.

Whatever mixture of databases your organization chooses to deploy for your OLTP requirements on premise and in the cloud – RDBMS, NoSQL and/or NewSQL – it’s as important as ever for data-driven organizations to be able to model their data and incorporate it into an overall architecture.

When it comes to organizations’ analytics requirements, including data that may be sourced from a wide range of NoSQL, NewSQL RDBMS and unstructured sources, leading organizations are adopting a variety of approaches, including a hybrid approach that many refer to as Managed Data Lakes.

Please join us next time for the fourth installment in our series: Data Modeling in a Jargon-filled World – Managed Data Lakes.

nosql

Categories
erwin Expert Blog

Why the NoSQL Database is a Necessary Step

 The NoSQL database is gaining huge traction and for good reason.

Traditionally, most organizations have leveraged relational databases to manage their data. Relational databases ensure the referential integrity, constraints, normalization and structured access for data across disparate tools, which is why they’re so widely used.

But as with any technology, evolving trends and requirements eventually push the limits of capability and suitability for emerging business use cases.

New data sources, characterized by increased volume, variety and velocity have exposed limitations in the strict relational approach to managing data.  These characteristics require a more flexible approach to the storage and provisioning of data assets that can support these new forms of data with the agility and scalability they demand.

Technology – specifically data – has changed the way organizations operate. Lower development costs are allowing start ups and smaller business to grow far quicker. In turn, this leads to less stable markets and more frequent disruptions.

As more and more organizations look to cut their own slice of the data pie, businesses are more focused on in-house development than ever.

This is where relational data modeling becomes somewhat of a stumbling block.

Rise of the NoSQL Database

More and more, application developers are turning to the NoSQL database.

The NoSQL database is a more flexible approach that enables increased agility in development teams. Data models can be evolved on the fly to account for changing application requirements.

This enables businesses to adopt an agile system to releasing new iterations and code. They’re scalable and object oriented, and can also handle large volumes of structured, semi-structured and unstructured data.

Due to the growing deployment of NoSQL and the fact that our customers need the same tools to manage them as their relational databases, erwin is excited to announce the availability of a beta program for our new erwin DM for NoSQL product.

With our new erwin DM NoSQL option, we’re the only provider to help you model, govern and manage your unstructured cloud data just like any other traditional database in your business.

  • Building new cloud-based apps running on MongoDB?
  • Migrating from a relational database to MongoDB or the reverse?
  • Want to ensure that all your data is governed by a logical enterprise model, no matter where its located?

Then erwin DM NoSQL is the right solution for you. Click here to apply for our erwin DM NoSQL/MongoDB beta program now.

And look for more info here on the power and potential of  NoSQL databases in the coming weeks.

erwin NoSQL database