Categories
erwin Expert Blog Data Modeling

Integrating SQL and NoSQL into Data Modeling for Greater Business Value: The Latest Release of erwin Data Modeler

SQL or NoSQL words written on white board, Big data concept

Due to the prevalence of internal and external market disruptors, many organizations are aligning their digital transformation and cloud migration efforts with other strategic requirements (e.g., compliance with the General Data Protection Regulation).

Accelerating the retrieval and analysis of data —so much of it unstructured—is vital to becoming a data-driven business that can effectively respond in real time to customers, partners, suppliers and other parties, and profit from these efforts. But even though speed is critical, businesses must take the time to model and document new applications for compliance and transparency.

For decades, data modeling has been the optimal way to design and deploy new relational databases with high-quality data sources and support application development. It facilitates communication between the business and system developers so stakeholders can understand the structure and meaning of enterprise data within a given context. Today, it provides even greater value because critical data exists in both structured and unstructured formats and lives both on premises and in the cloud.

Comparing SQL and NoSQL

While it may not be the most exciting match up, there’s much to be said when comparing SQL vs NoSQL databases. SQL databases use schemas and pre-defined tables, while NoSQL databases are the complete opposite. Instead of schemas and tables, NoSQL databases store data in ways that depend on what kind of NoSQL database is being used.

While the SQL and NoSQL worlds can complement each other in today’s data ecosystem, most enterprises need to focus on building expertise and processes for the latter format.

After all, they’ve already had decades of practice designing and managing SQL databases that emphasize storage efficiency and referential integrity rather than fast data access, which is so important to building cloud applications that deliver real-time value to staff, customers and other parties. Query-optimized modeling is the new watchword when it comes to supporting today’s fast delivery, iterative and real-time applications

DBMS products based on rigid schema requirements impede our ability to fully realize business opportunities that can expand the depth and breadth of relevant data streams for conversion into actionable information. New, business-transforming use cases often involve variable data feeds, real-time or near-time processing and analytics requirements, and the scale to process large volumes of data.

NoSQL databases, such as Couchbase and MongoDB, are purpose-built to handle the variety, velocity and volume of these new data use cases. Schema-less or dynamic schema capabilities, combined with increased processing speed and built-in scalability, make NoSQL the ideal platform.

Making the Move to NoSQL

Now the hard part. Once we’ve agreed to make the move to NoSQL, the next step is to identify the architectural and technological implications facing the folks tasked with building and maintaining these new mission-critical data sources and the applications they feed.

As the data modeling industry leader, erwin has identified a critical success factor for the majority of organizations adopting a NoSQL platform like Couchbase, Cassandra and MongoDB. Successfully leveraging this solution requires a significant paradigm shift in how we design NoSQL data structures and deploy the databases that manage them.

But as with most technology requirements, we need to shield the business from the complexity and risk associated with this new approach. The business cares little for the technical distinctions of the underlying data management “black box.”

Business data is business data, with the main concerns being its veracity and value. Accountability, transparency, quality and reusability are required, regardless. Data needs to be trusted, so decisions can be made with confidence, based on facts. We need to embrace this paradigm shift, while ensuring it fits seamlessly into our existing data management practices as well as interactions with our partners within the business. Therefore, the challenge of adopting NoSQL in an organization is two-fold: 1) mastering and managing this new technology and 2) integrating it into an expansive and complex infrastructure.

The Newest Release of erwin Data Modeler

There’s a reason erwin Data Modeler is the No.1 data modeling solution in the world.

And the newest release delivers all in one SQL and NoSQL data modeling, guided denormalization and model-driven engineering support for Couchbase, Cassandra, MongoDB, JSON and AVRO. NoSQL users get all of the great capabilities inherent in erwin Data Modeler. It also provides Data Vault modeling, enhanced productivity, and simplified administration of the data modeling repository.

Now you can rely on one solution for all your enterprise data modeling needs, working across DBMS platforms, using modern modeling techniques for faster data value, and centrally governing all data definition, data modeling and database design initiatives.

erwin data models reduce complexity, making it easier to design, deploy and understand data sources to meet business needs. erwin Data Modeler also automates and standardizes model design tasks, including complex queries, to improve business alignment, ensure data integrity and simplify integration.

In addition to the above, the newest release of erwin Data Modeler by Quest also provides:

  • Updated support and certifications for the latest versions of Oracle, MS SQL Server, MS Azure SQL and MS Azure SQL Synapse
  • JDBC-connectivity options for Oracle, MS SQL Server, MS Azure SQL, Snowflake, Couchbase, Cassandra and MongoDB
  • Enhanced administration capabilities to simplify and accelerate data model access, collaboration, governance and reuse
  • New automation, connectivity, UI and workflow optimization to enhance data modeler productivity by reducing onerous manual tasks

erwin Data Modeler is a proven technology for improving the quality and agility of an organization’s overall data capability – and that includes data governance and data intelligence.

Click here for your free trial of erwin Data Modeler.

Categories
erwin Expert Blog Data Modeling

Data Modeling Best Practices for Data-Driven Organizations

As data-driven business becomes increasingly prominent, an understanding of data modeling and data modeling best practices is crucial. This posts outlines just that, and other key questions related to data modeling such as “SQL vs. NoSQL.”

What is Data Modeling?

Data modeling is a process that enables organizations to discover, design, visualize, standardize and deploy high-quality data assets through an intuitive, graphical interface.

Data models provide visualization, create additional metadata and standardize data design across the enterprise.

As the value of data and the way it is used by organizations has changed over the years, so too has data modeling.

In the modern context, data modeling is a function of data governance.

While data modeling has always been the best way to understand complex data sources and automate design standards, modern data modeling goes well beyond these domains to accelerate and ensure the overall success of data governance in any organization.

 

 

As well as keeping the business in compliance with data regulations, data governance – and data modeling – also drive innovation.

Companies that want to advance artificial intelligence (AI) initiatives, for instance, won’t get very far without quality data and well-defined data models.

With the right approach, data modeling promotes greater cohesion and success in organizations’ data strategies.

But what is the right data modeling approach?

Data Modeling Data Goverance

Data Modeling Best Practices

The right approach to data modeling is one in which organizations can make the right data available at the right time to the right people. Otherwise, data-driven initiatives can stall.

Thanks to organizations like Amazon, Netflix and Uber, businesses have changed how they leverage their data and are transforming their business models to innovate – or risk becoming obsolete.

According to a 2018 survey by Tech Pro Research, 70 percent of survey respondents said their companies either have a digital transformation strategy in place or are working on one. And 60% of companies that have undertaken digital transformation have created new business models.

But data-driven business success doesn’t happen by accident. Organizations that adapt that strategy without the necessary processes, platforms and solutions quickly realize that data creates a lot of noise but not necessarily the right insights.

This phenomenon is perhaps best articulated through the lens of the “three Vs” of data: volume, variety and velocity.

Data Modeling Tool

Any2 Data Modeling and Navigating Data Chaos

The three Vs describe the volume (amount), variety (type) and velocity (speed at which it must be processed) of data.

Data’s value grows with context, and such context is found within data. That means there’s an incentive to generate and store higher volumes of data.

Typically, an increase in the volume of data leads to more data sources and types. And higher volumes and varieties of data become increasingly difficult to manage in a way that provides insight.

Without due diligence, the above factors can lead to a chaotic environment for data-driven organizations.

Therefore, the data modeling best practice is one that allows users to view any data from anywhere – a data governance and management best practice we dub “any-squared” (Any2).

Organizations that adopt the Any2 approach can expect greater consistency, clarity and artifact reuse across large-scale data integration, master data management, metadata management, Big Data and business intelligence/analytics initiatives.

SQL or NoSQL? The Advantages of NoSQL Data Modeling

For the most part, databases use “structured query language” (SQL) for maintaining and manipulating data. This structured approach and its proficiency in handling complex queries has led to its widespread use.

But despite the advantages of such structure, its inherent sequential nature (“this, then “this”), means it can be hard to operate holistically and deal with large amounts of data at once.

Additionally, as alluded to earlier, the nature of modern, data-driven business and the three VS means organizations are dealing with increasing amounts of unstructured data.

As such in a modern business context, the three Vs have become somewhat of an Achilles’ heel for SQL databases.

The sheer rate at which businesses collect and store data – as well as the various types of data stored – mean organizations have to adapt and adopt databases that can be maintained with greater agility.

That’s where NoSQL comes in.

Benefits of NoSQL

Despite what many might assume, adopting a NoSQL database doesn’t mean abandoning SQL databases altogether. In fact, NoSQL is actually a contraction of “not only SQL.”

The NoSQL approach builds on the traditional SQL approach, bringing old (but still relevant) ideas in line with modern needs.

NoSQL databases are scalable, promote greater agility, and handle changes to data and the storing of new data more easily.

They’re better at dealing with other non-relational data too. NoSQL supports JavaScript Object Notation (JSON), log messages, XML and unstructured documents.

Data Modeling Is Different for Every Organization

It perhaps goes without saying, but different organizations have different needs.

For some, the legacy approach to databases meets the needs of their current data strategy and maturity level.

For others, the greater flexibility offered by NoSQL databases makes NoSQL databases, and by extension NoSQL data modeling, a necessity.

Some organizations may require an approach to data modeling that promotes collaboration.

Bringing data to the business and making it easy to access and understand increases the value of data assets, providing a return-on-investment and a return-on-opportunity. But neither would be possible without data modeling providing the backbone for metadata management and proper data governance.

Whatever the data modeling need, erwin can help you address it.

erwin DM is available in several versions, including erwin DM NoSQL, with additional options to improve the quality and agility of data capabilities.

And we just announced a new version of erwin DM, with a modern and customizable modeling environment, support for Amazon Redshift; updated support for the latest DB2 releases; time-saving modeling task automation, and more.

New to erwin DM? You can try the new erwin Data Modeler for yourself for free!

erwin Data Modeler Free Trial - Data Modeling

Categories
erwin Expert Blog

Five Benefits of an Automation Framework for Data Governance

Organizations are responsible for governing more data than ever before, making a strong automation framework a necessity. But what exactly is an automation framework and why does it matter?

In most companies, an incredible amount of data flows from multiple sources in a variety of formats and is constantly being moved and federated across a changing system landscape.

Often these enterprises are heavily regulated, so they need a well-defined data integration model that helps avoid data discrepancies and removes barriers to enterprise business intelligence and other meaningful use.

IT teams need the ability to smoothly generate hundreds of mappings and ETL jobs. They need their data mappings to fall under governance and audit controls, with instant access to dynamic impact analysis and lineage.

With an automation framework, data professionals can meet these needs at a fraction of the cost of the traditional manual way.

In data governance terms, an automation framework refers to a metadata-driven universal code generator that works hand in hand with enterprise data mapping for:

  • Pre-ETL enterprise data mapping
  • Governing metadata
  • Governing and versioning source-to-target mappings throughout the lifecycle
  • Data lineage, impact analysis and business rules repositories
  • Automated code generation

Such automation enables organizations to bypass bottlenecks, including human error and the time required to complete these tasks manually.

In fact, being able to rely on automated and repeatable processes can result in up to 50 percent in design savings, up to 70 percent conversion savings and up to 70 percent acceleration in total project delivery.

So without further ado, here are the five key benefits of an automation framework for data governance.

Automation Framework

Benefits of an Automation Framework for Data Governance

  1. Creates simplicity, reliability, consistency and customization for the integrated development environment.

Code automation templates (CATs) can be created – for virtually any process and any tech platform – using the SDK scripting language or the solution’s published libraries to completely automate common, manual data integration tasks.

CATs are designed and developed by senior automation experts to ensure they are compliant with industry or corporate standards as well as with an organization’s best practice and design standards.

The 100-percent metadata-driven approach is critical to creating reliable and consistent CATs.

It is possible to scan, pull in and configure metadata sources and targets using standard or custom adapters and connectors for databases, ERP, cloud environments, files, data modeling, BI reports and Big Data to document data catalogs, data mappings, ETL (XML code) and even SQL procedures of any type.

  1. Provides blueprints anyone in the organization can use.

Stage DDL from source metadata for the target DBMS; profile and test SQL for test automation of data integration projects; generate source-to-target mappings and ETL jobs for leading ETL tools, among other capabilities.

It also can populate and maintain Big Data sets by generating PIG, Scoop, MapReduce, Spark, Python scripts and more.

  1. Incorporates data governance into the system development process.

An organization can achieve a more comprehensive and sustainable data governance initiative than it ever could with a homegrown solution.

An automation framework’s ability to automatically create, version, manage and document source-to-target mappings greatly matters both to data governance maturity and a shorter-time-to-value.

This eliminates duplication that occurs when project teams are siloed, as well as prevents the loss of knowledge capital due to employee attrition.

Another value capability is coordination between data governance and SDLC, including automated metadata harvesting and cataloging from a wide array of sources for real-time metadata synchronization with core data governance capabilities and artifacts.

  1. Proves the value of data lineage and impact analysis for governance and risk assessment.

Automated reverse-engineering of ETL code into natural language enables a more intuitive lineage view for data governance.

With end-to-end lineage, it is possible to view data movement from source to stage, stage to EDW, and on to a federation of marts and reporting structures, providing a comprehensive and detailed view of data in motion.

The process includes leveraging existing mapping documentation and auto-documented mappings to quickly render graphical source-to-target lineage views including transformation logic that can be shared across the enterprise.

Similarly, impact analysis – which involves data mapping and lineage across tables, columns, systems, business rules, projects, mappings and ETL processes – provides insight into potential data risks and enables fast and thorough remediation when needed.

Impact analysis across the organization while meeting regulatory compliance with industry regulators requires detailed data mapping and lineage.

THE REGULATORY RATIONALE FOR INTEGRATING DATA MANAGEMENT & DATA GOVERNANCE

  1. Supports a wide spectrum of business needs.

Intelligent automation delivers enhanced capability, increased efficiency and effective collaboration to every stakeholder in the data value chain: data stewards, architects, scientists, analysts; business intelligence developers, IT professionals and business consumers.

It makes it easier for them to handle jobs such as data warehousing by leveraging source-to-target mapping and ETL code generation and job standardization.

It’s easier to map, move and test data for regular maintenance of existing structures, movement from legacy systems to new systems during a merger or acquisition, or a modernization effort.

erwin’s Approach to Automation for Data Governance: The erwin Automation Framework

Mature and sustainable data governance requires collaboration from both IT and the business, backed by a technology platform that accelerates the time to data intelligence.

Part of the erwin EDGE portfolio for an “enterprise data governance experience,” the erwin Automation Framework transforms enterprise data into accurate and actionable insights by connecting all the pieces of the data management and data governance lifecycle.

 As with all erwin solutions, it embraces any data from anywhere (Any2) with automation for relational, unstructured, on-premise and cloud-based data assets and data movement specifications harvested and coupled with CATs.

If your organization would like to realize all the benefits explained above – and gain an “edge” in how it approaches data governance, you can start by joining one of our weekly demos for erwin Mapping Manager.

Automate Data Mapping

Categories
erwin Expert Blog

SQL, NoSQL or NewSQL: Evaluating Your Database Options

A common question in the modern data management space involves database technology: SQL, NoSQL or NewSQL?

But there isn’t a one-size-fits-all answer. What’s “right” must be evaluated on a case-by-case basis and is dependent on data maturity.

For example, a large bookstore chain with a big-data initiative would be stifled by a SQL database. The advantages that could be gained from analyzing social media data (for popular books, consumer buying habits) couldn’t be realized effectively through sequential analysis. There’s too much data involved in this approach, with too many threads to follow.

However, an independent bookstore isn’t necessarily bound to a big-data approach because it may not have a mature data strategy. It might not have ventured beyond digitizing customer records, and a SQL database is sufficient for that work.

Having said that, the “SQL, NoSQL or NewSQL” question is gaining prominence because businesses are becoming increasingly data-driven.

In 2019, an IDC study found 85% of enterprise decision-makers said they had a time frame of two years to make significant inroads into digital transformation or they will fall behind their competitors and suffer financially. Furthermore, a Progress study showed that 85% of enterprise decision-makers feel they only have two years to make significant digital-transformation progress before suffering financially and/or falling behind competitors.

Considering these statistics, what better time than now to evaluate your database technology? The “SQL, NoSQL or NewSQL question,” is especially important if you intend to become more data-driven.

SQL, NoSQL or NewSQL: Advantages and Disadvantages

SQL

SQL databases are tried and tested, proven to work on disks using interfaces with which businesses are already familiar.

As the longest-standing type of database, plenty of SQL options are available. This competitive market means you’ll likely find what you’re looking for at affordable prices.

Additionally, businesses in the earlier stages of data maturity are more likely to have a SQL database at work already, meaning no new investments need to be made.

However in the modern digital business context, SQL databases weren’t made to support the the three Vs of data. The volume is too high, the variety of sources is too vast, and the velocity (speed at which the data must be processed) is too great to be analyzed in sequence.

Furthermore, the foundational, legacy IT world they were purpose-built to serve has evolved. Now, corporate IT departments must be agile, and their databases must be agile and scalable to match.

NoSQL

Despite its name, “NoSQL” doesn’t mean the complete absence of the SQL database approach. Rather, it works as more of a hybrid. The term is a contraction of “not only SQL.”

So, in addition to the advantage of continuity that staying with SQL offers, NoSQL enjoys many of the benefits of SQL databases.

The key difference is that NoSQL databases were developed with modern IT in mind. They are scalable, agile and purpose-built to deal with disparate, high-volume data.

Hence, data is typically more readily available and can be changed, stored or handle the insertion of new data more easily.

For example, MongoDB, one of the key players in the NoSQL world, uses JavaScript Object Notation (JSON). As the company explains, “A JSON database returns query results that can be easily parsed, with little or no transformation.” The open, human- and machine-readable standard facilitates data interchange and can store records, “just as tables and rows store records in a relational database.”

Generally, NoSQL databases are better equipped to deal with other non-relational data too. As well as JSON, NoSQL supports log messages, XML and unstructured documents. This support avoids the lethargic “schema-on-write,” opting to “schema-on-read” instead.

NewSQL

NewSQL refers to databases based on the relational (SQL) database and SQL query language. In an attempt to solve some of the problems of SQL, the likes of VoltDB and others take a best-of-both-worlds approach, marrying the familiarity of SQL with the scalability and agile enablement of NoSQL.

However, as with most seemingly win-win opportunities, NewSQL isn’t without its caveats. These vary from vendor to vendor, but in essence, you either have to sacrifice familiarity side or scalability.

If you’d like to speak with someone at erwin about SQL, NoSQL or NewSQL in more detail, click here.

For more industry advice, subscribe to the erwin Expert Blog.

Benefits of NoSQL Data Modeling

Categories
erwin Expert Blog

Data Modeling is Changing – Time to Make NoSQL Technology a Priority

As the amount of data enterprises are tasked with managing increases, the benefits of NoSQL technology are becoming more apparent. 

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – NoSQL/NewSQL

In the first two posts of this series, we focused on the “volume” and “velocity” of Big Data, respectively.  In this post, we’ll cover “variety,” the third of Big Data’s “three Vs.” In particular, I plan to discuss NoSQL and NewSQL databases and their implications for data modeling.

As the volume and velocity of data available to organizations continues to rapidly increase, developers have chafed under the performance shackles of traditional relational databases and SQL.

An astonishing array of database solutions have arisen during the past decade to provide developers with higher performance solutions for various aspects of managing their application data. These have been collectively labeled as NoSQL databases.

Originally NoSQL meant that “no SQL” was required to interface with the database. In many cases, developers viewed this as a positive characteristic.

However, SQL is very useful for some tasks, with many organizations having rich SQL skillsets. Consequently, as more organizations demanded SQL as an option to complement some of the new NoSQL databases, the term NoSQL evolved to mean “not only SQL.” This way, SQL capabilities can be leveraged alongside other non-traditional characteristics.

Among the most popular of these new NoSQL options are document databases like MongoDB. MongoDB offers the flexibility to vary fields from document to document and change structure over time. Document databases typically store data in JSON-like documents, making it easy to map to objects in application code.

As the scale of NoSQL deployments in some organizations has rapidly grown, it has become increasingly important to have access to enterprise-grade tools to support modeling and management of NoSQL databases and to incorporate such databases into the broader enterprise data modeling and governance fold.

While document databases, key-value databases, graph databases and other types of NoSQL databases have added valuable options for developers to address various challenges posed by the “three Vs,” they did so largely by compromising consistency in favor of availability and speed, instead offering “eventual consistency.” Consequently, most NoSQL stores lack true ACID transactions, though there are exceptions, such as Aerospike and MarkLogic.

But some organizations are unwilling or unable to forgo consistency and transactional requirements, giving rise to a new class of modern relational database management systems (RDBMS) that aim to guarantee ACIDity while also providing the same level of scalability and performance offered by NoSQL databases.

NewSQL databases are typically designed to operate using a shared nothing architecture. VoltDB is one prominent example of this emerging class of ACID-compliant NewSQL RDBMS. The logical design for NewSQL database schemas is similar to traditional RDBMS schema design, and thus, they are well supported by popular enterprise-grade data modeling tools such as erwin DM.

Whatever mixture of databases your organization chooses to deploy for your OLTP requirements on premise and in the cloud – RDBMS, NoSQL and/or NewSQL – it’s as important as ever for data-driven organizations to be able to model their data and incorporate it into an overall architecture.

When it comes to organizations’ analytics requirements, including data that may be sourced from a wide range of NoSQL, NewSQL RDBMS and unstructured sources, leading organizations are adopting a variety of approaches, including a hybrid approach that many refer to as Managed Data Lakes.

Please join us next time for the fourth installment in our series: Data Modeling in a Jargon-filled World – Managed Data Lakes.

nosql

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – Big Data & MPP

By now, you’ve likely heard a lot about Big Data. You may have even heard about “the three Vs” of Big Data. Originally defined by Gartner, “Big Data is “high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision-making, insight discovery and process optimization.”

Categories
erwin Expert Blog

The Rise of NoSQL and NoSQL Data Modeling

With NoSQL data modeling gaining traction, data governance isn’t the only data shakeup organizations are currently facing.