Categories
erwin Expert Blog

Data Intelligence in the Next Normal; Why, Who and When?

While many believe that the dawn of a new year represents a clean slate or a blank canvas, we simply don’t leave the past behind by merely flipping over a page in the calendar.

As we enter 2021, we will also be building off the events of  2020 – both positive and negative – including the acceleration of digital transformation as the next normal begins to be defined.

data intelligence

As the pandemic took hold, IDC surveyed technology users and decision makers around the globe, reaching out every two weeks until September, when the survey frequency shifted to monthly. These surveys helped IDC develop a model that describes the five stages of enterprise recovery, aligning business focus with the economic situation:

  • When the COVID-19 crisis hit, organizations focused on business continuity.
  • As the economy slowed, they focused on cost optimization.
  • In the recession period, their focus turned to business resiliency.
  • As the economy returns to growth, organizations are making targeted investments.
  • When we enter into the next normal, the future enterprise will emerge.

The IDC surveys explored how the crisis impacted budgets across different areas of IT, from hardware and networking, to software and professional services. When the pandemic first hit, there was some negative impact on big data and analytics spending.

However, the economic situation changed as time went on. Digital transformation was accelerated, and budgets for spending on big data and analytics increased. This spending has continued during the return to growth, with more organizations moving toward becoming the future enterprise.

I have long stated that data is the lifeblood of digital transformation, and if the pandemic really has accelerated digital transformation, then the trends reported in IDC’s worldwide surveys make sense.

But data without intelligence is just data, and this is WHY data intelligence is required.

Data intelligence is a key input to data enablement in the digital enterprise, both by improving data literacy among data-native workers and by assuring the right data is being used at the right time, and for the right reason(s).

WHO needs to be involved in implementing and using data intelligence in the digital enterprise?

There is an ever-growing number of roles that work with data daily to complete tasks, make decisions, and affect business outcomes. These roles range from technical to business, from operations to strategy, and from the back office to the front office.

IDC has defined people in these roles as a generation: “Generation Data,” or “Gen-D” for short. Gen-D workers are data-natives — data is what they work in and work with to complete their tasks, tactical and/or strategic.

You may be part of Gen-D if “data” is in your job title, you are expected to make data-driven decisions, and you are able to use data to communicate with others. Gen-D workers also contribute to the overall data knowledge in the organization by participating in data intelligence and data literacy efforts and promoting good data culture.

WHEN do you need to gather intelligence about your data?

Now is the time.

The next or new normal has already begun and the more you know about your data, the better your digital business outcomes will be. It has been said that while it can take a long time to gain a customer’s trust, it only takes one bad experience to lose it.

Personally, I have had several instances of poor digital experiences such as items sent to the wrong address or orders (including mobile food orders) being fulfilled incorrectly.

Each represents a data problem: incorrect data, incorrect data interpretation, or a complete disconnect between the virtual and physical world. In these cases, better data intelligence could have helped in assuring the correct address, enabling correct order fulfillment, and assisting with interpretation through better data definition and description.

Even if you don’t have a formal data intelligence program in place, there is a good possibility your organization has intelligence about its data, because it is difficult for data to exist without some form of associated metadata.

Technical metadata is what makes up database schema and table definitions. Logical and physical data models may exist in data modeling or general-purpose diagraming software.

There is also a high likelihood that data models, data dictionaries, and data catalogs exist in the ubiquitous spreadsheet, or in centralized document repositories. However, just having metadata isn’t the same as managing and leveraging it as intelligence. Data in modern business environments is very dynamic, constantly moving, drifting, and shifting – requiring automated collection, management, and analytics to extract and leverage intelligence about it.

In many English-speaking countries, “Auld Lang Syne,” a Scots-language poem written by Robbie Burns and set to a common folk song tune, is often sung as the clock strikes midnight on the first day of the new year.

The phrase “auld lang syne” has several interpretations, but it can loosely be translated as “for the sake of old times.” As we move into 2021, we need to forget the negatives of 2020, and build on the positives to help define the next normal.

Categories
erwin Expert Blog

How Metadata Makes Data Meaningful

Metadata is an important part of data governance, and as a result, most nascent data governance programs are rife with project plans for assessing and documenting metadata. But in many scenarios, it seems that the underlying driver of metadata collection projects is that it’s just something you do for data governance.

So most early-stage data governance managers kick off a series of projects to profile data, make inferences about data element structure and format, and store the presumptive metadata in some metadata repository. But are these rampant and often uncontrolled projects to collect metadata properly motivated?

There is rarely a clear directive about how metadata is used. Therefore prior to launching metadata collection tasks, it is important to specifically direct how the knowledge embedded within the corporate metadata should be used.

Managing metadata should not be a sub-goal of data governance. Today, metadata is the heart of enterprise data management and governance/ intelligence efforts and should have a clear strategy – rather than just something you do.

metadata data governance

What Is Metadata?

Quite simply, metadata is data about data. It’s generated every time data is captured at a source, accessed by users, moved through an organization, integrated or augmented with other data from other sources, profiled, cleansed and analyzed. Metadata is valuable because it provides information about the attributes of data elements that can be used to guide strategic and operational decision-making. It answers these important questions:

  • What data do we have?
  • Where did it come from?
  • Where is it now?
  • How has it changed since it was originally created or captured?
  • Who is authorized to use it and how?
  • Is it sensitive or are there any risks associated with it?

The Role of Metadata in Data Governance

Organizations don’t know what they don’t know, and this problem is only getting worse. As data continues to proliferate, so does the need for data and analytics initiatives to make sense of it all. Here are some benefits of metadata management for data governance use cases:

  • Better Data Quality: Data issues and inconsistencies within integrated data sources or targets are identified in real time to improve overall data quality by increasing time to insights and/or repair.
  • Quicker Project Delivery: Accelerate Big Data deployments, Data Vaults, data warehouse modernization, cloud migration, etc., by up to 70 percent.
  • Faster Speed to Insights: Reverse the current 80/20 rule that keeps high-paid knowledge workers too busy finding, understanding and resolving errors or inconsistencies to actually analyze source data.
  • Greater Productivity & Reduced Costs: Being able to rely on automated and repeatable metadata management processes results in greater productivity. Some erwin customers report productivity gains of 85+% for coding, 70+% for metadata discovery, up to 50% for data design, up to 70% for data conversion, and up to 80% for data mapping.
  • Regulatory Compliance: Regulations such as GDPR, HIPAA, PII, BCBS and CCPA have data privacy and security mandates, so sensitive data needs to be tagged, its lineage documented, and its flows depicted for traceability.
  • Digital Transformation: Knowing what data exists and its value potential promotes digital transformation by improving digital experiences, enhancing digital operations, driving digital innovation and building digital ecosystems.
  • Enterprise Collaboration: With the business driving alignment between data governance and strategic enterprise goals and IT handling the technical mechanics of data management, the door opens to finding, trusting and using data to effectively meet organizational objectives.

Giving Metadata Meaning

So how do you give metadata meaning? While this sounds like a deep philosophical question, the reality is the right tools can make all the difference.

erwin Data Intelligence (erwin DI) combines data management and data governance processes in an automated flow.

It’s unique in its ability to automatically harvest, transform and feed metadata from a wide array of data sources, operational processes, business applications and data models into a central data catalog and then make it accessible and understandable within the context of role-based views.

erwin DI sits on a common metamodel that is open, extensible and comes with a full set of APIs. A comprehensive list of erwin-owned standard data connectors are included for automated harvesting, refreshing and version-controlled metadata management. Optional erwin Smart Data Connectors reverse-engineer ETL code of all types and connect bi-directionally with reporting and other ecosystem tools. These connectors offer the fastest and most accurate path to data lineage, impact analysis and other detailed graphical relationships.

Additionally, erwin DI is part of the larger erwin EDGE platform that integrates data modelingenterprise architecturebusiness process modelingdata cataloging and data literacy. We know our customers need an active metadata-driven approach to:

  • Understand their business, technology and data architectures and the relationships between them
  • Create an automate a curated enterprise data catalog, complete with physical assets, data models, data movement, data quality and on-demand lineage
  • Activate their metadata to drive agile and well-governed data preparation with integrated business glossaries and data dictionaries that provide business context for stakeholder data literacy

erwin was named a Leader in Gartner’s “2019 Magic Quadrant for Metadata Management Solutions.”

Click here to get a free copy of the report.

Click here to request a demo of erwin DI.

Gartner Magic Quadrant Metadata Management

 

Categories
erwin Expert Blog

Metadata Management, Data Governance and Automation

Can the 80/20 Rule Be Reversed?

erwin released its State of Data Governance Report in February 2018, just a few months before the General Data Protection Regulation (GDPR) took effect.

This research showed that the majority of responding organizations weren’t actually prepared for GDPR, nor did they have the understanding, executive support and budget for data governance – although they recognized the importance of it.

Of course, data governance has evolved with astonishing speed, both in response to data privacy and security regulations and because organizations see the potential for using it to accomplish other organizational objectives.

But many of the world’s top brands still seem to be challenged in implementing and sustaining effective data governance programs (hello, Facebook).

We wonder why.

Too Much Time, Too Few Insights

According to IDC’s “Data Intelligence in Context” Technology Spotlight sponsored by erwin, “professionals who work with data spend 80 percent of their time looking for and preparing data and only 20 percent of their time on analytics.”

Specifically, 80 percent of data professionals’ time is spent on data discovery, preparation and protection, and only 20 percent on analysis leading to insights.

In most companies, an incredible amount of data flows from multiple sources in a variety of formats and is constantly being moved and federated across a changing system landscape.

Often these enterprises are heavily regulated, so they need a well-defined data integration model that will help avoid data discrepancies and remove barriers to enterprise business intelligence and other meaningful use.

IT teams need the ability to smoothly generate hundreds of mappings and ETL jobs. They need their data mappings to fall under governance and audit controls, with instant access to dynamic impact analysis and data lineage.

But most organizations, especially those competing in the digital economy, don’t have enough time or money for data management using manual processes. Outsourcing is also expensive, with inevitable delays because these vendors are dependent on manual processes too.

The Role of Data Automation

Data governance maturity includes the ability to rely on automated and repeatable processes.

For example, automatically importing mappings from developers’ Excel sheets, flat files, Access and ETL tools into a comprehensive mappings inventory, complete with automatically generated and meaningful documentation of the mappings, is a powerful way to support governance while providing real insight into data movement — for data lineage and impact analysis — without interrupting system developers’ normal work methods.

GDPR compliance, for instance, requires a business to discover source-to-target mappings with all accompanying transactions, such as what business rules in the repository are applied to it, to comply with audits.

When data movement has been tracked and version-controlled, it’s possible to conduct data archeology — that is, reverse-engineering code from existing XML within the ETL layer — to uncover what has happened in the past and incorporating it into a mapping manager for fast and accurate recovery.

With automation, data professionals can meet the above needs at a fraction of the cost of the traditional, manual way. To summarize, just some of the benefits of data automation are:

• Centralized and standardized code management with all automation templates stored in a governed repository
• Better quality code and minimized rework
• Business-driven data movement and transformation specifications
• Superior data movement job designs based on best practices
• Greater agility and faster time-to-value in data preparation, deployment and governance
• Cross-platform support of scripting languages and data movement technologies

One global pharmaceutical giant reduced costs by 70 percent and generated 95 percent of production code with “zero touch.” With automation, the company improved the time to business value and significantly reduced the costly re-work associated with error-prone manual processes.

Gartner Magic Quadrant Metadata Management

Help Us Help You by Taking a Brief Survey

With 2020 just around the corner and another data regulation about to take effect, the California Consumer Privacy Act (CCPA), we’re working with Dataversity on another research project.

And this time, you guessed it – we’re focusing on data automation and how it could impact metadata management and data governance.

We would appreciate your input and will release the findings in January 2020.

Click here to take the brief survey

Categories
erwin Expert Blog

Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

Untapped data, if mined, represents tremendous potential for your organization. While there has been a lot of talk about big data over the years, the real hero in unlocking the value of enterprise data is metadata, or the data about the data.

However, most organizations don’t use all the data they’re flooded with to reach deeper conclusions about how to drive revenue, achieve regulatory compliance or make other strategic decisions. They don’t know exactly what data they have or even where some of it is.

Quite honestly, knowing what data you have and where it lives is complicated. And to truly understand it, you need to be able to create and sustain an enterprise-wide view of and easy access to underlying metadata.

This isn’t an easy task. Organizations are dealing with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and with little thought for downstream integration.

As a result, the applications and initiatives that depend on a solid data infrastructure may be compromised, leading to faulty analysis and insights.

Metadata Is the Heart of Data Intelligence

A recent IDC Innovators: Data Intelligence Report says that getting answers to such questions as “where is my data, where has it been, and who has access to it” requires harnessing the power of metadata.

Metadata is generated every time data is captured at a source, accessed by users, moves through an organization, and then is profiled, cleansed, aggregated, augmented and used for analytics to guide operational or strategic decision-making.

In fact, data professionals spend 80 percent of their time looking for and preparing data and only 20 percent of their time on analysis, according to IDC.

To flip this 80/20 rule, they need an automated metadata management solution for:

• Discovering data – Identify and interrogate metadata from various data management silos.
• Harvesting data – Automate the collection of metadata from various data management silos and consolidate it into a single source.
• Structuring and deploying data sources – Connect physical metadata to specific data models, business terms, definitions and reusable design standards.
• Analyzing metadata – Understand how data relates to the business and what attributes it has.
• Mapping data flows – Identify where to integrate data and track how it moves and transforms.
• Governing data – Develop a governance model to manage standards, policies and best practices and associate them with physical assets.
• Socializing data – Empower stakeholders to see data in one place and in the context of their roles.

Addressing the Complexities of Metadata Management

The complexities of metadata management can be addressed with a strong data management strategy coupled with metadata management software to enable the data quality the business requires.

This encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossary maintenance, and metadata management (associations and lineage).

erwin has developed the only data intelligence platform that provides organizations with a complete and contextual depiction of the entire metadata landscape.

It is the only solution that can automatically harvest, transform and feed metadata from operational processes, business applications and data models into a central data catalog and then made accessible and understandable within the context of role-based views.

erwin’s ability to integrate and continuously refresh metadata from an organization’s entire data ecosystem, including business processes, enterprise architecture and data architecture, forms the foundation for enterprise-wide data discovery, literacy, governance and strategic usage.

Organizations then can take a data-driven approach to business transformation, speed to insights, and risk management.
With erwin, organizations can:

1. Deliver a trusted metadata foundation through automated metadata harvesting and cataloging
2. Standardize data management processes through a metadata-driven approach
3. Centralize data-driven projects around centralized metadata for planning and visibility
4. Accelerate data preparation and delivery through metadata-driven automation
5. Master data management platforms through metadata abstraction
6. Accelerate data literacy through contextual metadata enrichment and integration
7. Leverage a metadata repository to derive lineage, impact analysis and enable audit/oversight ability

With erwin Data Intelligence as part of the erwin EDGE platform, you know what data you have, where it is, where it’s been and how it transformed along the way, plus you can understand sensitivities and risks.

With an automated, real-time, high-quality data pipeline, enterprise stakeholders can base strategic decisions on a full inventory of reliable information.

Many of our customers are hard at work addressing metadata management challenges, and that’s why erwin was Named a Leader in Gartner’s “2019 Magic Quadrant for Metadata Management Solutions.”

Gartner Magic Quadrant Metadata Management

Categories
erwin Expert Blog

SQL, NoSQL or NewSQL: Evaluating Your Database Options

A common question in the modern data management space involves database technology: SQL, NoSQL or NewSQL?

But there isn’t a one-size-fits-all answer. What’s “right” must be evaluated on a case-by-case basis and is dependent on data maturity.

For example, a large bookstore chain with a big-data initiative would be stifled by a SQL database. The advantages that could be gained from analyzing social media data (for popular books, consumer buying habits) couldn’t be realized effectively through sequential analysis. There’s too much data involved in this approach, with too many threads to follow.

However, an independent bookstore isn’t necessarily bound to a big-data approach because it may not have a mature data strategy. It might not have ventured beyond digitizing customer records, and a SQL database is sufficient for that work.

Having said that, the “SQL, NoSQL or NewSQL” question is gaining prominence because businesses are becoming increasingly data-driven.

In 2019, an IDC study found 85% of enterprise decision-makers said they had a time frame of two years to make significant inroads into digital transformation or they will fall behind their competitors and suffer financially. Furthermore, a Progress study showed that 85% of enterprise decision-makers feel they only have two years to make significant digital-transformation progress before suffering financially and/or falling behind competitors.

Considering these statistics, what better time than now to evaluate your database technology? The “SQL, NoSQL or NewSQL question,” is especially important if you intend to become more data-driven.

SQL, NoSQL or NewSQL: Advantages and Disadvantages

SQL

SQL databases are tried and tested, proven to work on disks using interfaces with which businesses are already familiar.

As the longest-standing type of database, plenty of SQL options are available. This competitive market means you’ll likely find what you’re looking for at affordable prices.

Additionally, businesses in the earlier stages of data maturity are more likely to have a SQL database at work already, meaning no new investments need to be made.

However in the modern digital business context, SQL databases weren’t made to support the the three Vs of data. The volume is too high, the variety of sources is too vast, and the velocity (speed at which the data must be processed) is too great to be analyzed in sequence.

Furthermore, the foundational, legacy IT world they were purpose-built to serve has evolved. Now, corporate IT departments must be agile, and their databases must be agile and scalable to match.

NoSQL

Despite its name, “NoSQL” doesn’t mean the complete absence of the SQL database approach. Rather, it works as more of a hybrid. The term is a contraction of “not only SQL.”

So, in addition to the advantage of continuity that staying with SQL offers, NoSQL enjoys many of the benefits of SQL databases.

The key difference is that NoSQL databases were developed with modern IT in mind. They are scalable, agile and purpose-built to deal with disparate, high-volume data.

Hence, data is typically more readily available and can be changed, stored or handle the insertion of new data more easily.

For example, MongoDB, one of the key players in the NoSQL world, uses JavaScript Object Notation (JSON). As the company explains, “A JSON database returns query results that can be easily parsed, with little or no transformation.” The open, human- and machine-readable standard facilitates data interchange and can store records, “just as tables and rows store records in a relational database.”

Generally, NoSQL databases are better equipped to deal with other non-relational data too. As well as JSON, NoSQL supports log messages, XML and unstructured documents. This support avoids the lethargic “schema-on-write,” opting to “schema-on-read” instead.

NewSQL

NewSQL refers to databases based on the relational (SQL) database and SQL query language. In an attempt to solve some of the problems of SQL, the likes of VoltDB and others take a best-of-both-worlds approach, marrying the familiarity of SQL with the scalability and agile enablement of NoSQL.

However, as with most seemingly win-win opportunities, NewSQL isn’t without its caveats. These vary from vendor to vendor, but in essence, you either have to sacrifice familiarity side or scalability.

If you’d like to speak with someone at erwin about SQL, NoSQL or NewSQL in more detail, click here.

For more industry advice, subscribe to the erwin Expert Blog.

Benefits of NoSQL Data Modeling