Categories
erwin Expert Blog

Talk Data to Me: Why Employee Data Literacy Matters  

Organizations are flooded with data, so they’re scrambling to find ways to derive meaningful insights from it – and then act on them to improve the bottom line.

In today’s data-driven business, enabling employees to access and understand the data that’s relevant to their roles allows them to use data and put those insights into action. To do this, employees need to “talk data,” aka data literacy.

However, Gartner predicts that this year 50 percent of organizations will lack sufficient AI and data literacy skills to achieve business value. This requires organizations to invest in ensuring their employees are data literate.

Data Literacy & the Rise of the Citizen Analyst

According to Gartner, “data literacy is the ability to read, write and communicate data in context, including an understanding of data sources and constructs, analytical methods and techniques applied — and the ability to describe the use case, application and resulting value.”

Today, your employees are essentially data consumers. There are three technological advances driving this data consumption and, in turn, the ability for employees to leverage this data to deliver business value 1) exploding data production 2) scalable big data computation, and 3) the accessibility of advanced analytics, machine learning (ML) and artificial intelligence (AI).

The confluence of this data explosion has created a fertile environment for data innovation and transformation. As a result, we’re seeing the rise of the “citizen analyst,” who brings business knowledge and subject-matter expertise to data-driven insights.

Some examples of citizen analysts include the VP of finance who may be looking for opportunities to optimize the top- and bottom-line results for growth and profitability. Or the product line manager who wants to understand enterprise impact of pricing changes.

David Loshin explores this concept in an erwin-sponsored whitepaper, Data Intelligence: Empowering the Citizen Analyst with Democratized Data.

In the whitepaper he states, the priority of the citizen analyst is straightforward: find the right data to develop reports and analyses that support a larger business case. However, some practical data management issues contribute to a growing need for enterprise data governance, including:

  • Increasing data volumes that challenge the traditional enterprise’s ability to store, manage and ultimately find data
  • Increased data variety, balancing structured, semi-structured and unstructured data, as well as data originating from a widening array of external sources
  • Reducing the IT bottleneck that creates barriers to data accessibility
  • Desire for self-service to free the data consumers from strict predefined data transformations and organizations
  • Hybrid on-premises/cloud environments that complicate data integration and preparation
  • Privacy and data protection laws from many countries that influence the ways data assets may be accessed and used

Data Democratization Requires Data Intelligence

According to Loshin, organizations need to empower their citizen analysts. A fundamental component of data literacy involves data democratization, sharing data assets with a broad set of data consumer communities in a governed way.

  • The objectives of governed data democratization include:
  • Raising data awareness
  • Improving data literacy
  • Supporting observance of data policies to support regulatory compliance
  • Simplifying data accessibility and use

Effective data democratization requires data intelligence. This is dependent on accumulating, documenting and publishing information about the data assets used across the entire enterprise data landscape.

Here are the steps to effective data intelligence:

  • Reconnaissance: Understanding the data environment and the corresponding business contexts and collecting as much information as possible
  • Surveillance: Monitoring the environment for changes to data sources
  • Logistics and Planning: Mapping the collected information production flows and mapping how data moves across the enterprise
  • Impact Assessment: Using what you have learned to assess how external changes impact the environment
  • Synthesis: Empowering data consumers by providing a holistic perspective associated with specific business terms
  • Sustainability: Embracing automation to always provide up-to-date and correct intelligence
  • Auditability: Providing oversight and being able to explain what you have learned and why

Data Literacy: The Heart of Data-Driven Innovation

Data literacy is at the heart of successful data-driven innovation and accelerating the realization of actionable data-driven insights.

It can reduce data source discovery and analyses cycles, improve accuracy in results, reduce the reliance expensive technical resources, assure the “right” data is used the first time reducing deployed errors and the need for expensive re-work.

Ultimately, a successful data literacy program will empower your employees to:

  • Better understand and identify the data they require
  • Be more self-sufficient in accessing and preparing the data they require
  • Better articulate the gaps that exist in the data landscape when it comes to fulfilling their data needs
  • Share their knowledge and experience with data with other consumers to contribute to the greater good
  • Collaborate more effectively with their partners in data (management and governance) for greater efficiency and higher quality outcomes

erwin offers a data intelligence software suite combining the capabilities of erwin Data Catalog with erwin Data Literacy to fuel an automated, real-time, high-quality data pipeline.

Then all enterprise stakeholders – data scientists, data stewards, ETL developers, enterprise architects, business analysts, compliance officers, citizen analysts, CDOs and CEOs – can access data relevant to their roles for insights they can put into action.

Click here to request a demo of erwin Data Intelligence.

erwin Data Intelligence

Categories
erwin Expert Blog Data Intelligence

The Top 8 Benefits of Data Lineage

It’s important we recognize the benefits of data lineage.

As corporate data governance programs have matured, the inventory of agreed-to data policies has grown rapidly. These include guidelines for data quality assurance, regulatory compliance and data democratization, among other information utilization initiatives.

Organizations that are challenged by translating their defined data policies into implemented processes and procedures are starting to identify tools and technologies that can supplement the ways organizational data policies can be implemented and practiced.

One such technique, data lineage, is gaining prominence as a core operational business component of the data governance technology architecture. Data lineage encompasses processes and technology to provide full-spectrum visibility into the ways that data flow across the enterprise.

To data-driven businesses, the benefits of data lineage are significant. Data lineage tools are used to survey, document and enable data stewards to query and visualize the end-to-end flow of information units from their origination points through the series of transformation and processing stages to their final destination.

Benefits of Data Lineage

The Benefits of Data Lineage

Data stewards are attracted to data lineage because the benefits of data lineage help in a number of different governance practices, including:

1. Operational intelligence

At its core, data lineage captures the mappings of the rapidly growing number of data pipelines in the organization. Visualizing the information flow landscape provides insight into the “demographics” of data consumption and use, answering questions such as “what data sources feed the greatest number of downstream sources” or “which data analysts use data that is ingested from a specific data source.” Collecting this intelligence about the data landscape better positions the data stewards for enforcing governance policies.

2. Business terminology consistency

One of the most confounding data governance challenges is understanding the semantics of business terminology within data management contexts. Because application development was traditionally isolated within each business function, the same (or similar) terms are used in different data models, even though the designers did not take the time to align definitions and meanings. Data lineage allows the data stewards to find common business terms, review their definitions, and determine where there are inconsistencies in the ways the terms are used.

3. Data incident root cause analysis

It has long been asserted that when a data consumer finds a data error, the error most likely was introduced into the environment at an earlier stage of processing. Yet without a “roadmap” that indicates the processing stages through which the data were processed, it is difficult to speculate where the error was actually introduced. Using data lineage, though, a data steward can insert validation probes within the information flow to validate data values and determine the stage in the data pipeline where an error originated.

4. Data quality remediation assessment

Root cause analysis is just the first part of the data quality process. Once the data steward has determined where the data flaw was introduced, the next step is to determine why the error occurred. Again, using a data lineage mapping, the steward can trace backward through the information flow to examine the standardizations and transformations applied to the data, validate that transformations were correctly performed, or identify one (or more) performed incorrectly, resulting in the data flaw.

5. Impact analysis

The enterprise is always subject to changes; externally-imposed requirements (such as regulatory compliance) evolve, internal business directives may affect user expectations, and ingested data source models may change unexpectedly. When there is a change to the environment, it is valuable to assess the impacts to the enterprise application landscape. In the event of a change in data expectations, data lineage provides a way to determine which downstream applications and processes are affected by the change and helps in planning for application updates.

6. Performance assessment

Not only does lineage provide a collection of mappings of data pipelines, it allows for the identification of potential performance bottlenecks. Data pipeline stages with many incoming paths are candidate bottlenecks. Using a set of data lineage mappings, the performance analyst can profile execution times across different pipelines and redistribute processing to eliminate bottlenecks.

7. Policy compliance

Data policies can be implemented through the specification of business rules. Compliance with these business rules can be facilitated using data lineage by embedding business rule validation controls across the data pipelines. These controls can generate alerts when there are noncompliant data instances.

8. Auditability of data pipelines

In many cases, regulatory compliance is a combination of enforcing a set of defined data policies along with a capability for demonstrating that the overall process is compliant. Data lineage provides visibility into the data pipelines and information flows that can be audited thereby supporting the compliance process.

Evaluating Enterprise Data Lineage Tools

While data lineage benefits are obvious, large organizations with complex data pipelines and data flows do face challenges in embracing the technology to document the enterprise data pipelines. These include:

  • Surveying the enterprise – Gathering information about the sources, flows and configurations of data pipelines.
  • Maintenance – Configuring a means to maintain an up-to-date view of the data pipelines.
  • Deliverability – Providing a way to give data consumers visibility to the lineage maps.
  • Sustainability – Ensuring sustainability of the processes for producing data lineage mappings.

Producing a collection of up-to-date data lineage mappings that are easily reviewed by different data consumers depends on addressing these challenges. When considering data lineage tools, keep these issues in mind when evaluating how well the tools can meet your data governance needs.

erwin Data Intelligence (erwin DI) helps organizations automate their data lineage initiatives. Learn more about data lineage with erwin DI.

Value of Data Intelligence IDC Report

Categories
erwin Expert Blog

Democratizing Data and the Rise of the Citizen Analyst

Data innovation is flourishing, driven by the confluence of exploding data production, a lowered barrier to entry for big data, as well as advanced analytics, artificial intelligence and machine learning.

Additionally, the ability to access and analyze all of this information has given rise to the “citizen analyst” – a business-oriented problem-solver with enough technical knowledge to understand how to apply analytical techniques to collections of massive data sets to identify business opportunities.

Empowering the citizen analyst relies on, or rather demands, data democratization – making shared enterprise assets available to a set of data consumer communities in a governed way.

This idea of democratizing data has become increasingly popular as more organizations realize that data is everyone’s business in a data-driven organization. Those that embrace digital transformation, regardless of industry, experience new levels of relevance and success.

Securing the Asset

Consumers and businesses alike have started to view data as an asset they must take steps to secure. It’s both a lucrative target for cyber criminals and a combustible spark for PR fires.

However, siloing data can be just as costly.

For some perspective, we can draw parallels between a data pipeline and a factory production line.

In the latter example, not being able to get the right parts to the right people at the right time leads to bottlenecks that stall both production and potential profits.

The exact same logic can be applied to data. To ensure efficient processes, organizations need to make the right data available to the right people at the right time.

In essence, this is data democratization. And the importance of democratized data governance cannot be stressed enough. Data security is imperative, so organizations need both technology and personnel to achieve it.

And in regard to the human element, organizations need to ensure the relevant parties understand what particular data assets can be used and for what. Assuming that employees know when, what and how to use data can make otherwise extremely valuable data resources useless due to not understanding its potential.

The objectives of governed data democratization include:

  • Raising data awareness among the different data consumer communities to increase awareness of the data assets that can be used for reporting and analysis,
  • Improving data literacy so that individuals will understand how the different data assets can be used,
  • Supporting observance of data policies to support regulatory compliance, and
  • Simplifying data accessibility and use to support citizen analysts’ needs.

Democratizing Data: Introducing Democratized Data

To successfully introduce and oversee the idea of democratized data, organizations must ensure that information about data assets is accumulated, documented and published for context-rich use across the organization.

This knowledge and understanding are a huge part of data intelligence.

Data intelligence is produced by coordinated processes to survey the data landscape to collect, collate and publish critical information, namely:

  • Reconnaissance: Understanding the data environment and the corresponding business contexts and collecting as much information as possible;
  • Surveillance: Monitoring the environment for changes to data sources;
  • Logistics and Planning: Mapping the collected information production flows and mapping how data moves across the enterprise
  • Impact Assessment: Using what you have learned to assess how external changes impact the environment
  • Synthesis: Empowering data consumers by providing a holistic perspective associated with specific business terms
  • Sustainability: Embracing automation to always provide up-to-date and correct intelligence; and
  • Auditability: Providing oversight and being able to explain what you have learned and why

erwin recently sponsored a white paper about data intelligence and democratizing data.

Written by David Loshin of Knowledge Integrity, Inc., it take a deep dive into this topic and includes crucial advice on how organizations should evaluate data intelligence software prior to investment.

Data Intelligence: Democratizing Data