Tag: data lake

Constructing a Digital Transformation Strategy: Putting the Data in Digital Transformation

Post author By Mariann McDonagh
Post date July 17, 2019
1 Comment on Constructing a Digital Transformation Strategy: Putting the Data in Digital Transformation

Having a clearly defined digital transformation strategy is an essential best practice for successful digital transformation. But what makes a digital transformation strategy viable?

Part Two of the Digital Transformation Journey …

Part one

In our last blog on driving digital transformation, we explored how business architecture and process (BP) modeling are pivotal factors in a viable digital transformation strategy.

EA and BP modeling squeeze risk out of the digital transformation process by helping organizations really understand their businesses as they are today. It gives them the ability to identify what challenges and opportunities exist, and provides a low-cost, low-risk environment to model new options and collaborate with key stakeholders to figure out what needs to change, what shouldn’t change, and what’s the most important changes are.

Once you’ve determined what part(s) of your business you’ll be innovating — the next step in a digital transformation strategy is using data to get there.

Constructing a Digital Transformation Strategy: Data Enablement

Many organizations prioritize data collection as part of their digital transformation strategy. However, few organizations truly understand their data or know how to consistently maximize its value.

If your business is like most, you collect and analyze some data from a subset of sources to make product improvements, enhance customer service, reduce expenses and inform other, mostly tactical decisions.

The real question is: are you reaping all the value you can from all your data? Probably not.

Most organizations don’t use all the data they’re flooded with to reach deeper conclusions or make other strategic decisions. They don’t know exactly what data they have or even where some of it is, and they struggle to integrate known data in various formats and from numerous systems—especially if they don’t have a way to automate those processes.

How does your business become more adept at wringing all the value it can from its data?

The reality is there’s not enough time, people and money for true data management using manual processes. Therefore, an automation framework for data management has to be part of the foundations of a digital transformation strategy.

Your organization won’t be able to take complete advantage of analytics tools to become data-driven unless you establish a foundation for agile and complete data management.

You need automated data mapping and cataloging through the integration lifecycle process, inclusive of data at rest and data in motion.

An automated, metadata-driven framework for cataloging data assets and their flows across the business provides an efficient, agile and dynamic way to generate data lineage from operational source systems (databases, data models, file-based systems, unstructured files and more) across the information management architecture; construct business glossaries; assess what data aligns with specific business rules and policies; and inform how that data is transformed, integrated and federated throughout business processes—complete with full documentation.

Without this framework and the ability to automate many of its processes, business transformation will be stymied. Companies, especially large ones with thousands of systems, files and processes, will be particularly challenged by taking a manual approach. Outsourcing these data management efforts to professional services firms only delays schedules and increases costs.

With automation, data quality is systemically assured. The data pipeline is seamlessly governed and operationalized to the benefit of all stakeholders.

Constructing a Digital Transformation Strategy: Smarter Data

Ultimately, data is the foundation of the new digital business model. Companies that have the ability to harness, secure and leverage information effectively may be better equipped than others to promote digital transformation and gain a competitive advantage.

While data collection and storage continues to happen at a dramatic clip, organizations typically analyze and use less than 0.5 percent of the information they take in – that’s a huge loss of potential. Companies have to know what data they have and understand what it means in common, standardized terms so they can act on it to the benefit of the organization.

Unfortunately, organizations spend a lot more time searching for data rather than actually putting it to work. In fact, data professionals spend 80 percent of their time looking for and preparing data and only 20 percent of their time on analysis, according to IDC.

The solution is data intelligence. It improves IT and business data literacy and knowledge, supporting enterprise data governance and business enablement.

It helps solve the lack of visibility and control over “data at rest” in databases, data lakes and data warehouses and “data in motion” as it is integrated with and used by key applications.

Organizations need a real-time, accurate picture of the metadata landscape to:

Discover data – Identify and interrogate metadata from various data management silos.
Harvest data – Automate metadata collection from various data management silos and consolidate it into a single source.
Structure and deploy data sources – Connect physical metadata to specific data models, business terms, definitions and reusable design standards.
Analyze metadata – Understand how data relates to the business and what attributes it has.
Map data flows – Identify where to integrate data and track how it moves and transforms.
Govern data – Develop a governance model to manage standards, policies and best practices and associate them with physical assets.
Socialize data – Empower stakeholders to see data in one place and in the context of their roles.

The Right Tools

When it comes to digital transformation (like most things), organizations want to do it right. Do it faster. Do it cheaper. And do it without the risk of breaking everything. To accomplish all of this, you need the right tools.

The erwin Data Intelligence (DI) Suite is the heart of the erwin EDGE platform for creating an “enterprise data governance experience.” erwin DI combines data cataloging and data literacy capabilities to provide greater awareness of and access to available data assets, guidance on how to use them, and guardrails to ensure data policies and best practices are followed.

erwin Data Catalog automates enterprise metadata management, data mapping, reference data management, code generation, data lineage and impact analysis. It efficiently integrates and activates data in a single, unified catalog in accordance with business requirements. With it, you can:

Schedule ongoing scans of metadata from the widest array of data sources.
Keep metadata current with full versioning and change management.
Easily map data elements from source to target, including data in motion, and harmonize data integration across platforms.

erwin Data Literacy provides self-service, role-based, contextual data views. It also provides a business glossary for the collaborative definition of enterprise data in business terms, complete with built-in accountability and workflows. With it, you can:

Enable data consumers to define and discover data relevant to their roles.
Facilitate the understanding and use of data within a business context.
Ensure the organization is fluent in the language of data.

With data governance and intelligence, enterprises can discover, understand, govern and socialize mission-critical information. And because many of the associated processes can be automated, you reduce errors and reliance on technical resources while increasing the speed and quality of your data pipeline to accomplish whatever your strategic objectives are, including digital transformation.

Check out our latest whitepaper, Data Intelligence: Empowering the Citizen Analyst with Democratized Data.

erwin Expert Blog Data Governance Data Intelligence

Demystifying Data Lineage: Tracking Your Data’s DNA

Post author By Danny Sandwell
Post date November 1, 2018
No Comments on Demystifying Data Lineage: Tracking Your Data’s DNA

Getting the most out of your data requires getting a handle on data lineage. That’s knowing what data you have, where it is, and where it came from – plus understanding its quality and value to the organization.

But you can’t understand your data in a business context much less track data lineage, its physical existence and maximize its security, quality and value if it’s scattered across different silos in numerous applications.

Data lineage provides a way of tracking data from its origin to destination across its lifespan and all the processes it’s involved in. It also plays a vital role in data governance. Beyond the simple ability to know where the data came from and whether or not it can be trusted, there’s an element of statutory reporting and compliance that often requires a knowledge of how that same data (known or unknown, governed or not) has changed over time.

A platform that provides insights like data lineage, impact analysis, full-history capture, and other data management features serves as a central hub from which everything can be learned and discovered about the data – whether a data lake, a data vault or a traditional data warehouse.

In a traditional data management organization, Excel spreadsheets are used to manage the incoming data design, what’s known as the “pre-ETL” mapping documentation, but this does not provide any sort of visibility or auditability. In fact, each unit of work represented in these ‘mapping documents’ becomes an independent variable in the overall system development lifecycle, and therefore nearly impossible to learn from much less standardize.

The key to accuracy and integrity in any exercise is to eliminate the opportunity for human error – which does not mean eliminating humans from the process but incorporating the right tools to reduce the likelihood of error as the human beings apply their thought processes to the work.

Data Lineage: A Crucial First Step for Data Governance

Knowing what data you have and where it lives and where it came from is complicated. The lack of visibility and control around “data at rest” combined with “data in motion,” as well as difficulties with legacy architectures, means organizations spend more time finding the data they need rather than using it to produce meaningful business outcomes.

Organizations need to create and sustain an enterprise-wide view of and easy access to underlying metadata, but that’s a tall order with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration. So the applications and initiatives that depend on a solid data infrastructure may be compromised, resulting in faulty analyses.

These issues can be addressed with a strong data management strategy underpinned by technology that enables the data quality the business requires, which encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossaries maintenance and metadata management (associations and lineage).

Centralized design, immediate lineage and impact analysis, and change-activity logging means you will always have answers readily available, or just a few clicks away. Subsets of data can be identified and generated via predefined templates, generic designs generated from standard mapping documents, and pushed via ETL process for faster processing via automation templates.

With automation, data quality is systemically assured and the data pipeline is seamlessly governed and operationalized to the benefit of all stakeholders. Without such automation, business transformation will be stymied. Companies, especially large ones with thousands of systems, files and processes, will be particularly challenged by a manual approach. And outsourcing these data management efforts to professional services firms only increases costs and schedule delays.

With erwin Mapping Manager, organizations can automate enterprise data mapping and code generation for faster time-to-value and greater accuracy when it comes to data movement projects, as well as synchronize “data in motion” with data management and governance efforts.

Map data elements to their sources within a single repository to determine data lineage, deploy data warehouses and other Big Data solutions, and harmonize data integration across platforms. The web-based solution reduces the need for specialized, technical resources with knowledge of ETL and database procedural code, while making it easy for business analysts, data architects, ETL developers, testers and project managers to collaborate for faster decision-making.

erwin Expert Blog

Top 10 Reasons to Automate Data Mapping and Data Preparation

Post author By Mariann McDonagh
Post date October 11, 2018
1 Comment on Top 10 Reasons to Automate Data Mapping and Data Preparation

Data preparation is notorious for being the most time-consuming area of data management. It’s also expensive.

“Surveys show the vast majority of time is spent on this repetitive task, with some estimates showing it takes up as much as 80% of a data professional’s time,” according to Information Week. And a Trifacta study notes that overreliance on IT resources for data preparation costs organizations billions.

The power of collecting your data can come in a variety of forms, but most often in IT shops around the world, it comes in a spreadsheet, or rather a collection of spreadsheets often numbering in the hundreds or thousands.

Most organizations, especially those competing in the digital economy, don’t have enough time or money for data management using manual processes. And outsourcing is also expensive, with inevitable delays because these vendors are dependent on manual processes too.

Taking the Time and Pain Out of Data Preparation: 10 Reasons to Automate Data Preparation/Data Mapping

Governance and Infrastructure

Data governance and a strong IT infrastructure are critical in the valuation, creation, storage, use, archival and deletion of data. Beyond the simple ability to know where the data came from and whether or not it can be trusted, there is an element of statutory reporting and compliance that often requires a knowledge of how that same data (known or unknown, governed or not) has changed over time.

A design platform that allows for insights like data lineage, impact analysis, full history capture, and other data management features can provide a central hub from which everything can be learned and discovered about the data – whether a data lake, a data vault, or a traditional warehouse.

Eliminating Human Error

In the traditional data management organization, excel spreadsheets are used to manage the incoming data design, or what is known as the “pre-ETL” mapping documentation – this does not lend to any sort of visibility or auditability. In fact, each unit of work represented in these ‘mapping documents’ becomes an independent variable in the overall system development lifecycle, and therefore nearly impossible to learn from much less standardize.

The key to creating accuracy and integrity in any exercise is to eliminate the opportunity for human error – which does not mean eliminating humans from the process but incorporating the right tools to reduce the likelihood of error as the human beings apply their thought processes to the work.

Completeness

The ability to scan and import from a broad range of sources and formats, as well as automated change tracking, means that you will always be able to import your data from wherever it lives and track all of the changes to that data over time.

Adaptability

Centralized design, immediate lineage and impact analysis, and change activity logging means that you will always have the answer readily available, or a few clicks away. Subsets of data can be identified and generated via predefined templates, generic designs generated from standard mapping documents, and pushed via ETL process for faster processing via automation templates.

Accuracy

Out-of-the-box capabilities to map your data from source to report, make reconciliation and validation a snap, with auditability and traceability built-in. Build a full array of validation rules that can be cross checked with the design mappings in a centralized repository.

Timeliness

The ability to be agile and reactive is important – being good at being reactive doesn’t sound like a quality that deserves a pat on the back, but in the case of regulatory requirements, it is paramount.

Comprehensiveness

Access to all of the underlying metadata, source-to-report design mappings, source and target repositories, you have the power to create reports within your reporting layer that have a traceable origin and can be easily explained to both IT, business, and regulatory stakeholders.

Clarity

The requirements inform the design, the design platform puts those to action, and the reporting structures are fed the right data to create the right information at the right time via nearly any reporting platform, whether mainstream commercial or homegrown.

Frequency

Adaptation is the key to meeting any frequency interval. Centralized designs, automated ETL patterns that feed your database schemas and reporting structures will allow for cyclical changes to be made and implemented in half the time of using conventional means. Getting beyond the spreadsheet, enabling pattern-based ETL, and schema population are ways to ensure you will be ready, whenever the need arises to show an audit trail of the change process and clearly articulate who did what and when through the system development lifecycle.

Business-Friendly

A user interface designed to be business-friendly means there’s no need to be a data integration specialist to review the common practices outlined and “passively enforced” throughout the tool. Once a process is defined, rules implemented, and templates established, there is little opportunity for error or deviation from the overall process. A diverse set of role-based security options means that everyone can collaborate, learn and audit while maintaining the integrity of the underlying process components.

Faster, More Accurate Analysis with Fewer People

What if you could get more accurate data preparation 50% faster and double your analysis with less people?

erwin Mapping Manager (MM) is a patented solution that automates data mapping throughout the enterprise data integration lifecycle, providing data visibility, lineage and governance – freeing up that 80% of a data professional’s time to put that data to work.

With erwin MM, data integration engineers can design and reverse-engineer the movement of data implemented as ETL/ELT operations and stored procedures, building mappings between source and target data assets and designing the transformation logic between them. These designs then can be exported to most ETL and data asset technologies for implementation.

erwin MM is 100% metadata-driven and used to define and drive standards across enterprise integration projects, enable data and process audits, improve data quality, streamline downstream work flows, increase productivity (especially over geographically dispersed teams) and give project teams, IT leadership and management visibility into the ‘real’ status of integration and ETL migration projects.

If an automated data preparation/mapping solution sounds good to you, please check out erwin MM here.

erwin Expert Blog

Healthy Co-Dependency: Data Management and Data Governance

Post author By Bunny Tharpe
Post date September 7, 2018
No Comments on Healthy Co-Dependency: Data Management and Data Governance

Data management and data governance are now more important than ever before. The hyper competitive nature of data-driven business means organizations need to get more out of their data than ever before – and fast.

A few data-driven exemplars have led the way, turning data into actionable insights that influence everything from corporate structure to new products and pricing. “Few” being the operative word.

It’s true, data-driven business is big business. Huge actually. But it’s dominated by a handful of organizations that realized early on what a powerful and disruptive force data can be.

The benefits of such data-driven strategies speak for themselves: Netflix has replaced Blockbuster, and Uber continues to shake up the taxi business. Organizations indiscriminate of industry are following suit, fighting to become the next big, disruptive players.

But in many cases, these attempts have failed or are on the verge of doing so.

Now with the General Data Protection Regulation (GDPR) in effect, data that is unaccounted for is a potential data disaster waiting to happen.

So organizations need to understand that getting more out of their data isn’t necessarily about collecting more data. It’s about unlocking the value of the data they already have.

The Enterprise Data Dilemma

However, most organizations don’t know exactly what data they have or even where some of it is. And some of the data they can account for is going to waste because they don’t have the means to process it. This is especially true of unstructured data types, which organizations are collecting more frequently.

Considering that 73 percent of company data goes unused, it’s safe to assume your organization is dealing with some if not all of these issues.

Big picture, this means your enterprise is missing out on thousands, perhaps millions in revenue.

The smaller picture? You’re struggling to establish a single source of data truth, which contributes to a host of problems:

Inaccurate analysis and discrepancies in departmental reporting
Inability to manage the amount and variety of data your organization collects
Duplications and redundancies in processes
Issues determining data ownership, lineage and access
Achieving and sustaining compliance

To avoid such circumstances and get more value out of data, organizations need to harmonize their approach to data management and data governance, using a platform of established tools that work in tandem while also enabling collaboration across the enterprise.

Data management drives the design, deployment and operation of systems that deliver operational data assets for analytics purposes.

Data governance delivers these data assets within a business context, tracking their physical existence and lineage, and maximizing their security, quality and value.

Although these two disciplines approach data from different perspectives (IT-driven and business-oriented), they depend on each other. And this co-dependency helps an organization make the most of its data.

The P-M-G Hub

Together, data management and data governance form a critical hub for data preparation, modeling and data governance. How?

It starts with a real-time, accurate picture of the data landscape, including “data at rest” in databases, data warehouses and data lakes and “data in motion” as it is integrated with and used by key applications. That landscape also must be controlled to facilitate collaboration and limit risk.

But knowing what data you have and where it lives is complicated, so you need to create and sustain an enterprise-wide view of and easy access to underlying metadata. That’s a tall order with numerous data types and data sources that were never designed to work together and data infrastructures that have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration. So the applications and initiatives that depend on a solid data infrastructure may be compromised, and data analysis based on faulty insights.

However, these issues can be addressed with a strong data management strategy and technology to enable the data quality required by the business, which encompasses data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossaries maintenance and metadata management (associations and lineage).

Being able to pinpoint what data exists and where must be accompanied by an agreed-upon business understanding of what it all means in common terms that are adopted across the enterprise. Having that consistency is the only way to assure that insights generated by analyses are useful and actionable, regardless of business department or user exploring a question. Additionally, policies, processes and tools that define and control access to data by roles and across workflows are critical for security purposes.

These issues can be addressed with a comprehensive data governance strategy and technology to determine master data sets, discover the impact of potential glossary changes across the enterprise, audit and score adherence to rules, discover risks, and appropriately and cost-effectively apply security to data flows, as well as publish data to people/roles in ways that are meaningful to them.

Data Management and Data Governance: Play Together, Stay Together

When data management and data governance work in concert empowered by the right technology, they inform, guide and optimize each other. The result for an organization that takes such a harmonized approach is automated, real-time, high-quality data pipeline.

Then all stakeholders — data scientists, data stewards, ETL developers, enterprise architects, business analysts, compliance officers, CDOs and CEOs – can access the data they’re authorized to use and base strategic decisions on what is now a full inventory of reliable information.

The erwin EDGE creates an “enterprise data governance experience” through integrated data mapping, business process modeling, enterprise architecture modeling, data modeling and data governance. No other software platform on the market touches every aspect of the data management and data governance lifecycle to automate and accelerate the speed to actionable business insights.

Tags data modeling, enterprise architecture, GDPR, data management, data warehouse, data governance, business process modeling, data-driven business, metadata management, collaborative data governance, enterprise data governance experience, data lineage, data analysis, data lake, data mapping, data mangement, data sources, disparate data, data infrastructure

erwin Expert Blog

Solving the Enterprise Data Dilemma

Post author By Mariann McDonagh
Post date August 30, 2018
No Comments on Solving the Enterprise Data Dilemma

Due to the adoption of data-driven business, organizations across the board are facing their own enterprise data dilemmas.

This week erwin announced its acquisition of metadata management and data governance provider AnalytiX DS. The combined company touches every piece of the data management and governance lifecycle, enabling enterprises to fuel automated, high-quality data pipelines for faster speed to accurate, actionable insights.

Why Is This a Big Deal?

From digital transformation to AI, and everything in between, organizations are flooded with data. So, companies are investing heavily in initiatives to use all the data at their disposal, but they face some challenges. Chiefly, deriving meaningful insights from their data – and turning them into actions that improve the bottom line.

This enterprise data dilemma stems from three important but difficult questions to answer: What data do we have? Where is it? And how do we get value from it?

Large enterprises use thousands of unharvested, undocumented databases, applications, ETL processes and procedural code that make it difficult to gather business intelligence, conduct IT audits, and ensure regulatory compliance – not to mention accomplish other objectives around customer satisfaction, revenue growth and overall efficiency and decision-making.

The lack of visibility and control around “data at rest” combined with “data in motion”, as well as difficulties with legacy architectures, means these organizations spend more time finding the data they need rather than using it to produce meaningful business outcomes.

To remedy this, enterprises need smarter and faster data management and data governance capabilities, including the ability to efficiently catalog and document their systems, processes and the associated data without errors. In addition, business and IT must collaborate outside their traditional operational silos.

But this coveted state of data nirvana isn’t possible without the right approach and technology platform.

Enterprise Data: Making the Data Management-Data Governance Love Connection

Bringing together data management and data governance delivers greater efficiencies to technical users and better analytics to business users. It’s like two sides of the same coin:

Data management drives the design, deployment and operation of systems that deliver operational and analytical data assets.
Data governance delivers these data assets within a business context, tracks their physical existence and lineage, and maximizes their security, quality and value.

Although these disciplines approach data from different perspectives and are used to produce different outcomes, they have a lot in common. Both require a real-time, accurate picture of an organization’s data landscape, including data at rest in data warehouses and data lakes and data in motion as it is integrated with and used by key applications.

However, creating and maintaining this metadata landscape is challenging because this data in its various forms and from numerous sources was never designed to work in concert. Data infrastructures have been cobbled together over time with disparate technologies, poor documentation and little thought for downstream integration, so the applications and initiatives that depend on data infrastructure are often out-of-date and inaccurate, rendering faulty insights and analyses.

Organizations need to know what data they have and where it’s located, where it came from and how it got there, what it means in common business terms [or standardized business terms] and be able to transform it into useful information they can act on – all while controlling its access.

To support the total enterprise data management and governance lifecycle, they need an automated, real-time, high-quality data pipeline. Then every stakeholder – data scientist, ETL developer, enterprise architect, business analyst, compliance officer, CDO and CEO – can fuel the desired outcomes with reliable information on which to base strategic decisions.

Enterprise Data: Creating Your “EDGE”

At the end of the day, all industries are in the data business and all employees are data people. The success of an organization is not measured by how much data it has, but by how well it’s used.

Data governance enables organizations to use their data to fuel compliance, innovation and transformation initiatives with greater agility, efficiency and cost-effectiveness.

Organizations need to understand their data from different perspectives, identify how it flows through and impacts the business, aligns this business view with a technical view of the data management infrastructure, and synchronizes efforts across both disciplines for accuracy, agility and efficiency in building a data capability that impacts the business in a meaningful and sustainable fashion.

The persona-based erwin EDGE creates an “enterprise data governance experience” that facilitates collaboration between both IT and the business to discover, understand and unlock the value of data both at rest and in motion.

By bringing together enterprise architecture, business process, data mapping and data modeling, erwin’s approach to data governance enables organizations to get a handle on how they handle their data. With the broadest set of metadata connectors and automated code generation, data mapping and cataloging tools, the erwin EDGE Platform simplifies the total data management and data governance lifecycle.

This single, integrated solution makes it possible to gather business intelligence, conduct IT audits, ensure regulatory compliance and accomplish any other organizational objective by fueling an automated, high-quality and real-time data pipeline.

With the erwin EDGE, data management and data governance are unified and mutually supportive, with one hand aware and informed by the efforts of the other to:

Discover data: Identify and integrate metadata from various data management silos.
Harvest data: Automate the collection of metadata from various data management silos and consolidate it into a single source.
Structure data: Connect physical metadata to specific business terms and definitions and reusable design standards.
Analyze data: Understand how data relates to the business and what attributes it has.
Map data flows: Identify where to integrate data and track how it moves and transforms.
Govern data: Develop a governance model to manage standards and policies and set best practices.
Socialize data: Enable stakeholders to see data in one place and in the context of their roles.

An integrated solution with data preparation, modeling and governance helps businesses reach data governance maturity – which equals a role-based, collaborative data governance system that serves both IT and business users equally. Such maturity may not happen overnight, but it will ultimately deliver the accurate and actionable insights your organization needs to compete and win.

Your journey to data nirvana begins with a demo of the enhanced erwin Data Governance solution. Register now.

erwin Expert Blog

Big Data Posing Challenges? Data Governance Offers Solutions

Post author By Michael Pastore
Post date August 17, 2018
1 Comment on Big Data Posing Challenges? Data Governance Offers Solutions

Big Data is causing complexity for many organizations, not just because of the volume of data they’re collecting, but because of the variety of data they’re collecting.

Big Data often consists of unstructured data that streams into businesses from social media networks, internet-connected sensors, and more. But the data operations at many organizations were not designed to handle this flood of unstructured data.

Dealing with the volume, velocity and variety of Big Data is causing many organizations to re-think how they store and govern their data. A perfect example is the data warehouse. The people who built and manage the data warehouse at your organization built something that made sense to them at the time. They understood what data was stored where and why, as well how it was used by business units and applications.

The era of Big Data introduced inexpensive data lakes to some organizations’ data operations, but as vast amounts of data pour into these lakes, many IT departments found themselves managing a data swamp instead.

In a perfect world, your organization would treat Big Data like any other type of data. But, alas, the world is not perfect. In reality, practicality and human nature intervene. Many new technologies, when first adopted, are separated from the rest of the infrastructure.

“New technologies are often looked at in a vacuum, and then built in a silo,” says Danny Sandwell, director of product marketing for erwin, Inc.

That leaves many organizations with parallel collections of data: one for so-called “traditional” data and one for the Big Data.

There are a few problems with this outcome. For one, silos in IT have a long history of keeping organizations from understanding what they have, where it is, why they need it, and whether it’s of any value. They also have a tendency to increase costs because they don’t share common IT resources, leading to redundant infrastructure and complexity. Finally, silos usually mean increased risk.

But there’s another reason why parallel operations for Big Data and traditional data don’t make much sense: The users simply don’t care.

At the end of the day, your users want access to the data they need to do their jobs, and whether IT considers it Big Data, little data, or medium-sized data isn’t important. What’s most important is that the data is the right data – meaning it’s accurate, relevant and can be used to support or oppose a decision.

How Data Governance Turns Big Data into Just Plain Data

According to a November 2017 survey by erwin and UBM, 21 percent of respondents cited Big Data as a driver of their data governance initiatives.

In today’s data-driven world, data governance can help your business understand what data it has, how good it is, where it is, and how it’s used. The erwin/UBM survey found that 52 percent of respondents said data is critically important to their organization and they have a formal data governance strategy in place. But almost as many respondents (46 percent) said they recognize the value of data to their organization but don’t have a formal governance strategy.

A holistic approach to data governance includes thesekey components.

An enterprise architecture component is important because it aligns IT and the business, mapping a company’s applications and the associated technologies and data to the business functions they enable. By integrating data governance with enterprise architecture, businesses can define application capabilities and interdependencies within the context of their connection to enterprise strategy to prioritize technology investments so they align with business goals and strategies to produce the desired outcomes.
A business process and analysis component defines how the business operates and ensures employees understand and are accountable for carrying out the processes for which they are responsible. Enterprises can clearly define, map and analyze workflows and build models to drive process improvements, as well as identify business practices susceptible to the greatest security, compliance or other risks and where controls are most needed to mitigate exposures.
A data modeling component is the best way to design and deploy new databases with high-quality data sources and support application development. Being able to cost-effectively and efficiently discover, visualize and analyze “any data” from “anywhere” underpins large-scale data integration, master data management, Big Data and business intelligence/analytics with the ability to synthesize, standardize and store data sources from a single design, as well as reuse artifacts across projects.

When data governance is done right, and it’s woven into the structure and architecture of your business, it helps your organization accept new technologies and the new sources of data they provide as they come along. This makes it easier to see ROI and ROO from your Big Data initiatives by managing Big Data in the same manner your organization treats all of its data – by understanding its metadata, defining its relationships, and defining its quality.

Furthermore, businesses that apply sound data governance will find themselves with a template or roadmap they can use to integrate Big Data throughout their organizations.

If your business isn’t capitalizing on the Big Data it’s collecting, then it’s throwing away dollars spent on data collection, storage and analysis. Just as bad, however, is a situation where all of that data and analysis is leading to the wrong decisions and poor business outcomes because the data isn’t properly governed.

Previous posts:

You can determine how effective your current data governance initiative is by taking erwin’s DG RediChek.

Tags data modeling, enterprise architecture, business process, big data, unstructured data, data warehouse, data governance drivers, state of data governance report, state of DG, three vs three, volume variety, data lake, data swamp, type of data, traditional data, Big Data data governance

Part one

Constructing a Digital Transformation Strategy: Data Enablement

Constructing a Digital Transformation Strategy: Smarter Data

The Right Tools

Data Lineage: A Crucial First Step for Data Governance

Taking the Time and Pain Out of Data Preparation: 10 Reasons to Automate Data Preparation/Data Mapping

Faster, More Accurate Analysis with Fewer People

The Enterprise Data Dilemma

The P-M-G Hub

Data Management and Data Governance: Play Together, Stay Together

Why Is This a Big Deal?

Enterprise Data: Making the Data Management-Data Governance Love Connection

Enterprise Data: Creating Your “EDGE”

How Data Governance Turns Big Data into Just Plain Data