Categories
erwin Expert Blog Data Governance

Are Data Governance Bottlenecks Holding You Back?

Better decision-making has now topped compliance as the primary driver of data governance. However, organizations still encounter a number of bottlenecks that may hold them back from fully realizing the value of their data in producing timely and relevant business insights.

While acknowledging that data governance is about more than risk management and regulatory compliance may indicate that companies are more confident in their data, the data governance practice is nonetheless growing in complexity because of more:

  • Data to handle, much of it unstructured
  • Sources, like IoT
  • Points of integration
  • Regulations

Without an accurate, high-quality, real-time enterprise data pipeline, it will be difficult to uncover the necessary intelligence to make optimal business decisions.

So what’s holding organizations back from fully using their data to make better, smarter business decisions?

Data Governance Bottlenecks

erwin’s 2020 State of Data Governance and Automation report, based on a survey of business and technology professionals at organizations of various sizes and across numerous industries, examined the role of automation in  data governance and intelligence  efforts.  It uncovered a number of obstacles that organizations have to overcome to improve their data operations.

The No.1 bottleneck, according to 62 percent of respondents, was documenting complete data lineage. Understanding the quality of source data is the next most serious bottleneck (58 percent); followed by finding, identifying, and harvesting data (55 percent); and curating assets with business context (52 percent).

The report revealed that all but two of the possible bottlenecks were marked by more than 50 percent of respondents. Clearly, there’s a massive need for a data governance framework to keep these obstacles from stymying enterprise innovation.

As we zeroed in on the bottlenecks of day-to-day operations, 25 percent of respondents said length of project/delivery time was the most significant challenge, followed by data quality/accuracy is next at 24 percent, time to value at 16 percent, and reliance on developer and other technical resources at 13 percent.

Are Data Governance Bottlenecks Holding You Back?

Overcoming Data Governance Bottlenecks

The 80/20 rule describes the unfortunate reality for many data stewards: they spend 80 percent of their time finding, cleaning and reorganizing huge amounts of data and only 20 percent on actual data analysis.

In fact, we found that close to 70 percent of our survey respondents spent an average of 10 or more hours per week on data-related activities, most of it searching for and preparing data.

What can you do to reverse the 80/20 rule and subsequently overcome data governance bottlenecks?

1. Don’t ignore the complexity of data lineage: It’s a risky endeavor to support data lineage using a manual approach, and businesses that attempt it that way will find it’s not sustainable given data’s constant movement from one place to another via multiple routes – and doing it correctly down to the column level. Adopting automated end-to-end lineage makes it possible to view data movement from the source to reporting structures, providing a comprehensive and detailed view of data in motion.

2. Automate code generation: Alleviate the need for developers to hand code connections from data sources to target schema. Mapping data elements to their sources within a single repository to determine data lineage and harmonize data integration across platforms reduces the need for specialized, technical resources with knowledge of ETL and database procedural code. It also makes it easier for business analysts, data architects, ETL developers, testers and project managers to collaborate for faster decision-making.

3. Use an integrated impact analysis solution: By automating data due diligence for IT you can deliver operational intelligence to the business. Business users benefit from automating impact analysis to better examine value and prioritize individual data sets. Impact analysis has equal importance to IT for automatically tracking changes and understanding how data from one system feeds other systems and reports. This is an aspect of data lineage, created from technical metadata, ensuring nothing “breaks” along the change train.

4. Put data quality first: Users must have confidence in the data they use for analytics. Automating and matching business terms with data assets and documenting lineage down to the column level are critical to good decision-making. If this approach hasn’t been the case to date, enterprises should take a few steps back to review data quality measures before jumping into automating data analytics.

5. Catalog data using a solution with a broad set of metadata connectors: All data sources will be leveraged, including big data, ETL platforms, BI reports, modeling tools, mainframe, and relational data as well as data from many other types of systems. Don’t settle for a data catalog from an emerging vendor that only supports a narrow swath of newer technologies, and don’t rely on a catalog from a legacy provider that may supply only connectors for standard, more mature data sources.

6. Stress data literacy: You want to ensure that data assets are used strategically. Automation expedites the benefits of data cataloging. Curated internal and external datasets for a range of content authors doubles business benefits and ensures effective management and monetization of data assets in the long-term if linked to broader data governance, data quality and metadata management initiatives. There’s a clear connection to data literacy here because of its foundation in business glossaries and socializing data so all stakeholders can view and understand it within the context of their roles.

7. Make automation the norm across all data governance processes: Too many companies still live in a world where data governance is a high-level mandate, not practically implemented. To fully realize the advantages of data governance and the power of data intelligence, data operations must be automated across the board. Without automated data management, the governance housekeeping load on the business will be so great that data quality will inevitably suffer. Being able to account for all enterprise data and resolve disparity in data sources and silos using manual approaches is wishful thinking.

8. Craft your data governance strategy before making any investments: Gather multiple stakeholders—both business and IT— with multiple viewpoints to discover where their needs mesh and where they diverge and what represents the greatest pain points to the business. Solve for these first, but build buy-in by creating a layered, comprehensive strategy that ultimately will address most issues. From there, it’s on to matching your needs to an automated data governance solution that squares with business and IT – both for immediate requirements and future plans.

Register now for the first of a new, six-part webinar series on the practice of data governance and how to proactively deal with the complexities. “The What & Why of Data Governance” webinar on Tuesday, Feb. 23rd at 3 pm GMT/10 am ET.

Categories
erwin Expert Blog

Managing Emerging Technology Disruption with Enterprise Architecture

Emerging technology has always played an important role in business transformation. In the race to collect and analyze data, provide superior customer experiences, and manage resources, new technologies always interest IT and business leaders.

KPMG’s The Changing Landscape of Disruptive Technologies found that today’s businesses are showing the most interest in emerging technology like the Internet of Things (IoT), artificial intelligence (AI) and robotics. Other emerging technologies that are making headlines include natural language processing (NLP) and blockchain.

In many cases, emerging technologies such as these are not fully embedded into business environments. Before they enter production, organizations need to test and pilot their projects to help answer some important questions:

  • How do these technologies disrupt?
  • How do they provide value?

Enterprise Architecture’s Role in Managing Emerging Technology

Pilot projects that take a small number of incremental steps, with small funding increases along the way, help provide answers to these questions. If the pilot proves successful, it’s then up to the enterprise architecture team to explore what it takes to integrate these technologies into the IT environment.

This is the point where new technologies go from “emerging technologies” to becoming another solution in the stack the organization relies on to create the business outcomes it’s seeking.

One of the easiest, quickest ways to try to pilot and put new technologies into production is to use cloud-based services. All of the major public cloud platform providers have AI and machine learning capabilities.

Integrating new technologies based in the cloud will change the way the enterprise architecture team models the IT environment, but that’s actually a good thing.

Modeling can help organizations understand the complex integrations that bring cloud services into the organization, and help them better understand the service level agreements (SLAs), security requirements and contracts with cloud partners.

When done right, enterprise architecture modeling also will help the organization better understand the value of emerging technology and even cloud migrations that increasingly accompany them. Once again, modeling helps answer important questions, such as:

  • Does the model demonstrate the benefits that the business expects from the cloud?
  • Do the benefits remain even if some legacy apps and infrastructure need to remain on premise?
  • What type of savings do you see if you can’t consolidate enough close an entire data center?
  • How does the risk change?

Many of the emerging technologies garnering attention today are on their way to becoming a standard part of the technology stack. But just as the web came before mobility, and mobility came before AI,  other technologies will soon follow in their footsteps.

To most efficiently evaluate these technologies and decide if they are right for the business, organizations need to provide visibility to both their enterprise architecture and business process teams so everyone understands how their environment and outcomes will change.

When the enterprise architecture and business process teams use a common platform and model the same data, their results will be more accurate and their collaboration seamless. This will cut significant time off the process of piloting, deploying and seeing results.

Outcomes like more profitable products and better customer experiences are the ultimate business goals. Getting there first is important, but only if everything runs smoothly on the customer side. The disruption of new technologies should take place behind the scenes, after all.

And that’s where investing in pilot programs and enterprise architecture modeling demonstrate value as you put emerging technology to work.

Emerging technology - Data-driven business transformation

Categories
erwin Expert Blog

Data Governance 2.0: Biggest Data Shakeups to Watch in 2018

This year we’ll see some huge changes in how we collect, store and use data, with Data Governance 2.0 at the epicenter. For many organizations, these changes will be reactive, as they have to adapt to new regulations. Others will use regulatory change as a catalyst to be proactive with their data. Ideally, you’ll want to be in the latter category.

Data-driven businesses and their relevant industries are experiencing unprecedented rates of change.

Not only has the amount of data exploded in recent years, we’re now seeing the amount of insights data provides increase too. In essence, we’re finding smaller units of data more useful, but also collecting more than ever before.

At present, data opportunities are seemingly boundless, and we’ve barely begun to scratch the surface. So here are some of the biggest data shakeups to expect in 2018.

2018 data governance 2.0

GDPR

The General Data Protection Regulation (GDPR) has organizations scrambling. Penalties for non-compliance go into immediate effect on May 25, with hefty fines – up to €20 million or 4 percent of the company’s global annual turnover, whichever is greater.

Although it’s a European mandate, the fact is that all organizations trading with Europe, not just those based within the continent, must comply. Because of this, we’re seeing a global effort to introduce new policies, procedures and systems to prepare on a scale we haven’t seen since Y2K.

It’s easy to view mandated change of this nature as a burden. But the change is well overdue – both from a regulatory and commercial point of view.

In terms of regulation, a globalized approach had to be introduced. Data doesn’t adhere to borders in the same way as physical materials, and conflicting standards within different states, countries and continents have made sufficient regulation difficult.

In terms of business, many organizations have stifled their digital transformation efforts to become data-driven, neglecting to properly govern the data that would enable it. GDPR requires a collaborative approach to data governance (DG), and when done right, will add value as well as achieve compliance.

Rise of Data Governance 2.0

Data Governance 1.0 has failed to gain a foothold because of its siloed, un-collaborative nature. It lacks focus on business outcomes, so business leaders have struggled to see the value in it. Therefore, IT has been responsible for cataloging data elements to support search and discovery, yet they rarely understand the data’s context due to being removed from the operational side of the business. This means data is often incomplete and of poor quality, making effective data-driven business impossible.

Company-wide responsibility for data governance, encouraged by the new standards of regulation, stand to fundamentally change the way businesses view data governance. Data Governance 2.0 and its collaborative approach will become the new normal, meaning those with the most to gain from data and its insights will be directly involved in its governance.

This means more buy-in from C-level executives, line managers, etc. It means greater accountability, as well as improved discoverability and traceability. Most of all, it means better data quality that leads to faster, better decisions made with more confidence.

Escalated Digital Transformation

Digital transformation and its prominence won’t diminish this year. In fact, thanks to Data Governance 2.0, digital transformation is poised to accelerate – not slow down.

Organizations that commit to data governance beyond just compliance will reap the rewards. With a stronger data governance foundation, organizations undergoing digital transformation will enjoy a number of significant benefits, including better decision making, greater operational efficiency, improved data understanding and lineage, greater data quality, and increased revenue.

Data-driven exemplars, such as Amazon, Airbnb and Uber, have enjoyed these benefits, using them to disrupt and then dominate their respective industries. But you don’t have to be Amazon-sized to achieve them. De-siloing DG and treating it as a strategic initiative is the first step to data-driven success.

Data as Valuable Asset

Data became more valuable than oil in 2017. Yet despite this assessment, many businesses neglect to treat their data as a prized asset. For context, the Industrial Revolution was powered by machinery that had to be well-maintained to function properly, as downtime would result in loss. Such machinery adds value to a business, so it is inherently valuable.

Fast forward to 2018 with data at center stage. Because data is the value driver, the data itself is valuable. Just because it doesn’t have a physical presence doesn’t mean it is any less important than physical assets. So businesses will need to change how they perceive their data, and this is the year in which this thinking is likely to change.

DG-Enabled AI and IoT

Artificial Intelligence (AI) and the Internet of Things (IoT) aren’t new concepts. However, they’re yet to be fully realized with businesses still competing to carve a slice out of these markets.

As the two continue to expand, they will hypercharge the already accelerating volume of data – specifically unstructured data – to almost unfathomable levels. The three Vs of data tend to escalate in unison. As the volume increases, so does the velocity and speed at which data must be processed. The variety of data – mostly unstructured in these cases – also increases, so to manage it, businesses will need to put effective data governance in place.

Alongside strong data governance practices, more and more businesses will turn to NoSQL databases to manage diverse data types.

For more best practices in business and IT alignment, and successfully implementing Data Governance 2.0, click here.

Data governance is everyone's business

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – In-memory Databases

With the volume and velocity of data increasing, in-memory databases provide a way to keep processing speeds low.

Traditionally, databases have stored their data on mechanical storage media such as hard disks. While this has contributed to durability, it’s also constrained attainable query speeds. Database and software designers have long realized this limitation and sought ways to harness the faster speeds of in-memory processing.

The traditional approach to database design – and analytics solutions to access them – includes in-memory caching, which retains a subset of recently accessed data in memory for fast access. While caching often worked well for online transaction processing (OLTP), it was not optimal for analytics and business intelligence. In these cases, the most frequently accessed information – rather than the most recently accessed information – is typically of most interest.

That said, loading an entire data warehouse or even a large data mart into memory has been challenging until recent years.

In-Memory

There are a few key factors in making in-memory databases and analytics offerings relevant for more and more use cases. One such factor has been the shift to 64-bit operating systems. Another is that it makes available much more addressable memory. And as one might assume, the availability of increasingly large and affordable memory solutions has also played a part.

Database and software developers have begun to take advantage of in-memory databases in a myriad of ways. These include the many key-value stores such as Amazon DynamoDB, which provide very low latency for IoT and a host of other use cases.

Another way businesses are taking advantage of in-memory is through distributed in-memory NoSQL databases such as Aerospike, to in-memory NewSQL databases such as VoltDB. However, for the remainder of this post, we’ll touch in more detail on several solutions with which you might be more familiar.

Some database vendors have chosen to build hybrid solutions that incorporate in-memory technologies. They aim to bridge in-memory with solutions based on tried-and-true, disk-based RDBMS technologies. Such vendors include Microsoft with its incorporation of xVelocity into SQL Server, Analysis Services and PowerPivot, and Teradata with its Intelligent Memory.

Other vendors, like IBM with its dashDB database, have chosen to deploy in-memory technology in the cloud, while capitalizing on previously developed or acquired technologies (in-database analytics from Netezza in the case of dashDB).

However, probably the most high-profile application of in-memory technology has been SAP’s significant bet on its HANA in-memory database, which first shipped in late 2010. SAP has since made it available in the cloud through its SAP HANA Cloud Platform, and on Microsoft Azure and it has released a comprehensive application suite called S/4HANA.

Like most of the analytics-focused in-memory databases and analytics tools, HANA stores data in a column-oriented, in-memory database. The primary rationale for taking a column-oriented approach to storing data in memory is that in analytic use cases, where data is queried but not updated, it allows for often very impressive compression of data values in each column. This means much less memory is used, resulting in even higher throughput and less need for expensive memory.

So what approach should a data architect adopt? Are Microsoft, Teradata and other “traditional” RDBMS vendors correct with their hybrid approach?

As memory gets cheaper by the day, and the value of rapid insights increases by the minute, should we host the whole data warehouse or data mart in-memory as with vendors SAP and IBM?

It depends on the specific use case, data volumes, business requirements, budget, etc. One thing that is not in dispute is that all the major vendors recognize that in-memory technology adds value to their solutions. And that extends beyond the database vendors to analytics tool stalwarts like Tableau and newer arrivals like Yellowfin.

It is incumbent upon enterprise architects to learn about the relative merits of the different approaches championed by the various vendors and to select the best fit for their specific situation. This is something that’s admittedly, not easy given the pace of adoption of in-memory databases and the variety of approaches being taken.

But there’s a silver lining to the creative disruption caused by the increasing adoption of in-memory technologies. Because of the sheer speed the various solutions offered, many organizations are finding that the need to pre-aggregate data to achieve certain performance targets for specific analytics workloads is disappearing. The same goes for the need to de-normalize database designs to achieve specific analytics performance targets.

Instead, organizations are finding that it’s more important to create comprehensive atomic data models that are flexible and independent of any assumed analytics workload.

Perhaps surprisingly to some, third normal form (3NF) is once again not an unreasonable standard of data modeling for modelers who plan to deploy to a pure in-memory or in-memory-augmented platform.

Organizations can forgo the time-consuming effort to model and transform data to support specific analytics workloads, which are likely to change over time anyway. They also can stop worrying about de-normalizing and tuning an RDBMS for those same fickle and variable analytics workloads, focusing on creating a logical data model of the business that reflects the business information requirements and relationships in a flexible and detailed format, that doesn’t assume specific aggregations and transformations.

The blinding speed of in-memory technologies provides the aggregations, joins and other transformations on the fly, without the onerous performance penalties we have historically experienced with very large data volumes on disk-only-based solutions. As a long-time data modeler, I like the sound of that. And so far in my experience with many of the solutions mentioned in this post, the business people like the blinding speed and flexibility of these new in-memory technologies!

Please join us next time for the final installment of our series, Data Modeling in a Jargon-filled World – The Logical Data Warehouse. We’ll discuss an approach to data warehousing that uses some of the technologies and approaches we’ve discussed in the previous six installments while embracing “any data, anywhere.”

Categories
erwin Expert Blog

Data Modeling in a Jargon-filled World – Internet of Things (IoT)

In the first post of this blog series, we focused on jargon related to the “volume” aspect of Big Data and its impact on data modeling and data-driven organizations. In this post, we’ll focus on “velocity,” the second of Big Data’s “three Vs.”

In particular, we’re going to explore the Internet of Things (IoT), the constellation of web-connected devices, vehicles, buildings and related sensors and software. It’s a great time for this discussion too, as IoT devices are proliferating at a dizzying pace in both number and variety.

Though IoT devices typically generate small “chunks” of data, they often do so at a rapid pace, hence the term “velocity.” Some of these devices generate data from multiple sensors for each time increment. For example, we recently worked with a utility that embedded sensors in each transformer in its electric network and then generated readings every 4 seconds for voltage, oil pressure and ambient temperature, among others.

While the transformer example is just one of many, we can quickly see two key issues that arise when IoT devices are generating data at high velocity. First, organizations need to be able to process this data at high speed.  Second, organizations need a strategy to manage and integrate this never-ending data stream. Even small chunks of data will accumulate into large volumes if they arrive fast enough, which is why it’s so important for businesses to have a strong data management platform.

It’s worth noting that the idea of managing readings from network-connected devices is not new. In industries like utilities, petroleum and manufacturing, organizations have used SCADA systems for years, both to receive data from instrumented devices to help control processes and to provide graphical representations and some limited reporting.

More recently, many utilities have introduced smart meters in their electricity, gas and/or water networks to make the collection of meter data easier and more efficient for a utility company, as well as to make the information more readily available to customers and other stakeholders.

For example, you may have seen an energy usage dashboard provided by your local electric utility, allowing customers to view graphs depicting their electricity consumption by month, day or hour, enabling each customer to make informed decisions about overall energy use.

Seems simple and useful, but have you stopped to think about the volume of data underlying this feature? Even if your utility only presents information on an hourly basis, if you consider that it’s helpful to see trends over time and you assume that a utility with 1.5 million customers decides to keep these individual hourly readings for 13 months for each customer, then we’re already talking about over 14 billion individual readings for this simple example (1.5 million customers x 13 months x over 30 days/month x 24 hours/day).

Now consider the earlier example I mentioned of each transformer in an electrical grid with sensors generating multiple readings every 4 seconds. You can get a sense of the cumulative volume impact of even very small chunks of data arriving at high speed.

With experts estimating the IoT will consist of almost 50 billion devices by 2020, businesses across every industry must prepare to deal with IoT data.

But I have good news because IoT data is generally very simple and easy to model. Each connected device typically sends one or more data streams with each having a value for the type of reading and the time at which it occurred. Historically, large volumes of simple sensor data like this were best stored in time-series databases like the very popular PI System from OSIsoft.

While this continues to be true for many applications, alternative architectures, such as storing the raw sensor readings in a data lake, are also being successfully implemented. Though organizations need to carefully consider the pros and cons of home-grown infrastructure versus time-tested industrial-grade solutions like the PI System.

Regardless of how raw IoT data is stored once captured, the real value of IoT for most organizations is only realized when IoT data is “contextualized,” meaning it is modeled in the context of the broader organization.

The value of modeled data eclipses that of “edge analytics” (where the value is inspected by a software program while inflight from the sensor, typically to see if it falls within an expected range, and either acted upon if required or allowed simply to pass through) or simple reporting like that in the energy usage dashboard example.

It is straightforward to represent a reading of a particular type from a particular sensor or device in a data model or process model. It starts to get interesting when we take it to the next step and incorporate entities into the data model to represent expected ranges –  both for readings under various conditions and representations of how the devices relate to one another.

If the utility in the transformer example has modeled that IoT data well, it might be able to prevent a developing problem with a transformer and also possibly identify alternate electricity paths to isolate the problem before it has an impact on network stability and customer service.

Hopefully this overview of IoT in the utility industry helps you see how your organization can incorporate high-velocity IoT data to become more data-driven and therefore more successful in achieving larger corporate objectives.

Subscribe and join us next time for Data Modeling in a Jargon-filled World – NoSQL/NewSQL.

Data-Driven Business Transformation

Categories
erwin Expert Blog

Data-Driven Business Transformation: the Data Foundation

In light of data’s prominence in modern business, organizations need to ensure they have a strong data foundation in place.

The ascent of data’s value has been as steep as it is staggering. In 2016, it was suggested that more data would be created in 2017 than in the previous 5000 years of humanity.

But what’s even more shocking is that the peak still not may not even be in sight.

To put its value into context, the five most valuable businesses in the world all deal in data (Alphabet/Google, Amazon, Apple, Facebook and Microsoft). It’s even overtaken oil as the world’s most valuable resource.

Yet, even with data’s value being as high as it is, there’s still a long way to go. Many businesses are still getting to grips with data storage, management and analysis.

Fortune 1000 companies, for example, could earn another $65 million in net income, with access to just 10 percent more of their data (from Data-Driven Business Transformation 2017).

We’re already witnessing the beginnings of this increased potential across various industries. Data-driven businesses such as Airbnb, Uber and Netflix are all dominating, disrupting and revolutionizing their respective sectors.

Interestingly, although they provide very different services for the consumer, the organizations themselves all identify as data companies. This simple change in perception and outlook stresses the importance of data to their business models. For them, data analysis isn’t just an arm of the business… It’s the core.

Data foundation

The dominating data-driven businesses use data to influence almost everything. How decisions are made, how processes could be improved, and where the business should focus its innovation efforts.

However, simply establishing that your business could (and should) be getting more out of data, doesn’t necessarily mean you’re ready to reap the rewards.

In fact, a pre-emptive dive into a data strategy could in fact, slow your digital transformation efforts down. Hurried software investments in response to disruption can lead to teething problems in your strategy’s adoption, and shelfware, wasting time and money.

Additionally, oversights in the strategy’s implementation will stifle the very potential effectiveness you’re hoping to benefit from.

Therefore, when deciding to bolster your data efforts, a great place to start is to consider the ‘three Vs’.

The three Vs

The three Vs of data are volume, variety and velocity. Volume references the amount of data; variety, its different sources; and velocity, the speed in which it must be processed.

When you’re ready to start focusing on the business outcomes that you hope data will provide, you can also stretch those three Vs, to five. The five Vs include the aforementioned, and also acknowledge veracity (confidence in the data’s accuracy) and value, but for now we’ll stick to three.

As discussed, the total amount of data in the world is staggering. But the total data available to any one business can be huge in its own right (depending on the extent of your data strategy).

Unsurprisingly, vast volumes of data are sourced from a vast amount of potential sources. It takes dedicated tools to be processed. Even then, the sources are often disparate, and very unlikely to offer worthwhile insight in a vacuum.

This is why it’s so important to have an assured data foundation upon which to build a data platform on.

A solid data foundation

The Any2 approach is a strategy for housing, sorting and analysing data that aims to be that very foundation on which you build your data strategy.

Shorthand for Any Data, Anywhere, Anycan help clean up the disparate noise, and let businesses drill down on, and effectively analyze the data in order to yield more reliable and informative results.

It’s especially important today, as data sources are becoming increasingly unstructured, and so more difficult to manage.

Big data for example, can consist of click stream data, Internet of Things data, machine data and social media data. The sources need to be rationalized and correlated so they can be analyzed more effectively.

When it comes to actioning an Anyapproach, a fluid relationship between the various data initiative involved is essential. Those being, Data ModelingEnterprise ArchitectureBusiness Process, and Data Governance.

It also requires collaboration, both in between the aforementioned initiatives, and with the wider business to ensure everybody is working towards the same goal.

With a solid data foundation platform in place, your business can really begin to start realizing data’s potential for itself. You also ensure you’re not left behind as new disruptors enter the market, and your competition continues to evolve.

For more data advice and best practices, follow us on Twitter, and LinkedIn to stay up to date with the blog.

For a deeper dive into best practices for data, its benefits, and its applications, get the FREE whitepaper below.

Data-Driven Business Transformation

Categories
erwin Expert Blog

Why Data vs. Process is dead, and why we should look at the two working together

Whether a collection of data could be useful to a business, is all just a matter of perspective. We can view data in its raw form like a tangled set of wires, and for them to be useful again, they need to be separated.

We’ve talked before about how Data Modeling, and Enterprise Architecture can make data easier to manage and decipher, but arguably, there’s still a piece of the equation missing.

To make the most out of Big Data, the data must also be rationalized in the context of the business’ processes, where the data is used, by whom, and how. This is what process modeling aims to achieve. Without process modeling, businesses will find it difficult to quantify, and/or prioritize the data from a business perspective – making a truly business outcome-focused approach harder to realize.

So What is Process Modeling?

“Process modeling is the documentation of an organization’s processes designed to enhance company performance,” said Martin Owen, erwin’s VP of Product Management.

It does this by enabling a business to understand what they do, and how they do it.

As is commonplace for disciplines of this nature, there are multiple industry standards that provide the basis of the approach to how this documentation is handled.

The most common of which, is the “business process modeling notation” (BPMN) standard. With BPMN, businesses can analyze their processes from different perspectives, such as a human capital perspective, shining a light on the roles and competencies required for a process to perform.

Where does Data Modeling tie in with Process Modeling?

Historically, industry analysts have viewed Data and Process Modeling as two competing approaches. However, it’s time that notion was cast aside, as the benefits of the two working in tandem are too great to just ignore.

The secret behind making the most out of data, is being able to see the full picture, as well as drill down – or rather, zoom in – on what’s important in the given context.

From a process perspective, you will be able to see what data is used in the process and architecture models. And from a data perspective, users can see the context of the data and the impact of all the places it is used in processes across the enterprise. This provides a more well-rounded view of the organization and the data. Data modelers will benefit from this, enabling them to create and manage better data models, as well as implement more context specific data deployments.

It could be that the former approach to Data and Process Modeling was born out of the cost to invest in both (for some businesses) being too high, aligning the two approaches being too difficult, or a cocktail of both.

The latter is perhaps the more common culprit, though. This is evident when we consider the many companies already modeling both their data and processes. But the problem with the current approach is that the two model types are siloed, severing the valuable connections between the data and meaning alignment is difficult to achieve. Additionally, although all the data is there, the aforementioned severed connections are just as useful as the data itself, and so denying them means a business isn’t seeing the full picture.

However, there are now examples of both Data and Process Modeling being united under one banner.

“By bringing both data and process together, we are delivering more value to different stakeholders in the organization by providing more visibility of each domain,” suggested Martin. “Data isn’t locked into the database administrator or architect, it’s now expressed to the business by connections to process models.”

The added visibility provided by a connected data and process modeling approach is essential to a Big Data strategy. And there are further indications this approach will soon be (or already is), more crucial than ever before. The Internet of Things (IoT), for example, continues to gain momentum, and with it will come more data, at quicker speeds, from more disparate sources. Businesses will need to adopt this sort of approach to govern how this data is moved and united, and to identify/tackle any security issues that arise.

Enterprise Data Architecture and Data Governance