The benefits of Data Vault automation from the more abstract – like improving data integrity – to the tangible – such as clearly identifiable savings in cost and time.
So Seriously … You Should Automate Your Data Vault
By Danny Sandwell
Data Vault is a methodology for architecting and managing data warehouses in complex data environments where new data types and structures are constantly introduced.
Without Data Vault, data warehouses are difficult and time consuming to change causing latency issues and slowing time to value. In addition, the queries required to maintain historical integrity are complex to design and run slow causing performance issues and potentially incorrect results because the ability to understand relationships between historical snap shots of data is lacking.
In his blog, Dan Linstedt, the creator of Data Vault methodology, explains that Data Vaults “are extremely scalable, flexible architectures” enabling the business to grow and change without “the agony and pain of high costs, long implementation and test cycles, and long lists of impacts across the enterprise warehouse.”
With a Data Vault, new functional areas typically are added quickly and easily, with changes to existing architecture taking less than half the traditional time with much less impact on the downstream systems, he notes.
Astonishingly, nearly 20 years since the methodology’s creation, most Data Vault design, development and deployment phases are still handled manually. But why?
Traditional manual efforts to define the Data Vault population and create ETL code from scratch can take weeks or even months. The entire process is time consuming slowing down the data pipeline and often riddled with human errors.
On the flipside, automating the development and deployment of design changes and the resulting data movement processing code ensures companies can accelerate dev and deployment in a timely and cost-effective manner.
Benefits of Data Vault Automation – A Case Study …
Global Pharma Company Saves Considerable Time and Money with Data Vault Automation
Let’s take a look at a large global pharmaceutical company that switched to Data Vault automation with staggering results.
Like many pharmaceutical companies, it manages a massive data warehouse combining clinical trial, supply chain and other mission-critical data. They had chosen a Data Vault schema for its flexibility in handling change but found creating the hubs and satellite structure incredibly laborious.
They needed to accelerate development, as well as aggregate data from different systems for internal customers to access and share. Additionally, the company needed lineage and traceability for regulatory compliance efforts.
With this ability, they can identify data sources, transformations and usage to safeguard protected health information (PHI) for clinical trials.
After an initial proof of concept, they deployed erwin Data Vault Automation and generated more than 200 tables, jobs and processes with 10 to 12 scripts. The highly schematic structure of the models enabled large portions of the modeling process to be automated, dramatically accelerating Data Vault projects and optimizing data warehouse management.
erwin Data Vault Automation helped this pharma customer automate the complete lifecycle – accelerating development while increasing consistency, simplicity and flexibility – to save considerable time and money.
For this customer the benefits of data vault automation were as such:
- Saving an estimated 70% of the costs of manual development
- Generating 95% of the production code with “zero touch,” improving the time to business value and significantly reduced costly re-work associated with error-prone manual processes
- Increasing data integrity, including for new requirements and use cases regardless of changes to the warehouse structure because legacy source data doesn’t degrade
- Creating a sustainable approach to Data Vault deployment, ensuring the agile, adaptable and timely delivery of actionable insights to the business in a well-governed facility for regulatory compliance, including full transparency and ease of auditability
Homegrown Tools Never Provide True Data Vault Automation
Many organizations use some form of homegrown tool or standalone applications. However, they don’t integrate with other tools and components of the architecture, they’re expensive, and quite frankly, they make it difficult to derive any meaningful results.
erwin Data Vault Automation centralizes the specification and deployment of Data Vault architectures for better control and visibility of the software development lifecycle. erwin Data Catalog makes it easy to discover, organize, curate and govern data being sourced for and managed in the warehouse.
With this solution, users select data sets to be included in the warehouse and fully automate the loading of Data Vault structures and ETL operations.
erwin Data Vault Smart Connectors eliminate the need for a business analyst and ETL developers to repeat mundane tasks, so they can focus on choosing and using the desired data instead. This saves considerable development time and effort plus delivers a high level of standardization and reuse.
After the Data Vault processes have been automated, the warehouse is well documented with traceability from the marts back to the operational data to speed the investigation of issues and analyze the impact of changes.
Bottom line: if your Data Vault integration is not automated, you’re already behind.
If you’d like to get started with erwin Data Vault Automation or request a quote, you can email email@example.com.