8 Essential Data Warehouse Best Practices

WHY DATA WAREHOUSES STILL MATTER

A data warehouse can be a powerful business intelligence tool that supports your reports, monitors the health of your company, and guides your plans for the future. 

But the information your data warehouse provides is only as good as the data warehouse itself. That means how it’s created and organized is tremendously important. The set-up of your data warehouse should be methodical and take into account your company’s needs. Here at Xerva, we build world-class data warehouses for our clients on a regular basis. Here are the best practices we use to consistently get quality results.

  1. Determine your methodology
  2. Focus on business processes
  3. Standardized coding practices
  4. Agile development
  5. A roadmap for development
  6. Evangelizing
  7. Executive sponsorship
  8. Pick the right software/hardware stack

 

1. DETERMINE YOUR METHODOLOGY

Some businesses simply “dump” all their data into their warehouse. But having a strict methodology that guides how the data is organized is vital. Otherwise, the data you do have is basically worthless. Your data warehouse is actually just a data dump. 

For our methodology, we follow a five-step process: 

  1. Discovery – You must first understand your data sources. What do they look like? Are you using an API or on-premise database? Interview all the relevant team members at your business so you know what their needs are. What sort of information do they need to complete their analysis? Use their answers to determine what data to extract. 
  1. Extract – There are many ways to pull data into a warehouse. You should pick the one that’s most appropriate to your data sources. Make sure it has the right amount of “latency” (does the business need a five-minute pull, daily pull, or monthly pull?). 
  1. Modeling – This step deals with organizing your data effectively. You can’t simply put a bunch of data in one place and call it a data warehouse. It must be organized and accessible, per the needs of the business. 
  1. Reporting – Never lose sight of how the data is going to be used. A data warehouse by itself doesn’t add value. You have to connect it to a visualization tool and create meaningful dashboards and reports that are actually used by people. Make sure you deliver Key Performance Indicators (KPIs) that meet the business’s specific needs. 
  1. Monitoring – After you’ve implemented a data warehouse, your work is not done. You must also deploy regular audits to catch outliers and ensure that your data is being updated and staying accurate.  

Remember that data warehousing more “technical debt” than typical BI projects. People are often surprised by the amount of maintenance required to keep warehouses fine-tuned. Every time new data is brought into a warehouse there is an opportunity for something to go wrong.  

For example, someone could introduce faulty data on the front end or a new SKU that is being sold. To catch these problems, set up exception-based reporting that can alert you to irregularities.  

 

 2. FOCUS ON BUSINESS PROCESSES

When you have Dynamic CRM or use another system regularly in your business, it’s tempting to simply funnel the data it collects into your warehouse. But one system doesn’t give you the full story of your customer, especially if you haven’t done a full implementation. 

Build your data warehouse around your business process, not your data sources. Consider what your full sales process looks like. That might mean expanding your analysis to include other sources in your data warehouse.  

For example, if you want to know how your sales process is performing, you’ll need cost information like commissions and marketing campaign spend. That information is probably in other systems like Marketo or QuickBooks.

 

 3. STANDARDIZED CODING PRACTICES

The backbone of your data warehouse is the coding your developers craft to collect your data, organize it, and produce reports. 

But there can be major issues with the coding process that are often overlooked, especially when you have several different developers working on the same warehouse who use different naming systems and coding styles. These issues arise when coders fail to standardize or even document their process. 

You’ve probably seen it before. The code looks like Frankenstein’s monster with different ways to name tables, different joins, and different CTEs. Some like temp tables; some like to name columns with underscores. There’s a lot of personal preference in coding, such as writing a SQL query for an important business metric in a way not understandable to other developers. 

The code might work (you can code in different ways to get the “right” answer) but when it comes time to make changes or troubleshoot, things start to go very badly.  

The solution is to set down standards from the beginning. Enshrine your naming and other process standards in a document then give it to every developer on the project. This process can take months of meetings to get right – but it’s key, and well worth the effort. Standards are like the foundation of a house–you want to get it right before you do anything else. 

Create your standards for coding your data warehouse as if the developer building it isn’t going to be the one to maintain it. Many members of a team can come and go during the life of a project. We want to believe everyone is going to stick around forever, but it’s far wiser to be prepared for turnover. 

You know you’ve done things right if anyone can go into the code, understand it, and be able to alter it as needed.   

Failing to create standards creates massive technical debt by forcing developers to re-write code when staff or business needs change. 

 

 4. AGILE DEVELOPMENT

When it comes to building data warehouses, the worst thing developers can do is hide away for six months while coding and then come back to the business with a “finished” system. Isolation is almost guaranteed to result in a data warehouse that fails to meet the needs of your business. The result will be low adoption and a lot of wasted effort.  

A better method is to form a close relationship with the business decision-makers early on and frequently deploy quick releases. Include the business in every step of the development process and give them a chance to make changes to the specifications along the way. That way you can include those changes in the data warehouse as you go. This ensures the business is happy with the finished product. 

Using the iteration process, you also get buy-in from executives in the business as they see progress and results more quickly. They also have a sense of ownership. Buy-in and ownership from executives is vital to the success of your data warehouse project, so don’t skimp on it. 

For instance, we were once brought into a project to support a struggling development team. They had spent a year building a data warehouse and had done a pretty good job, but the warehouse had almost zero adoption. Analysts were still using their previous data sources to create their reports. 

The developers hadn’t included the executives in the creation of the warehouse. As a result, they didn’t know what it could do, and most of the data warehouse’s potential was untapped. 

 

 5. A ROADMAP FOR DEVELOPMENT

 The attitude of executives more often than not is, “What have you done for me lately?” Consequently, maintaining buy-in and support for your data warehouse includes sharing the progress you’ve made and reminding executives how your project is adding value. 

A good way to do this is to create a project roadmap that outlines the major milestones you’re planning to reach with your data warehouse. Provide updates on “wins” you’ve achieved as you hit those milestones. The roadmap doesn’t need to be perfect. It should be a living document that evolves as you go along. Its accuracy is not as important as how frequently it is updated and communicated.  

Executives should have a thorough understanding of what you have on your plate and what you’ve accomplished so far. They’ll likely be fine with not hitting deadlines if they are in the loop. Let them know why the timeline alteration is necessary and what the new deadline will be. Perhaps, they asked you to work on X, which means Y and Z have to be delayed. With communication, they’ll understand the delay. 

 

6. EVANGELIZING

Building a successful data warehouse is a lot more of a sales job than most people realize. 

A developer’s natural tendency is to stand back and out of the limelight. They fall in love with the solution they’ve created and sometimes ignore what the business actually needs the data warehouse for.  

But in order to get buy-in and widespread adoption, you must be an evangelist. Tell all the key players about your work and how it will help to do their jobs better. Collect and provide crucial business intelligence to grow the business as quickly as possible. 

Start during the development process – celebrate your wins along the way. Don’t wait until the project is complete to start delivering value. This method is technically and “politically” advantageous to the developer as you don’t have to worry about the business losing interest and scrapping your project mid-development. 

Remember: it doesn’t matter how “beautiful” your data warehouse is. To be successful, it must be used. 

 

 7. EXECUTIVE SPONSHIP

When it comes to data warehouses, not all users are created equal. When developing the system, it’s crucial that executives – who control the money and strategy – get what they need. You want them to be happy with the data warehouse and any other Business Intelligence project and see it as a key tool for doing their job.  

Focus on the executives who have the clout to give you enough time to get widespread buy-in.  

During the development process, it’s important to hold regular recurring meetings with these executives to keep them updated on your progress. If they send you an email asking for something, you better do it ASAP. They are probably trying to make a case, at the executive level, for your project, so your response is critical. 

 

 8. PICK THE RIGHT HARDWARE/SOFTWARE STACK

 It’s tempting to get caught up in the excitement around new technology for your data warehouse – the thinking is that the latest must be the greatest. But in fact, in most cases, it’s best to stay off the leading edge when it comes to data warehouse tech. 

New tech is often over-hyped and misunderstood. According to The Gartner Hype cycle, it takes years for businesses to figure out how to properly apply new innovations (for example, read this piece we wrote about data lakes). It’s far better to build your data warehouse with tried-and-true software and hardware that has been well-vetted by BI professionals. 

Instead of jumping on trends or being sold by a vendor on fancy visualization tools, make sure you pick software and hardware that actually does the job you need it to do. The fanciest bells and whistles won’t help you at all if you use the wrong technology and your data is a mess as a result. Be smart and get the right tool for the right problem. 

After you have a solid and reliable tech stack in place, you can gradually introduce appropriate cutting-edge technology as needed.  

 

YOUR NEXT STEPS

Armed with these best practices, you can guide the creation of an effective data warehouse.   

Implementing these best practices can be challenging, that’s why we offer “Done-For-You” Data Warehouse solutions. We take care of our clients’ data, so they can focus on their core competencies.

Nate McMurtrey loves using data to solve business problems. He is a Business Intelligence expert, data warehouse wizard, and pioneer in Done-For-You data warehouse solutions. He has worked as a BI manager and freelance BI consultant. In 2014, he co-founded Xerva with Nate Allphin.