How to Test Your Data Warehouse

picture8

To start with, we need a Test schedule. The same is created in the process of developing the Test plan. In this schedule, we have to estimate the time required for testing of the entire Data Warehouse system. There are different methodologies available to create a Test schedule. None of them are perfect because the data warehouse ecosystem is very complex and large, also constantly evolving in nature. The most important takeaway from this article is that DW testing is data centric, while software testing is code centric. The connections between the DW components are groups of transformations that take place over data. These transformation processes should be tested as well, to ensure data quality preservation. The DW testing and validation techniques I introduce here are broken into four well defined processes, namely:

1. Integration testing
2. System testing
3. Data validation
4. Acceptance testing

Thus these types of testing are again divided in the following sub-phases:

  • Understanding of Requirements
  • Development of Test Plan and Test Design
  • Preparation of Test Cases
  • Execution of Test Cases
  • Reporting of the Test results

As you can see, the first major phase is the Requirements Definition. Data used in a Data Warehouse originates from multiple source systems. Therefore, defining good requirements can be extremely challenging. Successful requirements are those structured closely to business rules and in the same time address functionality and performance. These business rules and requirements provide a solid foundation to the data architects. Requirements are one of the keys to success.

The whole team should share the same tools from the project toolbox. It’s important that everyone works from the same set of documents and requirements (i.e. version of files). We have to create a rich set of test data, avoiding combinational explosion of sets. The automation approach should be a combination of Task automation and Test automation. An alternative strategy is to use the source systems to create the artificial test data. That way a realistic but artificial source data is created. The testers should be skilled as well.

Working with the concept of the Prototype model, the testing activities can be summarized in the form of the following checklist:

1. Multidimensional Schema

  • Workload Test
  • Hierarchy Test
  • Conformity Test
  • Nomenclature check
  • Performance Test
  • Early loading Test
  • Security Test
  • Maintainability Test

2. ETL Procedures

  • Code Test
  • Integrity Test
  • Integration Test
  • Administrability Test
  • Performance/Stress Test
  • Recovery Test
  • Security Test
  • Maintainability Test

3. Physical Schema

  • Performance/Stress Test
  • Recovery Test
  • Security Test

4. Front-end

  • Balancing Test
  • Usability Test
  • Performance/Stress Test
  • Security Test

In order to achieve our testing goals, we could focus on the following two approaches:

1. Manual Sampling – The testing activities covered under this approach are:
End-to-End Testing – It checks that data is properly loaded into the systems from which the data warehouse will extract data to generate Reports.
Row count Testing – To avoid any loss of data, all rows of data are counted after the ETL process to ensure that all the data is properly loaded.
Field size Testing – It checks that the data warehouse field should be bigger than the data field for the data being loaded. If it is not checked it will lead to data truncation.
Sampling – The sample used for testing must be a good representation of whole data.

2. Reporting – The testing activities covered under this approach are:
Report Testing – The reports are checked to see that the data displayed in the reports are correct and can be used for decision-making purpose.
Next major part of our testing is to choose a Test fixture strategy. I prefer to use Fresh fixture – ensuring that tests do not depend on anything they did not set up themselves. Combined with Lazy (Initialization) setup and Shared fixture (partitioning the fixture required by tests into two logical parts). That is assuming that we are happy with the idea of creating the test fixture the first time any test needs it, we can use Lazy Setup in the setUp() method of the corresponding Testcase Class to create it as part of running the first test. Subsequent tests will then see that the fixture already exists and reuse it.

The first part is the stuff every test needs to have present but is never modified by any tests—that is, the Immutable Shared Fixture. The second part is the objects that any test needs to modify or
delete; these objects should be built by each test as Fresh Fixtures. Most commonly, the Immutable Shared Fixture consists of reference data that is needed by the actual per-test fixtures. The per-test fixtures can then be built as Fresh Fixtures on top of the Immutable Shared Fixture). We also have the option called Minimal Fixture (i.e. use of smallest and simplest fixture possible for each test). Aiming at full post I have to mention the last option I would use, although it’s considered as Anti-pattern, is called Chained Tests (i.e. let the other tests in a test suite to setup the test fixture).

It is important to consider the Pesticide Paradox – often test automation suffers from static test data. Without variation in the data or execution path, bugs are located with decreasing frequency. Automation should be designed so that a test-case writer can provide a common equivalence class or data type as a parameter, and the automation framework will use the data randomly within that class and apply it for each test run. The automation framework should record the seed values, or actual data that was used, to allow reruns and retesting with the same data for debugging purposes.
It can utilize a TestFixtureRegistry via dedicated table – it’ll be able to expose various parts of a fixture, needed for suites, via discrete fixture holding class variables or via Finder Methods. Finder Methods helps us avoid hard-coded values in DB lookups in order to access the fixture objects. Those methods are very similar to those of Creation Methods, but they return references to existing fixture objects rather than building brand new ones. We should make those immutable.

NOTE: usage of Intent-Revealing Names for the fixture objects should be enforced in order to support the framework’s lookup functionality and better readability. To keep the current design the following implementation can be used – check if such registry already exists and remove it;

NOTE: in consideration must be taken the Data Sensitivity and Context Sensitivity of the tests that’ll rely on this module.

NOTE: take into consideration the Data Sensitivity and Context Sensitivity of the tests that’ll rely on this module.

Test Approach Phases

The suggested test phases are based on the development schedule per project, along with the need to comply with data requirements that need to be in place when the new DWH goes live.

Phase 1: Business processes

  •  Data Validation
  •  Performance Test
  •  Functional Test
  •  Data Warehouse (internal testing within ETL validating data stage jobs)

Data validation should start early in the test process and be completed before Phase 2 testing begins. Some data validation testing should occur in the remaining test phases, but to a much lesser extent.

Important business processes where performance is important should be identified and tested (when available) in the Phase 1. Performance testing should be continued in the later test phases as the application will be continuously enhanced throughout the project. In addition to Phase 1 testing, there will also be unit and functional testing. As unit testing is completed for a program, the tester will perform functional tests on the program. While functional testing takes place with one program, the developer continues with redeveloping and unit testing the next program.

Towards the end of Phase 1, the data warehouse team will be testing the data stage jobs. Unit testing should be completed first, then functional testing finishing a couple weeks afterwards. A final formal test will cap the end of Phase 1 testing.

Phase 2: Performance tests 

  •   Cross-functional process
  •   Security Test
  •   Data Warehouse (Repository testing and validation)

In addition to the previous tests, Phase 2 should also cover remaining test items that may not been tested in Phase 1 such as:

Phase 2 testing will be important because it is the final testing opportunity that the functional area testers will have to make sure the DW load works as expected before moving to regression testing in Phase 3. Some performance tests and data validation should be included in this phase.

Phase 3: Regression tests

Phase 3 testing is comprised of regression test periods to test updates that are required as part of the Company gaming platform. The functional area testers should have sufficient time to test in each regression test period.

Phase 4: Acceptance tests

Phase 4 testing is limited. In addition to the functional area testers, end users will probably be involved in this final test before the new system goes live. In customer acceptance testing, no new tests should be introduced at this time. Customer acceptance tests should have already been tested in prior test phases.

 

Evgeni Kostadinov

Evgeni Kostadinov prefers the challenges associated with testing of various technologies. He has extensive experience with UI, API, DW, Performance and Mobile. He has worked mainly in projects with Java, .Net and NodeJS environments for Telecom, Financial, Marketing and Banking Institutions. He actively participates in the development of the company’s QA and CI processes and infrastructure. Evgeni currently works as a QA Manager, technical trainer, as well as a QA Challenge Accepted lecturer.

Evgeni Kostadinov
Evgeni Kostadinov prefers the challenges associated with testing of various technologies. He has extensive experience with UI, API, DW, Performance and Mobile. He has worked mainly in projects with Java, .Net and NodeJS environments for Telecom, Financial, Marketing and Banking Institutions. He actively participates in the development of the company’s QA and CI processes and infrastructure. Evgeni currently works as a QA Manager, technical trainer, as well as a QA Challenge Accepted lecturer.

The Related Post

 LogiGear_Magazine_October_2014_Testing_Smart_and_Mobile
Mobile is no longer an area that a few UX people specialize in, and we need to start designing and testing everything for smartphones and tablets as well as computers. If you’re new to mobile usability testing, fear not. It is not as hard as you might think but there are some key differences from ...
LogiGear Magazine December 2012 – Mobile Test Automation  
LogiGear Magazine – November 2011 – Mobile Application Testing Issue
What you need to know for testing in the new paradigm This two part article analyzes the impact of the Internet of Things (IoT) product development on traditional testing. Part one of this series starts with a wide view on the IoT, embedded systems and device development aspects of testing. Part two, to be published ...
Strategies to Approach Mobile Web App Testing Mobile web technology has been continuously changing over the past few years, making “keeping up” challenging. In this article, Raj Subramanian covers the latest trends and changes happening in the mobile web and how testers can prepare for them.
Convergence of Social Media, Mobile, Analytics, & Cloud [SMAC] is one of the hottest trends these days. It is a major business agenda forcing organizations to rethink their strategies and increase technology investments in this direction.
To help testers gain an edge, here’s a list of free resources Mobile testing is making leaps and bounds of progress in the overall testing space. As this field is highly dynamic, a tester must constantly evolve and improvise his or her knowledge of mobile testing. To help software testers gain an edge, I have compiled the following list ...
Users aren’t likely to forgive and forget buggy apps. Mobile has big implications for business. The mobile experience is the customer experience, and you don’t get many second chances.
Steps that will enable you to identify the weaknesses of your new app, its vulnerabilities and strengths. So you’ve just finished developing a nifty, customisable app that can help farmers track their produce from source to market via their mobile phone. You’re elated and want to get started marketing it right away. Not to burst ...
Manual testing teams may not be able to test all the processes with each build Test automation of applications has been around for many years. There are many of us in the automated testing field that started very early in the test automation phase, but the introduction of mobile devices has brought on a new angle ...
Testing appears to be the least popular topic in Android development circles based on the relatively few books on Android app testing. Most tend to focus on development because, unfortunately (but true), application testing isn’t be something most developers think much about, or if they do, they don’t do it systematically (I’m guilty of this ...

Leave a Reply

Your email address will not be published.

Stay in the loop with the lastest
software testing news

Subscribe