Deploying and Executing
Before a Discovery Hub project is deployed and executed, it is simply a meta data model of your data warehouse. Deployment and execution generates and runs the code for extracting, transferring and loading your data as well as creating any OLAP cubes in the project. During development it can also be a good idea to deploy and execute the project to see if everything works as expected.
Deploying a Project
Deploying a project, or a part of a project, is the process of generating the structure of the staging database and the data warehouse, processing cubes and generating SQL code.
No data is loaded into the staging database or the data warehouse, and no cubes are processed at this time. When you successfully deploy a project, the project is automatically saved in the project repository.
Deployment in Discovery Hub is optimized in two ways: It is managed, i.e. objects are deployed after any objects they depend on, and differential, i.e. only the steps that have changed since the last deployment are deployed again.
Executing a Project
Executing a project is the process of loading data into the staging database, the data warehouse, and then processing any OLAP cubes.
Executing a project involves the following steps:
- Transferring data: The process of transferring data from the data source to the raw table of the staging database.
- Processing data: The process of cleansing data; that is, validating the data against the business rules, and moving the validated data to the valid table. Status information is also generated at this point.
- Verifying data against checkpoints: The process of checking the data that is being processed against the checkpoints you have specified. You can specify rules that will end the execution process if not met. This way, you avoid overwriting the data in your data warehouse with non-valuable data.
- Moving data: The process of moving data from a business unit to a data warehouse, or from a data warehouse to a cube.
- Processing cubes: The process of creating dimension hierarchies and retrieving values from the fact tables to populate the cubes with measures, including derived and calculated measures.
Discovery Hub supports managed and threaded execution. This means that Discovery Hub can execute a project in multiple threads while managing dependencies between objects and optimizing the execution to take the shortest amount of time.