Solved

Solutions for different schemas in a data lake?

Forum|Forum|3 years ago
June 27, 2022
6 replies
173 views

B

brandon.spencer
Contributor

I am new to TX and data lakes. I have a common problem that others have commented on and would like to know if:

1) The functionality now exists, or if it does not

2) How are others getting around the issue?

The issue is that I need to be able to segregate data sources in the data lake. I am pulling in data from multiple SQL sources where the source schema and table names are the same, and I need to differentiate those tables in the data lake. By schema would be best, but if that is not possible, I need some other ideas. Thanks!

Best answer by brandon.spencer

Thanks, Rory. In case somebody comes across this post in the future who is new to TX like me, I found this TX link:

Database Schemas – TimeXtender Support

After reading the information in the link above, it jogged my memory that the training did address this. And it can be found under TimeXtender - Optimization: Schemas, Views, and Simple Mode

+8

rory.smith
TimeXtender Xpert
Forum|Forum|3 years ago
June 28, 2022

Hi Brandon,

the ODX Server will store data pulled from a source in a folder inside your storage container with the source's name. I.e. for an AdventureWorks2014 SQL database loaded through a SQL source named AW2014 you would get: data lake container > AW2014 > production_product for the folder containing data from the AdventureWorks2014.production.product table

Like

B

brandon.spencer
Author
Contributor
Forum|Forum|3 years ago
June 28, 2022

Rory, thanks for the comment. That directory structure within the data lake appears to be sufficient for differentiating sources. It appears to me that the ODX copies over whatever the source schema is to whatever the storage option is. At any point along the way, whether it be from the ODX to storage, or from storage to a staging or MDW table, is there a way to create/map a table to a different schema?

Like

+8

rory.smith
TimeXtender Xpert
Forum|Forum|3 years ago
June 28, 2022

The ODX is meant to mirror your source system closely, custom schema assignments break from that structure.

Once you pull data into TimeXtender Datawarehouse layers, you can fully control how schemas are assigned. This can be based on the type of table (raw/valid/transformation/etc.), per table, etc.

Like

B

brandon.spencer
Author
Contributor
Forum|Forum|3 years ago
June 28, 2022

Okay, that's good news. Do I create the schemas via SQL server and map them in TX, or is there a way to create the schema from within TX? I presume the former, but since I am new to this meta-data repository idea, I don't want to do things out of order.

Like

+8

rory.smith
TimeXtender Xpert
Forum|Forum|3 years ago
June 28, 2022

The latter is prefereable: you want to orchestrate as much as possible from one place, and that is what TimeXtender is there for. I am pretty sure this is covered in the online training at https://learn.timextender.com/ - that should give you some context as well

Like

B

brandon.spencer
Author
Contributor
Answer
Forum|Forum|3 years ago
June 28, 2022

Thanks, Rory. In case somebody comes across this post in the future who is new to TX like me, I found this TX link:

Database Schemas – TimeXtender Support

After reading the information in the link above, it jogged my memory that the training did address this. And it can be found under TimeXtender - Optimization: Schemas, Views, and Simple Mode

Like

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded