Hello @Gardar ,
if both options are not really solution for you the thing that I think of is using a Query Table to add load the data from the source with a query and then put the transformation in there.
Hope this helps
= Daniel
I’m currently using views in the source to publish the data to TX. If you have access to the source to create views it’s the more flexible option.
@rvgfox I understand where you are comming from but I disagree. Creating it in the source you should have acces to the source to create it. When you leave and someone takes over it isn't always the case they will get source rights. So if something is wrong no one can fix it while a query table is in your controlled environment.
Dear @rvgfox ,
I would strongly recommend not using views in the source
Now you will have to maintain the views in the source with omare outside of the influence of TimeXtender and easily forgotten to keep up to date
the use of a Query Table is like using a view but from the destination and this is under the control and maintenance of the data platform team instead of the dba team
Both @Bernarddb and @daniel are right, but you have to take into account that in the real world of a company if you are developing a DW for an ERP like SAP/R3 (is my case) you have thousands of tables with millions of records and no "LastChangeDatetime" columns.
With TX it is crazy to transfer all the tables you are going to use directly to TX (for example to create the customer dimension I need 40 tables). That's why, I insist in my case, it is better to create views in the source. Of course you must have access to the source and be able to maintain them in the future.
Dear @rvgfox ,
I will stick with my standpoint on the use of the Query Table. Members of the data team should not have rights to create or alter views and tables in the production database. I know that using the Query Tables can be a bit hard to use because you will have to use the Query Language of the source database in combination with how TX will translate this to a SQL table or Parquet file. This still does not change the fact that makeing these kind of changes to the production database is not a valid solution.
It might be the easiest solution but it is not the best solution
I will not and have never transferred all table from the source unless I am going to use all the tables I will only transfer the tables that I’m going to usr Besides this, it also will not validate the use of views in the source as you have the option to use a query table. The ‘danger and ‘ease’ of joining tables and makeing some transformation in the query is tempting but not how I believe how a DWH / ETL should be set up. My Extraction alyer will only contain 1:1 copies. Because Transformations and renames and the likes will only happen in my Transformation Layer. Ofcourse it might be easy to have a view with the joins in the source and get the, for example, customer table as one table instead of 40, but it defeats the purpose of having a Tranformation layer. This sounds to me like a TETL structure and one has to maintain a T layer in the source and a T layer in TX. How does one you keep track of which transformations are where and how does one keep the data lineage in tact.
Plus using the views in the source will put an extra straign on the production database where otherwise the straign is in the DWH database. 1 small mistake might result in the production database going down or straining the production database to much that the production database is going down. These are not risks i’m willing to take.
We are getting a bit of topic here though
sorry for the long read
= Daniel
Obviously @daniel I’m agree with you, in fact I’m using that aproach with my other ERP (Dynamics 265 F&O) but, I would like to know who has working with SAP R/3 to transfer the tables directly to the staging area without using views or SAP APIs,and thinks that using views isn’t a good option.
Hello @rvgfox
That is a teally good question! Maybe start a new topic on this. Maybe we can derive a best practice out of this working with SAP/R3
= Daniel
@daniel, I had thought of that, but TRY_CAST is not supported in the query table. The source is using a CData JSON 22.0.8257.0 Provider.
I tried using CAST and IIF:
CAST(IIF([lastSync] = 'never', NULL, [lastSync]) as DATETIME)
but this assumes that ‘never’ is the only non-null, non-datetime value that lastSync would take.
So option 1. is still the only valid option. That, or creating a view of the simple table in the data-source. But what I would like is to transform it “on-the-fly”.
Edit:
This works:
CAST(IIF(ISDATE((lastSync]) = 1, ]lastSync],NULL) AS DATETIME) as lastSync
but it seems hacky, instead of just using TRY_CAST. I can also imagine other reasons that I would like to be able to transform data “on-the-fly” from source to a staging history table.
Hello @Gardar ,
Then option 1 is the best option indeed.
but do check out @Bernarddb ’s option because the raw field will reduce the redundancy in the columns and in the setting of the ‘normal’ table you can truncate the raw table and then you will not have any redundant data there.
= Daniel
Hi @Gardar
If your source data type is a CData data source, then you can change the data type in a RSD file to be a datetime instead of a string.
@Thomas Lind, it was originally set-up like that, but an error occurred when the data-source changed the column to also include the character ‘never’. In my opinion the publisher should’ve used NULL instead, but I figured I could get past these changes (and hopefully future changes too) by just taking in all columns as string and converting them myself between Business Unit and Data Warehouse.
Hello @Gardar ,
unfortunate to hear that the solution provided by @Thomas Lind did not work. I would have thought this would fix it.
what solution will you be implementing?
@daniel, I went with option 1 in this case.
@Gardar ,
Cool. Thanks for letting me know.
Take care