Skip to main content
Solved

Why do datatypes in the Raw table change with the Valid?

  • 9 August 2024
  • 2 replies
  • 31 views

Hello,

I have a question about the RAW tables in TimeXtender. I noticed that the data types of the fields in the RAW table change when I modify them in the tool. I was under the impression that RAW tables should keep the original data types from the source, since this would be “RAW”.

I ran into trouble when loading transferring from the ODX. I changed the datatype of a field and even added a transformation that resulted in a CAST in the transformation view. However, I got an error message saying: The given value of type String from the data source cannot be converted to type float of the specified target column.
This should not be, because the source contains only data that is/should be convertible to a float. The Raw table has no data, which I find strange. Changing the fields data type to float has also changed the datatype of the Raw table. I would expect the Raw table to always have the same datatype as the source and that the changing of datatypes is handled in the transformation view. 

 

My questions:

  1. What's the reason for allowing the Raw table datatypes change? This makes the data only raw-ish.
  2. Is there a way to ensure that the RAW table fields retain the same data types as the source? Or could this perhaps become new functionality. 

To be clear, I'm not looking for fixes/workarounds. I'm specifically interested in why datatypes in the raw table can differ from their source. 

 

Kind regards,
Benny Aalders

Hi,

your ODX is where your data is raw - why would you want another raw copy in your DWH if you are going to change types (or do other things like add lookups)? Type changes between ODX and DWH happen in the data movement from ODX to _R table (if you are not using Simple mode). There might be a debugging reason to do this, but you are then always paying extra storage and compute resources for something you may only need in specific circumstances.

If you want your _R table to match more closely with ODX, don't change any types but create new fields with the target type and use transformations to populate those. Then leave the source fields as raw only so you save as much space but retain the possibility of seeing the source data converted to the original SQL type. The field you add for the transformation will also exist in the raw table, but not be populated. You will also notice that conditional lookups are populated in the _r table. One of the reasons for changes in the raw table is that this makes inserts from raw to valid possible for new records instead of mandating all transformations being implemented as UPDATEs in historical/incremental context.

Your raw table may be empty if you have already executed a table in DWH and added fields later. Tables are automatically incremental between ODX and DWH (expand the mapping and see if there is an I in the icon) and there may not have been new records. You can force a full load or set the table to load full instead of automatic to overcome this.

Note that ADLS storage for the ODX means you are converting your source types to parquet types, and subsequently converting those to SQL types. You never have perfectly “raw” data other than in the source system. This is also where conversion errors from string to float/numeric/date types come in: collation can play a role.


Hi @Benny please let us know if the issue is resolved, and if so please help us by marking a best answer above. If you have any follow up questions please let us know


Reply