Solved

Read Excel decimal data with CDATA


I have an Excel sheet with percentages (11 decimals). The decimal sign is comma. See example below.

Year    Period    CostPercentage
2020    1    0,01800188
2020    2    0,016852519

But when I read the data in TX then I see the below result: 
 

 

I have tried to use the culture feature see below but no success. Find a screen dump below

Why does TX / Cdata not read the data as is? The “,” is interpreted as a Thousand separator.

Any help will be appreciated.

When I have a file with a “.” (dot) as a decimal sign I convert the data to numeric precision 38,scale 11 and that works.

 

icon

Best answer by Lorenzo 19 April 2023, 14:16

View original

11 replies

Userlevel 3
Badge +2

You can try to use the Data Type Override function and set the data type to float (data type Numeric, check ‘Is Float’).

In most cases this is how I solve issues with thousand/decimal separators as well as scientific notation.

https://legacysupport.timextender.com/hc/en-us/articles/115002775923-Using-the-Override-Data-Type-feature

Userlevel 5
Badge +7

@Lorenzo ,

is your CData Excel version the most recent for the version of TimeXtender you are running? There have been quite some improvements lately.

What region / locale settings does your server or user have? Have you tried experimenting with the TypeDetectionScheme  and EnableOleDbCompatibility settings to see if that detects the column types differently?

Your culture and client-culture settings should perform the conversion you expect, so that is worth diving into.

Unfortunately nothing works using CDTA Excel when you have comma as decimal and the server is en-US. When you use a CSV file however (and CDATA csv) then it works without any problems.

Does this means that TX (CDATA) is not able to process an Excel file with comma as decimal?

Userlevel 6
Badge +5

Hi @Lorenzo 

With the attached excel file, I am able to return the decimal value despite a comma as the decimal sign and my PC being en-US culture.

I connect to the file using the Cdata Excel provider using all the default settings.

Then I create a Query Table by right clicking on my data source and selecting Manage Query Tables using the following SQL statement, which replaces the comma with a dot, and then casts as float.

SELECT [RowId], [Year], [Period], 
CAST(REPLACE([CostPercentage],',','.') AS FLOAT) AS [CostPercentage] FROM [Excel].[Sheet1]

Then I execute the synchronize and transfer task for my data source. 

When I right click on my data source and select Data Source Explorer to view the data, it appears correct.

Then I bring in my table into my DW and edit my field to include 11 decimals

Alternatively you can bring the data in as text into the ODX storage (skip the query table) and then edit the field in the DW to include 11 decimals, this seems to work as well.

 

Userlevel 5
Badge +7

Hi Christian,

the workaround is nice - but you would expect it should not be required (given certain conditions).

 

I guess if your data is Dutch culture in Excel and your server uses the American culture settings, Excel will not recognize 1000,2 as a number representing 1000 and two tenths. Then the culture settings will not allow for a conversion. I think if your data culture and server culture match, you should be able to modify the data going into TX.

This is partially an Excel issue, but quite thorny if you are an international enterprise without a centralized culture/regional setting.

It would be good to verify that when the Excel file's culture and the server culture match, you can successfully convert going into TX.

 

Userlevel 6
Badge +5

@Lorenzo @rory.smith 

It works for me when I select the whole column (i.e. click on column C) and then change the format to number. It is important to select and format the whole column and not just the values / individual cells. After I executed the synchronize task again, the data type was returned as float.

 

Userlevel 6
Badge +5

Hi @Lorenzo have you tried the solution above, did it work? If so please help us by marking the best answer. Please let me know if you have any further questions :)

Hi @Christian Hauggaard My apologies for this late response, I was a few days off. 
It still does not work for me. Strange thing is that when I read your file with Costpercentage as text and no culture manipulation I see it as decimal comma (,) but when I use number it is read as a whole value.
My file I need to read as format number else I don't see decimals and in my case the decimals are read as dot (.) although there is a comma.

I have attached an Excel sheet. I would like to know what you see.

Userlevel 6
Badge +5

Hi @Lorenzo 

Can you please confirm if you see the same as I do below, or if you see it differently on your end?

When I open the file you attached, it looks like this. Column C is formatted as number and it displays with a dot, rather than a comma. My machine also has en-US culture, do you know what might be the reason for you seeing comma and I see the dot? Has special culture settings been applied in Excel? Where is your excel file created/located, is it on the same server as the ODX server with en-US culture?

Also which version of excel has been used to create the file? You can check this in the Excel file under File > Account > About Excel

I use the following provider and settings. No culture settings are applied. 

I synchronize the data source and this results in the correct values and float data type in the data source explorer.

If I edit the data source and add the culture nl-NL to the culture property, and synchronize I see the following incorrect values, however the data type is still recognized as float. 

 

Hi @Christiaan,

I see the comma, see screen print below. The Excel sheet is not created on the same VM as TX is running (we don't use ODX but business unit). We don't have Excel installed on the VM's TX is installed.

We use the same version of Excel, see below.

 

See the screen dump of the  provider, I have removed all the culture settings.

Probably you use the ODX version, I don't see a data source explorer, I only have a preview data. So the data type I can only see to hover over the field or use for example the data flow option of visualization 

You can see the data is now with a decimal dot “.” 

 I also see that the field is a float.

 

Userlevel 6
Badge +5

Hi @Lorenzo yes I was using ODX for my testing. Based on the below screenshot you sent, the issue seem to be resolved after removing the culture settings? Is that correctly understood?

 

Reply