Skip to main content
Solved

Correct charset for special characters in CSV data source?

  • 20 August 2024
  • 9 replies
  • 68 views

Hi TXD community:

 

In ODX server we have built a CData data source connection with CSV 2023 provider. Kindly see the connection details below.

 

The synchronization went successfully but the transfer task completed with one error and one warning.

The error is :”System.Data.CData.CSV.CSVException (0x80004005): 0500] Could not execute the specified command: Unable to translate bytes yE1] at index 3790 from specified code page to Unicode.” This error appears due to a strange character in a string ”Nicolás” in the source file and it shows as UTF-8 as below

, as you can see from the connection details we have setup UTF-8 charset. So I have tried to ANSI charset but get the error “System.Data.CData.CSV.CSVException (0x80004005): i500] Could not execute the specified command: 500 -- 'ANSI' is not a supported encoding name. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.” Do you know the valid encoding in this case or where is the document that it is referring to?

And the warning “The table folder 'csv_kronos.csv' is used by another table with the id faebce67-3f54-456c-b1d1-43a692487a58” This warning is always appearing somehow even without strange characters in the data source.

By any chance have you encountered similar issues or know how to fix them?

Thank you in advance!

Hi @Xiaoqing Hu 

Can you please share a simple csv dummy file with the Nicolás value?

I tried creating the following file and the sync and transfer task completed successfully using default settings for CSV 2023 provider

 


Hi @Christian Hauggaard  that is a good news. This is a dummy file https://drive.google.com/file/d/15CssBbDXXuBjPHROl8QvzvLbkW799SR3/view?usp=sharing


@Xiaoqing Hu 

can you please try changing the charset property to CP1252?

and then execute the sync and transfer task

 


Hi @Christian Hauggaard  thank you very much for the advice. After changed the charset the error is gone, but the warning stays. 

and when I try to preview the data from ODX server the error is below.

 


@Xiaoqing Hu please go to the azure portal and manually delete the data source table in the ODX storage blob. Then run sync and transfer task again

Please see this for more info:

https://support.timextender.com/odx%2D89/azure%2Ddata%2Dstorage%2Dtable%2Dfolder%2Dhas%2Dinvalid%2Dmeta%2Ddata%2D1577?tid=1577


Hi @Christian Hauggaard  yes it works now! Thank you very much. May I ask what are all possible/supported encodings here for charset?


@Xiaoqing Hu I do not have the full list. It is not listed in Cdata’s documentation either. Would you like me to contact Cdata and request the full list?


@Christian Hauggaard if that is feasible then would be much appreciated. 


@Xiaoqing Hu please see response from cdata below:

“The driver is built on a core component UTF-8 that supports a wide range of charset values, including the ones like Windows-1252 (CP1252), and Shift-JIS. While we do not list every charset value explicitly in the documentation, the driver should handle most of the charset values that the customer's environment supports.

You can list all supported charset values by running the following code:

foreach (var encodingInfo in Encoding.GetEncodings())
{
Console.WriteLine(encodingInfo.Name);
}

This will provide the customer with a complete list of supported encodings on their .NET environment.”


Reply