Skip to main content

This data source can be used to connect to text or CSV files in many locations. It is also able to find files across folders and subfolders. It can then decide to merge these into one or more specified tables. 

If your setup requires a specific setup to work there also is various methods available to handle this.

Configuration manual

The following settings can be used with the TimeXtender CSV Data Source connector.

General notes on setting comma-separated parameters

For every setting where a comma-separated list can be passed as a parameter, it is possible to wrap certain items in square brackets – within the brackets, any special character can be used.

Eg. the following date format list setting will be parsed as a 3-long list (notice the comma within the brackets):

yyyyMMdd,ddd,MM,yyyy],yyyy-MM-dd

Connection settings

In this section you can specify the location of the file(s) with the additional settings on how to access it.

 

Location

The following location types are supported:

  • Local file or folder
  • Azure Blob Storage
  • AWS S3 Bucket
  • SharePoint or OneDrive
  • Google Cloud Storage
  • SFTP

Local file or folder

  • When the file is locally stored on some drive.

Azure Blob Storage settings

AWS S3 Bucket settings

Currently the connector only supports connecting with Access Key Id. You can learn more in AWS documentation.

  • AWS region: the related region string (eg. eu-west-2)
  • AWS access key ID: access key id to S3 bucket
  • AWS secret access key: secret access key to the S3 bucket
  • AWS bucket: the bucket name

SharePoint or OneDrive settings

  • SharePoint/OneDrive client ID: The Client ID of the app you use for access. The app used needs the following granted Application permissions Files.Read.All and Sites.Read.All. To be sure you have enough rights you may also add Group.Read.All and User.Read.All.
  • SharePoint/OneDrive client secret: The secret code that is set for the app used.
  • SharePoint/OneDrive Tenant ID: The tenant id of the company that hosts the SharePoint site.
  • SharePoint/OneDrive Drive ID: The drive id pointing to the Document Library you connect to. You can find these by using Postman and by looking at the Graph API collection Graph Fork. You can use the app to connect to this and locate the SharePoint folder under Application. In there you have two requests, https://graph.microsoft.com/v1.0/sites and https://graph.microsoft.com/v1.0/sites/{{SiteID}}/drives. Use sites to find the site ID and use the SiteID to find the Drive ID.

Google Cloud Storage settings

The connector supports GCM authentication with service account keys. You can learn more about them here: Create and delete service account keys  |  IAM Documentation  |  Google Cloud

  • Google credential file: Path to the Service Account private key (JSON or P12)
  • Google storage bucket name: the bucket name

SFTP settings

The connector supports SFTP authentication with password or a public key file. It is mandatory to use one of these methods.

  • SFTP host
  • SFTP port
  • SFTP user name
  • SFTP key path

 

 

Path

The actual path to the csv file(s). You can only specify one path per data connector.

Include sub-folders

Specify whether CSV files should be collected from subdirectories too.

Note that this setting is ignored for AWS, Azure and Google Cloud locations.

Included file types

Comma-separated list of file extensions to read. Any file with an extension not specified in this field will be ignored.

File aggregation pattern:

Comma-separated list of file pattern(s) that should be treated as one table. Supported wildcard characters are '*' and '?'.

When set, all CSV files belonging to a pattern will be treated as one table. Note that grouped files are expected to share the same schema, otherwise the process will break during full data load.

Location specific fields

The rest of the connection settings are differ based on the selected Location type, and only the relevant fields would be editable – but those fields will be mandatory.

Culture setup


In this section you can find settings related to culture-specific and control symbols.

Culture and client culture

Specific culture to use for metadata parsing. They are optional, the built -in invariant culture is used as the default.

Delimiter

The delimiter character (defaults to comma)

Line ending

Specific character or character sequence  separating lines (by default all new-line charcters are supported)

Quote character

Character to wrap around a string containing control character(s)

Header setup

In this section you find settings about the file headers and comment rows

 

Has header record

Specify if there is a header row. Note that in case there are skipped comment rows, they will be excluded first before the header record is parsed.

Include empty headers

If there is a header record with empty column names, you can specify if you want the related values ignored during parsing

Skip top

Number of rows skipped for parsing, from the top (excluding the header row, if exists)

Skip comment rows

Specify if there are comment rows in the file which should be ignored. Must be set if there are comment rows, not doing so can result in processing errors.

Comment start character(s)

The character or character sequence that identifies a comment row

Incomplete row handling

This section contains settings about empty or incomplete rows and values

Ignore blank rows

If empty rows should be ignored during parsing. If not set, an empty row will break the processing.

Ignore incomplete rows

Same setting for incomplete rows. If not set, an incomplete row will break the processing.

Trim spaces

Whether to trim whitespace characters around a field.

Empty fields equal to null

Whether to process empty fields as null in the resulting data tables

Default value if null

Default value applied instead of null. Only applies if 'Empty fields equal to null' is set to true

Datatype handling

In this section you can find setting related to parsing certain data types.

Metadata URI:

If you want to state a specific file that is used to base all others on, you can connect to this one file in this field.

Max cell length

Maximum length to be used with text columns. If not set, the column length will be dynamically guessed based on the parsed rows during metadata parse (i.e. the maximum length will be taken, with an extra 25% ). Note that if during full data load, a longer text is found than the set cell length, the process will break .

Data format

Comma-separated list of supported custom date formats following .NET custom date and time format strings. Default value is derived from culture settings. Please note that if set, culture date formatting configuration will be ignored

True values

Comma-separated list of case insensitive values representing a boolean True

False values

Comma-separated list of case insensitive values representing a boolean False

Parsed row limit

Number of rows parsed to extract metadata. For values equal or lower than 0, all rows will be parsed. Defaults to 1000 rows.

Hi,
Is it possible to get the URI of each file on the rows upon ingestion from this data connection as possible with the CData CSV connector?

 

 


Hi @oscarcleve 

Not currently, it could be an idea. I am pretty sure it is a known request and they have discussed adding this.


Okay, I found an idea suggesting this improvement so I’ll upvote and cross my fingers for implementation. Would be great :)

 

 


Hi,
My header row in the CSV file I’m working on looks like this and it gives an “Incomplete” error:
 

Since it is tab separated, I guess that it technically TSV file, but anyway.
The line ending might not be a problem, since it’s kind of standard, but how do I set “TAB” as my delimiter?
Would it be possible to add a drop-down for supported delimiters and line endings?


Hi ​@thzman 

I am not sure how exactly to do it, but it used to be possible.

So it should be added as an option if it does not exist.

Try to use TabDelimited

 


HiThomas,
I have a CSV source with 19 different types of files,
To avoid having to specify  all the individual fields data types in the override data type feature,

I would like to use the Metadata URI   feature you mention on this page.
Should I be using a mixed file then here with all the possible fieldnames and an example which shows the maximum length and the correct type? Like a 500 character field,   a date example..?


Hi ​@Dirk Meert 

It works as a way to make sure a certain set of field names get the same data types despite what the other files you want to aggregate contains. 

I don’t imagine these 19 different types of fields are to be merged into one.

If you want to use the Meta Data URI for these different types, you would point at the folder where a file for each of the 19 different types are stored and each individual one would be used as a guide for what fits this setup. What maps the data type to the field is the name and the setup, so the delimiter settings is also important.


Reply