Tutorial

TimeXtender CSV Data source

1 year ago
June 26, 2024
11 replies
772 views

Thomas Lind
Community Manager
1128 replies

This data source can be used to connect to text or CSV files in many locations. It is also able to find files across folders and subfolders. It can then decide to merge these into one or more specified tables.

If your setup requires a specific setup to work there also is various methods available to handle this.

Configuration manual

The following settings can be used with the TimeXtender CSV Data Source connector.

General notes on setting comma-separated parameters

For every setting where a comma-separated list can be passed as a parameter, it is possible to wrap certain items in square brackets – within the brackets, any special character can be used.

Eg. the following date format list setting will be parsed as a 3-long list (notice the comma within the brackets):

yyyyMMdd,[dd,MM,yyyy],yyyy-MM-dd

Connection settings

In this section you can specify the location of the file(s) with the additional settings on how to access it.

Path

The actual path to the csv file(s). You can only specify one path per data connector. It can be a folder or a file. If your location isn’t a local one, it may have an alternate field that specifies where it is located like driveid in the SharePoint location option. In that regard it can be a / to show that it is the root level.

Include sub-folders

Specify whether CSV files should be collected from subdirectories too.

Note that this setting is ignored for AWS, Azure and Google Cloud locations.

Included file types

Comma-separated list of file extensions to read. Any file with an extension not specified in this field will be ignored.

File aggregation pattern

Comma-separated list of file pattern(s) that should be treated as one table. Supported wildcard characters are '*' and '?'.

When set, all CSV files belonging to a pattern will be treated as one table. Note that grouped files are expected to share the same schema, otherwise the process will break during full data load.

Location specific fields

The rest of the connection settings are differ based on the selected Location type, and only the relevant fields would be editable – but those fields will be mandatory.

Location

The following location types are supported:

Local file or folder
Azure Blob Storage
AWS S3 Bucket
SharePoint or OneDrive
Google Cloud Storage
SFTP

Local file or folder

When the file is locally stored on some drive.
If you are using Azure File Share then use the UNC path, e.g:
\\<storageAccountName>.file.core.windows.net\<fileShareName>.csv

Azure Blob Storage settings

Azure blob connection string: the complete connection string to the container. You can learn more about configuring connection strings here : Configure a connection string - Azure Storage | Microsoft Learn
Azure tenant ID: The tenant id the storage account is located in. Is needed when doing OAuth 2.0 authentication using the client grant type.
Azure client ID: The client id for the application that is used for doing OAuth 2.0 client authentication.
Azure client secret: The secret value that is set up for the application
Azure storage account name: The storage account name that hosts the container.
Azure blob container: the container name

AWS S3 Bucket settings

Currently the connector only supports connecting with Access Key Id. You can learn more in AWS documentation.

AWS region: the related region string (eg. eu-west-2)
AWS access key ID: access key id to S3 bucket
AWS secret access key: secret access key to the S3 bucket
AWS bucket: the bucket name

SharePoint or OneDrive settings

SharePoint/OneDrive client ID: The Client ID of the app you use for access. The app used needs the following granted Application permissions Files.Read.All and Sites.Read.All. To be sure you have enough rights you may also add Group.Read.All and User.Read.All.
SharePoint/OneDrive client secret: The secret code that is set for the app used.
SharePoint/OneDrive Tenant ID: The tenant id of the company that hosts the SharePoint site.
SharePoint/OneDrive Drive ID: The drive id pointing to the Document Library you connect to. You can find these by using Postman and by looking at the Graph API collection Graph Fork. You can use the app to connect to this and locate the SharePoint folder under Application. In there you have two requests, https://graph.microsoft.com/v1.0/sites and https://graph.microsoft.com/v1.0/sites/{{SiteID}}/drives. Use sites to find the site ID and use the SiteID to find the Drive ID.

Google Cloud Storage settings

The connector supports GCM authentication with service account keys. You can learn more about them here: Create and delete service account keys | IAM Documentation | Google Cloud

Google credential file: Path to the Service Account private key (JSON or P12)
Google storage bucket name: the bucket name

SFTP settings

The connector supports SFTP authentication with password or a public key file. It is mandatory to use one of these methods.

SFTP host
SFTP port
SFTP user name
SFTP key path

Culture setup

In this section you can find settings related to culture-specific and control symbols.

Culture and client culture

Specific culture to use for metadata parsing. They are optional, the built -in invariant culture is used as the default.

Delimiter

The delimiter character (defaults to comma)

Line ending

Specific character or character sequence separating lines (by default all new-line charcters are supported)

Quote character

Character to wrap around a string containing control character(s)

Ignore quotes

A check to make the set Quote character is read as literal text.

In this section you find settings about the file headers and comment rows

Specify if there is a header row. Note that in case there are skipped comment rows, they will be excluded first before the header record is parsed.

Include empty headers

If there is a header record with empty column names, you can specify if you want the related values ignored during parsing

Skip top

Number of rows skipped for parsing, from the top (excluding the header row, if exists)

Skip comment rows

Specify if there are comment rows in the file which should be ignored. Must be set if there are comment rows, not doing so can result in processing errors.

Comment start character(s)

The character or character sequence that identifies a comment row

Ignore brackets around column names

So if the field names in the file contain [] around the names, you can use the check to ignore them.

Incomplete row handling

This section contains settings about empty or incomplete rows and values

Ignore blank rows

If empty rows should be ignored during parsing. If not set, an empty row will break the processing.

Ignore incomplete rows

Same setting for incomplete rows. If not set, an incomplete row will break the processing.

Trim spaces

Whether to trim whitespace characters around a field.

Empty fields equal to null

Whether to process empty fields as null in the resulting data tables

Default value if null

Default value applied instead of null. Only applies if 'Empty fields equal to null' is set to true

Datatype handling

In this section you can find setting related to parsing certain data types.

Infer datatypes

A check that makes it attempt to handle the field types. If this is unchecked all fields will be read as string values.

Metadata URI

(Optional) Specify the path for a single file as a template to define the metadata to be used across the files defined by the path property. This can be used to reduce the execution time required for the import metadata task by limiting the number of rows that are scanned to infer the data type (i.e. only the rows in the metadata file are scanned instead of rows across multiple files). E.g. C:\FlatFiles\Sales.csv

Note: If this setting is used then the other properties to define data types such as Infer datatypes and Parsed row limit will apply to the metadata file instead of the files specified by the path property

Max cell length

Maximum length to be used with text columns. If not set, the column length will be dynamically guessed based on the parsed rows during metadata parse (i.e. the maximum length will be taken, with an extra 25% ). Note that if during full data load, a longer text is found than the set cell length, the process will break .

Data format

Comma-separated list of supported custom date formats following .NET custom date and time format strings. Default value is derived from culture settings. Please note that if set, culture date formatting configuration will be ignored

True values

Comma-separated list of case insensitive values representing a boolean True

False values

Comma-separated list of case insensitive values representing a boolean False

Parsed row limit

Number of rows parsed to extract metadata. For values equal or lower than 0, all rows will be parsed. Defaults to 1000 rows.

Did this topic help you find an answer to your question?

oscarcleve
Contributor
11 months ago
September 13, 2024

Hi,
Is it possible to get the URI of each file on the rows upon ingestion from this data connection as possible with the CData CSV connector?

Thomas Lind
Community Manager
11 months ago
September 13, 2024

Hi @oscarcleve

Not currently, it could be an idea. I am pretty sure it is a known request and they have discussed adding this.

oscarcleve
Contributor
11 months ago
September 13, 2024

Okay, I found an idea suggesting this improvement so I’ll upvote and cross my fingers for implementation. Would be great :)

https://support.timextender.com/ideas/tx-excel-csv-data-source-aggregation-include-dw-uri-and-dw-rowno-fields-2357

thzman
Participant
8 months ago
December 5, 2024

Hi,
My header row in the CSV file I’m working on looks like this and it gives an “Incomplete” error:

Since it is tab separated, I guess that it technically TSV file, but anyway.
The line ending might not be a problem, since it’s kind of standard, but how do I set “TAB” as my delimiter?
Would it be possible to add a drop-down for supported delimiters and line endings?

Thomas Lind
Community Manager
8 months ago
December 6, 2024

Hi @thzman

I am not sure how exactly to do it, but it used to be possible.

So it should be added as an option if it does not exist.

Try to use TabDelimited

Dirk Meert
Explorer
8 months ago
December 12, 2024

HiThomas,
I have a CSV source with 19 different types of files,
To avoid having to specify all the individual fields data types in the override data type feature,

I would like to use the Metadata URI feature you mention on this page.
Should I be using a mixed file then here with all the possible fieldnames and an example which shows the maximum length and the correct type? Like a 500 character field, a date example..?

Thomas Lind
Community Manager
8 months ago
December 12, 2024

Hi @Dirk Meert

It works as a way to make sure a certain set of field names get the same data types despite what the other files you want to aggregate contains.

I don’t imagine these 19 different types of fields are to be merged into one.

If you want to use the Meta Data URI for these different types, you would point at the folder where a file for each of the 19 different types are stored and each individual one would be used as a guide for what fits this setup. What maps the data type to the field is the name and the setup, so the delimiter settings is also important.

r.miltenburg
Participant
4 months ago
April 16, 2025

Hi,

I'm trying to set up a csv data soruce connection with including sub folders.

Some of my fields are greater than 8000 characters, so i changed the Max cell length to 9000, but it still gives me this error in Timextender import metadata (sync):
Exception Type: System.ArgumentException
Message: Value cannot exceed 850 characters

Any idea on how to solve this?

Thomas Lind
Community Manager
3 months ago
May 14, 2025

Hi @r.miltenburg

The issue could also be a field name, not just the content.

You can always set the strings to be nvarchar(max).

bas.hopstaken
TimeXtender Xpert
2 months ago
June 4, 2025

Hi @Thomas Lind ,

As far as I can see, it is not possible to use an Azure Access Key or SAS Key to connect/authenticate to Azure Blob Storage or Azure Data Lake?

Bas Hopstaken (Cmotions)

Thomas Lind
Community Manager
2 months ago
June 6, 2025

Hi @bas.hopstaken

You use a SAS Connection string.

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Configuration manual

General notes on setting comma-separated parameters

Connection settings

Path

Include sub-folders

Included file types

File aggregation pattern

Location specific fields

Location

Local file or folder

Azure Blob Storage settings

AWS S3 Bucket settings

SharePoint or OneDrive settings

Google Cloud Storage settings

SFTP settings

Culture setup

Culture and client culture

Delimiter

Line ending

Quote character

Ignore quotes

Header setup

Has header record

Include empty headers

Skip top

Skip comment rows

Comment start character(s)

Ignore brackets around column names

Incomplete row handling

Ignore blank rows

Ignore incomplete rows

Trim spaces

Empty fields equal to null

Default value if null

Datatype handling

Infer datatypes

Metadata URI

Max cell length

Data format

True values

False values

Parsed row limit

Reply

Related topics

Azure Synapse Dedicated SQL Pool Reference Architecture

Get Started with TimeXtender Data Integration

Episode 11 - Use Synapse SQL Pool for MDW Storageicon

What is Azure Synapse Dedicated SQL Pool?

Use Azure Synapse Dedicated SQL Pool for Prepare instance storage

Most helpful members this week

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings