Tutorial

TimeXtender Excel data source

Forum|Forum|1 year ago
June 26, 2024
17 replies
1242 views

Thomas Lind
Community Manager

You can use this provider to connect to Excel files. It can connect to multiple files at once and merge them based on specific options such as files, sheets or both. There also is options making it possible to connect to setups that is not straight forward, using table definitions, culture settings and more.

Configuration manual

The following settings can be used with the TimeXtender Excel data source connector.

Connection settings

These settings relate to the connection to where the Excel file is located.

Path

When “Location” is set to “Local file or folder” you can here provide the path to where the Excel file(s) are located. If a path to a folder is provided all files in the folder and if the option “Include sub-folders” is set then all files in the subfolders as well are ingested.

For alternate location such as SharePoint or Azure, you have other options for pointing at a location, such as setting the drive id and a container. If the files are stored on that level you can add a / as the path.

Include sub-folders

As mentioned above this option defines if sub-folders are also traversed to find Excel files to ingest.

Metadata URI

(Optional) Specify the path for a single file as a template to define the metadata to be used across the files defined by the path property. This can be used to reduce the execution time required for the import metadata task by limiting the number of rows that are scanned to infer the data type (i.e. only the rows in the metadata file are scanned instead of rows across multiple files). E.g. C:\FlatFiles\Sales.xlsx

Note: If this setting is used then the other properties to define data types such as Infer datatypes and Number of rows to infer data types will apply to the metadata file instead of the files specified by the path property

Location

Drop down box where you can choose what type of location your Excel file is located at.

The following location types are supported:

Local file or folder
Azure Blob Storage
AWS S3 Bucket
SharePoint or OneDrive
Google Cloud Storage
SFTP

Local file or folder

When the file is locally stored on some drive.

Azure Blob Storage settings

Azure blob connection string: the complete connection string to the container. You can learn more about configuring connection strings here : Configure a connection string - Azure Storage | Microsoft Learn
Azure tenant ID: The tenant id the storage account is located in. Is needed when doing OAuth 2.0 authentication using the client grant type.
Azure client ID: The client id for the application that is used for doing OAuth 2.0 client authentication.
Azure client secret: The secret value that is set up for the application
Azure storage account name: The storage account name that hosts the container.
Azure blob container: the container name

AWS S3 Bucket settings

Currently the connector only supports connecting with Access Key Id. You can learn more in AWS documentation.

AWS region: the related region string (eg. eu-west-2)
AWS access key ID: access key id to S3 bucket
AWS secret access key: secret access key to the S3 bucket
AWS bucket: the bucket name

SharePoint or OneDrive settings

Note: If pointing to a specific Excel file in SharePoint or OneDrive site root folder then enter path as filename.xlsx. If you would like to extract all the files in the SharePoint site then enter path as /. If you would like to extract excel files from a specific folder in the SharePoint site then enter the folder name. If it is a nested folder then enter the following path: foldername1/foldername2. If you would like to point to a specific file in a nested folder in a SharePoint site then enter the following path: foldername1/foldername2/filename.xlsx

SharePoint/OneDrive client ID: The Client ID of the app you use for access. The app used needs the following granted Application permissions Files.Read.All and Sites.Read.All. To be sure you have enough rights you may also add Group.Read.All and User.Read.All.
SharePoint/OneDrive client secret: The secret code that is set for the app used.
SharePoint/OneDrive Tenant ID: The tenant id of the company that hosts the SharePoint site.
SharePoint/OneDrive Drive ID: The drive id pointing to the Document Library you connect to. You can find these by using Postman and by looking at the Graph API collection Graph Fork. You can use the app to connect to this and locate the SharePoint folder under Application. In there you have two requests, https://graph.microsoft.com/v1.0/sites and https://graph.microsoft.com/v1.0/sites/{{SiteID}}/drives. Use sites to find the site ID and use the SiteID to find the Drive ID.

Google Cloud Storage settings

The connector supports GCM authentication with service account keys. You can learn more about them here: Create and delete service account keys | IAM Documentation | Google Cloud

Google credential file: Path to the Service Account private key (JSON or P12)
Google storage bucket name: the bucket name

SFTP settings

The connector supports SFTP authentication with password or a public key file. It is mandatory to use one of these methods.

SFTP host
SFTP port
SFTP user name
SFTP key path

Excel settings

These settings relate to the behavior of the Excel reader and how to treat the content of the Excel documents.

Set first non-empty row as column names

Check if your Excel documents have header records. This will then use the first non-empty record to set the names of your resulting columns. If unchecked generic column names will be generated. E.g. “Column_1, Column_2, …”

Table definitions

Here you can provide a comma-separated list of table definitions to ingest from your Excel document(s). A table definition has the following structure:

<Table name>=<Sheet name>!<Starting cell>:<Ending cell>

E.g. “Table1=Sheet1!E6:H9”

The table will then be ingested with data from within the range provided in the definition. The definition is also dependent on the option “Has header record”, if checked the first record is used for column names. Aggregation options will also affect defined tables.

It is also possible to provide wildcards for some of the parts in the definition. A wildcard is defined by using a “*” star symbol. You can provide wildcards for “Sheet name”, this will then ingest data from all sheets within the range defined. This data is not aggregated unless aggregation options are selected. A wildcard can also be provided for the row part of the “Ending cell” part. E.g. “H*”. The reader will then read all rows within the defined columns.

Example of a table definition using wildcards: “Table1=*!E6:H*”

Only use defined tables

When checked only tables defined using table definitions will be ingested. If not checked all tables defined using table definitions will be ingested along with the whole sheet(s) tables from the file(s).

Aggregation

Here you can choose the type of aggregation you want to apply when ingesting Excel document(s). The options are:

None (All files & sheets in the path will be loaded as separate tables)
Files (Matching sheetnames and/or table definitions will be merged across files)
Sheets (All sheets and/or table definitions within files are merged)
Files & sheets (A combination of Files and Sheet aggregation)

None – No aggregation is applied and all tables are returned with unique names using the following naming structure:

<filename>_<sheet name> or <filename>_<sheet name>_<table definition name>

Files – Tables within files are named using the following naming structure:

<sheet name> and/or <sheet name>_<table definition name>

The data is then aggregated over files by table name.

Sheets – Data is aggregated over sheets and returned as a single table per file plus any table definition tables. Table definitions are also aggregated.

Files & sheets – Same as above, but in the order that Sheets are aggregated first for each file and then files are aggregated, will return a single table plus a single table for any table definitions.

Included sheets

Here you can specify a list of sheet names to select, by applying the names in a comma-separated bracketed list.

Infer datatypes

Check this to infer data types from your Excel documents. If unchecked, all data types will be strings.

Number of rows to infer data types

If you have checked “Infer datatypes”, you can here provide the number of rows to use for inferring the datatypes. This is because there is a small overhead of doing the inference. Setting the number to 0 or a negative number will result in all rows being used for inference.

Culture

Here you can specify the culture of the read Excel files. For example, if the source is different than yours or does not match InvariantCulture.

Number decimal separator

Here you can provide the number decimal separator used when inferring numeric data types.

Date formats

Here you can specify a custom date format. You can apply more than one by applying them in a comma-separated format within brackets.

Misc behavior

Test connection

Clicking the test connection button has the following behavior:

If the path is empty or not set will return a failed connection
If a path is provided and the file can not be opened will return a failed connection
If a path is provided and the file or folder does not exist will return a failed connection

If none of the above a connection is returned as a success.

This means that we do not check whether the file or folder contains files that are Excel documents or if they are corrupted or contain any data.

Erik de Vries
Starter
Forum|Forum|1 year ago
July 3, 2024

Seems that path cannot be a UNC pathname? A drive letter does work. I’m I right?

If yes something for improvement?

Erik de Vries
Starter
Forum|Forum|1 year ago
July 3, 2024

Another question. It looks like that meta data is required. Is it?

In general it would be good to know what is required and what is optional.

Christian Hauggaard
Community Manager
Forum|Forum|1 year ago
July 3, 2024

@Erik de Vries please submit a product idea for UNC path name

the metadata URI is not required. I am able to connect to a local workbook simply by specifying the path property

DylanW.
Contributor
Forum|Forum|1 year ago
December 3, 2024

I am connecting to a SharePoint environment that contains a folder with multiple Excel files in various structures. According to the documentation, I should be able to connect to this folder:

"Note: If pointing to a specific Excel file in SharePoint or OneDrive site root folder then enter path as filename.xlsx. If you would like to extract all the files in the SharePoint site then enter path as /. If you would like to extract excel files from a specific folder in the SharePoint site then enter the folder name. If it is a nested folder then enter the following path: foldername1/foldername2. If you would like to point to a specific file in a nested folder in a SharePoint site then enter the following path: foldername1/foldername2/filename.xlsx”

Currently, I can only connect to a specific file within the folder using the path format:
foldername1/foldername2/filename.xlsx.
However, all other options fail, and I receive an error message:

Exception Type: System.IO.InvalidDataException
Message: End of Central Directory record could not be found.
Stack Trace: at System.IO.Compression.ZipArchive.ReadEndOfCentralDirectory()
at System.IO.Compression.ZipArchive.Init(Stream stream, ZipArchiveMode mode, Boolean leaveOpen)
at Sylvan.Data.Excel.XlsxWorkbookReader..ctor(Stream iStream, ExcelDataReaderOptions opts)
at Sylvan.Data.Excel.ExcelDataReader.Create(Stream stream, ExcelWorkbookType fileType, ExcelDataReaderOptions options)
at TimeXtender.ODX.ExcelCustomDataSource.ExcelEngine.GetDataTables()
at TimeXtender.ODX.ExcelCustomDataSource.ExcelDataSource.GetTables(DataSourceQueryTableSettings _, Func`3 tableHasFilterMatch, Func`4 columnHasFilterMatch)
at DataSourceEngine.Custom.CustomSourceEngine.SynchronizeSource(List`1 existingTableModels, IDataStorageEngine dataStorageEngine)
at ExecutionEngine.Action.SynchronizeAction.Execute()
at ExecutionEngine.Action.ExecutionAction.<.ctor>b__11_0()
at System.Threading.Tasks.Task.Execute()

The documentation indicates that connecting in other ways should be possible. Could you provide more detailed examples or clarify the expected outcomes for these scenarios?

Christian Hauggaard
Community Manager
Forum|Forum|1 year ago
December 3, 2024

@DylanW. Which version of TDI and the data source are you using?

I am using TimeXtender Excel data source 22.0.0.0 and TDI 6822.1 and as shown below I can see multiple excel files when specifying my nested sharepoint folder (I tried both with “CSV/MyXL” and “CSV/MyXL/” and both worked) and then executing the import metadata task

Do you notice any differences between our setups? Did you execute the import metadata task?

DylanW.
Contributor
Forum|Forum|1 year ago
December 3, 2024

@Christian Hauggaard TimeXtender Excel data source 22.0.0.0 and TDI 6766.1.
my settings:

Do you mean synchronize metadata with import metadata task?

Only difference is, the Victa folder is somehow locked in sharepoint:

Christian Hauggaard
Community Manager
Forum|Forum|1 year ago
December 3, 2024

@DylanW. first you must execute the import metadata task to extract the latest metadata (e.g. new column info)

Then you synchronize (in the metadata manager) this will compare the new metadata that was imported, to the metadata currently committed. Once you have reviewed/made changes click OK to commit the changes based on the new metadata.

Therefore please run the import metadata task

Then synchronize metadata

Do you see the files now?

I have tried to reproduce the lock icon on my sharepoint folder by stopping the inheritence of permissions under manage access advanced settings for the folder, which search results online suggest but I still do not see the lock icon. Can you please test with a folder without the lock icon?

DylanW.
Contributor
Forum|Forum|1 year ago
December 3, 2024

@Christian Hauggaard Should this option be available in 6766.1?

Isn't the synchronize task doing exactly the same?

Christian Hauggaard
Community Manager
Forum|Forum|1 year ago
December 3, 2024

@DylanW. No please see the release notes

“Redesigned metadata synchronization and table selection: We've completely reimagined how you manage metadata and select tables in the Ingest layer. With these changes, we aim to make it easier to combat schema drifts, e.g. when you change data source providers, and put you in firm control of what goes into your Ingest storage. 'Synchronize Tasks' are now known as 'Metadata Import Tasks' and will no longer do a full synchronization of the data source. Rather, it will import the metadata from the data source and store it in the data storage of the Ingest. The Data Source Explorer has become the Metadata Manager and is now the place for synchronizing data sources - selecting and mapping tables in the data source to tables in the Ingest storage - all based on the metadata imported by the Metadata Import Tasks.“

Therefore the functionality for data source ingestion is different now. Please let me know if you have any follow up questions

Thomas Lind
Author
Community Manager
Forum|Forum|1 year ago
December 3, 2024

@Christian Hauggaard TimeXtender Excel data source 22.0.0.0 and TDI 6766.1.
my settings:

Do you mean synchronize metadata with import metadata task?

Only difference is, the Victa folder is somehow locked in sharepoint:

Hi @DylanW.

Are you adding Documenten as a folder?
It is not a folder it is the Document Library. You can see the document libraries in the top.

The Document Library is the Drive ID. The folders is the level below that.

The path would be:

iDas Volvo Dealers\Dealers\Victa

DylanW.
Contributor
Forum|Forum|1 year ago
December 3, 2024

@Thomas Lind Hi Thomas,

No I am not adding Documenten as a folder.

My path:

iDas Volvo Dealers\Dealers\Victa(same as yours and Christian's)

If I add the filename to the path it works fine.

Extraction of multiple excel files does not work.

I will update timextender as @Christian Hauggaard suggested and try again!

Or do you have a different solution?

Thomas Lind
Author
Community Manager
Forum|Forum|11 months ago
December 4, 2024

Hi @DylanW.

I see in the image that it is set to

iDas Volvo Dealers/Dealers/Victa

Try to change the / to \ if it isn’t already changed.

If you still use 6766.1 it may be due to you checking the Include sub-Folders part. I can’t replicate the issue in my 6823.1 release, but I think it may have been a bug earlier if there were no sub level in the folder where you were looking.
I have these files

I made this setup to get these files.

jasperp
Starter
Forum|Forum|7 months ago
April 28, 2025

Hello,

Is it possible to skip the first few rows when using the Timextender Excel Data Source? It has always been possible with the CData Excel provider, but I can't seem to find the functionality anymore.

Thanks in advance!

Thomas Lind
Author
Community Manager
Forum|Forum|6 months ago
May 13, 2025

Hi @jasperp

I will add this here as we already had our conversation about this elsewhere.

The way to skip the first rows is only possible by using the Table Definitions method.

Once you have applied that it will only choose this. There is no skip number option for excel as there is for CSV.

oscarcleve
Contributor
Forum|Forum|6 months ago
May 14, 2025

Upvote this idea if you want the Excel sheet name as a column:

KajEmergo
Contributor
Forum|Forum|1 month ago
October 29, 2025

You mention the following settings are required for an App Registration to fetch data from Sharepoint:

SharePoint/OneDrive client ID: The Client ID of the app you use for access. The app used needs the following granted Application permissions Files.Read.All and Sites.Read.All. To be sure you have enough rights you may also add Group.Read.All and User.Read.All.

Certain security officers may find these settings are not restricted enough. Essentially, you could fetch all Excel-files stored in your entire Sharepoint tenant.

With CData, you could use an App Registration that has delegated permissions (using the Sharepoint permissions of the user authenticating the data source). You would use the OAuth Grant Type ‘CODE’ setting for that. Is there going to be any feature like this available for the TX Excel data source?

Thomas Lind
Author
Community Manager
Forum|Forum|7 days ago
November 26, 2025

Hi @KajEmergo

The CODE authorization method hasn’t been implemented, but it has been discussed in regards to REST for example.

I would say that it could be a good new feature, but supposedly it may be hard to implement. I would make a feature request for it in the ideas section.

Configuration manual

Connection settings

Path

Include sub-folders

Metadata URI

Location

Local file or folder

Azure Blob Storage settings

AWS S3 Bucket settings

SharePoint or OneDrive settings

Google Cloud Storage settings

SFTP settings

Excel settings

Set first non-empty row as column names

Table definitions

Only use defined tables

Aggregation

Included sheets

Infer datatypes

Number of rows to infer data types

Culture

Number decimal separator

Date formats

Misc behavior

Test connection

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded