Tutorial

TimeXtender Excel data source

  • 26 June 2024
  • 3 replies
  • 65 views

Userlevel 6
Badge +5

You can use this provider to connect to Excel files. It can connect to multiple files at once and merge them based on specific options such as files, sheets or both. There also is options making it possible to connect to setups that is not straight forward, using table definitions, culture settings and more.

Configuration manual

The following settings can be used with the TimeXtender Excel data source connector.

Connection settings

These settings relate to the connection to where the Excel file is located.

Location

Drop down box where you can choose what type of location your Excel file is located at. Currently, only local files are available. In the future, this will have options like DataLake, Excel Online, etc.

Path

When “Location” is set to “Local file or folder” you can here provide the path to where the Excel file(s) are located. If a path to a folder is provided all files in the folder and if the option “Include sub-folders” is set then all files in the subfolders as well are ingested.

Include sub-folders

As mentioned above this option defines if sub-folders are also traversed to find Excel files to ingest.

Metadata URI

Here you can provide a separate path to where to read metadata from. It has the same behavior as Path and will read single or multiple files and also include/exclude subfolders, but is only used for metadata calls.

Excel settings

These settings relate to the behavior of the Excel reader and how to treat the content of the Excel documents.

Set first non-empty row as column names

Check if your Excel documents have header records. This will then use the first non-empty record to set the names of your resulting columns. If unchecked generic column names will be generated. E.g. “Column_1, Column_2, …”

Table definitions

Here you can provide a comma-separated list of table definitions to ingest from your Excel document(s). A table definition has the following structure:

<Table name>=<Sheet name>!<Starting cell>:<Ending cell>

E.g. “Table1=Sheet1!E6:H9”

The table will then be ingested with data from within the range provided in the definition. The definition is also dependent on the option “Has header record”, if checked the first record is used for column names. Aggregation options will also affect defined tables.

It is also possible to provide wildcards for some of the parts in the definition. A wildcard is defined by using a “*” star symbol. You can provide wildcards for “Sheet name”, this will then ingest data from all sheets within the range defined. This data is not aggregated unless aggregation options are selected. A wildcard can also be provided for the row part of the “Ending cell” part. E.g. “H*”. The reader will then read all rows within the defined columns.

Example of a table definition using wildcards: “Table1=*!E6:H*”

Only use defined tables

When checked only tables defined using table definitions will be ingested. If not checked all tables defined using table definitions will be ingested along with the whole sheet(s) tables from the file(s).

Aggregation

Here you can choose the type of aggregation you want to apply when ingesting Excel document(s). The options are:

  • None (All files & sheets in the path will be loaded as separate tables)
  • Files (Matching sheetnames and/or table definitions will be merged across files)
  • Sheets (All sheets and/or table definitions within files are merged)
  • Files & sheets (A combination of Files and Sheet aggregation)

None – No aggregation is applied and all tables are returned with unique names using the following naming structure:

<filename>_<sheet name> or <filename>_<sheet name>_<table definition name>

Files – Tables within files are named using the following naming structure:

<sheet name> and/or <sheet name>_<table definition name>

The data is then aggregated over files by table name.

Sheets – Data is aggregated over sheets and returned as a single table per file plus any table definition tables. Table definitions are also aggregated.

Files & sheets – Same as above, but in the order that Sheets are aggregated first for each file and then files are aggregated, will return a single table plus a single table for any table definitions.

Included sheets

Here you can specify a list of sheet names to select, by applying the names in a comma-separated bracketed list.

Infer datatypes

Check this to infer data types from your Excel documents. If unchecked, all data types will be strings.

Number of rows to infer data types

If you have checked “Infer datatypes”, you can here provide the number of rows to use for inferring the datatypes. This is because there is a small overhead of doing the inference. Setting the number to 0 or a negative number will result in all rows being used for inference.

Culture

Here you can specify the culture of the read Excel files. For example, if the source is different than yours or does not match InvariantCulture. 

Number decimal separator

Here you can provide the number decimal separator used when inferring numeric data types.

Date formats

Here you can specify a custom date format. You can apply more than one by applying them in a comma-separated format within brackets.

Misc behavior

Test connection

Clicking the test connection button has the following behavior:

  • If the path is empty or not set will return a failed connection
  • If a path is provided and the file can not be opened will return a failed connection
  • If a path is provided and the file or folder does not exist will return a failed connection

If none of the above a connection is returned as a success.

This means that we do not check whether the file or folder contains files that are Excel documents or if they are corrupted or contain any data.


3 replies

Badge

Seems that path cannot be a UNC pathname? A drive letter does work. I’m I right?

If yes something for improvement?

Badge

Another question. It looks like that meta data is required. Is it?

In general it would be good to know what is required and what is optional.

Userlevel 6
Badge +5

@Erik de Vries please submit a product idea for UNC path name

the metadata URI is not required. I am able to connect to a local workbook simply by specifying the path property

 

Reply