CData Performance

CData Providers - Performance Optimization and Troubleshooting

2 years ago
February 16, 2023
0 replies
1041 views

Syed Yousuf
Contributor
82 replies

Warning: CData data source providers are no longer supported. We strongly recommend that you use or upgrade to TimeXtender-branded providers.

This article describes general techniques to optimize performance of CData Providers and to troubleshoot errors or unexpected behavior during data extraction. For specific provider-related techniques, refer to the provider documentation.

Performance

A CData provider’s performance is largely dependent on the data source and the amount of data.

Decreasing the amount of data retrieved will improve performance.

Queries with multiple joins, many columns, large number of rows, etc. take longer.

Using server-side filters to decrease the data set can improve performance. These filters are passed through the API request to the actual data source

Properties related to performance:

MaxRows - limits the number of rows returned when no aggregation or group by is used in the query
Verbosity - decreasing the verbosity level will reduce the amount of log writing
PageSize - affects the maximum number of results to return per page; If you get an error asking to request less data, set PageSize to a lower value.
BatchSize - useful when the server has limitations on the size of the request that can be submitted
Disable PseudoColumns / CustomFields to decrease the request size

Generating Logs

Logfiles can be generated by setting two properties:

Logfile – set to a file path and file name (for example, Logfile=‘C:\Logs\REST_Log.txt’)
Verbosity – set to a number between 1 and 5 (for example, Verbosity=3)

A Verbosity 3 log is usually sufficient for troubleshooting a technical issue. However, if logging is always on, the verbosity should not be set higher than 1 to avoid performance loss.

Verbosity Levels:

1 Logs the query, the number of rows returned by it, the start of execution and the time taken, and any errors.
2 Everything above + cache queries and additional information about the request
3 Everything above + HTTP headers and body of the request and response
4 Everything above + transport-level communication with the data source, including SSL negotiation
5 Everything above + communication with the data source and additional details that may be helpful in troubleshooting problems, including interface commands.

Note: Verbosity levels 3 and higher can capture sensitive client data, including:

3 Full body of the request and response, this will include all data returned by the provider
4 SSL certificates
5 Any other transfer data, including binary transfer data

The provider masks sensitive values, such as passwords, in the connection string and requests. However, it is best practice to review a log for sensitive data before sharing outside of your organization.

Verify the provider version

The provider version can be obtained from the log file. It is printed out whenever a connection is opened. For example:

Opened REST connection. Version: 22.0.8354.0. Edition: [ADO.NET Provider .NET 4.0].

The version will help identify known issues (if any) that have been resolved in later builds, so you may update the provider as needed.

Identify whether the issue originates from Product/Application or the Provider

Review the error message to identify the source of the problem (e.g. an error message you are familiar with, or know that it stems from a product).

View the stack trace for the exception. If it contains the CData class, the error is most likely coming from the provider.

Verify the query being pushed down to the provider. Depending on the issue, you can distinguish whether the problem is specific to the product or the provider.

Identify whether the issue originates from the Provider or the Datasource

The log displays the order of operations performed (such as retrieving metadata, executing queries, etc.) along with the requests and responses sent to the server. The error message will be displayed in the log as well and will help identify at what point/call/request the error occurred.

The request and response sequence can be reviewed to identify if there are any issues. The request can be verified to ensure that the format and data is correct based on the query being issued. The request may vary between providers. Reviewing the data source specifications may help identify issues, e.g., malformed request, for further troubleshooting.

The response can also be reviewed as it may contain additional information on errors (For example, the SharePoint providers may just return an HTTP response that needs to be looked at). The response can also be reviewed for the data returned. This is useful for cases where data is not being returned when you expect it to. This may imply the data most likely doesn’t exist on the server, does not match the query, or a filtering issue in the provider. To troubleshoot further, you may remove filters (e.g., WHERE clause) and test if the data is returned. If it does, then this narrows down the issue to the filter. If the filter is pushed down to the provider (e.g., there isn’t an extra layer performing the filter), an issue can be raised with the support team to investigate further.

Reviewing Logs

[INFO|Connec] lines contain information on the provider used, the version, the edition, and the connection string

Every CData provider has a major and minor version: i.e., 22.0.8413

- 22 – Major version, updated every year
- 8413 – Minor version, updated with daily builds

[EXEC|Parsed] will contain the query executed by the user

[HTTP|Req: 0] contains the request sent by the provider including the URL and headers; the integer will increment to track different requests

[HTTP|Res: 0] contains the headers and response from the API; integer will correspond to the Request.

The response can be used to determine if missing data was returned by the API.

It may also contain a more detailed error message.

OAuth Troubleshooting

The provider will automatically refresh the OAuth token (where applicable, based on support from the data source) when it expires. This is done when the InitiateOAuth property is set to GETANDREFRESH or REFRESH. In this scenario, upon making a connection, the provider will use the refresh token to obtain a new Access Token. By default, the Access Token is saved to disk via the OAuthSettingsLocation property. This file is unique per user/credentials. If there are multiple data sources with different credentials/accounts being used, then the OAuthSettingsLocation value will need to be set to unique files. If those are set to the same file, there will be authentication errors and conflicts as the access token stored in the file will not be correct for one of the data sources and may be overwritten when a refresh occurs.

For REST API, use Postman to verify connection

Sometimes, the server cannot perform the request due to a client side problem, and returns a 400 (Bad Request) status code. To cross-check whether the connection string options are working OK, test the same request in an external tool (like Postman) with the URL and Header.
If you are able to see the result in Postman, then test again with the CData provider in TimeXtender Data Integration and generate a log in order to troubleshoot.

Timeout

Timeouts can occur because of the Timeout connection property or due to the data source.

The Timeout connection property can be set to 0, which means the provider will never time out.

An easy way to determine if the timeout is caused by the provider or the data source is to compare the timestamps of the initial request and timeout in a verbosity 2 log. If the timestamps are exactly the length that Timeout is set to, it is caused by the provider.

Review Timeouts on databases for more details on timeout issues.

Did this topic help you find an answer to your question?

Be the first to reply!

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Performance

Properties related to performance:

Generating Logs

Verify the provider version

Identify whether the issue originates from Product/Application or the Provider

Identify whether the issue originates from the Provider or the Datasource

Reviewing Logs

OAuth Troubleshooting

For REST API, use Postman to verify connection

Timeout

Reply

Related topics

DW_ID not updated in unique key field in hgistoric tableicon

Use History to implement Slowly Changing Dimensions

Spatial Joins with TimeXtender

How to add a default value to a unique identifier field

Table relations

Most helpful members this week

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings