Solved

JSON/REST Provider Ignores Filters in CBS OData URI


Badge

Hi Community,

I'm trying to connect the database from CBS (in both old and new version of TX). The original request URI gives an error response that the query exceeds 10.000 rows at test connection. This is a well-known issue with data from CBS, so I adapted the URI to include a filter. I tested this new query both in Postman and Qlik Sense Desktop and there the filter works and I retrieve aprox. 2.500 rows, from 2020 until 2022. 

Using this query in TX does not work unfortunately. The error that normally encounters when testing connection does not pop-up, however, the table still retrieves 10.000 rows. Probably just random from all available years (1995 t/m 2022).

I am using below request URI:

https://opendata.cbs.nl/ODataFeed/OData/70072NED/TypedDataSet?$filter=substring(Perioden,0,4) ge '2020' and substring(Perioden,4,2) eq 'JJ'

with custom header: accept:application/json

Any ideas what causes this? Thanks!

icon

Best answer by Christian Hauggaard 28 March 2023, 09:00

View original

11 replies

Userlevel 5
Badge +5

Hi Remy
I can see that it is free data, so I tried to connect to it as well. I tried to do so in Postman and other programs. It gives this error in postman

 

Badge

Hi Thomas,

The URI should also include the last part that is not underlined in the original post, due to the space before 'ge’. This is the full URI that you will have to use:

https://opendata.cbs.nl/ODataFeed/OData/70072NED/TypedDataSet?$filter=substring(Perioden,0,4) ge '2020' and substring(Perioden,4,2) eq 'JJ'

Or you could use the base URI https://opendata.cbs.nl/ODataFeed/OData/70072NED/TypedDataSet and set the following query parameter:

$filter = substring(Perioden,0,4) ge '2020' and substring(Perioden,4,2) eq 'JJ'

Userlevel 5
Badge +5

Hi Remy

I was sure the issue was related to encoding, but opening the URI in a web browser added the missing bits automatically.

This connection string worked for me.

Format=XML;URI="https://opendata.cbs.nl/ODataFeed/OData/70072NED/TypedDataSet?$filter=substring(Perioden,0,4)%20ge%20%272020%27and%20substring(Perioden,4,2)%20eq%20%27JJ%27"

 

I also tried with JSON and it worked as well, probably a little bit better.

Custom Headers=Accept:application/json;URI="https://opendata.cbs.nl/ODataFeed/OData/70072NED/TypedDataSet?$filter=substring(Perioden,0,4)%20ge%20%272020%27and%20substring(Perioden,4,2)%20eq%20%27JJ%27"

Badge

Hi Thomas,

I am not sure if that obtains the result I am looking for. Using the new URI does not resolve my issue, since I still retrieve 10.000 records belonging to all the years in de dataset. The URI specifies that I only want to retrieve these records that belong to the years between 2020 and 2022. 

Does your connection string result in the number of lines specified above (around 2.500) for the periods 2020 till 2022? You can find these in the field ‘Perioden’ and the values should only contain 2020JJ00, 2021JJ00 and 2022JJ00.

Userlevel 5
Badge +7

It might be useful to grab some verbosity level 3 logging and compare the calls TX does (and the results) to the Console logging from Postman. Given that the Dutch locale switches . and , it might also be useful to set the correct Culture and/or regional settings on the virtual machine you are using.

Userlevel 3
Badge +2

Hi Remy,

Grabbing data from CBS should be quite easy with OData. Most datasets can be retrieved with this provider.

In your case, use this connection string and you should be able to download the full dataset without facing the 10K limit.

Pagesize=10000;Timeout=600;URL=https://datasets.cbs.nl/odata/v1/CBS/70072ned/$metadata;

 

Userlevel 6
Badge +5

Hi @Remy Kamphuis | Victa did you find a solution for this? If so can you please help by marking a best answer above. Otherwise please let us know if we can provide more assistance

Badge

Hi @Christian Hauggaard,
The link send by Erik retrieved a few tables with metadata, not the results I was looking for. I have not been able to check the logs. Both vm's I have tested had Dutch locale and server settings. I have now adapted the RSD to allow for paging which retrieves the correct amount of lines. However, when testing with an URI that wants to retrieve data from 2009 and greater, it does not work, unfortunately.

So, I did find a solution that allows us to work with the data. However, in a few years this will probably stop working.  

Userlevel 6
Badge +5

Hi @Remy Kamphuis | Victa I have reproduced the issue regarding filtering - as you mention it appears to work in postman/browser, but not using the provider. I have forwarded this issue to Cdata and will let you know once I receive a response.

Userlevel 6
Badge +5

Hi @Remy Kamphuis | Victa Cdata have gotten back to me. Can you please try changing the “URIseparator” property under the “Misc” section of the data source settings to semi-colon instead of comma? It appears that the URI contains commas, as a result the provider is splitting the URI up by the comma and attempts to process each part separately.

 

Badge

Perfect! I used the REST connector, but changing the same property under the “JSON and XML” section as well works like a charm!

Reply