Solved

Using a nested REST API in combination with paging

  • 15 February 2023
  • 17 replies
  • 847 views

Hi,

 

I am having some trouble in using the nested REST API in combination with paging. The extensive description on how to use the nested call in the RSD-file was very helpfull. The only thing I cannot get working is the paging at the first level.

Two different cases, but I think the same underlying issue. In the first case the pageoffset is used as a parameter, in the other one it uses a pagetoken (the nextmarker in a blobstorage). In the output, I do see that paging is applied, but the problem is that it starts at the first page again. I seems that when using the input- and output files the correct information about the next page is not being picked up by the api-call.

I have tried the call without nesting (so only on the first level) and without qualifiying the parameters and then the paging is working correctly, but when I switch to in and out again, I get false results.

Does anybody recognize this isse and solved it somehow?

 

Kind regards,

Jacqueline Hofmeijer

icon

Best answer by jacqueline.hofmeijer 24 October 2023, 11:50

View original

17 replies

Userlevel 6
Badge +5

Hi Jacqueline

We had something regarding this @Jan Stroeve may know about it.

In essence this is how they did it. They added the two rows along with the pagination option.

  <api:set attr="EnablePaging" value="true" />
  <api:set attr="pagetokenpath" value="/json/pagination/cursor" />
  <api:set attr="pagetokenparam" value="cursor" />

Do you have the field with the token as a field in the response, or is it in a header value?

Hi Thomas,

I am also using these parameters, for the blob storage it is defined like this:

  <api:set attr="EnablePaging"     value="TRUE" />
  <api:set attr="pagetokenpath" value="/EnumerationResults/NextMarker" />
  <api:set attr="pagetokenparam" value="marker" />

The value NextMarker is returned as a field in the response body.

When defined like this it is working fine, and I do not have to script anything to have the paging working.

 

I get problems with this when I set it up like this:

  <api:set attr="filein.EnablePaging"     value="TRUE" />
  <api:set attr="filein.pagetokenpath" value="/EnumerationResults/NextMarker" />
  <api:set attr="filein.pagetokenparam" value="marker" />

  <api:script method="GET">
    <api:set attr="method" value="GET"/>
    <api:call op="xmlproviderGet" in="filein" out="fileout">
      <api:push"/>
    </api:call>
  </api:script>

 

Userlevel 6
Badge +5

Hi Jacqueline

That was also the issue for Jan.
What CData suggested was this.

Create two individual RSD files one that is doing the nesting call like this:

<api:script xmlns:api="http://apiscript.com/ns?v1" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<!-- See Column Definitions to specify column behavior and use XPaths to extract column values from JSON. -->
<api:info title="detailedInfo" desc="Generated schema file." xmlns:other="http://apiscript.com/ns?v1" other:queryslicercolumn="nested_id">
<attr name="ID" xs:type="string" readonly="false" other:xPath="/json/data/name" />
<!-- You can modify the name, type, and column size here. -->
<attr name="nested_id" xs:type="string" readonly="false" other:filter="{nested_id}" other:xPath="/json/data/nested_id" />
<attr name="Name" xs:type="string" readonly="false" other:xPath="/json/data/name" />
<attr name="Date" xs:type="datetime" readonly="false" other:xPath="/json/data/date" />
<attr name="Number" xs:type="decimal" readonly="false" other:xPath="/json/data/number" />
</api:info>

<api:set attr="DataModel" value="DOCUMENT" />
<api:set attr="URI" value="https://api.rest.com/v2/detailedInfo?nested={nested_id}" />
<api:match pattern="*nested_id*" type="glob" value="[_query.criteria]">
<api:else>
<api:set attr="URI" value="https://api.rest.com/v2/detailedInfo?nested=001" />
</api:else>
</api:match>
<api:set attr="JSONPath" value="$.data" />

<api:set attr="EnablePaging" value="TRUE" />
<api:set attr="pagetokenpath" value="/EnumerationResults/NextMarker" />
<api:set attr="pagetokenparam" value="marker" />

<!-- The GET method corresponds to SELECT. Here you can override the default processing of the SELECT statement. The results of processing are pushed to the schema's output. See SELECT Execution for more information. -->
<api:script method="GET">
<api:set attr="method" value="GET"/>
<api:call op="jsonproviderGet">
<api:push/>
</api:call>
</api:script>

</api:script>

And one that has a list of the nested values could look like this.

<api:script xmlns:api="http://apiscript.com/ns?v1" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<!-- See Column Definitions to specify column behavior and use XPaths to extract column values from JSON. -->
<api:info title="values" desc="Generated schema file." xmlns:other="http://apiscript.com/ns?v1">
<attr name="id" xs:type="string" readonly="false" other:xPath="/json/data/id" />
<attr name="name" xs:type="string" readonly="false" other:xPath="/json/data/name" />
<attr name="status" xs:type="string" readonly="false" other:xPath="/json/data/status" />
</api:info>

<api:set attr="DataModel" value="DOCUMENT" />
<api:set attr="URI" value="https://api.rest.com/v2/values" />
<api:set attr="JSONPath" value="$.data" />

<api:set attr="EnablePaging" value="TRUE" />
<api:set attr="pagetokenpath" value="/EnumerationResults/NextMarker" />
<api:set attr="pagetokenparam" value="marker" />

<!-- The GET method corresponds to SELECT. Here you can override the default processing of the SELECT statement. The results of processing are pushed to the schema's output. See SELECT Execution for more information. -->
<api:script method="GET">
<api:set attr="method" value="GET"/>
<api:call op="jsonproviderGet">
<api:push/>
</api:call>
</api:script>

</api:script>

Then you have a REST data source that looks at a folder where these two RSD files are located and you create a Query table to do the nesting with. Like so.

SELECT ID, Nested_Id, Name, Date, Number FROM REST.detailedInfo
WHERE Nested_Id IN (SELECT ID FROM REST.values)

This should make it work despite having to do a nested call alongside this method.

Notice that I use a Query Slicer column and a nested field to make it able to be used for this.

Okay, hoped that it would be a bit more straightforward, but I will give it a try in my rsd-files.

Userlevel 6
Badge +5

Well they also provided some message about that.

If the customer absolutely needs the logic in one RSD file (i.e. so they can run a simple unfiltered SELECT *), then that will require manual logic using rows@next. In that case, please let me know and I can try to customize the RSD file accordingly.

So I think that may be more difficult actually.

Let me know if you get stuck.

I have tried to get this method working in my rsd-files but unfortunately it has no result in both my cases. Could it be that it has to do with the fact that the variables are part of the URI and not just parameters on the query side of the request?

Also not sure if I understand how to use the “nested_id” in the script. When executing the datasource that is using a json connection the log is showing the right (nested) api-call of that specific line, but the data is just not transferred to a table in the sql-database.

The other datasource is using a csv-provider and in that case the parameter remain NULL and thus giving a error code.

Regarding the rows@next, I had already tried that method but then I still have the same issue that the  mechanism of filein/fileout does not seem to be able to transfer the right values.

Userlevel 6
Badge +5

Hi Jacqueline

We should maybe do a session where you show me how far you are and what issues you get. I may have some ideas and if not I can gather as much necessary data to ask for help with this.

When would it fit you?

I am available most days from 09-15 UTC +01:00

Hi Thomas,

That would be great. Tomorrow or Thursday around 11:00? 

Userlevel 6
Badge +5

Hi Jacqueline

I have sent an invite.

Userlevel 6
Badge +5

Hi @jacqueline.hofmeijer did you manage to resolve the issue? If so we would very much appreciate it if you could please describe the solution. If not, please let us know if we can help further

Hi, I've had a similar use case when calling a nested API call in a different ETL-tool. The following rsd script first reads Azure AD groups via the Graph API and then retrieves the group members via a nested Graph API call. Hopefully this example will help you.
<api:script xmlns:api="http://apiscript.com/ns?v1" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<api:info title="Members" desc="Generated schema file." xmlns:other="http://apiscript.com/ns?v1">
<!-- You can modify the name, type, and column size here. -->
<attr name="GroupId" xs:type="string" readonly="false" other:xPath="groupId" />
<attr name="GroupDisplayName" xs:type="string" readonly="false" other:xPath="groupDisplayName" />
<attr name="GroupDescription" xs:type="string" readonly="false" other:xPath="groupDescription" />
<attr name="UserId" xs:type="string" readonly="false" other:xPath="userId" />
<attr name="UserDisplayName" xs:type="string" readonly="false" other:xPath="userDisplayName" />
<attr name="UserPrincipalName" xs:type="string" readonly="false" other:xPath="userPrincipalName" />
<!-- Attributes are used for passing to child API call -->
<attr name="Id" xs:type="string" readonly="false" other:xPath="id" />
<attr name="DisplayName" xs:type="string" readonly="false" other:xPath="displayName" />
<attr name="Description" xs:type="string" readonly="false" other:xPath="description" />
</api:info>

<api:set attr="groupin.DataModel" value="DOCUMENT" />
<api:set attr="groupin.JSONPath" value="/value" />
<api:set attr="groupin.URITemplate" value="https://graph.microsoft.com/v1.0/groups?$select=id,displayName,description" />
<api:set attr="groupin.ElementMapPath#" value="/value/id" />
<api:set attr="groupin.ElementMapName#" value="group_id" />
<api:set attr="groupin.EnablePaging" value="TRUE" />
<api:set attr="memberin.DataModel" value="DOCUMENT" />
<api:set attr="memberin.EnablePaging" value="TRUE" />
<api:set attr="memberin.JSONPath" value="/value" />
<api:set attr="memberin.URITemplate" value="https://graph.microsoft.com/v1.0/groups/{groupId}/members?$select=id,displayName,userPrincipalName" />

<api:script method="GET">
<api:set attr="groupin.URI" value="[groupin.URITemplate]" />
<!-- Get the groups -->
<api:call op="jsonproviderGet" in="groupin" out="groupout">
<api:set attr="memberin.groupId" value="[groupout.id]" />
<api:set attr="memberin.groupDisplayName" value="[groupout.displayName | allownull()]" />
<api:set attr="memberin.groupDescription" value="[groupout.description | allownull()]" />
<api:set attr="memberin.URI" value="[memberin.URITemplate | replace('{groupId}', [memberin.groupId])]" />
<!-- Get the members of the group -->
<api:call op="jsonproviderGet" in="memberin" out="memberout">
<api:set attr="out.userId" value="[memberout.id]" />
<api:set attr="out.userDisplayName" value="[memberout.displayName]" />
<api:set attr="out.userPrincipalName" value="[memberout.userPrincipalName]" />
<api:set attr="out.groupId" value="[memberin.groupId | allownull()]" />
<api:set attr="out.groupDisplayName" value="[memberin.groupDisplayName | allownull()]" />
<api:set attr="out.groupDescription" value="[memberin.groupDescription | allownull()]" />
<api:push item="out"/>
</api:call>
</api:call>
</api:script>
</api:script>

 

Userlevel 6
Badge +5
Hi, I've had a similar use case when calling a nested API call in a different ETL-tool. The following rsd script first reads Azure AD groups via the Graph API and then retrieves the group members via a nested Graph API call. Hopefully this example will help you.
<api:script xmlns:api="http://apiscript.com/ns?v1" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<api:info title="Members" desc="Generated schema file." xmlns:other="http://apiscript.com/ns?v1">
<!-- You can modify the name, type, and column size here. -->
<attr name="GroupId" xs:type="string" readonly="false" other:xPath="groupId" />
<attr name="GroupDisplayName" xs:type="string" readonly="false" other:xPath="groupDisplayName" />
<attr name="GroupDescription" xs:type="string" readonly="false" other:xPath="groupDescription" />
<attr name="UserId" xs:type="string" readonly="false" other:xPath="userId" />
<attr name="UserDisplayName" xs:type="string" readonly="false" other:xPath="userDisplayName" />
<attr name="UserPrincipalName" xs:type="string" readonly="false" other:xPath="userPrincipalName" />
<!-- Attributes are used for passing to child API call -->
<attr name="Id" xs:type="string" readonly="false" other:xPath="id" />
<attr name="DisplayName" xs:type="string" readonly="false" other:xPath="displayName" />
<attr name="Description" xs:type="string" readonly="false" other:xPath="description" />
</api:info>

<api:set attr="groupin.DataModel" value="DOCUMENT" />
<api:set attr="groupin.JSONPath" value="/value" />
<api:set attr="groupin.URITemplate" value="https://graph.microsoft.com/v1.0/groups?$select=id,displayName,description" />
<api:set attr="groupin.ElementMapPath#" value="/value/id" />
<api:set attr="groupin.ElementMapName#" value="group_id" />
<api:set attr="groupin.EnablePaging" value="TRUE" />
<api:set attr="memberin.DataModel" value="DOCUMENT" />
<api:set attr="memberin.EnablePaging" value="TRUE" />
<api:set attr="memberin.JSONPath" value="/value" />
<api:set attr="memberin.URITemplate" value="https://graph.microsoft.com/v1.0/groups/{groupId}/members?$select=id,displayName,userPrincipalName" />

<api:script method="GET">
<api:set attr="groupin.URI" value="[groupin.URITemplate]" />
<!-- Get the groups -->
<api:call op="jsonproviderGet" in="groupin" out="groupout">
<api:set attr="memberin.groupId" value="[groupout.id]" />
<api:set attr="memberin.groupDisplayName" value="[groupout.displayName | allownull()]" />
<api:set attr="memberin.groupDescription" value="[groupout.description | allownull()]" />
<api:set attr="memberin.URI" value="[memberin.URITemplate | replace('{groupId}', [memberin.groupId])]" />
<!-- Get the members of the group -->
<api:call op="jsonproviderGet" in="memberin" out="memberout">
<api:set attr="out.userId" value="[memberout.id]" />
<api:set attr="out.userDisplayName" value="[memberout.displayName]" />
<api:set attr="out.userPrincipalName" value="[memberout.userPrincipalName]" />
<api:set attr="out.groupId" value="[memberin.groupId | allownull()]" />
<api:set attr="out.groupDisplayName" value="[memberin.groupDisplayName | allownull()]" />
<api:set attr="out.groupDescription" value="[memberin.groupDescription | allownull()]" />
<api:push item="out"/>
</api:call>
</api:call>
</api:script>
</api:script>

 

@gijs 
Nice one, I use it as well.

Hi @gijs,

Thanks for your reaction but unfortunately that does not solve my problem. 

I already got some connections working in the way you described, but the challenge for me lies in the combination of:

  • nested calls
  • paging
  • incremental loading.

The option Thomas provided in using a query slicer does work for using nested call in combination with paging (after a bit of customizing) but not for the incremental loading at top level. And I need this for performance issues.

Userlevel 6
Badge +5

Hi @jacqueline.hofmeijer 

Have you resolved this issue currently?

Otherwise I may have some new suggestions for you to try.

Hi @Thomas Lind,

For now I have worked around the issue by using the incremental loading but accept the limitations of the paging option. I have just created extra execution steps to load the data multiple times, but this way I won't be sure the data is fully loaded.

So new suggestions are very welcome.

Kind regards,

Jacqueline

Userlevel 6
Badge +5

Can you send the files to me and how the incremental procedure currently looks?

Don’t add them here unless they are without private data.

Userlevel 6
Badge +5

Hi @jacqueline.hofmeijer do you have an update for Thomas’ questions above? Thanks!

Reply