Skip to main content

Hello,

 

I want to get metadate from my Azure datalake using their Blob API. 

I wasn't seeing any data in the Ingest storage so I turned on cashing to file, to try to see what's happening. 

 

There are three files in my cashing folder: 

  • Data_.raw: The return of the call, i.e. my actual data. This look excellent, except that it's a .raw file. Contents:
    <?xml version="1.0" encoding="utf-8"?>
    <EnumerationResults ServiceEndpoint="https://xxxx.blob.core.windows.net/" ContainerName="datalake">
    <Prefix>my_prefix</Prefix>
    <Blobs>
    <Blob>
    ....
    </Blob>
    </Blobs>
    <NextMarker/>
    </EnumerationResults>

     

  • Data_.xml: Basically the same as the Data_.raw, but with the content of Data_.raw as the data of a value-element. The data also contains the XML header (so now the document has two headers) and the brackets have been encoded (i.e. all the `<` are now `<`). 
    <?xml version="1.0" encoding="utf-8"?>
    <Table_flattening_name>
    <value>
    &lt;?xml version="1.0" encoding="utf-8"?&gt;
    &lt;EnumerationResults
    ServiceEndpoint="https://xxxx.blob.core.windows.net/"
    ContainerName="datalake"&gt;
    &lt;Prefix&gt;my_prefix&lt;/Prefix&gt;
    &lt;Blobs&gt;
    &lt;Blob&gt;
    ...
    &lt;/Blob&gt;
    &lt;/Blobs&gt;
    &lt;NextMarker /&gt;
    &lt;/EnumerationResults&gt;
    </value>
    </Table_flattening_name>

     

  • Data_transformed_1.xml: The result of my XSLT on Data_.xml

Data_transformed_1.xml contains one empty element, which is caused by Data_.xml being malformed. 

I can't really figure out what's going on. In other APIs I only had two files. Not sure what the Data_.raw file is doing, but everything would work if that file were Data_.xml. 

 

What could be causing this? Why is there a Data_.raw file? How can I fix this?

 

Hi ​@Benny 

It would seem like there is no data in both of the examples, unless the …. is supposed to mean that one contains data.

The two files are the raw example of the source and the XML is the data when it is converted into that.

What are you connecting to, it seems like some sort of Microsoft?


The ellipsis is indeed meant to replace actual data. That all looks fine. 

I figured that the .raw represents raw data. But in the cashing of REST endpoints that do work, I don't see this file. There's just an XML with the raw data and a transformed.xml containing the transformed data. 

So that leaves me puzzled to what's happening. Especially since the .xml is malformed. 

 

I’m trying to get data from an Azure storage, specifically list the blobs in a certain container: https://learn.microsoft.com/en-us/rest/api/storageservices/list-blobs?tabs=microsoft-entra-id.

 

From other rest endpoints I connected to I expected to see something like:

Data_.xml --> Data_transformed.xml

In stead the cashing folder implies

Data_.raw --> Data_.xml --> Data_transfromed.xml

Data_.raw contains exactly the XML I need. Some weird transformation is applied that turns in into the malformed Data_.xml. Then my table flattening XSLT is applied, which can't make sense of the malformed XML. 


Reply