How to import an Excel or CSV through the Collibra API
22 October 2020
A thriving Collibra instance is populated with a wide spectrum of data such as business terms, policies, code values, metadata of schemas, tables and columns, …. To keep Collibra relevant within an organization, this data needs to be accurate and up-to-date. Collibra provides tools like Collibra Catalog and the Collibra API to automatically import data with a set frequency.
Why use the Collibra Import API
When importing the technical metadata of a database, Collibra’s Catalog is your best option. When updating Collibra with information from different platforms however, the Collibra API is better suited. Organizations often fall back on Excel or CSV imports through the user interface. These imports, however, are prone to errors and take some manual intervention. Doing these imports through Collibra’s Import API will not only lower the possibility of errors, but also streamline the process and make it faster. This guide explains how to use this Import API.
Step 1: Accessing the Import API
To use the Import API, we first need to access it. In order to start, log in to the Collibra instance where you want to load your data. In the top right corner click on the question mark and select “API documentation”.
Clicking this will redirect you to the developer documentation of Collibra. Next click on “Rest Import API”.
This will direct you to the swagger interface for the Import API, which allows you to use the different methods or endpoints the API offers.
Now that we’ve reached the Import API it’s important to understand what you’re looking at. Click on "Import" to unfold the different API endpoints offered by the Collibra Import API. These endpoints allow you to do an operation to interact with Collibra. For instance, a POST endpoint allows you to upload data to Collibra while a GET endpoint will return you information based on what you requested.
Using an endpoint through this interface will allow you to make an API call. Effectively interacting with Collibra through the API. The 3 first endpoints are used to import data through the data formats CSV, Excel or JSON. Clicking an endpoint will cause it to unfold. It will show you a series of parameters and responses. The parameters are what we will use to properly configure the API call.
Step 2: Configuring the Import API
In the next step you need to choose between importing a CSV or an Excel file and select a corresponding endpoint “/import/csv-job” or “/import/excel-job”. In this example, we will upload a CSV file, so we select the “/import/csv-job” endpoint. To fill in the parameters that belong to this endpoint click the “try it out” button.
Clicking the “try it out” button unlocks the parameters of the endpoint. The parameters we are looking for are:
To select the file we want to upload click on “browse” next to the file parameter. After you’ve selected the file from your computer, we fill in the separator, quote and escape character used in your CSV file. Collibra fills in default values for these parameters but you can change them.
Next up is the headerRow parameter which can only be set to true or false. If your CSV file has a header row set this to true. It will cause Collibra to ignore the first row of your CSV. Finally, paste the template JSON in the template bar under the example of a correct JSON template.
The template is a mapping between the columns in your CSV file and what they represent in Collibra. For more information on importing JSON templates visit our how-to guide on this topic.
Step 3: Interpreting the results
Once everything is filled in, click the “execute” button. The tab “Responses” contains 3 different elements: a Curl, Request URL and Server response. The latter shows us the response we got from Collibra. In an ideal scenario, the “Code” is equal to “200”. This tells us Collibra has received our request and has started processing it. If you see a “500” code it means something went wrong.
To be 100% sure that the import has succeeded, check the “Activities” tab in Collibra. You can find it in Settings > General > Activities. If you see a line “Import” with the status “Completed”, the import succeeded.
By using this guide, you’ll be able to import CSVs or Excel files faster and with fewer possibilities of errors than you would through the user interface. It’s also the first step to automating data imports. When completely automating the import you can re-use the JSON template and know which endpoint to call.
In case you have any additional questions feel free to reach out at any moment.