Best pattern for periodic bulk export of ZenDesk objects

Forum|Forum|4 years ago
February 5, 2022
6 replies
0 views

Todd16

Much like the thread here (https://support.zendesk.com/hc/en-us/community/posts/4415990396186-When-to-use-which-retrival-API), I have experimented with various ZenDesk APIs to retrieve information. My singular purpose is to copy information out of ZenDesk into my own organization's assets to support analytics in combination with many other data sources.

Since I wish to retrieve all information initially, and then keep up with updates, we attempted to use the incremental API (https://developer.zendesk.com/api-reference/ticketing/ticket-management/incremental_exports/). Unfortunately, an undocumented limit exists - apparently issuing the first API call starts a "session" that expires after one hour. Unfortunately, exporting all tickets takes multiple hours. If that initial backlog can be downloaded, the incremental API would likely work just fine - it's unlikely we would generate sufficient ticket updates to exceed the hour limit between downloads. (Although this is entirely possible - I don't know if my org performs bulk updates on a substantial number of tickets.)

We eventually settled on the "search/export" API. This is the only API that permits specification of a known-finite resultset. All others are open-ended and run into the time limit or otherwise artificially capped. Our use case is to simply extract "all data" and incrementally (daily/hourly) extract updates by requesting objects that were updated in a "window" of time. My current solution has been to use the API like so:

https://{org}.zendesk.com/api/v2/search/export.json?filter[type]={object type}&query=updated>={window start} updated<={window end}
A little more explicit example that retrieves one day of updates:
https://{org}.zendesk.com/api/v2/search/export.json?filter[type]=ticket&query=updated>=2022-01-01T00:00:00Z updated<=2022-01-02T00:00:00Z

Essentially, stepping through time periods since the "beginning of time" (our first use of ZenDesk), and picking up all objects changed in whatever size window I choose. Once I arrive at "the present", I record that time. On the next interaction, I request information since the last interaction up to "now". Now - this solution is not "perfect". For the initial download of history, I have to choose a window size that does not allow a query to take longer than one hour to completely fill. That took some trial and error, with an engineering margin included. That window size is almost always "too small" to be efficient, and can still suffer from a timeout, requiring adjustment (downward). And yes, over time, I will accumulate multiple "copies" (versions) of the same ticket, as it's updated multiple times over time. (That's totally OK, I can use only the most recent version I have retrieved.)

All that said, it worked.

Until I we were requested to retrieve ticket metrics. Apparently "ticket_metrics" is not an "object type" in ZenDesk's search/export API (nor is "ticket metrics" or "ticketmetrics").

So - how can I retrieve ticket metrics such that:
* I can retrieve them in bulk - not ticket by ticket
* I can retrieve them in "update windows" (i.e. "all ticket metrics updated between X and Y") to avoid the one-hour limit
* I am not limited to a certain "count" of responses (such as 1000)
And ideally:
* Through exactly the same mechanism as the other ZenDesk object types such as "tickets" and "groups"...
Or alternatively:
* What general pattern can I use to download ALL ZenDesk data (all objects), and keep up with updates?

Any suggestions are welcome - TIA.

Guided
Forum|Forum|4 years ago
February 7, 2022

Hello Todd,

I think you would benefit from using the Incremental Export API (even when it's causing you trouble), if that gets you the information you're after. As the Search Export API has other limitations (this is the one where I experience limits instead 😅) and is less suitable for large amounts of data.

Unfortunately, an undocumented limit exists - apparently issuing the first API call starts a "session" that expires after one hour. Unfortunately, exporting all tickets takes multiple hours.

What is expiring after 1 hour? The authentication or does the cursor expire too soon, or is it something else? I think in this scenario you want to use the time based export anyway. For the incremental ticket metric export there is only the time based flavor. In that case you can store the "end_time" or "next_page" of your last successful call and continue where you left off. Whether that's a normal next iteration (hourly/daily) or a retry attempt (when catching up). Even if you run into something every hour, you will get there eventually.

Like

T

Todd16
Author
Forum|Forum|4 years ago
February 8, 2022

Hey Sebastiaan,

Thanks for commenting. I meant to get back to you earlier, but got sidetracked. I'm trying some other experiments with the incremental API. I'll get back to you...