IGDB: Store all games in local database

I am working on a project where I need all games stores on my own local server for performance reasons. So far, my approach has been to write a Cron job that calls the IGB API and store the results of the API calls to my local database once a day. The code that pulls this works (this is a Rails app, so I make a rake task and it’s being scheduled with the whenever gem).

What I’d like to confirm is that I want to avoid to pull “everything” everyday. Rather, I thought that since the IGDB games DB would grow linearly, I could instead start from index 0 and call 500 items, then loop until there were no more items to fetch. Then every day, I could start from the index equal to the length of my games table and pass it as an offset to the endpoint.

It seems to work but I need to confirm on crucial assumption of mine: will IGDB calls always return the same values for the same offsets? Or do I need to put some sort of sorting in there to guarantee it?

Thank you!

Rather than doing what you are doing.
Extract once then use IGDB webhooks to stay up to date.

Since you need to account for deletions. Which’ll throw off your restart point

What the IGDBot says about this on the IGDB Discord

?api_scrape

IGDBot

In most apis, “scrape” is a four-letter word. However, we’ve made it a primary use case. Simply page through our data, 500 records at a time, and save them to your own database! Then setup webhooks and we’ll send you new data as it’s updated.

My apporach is to do caching.
I only store what a user calls for and cache it for 24 hours, locally.

Then the records I do have is always up to date (well I use webhooks to invalidate my cache if a cached record is updated)

Rather than doing what you are doing.
Extract once then use IGDB webhooks to stay up to date.
Since you need to account for deletions. Which’ll throw off your restart point

Thanks for the reply! Assuming that I do it this route, how would I know which action to perform based on what the webhooks return? Is it a list of IDs and operations? Since I will be holding the whole lit of games in my DB, I am guessing I am only receiving “the diff” if I can call it that, or am I getting a snapshot of the new entire list? :thinking:

My apporach is to do caching.
I only store what a user calls for and cache it for 24 hours, locally.
Then the records I do have is always up to date (well I use webhooks to invalidate my cache if a cached record is updated)

Right, this was my initial thought but I wanted to be able to create relationships in my DB between Models that were easier to work with, and having part of my data local and part in the 3rd party API made these relations trickier to manage.

But I am curious if this might not be easier to manage, even with rate limiting? I was foreseeing rate limiting being a problem if I had a lot of DB Models tied to the IGDB API, where as if I hosted my own and pulled once every day, I would have control over it. It also meant that if there were maintenance or downtime of IGDB, I had a working version that would still work.

Was I over-engineering this or do you think my concerns were valid? Thanks again!

IGDB’s rate limit is 4 concurrent requests at once, so you might not hit that depending on how you operate.

Sure seems logically you’ll just have a lot of data on hand you might never need

IGDB webhooks are split by record type into three subtype create/update/delete

The webhooks are someone dry on record details but they will not give you a diff.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.