by Nick Gambino, Software Engineer
This is part of a 2-part series. You can find Part 1 here.
While we were looking at our initial data pipeline, we realized that in order to make it more flexible, data needed to travel to multiple places in a distributed way rather than directly populating the database. The best way to do it would be to further segment the microservices, and scope their job functions further.
Here’s where we ended up:
Specifically, we converted the single data queue into a “data exchange” that fed three separate microservice queues. The purpose of the exchange was simply to get the raw data payloads and drop them on it’s various “child” data queues. The exchange was by design as simple as possible, while the child services were more complex as they handled the data transformation as needed.
It’s important to note that these microservices should be completely decoupled from the database service. This is by design and makes swaps and upgrades within the system relatively easy to accomplish. It also provides a higher level of reliability to the system as a whole.
We prioritized setting up a “backfill-queue” that read the data from the exchange, and backed it up into S3. This eliminated the need for the database to handle it’s own backups, and was critical for when we made the transition to InfluxDB. This would allow us to backfill data missed during the initial transition, and would also provide backfills after the migration.
We also set up a “monitoring-queue” that would take data from the exchange and then extract key bits (like status, uptime, and health), and populate a Redis data store with the relevant data. For our monitoring services, we could then cache when BSU values were updated in Redis, rather than querying the database. This reduced the number of queries to the database, which ultimately led to more responsive applications and a better user-experience.
Lastly, the “data-queue” in the diagram is in principle more or less the same as it was in the previous diagram, in that it would feed the data into the database. However, we needed to refactor some of the logic in this service, since we needed to transform the data into formats that InfluxDB could consume. By breaking out the other two services previously, we could simplify and scope the logic of this data transformation specifically to this microservice, rather than have it handle multiple responsibilities (like backfill, and monitoring). Having this separation of concerns lowers the risk of failure. For instance, if we got a data payload that crashed the data transformation queue, we knew that the backfill-queue would be unaffected. Similarly, if the backfill-queue crashed we could be confident that data would still be getting into the database.
Now that we had broken out our data pipeline into several scoped microservices, we knew that when it was time to actually make the database swap, that we could do it with a high level of confidence, and with a low risk of data loss.