Logging is a critical part of any application. Having solid logs significantly decreases the amount of time you spend on debugging problems, therefore improving the quality and reliability of your service. So let's take a look at how to use DynamoDB for your logging infrastructure.
DynamoDB is a great choice for logging due to the flexibility and scalability it provides. However, it does take a bit of thought about how you are going to model your data to meet your logging needs. Therefore, as you've likely heard before, it is critical to think in advance before you get started about your usage patterns.
One of the great benefits of DynamoDB is that you don't have a defined schema. If you want to add more properties for one type of log, or a different service, you can simply add that extra property to your log. No need to update or define your schema.
Let's look at an example log here:
Now of course, you could easily expand this to include more properties, such as user information, IP address, server configuration, etc. But the premise is pretty simple. The goal here is to easily get logs into DynamoDB so that they are usable.
Read Usage Patterns
Logging is an interesting situation because of how unique the usage patterns are. For example, if you have a User database, your application needs to be able to read and write to that database very frequently. And even users who haven't used your application in a long time expect a fast experience. Logging in different. Although writes need to be extremely fast and stable, reads are typically only done by engineers at the company and not end-users. Additionally, as time goes on, older logs become less and less relevant and valuable.
For this reason, it isn't normally recommended to keep raw logs in your system long-term. Being able to take those logs that have been written and analyze them, or move them to cold storage is critical.
For example, let's try to answer the question, how many error logs were there on a given day? Well, how could we do that with the logs we wrote above? We'd likely have to add an index on the
level property with a sort key of
ts. We then run a query command to get all items where
level=ERROR & where the
ts property is between the values we want for that given day. Count all of those items up, and you have your answer.
However, at a certain point, that process becomes highly inefficient. Why should we have to query every log item when we just want to know the count for a particular day? You could write a Lambda function that runs on a cron schedule, query's your log database, and generates that count, just like before, but then saves the result to a different table. For example:
Then, from there, you can easily query that other table for that given
id, and you get the count for that day. You save bandwidth, time, and cost using this method. You get the exact data you are looking for, all by building services that take your raw log messages and generate the data you actually care about.
This is why it is critical to think about your usage patterns when using DynamoDB for logging. It can be a highly efficient way to handle logging; however, it does take some thought about how you are going to transform your raw logs into usable data that you care about.
Now you might be asking yourself, why can't I just use DynamoDB Transactions and increase the count for that given day whenever a new log comes in? This is completely possible, and it truly depends on your use case. If getting live counts for that particular metric is important for you, that would be a valid use case. However, it means you are paying for a write command twice for every log message. One for the raw log message we discussed earlier, and another to modify the count item. Due to the fact that write commands are more expensive, using a cron job and ensuring you only run that count write command once, can lead to drastic cost savings.
One final thing to consider is taking advantage of DynamoDB streams for certain types of processing. Maybe once that cron job runs and it writes something to your other table, you send a notification to yourself if it's higher than an expected value. Separating out these concerns into streams can separate those concerns and keep things segmented.
The important thing here is to consider your use cases and what type of information you want to get out of your logging system. Remember that you can build Lambda functions to modify, aggregate, and transform your logs into usable data as you see fit.
We mentioned previously about moving logs to cold storage. This is an important consideration to make. For logs that are years old, is it necessary to keep those around? If so, is it really critical to keep those in your DynamoDB database? Chances are no to at least one of those questions. Therefore it is important to come up with a strategy for how to handle old logs.
The easiest method for this is to use DynamoDB's TTL function. This allows you to easily set a time for items to be automatically deleted from your DynamoDB table. You can then use DynamoDB Streams to capture the deleted items and migrate them to a long-term storage solution such as S3 (if you so desire).
Another option is to write your own Lambda function to handle the deletion and migration of old logs to cold storage. This allows for more flexibility and customization, but it does come at the expense of writing that function yourself instead of taking advantage of the managed nature of DynamoDB TTL. Again, you can choose to migrate to S3 or another storage solution or just delete the logs alltogether.
At some point, you are unlikely to need to query against that data as frequently. So moving that data to another system, such as Amazon S3, makes a lot of sense in that scenario. At the time of writing this, S3 storage costs are considerably lower than DynamoDB storage costs, so this is a great option to ensure your costs stay reasonable over time.
Of course, another consideration here is the possibility of only storing certain logs. Maybe debug messages can be disregarded after a week, whereas warnings get persisted for a year, and error messages get persisted for two years then migrated to S3. Again, it all depends on your use case and your interaction with old logs.
One of the major benefits of DynamoDB is the scalability of the platform. We aren't going to go too deep into this topic in this post, but we wanted to give some specific insight on how some of DynamoDB's scaling functionality can be applied in the context of using DynamoDB for logging.
Logs tend to be highly variable. If a problem occurs, the likelihood of your log messages increasing is high. A key system goes down, the frequency of logs tends to increase. Because of this, the On-Demand DynamoDB option tends to work extremely well for logging use cases in DynamoDB. If all the sudden you get a massive spike in DB traffic, you don't have to worry about scaling at all, and AWS just takes care of instantly ramping up your capacity to handle that massive surge in logging traffic.
For more information on this topic, checkout the Dynobase post on DynamoDB On-Demand Scaling vs Provisioned with Auto-Scaling.
Logging makes a lot of sense in DynamoDB due to the flexible and scalable nature of the database. There are many different ways you can set it up, but think about your read patterns, use cases, and take advantage of the scalability and flexibility DynamoDB offers. And most importantly, log so you can debug problems faster! Good luck.
© 2022 Dynobase