Using a VPC with DynamoDB - The Why & How-To
Written by Kavindu Gunathilake
Published on March 8th, 2022
Time to 10x your DynamoDB productivity with Dynobase [learn more]
What is VPC?
Virtual Private Cloud (VPC) is a virtual network that we can create in AWS. It provides complete control over the network, including the following:
- IP address range.
- Creation of Subnets.
- Control the inbound and outbound network traffic.
This article mainly talks about accessing DynamoDB from the resources placed inside a VPC.
Why Do We Use VPC with DynamoDB?
Let's take a web application hosted in AWS that uses DynamoDB as the database. Inside the VPC, we can have multiple availability zones, and each of these availability zones has one or more subnets. The resources we place inside the subnet will get an IP address from the IP range allocated to the subnet. These could be public or private, depending on whether they are exposed to the internet or not. Here, the main purpose is to define a network boundary around the resources for security.
Suppose that we place an EC2 instance within the subnet and it needs to access DynamoDB. Traditionally, the database call will route through the DynamoDB public IP range. Even for a private subnet, we need to attach a NAT Gateway for that to work. That means the database request will cross the subnet and VPC boundary to the internet.
Still, with the access control in place with DynamoDB and SSL/TLS to encrypt traffic, no major security issue is there. However, for some of the compliance reasons and advanced security, it might require to avoid crossing the VPC boundary completely.
But is there a way we can still keep the traffic routed securely to DynamoDB at the network level?
DynamoDB VPC Endpoint
VPC endpoints for DynamoDB make it possible to define a secure path to access DynamoDB from a VPC. It even enables Amazon EC2 instances in a VPC to use their private IP addresses to access DynamoDB with no exposure to the public internet.
Image reference
DynamoDB VPC Endpoint Performance
Using the VPC endpoint for DynamoDB will also improve the application performance since it allows a direct connection from VPC to DynamoDB. For instance, you can avoid Database traffic going through the NAT Gateways for private subnets, where it won't become a bottleneck for performance. Additionally, this setup can reduce latency and improve the overall efficiency of your application by minimizing the number of network hops required to reach DynamoDB.
VPC Endpoint for DynamoDB - Policies
The VPC endpoint has a policy that can create access limitations to DynamoDB. Initially, the policy would allow access to any user or service within the VPC using credentials from any AWS account to any DynamoDB resource. An endpoint policy is an IAM policy that you attach to an endpoint that allows access to some or all of your associated services.
The code below shows how to create a read-only endpoint to the database table.
How-To Add a VPC Endpoint for DynamoDB?
The following steps will connect the DynamoDB and EC2 instance in the VPC using the VPC endpoint.
Prerequisite steps:
- Launch the EC2 instance(make EC2 instance available) and ensure that its state is running. This link will provide a step-by-step guide for it.
- Configure the Amazon EC2 Instance to set a VPC endpoint for the database. This link will provide more details.
Step 01
- To ensure DynamoDB communication is working fine, test to access with a public endpoint. The below will help to test it.
aws dynamodb list-tables
Step 02
- Check whether DynamoDB is an available AWS service within the region.
aws ec2 describe-vpc-endpoint-services
Step 03
- Find the VPC identifier
aws ec2 describe-vpcs
Step 04
- Creating the VPC endpoint for found VPC ID. Here VPC ID is
vpc-0bbc736e
aws ec2 create-vpc-endpoint --vpc-id vpc-0bbc736e --service-name com.amazonaws.us-east-1.dynamodb --route-table-ids rtb-11aa22bb
Step 05
- Verify the access using the VPC endpoint.
aws dynamodb list-tables
Appsync - DynamoDB - VPC endpoint
AWS AppSync is a fully managed service that facilitates the development of GraphQL APIs. Front-end developers mainly benefit from this service as it can handle heavy queries by securely connecting to data sources such as AWS DynamoDB. Our main application can be hosted in the EC2 instance inside the VPC, AppSync, and connect to the VPC endpoint. Making a GraphQL API heavily useful in the front-end will create a connection between AppSync and DynamoDB using the VPC endpoint.
AWS Lambda with VPC for DynamoDB
Lambda and DynamoDB run in the AWS Cloud. Both can work outside VPCs. However, if the Lambda function needs to access resources in a VPC, we might need to place it there. But, it will also create a similar challenge like an EC2 instance if the same Lambda function needs to access DynamoDB.
We can use VPC endpoints to solve this challenge as well. By configuring a VPC endpoint, the Lambda function can securely and efficiently access DynamoDB without traversing the public internet, thus maintaining compliance and enhancing security.
AWS Data Pipeline and Usage of VPC
To better understand the topic, you need to understand what is meant by a data pipeline. AWS Data Pipeline is a web service that you can use to automate data transformations. Also, with the AWS Data Pipeline, you can define data-driven workflows and schedules, depending on the successful completion of previous tasks.
This link will drive you through the simple steps to set up AWS Data Pipeline. As we know, after 2013-12-04, AWS will create a VPC for the related region when creating the AWS account. Therefore, the default configuration of the default VPC supports AWS Data Pipeline resources.
Use Case:
- Customers can choose to run their web server inside an EC2 instance of a VPC. This web server tracks day-to-day life data and stores it in Amazon DynamoDB managed by AWS. So that this should be a scheduled daily automated task, the pipeline will help you give some insights into the data you have stored for a certain period.
- Since we are dealing with large amounts of data, and they require security and performance at the highest possible level, we need DynamoDB VPC endpoints to make the setup efficient and secure.
Note: AWS recommends that you create resources for all your pipelines in VPCs.
Security Considerations
When using VPC endpoints with DynamoDB, it's crucial to consider security best practices. Ensure that your VPC endpoint policies are as restrictive as possible, granting only the necessary permissions to the required resources. Regularly review and update these policies to adapt to changing security requirements. Additionally, monitor the traffic through your VPC endpoints using AWS CloudWatch to detect any unusual activity or potential security threats.
Frequently Asked Questions
My Lambda in VPC can't find a connection with DynamoDB - what should I do?
DynamoDB is located outside the VPC, and AWS fully manages it. Therefore, to connect to DynamoDB from a Lambda function within a VPC, you need to connect it through a VPC endpoint.
Though it's not optimal, another alternative is to use a NAT Gateway if your Lambda is in a private subnet. Your Lambda is able to connect with DynamoDB. This link refers to an article that will help to set up the NAT Gateway with VPC.
Does DynamoDB require VPC?
No. You can retrieve or send data to DynamoDB over the public network. By default, communications to and from DynamoDB use the HTTPS protocol, protecting network traffic using SSL/TLS encryption.
A VPC endpoint for DynamoDB enables your VPC's EC2 instances to use their private IP addresses to access DynamoDB with increased privacy, security, and greater efficiency.