Skip to main content

Routing Lambda traffic out of VPC using Egress-only Internet Gateway

· 9 min read
Ivan Barlog
AWS Solutions Architect @ BeeSolve

Last year AWS announced that you can use Egress-only Internet Gateway to route the traffic out of VPC without need for NAT Gateway. For free. The only gotcha is that it needs to be IPv6 traffic.

This way you can finally stop paying for NAT Gateways - if you communicate through IPv6 of course.

TL;DR

Here is all you need to know

In order to route traffic from private egress only VPC subnets you need to allow dual-stack in your VPC which will automatically deploy Egress-only Internet Gateway.

You will also need to set up DynamoDB and S3 VPC Gateway endpoints in order to communicate with these services.

Lastly since you will need to communicate over IPv6, you can force your Node.js to resolve IPv6 addresses by this line of code:

setDefaultResultOrder("ipv6first");

If you want to push messages to SQS, your client needs to be set up like this:

const sqs = new SQSClient({
useDualstackEndpoint: true,
useQueueUrlAsEndpoint: false,
});

Happy building!

The full story

In order to find out how this works I've created very simple example which tests two Lambda functions - one is in VPC and another is outside of VPC.

Both functions perform these tasks:

  1. read/write DynamoDB table
  2. send messages to SQS
  3. read/write S3 bucket
  4. fetch public URL

I also do this several times over and I measure the time these tasks take. I wouldn't call it necesserily benchmark though.

Full code of Lambda handler
import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import {
ListObjectsV2Command,
PutObjectCommand,
S3Client,
} from "@aws-sdk/client-s3";
import { SendMessageCommand, SQSClient } from "@aws-sdk/client-sqs";
import {
DynamoDBDocumentClient,
PutCommand,
QueryCommand,
} from "@aws-sdk/lib-dynamodb";
import { randomUUID } from "node:crypto";
import { setDefaultResultOrder } from "node:dns";

setDefaultResultOrder("ipv6first");

const dynamo = DynamoDBDocumentClient.from(new DynamoDBClient(), {
marshallOptions: {
removeUndefinedValues: true,
convertEmptyValues: false,
},
});

const sqs = new SQSClient({
useDualstackEndpoint: true,
useQueueUrlAsEndpoint: false,
});

const s3 = new S3Client();

const size = 100;
const pk = "test1";
const apiUri = "https://jsonplaceholder.typicode.com/todos/1";

export const handler = async () => {
console.time();
await fillDatabase();
await readDatabase();
await sendToQueue();
await writeToS3();
await readFromS3();
await fetchPublicApi().catch(console.error);
console.timeEnd();
};

async function fillDatabase() {
const start = performance.now();

for (let i = 0; i < size; i++) {
await dynamo.send(
new PutCommand({
TableName: process.env.TABLE_NAME!,
Item: {
pk,
sk: Date.now().toString(),
},
}),
);
}

const diff = performance.now() - start;
console.info(`fill database took: ${diff}ms`);
console.info(`fill database avg took: ${diff / size}ms`);
}

async function readDatabase() {
const start = performance.now();

for (let i = 0; i < size; i++) {
await dynamo.send(
new QueryCommand({
TableName: process.env.TABLE_NAME!,
KeyConditionExpression: "#pk = :pk",
ExpressionAttributeNames: {
"#pk": "pk",
},
ExpressionAttributeValues: {
":pk": pk,
},
}),
);
}

const diff = performance.now() - start;
console.info(`read database took: ${diff}ms`);
console.info(`read database avg took: ${diff / size}ms`);
}

async function fetchPublicApi() {
const start = performance.now();

for (let i = 0; i < size; i++) {
await fetch(apiUri);
}

const diff = performance.now() - start;
console.info(`fetch public api took: ${diff}ms`);
console.info(`fetch public api avg took: ${diff / size}ms`);
}

async function sendToQueue() {
const start = performance.now();

for (let i = 0; i < size; i++) {
await sqs.send(
new SendMessageCommand({
QueueUrl: process.env.QUEUE_URL!,
MessageBody: JSON.stringify({ createdAt: new Date() }, null, 2),
}),
);
}

const diff = performance.now() - start;
console.info(`send message to queue took: ${diff}ms`);
console.info(`send message to queue avg took: ${diff / size}ms`);
}

async function writeToS3() {
const start = performance.now();

for (let i = 0; i < size; i++) {
await s3.send(
new PutObjectCommand({
Bucket: process.env.BUCKET_NAME!,
Key: randomUUID(),
Body: JSON.stringify({ createdAt: new Date() }),
}),
);
}

const diff = performance.now() - start;
console.info(`write to s3 took: ${diff}ms`);
console.info(`write to s3 avg took: ${diff / size}ms`);
}

async function readFromS3() {
const start = performance.now();

for (let i = 0; i < size; i++) {
await s3.send(
new ListObjectsV2Command({
Bucket: process.env.BUCKET_NAME!,
MaxKeys: 10,
}),
);
}

const diff = performance.now() - start;
console.info(`read from s3 took: ${diff}ms`);
console.info(`read from s3 avg took: ${diff / size}ms`);
}
Code for setting up infrastructure using CDK
import { App, Duration, Stack, type Environment } from "aws-cdk-lib";
import {
AttributeType,
Billing,
TableEncryptionV2,
TableV2,
} from "aws-cdk-lib/aws-dynamodb";
import {
GatewayVpcEndpointAwsService,
IpProtocol,
SubnetType,
Vpc,
} from "aws-cdk-lib/aws-ec2";
import { Architecture, Runtime } from "aws-cdk-lib/aws-lambda";
import {
NodejsFunction,
type BundlingOptions,
} from "aws-cdk-lib/aws-lambda-nodejs";
import { Bucket } from "aws-cdk-lib/aws-s3";
import { Queue } from "aws-cdk-lib/aws-sqs";
import { resolve } from "path";

const env: Environment = {
account: "your-account-id",
region: "your-region",
};

const app = new App();
const stack = new Stack(app, "EgressOnlyInternetGatewayTest", { env });

const vpc = new Vpc(stack, "Vpc", {
subnetConfiguration: [
{
name: "Public",
subnetType: SubnetType.PUBLIC,
cidrMask: 19,
},
{
name: "Private",
subnetType: SubnetType.PRIVATE_WITH_EGRESS,
cidrMask: 19,
},
{
name: "Isolated",
subnetType: SubnetType.PRIVATE_ISOLATED,
cidrMask: 19,
},
],
natGateways: 0,
maxAzs: 2,
ipProtocol: IpProtocol.DUAL_STACK,
});

vpc.addGatewayEndpoint(`DynamoDbEndpoint`, {
service: GatewayVpcEndpointAwsService.DYNAMODB,
});
vpc.addGatewayEndpoint(`S3Endpoint`, {
service: GatewayVpcEndpointAwsService.S3,
});

const table = new TableV2(stack, "Data", {
partitionKey: {
name: "pk",
type: AttributeType.STRING,
},
sortKey: {
name: "sk",
type: AttributeType.STRING,
},
billing: Billing.onDemand(),
encryption: TableEncryptionV2.awsManagedKey(),
});

const bundling: BundlingOptions = {
minify: true,
sourceMap: false,
target: "es2022",
};

const queue = new Queue(stack, "Queue");
const bucket = new Bucket(stack, "bucket");

const memorySize = 1024;
const timeout = Duration.seconds(10);

const vpcHandler = new NodejsFunction(stack, "VpcHandler", {
entry: resolve(__dirname, "./handler.ts"),
handler: "handler",
bundling,
memorySize,
timeout,
environment: {
TABLE_NAME: table.tableName,
QUEUE_URL: queue.queueUrl,
BUCKET_NAME: bucket.bucketName,
},
runtime: Runtime.NODEJS_24_X,
architecture: Architecture.ARM_64,
depsLockFilePath: resolve(`${__dirname}/bun.lock`),
vpc,
allowAllIpv6Outbound: true,
ipv6AllowedForDualStack: true,
vpcSubnets: {
subnetType: SubnetType.PRIVATE_WITH_EGRESS,
},
});

const publicHandler = new NodejsFunction(stack, "PublicHandler", {
entry: resolve(__dirname, "./handler.ts"),
handler: "handler",
bundling,
memorySize,
timeout,
environment: {
TABLE_NAME: table.tableName,
QUEUE_URL: queue.queueUrl,
BUCKET_NAME: bucket.bucketName,
},
runtime: Runtime.NODEJS_24_X,
architecture: Architecture.ARM_64,
depsLockFilePath: resolve(`${__dirname}/bun.lock`),
});

table.grantReadWriteData(vpcHandler);
table.grantReadWriteData(publicHandler);
queue.grantSendMessages(vpcHandler);
queue.grantSendMessages(publicHandler);
bucket.grantReadWrite(vpcHandler);
bucket.grantReadWrite(publicHandler);

With CDK I've described simple architecture for two Lambda functions - one deployed inside VPC and one outside VPC. I've added sample SQS queue, DynamoDB table and S3 bucket.

The VPC is described as follows:

const vpc = new Vpc(stack, "Vpc", {
subnetConfiguration: [
{
name: "Public",
subnetType: SubnetType.PUBLIC,
cidrMask: 19,
},
{
name: "Private",
subnetType: SubnetType.PRIVATE_WITH_EGRESS,
cidrMask: 19,
},
{
name: "Isolated",
subnetType: SubnetType.PRIVATE_ISOLATED,
cidrMask: 19,
},
],
natGateways: 0,
maxAzs: 2,
ipProtocol: IpProtocol.DUAL_STACK,
});

vpc.addGatewayEndpoint(`DynamoDbEndpoint`, {
service: GatewayVpcEndpointAwsService.DYNAMODB,
});
vpc.addGatewayEndpoint(`S3Endpoint`, {
service: GatewayVpcEndpointAwsService.S3,
});

Important things to note above are 0 NAT gateways natGateways: 0 and enabling dual-stack protocol for using IPv6 ipProtocol: IpProtocol.DUAL_STACK. The latter also adds Egress-only Internet Gateway which is crucial for this architecture.

I've also enabled both of DynamoDB and S3 VPC Gateway endpoints. These are used for communication with these services without need of going through public network meaning the communication is done entirely through internal AWS network. Also using these is free of charge.

The first Lambda is configured to be deployed outside of VPC. This means that it will communicate with services like S3, SQS and DynamoDB through their public endpoints. It is perfectly safe as all the traffic is by default encrypted in transit and all of the services are using IAM for authentication and authorization.

The second Lambda is deployed to private subnet (SubnetType.PRIVATE_WITH_EGRESS) which means that it will communicate with S3 and DynamoDB through VPC Gateway Endpoint. In order to communicate with other services we either need to create some VPC endpoints (~$10/month per endpoint) or we want to route the traffic through Egress-only Internet Gateway. Because of the cost optimization I have chosen latter.

My example uses TypeScript and AWS SDK but I guess any other SDK can be set up in this way.

In order to access SQS through IPv6 we need to let SQS client know that we don't want to use IPv4 endpoint which is used by default. The queue URL which is used for pushing messages to SQS looks something like https://sqs.eu-central-1.amazonaws.com/1234567890/EgressOnlyInternetGatewayTest-Queue4A7E3555-BJOpI2ObN8Iz which clearly needs to be translated to IP address. By default SQSClient does not use dual-stack so we need to enable it like this:

const sqs = new SQSClient({
useDualstackEndpoint: true, // Enables IPv6/IPv4 dualstack endpoint.
useQueueUrlAsEndpoint: false, // Set this value to false to ignore the QueueUrl and use the client’s resolved endpoint, which may be a custom endpoint.
});

When I did this I thought that I am done but unfortunately another part of my code which uses fetch resolves hostnames with IPv4 instead of IPv6. After what felt like 20 minutes I realized that Node.js needs to be hinted to use IPv6 resolving as well.

I've tried different approaches like resolving the hostname through node:dns first and then using the resolved address with fetch etc. but I ended up with this single line of code which I put on top of my TypeScript file:

setDefaultResultOrder("ipv6first");

Super simple and elegant - once you know you find it in a haystack 🙃

After setting these two things I was able to run the same code inside and outside of VPC. I know that I didn't do real benchmark but from what I've seen the above code has performed very similarly in both environments.

Here is sample output of both handlers (some of repetitive information is deleted for clarity). Both results show numbers without cold start.

VpcHandler:

START RequestId: abcacdc9-1363-4cbf-9d47-a8c3d570eefa Version: $LATEST
fill database took: 632.5130489999974ms
fill database avg took: 6.3251304899999745ms
read database took: 744.5211479999998ms
read database avg took: 7.445211479999998ms
send message to queue took: 660.9118600000002ms
send message to queue avg took: 6.609118600000001ms
write to s3 took: 2391.3652559999973ms
write to s3 avg took: 23.913652559999974ms
read from s3 took: 1804.0117310000023ms
read from s3 avg took: 18.040117310000024ms
fetch public api took: 777.1355969999968ms
fetch public api avg took: 7.771355969999968ms
default: 7.011s
END RequestId: abcacdc9-1363-4cbf-9d47-a8c3d570eefa
REPORT RequestId: abcacdc9-1363-4cbf-9d47-a8c3d570eefa Duration: 7013.91 ms Billed Duration: 7014 ms Memory Size: 1024 MB Max Memory Used: 171 MB

PublicHandler:

START RequestId: 2d50f0e9-9e80-4721-8be2-5a690b5502d0 Version: $LATEST
fill database took: 633.4711789999965ms
fill database avg took: 6.3347117899999645ms
read database took: 737.5715189999974ms
read database avg took: 7.3757151899999736ms
send message to queue took: 587.4475230000025ms
send message to queue avg took: 5.874475230000026ms
write to s3 took: 2231.286675000003ms
write to s3 avg took: 22.31286675000003ms
read from s3 took: 1750.1437510000032ms
read from s3 avg took: 17.50143751000003ms
fetch public api took: 828.715667000004ms
fetch public api avg took: 8.28715667000004ms
default: 6.769s
END RequestId: 2d50f0e9-9e80-4721-8be2-5a690b5502d0
REPORT RequestId: 2d50f0e9-9e80-4721-8be2-5a690b5502d0 Duration: 6772.01 ms Billed Duration: 6773 ms Memory Size: 1024 MB Max Memory Used: 176 MB

Conclusion

If you need to run your AWS Lambda in private VPC subnets to access services like EFS or a private RDS, or if you simply want to strengthen your security posture, the Egress-only Internet Gateway may be a good, cost-effective solution for you.

For accessing services and external APIs that support IPv6, using an Egress-only Internet Gateway is the way to go.

However, if you need to resolve IPv4, you will still need to rely on a NAT Gateway for now.

As always, I wish you happy coding! Now go build something. 🙂