The Hub-and-Spoke topology is the most common topic in the discussion for building cloud infrastructure design. This topology appeared in both AWS and Azure design papers and had been around as a very important option in physical networking design.
The AWS whitepaper Building a Scalable and Secure Multi-VPC AWS Network Infrastructure has thorough discussion on the topology. This topology often feature a Transit Gateway as the hub. In addition to workload VPCs, the network topology often includes some special-purpose VPCs, such as interface endpoints VPC, or shared tooling VPCs. One of the special-purpose VPC is the inspection VPC. It is a key design area to suit the need of inspection and traffic management for the business and the design may vary a lot depending on the available inspection tools such as a Firewall appliance.
Inspection Requirements
The most common situation with an enterprise is connecting with on-prem networking. Options including Direct Connect, site-to-site IPsec or SD-WAN overlay. The business decides whether and at what level they would like to inspect the traffic between on-prem and their VPCs. Here is an example.
Connectivity | Inspection Requirement |
Between Workload VPCs (East-West) | No inspection |
Between a workload VPC and a special-purpose VPC | Normal Inspection |
Ingress Traffic from Internet to Workload VPC | Deep Packet Inspection |
Egress Traffic from Workload VPC to Internet | Deep Packet Inspection |
Between Workload VPC and on-prem networking over Direct Connect | Normal Inspection |
…… |
With a normal inspection, the firewall appliance only checks the information in the packet’s header, such as the source and destination IP addresses, port number, etc. With deep packet inspection, the appliance examins a larger range of metadata as well as the data in each packet. DPI provides a more effective mechanism to perform network packet filtering and find otherwise hidden threats. It is however an expensive operations from a performance standpoint. Ultimately the business makes the call but it is important to identify ALL connectivity scenarios in this phase and explicitly document the decision and rationales. They can choose from an NGFW product or the Network Firewall service from AWS, depending on capability required.
Inspection Architecture
At minimum, inspection is required for ingress and egress traffic to and from workload VPC. The design must account for both routing and inspection. Many would use the same VPC for ingress/egress traffic and for inspection.
It is also possible to separate these two purposes into two different dedicated VPCs: an inspection VPC that hosts firewall services or appliances, and an ingress/egress VPC that directs traffic from and to the Internet but we must route the traffic to the inspection appliance. If all traffic to be inspected has to be routed through the Transit Gateway both ways, the cost would be high. In 2020 AWS introduced Gateway Load Balancer (GWLB) to address this use case.
The recommended pattern using GWLB allows you to place firewall appliance and a GWLB in one VPC, and place the GWLB endpoint (GWLBE) in a different VPC. The connectivity between GWLBE and GWLB is backed by HyperPlane, a technology that also enables other endpoint service such as PrivateLink. The connectivity between GWLB and the appliance take place with Geneve encapsulation. This pattern places any appliance behind an endpoint, so long as the appliance supports Geneve protocol.
The GWLB technology enables a number of inspection patterns based on distributed ingress paths, as summarized in this document. Distributed ingress/egress means each workload VPC can have their own Internet Gateway and NAT gateways. They must configure their route table so as to send the traffic via GWLBEs to inspection appliances. In general, I recommend this pattern over the centralized ingress/egress patterns where only the inspection VPC can take ingress traffic from Internet Gateway. The Network architectures for ingress traffic inspection presentation from 2021 ReInvent covered this topic as well, especially about the scaling benefit of distributed ingress.
Firewall deployment patterns
The firewall deployment pattern available differ between vendors and the requirements. Since the GWLB pattern places appliances behind the GWLB, the appliances rely on Geneve traffic that GWLB forward over. Some vendors may argue that this pattern keeps the NGFW product from performing other tasks that do not support Geneve traffic. One example is Network Address Translation. The native NAT gateway services is very expensive (consider fck-nat as an alternative for NAT). Many clients want to use the NAT feature of the NGFW product. The architecture therefore has to be adjusted in favour of centralized egress. Review this post about one-arm mode and two-arm mode.
If we have to go with central ingress/egress anyways, there are still numerous options. Take FortiGate for example, while the GWLB pattern of deployment is supported, other available options include:
- Traditional pattern with multiple interfaces across different subnets in the inspection VPC (L3 mode)
- Integration with Transit Gateway using Transit Gateway Connect Attachment
- Integration with Transit Gateway using Transit Gateway VPN Attachment
I regard the first option as traditional because it does not directly integrate with Transit Gateway and it is very similar to how we deploy them in a physical networking environment. Fortigate refers to it as L3 (NAT/route) mode. In this mode the Firewall appliance can also influence network routing.
The second and the third options are similar except for different types of Transit Gateway attachments are used. The reason to directly integrate with Transit Gateway is so that the Transit Gateway can route the traffic for inspection therefore no need for a Gateway Load Balancer, and thus no dependency on the firewall features supporting Geneve.
The second option builds a GRE (Generic Routing Encapsulation) tunnel over a Transit Gateway Connect attachment as the transport tunnel, and uses BGP to exchange routes between the Transit Gateway and the appliance. It treats the firewall instances as SD-WAN appliance and has performance benefit. The third option uses VPN attachment with the main benefit of encryption if it is part of compliance requirement.
Summary
In the networking infrastructure design, ingress and egress routing are the most critical one-way door decision. This decision must account for both routing and inspection. While there are many options, we usually start with capturing the key requirements. In this post we reviewed how to approach the requirement, a key technology Gateway Load Balancer and some firewall deployment patterns with FortiGate as an example. The approach is similar for other NGFW vendors, such as Palo Alto, Check Point or Cisco Secure Firewall.