All you need to know to connect your On-Premises with AWS

Hemant Rawat
13 min readJan 15, 2024

--

On-premises to Cloud (Image courtesy: Unsplash)

Abstract

With an increasing number of enterprises expressing interest in deploying their applications on public cloud platforms such as Amazon Web Services (AWS), it becomes imperative for enterprises to gain a comprehensive understanding of the journey that networking, traffic, and packet processing undergo as they traverse through the AWS public cloud. This article aims to furnish insights into the network connectivity to the AWS ecosystem, shedding light on the architecture and design principles underpinning AWS-based deployments.

1.0 Background

Some workloads come with stringent demands in terms of latency, jitter, and the need for specific capabilities like seamless failover to ensure uninterrupted services. Below, is the summary of the key requirements for enabling Hybrid clouds:

Requirements:

1. Unified Routing Domain: Customer seek a unified routing domain utilizing industry-standard routing technologies to ensure connectivity regardless of the workload’s location, whether on-premises or in the cloud — this eliminates cloud-specific dependencies and complexities while providing a consistent network environment. Many customers have adopted segment routing-multiprotocol label switching (SR-MPLS) on-premises and expect cloud support with multiprotocol-border gateway protocol (MP-BGP) spanning end to end.

2. Centralized Intelligent Network Controller: There’s a need for a centralized intelligent network controller that can orchestrate, automate, and enable services from a single management interface in a hybrid cloud environment. This controller should provide complete visibility into the network from end to end.

3. High-Performance Transport Fabric: Workloads require a cohesive, transparent, and high-performance transport fabric to facilitate seamless communication within the strict latency constraints imposed by applications.

4. DevOps Adoption: Customers aim to use DevOps methodologies for workloads to enable agile feature development and rollout. The ability to develop, test, and deploy features rapidly is crucial to harness the potential of next-generation networks.

5. Automation and Portability: Enterprises require automation and portability of their technology stack through independent continuous integration/continuous delivery (CI/CD) mechanisms. This ensures the flexibility to deploy applications and services in any desired environment, whether public or private cloud, centralized or edge locations.

6. Monitoring: Effective monitoring capabilities are essential to gain insights into network performance and ensure optimal service quality and availability.

Customers expect these capabilities to mirror those found in on-premises service provider transport networks, including granular service separation, advanced packet queuing, deterministic routing of data flows, rapid failure detection and failover in the transport network, the ability to establish and measure meaningful Service Level Agreements (SLAs), network-based security, and control over IP routing decisions. All these requirements must be met while delivering high performance and scalability on the cloud infrastructure.

Lets explore various connectivity patterns one can utilize to achieve above objectives:

2.0 AWS VPC and AWS Direct Connect

Lets dive deep into various networking options available to customers for establishing connectivity between their on-premises networks and the AWS Cloud. It is assumed that most large enterprises require BGP-based guaranteed bandwidth connectivity, so this discussion will primarily focus on Direct Connect-based connectivity patterns, omitting in-depth exploration of alternative solutions like AWS site-to-site VPN or Cloud WAN services.

Amazon Virtual Private Cloud (Amazon VPC) serves as a logically isolated virtual network dedicated to your AWS account. Within a VPC, you can launch AWS resources and define your IP addressing scheme. This includes configuring subnet ranges, routing tables, network gateways, and security settings. VPC operates as a security boundary within AWS, ensuring that communication remains restricted exclusively to the resources under your deployment and management.

Amazon VPC extends support for IP address allocation by subnets, enabling the segmentation of IP address spaces into distinct CIDR ranges ranging from /16 to /28. This segmentation allows you to assign blocks and segments, thereby enhancing control over external and internal traffic routing.

The AWS Direct Connect service offers the most direct and efficient route to access AWS resources. Traffic traversing through AWS Direct Connect remains within the AWS backbone, bypassing the public internet entirely. This ensures a dedicated private network connection of 1G/10G/100G, catering to the low latency demands of applications.

To initiate the Direct Connect, customers must first select the nearest AWS Direct Location (Currently, AWS has over 100 Direct Connect locations worldwide). Using the AWS Console, customers can place orders for Direct Connect specifying the desired bandwidth, the number of links, the number of locations, and their chosen Colocation (CoLo) provider. Following the successful establishment of the connection, you can proceed to configure one or more virtual interfaces to establish network connectivity.

Direct Connect facilitates the creation of logical virtual connections within the physical Direct Connect circuit, leveraging 802.1Q Layer 2 VLANs for this purpose. By using VLANs, Direct Connect ensures network isolation and enables the creation of virtual circuits to different network types. These logical virtual connections are subsequently associated with a virtual interface within the AWS environment.

There are three types of VIFs (Virtual Interfaces) available for customers to establish connectivity between their data centers and AWS Regions:

Figure 1: Three types of Virtual Interfaces

1. Private Virtual Interfaces (VIFs): Private VIFs are used to access Amazon VPCs using private IP addresses. Each VIF is associated with a distinct virtual local area network (VLAN) tag. This means customers can segment traffic effectively by utilizing one VIF per virtual routing and forwarding (VRF) in their data center. It’s worth noting that the number of VIFs may be subject to limitations depending on the type of Direct Connect connection in use.

2. Public Virtual Interface: A Public Virtual Interface enables access to all AWS public services using public IP addresses.

3. Transit Virtual Interface: A Transit Virtual Interface serves as a VLAN that facilitates the transport of traffic from a Direct Connect Gateway to one or more Transit Gateways.

Once the Layer 2 VLAN link is established, the subsequent step involves assigning IP addresses and establishing BGP connectivity.

When establishing one or more Direct Connect connections with AWS, it is strongly recommended to utilize Border Gateway Protocol (BGP) as the preferred routing protocol. BGP leverages Autonomous System Numbers (ASNs) to construct a vector graph representing the network topology based on the exchanged prefixes between your router and the corresponding AWS Gateway. These connections between two ASNs compose a path, and the aggregation of all such paths forms a route used to reach a specific destination. BGP carries a sequence of ASNs that indicate the routes traversed.

BGP Configuration:

For each virtual interface, you will need to specify a VLAN ID and BGP ASN. Optionally, customers have the flexibility to configure Router Peer IPs, MTU, and a BGP Authentication key. In cases where no values are entered, AWS will generate these fields automatically.

Sample Router BGP configuration:

Figure 2: Sample BGP Configuration

After establishing the BGP connectivity, customers have several methods to expose their Virtual Private Cloud (VPC) to on-premises:

1. Virtual Private Gateway (VGW): Customers can provision a VGW and associate it with the VPC. Subsequently, the VGW can be linked to a Private Virtual Interface to establish connectivity with the data center. VGW facilitates connection to a single VPC within the same AWS Region.

2. Direct Connect Gateway: Customers can use a Direct Connect Gateway to interconnect multiple VPCs spanning various AWS Regions. The use of Direct Connect Gateway is highly recommended.

3. Transit Gateway (TGW): VPCs can be connected to a Transit Gateway, which, in turn, can be attached to a Direct Connect Gateway.

To establish a BGP connection, your router and the corresponding AWS Gateway (VGW/DXGW) must be directly linked. All BGP neighbor connections must terminate at the VGW/DXGW. Without the successful establishment of this neighbor relationship, BGP updates will not be exchanged.

The VGW/DXGW receives routing information from your routers and utilizes the BGP best path selection algorithm to determine the preferred paths. The rules governing this algorithm in the context of VPCs include:

· Most Specific IP Prefix: Preference is given to the most specific IP prefix.

· Local Preference: A prefix with a higher local preference is favored.

· Shortest AS_PATH: Among matching prefixes, the one with the shortest AS_PATH is prioritized.

· MED (Metric): Lower MED metric values are preferred (optional).

· Origin Code: The route advertised via the “network” command takes precedence over the one advertised through “redistributed” commands.

In cases where all the above criteria are equal, Equal-Cost Multi-Pathing (ECMP) is used across the connections. Direct Connect also supports Bi-directional Forwarding Detection (BFD) through a heartbeat mechanism for fault detection.

Customers often maintain multiple paths for redundancy when connecting to AWS. The most common ways to influence path selection in such scenarios include:

AS-PATH Prepending: Customers can advertise prefixes with varying lengths of AS paths to steer traffic within a region, with the shortest AS path being preferred.

Use of BGP Communities: utilize BGP communities to assign preferences:

· 7224:7100 — Low Preference: 780

· 7224:7200 — Medium Preference: 790

· 7224:7300 — High Preference: 800

The default value is Local preference: 790

A Direct Connect Connection within the same home region as the source VPC is inherently favored by default. It’s essential to note that AS PATH alone may not suffice for cross-region connections; in such cases, local preference should be used.

3.0 Network Segmentation

One critical requirement for customers is the segregation of network traffic. Let’s explore five methods to fulfill this requirement. It’s worth noting that it is recommend the customer’s IP Networking team oversee AWS Region connectivity, and use a Network Operations (NetOps) account, owned by the Networks team, to ensure robust Security, Compliance, and Routing controls for multiple tenants and workloads.

3.1 Using Virtual Private Gateway (VGW)

The AWS Virtual Private Gateway (VGW) facilitates the connection of your VPC to an on-premises network. By attaching your VGW to your VPC on one end and utilizing a Private Virtual Interface (VIF) on the Direct Connect connection to advertise VPC routes using BGP, you can establish this connectivity.

Here’s how traffic from the data center VRF, headed for the application within a specific VPC (let’s call it RD1), is routed:

1. Initially, the traffic from the customer’s data center VRF is directed to the AWS Router located at a Colo (Colocation) facility.

2. The customer’s router uses the private VIF over the Direct Connect link to transmit this traffic to the Virtual Private Gateway linked to VPC RD1.

3. By the VPC Route Table, the traffic is then forwarded to the appropriate VPC RD1.

Figure 3: VGW based connectivity

This sequence ensures the efficient routing of traffic from the data center to the designated VPC.

Note: The Virtual Private Gateway (VGW) does not support transitive routing, meaning it permits incoming packets destined for IPs within the VPC but does not facilitate routing between VPCs.

3.2 Using Direct Connect Gateway (DX-GW)

The use of AWS Direct Connect Gateway (DXGW) simplifies the process of establishing connectivity with on-premises networks by enabling the attachment of multiple Virtual Private Gateways (VGWs) to the same DX-GW. Notably, a Direct Connect Gateway is a globally accessible resource, allowing connections to be established in any AWS Region worldwide. Importantly, the Direct Connect Gateway operates transparently and does not affect the path of data traffic. By associating multiple VGWs with a Direct Connect Gateway, it becomes possible to utilize a single Private Virtual Interface (VIF) for communication with the on-premises network, thereby eliminating the need for multiple BGP sessions.

Figure 4: DX-GW based connectivity

Please note that a single DXGW can accommodate associations with up to 10 VPCs/VGWs, and each Direct Connect (DX) connection can support the use of up to 50 DXGWs, resulting in a total capacity for connecting 500 VPCs. This connectivity approach offers a straightforward means of linking on-premises data centers with AWS, with all VPC CIDRs being consolidated into a single BGP Route Summary.

3.3 Using Transit Gateway (TGW)

The Transit Gateway (TGW) can be understood as a regional router, serving as a key component for interconnecting VPCs and facilitating connectivity with on-premises networks. Understanding how TGW operates involves grasping three key concepts:

1. TGW Attachment: These interfaces enable the passage of packets into and out of the TGW. When you establish a VPC attachment, it generates an Elastic Network Interface (ENI) within the selected subnet during TGW creation. This TGW ENI serves as the conduit for ingress and egress traffic for the VPC. There are various types of attachments available, including VPC, Direct Connect Gateway (DXGW), TGW Peering, Connect, and VPN.

2. TGW Route Table (RT) Association: Attachments can be associated with precisely one TGW Route Table. A single Route Table can accommodate multiple attachments.

3. TGW Propagation: Attachments can share their routes with a Route Table.

4. TGW Route Table (RT): This defines the criteria for routing packets:

a. Packets exiting an attachment will be directed to the RT associated with that attachment.

b. The next hop could potentially be another attachment.

c. A TGW can support a substantial 10,000 routes.

It’s important to note that TGW Routes are not automatically propagated to VPC Route Tables (unlike Virtual Private Gateways); they must be added manually.

Here are some important considerations when working with TGWs:

i. TGW does not honor the local preference community advertised via Direct Connect.

ii. TGW still respects AS Path prepending, giving preference to prefixes with shorter AS Paths.

iii. If the on-premises network advertises the same prefix with the same ASNs (including all ASNs in the path, including the DXGW’s ASNs), TGW will perform Equal-Cost Multi-Pathing (ECMP) across two DXGW attachments.

iv. If the on-premises network advertises the same prefix with the same AS Path length but different ASNs, TGW will not use ECMP. Instead, the DXGW attachment with the routes learned first will be added to the TGW route table. The geographical region or location is not a factor in this determination.

Figure 5: Transit Gateway connectivity

Note: TGW enables transitive routing, meaning that the destination IP address may fall outside the VPC CIDR range.

3.4 Using TGW Connect

Customers often need to advertise a multitude of IP prefixes to AWS, where each IP prefix may correspond to a specific application, potentially numbering in the thousands. The implementation of this requirement is streamlined with TGW Connect.

AWS Transit Gateway Connect attachments, established through GRE tunnels over the transit VIF (Virtual Interface), offer a means to segregate traffic effectively. This involves creating a dedicated Connect attachment for each Virtual Routing and Forwarding (VRF) instance. GRE, in this context, serves as a protocol for encapsulating data packets, allowing the utilization of one routing protocol within the packets of another.

A TGW Connect attachment leverages a transport TGW attachment, which can be an existing VPC or DXGW attachment, to establish a connection with a third-party appliance, thus linking it to the TGW. Notably, the TGW Connect attachment extends support for eBGP, iBGP, MP-BGP, and ECMP routing mechanisms, while static routes are not accommodated.

Figure 6: Transit Gateway Connect

3.5 Using Transit VPC

In cases where customers possess network segregation needs, they have the option to opt for a virtual router (VM or Container-based) to achieve VRF-based network separation. It is recommended to deploy multiple instances of such a virtual router, configured in either an Active/Active or Active/Passive mode. Each workload subsequently advertises its Service IP (VIP) via BGP to the virtual router, which, in turn, disseminates this information to the on-premises network.

Figure 7: Transit VPC

4.0 High Availability

Direct Connect High Availability Architecture

Location Costing:

Active/Active and Active/Passive routing designs are important considerations when setting up network connections to AWS, especially for scenarios where high availability and failover capabilities are crucial. Here’s a breakdown of the key strategies mentioned for each design:

Active/Active Routing Design:

· Local Preference BGP Communities: Local preference values can be assigned to BGP communities to influence the routing decisions within your network. In an Active/Active setup, you can assign equal local preference values to routes received from both AWS Direct Connect connections, ensuring that outbound traffic is evenly distributed across both connections.

· Equal AS-Path Lengths: To balance outbound traffic, you can advertise routes to AWS using equal AS-Path lengths. This approach ensures that AWS receives the routes with the same AS-Path length, making it treat both Direct Connect connections equally for outbound traffic.

· Per-Destination Routing: For redundancy and load balancing, you can implement per-destination routing. This means that for each destination network or prefix, you decide which Direct Connect connection to use. Half of your routes can be sent over one link, and the other half over the second link. This setup provides redundancy for non-primary destinations and ensures efficient load balancing.

Active/Passive Routing Design:

· Local Preference BGP Communities: Similar to Active/Active, local preference values can be used to influence routing decisions. In this case, you would designate one Direct Connect connection as the primary path by assigning it a higher local preference value. The secondary path would have a lower local preference value, making it the backup or passive path.

· AS PATH Prepend: AS-PATH prepend is a technique where you add your AS number multiple times to the AS-Path attribute of BGP routes when advertising them. For Active/Passive routing, you can prepend your AS number more times for the secondary path. This makes the primary path more attractive to incoming traffic and the secondary path less attractive, serving as a backup.

· Advertise More Specific Routes: To control traffic failover, you can advertise more specific routes for the primary path. More specific routes are preferred in BGP routing. In the event of a failure on the primary path, you withdraw the more specific routes, causing traffic to flow through the secondary path.

These strategies allow you to implement Active/Active or Active/Passive routing designs that suit your network’s requirements for high availability, load balancing, and failover. Properly configuring BGP and utilizing BGP communities can significantly enhance the control and flexibility of your network’s routing policies in AWS Direct Connect scenarios.

5.0 Summary

AWS Direct Connect can be used to establish a dedicated and reliable network link between data centers and AWS services, facilitating hybrid cloud deployments and meeting specific performance and security requirements. It bypassing the public internet, offering more consistent network performance with lower latency and higher throughput. Direct Connect can also help reduce data transfer costs by providing more predictable and potentially lower-cost data transfer compared to internet-based connections.

--

--

Hemant Rawat
Hemant Rawat

Written by Hemant Rawat

Product Management & Solutions Engineering.

No responses yet