《ACI 的 Kubernetes (K8s) 基礎設施連接 - 現代數據中心的網絡設計.pdf》由會員分享,可在線閱讀,更多相關《ACI 的 Kubernetes (K8s) 基礎設施連接 - 現代數據中心的網絡設計.pdf(98頁珍藏版)》請在三個皮匠報告上搜索。
1、#CiscoLive#CiscoLiveCamillo RossiTechnical Marketing Engineer BRKDCN-2663Network Designs for the Modern Data CentreKubernetes Infrastructure Connectivity for ACI 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveEnter your personal notes hereCisco Webex App 3Questions?Use Ci
2、sco Webex App to chat with the speaker after the sessionFind this session in the Cisco Live Mobile AppClick“Join the Discussion”Install the Webex App or go directly to the Webex spaceEnter messages/questions in the Webex spaceHowWebex spaces will be moderated by the speaker until June 9,2023.12343ht
3、tps:/ 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicBRKDCN-2663Agenda 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicKubernetes RefreshKubernetes RefreshKubernetes Network ChallengesACI-CNIBGP Based ArchitectureCalico,Cilium and Kube-RouterAutomation and Visibili
4、ty Which solution is right for me?Q&A4BRKDCN-2663Kubernetes Refresh 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveKubernetes-podA pod is the scheduling unit in Kubernetes.It is a logical collection of one or more containers which are always scheduled together.The set of
5、containers composed together in a pod share an IP.6rootk8s-01-p1#kubectl get pod-namespace=kube-systemNAME READY STATUS RESTARTS AGEaci-containers-controller-1201600828-qsw5g 1/1 Running 1 69daci-containers-host-lt9kl 3/3 Running 0 72daci-containers-host-xnwkr 3/3 Running 0 58daci-containers-openvsw
6、itch-0rjbw 1/1 Running 0 58daci-containers-openvswitch-7j1h5 1/1 Running 0 72dBRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveKubernetes DeploymentDeployments are a collection of pods providing the same serviceYou describe the desired state in a Deployment obje
7、ct,and the Deployment controller will change the actual state to the desired state at a controlled rate for youFor example you can create a deployment that declare you need to have 2 copies of your front-end pod.7rootk8s-01-p1#kubectl get deployment-namespace=kube-systemNAME DESIRED CURRENT UP-TO-DA
8、TE AVAILABLE AGEaci-containers-controller 1 1 1 1 72dBRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveKubernetes ServicesA service tells the rest of the Kubernetes environment(including other pods and Deployments)what services your application provides.While pod
9、s come and go,the service IP addresses and ports remain the same.Kubernetes automatically load balance the load across the replicas in the deployment that you expose through a ServiceOther applications can find your service through Kubernetes service discovery.Every time a service is create a DNS en
10、try is added to kube-dns8rootk8s-01-p1#kubectl get svc-namespace=kube-systemNAME CLUSTER-IP EXTERNAL-IP PORT(S)AGEkube-dns 11.96.0.10 53/UDP,53/TCP 72dBRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveKubernetes External ServicesIf there are external IPs that rou
11、te to one or more cluster nodes,Kubernetes services can be exposed on those external IPs.Traffic that ingresses into the cluster with the external IP(as destination IP),on the service port,will be routed to one of the service endpoints.External IPs are not managed by Kubernetes and are the responsib
12、ility of the cluster administrator.9rootk8s-01-p1#kubectl get svc front-end-namespace=guest-bookNAME CLUSTER-IP EXTERNAL-IP PORT(S)AGEfront-end 11.96.0.33 11.3.0.280:30002/TCP 3mBRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveKubernetes-AnnotationsSimilar to la
13、bels but are NOT used to identify and select object10rootk8s-01-p1#kubectl describe node k8s-01-p1|moreName:k8s-01-p1Role:Labels:beta.kubernetes.io/arch=amd64beta.kubernetes.io/os=linuxkubernetes.io/hostname=k8s-01-p1node-role.kubernetes.io/master=Annotations:node.alpha.kubernetes.io/ttl= 2023 Cisco
14、 and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveKubernetes NamespaceGroups everything together:PodDeploymentVolumesServices Etc11BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive12All Together:A K8S ClusterBRKDCN-2663ContainerApplicationpod1Depl
15、oyment1ContainerApplicationpod2ContainerApplicationpodnNode1NodeNNode2Service1.1.1.1:80NamespaceA node can be part ofSeveral NamespacesAgenda 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicKubernetes RefreshKubernetes Network ChallengesKubernetes Network ChallengesACI-CNIBGP Based
16、ArchitectureCalico,Cilium and Kube-RouterAutomation and Visibility Which solution is right for me?Q&A13BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveSegmentationSecure K8s infrastructureinfrastructure:network isolation for kube-system and other infrastructure
17、 related objects(i.e.heapster,hawkular,etc.)Network isolation between namespacesnamespaces14BRKDCN-2663PODPODPODFrontend-EPGPODPODPODAPI-Gateway-EPGPolicyPODPODPODBackend-EPGPODPODPODMonitoring-EPGPolicyPolicyPolicy 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveCommunica
18、tions outside of the ClusterNon-Cluster endpoints communicating with Cluster:Exposing external services,how?NodePort?LoadBalancer?Scaling-out ingress controllers?Cluster endpoints communicating with non-cluster endpoints:POD access to external services and endpointsCluster accessing shared resources
19、 like Storage15BRKDCN-2663PolicyPODPODPODFrontend-EPGPODPODPODAPI-Gateway-EPGPolicyPODPODPODBackend-EPGPODPODPODMonitoring-EPGPolicyPolicyPolicyAgenda 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicKubernetes RefreshKubernetes Network ChallengesACIACI-CNICNIBGP Based ArchitectureCa
20、lico,Cilium and Kube-RouterAutomation and Visibility Which solution is right for me?Q&A16BRKDCN-2663ACI-CNI Solution Overview 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveVisibility:Live statistics in APIC per container and health metricsHardware-accelerated:Integrated
21、load balancing and Source NATEnhanced Multitenancy and unified networking for containers,VMs,bare metal Flexible policy:Native platform policy API and ACI policiesFast,easy,Fast,easy,secure and secure and scalable scalable networking networking for your Application Container PlatformTurnkey solution
22、 for node and container connectivityWhy ACI-CNI for Application Container Platforms18BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveCisco ACI CNI plugin featuresIP Address Management for Pods and ServicesDistributed Routing and Switching with integrated VXLAN
23、overlays implemented fabric wide and on Open vSwitchDistributed Firewall for implementing Network PoliciesEPG-level segmentation for K8s objects using annotationsConsolidated visibility of K8s networking via VMM Integration19BRKDCN-2663NodeOpFlexOVSKubernetesACI PoliciesNetwork PolicyNodeOpFlexOVSAC
24、I-CNI Configuration 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveKubernetes Nodes will require the following interfacesInfraVLAN sub-interface over which we build the opflex channelNode IP sub-interface used for the Kubernetes API host IP address(Optional)OOB Management
25、 sub-interface or physical interface used optionally for OOB access.21BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveacc-provision configuration file(1)22BRKDCN-2663#The following resources must already exist on the APIC,#they are used,but not created by the p
26、rovisioning tool.aep:ACI_AttEntityP#The AEP for ports/VPCs used by this clustervrf:#The VRF can be placed in the same Tenant or in Common.name:vrf1tenant:KubeSpray#This can be the system-id or commonl3out:name:l3out#Used to provision external IPsexternal_networks:-default_extepg#Default Ext EPG,used
27、 for PBR redirectionaci_config:system_id:Kubernetes#Tenant Name and Controller Domain Nameapic_hosts:#List of APIC hosts to connect for APIC API-10.67.185.102vmm_domain:#Kubernetes VMM domain configurationencap_type:vxlan#Encap mode:vxlan or vlanmcast_range:#mcast range for BUM replicationstart:225.
28、22.1.1end:225.22.255.255mcast_fabric:225.1.2.4nested_inside:#(OPTIONAL)If running k8s node as VMs specify the VMM Type and Name.type:vmwarename:ACI 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveacc-provision configuration file(2)23BRKDCN-2663#Networks used by Kubernetes#
29、net_config:node_subnet:10.32.0.1/16#Subnet to use for nodespod_subnet:10.33.0.1/16#Subnet to use for Kubernetes Podsextern_dynamic:10.34.0.1/24#Subnet to use for dynamic external IPsextern_static:10.35.0.1/24#Subnet to use for static external IPsnode_svc_subnet:10.36.0.1/24#Subnet to use for service
30、 graphkubeapi_vlan:4011#The VLAN used by for nodes to node API communicationsservice_vlan:4013#The VLAN used by LoadBalancer servicesinfra_vlan:3456#The ACI infra VLAN used to establish the OpFlex tunnel with the leaf 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveacc-pro
31、visionACI Container Controller Provision:Takes a YAML file containing the parameters of your configuration Generates and pushes most of the ACI config Generates Kubernetes ACI CNI containers configurationacc-provision-flavor=kubernetes-1.25-a-u admin-p pass c config.yml o cni_conf.yml24Used to selec
32、t if we are deploying kubernetes 1.x or OpenShift 3.x APIC user and passwordConfiguration fileOutput file for ACI CNI config BRKDCN-2663acc-provision will now create 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveEPGs for nodes and PodsWithin the tenant selected the provi
33、sioning tool creates a Kubernetes Application Profile with three EPGs:for the node interfaces for the system PODs Default EPG for all containers on any namespace26BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBDs and Contractsminimum set of contracts to ensur
34、e basic cluster functionality and security27BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveL4-L7 DevicesDynamically updated if nodes are added or removed from the k8s cluster Service Graph Template28BRKDCN-2663ACI CNI Plugin componentsFor your reference 2023 C
35、isco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveACI CNI Plugin ComponentsACI Containers Controller(ACC,aci-containers-controller)It is a Kubernetes DeploymentDeployment running one POD instance.Handles IPAMManagement of endpoint stateManagement of SNAT PoliciesPolicy Mapping(ann
36、otations)Controls Load BalancingPushes configurations into the APIC30BRKDCN-2663rootdom-master1#oc get deployments-namespace=aci-containers-systemNAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGEaci-containers-controller 1 1 1 1 223dFor your reference 2023 Cisco and/or its affiliates.All rights reserved
37、.Cisco Public#CiscoLiveACI CNI Plugin ComponentsAci Containers Host(ACH,aci-container-host):is a DaemonSetDaemonSet composed of 3 containers running on every nodemcast-daemon:Handles Broadcast,unknown unicast and multicast replicationaci-containers-host:Endpoint metadata,Pod IPAM,Container Interface
38、 Configurationopflex-agent:Support for Stateful Security Groups,Manage configuration of OVS,Render policy to openflow rules to program OVS,handles loadbalancer services31BRKDCN-2663rootdom-master1#oc get daemonsets-namespace=aci-containers-system|grep hostNAME DESIRED CURRENT READY UP-TO-DATE AVAILA
39、BLE NODE SELECTOR AGEaci-containers-host 2 2 2 2 2 223dFor your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveACI CNI Plugin ComponentsAci Containers Openvswitch(ACO,aci-container-openvswitch)DaemonSetDaemonSet running on every nodeIt is the Open vSwitch enforc
40、ing the required networking and security policies provisioned through the OpFlex agent32BRKDCN-2663rootdom-master1#oc get daemonsets-namespace=aci-containers-system|grep vswitchNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGEaci-containers-openvswitch2 2 2 2 2 223dFor your reference
41、ACI-CNI:K8s Security Model 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLivenamespace-anamespace-bPolicy applied to namespace:namespace-akind:NetworkPolicyapiVersion:extensions/v1beta1metadata:name:allow-red-to-blue-same-nsspec:podSelector:matchLabels:type:blueingress:-from
42、:-podSelector:matchLabels:type:redSupport for Network Policy in ACISpecification of how selections of pods are allowed to communicate with each other and other network endpoints.Network namespace isolation using defined labels directional:allowed ingress pod-to-pod traffic filters traffic from pods
43、in other projects can specify protocol and ports(e.g.tcp/80)34BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveMapping Network Policy and EPGs35Cluster IsolationNamespace IsolationDeployment IsolationSingle EPG for entire cluster.(Default behavior)No need for an
44、y internal contracts.Each namespace is mapped to its own EPG.Contracts for inter-namespace traffic.Each deployment mapped to an EPGContracts tightly control service trafficEPGNetworkPolicyKey MapContractBRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveDual level
45、 Policy Enforcement by ACI36Containers are mapped to EPGs and contracts between EPGs are also enforced on all switches in the fabric where applicable.Both Kubernetes Network Policy and ACI Contracts are enforced in the Linux kernel of every server node that containers run on.Both policy mechanisms c
46、an be used in conjunction.apiVersion:networking.k8s.io/v1 kind:NetworkPolicymetadata:name:default-deny spec:podSelector:policyTypes:-Ingress-EgressNative API Default deny all trafficBRKDCN-2663Exposing Services 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveAutomated Load
47、BalancingCreate a service of type“LoadBalancer”(as per K8s standard)ACI CNI will:Allocate an external IP from a user-defined subnetDeploy a Service Graph with PBR redirection to LoadBalance the traffic between any K8s Nodes that have PODs for the exposed service38ciscok8s-01:/demo/guestbook1$kubectl
48、-namespace=guestbook get svc frontendNAME CLUSTER-IP EXTERNAL-IP PORT(S)AGEfrontend 10.37.0.124 10.34.0.5 80:32677/TCP 5hextern_ipBRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveService Graphs and PBR39*defined in the acc-provision config fileEvery time a servi
49、ce is exposed the ACI CNI controller will deploy:An External EPG with a/32 match for the Service IPA new contract between the svc_ExtEPG and the default_ExtEPG*A Service Graph with PBR redirection containing every node where an exposed POD is running L3Outdefault_ExtEpg0.0.0.0/0Svc_x_ExtEPG10.34.0.5
50、/32ContractPBR Service GraphRTRClientConsProvNode1Node2NodeNPod1Pod3Pod5Pod2Pod4NodeNOVSOVSOVSBRKDCN-2663For your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveService Graphs and PBR Packet walk40L3Outdefault_extEpg0.0.0.0/0Svc_x_ExtEPG10.34.0.5/32ContractPBR S
51、ervice GraphNode1Node2NodeNRTRClientConsProv1.Client send a request to 10.34.0.5,ACI performs Longest Prefix Match(LPM)on the SIP and classify the traffic in the default_extEPG2.ACI does a routing lookup for 10.34.0.5,IP does not exist in the fabric,we should route it out however LPM places it in th
52、e Svc_x_ExtEPG3.PBR redirection is triggered and the traffic is LoadBalanced by the fabric to one of the nodesSIPSIPDIPDIP192.168.1.10010.34.0.5Pod1Pod3Pod5Pod2Pod4NodeNOVSOVSOVSBRKDCN-2663For your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveService Graphs an
53、d PBR Packet walk411.Client send a request to 10.34.0.5,ACI performs policy look up on the SIP and classify the traffic in the default_extEPG2.ACI performs policy lookup for 10.34.0.5 and matches it in Svc_x_ExtEPGactivating the contract between the two EPGs3.PBR redirection is triggered and the tra
54、ffic is LoadBalanced by the fabric to one of the nodesSIPSIPDIPDIP192.168.1.10010.34.0.5L3Outdefault_extEpg0.0.0.0/0Svc_x_ExtEPG10.34.0.5/32ContractPBR Service GraphNode1Node2NodeNRTRClientConsPod1Pod3Pod5Pod2Pod4NodeNOVSOVSOVSProvBRKDCN-2663For your reference 2023 Cisco and/or its affiliates.All ri
55、ghts reserved.Cisco Public#CiscoLiveService Graphs and PBR Packet walk42L3Outdefault_extEpg0.0.0.0/0Svc_x_ExtEPG10.34.0.5/32ContractPBR Service GraphcRTRClientConsProv4.The K8S node is not expecting any traffic directed to the external service IP so OVS will perform NAT as required5.If there are mul
56、tiple POD on a single node OVS will perform a second stage LB to distribute the load between Pods running on the same nodeNode1Node2NodeNPod1Pod3Pod5Pod2Pod4NodeNOVSOVSOVSSIPSIPDIPDIP192.168.1.100PodX IPSIPSIPDIPDIP192.168.1.10010.34.0.5BRKDCN-2663For your reference 2023 Cisco and/or its affiliates.
57、All rights reserved.Cisco Public#CiscoLiveService Graphs and PBR Packet walk43L3Outdefault_extEpg0.0.0.0/0Svc_x_ExtEPG10.34.0.5/32ContractPBR Service GraphRTRClientConsProv4.PodX replies to the client 5.OVS restore the original external Service IP6.PBR redirection is not triggered since the source E
58、PG is the Shadow EPG of the PBR node7.Traffic is routed back to the client(and is permitted by the contract)DIPDIPSIPSIP192.168.1.10010.34.0.5Node1Node2NodeNPod1Pod3Pod5Pod2Pod4NodeNOVSOVSOVSDIPDIPSIPSIP192.168.1.100PodX IPBRKDCN-2663For your referencePOD SNAT 2023 Cisco and/or its affiliates.All ri
59、ghts reserved.Cisco Public#CiscoLivePOD Networking-RecapDuring the ACI CNI installation a Bridge Domain with a dedicated subnet is created for your POD Networking.Every POD that is created in the Kubernetes cluster will be assigned an IP address from the POD Subnet.In most cases this is an advantage
60、 to other CNI implementation:the POD IP/Subnet is equivalent to any other IP/Subnet in ACIThe POD/IP Subnet can communicate directly with any other IP/Subnet directly connect to ACI The POD/IP Subnet can communicate directly with any other IP/Subnet outside of ACI via a standard L3OUTYour PODs are f
61、irst class citizen in ACI!45BRKDCN-2663Some challenges 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveChallenge 1:External Firewall ConfigurationThe POD IP is ephemeral:It is not possible to predict what IP address a POD will be assigned.It is not possible to manually ass
62、ign an IP to a PODThis standard Kubernetes behaviorstandard Kubernetes behavior but it does not work well with firewalls:Is not possible to configure the firewalls ACLs based on the POD IPs as the POD IP can change at any time.The same Kubernetes cluster can host multiple applications:It is not poss
63、ible to use the POD subnet as security boundaryBRKDCN-266347 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveChallenge 2:POD Subnet RoutingThe POD Subnet is,most likely,a private subnetIn certain scenarios a POD might need communicated with an external environment(i.e.inte
64、rnet)and the POD IP address needs to be nattedBRKDCN-266348 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveACI CNI SNAT to the rescue!POD Initiated traffic can be natted to an IP address selected by the userSNAT IP:Single IP or RangeAbility to apply SNAT Policy at differe
65、nt levels:Cluster Level:connection initiated by any POD in any Namespaces is natted to the selected SNAT IPNamespace:connection initiated by any POD in the selected Namespaces is natted to the selected SNAT IPDeployment:connection initiated by any POD in the selected Deployment is natted to the sele
66、cted SNAT IPLoadBalanced Service:connection initiated by any POD mapped to an external Service of Type LoadBalance are natted to the external Service IP.49BRKDCN-2663ACI CNI SNATDesign OverviewFor your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveACI CNI SNAT
67、Architecture OverviewKubernetes SNAT configuration is done via Custom Resource Definitions(CRD)Connection Tracking and Source NAT in performed by OpenVSwitch(OVS),distributed across all the cluster nodesACI CNIs Opflex agent programs OVS flows(SNAT Rules)on each hostACI CNI Controller programs ACI S
68、ervice Graph to send return traffic back to the cluster(more on this later)SNAT is applied only to traffic exiting the cluster via an L3OUTNot supported inside the sameBRKDCN-266351For your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveACI CNI SNAT IP Allocatio
69、nSNAT IPs are allocated per scope:Kubernetes clusterNamespaceDeploymentService(Re-Using the external service IP)A“Scope”can exist on multiple Kubernetes Nodes and the same SNAT IP can be used on multiple nodes and this requires some clever thinking for the return traffic because:Node1POD1Node3POD4No
70、de2POD2POD3SNAT-IP1SNAT-IP1SNAT-IP1Who was the Who was the originating originating POD?POD?To which Node To which Node should I send should I send the trafficthe trafficDST IP=SNAT-IP1SCR IP=ServerBRKDCN-266352For your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#Cisco
71、LiveACI CNI SNAT IP Allocation(cont.)To solve this issue a unique range of TCP and UDP ports for the SNAT IP is allocated to each node:this will allow ACI CNI to know which node is hosting the POD that initiated the connectionNode1POD1Node3POD4Node2POD2POD3SNAT-IP1Port A to BSNAT-IP1Port C to DSNAT-
72、IP1Port E to FDest IP=SNAT-IP1:PortAThe source port is AIt goes to Node1BRKDCN-266353For your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveACI CNI SNAT IP Allocation(cont.)The port-range is user configurable,by default:A SNAT-IP will use ports in the range 500
73、0 to 65000The per-node SNAT-IP Port range size is 3000 ports With 2500 port per node 1 SANT-IP can support up to 24 nodes in a cluster(65000 5000)/2500=20If Multiple IPs are allocated to the SAME SNAT Policy once the port range of the SNAT-IP(5000 to 65000)is exhausted a new SNAT-IP is allocatedIt i
74、s important to size this correctly:if you expect that a single SnatPolicycan have more than 3000 POD initiated connections per Node you should increase this range and allocate a sufficient number of SNAT-IP in the SnatPolicyFor your referenceBRKDCN-266354 2023 Cisco and/or its affiliates.All rights
75、reserved.Cisco Public#CiscoLiveData Path Egress traffic When traffic is initiated from a POD with SNAT enabled,the source NATing happens on the node on which the pod resides.The following will happen:Verify the ACI Contracts are allowing the egress trafficIf Contract allows,and if SNAT is configured
76、 for the pod,we determine if the destination is cluster local or not.If it belongs to the cluster,it follows current processing used for the east-west data path and no SNAT is applied.If the destination does not belong to the cluster,we leverage the OVS SNAT flow rules to map the pods Source-IP:Port
77、 to a SNAT-IP:SNAT-Port on the host on which that pod runs.The SNAT-Port is chosen dynamically by OVS from the allocated port-range for the SNAT-IP address for that node.The packet is forwarded from the Node on the service VLAN and will make its way through the ACI fabric to the external network.For
78、 your referenceBRKDCN-266355 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveData Path Return traffic The return traffic uses as destination IP the SNAT-IPSince multiple nodes could map to the same SNAT-IP,we leverage ACI service graph with PBR redirection to send the retu
79、rn packet to the clusterA new externalEPG called _ snat_svcgraph is created and configured with the subnets used for SNATThe only exception is if we are using the External Service IP as SANT-IP.In such case the same PBR Policy/ExtEPG/Contract of the exposed service will be also used for the return S
80、NAT trafficA new contract called _snat_svcgraph is created and configured as a provider of your default external EPG and as a consumer of the _ snat_svcgraphexternalEPGNote:The contract Scope is“Global”and the extEPG subnet is configured as“Shared Security Import Subnet”to allow for route leaking ac
81、ross VRFs.This is non-configurable.A future ACI CNI release will allow the user to override this behavior For your referenceBRKDCN-266356 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveData Path Return traffic(cont.)The ACI PBR redirection feature uses,as load balancing m
82、echanism,an hash based on Source IP,Destination IP and Protocol Type.Thus,the return traffic may or may not land on the node on which source pod resides to handle this case:If the destination pod resides on the node,OVS connection tracking rules translate the SNAT-IP:SNAT-Port to the pods Source-IP:
83、Port.If the destination pod reside on a different node,OVS rules bounce the traffic to the actual destination node by replacing the destination MAC address with the one of the correct node.For your referenceBRKDCN-266357 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveNote
84、:ICMP is not supported Since ICMP has no port is not possible to distinguish between different PODs.ICMP(Ping)is hence not supported BRKDCN-266358For your referenceACI CNI SNAT Packet WalkFor your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveService Graphs for
85、 SNATThe first time a SNAT Policy is configured the aci-container-controller will:Create a snat_svcgraph externalEPG1Add the SNAT IP under the snat_svcgraphCreate a new snat_svcgraph contract between the snat_svcgraph ExtEPG and the default_ExtEPGCreate a Service Graph with PBR redirection containin
86、g every node of the ClusterAllocate a port range to each node for the SNAT IPL3Outdefault_ExtEpg0.0.0.0/0ContractRTRClientProvConsNode1Node2Pod1Pod3Pod2Pod4OVSOVSPBR Service GraphSNAT IPSNAT IPRangeRangeNODENODE10.20.30.1A to BNode1C to DNode2snat_svcgraph10.20.30.1/32BRKDCN-266360For your reference
87、 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveService Graphs for SNAT Packet walk egress1.POD1 initiate a flow to 1.1.1.12.OVS checks ACI contract/K8S Policies are allowing the communication 3.OVS allocate a SNAT-IP:PORT from the node range4.OVS NATs the POD IP to the S
88、NAT-IP and track the mapping between the POD-IP:Port and the SNAT-IP:SNAT-Port 5.Normal ACI routing takes placeSNAT IPSNAT IPPortPortRuleRule10.20.30.1A A to BNode1C to DNode2POD EPGL3Outdefault_ExtEpg0.0.0.0/0ContractRTRServer1.1.1.1ProvNode1Node2Pod1Pod3Pod2Pod4OVSOVSConsSIP:PORTSIP:PORTDIPDIPPod1
89、:Port11.1.1.1SIP:PORTSIP:PORTDIPDIP10.20.30.1:PortA1.1.1.1Lets assume it picks A as src portPOD IP:PORTPOD IP:PORTSNATSNAT-IP:PORT IP:PORT Pod1:Port1Pod1:Port110.20.30.1:PortA10.20.30.1:PortABRKDCN-266361For your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveSe
90、rvice Graphs for SNAT Packet walk ingress1.Server send a reply to 10.20.30.1,policy lookup on the SIP(1.1.1.1)and classify the traffic in the default_extEPG2.The destination IP 10.20.30.1 matches the snat_svcgraph resulting in the contract between these groups to be applied3.PBR redirection is trigg
91、ered and the traffic is LoadBalanced by the fabric to one of the nodesL3Outdefault_ExtEpg0.0.0.0/0ContractRTRServerProvConsNode1Node2Pod1Pod3Pod2Pod4OVSOVSPBR Service Graphsnat_svcgraph10.20.30.1/32SIPSIPDIPDIP1.1.1.11.1.1.110.20.30.1:PortA10.20.30.1:PortASIPSIPDIPDIP1.1.1.11.1.1.110.20.30.1:PortA10
92、.20.30.1:PortABRKDCN-266362For your reference 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveService Graphs for SNAT Packet walk ingress4.The destination POD might not be on the node selected by the PBR hashing algorithm.If the pod does not reside on the node,an OVS rules
93、 bounce the traffic to the actual destination node by replacing the destination MAC address to point to the pods node MAC address.5.Once the traffic reaches the destination Node,OVS connection tracking rules translate the SNAT-IP:SNAT-Port to the pods Source-IP:Port.Node1Node2Pod1Pod3Pod2Pod4OVSOVSS
94、NAT IPSNAT IPRangeRangeNODENODE10.20.30.1A to BNode1C to DNode2SIPSIPDIPDIPDMACDMAC1.1.1.11.1.1.110.20.30.1:PortA10.20.30.1:PortANode2Node2SNAT IPSNAT IPRangeRangeNODENODE10.20.30.110.20.30.1A to BA to BNode1Node1C to DNode2Need to redirect Need to redirect to Node1to Node1SIPSIPDIPDIPDMACDMAC1.1.1.
95、11.1.1.110.20.30.1:PortA10.20.30.1:PortANode1Node1SIPSIPDIPDIP192.168.1.100192.168.1.10010.20.30.1:PortA10.20.30.1:PortASIPSIPDIPDIP1.1.1.11.1.1.1Pod1:Port1BRKDCN-266363For your referenceACI CNI DemoAgenda 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicKubernetes RefreshKubernetes
96、Network ChallengesACI-CNIBGP Based ArchitectureBGP Based ArchitectureCalico,Cilium and Kube-RouterAutomation and Visibility Which solution is right for me?Q&A65BRKDCN-2663BGP Based Architecture 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveBGP Based Integration Benefits
97、why?1.1.Relies on a wellRelies on a well-established protocol(BGP)established protocol(BGP)2.2.Unified networkingUnified networking:Node,Pod and Service endpoints are accessible from an L3OUT providing easy connectivity across and outside the fabric3.3.(Limited)ACI Security:(Limited)ACI Security:abi
98、lity to use external EPG classification to secure communications to Node/Pod/Service Subnets(no/32 granularity)4.4.High performanceHigh performance:low-latency connectivity without egress routers if no Overlay are used5.5.HardwareHardware-assisted load balancingassisted load balancing:ECMP up to 64
99、paths/Nodes6.6.Any Hypervisor/Bare Metal:Any Hypervisor/Bare Metal:allows to mix form factors together67BRKDCN-2663Architecture and Configuration 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveArchitectureEach K8s Node will peer with a pair of border leavesSingle AS for t
100、he whole clusterSimpler ACI config(can use a subnet for passive peering)CNI Advertise all the K8s subnets to ACI as well as host-routes for exposed services leveraging ECMP for LoadBalancing69BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveL3OUT DesignK8s Nodes
101、 are connected to an L3OUT via vPCExternal EPGs can be used to classify the traffic coming from the clusterFloating L3OUTVM MobilityAbility to mix BareMetal and VMs running on any hypervisor70BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveACI Best Practice:Pee
102、r to local ToRIf some K8s nodes are connected to AnchorAnchor and some to NonNon-Anchor Leaves and are advertising the same Service IP only the one connected to the AnchorAnchor Leaves are selected as valid next hop.This happens because the Next-Hop cost is higher for to Non-Anchor Leaves connected
103、K8s Nodes.We are working to address this in an upcoming ACI release71BRKDCN-2663Leaf-1Leaf-4show ip route 1.1.1.1*via Leaf-1show ip route 1.1.1.1*via K8s-1Leaf-2Leaf-3K8s-2SVC 1.1.1.1K8s-1SVC 1.1.1.1 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveACI BGP Tuning AS overrid
104、e and Disable Peer AS Check:To support having a single AS per cluster without the presence of Route Reflectors or Full Mesh inside the clusterBGP Graceful RestartBGP timers tuned to 1s/3s for quick eBGP node down detectionRelax AS path policy to allow installing more than one ECMP path for the same
105、routeIncrease Max BGP ECMP path to 64 for better load balancing72BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveACI BGP Hardening(Optional)Enabled BGP password authenticationSet the maximum AS limit to oneConfigure BGP import route control to accept only the e
106、xpected subnets from the Kubernetes cluster:Pod subnet(s)Node subnet(s)Service subnet(s)Set a limit on the number of received prefixes from the nodes.73BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveExpected Routing BehaviourNodes,pods and service IPs Subnets
107、will be advertised to the ACI fabricEvery K8s nodes is allocated one or more subnets from the POD Supernet.Each subnets is advertised to ACI as wellExposed Services will be advertised to ACI as host routes from every nodes that has a running POD associated to the service.74BRKDCN-2663Agenda 2023 Cis
108、co and/or its affiliates.All rights reserved.Cisco PublicKubernetes RefreshKubernetes Network ChallengesACI-CNIBGP Based ArchitectureCalico,Cilium and KubeCalico,Cilium and Kube-RouterRouterAutomation and Visibility Which solution is right for me?Q&A75BRKDCN-2663What is CalicoA Kubernetes CNI plugin
109、 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveCalicoA CNI plugin of Kubernetes77BRKDCN-2663ToRToRK8s nodeLinux Kerneleth0pod namespaceeth0eth1calicxFelixBIRDiptablesroutesK8s nodeLinux Kerneleth0eth0eth1calicxFelixBIRDiptablesroutespod namespaceBIRD:It is a routing daem
110、on responsible for peering with other K8s nodes and exchanging routes of pod network and service network for inter-node communication.2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveCalicoA CNI plugin of Kubernetes78BRKDCN-2663ToRToRK8s nodeLinux Kerneleth0pod namespaceeth
111、0eth1calicxFelixBIRDiptablesroutesK8s nodeLinux Kerneleth0eth0eth1calicxFelixBIRDiptablesroutespod namespaceFelix:Running in same pod as BIRD,programs routes and ACLs(iptables)and anything required on Calico node to provide connectivity for the pods scheduled on that node 2023 Cisco and/or its affil
112、iates.All rights reserved.Cisco Public#CiscoLiveCalico BGP ConfigOne or more IPPool with all overlays disabledBGPConfiguration with:nodeToNodeMeshEnabled set to“false”List of serviceClusterIPs and serviceExternalIPs subnets to enabled host routes advertisement for those subnets BGPPeer to define the
113、 BGP Peer the K8s nodes connects toA Secret,Role and RoleBinding to pass the BGP Password to the Calico BGP Process79BRKDCN-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveCalico IPPool Config Cont.apiVersion:crd.projectcalico.org/v1kind:IPPoolmetadata:name:default-ip
114、v4-ippoolspec:blockSize:26cidr:192.168.3.0/24ipipMode:NevernodeSelector:all()vxlanMode:Never80BRKDCN-2663How to split the POD subnet between nodesPOD SubnetDisable IP in IP Allocate this Subnet to all the nodesDisable VXLAN Overlay 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#Ci
115、scoLiveCalico BGPConfiguration Cont.81BRKDCN-2663apiVersion:crd.projectcalico.org/v1kind:BGPConfigurationmetadata:name:defaultspec:asNumber:65003listenPort:179logSeverityScreen:InfonodeToNodeMeshEnabled:falseserviceClusterIPs:-cidr:192.168.4.0/24serviceExternalIPs:-cidr:192.168.5.0/24K8s Cluster BGP
116、 ASBGP PortDisable iBGP Full Mesh PeeringAllow Calico to Advertise the Cluster and External service Subnets 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveHow do I peer with the“local”TORs?Use a Node label to identify the location of the K8s Node,for example the rack idCo
117、nfigure the BGPPeer resource with a nodeSelector matching the label of the K8s NodesThe result will be that the peering is happening only between K8s Nodes and leaves with a matching rack id82BRKDCN-2663K8s-node-1rack_id=1K8s-node-2rack_id=2Leaf-11Leaf-21Rack 1Rack 2K8s ClusterapiVersion:projectcali
118、co.org/v3kind:BGPPeermetadata:name:11spec:peerIP:a.b.c.dasNumber:65002nodeSelector:rack_id=1apiVersion:projectcalico.org/v3kind:BGPPeermetadata:name:21spec:peerIP:a.b.c.dasNumber:65002nodeSelector:rack_id=2Agenda 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicKubernetes RefreshKube
119、rnetes Network ChallengesACI-CNIBGP Based ArchitectureCalico,Cilium and Kube-RouterAutomation and Visibility Which solution is right for me?Q&A83BRKDCN-2663What is Kube-RouterA Kubernetes CNI plugin 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveKube RouterA CNI plugin of
120、 Kubernetes85BRKDCN-2663Kube-router is built around concept of watchers and controllers.GoBGP Runs inside the KubeRouter PODInjects learned routes into local Node routing tableopen source BGP implementation designed from scratch 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#Cisco
121、LiveKube-Router eBGP ConfigMost of the configuration is applied at the Kube-Router DaemonSetlevel,for our design we need the following options86BRKDCN-2663-run-router=true-run-firewall=true-run-service-proxy=true-bgp-graceful-restart=true-bgp-holdtime=3s-kubeconfig=/var/lib/kube-router/kubeconfig-cl
122、uster-asn=-advertise-external-ip-advertise-loadbalancer-ip-advertise-pod-cidr=true-enable-ibgp=false-enable-overlay=false-enable-pod-egress=false-override-nexthop=true 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveHow do I peer with the“local”TORs?Kube-Router expects the
123、 K8s nodes to be annotated with the peer(leaf)IP,AS and password so we simply need to annotate the K8s nodes accordingly for example:kube-router.io/peer.ips=,kube-router.io/peer.asns=,kube-router.io/peer.passwords=,Annotating any node with the above config will result in such node to peer with“leaf-
124、1”and”leaf-2”87BRKDCN-2663BGP CNI DemoAgenda 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicKubernetes RefreshKubernetes Network ChallengesACI-CNIBGP Based ArchitectureCalico,Cilium and Kube-RouterAutomation and Visibility Automation and Visibility Which solution is right for me?Q&
125、A89BRKDCN-2663Automation and Visibility DemoAgenda 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicKubernetes RefreshKubernetes Network ChallengesACI-CNIBGP Based ArchitectureCalico,Cilium and Kube-RouterAutomation and Visibility Which solution is right for me?Which solution is righ
126、t for me?Q&A91BRKDCN-2663Which solution is right for me?2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveWhich solution is right for me?This is an hard question,all the supported CNI plugins are enterprise grade and provide a rich feature setSome question you might ask your
127、self:Is running the same CNI plugin on ANY network infrastructure important?Is the K8s team already using a specific CNI?Are the advanced ACI CNI feature important?Ability to place PODs into EPG,SNAT,PBR based load balancingIs having a single vendor for the networking stack support important?93BRKDC
128、N-2663 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLive94CNI ComparisonBRKDCN-2663ACI CNIACI CNICalicoCalicoKuberKuber-RouterRouterCiliumCiliumSupportTACOpenSource/PayOpenSourceOpenSource/PayNetwork InfraACI OnlyAnyAnyAnyNetwork ConfigAutomatedAutomatedAutomatedAutomatedLi
129、nux OS SupportUbuntu/CoreOSAnyAnyAnyService LoadBalancingACI PBRBGP ECMPBGP ECMPBGP ECMPPOD SNATNAT PoolsNAT To Node IPNAT To Node IPNAT To Node IPSecurity ModelACI Policy Model+K8s Network PoliciesCalico Network PoliciesK8s Network PoliciesExtended K8s Network PoliciesEnd To End VisibilityYesVia Op
130、ensource ToolsVia Opensource ToolsHubbleData PlaneOVSLinux or eBPFLinux+IPVS for LoadBalancingeBPF 2023 Cisco and/or its affiliates.All rights reserved.Cisco Public#CiscoLiveFill out your session surveys!Attendees who fill out a minimum of four session surveys and the overall event survey will get C
131、isco Live-branded socks(while supplies last)!95BRKDCN-2663These points help you get on the leaderboard and increase your chances of winning daily and grand prizesAttendees will also earn 100 points in the Cisco Live Game for every survey completed.2023 Cisco and/or its affiliates.All rights reserved
132、.Cisco PublicContinue your educationVisit the Cisco Showcase for related demosBook your one-on-oneMeet the Engineer meetingAttend the interactive education with DevNet,Capture the Flag,and Walk-in LabsVisit the On-Demand Library for more sessions at www.CiscoL you#CiscoLive 2023 Cisco and/or its aff
133、iliates.All rights reserved.Cisco Public#CiscoLive98Gamify your Cisco Live experience!Get points Get points for attending this session!for attending this session!Open the Cisco Events App.Click on Cisco Live Challenge in the side menu.Click on View Your Badges at the top.Click the+at the bottom of the screen and scan the QR code:How:123498 2023 Cisco and/or its affiliates.All rights reserved.Cisco PublicBRKDCN-2663