There has been a lot of discussion about the Enterprise Stack and if any one vendor will ever own the whole thing. While that is a grand vision for any of the companies named below (and sure to bring applause from their shareholders on “analyst” day), I simply don’t see customers ever allowing it to become reality. Customers don’t want to be locked in. In my opinion, the virtual enterprise stack is what will win, but there will be lots of different vendor choices up and down that stack….and by the way, it probably won’t actually be hosted in the Enterprise DC (but that is another discussion).
For the first part of this year, customers are voting with their wallets and they are choosing…..drum role…..everyone (see How Much Integration Is Too Much in the Cloud?). Unlike the Internet bubble burst back in 2000/2001 where companies like Cisco stole market and revenue share from the rest of the networking industry during the recovering, this recovery looks like it might be shaping up to be a little different. Back in 2001, the technology didn’t really evolve very much between the time of the burst to the point where the recovery really kicked in. Sure, it got faster and cheaper, but architectures fundamentally stayed the same (they just got some new bells and whistles). Blade servers started to emerge and networks saw a lot of movement from 100M to 1G at the access layer but virtualization hadn’t really kicked in yet. Since November of 2008, customers have had a lot of time to reevaluate their entire IT stack AND a lot of new architectural solutions have emerged. Amazon’s EC2 and Rackspace’s Cloud Hosting have given customers direct access to more cost effective data center resources that they can access on demand. Google Apps have given companies big and small complete business solutions (email, docs, spreasheets, sites, etc) that can be spun up and online the same hour the company opens it’s doors. VMware’s vNetwork Distributed Swith/Cisco’s Nexus 1000V, OpenFlow & Open vSwitch, HP’s Virtual Connect, Palo Alto Networks NG Enterprise Firewalls, Cisco’s Nexus 5000/2000 combination (foundational to Cisco UCS) and Arista’s 7×00 w/ vEOS are all examples of fundamentally new capabilities introduced since the downturn which customers can now leverage to harness their increasingly complex and highly virtualized data centers.
The point is that customers today are faced with much greater IT challenges than they were in 2008 and the technologies are dramatically different…not necessarily all of them better, but definitely different. And, there are a lot of new IT solutions warming up their engines to go out on track for the first time and see what types of lap times they can turn in. Should be fun!
Back in July, Edward L. Haletky (aka Texiwill on the VMware Communities site) hosted a Virtualization Security Round Table Podcast talking about the Cisco Nexus 1000V (one in a series of podcasts Edward hosts). During that podcast, Edward and I explored some of the security aspects of the Cisco Nexus 1000V and answered some common questions and clear up any misconceptions about the product.
At the beginning of the podcast, Edward asked me to provide a quick intro for the Nexus 1000V. The Nexus 1000V is an advanced switching architecture for VMW vSphere, which allows the virtual network infrastructure to be managed consistently with the physical network infrastructure. The Nexus 1000V puts responsibility & control of the access layer (virtual access layer) in the hand of the network administrator, while maintaining the existing workflow for the server or virtualization administrator. The goal is to remove barriers of adoption for server virtualization by providing the operational and deployment tools necessary to virtualize 100% of the data center
The Nexus 1000V is made up of 2 primary components and leverages the vSphere vNetwork switch API interfaces to integrate with the VMW product:
The first component is called a Virtual Ethernet Module (or VEM). The VEM is Cisco’s own developed vSwitch, which runs as a loadable kernel module in the vSphere hypervisor. The VEM runs at the same level in the hypervisor as VMW’s own vSwitch, but is a completely separate module – a different set of bits. You can have both the N1K VEM and a VMW vSwitch running at the same time in the same vSphere host, but a physical NIC needs to be assigned to one or the other. The VEM is supported in both ESX and ESXi implementations and can be installed manually (just like any other ESX patch/update) or automatically using VUM. Just like with the vSwitch, the VEM is the L2 data plane of the solution, but it gets its configuration from and shares it’s statistics with the 2nd component of the solution.
The 2nd component is called a Virtual Supervisor Module (or VSM). The VSM is a standalone instance of our Nexus Operating System (or NX-OS) running as a virtual machine that acts as the control and management plane for the solution. A single VSM (or a pair for HA deployments running Active/Standby) is used to manage and program up to 64 separate VEMs. This creates the logical equivalent of a large modular switch with 64 linecards. Only in the case of the N1KV, the system is all software.
The VSM is accessible via telnet or SSH (with a full TACACs, AAA, and Radius support built in) just like any other NX-OS or IOS network device (SNMP is also available for device management, configuration and monitoring). Once logged in, the user is presented with a familiar network CLI, consistent with our physical network devices. The user can then use the CLI to configure the N1KV just like any other switch.
The VSM communicates with the VEMs over standards based Ethernet networks (L2 today, L3 in the near future), nothing special is required for this. VSM to VEM communication leverages an integrity check with a unique hash value for each message to secure the communication channel and prevent against “man in the middle attacks”. This communication leverages different VLANs for internal management and control traffic, so restricting access to these VLANs is also recommended.
The VSM communicates with vCenter, the VMW management console, by leveraging a certificate based authentication mechanism. Once authenticated, vCenter can tell the VSM about what ESX hosts require network connectivity and the VSM can tell vCenter what network services (AKA Port Groups or Network Labels) are available on a particular cluster. We call these network services Port Profiles.
A VM has a VNIC, to which the VI admin assigns a Port Group (aka Network Label). With the Nexus 1000V, that Port Group is defined by a Port Profile which is a user configurable set of interface parameters. A Port Profile is defined on the Nexus 1000V VSM by the Network Admin.
A Port Profile is a collection of network configuration attributes (VLAN, security rule set – AKA L2/3/4 ACLs, QoS marking & classification, Rate Limiting, monitoring level, etc) that is defined at a global configuration level (as opposed to an interface configuration level) and then exposed by name to vCenter for the virtualization admin to use for a VNIC on a VM.
The Port Profile creation and application is a key part of our solution which allows not only the correct policy to be assigned to a VNIC when a VM is created, but once assigned, the profile and the associated network interface statistics are moved with the VM as a migration event takes place.
For those who would like to see a better visual explanation of the Nexus 1000V components, you can check out the Nexus 1000V demonstration video:
After the introduction, Edward asked some questions about the product and asked me to clear up some confusion about how the product can be deployed (confusion caused by other vendors not happy they don’t yet have a solutions that can compete with the Nexus 1000V). The first set of questions were about deployment scenarios:
Does the N1KV require a Nexus switch upstream to work?
NO!
Does the N1KV require a Cisco switch upstream to work?
NO! It works with any standards based upstream switch
It works at both 1G and 10G speeds
It works with any server/NIC on the VMware HW compatibility list
It works with various blade switch/network devices like HP Virtual Connect and Flex-10 (and adds advanced virtual networking awareness and security to both of those solutions)
The VSM can run on either ESX 3.5 or ESX 4.0
The VEM can only run on ESX/ESXi 4.0 with an Enterprise Plus license
What are the benefits of having a Nexus or Catalyst device upstream?
Consistent network operational model
Customers spend 10% of their money to deploy network infrastructure and 90% on operating it. Introducing tools that provide operational consistency for the virtual network infrastructure has a huge ROI. Cisco and VMW will be releasing a ROI calculator shortly that shows how customers can virtualize 30% more applications and save 30% in the process when they deploy the Nexus 1000V with vSphere Enterpise Plus.
What should my company care about the virtual access layer?
Access layer makes up the volume of the data center ports, and those are moving to virtual ports
This is the single largest area of vulnerability in customer data centers today
Virtualization networking rules and governance/change control are much more relaxed than for physical infrastructure….how can that be secure?
Nexus 1000V brings consistency and operational safeguards to your virtual access layer
Next, Edward and I discussed many of the security focused aspects of the Nexus 1000V. Since security means different things to different people, the Nexus 1000V has to address a broad number of security concerns. Here is a look at some of the security related features with a description of what each one does:
Network Infrastructure Security:
Authentication, Authorization, Accounting with either Radius or TACACs+ support (authenicate user login, authorize them to do certain things, account for the things that they configure)
SSH for secure and encrypted access to the VSM
VM Security:
Port Profiles controlled by NW infrastructure security allow for specific and consistent network policy to be applied to VM VNIC interfaces…eliminate “fat finger” holes and inconsistencies.
ACLs (IP & MAC) allow L2, L3, L4 traffic to be filtered at the VM VNIC level, can be applied manually or via a Port Profile
Private VLANs control communication of host on the same L2 subnet (without ACLs), support PVLAN communities, isolated and promiscuous ports and save IP addresses (less sub-netting required).
Port Security controls what MAC(s) can be learned through a VETH port and disables the port if it changes or the numbers are exceeded.
DHCP Snooping acts like a firewall between untrusted hosts and trusted DHCP servers. DHCP snooping performs the following activities:
Validates DHCP messages received from untrusted sources and filters out invalid messages.
Builds and maintains the DHCP snooping binding database, which contains information about untrusted hosts with leased IP addresses.
Uses the DHCP snooping binding database to validate subsequent requests from untrusted hosts.
Dynamic ARP inspection (DAI) and IP Source Guard also use information stored in the DHCP snooping binding database.
IP Source Guard – IP Source Guard is a per-interface traffic filter that permits IP traffic only when the IP address and MAC address of each packet matches one of two sources of IP and MAC address bindings:
Entries in the Dynamic Host Configuration Protocol (DHCP) snooping binding table.
Static IP source entries that you configure.
Dynamic Arp Inspection (DAI) ensures that only valid ARP requests and responses are relayed. When DAI is enabled and properly configured, an NX-OS device performs these activities:
Intercepts all ARP requests and responses on untrusted ports.
Verifies that each of these intercepted packets has a valid IP-to-MAC address binding before updating the local ARP cache or before forwarding the packet to the appropriate destination.
Drops invalid ARP packets.
DAI can determine the validity of an ARP packet based on valid IP-to-MAC address bindings stored in a DHCP snooping binding database.
This database is built by DHCP snooping if DHCP snooping is enabled on the VLANs and on the device.
If the ARP packet is received on a trusted interface, the device forwards the packet without any checks.
On untrusted interfaces, the device forwards the packet only if it is valid.
For more information on how to leverage some of these features, check out Cisco’s DMZ Whitepaper discussing how to leverage vSphere and the Nexus 1000V to support DCZ environment.
Also, here is some additional Q/A that folks may find helpful:
How much of the VMware vSwitch does the Nexus use?
The N1KV is a Cisco developed vSwitch (all aspects). We do use the VMW vNetwork API (which we developed with VMW), but the actual data plane component in the ESX server is all Cisco.
Does the N1KV use the VMsafe APIs?
The VMsafe APIs are now called the vNetwork Appliance APIs according to some documentation I received on this from VMware. I had my engineering team check and can confirm that we are not using any of these APIs for the Nexus 1000V today. These are additional APIs that sit on top of the switch in the vmkernel. There is no overlap with these APIs and the vNetwork Distributed Switch APIs, which is what allows these to work with either the VMW vSwitch or the N1KV.
How does the Nexus 1000V play with VMsafe products?
Several security vendors with VMsafe implementations have demonstrated interoperability with the N1KV (Altor Networks & Reflex). The 2 solutions do not have any dependency on one another.
How is the security of the virtual appliance? Did we test it?
Yes, it stood up to everything we through at it, similar to our Nexus switches.
Is the N1K susceptible to L2 attacks?
N1KV does not support STP, and is not susceptible to STP attacks. It does not forward BPDUs and cannot be uses as a transit node by the physical infrastructure.
We have a software implementation of a CAM table. It is automatically programmed with VNIC MACs from the local VMs present on each VEM (just like the vSwitch) but it can also learn additional MACs from the VMs if that is required. With this in mind, we recommend using a feature called Port Security that limits a malicious VM from filling up a MAC table.
In its most basic form, the Port Security feature remembers the Ethernet MAC address connected to the switch port and allows only that MAC address to communicate on that port. If any other MAC address tries to communicate through the port, port security will disable the port. Network admins often configure the switch to send a SNMP trap to their network monitoring solution that the port’s disabled for security reasons.
With the Port security feature, you can also set the maximum number of addresses which can be learned on each virtual Ethernet interface if you expect a particular type of VM to announce more than 1 MAC per VNIC. The feature forces the port to be disabled if it learns 1 more than that maximum number set on a given interface.
Port security and each of it’s options can be defined as part of a Port Profile and applied to a VM VNIC each consistently each time a new VM is deployed.
What are the Layer 3 attributes that can be leveraged to protect against attacks?
Access control lists (ACLs) are a commonly used network feature and can be used to define what L2, L3 and L4 traffic will or won’t be allowed to a particular VM VNIC. This feature can be used to control who can talk to a VM and who a VM can talk to.
Additional features discussed above that can be leveraged include Dynamic Arp Inspection, DHCP Snooping, IP Source Guard
When we started submitting abstracts for VMworld 2009 earlier this year, I didn’t really think it would involve more work than what we had put into the show last year when we announced the Cisco Nexus 1000V. Boy, was I wrong.
Just for the N1K alone (not to mention all the UCS related activities at the show), we need a small army of people to develop and support the N1K self -paced lab SPL23:
Han is working on developing an all new breakout presentation which will be run on Wednesday (3:30PM) and Thursday afternoon(TA2384). You can watch a short intro for the presentation here:
We are also working on new content for a bunch of stuff going on in the Cisco booth including a preview of some new features of the Nexus 1000V (ask us about Virtual Service Domains – VSD, a very cool feature), detailed demos of the currently available product and lots of short presentations covering different technical and business aspects of the product including:
Introduction to the Nexus 1000V
Securing the Virtualizated Data Center with the Nexus 1000V
My current project is the VN-Link Evolution story. I have been working on this project for so long that I was starting to forget details about how it got started, so I decided to go back to the beginning and recount how VN-Link and the Swordfish (Nexus 1000V) came to be.
Chapter 1: June 2006 – Two informal exploratory programs are merged
Chapter 2: January 2008 – Swordfish Proof of Concept Delivered (Sailfish)
Chapter 3: July 2008 – Swordfish Beta 1 Delivered
Chapter 4: August 2008 – Swordfish officially named the Nexus 1000V