I have been a little quiet here for the past 2 months since VMWorld in San Francisco in September. The Nexus 1000V team has been very busy this past quarter working with customer (we have added over 400 new customers in this period) and preparing for our next release which is going to post to Cisco CCO in the first part of December. I am happy to report that HP got it’s facts straight about how the Nexus 1000V really does work with their Virtual Connect and Flex-10 solutions (it might have been the Cisco video we posted showing the solutions working together that helped things along – see below).
Also, I have been surprised that there has not been much noise following all the announcements made at VMworld 2009 about an “open” vswitch or any of the management veneers that promised to make a standard VMware vSwitch support all of the features of the Nexus 1000V. Maybe Santa will bring us a gift and deliver some specific product details this holiday season so we can understand what these solutions really can or can’t do.
Back in July, Edward L. Haletky (aka Texiwill on the VMware Communities site) hosted a Virtualization Security Round Table Podcast talking about the Cisco Nexus 1000V (one in a series of podcasts Edward hosts). During that podcast, Edward and I explored some of the security aspects of the Cisco Nexus 1000V and answered some common questions and clear up any misconceptions about the product.
At the beginning of the podcast, Edward asked me to provide a quick intro for the Nexus 1000V. The Nexus 1000V is an advanced switching architecture for VMW vSphere, which allows the virtual network infrastructure to be managed consistently with the physical network infrastructure. The Nexus 1000V puts responsibility & control of the access layer (virtual access layer) in the hand of the network administrator, while maintaining the existing workflow for the server or virtualization administrator. The goal is to remove barriers of adoption for server virtualization by providing the operational and deployment tools necessary to virtualize 100% of the data center
The Nexus 1000V is made up of 2 primary components and leverages the vSphere vNetwork switch API interfaces to integrate with the VMW product:
The first component is called a Virtual Ethernet Module (or VEM). The VEM is Cisco’s own developed vSwitch, which runs as a loadable kernel module in the vSphere hypervisor. The VEM runs at the same level in the hypervisor as VMW’s own vSwitch, but is a completely separate module – a different set of bits. You can have both the N1K VEM and a VMW vSwitch running at the same time in the same vSphere host, but a physical NIC needs to be assigned to one or the other. The VEM is supported in both ESX and ESXi implementations and can be installed manually (just like any other ESX patch/update) or automatically using VUM. Just like with the vSwitch, the VEM is the L2 data plane of the solution, but it gets its configuration from and shares it’s statistics with the 2nd component of the solution.
The 2nd component is called a Virtual Supervisor Module (or VSM). The VSM is a standalone instance of our Nexus Operating System (or NX-OS) running as a virtual machine that acts as the control and management plane for the solution. A single VSM (or a pair for HA deployments running Active/Standby) is used to manage and program up to 64 separate VEMs. This creates the logical equivalent of a large modular switch with 64 linecards. Only in the case of the N1KV, the system is all software.
The VSM is accessible via telnet or SSH (with a full TACACs, AAA, and Radius support built in) just like any other NX-OS or IOS network device (SNMP is also available for device management, configuration and monitoring). Once logged in, the user is presented with a familiar network CLI, consistent with our physical network devices. The user can then use the CLI to configure the N1KV just like any other switch.
The VSM communicates with the VEMs over standards based Ethernet networks (L2 today, L3 in the near future), nothing special is required for this. VSM to VEM communication leverages an integrity check with a unique hash value for each message to secure the communication channel and prevent against “man in the middle attacks”. This communication leverages different VLANs for internal management and control traffic, so restricting access to these VLANs is also recommended.
The VSM communicates with vCenter, the VMW management console, by leveraging a certificate based authentication mechanism. Once authenticated, vCenter can tell the VSM about what ESX hosts require network connectivity and the VSM can tell vCenter what network services (AKA Port Groups or Network Labels) are available on a particular cluster. We call these network services Port Profiles.
A VM has a VNIC, to which the VI admin assigns a Port Group (aka Network Label). With the Nexus 1000V, that Port Group is defined by a Port Profile which is a user configurable set of interface parameters. A Port Profile is defined on the Nexus 1000V VSM by the Network Admin.
A Port Profile is a collection of network configuration attributes (VLAN, security rule set – AKA L2/3/4 ACLs, QoS marking & classification, Rate Limiting, monitoring level, etc) that is defined at a global configuration level (as opposed to an interface configuration level) and then exposed by name to vCenter for the virtualization admin to use for a VNIC on a VM.
The Port Profile creation and application is a key part of our solution which allows not only the correct policy to be assigned to a VNIC when a VM is created, but once assigned, the profile and the associated network interface statistics are moved with the VM as a migration event takes place.
For those who would like to see a better visual explanation of the Nexus 1000V components, you can check out the Nexus 1000V demonstration video:
After the introduction, Edward asked some questions about the product and asked me to clear up some confusion about how the product can be deployed (confusion caused by other vendors not happy they don’t yet have a solutions that can compete with the Nexus 1000V). The first set of questions were about deployment scenarios:
Does the N1KV require a Nexus switch upstream to work?
NO!
Does the N1KV require a Cisco switch upstream to work?
NO! It works with any standards based upstream switch
It works at both 1G and 10G speeds
It works with any server/NIC on the VMware HW compatibility list
It works with various blade switch/network devices like HP Virtual Connect and Flex-10 (and adds advanced virtual networking awareness and security to both of those solutions)
The VSM can run on either ESX 3.5 or ESX 4.0
The VEM can only run on ESX/ESXi 4.0 with an Enterprise Plus license
What are the benefits of having a Nexus or Catalyst device upstream?
Consistent network operational model
Customers spend 10% of their money to deploy network infrastructure and 90% on operating it. Introducing tools that provide operational consistency for the virtual network infrastructure has a huge ROI. Cisco and VMW will be releasing a ROI calculator shortly that shows how customers can virtualize 30% more applications and save 30% in the process when they deploy the Nexus 1000V with vSphere Enterpise Plus.
What should my company care about the virtual access layer?
Access layer makes up the volume of the data center ports, and those are moving to virtual ports
This is the single largest area of vulnerability in customer data centers today
Virtualization networking rules and governance/change control are much more relaxed than for physical infrastructure….how can that be secure?
Nexus 1000V brings consistency and operational safeguards to your virtual access layer
Next, Edward and I discussed many of the security focused aspects of the Nexus 1000V. Since security means different things to different people, the Nexus 1000V has to address a broad number of security concerns. Here is a look at some of the security related features with a description of what each one does:
Network Infrastructure Security:
Authentication, Authorization, Accounting with either Radius or TACACs+ support (authenicate user login, authorize them to do certain things, account for the things that they configure)
SSH for secure and encrypted access to the VSM
VM Security:
Port Profiles controlled by NW infrastructure security allow for specific and consistent network policy to be applied to VM VNIC interfaces…eliminate “fat finger” holes and inconsistencies.
ACLs (IP & MAC) allow L2, L3, L4 traffic to be filtered at the VM VNIC level, can be applied manually or via a Port Profile
Private VLANs control communication of host on the same L2 subnet (without ACLs), support PVLAN communities, isolated and promiscuous ports and save IP addresses (less sub-netting required).
Port Security controls what MAC(s) can be learned through a VETH port and disables the port if it changes or the numbers are exceeded.
DHCP Snooping acts like a firewall between untrusted hosts and trusted DHCP servers. DHCP snooping performs the following activities:
Validates DHCP messages received from untrusted sources and filters out invalid messages.
Builds and maintains the DHCP snooping binding database, which contains information about untrusted hosts with leased IP addresses.
Uses the DHCP snooping binding database to validate subsequent requests from untrusted hosts.
Dynamic ARP inspection (DAI) and IP Source Guard also use information stored in the DHCP snooping binding database.
IP Source Guard – IP Source Guard is a per-interface traffic filter that permits IP traffic only when the IP address and MAC address of each packet matches one of two sources of IP and MAC address bindings:
Entries in the Dynamic Host Configuration Protocol (DHCP) snooping binding table.
Static IP source entries that you configure.
Dynamic Arp Inspection (DAI) ensures that only valid ARP requests and responses are relayed. When DAI is enabled and properly configured, an NX-OS device performs these activities:
Intercepts all ARP requests and responses on untrusted ports.
Verifies that each of these intercepted packets has a valid IP-to-MAC address binding before updating the local ARP cache or before forwarding the packet to the appropriate destination.
Drops invalid ARP packets.
DAI can determine the validity of an ARP packet based on valid IP-to-MAC address bindings stored in a DHCP snooping binding database.
This database is built by DHCP snooping if DHCP snooping is enabled on the VLANs and on the device.
If the ARP packet is received on a trusted interface, the device forwards the packet without any checks.
On untrusted interfaces, the device forwards the packet only if it is valid.
For more information on how to leverage some of these features, check out Cisco’s DMZ Whitepaper discussing how to leverage vSphere and the Nexus 1000V to support DCZ environment.
Also, here is some additional Q/A that folks may find helpful:
How much of the VMware vSwitch does the Nexus use?
The N1KV is a Cisco developed vSwitch (all aspects). We do use the VMW vNetwork API (which we developed with VMW), but the actual data plane component in the ESX server is all Cisco.
Does the N1KV use the VMsafe APIs?
The VMsafe APIs are now called the vNetwork Appliance APIs according to some documentation I received on this from VMware. I had my engineering team check and can confirm that we are not using any of these APIs for the Nexus 1000V today. These are additional APIs that sit on top of the switch in the vmkernel. There is no overlap with these APIs and the vNetwork Distributed Switch APIs, which is what allows these to work with either the VMW vSwitch or the N1KV.
How does the Nexus 1000V play with VMsafe products?
Several security vendors with VMsafe implementations have demonstrated interoperability with the N1KV (Altor Networks & Reflex). The 2 solutions do not have any dependency on one another.
How is the security of the virtual appliance? Did we test it?
Yes, it stood up to everything we through at it, similar to our Nexus switches.
Is the N1K susceptible to L2 attacks?
N1KV does not support STP, and is not susceptible to STP attacks. It does not forward BPDUs and cannot be uses as a transit node by the physical infrastructure.
We have a software implementation of a CAM table. It is automatically programmed with VNIC MACs from the local VMs present on each VEM (just like the vSwitch) but it can also learn additional MACs from the VMs if that is required. With this in mind, we recommend using a feature called Port Security that limits a malicious VM from filling up a MAC table.
In its most basic form, the Port Security feature remembers the Ethernet MAC address connected to the switch port and allows only that MAC address to communicate on that port. If any other MAC address tries to communicate through the port, port security will disable the port. Network admins often configure the switch to send a SNMP trap to their network monitoring solution that the port’s disabled for security reasons.
With the Port security feature, you can also set the maximum number of addresses which can be learned on each virtual Ethernet interface if you expect a particular type of VM to announce more than 1 MAC per VNIC. The feature forces the port to be disabled if it learns 1 more than that maximum number set on a given interface.
Port security and each of it’s options can be defined as part of a Port Profile and applied to a VM VNIC each consistently each time a new VM is deployed.
What are the Layer 3 attributes that can be leveraged to protect against attacks?
Access control lists (ACLs) are a commonly used network feature and can be used to define what L2, L3 and L4 traffic will or won’t be allowed to a particular VM VNIC. This feature can be used to control who can talk to a VM and who a VM can talk to.
Additional features discussed above that can be leveraged include Dynamic Arp Inspection, DHCP Snooping, IP Source Guard
When we started submitting abstracts for VMworld 2009 earlier this year, I didn’t really think it would involve more work than what we had put into the show last year when we announced the Cisco Nexus 1000V. Boy, was I wrong.
Just for the N1K alone (not to mention all the UCS related activities at the show), we need a small army of people to develop and support the N1K self -paced lab SPL23:
Han is working on developing an all new breakout presentation which will be run on Wednesday (3:30PM) and Thursday afternoon(TA2384). You can watch a short intro for the presentation here:
We are also working on new content for a bunch of stuff going on in the Cisco booth including a preview of some new features of the Nexus 1000V (ask us about Virtual Service Domains – VSD, a very cool feature), detailed demos of the currently available product and lots of short presentations covering different technical and business aspects of the product including:
Introduction to the Nexus 1000V
Securing the Virtualizated Data Center with the Nexus 1000V
August 2008 – Swordfish officially named the Nexus 1000V
VN-Link Chain
With VMworld 2009 just around the corner (August 31-Sept 3 at the Moscone Center in San Francisco) and our entire team scrambling to get ready for the show (Platinum sponsor keynote, product breakout presentations, Nexus 1000V self-paced lab, demo of upcoming Nexus 1000V release, etc), I was reminded of our preparation just 1 year ago for VMworld 2008 in Las Vegas.
We had started the Swordfish beta program in conjunction with VMW and had made a decision to publically introduce the new product and some of it’s capabilities at the upcoming show. From a marketing perspective, we knew it was going to be part of the Cisco Nexus family, but we didn’t know if we should give it a number or not. We had originally toyed with the idea of just calling the product the Cisco Nexus Virtual Switch, but it wasn’t in keeping with Cisco’s model for naming/numbering new network infrastructure solutions. After a lot of back and forth with the different teams (product, engineering, marketing, partner, executive, etc), we settled on calling Swordfish the Nexus 1000V, with the “V” standing for “virtual” to indicate it was a software product. At the same time, we also came up with the name VN-Link, although concluding on that name was quite a bit more complicated.
We needed to come up with a simple marketing term that could be used to define this new area of advanced virtual machine networking capabilities and it had to apply to both the Nexus 1000V and the optional Network Interface Virtualization (aka VN-Tag) model we were developing for our forthcoming Cisco Unified Computing System (aka UCS) which was to be launched in the spring of 2009. The Nexus 100ov was a drop-in replacement for the VMware vSwitch in the vSphere solution and it maintained the existing location of the first hop L2 switch (in the hypervisor) and did not require any special hardware or switches upstream to function properly. Once installed in a VMware vSphere Enterprise Plus environment, the solution provided an advanced network feature set, a familiar and consistent network management and operations model and support for virtualization aware networking services. These services include policy based connectivity, mobility of network & security services and a non-disruptive operations model for the server administrator. Now the Cisco UCS with support for Network Interface Virtualiation (NIV) will also support the same features described above. The main difference between the Nexus 1000V solution and the Cisco UCS NIV solution is that NIV model extracts the first hop L2 switch from the hypervisor and performs this function in the upstream switch (in the case of the Cisco UCS, this would be the 6100 series device). Brad Hedlund has done a great job of illustrating this on his blog here. The other features and benefits are consistent between the 2 solutions, something we were shooting for from a customer perspective. The benefit of the NIV model is that the server CPU can be offloaded from having to perform any of the network functions, leaving more cycles for additional server workloads. With this model, all packets are tagged and sent to the upstream switch taking advantage of the hardware ASICs to perform network policy enforcement and first hop L2 forwarding decisions.
Again, from a customer facing perspective, the Nexus 1000V model and the NIV model are consistent. They both leverage port profiles to assign network and security policy to the VM vnic (assigned as a Port Group in VMW’s vCenter). They both leverage the vSphere vNetwork distributed switch from VMware and support mobility of network and security properties and interface/flow state. They both maintain the regular VM creation and operational workflow for the server administrator, offloading the vSwitch and Port Group configuation and management to the networking team. And they both leverage the concept of a virtual ethernet interface (veth for short) to associated a unique network interface to a specific VM VNIC. Effectively, both of these solutions create a logical equivalent of what we see in the physical world everday –> Server NIC, Switch Port and the RJ-45 cable that ties them together. It’s “virtual”. It’s in the “network”. It’s a logical “link”. It’s a virtual network link. It’s a VN-Link. A great paper on VN-Link can be found here in case you want to know more.
Before this program, I never realized how much work is involved in getting customers to actually test and provide feedback on a new product, especially a software add-on to another company’s product. Swordfish’s success was dependent on a successful beta cycle of VMware’s next generation product in the 2nd half of 2008 and 1st part of 2009 (to be known as vSphere) and customer’s desire to evaluate the new features. To make matters even more difficult, because of customer confidentiality agreements already in place, the 2 companies could not share beta customer information. Since the Swordfish beta was to be run as a “add-on” beta program to the vSphere beta, we (Cisco) needed to approach customers one by one who we knew were interested in Swordfish and possibly participating in the vSphere beta program already. Sometimes we got lucky, sometimes we struck out, but overall, this “beta on a beta” proved to be very difficult from a logistics perspective. Looking forward, my goal would be to find a way to run subsequent beta programs on a version of vSphere that has already GA’d to dramatically simplify the customer engagement process. This wasn’t an option with Virtual Infrastructure 3 because of the lack of the vNetwork Distributed Switch functionality (which was known at the time as “distributed virtual switch”).
In parallel to figuring out the beta process with VMware, the team was busy with nailing down the the business arrangement to ensure that 1) both Cisco and VMware could embrace this co-development effort as a win-win (not very easy when you have two 800lbs gorillas sitting in the same room, neither wanting to move away or give on the terms they normally require in any agreement) and 2) the solution would result in something easily deployed and adopted by our mutual customers. Flexibility, lots of hard work, many late nights and some ingenious engineering efforts on both sides went into pulling this off, and it was certainly aided (from my perspective) by the initial feedback we were starting to get from joint customers.
So what was the initial feedback from customers? It was extremely positive. There were only a few features supported in the beta 1 program, but one of the key innovations that the Swordfish engineering team had developed (a feature called Port Profiles) was starting to catch on with customers. This feature allows network administrators to configure/set network policy for virtual machine environments and then expose a collection of these network policies to the server administrator through VMware’s vCenter tool. Port Profiles would enable network, security and server admins to embrace server virtualization like never before, allowing the functional data center teams to respect each other’s configuration and operational boundaries. This is a critical capability if customers are going to attempt to virtualize 100% of their applications in the next few years.
As the positive customer feedback continued to come in, we worked with VMware to figure out if we wanted to introduce this new concept at the upcoming VMWorld show in Las Vegas in September of 2008. The goal was to create awareness for the new technology and pave the way for an even broader phase 2 beta program (the biggest in Cisco’s history) and successful product launch. The 2 companies agreed that this would be a great step to take and the planning began for a September technology launch.
The next chapter
Chapter 4: August 2008 – Swordfish officially named the Nexus 1000V