Since upgrading to NSX 6.4, I get warnings for certain ESXi hosts because there is a VLAN and MTU mismatch. So I checked the vDS healthchecks and found the following error: VLAN 0 is not supported. But since we didn’t remove or add VLANs on the physical switches, I was very surprised. And VLAN 0 shouldn’t exist anyway. So, what is it about and why does the error suddenly occur?
First of all, I would like to talk about the VLAN 0. With most network manufacturers VLAN 0 does not exist or it is reserved and cannot be used. It can therefore not be configured on our physical Cisco switches. So I checked all portgroups to see if VLAN ID 0 is set somewhere in the virtual infrastructure. For such things I wrote a PowerCLI script that prints the settings of all portgroups to a CSV file:
With that I could see that VLAN ID 0 is only displayed if the portgroup has the VLAN type set to “none”.
Aha. This takes me a little further, because I had already noticed that under NSX 6.4.x all portgroups of newly created logical switches have VLAN configured as “none” (this was different before NSX 6.4).
I also found a KB article about this: https://kb.vmware.com/s/article/53724
Now it made sense to me.
Why does this “VLAN 0 not supported” problem occur?
With the setting “VLAN: None”, the traffic of this portgroup is sent untagged to the physical switch. Since the VXLAN traffic is encapsulated via the VTEP in the VLAN used for NSX, this setting is no problem for the logical switch portgroups.
But the Distributed Switch Healthcheck also check these portgroups and try to send untagged frames. But since the physical switch ports are configured in trunk mode, untagged frames are dropped if there is no Native VLAN configured. And we didn’t need this until now, because there should be no untagged traffic in our virtual infrastructure.
What can be done to correct this error?
In our case, we only need to configure a native VLAN on each physical switchport where an ESXi host uplink is connected to. Since we don’t want any untagged traffic in production, creating a “dummy VLAN” only on the physical switches is enough.
So, I created a VLAN on all switches and configured it on each switchport of an ESXi host. It is important that this VLAN is also allowed at the trunk port because the Distributed Switch Healthchecks have to send untagged frames and reach the other hosts.
switch# conf t switch(config)# vlan 123 switch(config-vlan)# name VMWARE-NATIVE-DUMMY switch(config-vlan)# exit switch(config)# interface Ethernet1/35 switch(config-if)# switchport trunk allowed vlan add 123 switch(config-if)# switchport trunk native vlan 123 switch(config-if)# interface Ethernet1/36 switch(config-if)# switchport trunk allowed vlan add 123 switch(config-if)# switchport trunk native vlan 123 ... switch(config-if)# end switch# wr
That’s it. After a few minutes all vDS Healtchecks turned green again and the warnings about VLAN and MTU mismatches because of unsupported VLAN 0 disappeared.