Troubleshooting Tips For a Failed Site-To-Site VPN Tunnel

Troubleshooting Tips For a Failed Site-To-Site VPN Tunnel

Feb 7

  • Created: Feb 7, 2013 2:15 PM

Troubleshooting Tips For a Failed Site-To-Site VPN Tunnel

Troubleshooting a site-to-site VPN tunnel that is not working can be a difficult task, luckily most VPN appliances provide ample debugging information for you to diagnose the issue. When viewing this debugging information, a good set of steps can be taken to isolate the exact issue without wasting time. These steps are listed here and can help streamline the troubleshooting process for you.

We use Juniper VPN hardware at our side here at Nexcess and have successfully created tunnels to just about everything including Cisco ASA and PIX, Checkpoint, Sonicwall, Netgear, and Zyxel to name a few. From a troubleshooting standpoint, it doesn’t really matter what device you are using at each end of the tunnel as long as there are no known interoperability issues between the two. While setting up these tunnels, issues have come up and as a general guideline there are basically three things that you should look for when a tunnel fails to work as expected:

  1. Phase 1 negotiations fail
  2. Phase 2 negotiations fail
  3. The tunnel is established, but not passing traffic.

The first step to take when you have a tunnel that is not working based on any three of the above conditions is to verify that both ends of the tunnel have the exact same information. Often the remote tunnel may be configured by a different individual, so accurate communication of all necessary information between both parties is key:

  1. Make sure your encryption settings, hashes, lifetimes, etc are the exact same for both ends of the tunnel for both phase1 and phase2 negotiations. Verify your gateways are the same. Make sure your internal subnets are different and have the correct information including masks.
  2. Verify the pre-shared key on each end is the same. Watch out for additional whitespace at the beginning or the end of the key that was inadvertently pasted in.
  3. Also verify that each end is using the same type of tunnel. If one side is configuring a route-based tunnel while the other is a policy-based, you will run into issues.

Enabling debug logging for the tunnel you are attempting to establish is the next step to take once you have confirmed that the information at both sides of the tunnel is presumed correct. The steps to perform this will vary from device to device, but make sure your logging is verbose enough to display the actual handshaking as the connection attempts to establish.

If phase1 negotiations are failing for you, check that the encryption algorithm, authentication method, hash algorithm, and lifetime are the exact same on both sides for the phase 1 proposal. Once verified, then look at the gateway configuration. The initiator mode should be the same on both sides along with the remote gateway ip. If you are seeing nothing in the logs, there is a good chance the remote gateway ip is incorrect or a firewall is blocking connectivity somewhere between the two.

If phase 1 negotiations have established and you are failing on phase 2, there are a few different items to check. Just as with the phase 1 settings, verify the phase 2 proposal encryption algorithm, authentication algorithm or hash, and lifetime are the same on both sides. If using perfect forward secrecy, now would be a good time to check it as well. The tunnel type should also be checked, verify a policy or route based tunnel is the same on each end. In the event of a route based tunnel, the proxy-id settings should be correct at each side. Most policy based tunnels will auto generate the proxy-id settings automatically, so the proxy-id is not needed in this case. Also check interface binding. Make sure the tunnel is bound to the public facing interface (or whichever interface the tunnel should be established over).

Now If you have both phase 1 and phase 2 successful negotiations and your tunnel is reported as up but you cannot pass traffic, you need to focus on firewall policies and routing. Routing is the first and easiest thing to check. If you can console you VPN appliance, simply send a ping to the remote private gateway and verify connectivity over the tunnel. If the ping fails, check your policies in both a route or policy based tunnel configuration and make sure that ICMP is set to pass and both source and destination networks are set correctly. If the ping succeeds, next check your layer 3 routing table on your primary gateway ( if it is not the VPN appliance itself). The destination network should be using the VPN appliance private ip for the gateway. Alternatively, you could add local static routes on each machine and device that would be traversing the tunnel. This of course would become tedious if operating more than a handful of machines. Finally verify all policies are in place for both directions of the tunnel for proper network security. Allow only the protocols that are needed to pass each way over the tunnel. Taking this a step further, you can lock down the devices by source and destination ip address policies if necessary.

All of the above steps should solve any tunnel issues you are experiencing. If you are still unable to establish the tunnel, try a different set of encryption settings. There may be some strange incompatibilities with one or more of the devices. Also check the release notes for the latest firmware version of your VPN appliance (since you have already upgraded any firmware to the latest version). It may offer some hints of what your continuing issue may be. Finally, look up the knowledgebase for your specific appliance as it may offer further suggestions for interoperability if each device is different.

  • http://www.cloudstaff.com/ Dagmar Garrison

    This is very intensive! Thanks for sharing this. This is indeed a good reference!