Precise Time Sync & iSCSI

It’s a good feeling to kill two birds with one stone (figuratively) when delivering an IT project. I had the opportunity to do just that recently when two separate business units offered up two very different problems that I could solve at the same time. The first requirement was time synchronisation using PTPv2, or IEEE1588-2008 Precision Time Protocol, to synchronise devices in multiple buildings within our campus (approximately 2000 fibre route meters from end to end). The second requirement was an iSCSI network between the same locations for SAN replication between multiple pairs of disparate buildings. 

Our current LAN network at the time just didn’t support PTP. I’m sure we could have made it work somehow by using multicast routing, but the sub 50 microsecond precision required for the application wouldn’t have been achievable. It was obvious that we needed some new switches. Initially I was looking at Cisco’s industrial switches since we already had some deployed in the network.

As for the iSCSI requirement, we needed multiple 10Gig uplinks to every SAN. Again, our current LAN network at the time didn’t have the capacity to accommodate that many 10Gig ports, so it was also apparent that new switches were required to run a separate SAN network. 

After a lot of Googling for high density 10Gig switches, I came across some Nexus switches that also happened to support PTP. They had 32 x 10Gig SFP+ ports each. My thought process ended up something like: Ideal. WOW Expensive! Let’s try Ebay. Ideal.

We ended up buying 8 x Nexus 5548up switches from Ebay, with the L3 daughter board and L3 license (i’ll explain why later). The switches ended up costing around £1,500 each, instead of £25k+ per switch new.

To provide an accurate time source we also purchased a Meinberg GPS / GNSS PTP grandmaster clock. This also turned out to provide NTP as well, so our domain has a pretty precise time source now, but that’s a different story.

Physical Topology

Now the topology of the switches is as follows. topology_1.jpg We decided to go with this topology to allow us to add an additional grandmaster clock at a later date in the second “hub” location. Since both hubs are on different substations etc, this would provide some resilience if and when required. Wed also have an abundance of disparate fibre between all of the facilities and the two hubs, as these are are main datacenter locations. 

PTP

The PTP logical topology looks a little something like this.topology_2.jpg
Each switch is configured to be a boundary clock in the PTP domain. The PTP priorities of the switches mean that if the grandmaster goes offline for whatever reason, the switches will all synchronise to the lowest priority switch. This was done to ensure the equipment in all facilities were in sync with each other, even if they weren’t in sync with actual time. If a facility becomes isolated from he rest of the network, then at least its clocks will all remain in sync with each other since the switch in that facility has a lower priority than the default PTP priority.

The PTP ports are on a vlan local to the switch. Since the switches are boundary clock, there doesn’t need to be layer 2 adjacency between the grandmaster clock and the clocks (devices) for PTP to work. The port configuration is simple and is as follows.

You don’t even need to configure ip addresses on the local vlan. The devices will use link local addresses and multicast to communicate with the switch. No pesky VRF’s required!

iSCSI

The iSCSI topology ended up being Layer 3 between the buildings. We didn’t want to stretch Vlan’s between the buildings to help contain outages. The operational facilities are test & validation facilities with a lot of moving mass or high voltage, so the likelihood of a switch being taken out is pretty high. To propagate routes we used OSPF, with the hubs being area 0 and each facility having it’s own area. The logical iSCSI topology looks like this.topology_1.jpg I’ve only included the information for two of the four facilities to make it more legible.

I’m aware that some people don’t like using layer 3 in iSCSI networks. In our situation, the replication is asynchronous. In each building the iSCSI initiator and target are both on the same subnet / vlan. Only the replication traverses layer 3 boundaries. We haven’t seen any performance or reliability issues with this topology yet so all is good. 

Final Observations / Notes

One caveat of the Nexus 5548up switches is that the ports cannot run at 100Mbit/s. This is an issue for some PTP devices as they only have 100Mbit/s ports. To get around this we used cheap media converters from FS.com. These media converters will happily have a 1Gbit/s SFP in the SFP port, and a 100Mbit/s connection in the RJ45 port. 

Another caveat of the Nexus 5548up switches is that if you want to run a port at 1Gbit/s with an RJ45 SFP, you must force the port to 1000Mbit before it will come up. Just add speed 1000 to the port configuration.

Buying Cisco SFP’s isn’t really worth it when using used switches. Even on Ebay a used one is around £35-60 and you don’t know what sort of environment it’s been in. FS.com sell compatible optics for £19 with a 5 year warranty.

We also purchased spare switches for a quick response to a failure. We could have doubled down and deployed two per facility, but didn’t think it was necessary. The clock devices only have a single port at the end of the day, and SAN replication can catch up.

That’s it. I just thought I’d share one of those little last minute projects that needs to be delivered on a  tight budget.

Cisco Router Dual WAN Uplinks with NAT

Dual WAN uplinks for resilience are a common request when configuring small business routers. I’ll work with the topology below and go through the configuration.

Screenshot 2020-02-13 at 20.56.40.png

The only devices I’ll be going over are R1 and PC1. In the real world, the rest would be out of your control anyway. I’ll include the GNS3 project file at the end of this post if you would like to play with it.

Configuration

The first thing we need to do is configure the interfaces of the two ISP connections.

interface GigabitEthernet0/0
  description ISP1
  ip address 100.1.1.2 255.255.255.252
  ip nat outside
!
interface GigabitEthernet1/0
  description ISP2
  ip address 200.2.2.2 255.255.255.252
  ip nat outside
!

And the LAN interface

interface GigabitEthernet6/0
  description LAN
  ip address 192.168.1.1 255.255.255.0
  ip nat inside
!

Next we’ll configure our routes via both ISP’s. To make the failover work we need to track some objects on the primary connection. This will make failover occur if the internet connection goes down. I would advise you track reachability of a couple of hosts to avoid failing over if somebody else is having an issue. I would also advise against tracking the next hop, as an issue within the ISP network wouldn’t cause failover to occur but may prevent you from reaching the internet. We do this using ip sla to two different known hosts (I used Google’s public DNS servers for this demo).

ip sla 100
icmp-echo 8.8.8.8 source-interface GigabitEthernet0/0
threshold 1000
timeout 1000
frequency 10
ip sla schedule 100 life forever start-time now
ip sla 101
icmp-echo 8.8.4.4 source-interface GigabitEthernet0/0
threshold 1000
timeout 1000
frequency 10
ip sla schedule 101 life forever start-time now

Next we create two track objects to monitor the ip sla’s for reachability.

track 100 ip sla 100 reachability
!
track 101 ip sla 101 reachability

Then a track object to track the first two objects. By using the boolean or option, the track will go down if all of the tracked objects go down but will not if only one goes down.

track 105 list boolean or
object 100
object 101

Now we will add our default routes via both ISP’s. We will use the track 105 object to determine if ISP1 is up and add it to the routing table. Otherwise we will add ISP2 with a metric of 10.

ip route 0.0.0.0 0.0.0.0 100.1.1.1 track 105
ip route 0.0.0.0 0.0.0.0 200.2.2.1 10

That should be the routing done, now we will need to configure NAT to allow the LAN clients to access the internet. We’ll create an access list to define the LAN traffic that should be translated.

ip access-list standard NAT-INSIDE
permit 192.168.1.0 0.0.0.255
!

And we’ll use some route-maps to match the LAN traffic on the outside interfaces for translation.

route-map RM-NAT-ISP2 permit 20
match ip address NAT-INSIDE
match interface GigabitEthernet1/0
!
route-map RM-NAT-ISP1 permit 10
match ip address NAT-INSIDE
match interface GigabitEthernet0/0
!

And finally, the NAT configuration commands.

ip nat inside source route-map RM-NAT-ISP1 interface GigabitEthernet0/0 overload
ip nat inside source route-map RM-NAT-ISP2 interface GigabitEthernet1/0 overload

Verification

Now we can do some testing. If we ping from PC1 to 8.8.8.8, the ping should succeed. You can also perform a traceroute from PC1 to 8.8.8.8 to verify the route the traffic flows.

PC1#ping 8.8.8.8
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 8.8.8.8, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 48/85/196 ms

PC1#traceroute 8.8.8.8
Type escape sequence to abort.
Tracing the route to 8.8.8.8
VRF info: (vrf in name/id, vrf out name/id)
1 192.168.1.1 16 msec 48 msec 28 msec
2 100.1.1.1 104 msec 72 msec 44 msec
3 1.2.1.1 56 msec 60 msec 68 msec

You can use show ip nat translations on R1 to verify the NAT translations over ISP1.

R1#show ip nat translations
Pro Inside globalInside local Outside localOutside global
icmp 100.1.1.2:1025192.168.1.11:168.8.8.8:16 8.8.8.8:1025
udp 100.1.1.2:4501 192.168.1.11:49157 8.8.8.8:334378.8.8.8:33437
udp 100.1.1.2:4502 192.168.1.11:49158 8.8.8.8:334388.8.8.8:33438
udp 100.1.1.2:4503 192.168.1.11:49159 8.8.8.8:334398.8.8.8:33439
udp 100.1.1.2:4504 192.168.1.11:49160 8.8.8.8:334408.8.8.8:33440
udp 100.1.1.2:4505 192.168.1.11:49161 8.8.8.8:334418.8.8.8:33441
udp 100.1.1.2:4506 192.168.1.11:49162 8.8.8.8:334428.8.8.8:33442

And also, show ip route on R1 should show the next hop as ISP1.

S*    0.0.0.0/0 [1/0] via 100.1.1.1

Now, if we suspend a link connected to the ISP1 route, it doesn’t matter which one, our topology should failover.

Screenshot 2020-02-13 at 21.44.38.png

First thing you should notice is the track objects going down on R1.

*Feb 13 21:44:28.847: %TRACKING-5-STATE: 100 ip sla 100 reachability Up->Down
*Feb 13 21:44:28.851: %TRACKING-5-STATE: 101 ip sla 101 reachability Up->Down
*Feb 13 21:44:29.835: %TRACKING-5-STATE: 105 list boolean or Up->Down

And the default route on R1 should have changed to ISP2.

S*0.0.0.0/0 [10/0] via 200.2.2.1

And if we repeat the same ping and traceroute from PC1 to 8.8.8.8, they should still work fine, but the route should show ISP2 as the second hop instead of ISP1.

PC1#ping 8.8.8.8
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 8.8.8.8, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 28/56/84 ms

PC1#traceroute 8.8.8.8
Type escape sequence to abort.
Tracing the route to 8.8.8.8
VRF info: (vrf in name/id, vrf out name/id)
1 192.168.1.1 12 msec 88 msec 24 msec
2 200.2.2.1 52 msec 32 msec 36 msec
3 1.2.2.1 64 msec 64 msec 36 msec

show ip nat translations on R1 should also show the NAT translations to ISP2 now.

R1#show ip nat translations
Pro Inside globalInside local Outside localOutside global
icmp 200.2.2.2:1024192.168.1.11:2 8.8.8.8:28.8.8.8:1024
udp 200.2.2.2:4501 192.168.1.11:49167 8.8.8.8:334378.8.8.8:33437
udp 200.2.2.2:4502 192.168.1.11:49168 8.8.8.8:334388.8.8.8:33438
udp 200.2.2.2:4503 192.168.1.11:49169 8.8.8.8:334398.8.8.8:33439
udp 200.2.2.2:4504 192.168.1.11:49170 8.8.8.8:334408.8.8.8:33440
udp 200.2.2.2:4505 192.168.1.11:49171 8.8.8.8:334418.8.8.8:33441
udp 200.2.2.2:4506 192.168.1.11:49172 8.8.8.8:334428.8.8.8:33442

Now if we resume the link to the ISP1 route, the track objects will come back up and everything should fail back.

*Feb 13 21:50:43.871: %TRACKING-5-STATE: 100 ip sla 100 reachability Down->Up
*Feb 13 21:50:43.871: %TRACKING-5-STATE: 101 ip sla 101 reachability Down->Up
*Feb 13 21:50:44.839: %TRACKING-5-STATE: 105 list boolean or Down->Up

Notes

There are a couple of things to note with this configuration:

  • There is zone-based firewall configuration in this demo. I highly recommend one if you aren’t using a dedicated firewall.
  • Inbound connections using port forwarding and the primary connection IP address will not failover if ISP1 fails.
  • The IOS image used in this GNS project is c7200-advipservicesk9-mz.152-4.S5.bin. You’ll have to provide that yourself.

Downloads

GNS3 Project – Dual WAN

Cisco 887VA VDSL – Ethernet bridge on Sky Fibre Unlimited Pro

After successfully configuring the Cisco 887VA on my Sky Fibre Unlimited Pro connection I started to configure NAT and ACLs to allow all of my devices to work properly. For the most part this wasn’t an issue, until I came to all of the games consoles in the house.

We pretty much have a console in every room in the house, totalling 3 x Xbox 360s, a PS3 and a PS4. I blogged a while ago about setting them all up to use UPnP on pfsense to map inbound ports on demand in order to get an open NAT type in game. Now that worked well, but unfortunately the Cisco box doesn’t support UPnP or NAT-PNP by design. The reason for the deliberate lack of these features is simple; Cisco IOS devices are enterprise devices, and no enterprise want users to be able to dynamically NAT ports to internal resources.

While I agree that UPnP or NAT-PNP is a security risk in the enterprise, many other vendors support the features but provide means to restrict which devices may use them, similar to how pfsense does.

The console all tend to use the same ports to connect to the internet. However, when they use UPnP they can use alternative ports if the UPnP router refuses to open the requested ports because another device is using them. This is all good on consumer routers which tend to have UPnP enable as standard. The biggest problem I have with the 887 is that the ports would have to be manually NATed to the console that as currently in use, and the other console would struggle to work properly.

This issue pretty much rules out the feasibility of using the 887 in our house as a conventional router. I did however wonder is I could simply replace the Openreach Modem with the 887 and continue to use my trusty Firebox x750 running pfsense as my firewall. I started to play with my config. After a quick config erase and reload I had a blank canvas to play with.

I decided to try a bridge group first. I shutdown the ATM interface and created the required sub interface on the eth0 interface as below. I also put the sub interface in a bridge group.

interface Ethernet0
no ip address
no ip route-cache
!
interface Ethernet0.101
encapsulation dot1Q 101
no ip route-cache
bridge-group 1
!
interface ATM0
no ip address
no ip route-cache
shutdown
no atm ilmi-keepalive

I then tried to pt a FastEthernet interface into the bridge group, which failed as layer 2 interfaces are not allowed in bridge groups. To get around this I created a vlan and placed that into the bridge group. I then stick Fa0 into the clan. Notice the config line “ip virtual-reassembly in”. It is required.

interface Vlan100
no ip address
ip virtual-reassembly in
no ip route-cache
bridge-group 1
!
interface FastEthernet0
description ~ Uplink to Firewall ~
switchport access vlan 100

Then I set the protocol on the bridge group.

bridge 1 protocol ieee

Finally I disable the routers routing engine.

no ip routing

It worked. I was impressed!.

I know many people this the 887VA is an expensive router to just use as a bridge. I disagree. I have had issues with my line for a few months, caused by broken insulation on the drop wire. The drop wire has been replaced now but OpenReach didn’t re-enable DLM, meaning the line never really built any speed up since the drop wire was replaced. Considering it was sitting at 52Mbs sync our of 80, and the DSLAM is approximately 80 meters from my master socket, this wasn’t acceptable. After 8, yes eight, engineer to my house, none of which were interested in the history of the fault and none of which were willing to do the OGEA reset I requested to set the line speed back to 80Mbs to train down to a stable speed, I have pretty much given up on OpenReach.

Enter 887VA. When I started using this router as a bridge two days ago, the sync speed was already 5Mbs up from the OpenReach Modem. I decided to hammer the connection using Iperf and monitor it for errors. I set iperf away all night at the maximum speed of the line and checked it in the morning. There were a total of 7 CRCs and no drop outs. Result. I shutdown “controller vdsl 0” and brought it back up to find another 1.3Mbs sync speed. I repeated this procedure again the next night and yet again gained another 0.9 Mbs sync speed, bringing me to just under 60Mbs.

Another benefit of using the 887VA is the fact I can see my full line stats. Bonus.

I’m going to continue to try and increase my line speed over the next week and see how high I can get it. If only I could use the “del noise-margin” command.

 

Sky Fibre Unlimited Pro on a Cisco 887VA

I recently decided to look for a replacement for the crappy white OpenReach modem that was installed as part of my Sky Fibre Unlimited Pro FTTC connection. The problem was that I didn’t want to fork out for an expensive VDSL2 modem to find I couldn’t get it working with the silly MER authentication used by Sky to try and prevent you from using your own router.

Luckily, a Cisco 887v a became available to test with before I took the plunge and bought one. I started googling and couldn’t find one success case of using this router with Sky’s service. Undeterred, I started to tinker and eventually got it working….

Before you begin you will need your mac address,  user-id and password. I won’t cover how to obtain these in this post as I provided steps (steps 1 to 7) to obtain them in an earlier post.

Once you have your mac, username and password, you will need to use them to create three bits of information.

MAC:             <0000.0000.0000> (remove the :’s and place a . after every four characters)
Hostname:    <username>|<password>
Client-ID:      <hexadecimal string of Hostname> (A converter is available here.)

I won’t go into any other configuration in this post, just the interface configuration.

First of all you want to disable the ATM interface as it shared a physical interface with the VDSL controller.

interface ATM0
no ip address
shutdown
no atm ilmi-keepalive

The VDSL modem should automatically connect to the DSLAM. You can check it’s progress by using “show controller vdsl 0”.

When the VDSL modem connects it brings interface Ethernet0 up. Eth0 is a virtual port but is used as your outside interface. OpenReach encapsulate traffic for different ISPs in Vlans. In the case of Sky it is Vlan 101 so you need to use a sub interface of Eth0.

interface Ethernet0
mac-address <mac>
no ip address
!
interface Ethernet0.101
encapsulation dot1Q 101
ip dhcp client request classless-static-route
ip dhcp client client-id hex <client-id in hex>
ip dhcp client hostname <username>|<password>
ip address dhcp
no ip redirects
no ip proxy-arp
ip flow ingress
ip flow egress
ip nat outside
no ip virtual-reassembly in

Thats it. I’ll post my full config below which includes some basic NAT. It doesn’t include any security though. And no, you don’t need a dialer interface!

version 15.1
no service pad
service timestamps debug datetime msec
service timestamps log datetime msec
no service password-encryption
!
hostname Router
!
boot-start-marker
boot-end-marker
!
!
!
no aaa new-model
!
memory-size iomem 10
crypto pki token default removal timeout 0
!
!
ip source-route
!
!
!
!
!
ip cef
no ipv6 cef
!
!
multilink bundle-name authenticated
license udi pid CISCO887VA-K9 sn FCZ1633C05Z
license boot module c880-data level advipservices
!
!
!
!
!
!
controller VDSL 0
!
!
!
!
!
!
!
!
interface Ethernet0
mac-address <mac>
no ip address
!
interface Ethernet0.101
encapsulation dot1Q 101
ip dhcp client request classless-static-route
ip dhcp client client-id hex <client-id>
ip dhcp client hostname <username>|<password>
ip address dhcp
no ip redirects
no ip proxy-arp
ip flow ingress
ip flow egress
ip nat outside
no ip virtual-reassembly in
!
interface ATM0
no ip address
shutdown
no atm ilmi-keepalive
!
interface FastEthernet0
switchport access vlan 1
!
interface FastEthernet1
!
interface FastEthernet2
!
interface FastEthernet3
!
interface Vlan1
ip address 192.168.1.1 255.255.255.0
ip flow ingress
ip flow egress
ip nat inside
ip virtual-reassembly in
!
ip forward-protocol nd
no ip http server
no ip http secure-server
!
!
ip nat inside source list NATACL interface Ethernet0.101 overload
!
ip access-list standard NATACL
permit 192.168.1.0 0.0.0.255
!
logging esm config
access-list 1 permit 192.168.1.0 0.0.0.255
dialer-list 1 protocol ip permit
!
!
!
!
!
control-plane
!
!
line con 0
no modem enable
line aux 0
line vty 0 4
login
transport input all
!
end

 

Installing / Configuring and Administering pfSense as a multi-tenant firewall

I am about to embark on a mission… A mission to provide uncontested but limited Internet connectivity to our tenants. To do this I have decided to deploy pfSense, and I will be documenting each step for both our reference here at work, and in the hope that it will help somebody do something similar in the future.

To start with, we needed a specification of what we need the system to do. Here it is.

  • The firewall must serve multiple tenants (up to 50+)
  • The firewall must give each tenant their own external IP
  • The firewall must prevent each of the tenants from seeing each others’ networks
  • The firewall must allow us to limit the amount of bandwidth each tenant can utilize (otherwise they have free reign of our dual redundant gigabit fibre connections)
  • The firewall must allow us to filter out certain traffic such as p2p
  • The firewall must allow us to set data caps for each tenant
  • The firewall must let us create a DMZ for each tenant if required
  • The firewall must allow us to configure network services for each tenant (DHCP, DNS, etc)
  • The firewall must allow each tenant to have their own VPN connection if required
  • The firewall must allow us to report on bandwidth utilization and data transfer usage on a per-tenant basis

This may seem a tall order for one box, but with pfsense it is absolutely possible providing the hardware is capable of it. for our firewall we are going to re-deploy one of our old servers which was decommissioned during our virtualization project. The server used to be one of our domain controllers and it performed well while it was in service. I believe it will perform well as firewall as well. Its spec is below.

  • IBM x3550 1u Server
  • 2x Dual core Xeon processors
  • 4GB Ram
  • 2 x 76GB SAS disks in a RAID 1 (mirrored) configuration
  • 2x On board Intel Pro/1000 Gigabit NIC’s
  • 1x Dual port Intel Pro/1000 Gigabit NIC
  • N+1 Power supplies

As you can see the server isn’t wanting when it comes to specs for the purpose it will be used for. It was slightly higher speced but parts have since been “pinched” for other projects. If this project goes well then we will be looking to build another similar firewall using our other domain controller of the same spec and cluster them for both resilience and load balancing.

I will be starting this project this afternoon so check back for updates, step-by-step guides and images of the entire process during “Project FireServer”.

Part 1 – The Hardware and Topology ->>>