There have been a number of improved netwatch scripts listed on the mikrotik wiki in the past however many of these are hard to understand, broken or both.
I had a request from an associate to assist them finding a solution to fall over VPN traffic from one link to another in the event of an outage; in a network configuration where it wasn’t possible to use the local upstream router as an indication of the VPNs status, so I took the opportunity to revise a netwatch script based loosely on the one located here: http://wiki.mikrotik.com/wiki/Improved_Netwatch_II
My rewrite of this allows both the “up” and “down” scripts to be called from the same place (preferably a scheduler entry) and to be extra nice I’ve commented the whole script so you’re all welcome to modify as you see fit.
Pre-requisites:
– Script has been tested on v4.13
– Requires 2 routes to a set address/address range, one with a distance of 1 (preferred route) the other with a distance of 2
Note: If the address you’re trying to test to lies within the range you’re wanting to route to (eg: 192.168.1.1 and 192.168.1.0/24 as the test address and route respectively) you’ll want to add a static route for the test address (eg: 192.168.1.1 via 192.168.2.1) to ensure you always try the primary path to get to it (otherwise the testing would flap back and forth between the 2 links!)
Example configuration:
The netwatch route in this case would be:
/ip route add dst-address=192.168.2.0/24 gateway=192.168.4.2 distance=1 comment="Netwatch-Route"
The script:
#define variables :local i 0 #Check for specific route with distance of 1 :if ([/ip route find comment="Netwatch-Route" distance=1]!="") do={ #add +1 to $i while checking that $i is lower than 5 and there is no ping response (ping command returns 0) #if either of the 2 clauses are not met, the command will terminate - that is if a ping response is received (breaking one of the clauses), it'll terminate before 5 loops :do {:set i ($i + 1)} while ($i < 5 && ([/ping 192.168.2.1 interval=3 count=1]=0)) #when this point is reached check if $i=5 and if so do the following :if ($i=5) do={ #add a log entry :log info "Netwatch-Route has gone down" #set the route distance to 5 (putting it at a lower priority than the alternative) /ip route set [find comment="Netwatch-Route"] distance=3 } #route not found, so... } else={ #Check for specific route with distance of 3 :if ([/ip route find comment="Netwatch-Route" distance=3]!="") do={ #add +1 to $i while checking that $i is lower than 10 and there is a ping response (ping command returns 0) #if either of the 2 clauses are not met, the command will terminate - that is if a ping response is received (breaking one of the clauses), it'll terminate before 10 loops :do {:set i ($i + 1)} while ($i < 10 && ([/ping 192.168.2.1 interval=3 count=1]=1)) #when this point is reached, check if $i=10 and if so do the following :if ($i=10) do={ #add a log entry :log info "Netwatch-Route is back up" #set the route distance to 1 (putting it back to the higher priority) /ip route set [find comment="Netwatch-Route"] distance=1 } #no matching route found } else={ #log failure of script :log info "Route Required does not exist" #log failure to terminal (helpful for testing purposes) :error "Route Required does not exist" } }
For Mikrotik 2, the check address would be 192.168.1.1 and route would be
/ip route add dst-address=192.168.1.0/24 gateway=192.168.4.1 distance=1 comment="Netwatch-Route"
Uncommented/untabbed version of the script for those copy-paste fanatics:
:local i 0 :if ([/ip route find comment="Netwatch-Route" distance=1]!="") do={ :do {:set i ($i + 1)} while ($i < 5 && ([/ping 192.168.1.1 interval=3 count=1]=0)) :if ($i=5) do={ :log info "Netwatch-Route has gone down" /ip route set [find comment="Netwatch-Route"] distance=3 } } else={ :if ([/ip route find comment="Netwatch-Route" distance=3]!="") do={ :do {:set i ($i + 1)} while ($i < 10 && ([/ping 192.168.1.1 interval=3 count=1]=1)) :if ($i=10) do={ :log info "Netwatch-Route is back up" /ip route set [find comment="Netwatch-Route"] distance=1 } } else={ :log info "Route Required does not exist" :error "Route Required does not exist" } }
I can dig it.
Another option would be to create a tunnel and do OSPF across, but this often complicates the issue beyond what is necessary.
Most definitely, we also discussed using BGP to make use of the route flap damping however as you’ve noted, it raises the understanding level required to troubleshoot any issues.
Nice script, based on that I wrote a script for my very special case.
:local i 0;
:local j 0;
:set i ([/ping 192.168.15.121 size=1000 interval=100ms src-address=192.168.15.125 count=100]);
:set j ([/ping 192.168.12.113 size=1000 interval=100ms src-address=192.168.12.115 count=100]);
:log debug (“XYZ@Net Radio Main: ” . ( 100 – $i ) . “%PL *** XYZ@Net Radio Backup: ” . ( 100 – $j ) . “%PL”);
#both links are fully down, 100% packet loss
:if ($i=0 && $j=0) do={:log warning “XYZ@Net WIRELESS DISATER, Main and Backup DOWN!!!, SWITCH TO INTRANET IP-IP TUNNEL”;
:if ([/ip route get [find comment=”BW-XYZ@Net Radio Main”] distance] != 2) do={
/ip route set [find comment=”BW-XYZ@Net Radio Main”] distance=2;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Radio Main -> distance=2″; }
:if ([/ip route get [find comment=”BW-XYZ@Net Radio Backup”] distance] !=3) do={
/ip route set [find comment=”BW-XYZ@Net Radio Backup”] distance=3;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Radio Backup -> distance=3″; }
:if ([/ip route get [find comment=”BW-XYZ@Net Intranet”] distance] !=1) do={
/ip route set [find comment=”BW-XYZ@Net Intranet”] distance=1;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Intranet -> distance=1”;}
}
#backup links has more packet loss
:if ($i>$j ) do={:log debug “BW-XYZ@Net Radio Backup has PROBLEMS!!!, SWITCH to XYZ@Net Radio Main”;
:if ([/ip route get [find comment=”BW-XYZ@Net Radio Main”] distance] != 1) do={
/ip route set [find comment=”BW-XYZ@Net Radio Main”] distance=1;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Radio Main -> distance=1″; }
:if ([/ip route get [find comment=”BW-XYZ@Net Radio Backup”] distance] !=2) do={
/ip route set [find comment=”BW-XYZ@Net Radio Backup”] distance=2;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Radio Backup -> distance=2″; }
:if ([/ip route get [find comment=”BW-XYZ@Net Intranet”] distance] !=3) do={
/ip route set [find comment=”BW-XYZ@Net Intranet”] distance=3;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Intranet -> distance=3”;}
}
#main link has more packet loss
:if ($j>$i ) do={:log debug “BW-XYZ@Net Radio Main has PROBLEMS, SWITCH to XYZ@Net Radio Backup!!!”;
:if ([/ip route get [find comment=”BW-XYZ@Net Radio Main”] distance] != 2) do={
/ip route set [find comment=”BW-XYZ@Net Radio Main”] distance=2;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Radio Main -> distance=2″; }
:if ([/ip route get [find comment=”BW-XYZ@Net Radio Backup”] distance] !=1) do={
/ip route set [find comment=”BW-XYZ@Net Radio Backup”] distance=1;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Radio Backup -> distance=1″; }
:if ([/ip route get [find comment=”BW-XYZ@Net Intranet”] distance] !=3) do={
/ip route set [find comment=”BW-XYZ@Net Intranet”] distance=3;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Intranet -> distance=3”;}
}
#Both links has the same condition
:if ($j=$i ) do={:log debug (“BW-XYZ@Net Radio Main and Backup both have ” . ( 100 – $j ) . “% PL !”) ;
:if ([/ip route get [find comment=”BW-XYZ@Net Radio Main”] distance] != 1) do={
/ip route set [find comment=”BW-XYZ@Net Radio Main”] distance=1;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Radio Main -> distance=1″; }
:if ([/ip route get [find comment=”BW-XYZ@Net Radio Backup”] distance] !=2) do={
/ip route set [find comment=”BW-XYZ@Net Radio Backup”] distance=2;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Radio Backup -> distance=2″; }
:if ([/ip route get [find comment=”BW-XYZ@Net Intranet”] distance] !=3) do={
/ip route set [find comment=”BW-XYZ@Net Intranet”] distance=3;
:log info “IPSLA TRIGGERED: BW-XYZ@Net Intranet -> distance=3”;}
}
Hope to be useful. Also I welcome any suggestions and improvements.
What if the first 4 pings go through and only the fifth one fails. Doesn’t it give a false positive then?
It has to fail 5 pings in a row before it is recognised as being down. Note the section below where it does the pings; the failover script won’t activate unless $i=5 (5 pings down).
It would be interesting to see a solution with OSPF or BGP.
I suppose these protocols already implement a solution regards route flapping if we have packet loss in the data link.
It would be more complicated buy cleaner.
Have you seen a solution ?
Thanks,
Javier.-
That is quite true and was one of my suggestions however the client for this particular code didn’t want any routing protocol as they were concerned that others wouldn’t be able to troubleshoot, but would be fine to view and understand the static routes in place. It was suitable for the purpose 🙂
The solution to doing it with say, OSPF is simply to make some links a higher cost value than others. Then when the ospf link on the primary path goes offline the backup path is already there to take the traffic.
Can you tell me if this setup works for vrrp too ? (2 mikrotiks on 192.168.1.0/24 & 2 mikrotiks on 192.168.2.0/24)