WireGuard extras
This article relies on the following:
- Accessing web interface / command-line interface
Introduction
- This how-to describes the most common WireGuard tuning scenarios adapted for OpenWrt.
- Follow WireGuard server for server setup and WireGuard client for client setup.
- Follow WireGuard protocol for server and client configuration.
- Follow DDNS client to use own server with dynamic IP address.
- Follow Random generator to overcome low entropy issues.
Extras
References
Web interface
If you want to manage VPN settings and view VPN status using web interface. Install the necessary packages.
# Install packages
luci-proto-wireguard qrencode
service rpcd restart
- Navigate to LuCI → Network → Interfaces to configure WireGuard.
- Navigate to LuCI → Status → WireGuard to view WireGuard status.
Connection probe / VPN automation failover script
No current VPN providers have a Luci app for OpenWrt that acts like their OS-specific desktop, laptop & mobile VPN clients. Those apps can dynamically pull VPN server info, and connect on an as-needed basis, to the closest/fastest VPN server, or other preferences you have (example). When moving your VPN client connection from your individual devices to the router, you must initially make your own (wireguard) interface(s) on the router. ( ProtonVPN examples: see here, and here ). However, since individual VPN servers can go down for maintenance, or have other issues, and you would otherwise lose the oversight, redundancy and error-checking provided by using a direct client-based app, one way to improve the conditions at the router, is to have a boot script periodically checking your VPN profile for connectivity. If the check fails, it either addresses the basic wan connectivity, if that failed, or if that is ok, it tries bringing up each VPN profile, one at a time. If none of them are able to connect, it logs that information so you can take corrective action.
ABOUT
This script is an upgraded version of the script here, and was originally built from it, adding VPN, then options for SQM and CAKE-AUTORATE, on top of devices that have a cell modem, particularly Quectel devices that seem to have firmware stability issues, but are also one of the most common devices. Sierra Wireless modem support was added later.
- GENERICWAN support (fiber/ethernet) was recently added to the modem_reset section of the script (see changelog). Since this script originated as a cell modem watchdog script, that feature, of supporting a generic WAN connection (for instance, into a fiber bridge) had been accidentally excluded. If your WAN link goes down, the 'modem_reset' function in the script, has a GENERICWAN section at the top, now, and is configured to do an 'ifup' on your wan link (equivalent of ifdown && ifup). If you want/need more functionality, you can add it to that section, and, if needed, borrow from the other modem sections to make an 'advanced reset' for your situation.
REQUIREMENTS and PREREQUISITES
- Set all values in the script to match your actual values: e.g. wan, qmi, wwan0, /dev/cdc-wdm0, etc. See the instructions in the script for instructions. Save the script as
/root/wan-watchdog.sh. Dochmod +x wan-watchdog.sh. - This script requires the
fpingand, for cellular modems, thesocatpackage. Add to your build image, or use the built-in software manager (apk/opkg). - Functions of Cake-Autorate/SQM auto-management are disabled by default in the script, as they require OpenWRT SQM guide (luci-app-sqm) and cake-autorate installation. Cake-Autorate provides automatic bandwidth and other adjustments for variable-bandwidth WAN connections (Cell) compensation-optimization, beyond and on top of, SQM.
- SQM can significantly increase router CPU requirements, and SQM requires the firewall's Flow-Offloading option to be set to Software, not Hardware. See this chart for benchmark results that will be similar to the router performance requirements with SQM Cake enabled: https://github.com/cyyself/wg-bench
- Generic WAN connections, and, for cellular modems, plain QMI and plain MBIM are supported. Not tested on other methods of cell modem WWAN protocols. ModemManager's additional built-in auto-management routines would interfere with this script. Quectel devices seem to have hard-lock issues after a few minutes of uptime, if the MTU (through QMI, MBIM or ModemManager) is set, but if un-set, plain QMI seems quite stable. This script evolved from a cell modem router watch-dog script, incorporating VPN aspects later.
- fping tests both IPv4 & IPv6, in that order, and will report failure if either fails. Therefore, if you are using the VPN option, your VPN must support IPv4 & IPv6, or, you can modify the check_fping function in the script to only test what you have (e.g. IPv4-only).
- Change the IPv4 & 6 sites, preferably to your ISP's DNS servers. The script has them set for Verizon.
- VPN Option Requires: Setting up 4 VPN interfaces
- Make sure each VPN works, by itself, individually. Here is a set-up guide: Wireguard on OpenWrt guide. Uncheck 'Bring up on boot' for each VPN interface.
- Modify the variable declarations in the script to match your VPN interfaces (e.g. protonvpn, protonvpn2, protonvpn3, protonvpn4).
- If you are here for the cellular and other aspects and not using a VPN, simply leave the default values.
- Set VPN=“0” if you do not want VPN support with this script (e.g. only other functions, qmi/wwan0/WAN connectivity-checking).
- Script's BOOTWAIT time is pre-set to 5 seconds in this example, for testing purposes from the shell, and so as long as the value is low, the script is in testing mode and never reboots your device. The script will run-once, then exit (not loop), and NOT REBOOT, even if there is a connectivity failure, until you set BOOTWAIT to 45 or more. Only do so, when you have manually run this script and it works. When you are ready, execute '/root/wan-watchdog.sh &' and it will run in the background, then you can exit the shell. If you've modified the rc.local as described below, it will be called at next boot.
- If the script is run as
./wan-watchdog.sh test, that will over-ride the BOOTWAIT timer, and run it immediately in test mode. This way, as you prepare to put it into production, or make changes, you are not constantly editing the script and changing the BOOTWAIT value, to toggle it in and out of testing mode. - Another new feature was added, that if the script runs in test mode, logging will go to /root/wwatch.log.
- Run this script until you have no errors. Only then increase the BOOTWAIT time to move it out of Test Mode.
Click Here for the script (hidden by default)
Click on 'wan-watchdog.sh' below to download a copy.
- wan-watchdog.sh
#!/bin/sh #wan-watchdog.sh v.20260404.01 # Settings: BOOTWAIT=5 # Under 45 seconds, automatically sets TESTMODE=1: Script will run once in Testing Mode and then quit (not loop). # You can also invoke testing mode by running './wan-watchdog.sh test'.That will over-ride the timer (0 seconds and always in test mode). # The reason you might want to still use the timer-method, a shorter bootwait (under 45 seconds, like 40 seconds), is to leave the script as-is, # allowing the router to reboot and rc.local to call the script, without looping the script, for testing purposes. # Where do you want the connection error messages written to? log_file="/root/wan_watchdoglog.txt" # Besides writing the log start date/time, only failures or changes to the connection are written to file. # ALERT!: Running in TESTMODE will divert log to /root/wwatch.log. # Re-test connection interval: How many seconds to wait after a completed process of checking connectivity, to do it again? LOOPWAIT=30 WAITFOR=3 # Seconds to wait in between steps (adds momentary pauses between steps, useful for watching the script when debugging in test mode) MODEMTYPE="GENERIC" # Determines the cell modem reset procedure used within the modem_reset function. Use GENERICWAN, QUECTELQMI, SIERRAQMI or SIERRAMBIM. At this time, different reset procedures are used, depending on the modem and the mode it's operating in (QMI or MBIM). They may be merged in future versions. It's found at present testing, that the two different brands of modems exhibit slightly different lock-up characteristics, and different procedures based on actual experiences, are used to recover. # GENERIC / GENERICWAN was recently added for people with fiber/ethernet or other non-cellular WAN link. NVRELOAD=0 # Testing a 3rd-level modem restoration technique, currently only used for SIERRAMBIM: If the Sierra modem fails in the 'up:false' condition, this can happen when the cell tower is fine, but instead be caused by corruption in the modem's settings. Enabling this option will do an extended reset. # 20260220: Update!: Further testing and research, revealed that generally, DO NOT use the above method. What you probably should be doing if you run into the described condition, is delete all the modem images and preferences (Sierra AT!ENTERCND="A710",AT!IMAGE=0 command) and reflash only your carrier firmware, with qmi-firmware-update flash tool (not fwdld) # CAUTION: If you are going to use this, before you set this to '1', you should have entered management mode, when your modem was working, and done an AT!NVBACKUP=3, to save your modem's working settings in the NVRAM backup location 3. # If your router has a general wan connection, WWANNAME="wan" would probably be the correct setting. WWANNAME="wan" # On cellular modem routers, usually 'mbim' or 'qmi' - whatever you named the interface in Device -> Interfaces. # ^^ As 'ifstatus' and 'ifup' recognizes it. This is different than the 'ifconfig WWANDEVICE' below. WWANDEVICE="eth0" # The WWAN DEVICE: Varies by device. Usually wwan0 on cellular modem routers, same as used by 'ifconfig'. MODEMUSBDEVNODE="/dev/cdc-wdm0" # Used in cellular modem routers, for the (wan) modem_reset function. This setting is normally /dev/cdc-wdm0 on either QMI or MBIM-configured single-modem systems. # Replace with your provider's DNS servers, even if you are not using them for DNS. Used for checking if both ipv6 and ipv4 are up and working. DNS6SITES="2001:4888:32:ff00:3c1:d:: 2001:4888:3b:ff00:3c2:d::" # Replace with YOUR provider's IPV6 DNS servers. (These sample ones are Verizons's) DNS4SITES="198.224.151.135 198.224.150.135" # Replace with YOUR provider's IPV4 DNS servers. # Requires the SQM and Cake-Autorate setup steps, first. See further notes below. If you do NOT have SQM and CAKE-AUTORATE configured, choose '0' DOYOULIKECAKE=0 # Enable Cake-Autorate (requires SQM). Works with WWAN or VPN. # Requires the SQM module and setup steps, first. See below notes (too long for here). If you do NOT have SQM configured, or are not using it, choose '0'. # SQM? 1 - Yes, turn-on and use SQM. 0 - No. If No, you can turn on Hardware flow control in Firewall. FEELINGSQMISH=0 # SQM does not work with Hardware Offloading in your Firewall settings: Turn it to Software Offloading if using SQM. # Low-performance routers, cannot do a Cake shaping with SQM effectively, let alone CAKE-AUTORATE. Real-world performance is better simply on such devices, with both SQM disabled and Hardware Offloading enabled. if [ $DOYOULIKECAKE = 1 ] ; then FEELINGSQMISH=1 # SQM _must_ be enabled if Cake-Autorate is turned-on. fi # VPN (1 - Yes / 0 - No): # Define whether you want to use VPN's: 1 for yes 0 for no. You may, for whatever reason, not want to use VPN's on your router at some time, e.g. during testing. # ATTENTION: Reminder when you set-up all your VPN profiles, to configure them to NOT 'Start on boot'. Let this script start them instead. VPN="0" # Put here the names of your VPN Interfaces (Luci->Network->Interfaces): # ATTENTION: None of your VPN profiles (interfaces) should be set to 'Start at boot' - UNCHECK that. And manually test each one to make sure they work. VPN1="protonvpn" VPN2="protonvpn2" VPN3="protonvpn3" VPN4="protonvpn4" # Lastly, check the DNS hosts in the check_fping function. You may want to use your provider's DNS servers for better results. # Changelog: Newest comments at the top # # v20260404.01: Fixed some missing log-recording when the second-level SIERRAQMI/QUECTELQMI routine fails to bring the modem back up: # ... Was simply missing some tee -a logfile commands # ... ATINOUT swapped with SOCAT, as SOCAT is included in the main openwrt sources. # ... Added more extensive logging after second-level failure: SIM status, AT!GSTATUS? and the equivalent Quectel command # v20260310.01: Added an elif confition for the SQM-CAKE portion, when SQM is installed, selected ON, and either running or not. # ... previously, this script did not have a condition for when you are only using SQM alone, without cake-autorate. # ... it will currently not catch if you selected FEELINGSQMISH OFF (no SQM), but SQM is installed, like elsewhere (yet). # ... will add that shortly. In the meantime, you can manipulate the script SQM Section to do what you need. # v20260223.01: Added an elif condition for the SQM-CAKE portion, in case a reboot is needed. With the 20260222 fall-through addition, the fall-through portion # ... was catching this NEEDREBOOT=1 CONNECTIONSTATUS=0 condition (requiring reboot). # v20260222.01: Updated check_fping was missing a ' ' (typo), causing a fall-through condition instead of corrective action. # ... Various modem_reset techniques had inconsistent delay timers after 'ifup' command. Takes around 65 seconds (maximum) in testing for QMI, but only 35-40 seconds # ... for MBIM. Having too short of a delay can cause a false-negative condition. # ... Updated the internal documentation (these notes, remarks in the script so you know what's happening) # v20260219.01: Some improvements in the layout and wording output and sent to the log, when there is an fping failure condition. # ... Added: fall-through condition for check_fping. # ... Improved: fall-through logging of variables in the Cake/SQM portion # ... Added: a GENERIC/GENERICWAN 'modem_reset' profile. This script will log if there is an issue, but the script will not do actually anything unless you add some instructions under that profile. # ... This is for people who have fiber or some other upstream, and want to use this script for it's wireguard/sqm/cake functions. # ... While it's obviously needed, this script evolved from a modem watchdog script, and turned into that + VPN then + SQM & Cake-Autorate. # ... Added: './wan-watchdog.sh test' will run in TESTMODE, regardless of the BOOTWAIT timer. Handy for trouble-shooting. # v20260209.xx: critical change: SQM/Cake-Autorate: added missing else condition, when neither is installed. # ... This script previously accidentally assumed SQM/CAR were both at least, installed, even if not active. It also did not have a final 'fall through' else condition, either. # ... If you are not using SQM or CAR, and so if you _never installed sqm or cake-autorate_ on your router, this update applies to you. # ... A final 'else' was also added, to output all the variables, log and quit the script, if the conditional evaluations don't match with anything anticipated. # ... Also, checks if you've enabled SQM or Cake-Autorate, and that those services *exist* on the router. If they don't, log it and quit the script. Useful during testing. # v20260120.01: SIERRAMBIM reset procedure is using 'atinout', https://github.com/koshev-msk/modemfeed/tree/master. # ... If you do not have 'atinout', try replacing it in the script with 'socat': (echo "AT!RESET" | socat - /dev/ttyUSB2,crnl,ignoreeof) # v20260119.01: Modified the SIERRAMBIM to actually send an AT!RESET which is different than an 'ifup mbim'.. # v20260119.01: Added !RMARELOAD option, to reload NVRAM stored settings in SIERRA WIRELESS, in uncommon case where a running modem starts reporting "up:false" # ... Make sure prior to enabling the NVRELOAD feature, that you have done an AT!NVBACKUP=3 on your working modem settings. # ... In most cases, if this corruption occurs, it has to do with power supply issues and/or a reflash using 'qmiflash' then fixes the modem. # v20251226.04: Tightened up the Modem Restart function and logging. # v20251226.03: ifstatus mbim is now only looking to display and log the up and uptime, not screenfulls of output. # v20251226.02: DNS4SITES and DNS6SITES variables added. Use YOUR provider's DNS servers. # v20251226.01: SIERRAMBIM routine added. # v20251201.03: fix a lazy-fingers typo causing reboot on failure regardless of whether the modem came back up after reset procedure # v20251201: re-wrote the check_fping script. Was having too much trouble in the prior method with detecting success or failure. # will begin dating future changelog notes, above. # minor updates will be noted with x.xx where .xx is the incremental. The general logic and process has not changed. # Reordered check_fping to do the ipv6 test first, primarily because it's the more important protocol, but also because QMI brings it up first. # Removed/fixed: removed dns.opendns.com from the ipv6 fping query. It reports a false positive for an unknown reason, when device actually has no connectivity. # Fixed: --fast-reachable is just --reachable. check_fping was generating a failure, because certain versions don't use --fast-reachable. # Added: the MODEMUSBDEVNODE variable for what is typically /dev/cdc-wdm0. # Moved: the modem reset procedure (ifdown/up, AT!RESET) to a function: modem_reset # Change: Swapped-in dns.cloudflare.com dns.google dns.opendns.com for the test sites to ping. You can use different ones, but the DNS names should resolve with 'ping name -4' and 'ping name -6', and respond normally via both IPv4 and IPv6 addresses. You should use DNS NAMES, not IP addresses. # Change: Added commands for resetting a hung SIERRA wireless modem. Script originally designed only for QUECTEL. # Change: moved a number of common configurable variables to the top of this script. # Fixed: added a command prior to reboot, to do a proper shutdown of SQM which without it will leave a ghost /tmp/run/sqm/sqm-run.lock/pid and the .lock folder, which would otherwise give an # ...'SQM: WARNING: Unable to get run lock - already held by xxxx' error. # Fixed: a small script startup logging bug. # Added: a short sleep period after the sysntpd restart, to make sure the service has a few seconds (5) to retrieve the current time and date. # Removed: 'service sysntpd restart' instructions from /etc/rc.local (Router boot actions), and moved that command to this script, # ...as Cell-connected routers can take up to 45 seconds to establish a link, and # ...therefore require that much time to be able to get the accurate local time on which to write to the watchdog log. # ...If you have 'service sysntpd restart' in your /etc/rc.local as per earlier instructions, you should delete that line. # Slight logging changes for clarification. # VPN=0 handling improvement/fixes. # Enabled SQM and cake to run on the plain WWAN device, e.g. wwan0, if VPN=0. # Cleaned up: the use of /lib/functions/network.sh # Cleaned up: the bootwait logic. Funciontionally, the previous version of the script works fine, but it makes more sense to do it this way. # Fixed the cake-autorate logic. Cake-autorate responds variously 'not running' or 'inactive' when it is not running (why?? who knows?). # Added/changed/updated: clearing SQM interfaces after changing/starting VPN # Added: Exporting CURRENTVPN value to file for use by cake-autorate.sh. # Changed: write CURRENTVPN for use by outside scripts to /tmp not /root. It doesn't need to be /root / persistent between restarts. # Added: logfile as variable instead of explicitly # Improved: FPING function to test IPv4 and IPv6 both, and issue a failure if either one fails. # Setup: Call this script from rc.local with '/root/wan-watchdog.sh &' # On cell-connected routers, what this is designed-for, your rc.local calls this script at startup: # # sleep 15 # Gives the router a little more time to complete boot-up processes. # /root/wan-watchdog.sh & # Starts this script and allows it to run in the background. # BOOTWAIT TESTMODE determination: # When BOOTWAIT is less than 45 seconds, this script runs in Testing Mode: # Ordinary mode: designed for start-on-boot and loop, checking the connection, etc. # Testing mode: Will cause this script to run 1 time then exit, and if there is e.g. a total wwan failure, it will echo to the screen 'reboot' but not actually reboot the router. # The wait period gives your cell modem time to connect. Should be at least 45 seconds under ordinary circumstances. # Usually 45-360 seconds. Initial waiting period before testing. # Controlling it via BOOTWAIT is done in case you reboot your router manually with the script in testing mode: the script does not automatically keep running, reboot or go into a boot-loop. # Test BOOTWAIT value to determine test run or normal run mode: if [ $BOOTWAIT -ge 45 ] && [ -z "$1" ] ; then # if greater or equal to 45 and there is nothing appended to running "./wan-watchdog.sh" echo echo "Watchdog script running in normal mode. (This message not logged)" TESTMODE=0 echo "Waiting BOOTWAIT value (${BOOTWAIT}) seconds before continuing..." sleep $BOOTWAIT # This startup delay must be first, to ensure the time and date recorded in the log is correct. else # fall-through condition: # Handle non-null, non-'test' argument if [ "$1" != "test" ] && [ -n "$1" ] ; then # if some gibberish was appended: "./wan-watchdog.sh TEST", or "... gibberish", exit. echo echo "Something was appended to run wan-watchdog.sh. It was: $1. But I don't have a condition for that: Exiting immediately..." exit # exit script immediately fi # Run in test mode: either 'test' arg given OR BOOTWAIT < 45 echo echo "Watchdog script running in test-mode." TESTMODE=1 log_file="/root/wwatch.log" echo "Writing errors to alternative logfile: ${log_file}" echo "Wan-watchdog script started in TESTING mode on $(date)." | tee -a "$log_file" echo "Run once and exit. No loop, and no reboot if needed (e.g. unrecoverable wan connection failure)." # Increase the BOOTWAIT value for ordinary start-up. fi # Check for existence of a logfile. Create one if there isn't one: if ! [ -f ${log_file} ] ; then echo # new blank line echo "Logfile ${log_file} does not exist... creating." echo "Logfile created." >> $log_file fi # DOYOULIKECAKE: SQM and Cake-autorate (cell wan bufferbloat) # To add Cake-Autorate on top of SQM, for variable-speed internet uplinks (cellular, typically): # 1. You must have SQM installed (luci-app-sqm). # 2. Cake-autorotate must be already set-up according to: https://github.com/lynxthecat/cake-autorate/blob/master/INSTALLATION.md # 3. Initially you must manually start each VPN profile, one at a time, and go to the Network->SQM QoS menu, and 'Enable this SQM Instance', set the DL/UL, and Queue Discipline to Cake/Layer-of-cake # 4. Do that with all 4 VPN instances individually. # 5. Do it also with the plain wwan0 interface/device. Shut off all VPN's then configure Cake/layer-of-cake using the Luci SQM menu on it. # SQM will remember the configurations for each device: VPN's and raw wwan0. When it is restarted in this script, during testing # mode you will see errors for all the OTHER devices that are not active. That is normal. # 6. Note: There are also some lines that need to be modified in /root/cake-autorate/config.primary.sh Cake-Autorate config, as follows: # Add: # read -r CAKEWAN < /tmp/currentvpn.txt # Change: # dl_if="ifb4${CAKEWAN}" # This will show up in current versions of cake-autorate's config.primary.sh as "dl_if=ifb-wan" instead of dl_if=ifb4wan. They should be changed as specified here. # ul_if="${CAKEWAN}" # 6. After you manually get cake-autorate working, DISABLE the cake-autorate service from auto-start: # Choose 'disabled' (from System->Startup) for cake-autorate. Instead, this script will manually start cake-autorate at the appropriate stage. # 7. If you have a working configuration with this script and Cake-Autorate, and: # Your router has gone into TFTP recovery mode after an attempted -sysupgrade, (an ongoing issue when attempting to upgrade an LBR20), and you have a restored backup, and # prior to the backup you had the following lines in System->Backup->Configuration: # /etc/firewall.user # for custom firewall commands to change the Hop and TTL # /root # to back-up the entire root folder, including this script, log and cake-autorate folder. # # You need to re-run the cake-autorate's 'setup.sh', and choose Y to keep your existing configuration (which is config.primary.sh), # to get cake-autorate actually working again. It will already be disabled from auto-starting on boot, in the System->Startup list. # include functions needed for retrieving current gateway, ipv4 wan and ipv6 wan IP addresses: . /lib/functions/network.sh # Below is the custom FPING function: (not from /lib/functions/network.sh, but made specifically for this script instead) check_fping() { # Run the fping command and check if it succeeds # Interval in ms, count number is number of failures before function returns FAILURE. You can adjust these numbers as-needed. # This pings your provider's DNS servers or those sites of your choosing, # If 1 response each from the IPV4 and IPV6 is successful, then it registers SUCCESS. # If either all IPv4 or all IPv6 completely fail, or both fail, it responds with FAILURE. local resultipv6=0 local resultipv4=0 echo "Pinging your selected ipv4 and ipv6 sites:" # edit the lines to use your provider's ipv4 and ipv6 dns servers. Global sites like dns.opendns.com may not respond. # IPV6 hosts: fping --alive --interval=400 --timeout=5000 --period=2000 --count=3 --reachable=1 --addr -6 ${DNS6SITES} || resultipv6=$? # IPV4 hosts: fping --alive --interval=400 --timeout=5000 --period=2000 --count=3 --reachable=1 --addr -4 ${DNS4SITES} || resultipv4=$? # if both ipv4 & 6 are successful, return 0 / success, and continue: if [ $resultipv6 -eq 0 ] && [ $resultipv4 -eq 0 ]; then echo "Communications working - ordinary fping success. (not logged)" return 0 # Return success # if either ipv4 or ipv6 are unsuccessful, reply with an error, create a newline in the log and post an error with the date: elif [ $resultipv6 -eq 1 ] || [ $resultipv4 -eq 1 ] ; then echo | tee -a "$log_file" echo "On $(date): ALERT!: fping:" | tee -a "$log_file" # if it was ipv6 that failed, report & log that specifically: if [ $resultipv6 -eq 1 ] ; then echo "fping: no response: FAILURE on IPv6 test!" | tee -a "$log_file" # only log on failure fi # if it was ipv4 that failed, report & log that specifically: if [ $resultipv4 -eq 1 ]; then echo "fping: no response: FAILURE on IPv4 test!" | tee -a "$log_file" # only log on failure fi return 1 # Return failure else # check_fping 'fall through' else condition. No previous conditions met: log everything and quit the script. echo "On $(date): ALERT!:" | tee -a "$log_file" echo "Fall-through condition reached, unexpected error: failure fping" | tee -a "$log_file" echo "FIXME! Some other error or condition unknown in the check_fping function section of the script" | tee -a "$log_file" echo "Current values for:" | tee -a "$log_file" echo "resultipv6: ${resultipv6}" | tee -a "$log_file" echo "resultipv4: ${resultipv4}" | tee -a "$log_file" echo "NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "MODEMTYPE: ${MODEMTYPE}" | tee -a "$log_file" echo "WWANNAME: ${WWANNAME}" | tee -a "$log_file" echo "WWANDEVICE: ${WWANDEVICE}" | tee -a "$log_file" echo "CONNECTIONSTATUS: ${CONNECTIONSTATUS}" | tee -a "$log_file" echo "VPNCHANGED: ${VPNCHANGED}" | tee -a "$log_file" echo "DOYOULIKECAKE: is ${DOYOULIKECAKE}" | tee -a "$log_file" echo "FEELINGSQMISH: is ${FEELINGSQMISH}" | tee -a "$log_file" echo "Doing a network_flush_cache and re-reading NETx_IF_NAME..." echo "Flushing network cache and re-testing NETx_IF_NAME results.." network_flush_cache network_find_wan NET_IF_NAME network_find_wan6 NET6_IF_NAME echo "Current NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "Current NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "Exiting script..." | tee -a "$log_file" exit # Immediately exit script fi } modem_reset() { echo | tee -a "$log_file" echo "Something wrong, fping failing. Attempting to do recovery routines for WAN/Modem Type: ${MODEMTYPE}" | tee -a "$log_file" # GENERIC WAN routine v20260219.01 (fiber/ethernet/coax cable etc, not cellular modems) if [ "$MODEMTYPE" = "GENERIC" ] || [ "$MODEMTYPE" = "GENERICWAN" ] ; then echo "Running selected GENERIC WAN recovery profile." | tee -a "$log_file" echo "We will attempt a simple 'ifup', but you should modify the script to do what you want when the WAN connection fails." | tee -a "$log_file" echo "Note! An 'online' response does not actually mean the WAN is working." | tee -a "$log_file" echo "But, if modem_reset function is running, there is no actual connectivity." | tee -a "$log_file" sleep $WAITFOR # first only probe and then log and memorize the WAN status: if ifstatus "${WWANNAME}" | grep -q '"up": true'; then # If up status = true, but there's no connectivity, log it first: # If 'up' status is 'true', then # The WAN is crashed-out 'soft-crash' internally, but there is no throughput: echo "Logging the previous WAN link uptime duration, in seconds:" | tee -a "$log_file" sleep $WAITFOR ifstatus ${WWANNAME} | grep "uptime" | tee -a "$log_file" # Display what the previous uptime is (when uptime > 0, it implies up:true). # Only if uptime is greater than 60 seconds (it should normally be!), then calculate and display cell link uptime in days, hours and minutes: if [ "$(ifstatus "${WWANNAME}" | grep -o '"uptime": [0-9]*' | cut -d' ' -f2)" -gt 60 ] ; then # if uptime is greater than 60 seconds (should be).. # Log & Display cell link uptime as Days, Hours and Minutes: ifstatus ${WWANNAME} | grep -o '"uptime": [0-9]*' | cut -d' ' -f2 | awk '{s=$1; d=int(s/86400); h=int(s%86400/3600); m=int(s%3600/60); print " Up for: "d" days, "h" hours, "m" minutes."}' | tee -a "$log_file" fi # set a local function variable so we can remember that the previous status was online+ifstatus was true. local previouswanstatus='true' elif ifstatus "${WWANNAME}" | grep -q '"up": false' ; then # if uptime is false, something else is wrong.. ifstatus "${WWANNAME}" | grep '"up": false' | tee -a "$log_file" # log it if the WAN is in offline/false mode. # set a local function variable so we can remember that the previous status was offline+ifstatus was false. local previouswanstatus='false' fi echo "Here's where you insert your custom routines, what you want the router to try to do to recover your WAN connection." | tee -a "$log_file" echo "For now, we will send a generic 'ifup' to the designated WWANNAME interface." | tee -a "$log_file" echo "Sending: ifup ${WWANNAME}" | tee -a "$log_file" # as reset is not the same as ifup ifup ${WWANNAME} echo "Waiting another 50 seconds for the actual IP data interface to come up..." | tee -a "$log_file" sleep 50 # re-test what the status of the WWANNAME is, after repair steps: if ifstatus "${WWANNAME}" | grep -q '"up": true'; then # If up status is true, we assume everything is fixed. Display and log the result, and return out of the function ifstatus ${WWANNAME} | grep "up" | tee -a "$log_file" # We need to show the 'up' status, and uptime. Should be about 10 seconds. # If the WAN was successfully reset/recovered, it will typically show up=true, and an uptime of about 10 seconds: # > we are not going to calculate the Days Hours Minutes again, because it would only have been a few seconds of uptime. return # Exits the modem restart function. Let fping do a follow-up test to the connection. If it fails, the router will reboot! # 'up:false' - Else, if the up status is was and is still 'false', then this could a more abnormal condition. elif ifstatus "${WWANNAME}" | grep -q '"up": false' && [ "$previouswanstatus" = "false" ]; then # The WAN is in an abnormal condition. 'ifup' etc did not work. Log and exit function. ifstatus ${WWANNAME} | grep "up" | tee -a "$log_file" # log and display the up status. echo "*!ALERT!* exiting, unresolvable condition for the ${MODEMTYPE}." | tee -a "$log_file" echo "GENERIC WAN abnormally stuck in offline mode! Investigate it manually. Exiting script directly, to avoid unnecessary reboot which probably will not fix the issue.." | tee -a "$log_file" exit # Exit the script # 'up:false' - Else, if the up status is 'false', and it was previous up:true, this is an abnormal condition. Report and exit script. elif ifstatus "${WWANNAME}" | grep -q '"up": false' && [ "$previouswanstatus" = "true" ]; then # The WAN is in an abnormal condition. 'ifup' etc did not work. Log and exit function. ifstatus ${WWANNAME} | grep "up" | tee -a "$log_file" # log and display the up status. echo "*!ALERT!* exiting, unresolvable condition for the ${MODEMTYPE}." | tee -a "$log_file" echo "GENERIC WAN is in offline(up:false) mode, but was previously reporting up:true! Investigate it manually. Exiting script directly, to avoid unnecessary reboot which probably will not fix the issue.." | tee -a "$log_file" exit # Exit the script else # v20260219.01 Fall through for exception in GENERICWAN fix conditional procedure echo | tee -a "$log_file" echo "On $(date): ALERT!:" | tee -a "$log_file" echo "Fall-through condition reached in GENERICWAN portion of modem_reset function, unexpected error: failure" | tee -a "$log_file" echo "aborting entire script...." | tee -a "$log_file" echo "Current values for:" | tee -a "$log_file" echo "previouswanstatus: ${previouswanstatus}" | tee -a "$log_file" echo "MODEMTYPE: ${MODEMTYPE}" | tee -a "$log_file" echo "NVRELOAD: ${NVRELOAD}" | tee -a "$log_file" echo "Current NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "Current NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "WWANNAME: ${WWANNAME}" | tee -a "$log_file" echo "WWANDEVICE: ${WWANDEVICE}" | tee -a "$log_file" echo "CONNECTIONSTATUS: ${CONNECTIONSTATUS}" | tee -a "$log_file" echo "VPNCHANGED: ${VPNCHANGED}" | tee -a "$log_file" echo "DOYOULIKECAKE: is ${DOYOULIKECAKE}" | tee -a "$log_file" echo "FEELINGSQMISH: is ${FEELINGSQMISH}" | tee -a "$log_file" echo "Doing a network_flush_cache and re-reading NETx_IF_NAME..." network_flush_cache network_find_wan NET_IF_NAME network_find_wan6 NET6_IF_NAME echo "Current NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "Current NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "Exiting script..." | tee -a "$log_file" exit # Immediately exit script fi # SIERRAQMI/QUECTELQMI routine v20260404.01 # 20260404: fixed some missing instructions to save the output to log file. # 20260222: restructured the logic in detecting up/down status. It uses the same method but the logical path it follows is clearer: # It reports what the 'ifstatus' responds with, regardless if it's up/true or down/false. It also now saves that status in a local variable for comparison, later. # Then it takes action and and re-checks the status, and compares it with the previous result. elif [ "$MODEMTYPE" = "SIERRAQMI" ] || [ "$MODEMTYPE" = "QUECTELQMI" ] ; then echo "Querying online status via uqmi:" | tee -a "$log_file" echo "Sending: uqmi -d ${MODEMUSBDEVNODE} --get-device-operating-mode..." | tee -a "$log_file" uqmi -d ${MODEMUSBDEVNODE} --get-device-operating-mode | tee -a "$log_file" # Log what uqmi cli *thinks* the modem status is (for reference). echo "Note! An 'online' response does not actually mean the modem is working. Typically, they crash-out in the 'online' state" | tee -a "$log_file" echo "But if this function is running, there is no actual connectivity." | tee -a "$log_file" sleep $WAITFOR # check reported uptime, if it is up, and log if ifstatus "${WWANNAME}" | grep -q '"up": true'; then # If up status = true, let's log that and uptime, first: # If 'up' status is 'true', then # The modem is crashed-out 'soft-crash' internally, but there is no throughput: echo "'ifstatus ${WWANNAME}' is responding that connection is 'up', although since this routine (modem_reset) is running, then the cell modem is probably 'light'-crashed internally." | tee -a "$log_file" echo "Logging the previous cell link uptime duration, in seconds:" | tee -a "$log_file" sleep $WAITFOR ifstatus ${WWANNAME} | grep "uptime" | tee -a "$log_file" # Display what the previous uptime is (when uptime > 0, it implies up:true). # Only if uptime is greater than 60 seconds (it should normally be!), then calculate and display cell link uptime in days, hours and minutes: if [ "$(ifstatus "${WWANNAME}" | grep -o '"uptime": [0-9]*' | cut -d' ' -f2)" -gt 60 ] ; then # if uptime is greater than 60 seconds (should be).. # Log & Display cell link uptime as Days, Hours and Minutes: ifstatus ${WWANNAME} | grep -o '"uptime": [0-9]*' | cut -d' ' -f2 | awk '{s=$1; d=int(s/86400); h=int(s%86400/3600); m=int(s%3600/60); print " Up for: "d" days, "h" hours, "m" minutes."}' | tee -a "$log_file" fi # set a local function variable so we can remember that the previous status was online+ifstatus was true. local previouswanstatus='true' # If false/down, log it, and make a note of the current status, for later comparison: elif ifstatus "${WWANNAME}" | grep -q '"up": false'; then echo "'ifstatus ${WWANNAME}' is intially responding that connection is 'down'." | tee -a "$log_file" # set a local function variable so we can remember that the previous status was offline+ifstatus was false. local previouswanstatus='false' fi # now take a simple 'ifup' action, usually successful: sleep $WAITFOR echo "Attempting basic 'ifup ${WWANNAME}', to re-establish connectivity:" | tee -a "$log_file" ifup $WWANNAME echo "Wait 65 seconds to give the modem time to reconnect..." | tee -a "$log_file" sleep 65 # give the modem time to reconnect (takes at least 25 seconds for a typical QMI Quectel config. YMMV) # if it responds as up, we consider it a success if ifstatus "${WWANNAME}" | grep -q '"up": true' ; then # Log the up status and uptime, presuming a successful reconnect, pass back to fping to check. ifstatus "${WWANNAME}" | grep "up" | tee -a "$log_file" return # Seems successful! Exit the modem restart function for a follow-up check_fping # if it is reporting subsequently, that the up status is (still) false/down, log it and do a more invasive approach: # invasive process: elif ifstatus "${WWANNAME}" | grep -q '"up": false' ; then ifstatus "${WWANNAME}" | grep '"up": false' | tee -a "$log_file" # log it if the modem is in offline/false mode. # try a more invasive approach to getting the wan interface up echo "Previous simple 'ifdown && ifup' unsuccessful." | tee -a "$log_file" echo "Full RESET of modem:" | tee -a "$log_file" echo "Sending: uqmi -d ${MODEMUSBDEVNODE} --set-device-operating-mode reset" | tee -a "$log_file" uqmi -d ${MODEMUSBDEVNODE} --set-device-operating-mode reset | tee -a "$log_file" echo "Wait 25 seconds to give the modem time to reconnect, then re-querying the status:" | tee -a "$log_file" sleep 25 # Mandatory 25 seconds echo "Getting operating mode status..." | tee -a "$log_file" echo "Sending: uqmi -d ${MODEMUSBDEVNODE} --get-device-operating-mode" | tee -a "$log_file" uqmi -d ${MODEMUSBDEVNODE} --get-device-operating-mode | tee -a "$log_file" echo "Sending: ifup ${WWANNAME}" | tee -a "$log_file" # as reset is not the same as ifup ifup ${WWANNAME} echo "Waiting 65 seconds for the actual IP data interface to come up..." | tee -a "$log_file" sleep 65 # followup testing, after more invasive process, checking modem self-reporting online/offline status # Result: (1) we tried a simple 'ifup' to fix a no-throughput condition on the wan, but (2) the follow-up up status returned down/false. # so (3) we tried a more advanced reset + ifup, and (4) now it's (reported to be) working. # success! if ifstatus "${WWANNAME}" | grep -q '"up": true' ; then echo "SUCCESS! Modem reporting:" | tee -a "$log_file" ifstatus "${WWANNAME}" | grep "up" | tee -a "$log_file" # Log the up status and uptime, presumes a successful reconnect (if not, the followup fping will reboot the router when detecting the failure). return # Success! Exit the modem restart function for a follow-up check_fping fi # same as (1,2,3), but (4) the follow-up status still returns false/down both times. Hard down condition. # sadness if ifstatus "${WWANNAME}" | grep -q '"up": false' && [ "$previouswanstatus" = "false" ]; then # if uptime is still false, something is wrong.. echo "FAILURE! Modem reporting:" | tee -a "$log_file" ifstatus "${WWANNAME}" | grep '"up": false' | tee -a "$log_file" # log the status is in offline/false mode. echo "Some sort of severe error condition:" | tee -a "$log_file" echo " • WAN Connection ${MODEMTYPE} had a no-throughput condition." | tee -a "$log_file" echo " • Modem initially reported as down/false/offline." | tee -a "$log_file" echo " • The basic reconnect attempted, with no up/true response, then advanced reconnect/reset, also with no up/true reported." | tee -a "$log_file" echo " • Modem still reporting as down/false/offline." | tee -a "$log_file" echo "Script exiting...." | tee -a "$log_file" exit # Exit the script. Previous state was down/false, with no connectivity. It was previously 'false': it is still 'false'. # unusual condition, can happen sometimes due to modem corruption, sim slot issues # ... (4) the follow-up status returned down/false, yet it was originally reported true (with no throughput) elif ifstatus "${WWANNAME}" | grep -q '"up": false' && [ "$previouswanstatus" = "true" ]; then # if uptime has reverted to false, something is very wrong.. ifstatus "${WWANNAME}" | grep '"up": false' | tee -a "$log_file" # log it. # this is an odd condition, usually related to modem corruption of some sort. It was previously self-reporting 'up' with no throughput, # and after subsequent attempts to bring it truly online, it is now reporting down/offline/false. echo "Some sort of severe error condition:" | tee -a "$log_file" echo " • WAN Connection ${MODEMTYPE} had a no-throughput condition." | tee -a "$log_file" echo " • Modem previously reported online/true/up, as it does, despite no throughput condition." | tee -a "$log_file" echo " • Neither the basic nor advanced reconnect/reset was successful." | tee -a "$log_file" echo " • And in the end, the modem reverted/reports as down/false/offline." | tee -a "$log_file" echo " • Sometimes, it is a SIM problem:" | tee -a "$log_file" echo " • Recording AT+CPIN? status:" | tee -a "$log_file" echo "AT+CPIN?" | socat - /dev/ttyUSB2,crnl | tee -a "$log_file" echo " • If output above shows SIM_STATUS_ILLEGAL, you may simply have to re-insert the SIM, and if your device does not detect hotswapped SIMs, manually send a AT+CFUN=1,1 to reboot the modem only, then manually check again with AT+CPIN?. It should say READY." | tee -a "$log_file" if [ "$MODEMTYPE" = "SIERRAQMI" ] ; then echo " • Recording AT!GSTATUS? (only works for SIERRA):" | tee -a "$log_file" echo "AT!GSTATUS?" | socat - /dev/ttyUSB2,crnl | tee -a "$log_file" elif [ "$MODEMTYPE" = "QUECTELQMI" ] ; then echo " • Recording AT+QENG="servingcell" status (only works on QUECTEL):" | tee -a "$log_file" echo 'AT+QENG="servingcell"' | socat - /dev/ttyUSB2,crnl | tee -a "$log_file" fi echo "Script exiting.... Check SIM, modem and tower status manually." | tee -a "$log_file" exit # Exit the script. Check the sim, modem, check the tower status. fi else # v20260219.01 Fall through for SIERRAQMI/QUECTELQMI: echo | tee -a "$log_file" echo "On $(date): ALERT!:" | tee -a "$log_file" echo "Fall-through condition reached in QMI portion of modem_reset function, unexpected error: failure" | tee -a "$log_file" echo "Current values for:" | tee -a "$log_file" echo "previouswanstatus: ${previouswanstatus}" | tee -a "$log_file" echo "MODEMTYPE: ${MODEMTYPE}" | tee -a "$log_file" echo "NVRELOAD: ${NVRELOAD}" | tee -a "$log_file" echo "Current NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "Current NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "WWANNAME: ${WWANNAME}" | tee -a "$log_file" echo "WWANDEVICE: ${WWANDEVICE}" | tee -a "$log_file" echo "CONNECTIONSTATUS: ${CONNECTIONSTATUS}" | tee -a "$log_file" echo "VPNCHANGED: ${VPNCHANGED}" | tee -a "$log_file" echo "DOYOULIKECAKE: is ${DOYOULIKECAKE}" | tee -a "$log_file" echo "FEELINGSQMISH: is ${FEELINGSQMISH}" | tee -a "$log_file" echo "Doing a network_flush_cache and re-reading NETx_IF_NAME..." network_flush_cache network_find_wan NET_IF_NAME network_find_wan6 NET6_IF_NAME echo "Current NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "Current NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "Exiting script..." | tee -a "$log_file" exit # Immediately exit script fi # SIERRAMBIM routine v20260201.01 elif [ "$MODEMTYPE" = "SIERRAMBIM" ] ; then # check and log the status of the modem: true if ifstatus "${WWANNAME}" | grep -q '"up": true'; then # If up status = true, go through the modem reset / reboot procedure for the 'soft-crash' condition: # If 'up' status is 'true', then # The modem is crashed-out 'soft-crash' internally, but there is no throughput: # up:true - modem responds to a status query as 'up', but typically in this state where fping has failed, there is no throughput. # ... this is the typical crashed condition for quectel and sierra modems. echo "'ifstatus ${WWANNAME}' is responding that connection is 'up', although since this routine (modem_reset) is running, then the cell modem is probably 'light'-crashed internally." echo "Logging the previous cell link uptime duration, in seconds:" | tee -a "$log_file" ifstatus ${WWANNAME} | grep "uptime" | tee -a "$log_file" # Display what the previous uptime is (when uptime > 0, it implies up:true). # Only if uptime is greater than 60 seconds (it should normally be at this point!), then calculate and display cell link uptime in days, hours and minutes: if [ "$(ifstatus "${WWANNAME}" | grep -o '"uptime": [0-9]*' | cut -d' ' -f2)" -gt 60 ] ; then # if uptime is greater than 60 seconds (should be).. # Display cell link uptime as Days, Hours and Minutes: ifstatus ${WWANNAME} | grep -o '"uptime": [0-9]*' | cut -d' ' -f2 | awk '{s=$1; d=int(s/86400); h=int(s%86400/3600); m=int(s%3600/60); print " Up for: "d" days, "h" hours, "m" minutes."}' | tee -a "$log_file" fi echo "Note! An 'up: true', response does not actually mean the modem interface is fully working." echo "Typically, they crash-out internally in the 'online' / 'true' state with no throughput." echo "but critically, there is no actual connectivity." sleep $WAITFOR echo "${MODEMTYPE} restart procedure:" | tee -a "$log_file" echo "Attempting to send AT!RESET (reboot of the cell modem only):" | tee -a "$log_file" local resultatreset stty -F /dev/ttyUSB2 raw -echo # Sample: echo "AT" | socat - /dev/ttyUSB0 command: resultatreset=$(echo "AT!RESET" | socat - /dev/ttyUSB2,crnl) echo "$resultatreset" | tee -a "$log_file" echo "Wait 35 seconds to give the modem time to boot and reconnect." | tee -a "$log_file" sleep 35 # Mandatory pause minimum for the cell modem to reconnect to tower. echo "Re-querying status with 'AT!GSTATUS?'" | tee -a "$log_file" local resultgstatus # modem self-status to tower, provider, regardless of the 'ifup/ifstatus' openwrt interface status resultgstatus=$(echo "AT!GSTATUS" | socat - /dev/ttyUSB2,crnl) echo "$resultgstatus" | tee -a "$log_file" echo "'ifup ${WWANNAME}' & pause afterwards, waiting for interface to come back up" ifup ${WWANNAME} | tee -a "$log_file" echo "Mandatory: Waiting 45 seconds for the openwrt IP interface data to come up..." | tee -a "$log_file" sleep 45 # wait for interface to come back up/reload. ifstatus ${WWANNAME} | grep "up" | tee -a "$log_file" # We need to show the 'up' status, and uptime. Should be about 10 seconds. # If the modem was successfully reset, it will show up=true, and an uptime of about 10 seconds: # we had success since the status of up is true. # we are not going to calculate the Days Hours Minutes again, because it would only have been a few seconds of uptime. return # Exits the modem restart function. # 'up:false' - Else, if the up status is initially 'false', then this could a more abnormal condition, especially if the modem was working just-prior. # Could be the tower, or a corruption issue in the modem. elif ifstatus "${WWANNAME}" | grep -q '"up": false'; then # Display results and log: ifstatus ${WWANNAME} | grep "up" | tee -a "$log_file" # log that it's 'false' (says 'cell phone link is down') echo "** 'ifstatus ${WWANNAME}' is reporting the cell modem link is down (false)." | tee -a "$log_file" echo " If the tower is up (please verify), this condition usually means the unusual instance when the modem had a corruption of it's internal registers." | tee -a "$log_file" echo " NVRELOAD set to ${NVRELOAD}." sleep $WAITFOR # do a basic ifup to see if the interface is simply 'down'. Usually this step is not necessary, but in edge cases, maybe it was manually brought down prior to running this script echo " Attempting to 'ifup' the modem interface - maybe it's simply down?" | tee -a "$log_file" echo " 'ifup ${WWANNAME}' & pause afterwards, while waiting for interface to come back up" | tee -a "$log_file" ifup ${WWANNAME} | tee -a "$log_file" echo " Mandatory: Waiting 45 seconds for the openwrt IP interface data to come up..." | tee -a "$log_file" sleep 45 # wait for interface to come back up/reload. # testing result, when started with a down+false condition: if ifstatus "${WWANNAME}" | grep '"up": true'; then # We need to check the 'up' status. Uptime should be about 10 seconds. ifstatus "${WWANNAME}" | grep "up" | tee -a "$log_file" # Log the up status and uptime, presumes a successful reconnect (if not, the followup fping will reboot the router when detecting the failure). echo "'ifup ${WWANNAME}' success. Modem responds as up. Exiting modem_reset function, to further test with fping.." # If the modem was successfully reset, it will show up=true, and an uptime of about 10 seconds: # we had success since the status of up is true. # we are not going to calculate the Days Hours Minutes again, because it would only have been a few seconds of uptime. return # Exits the modem restart function. elif ifstatus "${WWANNAME}" | grep -q '"up": false'; then # extended modem reset functions: echo " ** Even after " if [ $NVRELOAD -eq 1 ] ; then echo " NVRELOAD selected. A more invasive reset of the modem. Maybe the nvram settings got screwed-up (this happens, occasionally)" | tee -a "$log_file" echo " Enter management functions on sierra wireless:" | tee -a "$log_file" sleep 1 && echo " Putting modem in a mode to accept higher-level management commands" local responsefromatcmd stty -F /dev/ttyUSB2 raw -echo responsefromatcmd=$(echo 'AT!ENTERCND="A710"' | socat - /dev/ttyUSB2,crnl) echo " Response: ${responsefromatcmd}" | tee -a "$log_file" echo " Tell the modem to reload nvram settings, saved manually previously with !NVBACKUP=3" | tee -a "$log_file" stty -F /dev/ttyUSB2 raw -echo responsefromatcmd=$(echo "AT!RMARESET=3" | socat - /dev/ttyUSB2,crnl) echo " Response: ${responsefromatcmd}" | tee -a "$log_file" sleep 7 && echo "modem is reloading NVRAM variables..." echo " Tell the modem to reboot. This is required after the above reload command." | tee -a "$log_file" stty -F /dev/ttyUSB2 raw -echo responsefromatcmd=$(echo "AT!RESET" | socat - /dev/ttyUSB2,crnl) echo " Response: ${responsefromatcmd}" | tee -a "$log_file" sleep 20 && echo "modem is rebooting..." | tee -a "$log_file" ifup ${WWANNAME} | tee -a "$log_file" # restarting the interface echo " Waiting 45 seconds to give the modem time to reconnect." | tee -a "$log_file" sleep 45 # Mandatory pause minimum for the cell modem to reconnect to tower. echo " Re-querying status with 'ifstatus ${WWANNAME}'" | tee -a "$log_file" echo " This time, the modem, if successfully reset, will show up is true, and an uptime of about 10 seconds:" | tee -a "$log_file" if ifstatus "${WWANNAME}" | grep -q '"up": true'; then ifstatus "${WWANNAME}" | grep "up" | tee -a "$log_file" # We only need to see the 'up' status, and uptime. return # Exits the modem restart function. elif ifstatus "${WWANNAME}" | grep -q '"up": false'; then # something is REALLY REALLY wrong. The modem should be up by now. Display results and log: ifstatus "${WWANNAME}" | grep "up" | tee -a "$log_file" echo "** Exiting the entire script from within the modem_reset function. Nothing got the modem working again. Maybe the tower is down?" | tee -a "$log_file" exit # Exit the entire script. Using EXIT at this point, is based on local critical decision-making: Is the tower really up? Is the bill paid? Or does the cell modem has an unknown issue. fi # exit extended reset testing elif [ $NVRELOAD -eq 0 ] ; then echo "Modem abnormally stuck in offline mode! NVRELOAD is disabled in the script. Investigate it manually. Exiting script directly, to avoid unnecessary reboot which probably will not fix the issue.." | tee -a "$log_file" echo "Maybe the tower is down? But you chose not to have the script try to fix the modem corruption issue. You have to decide at this point what you want to do." | tee -a "$log_file" echo "Exiting the wan-watchdog script..." | tee -a "$log_file" exit # Exit the entire script. fi fi fi else # v20260219.01 Fall through, usually due to incorrect MODEMTYPE declaration echo | tee -a "$log_file" echo "On $(date): ALERT!:" | tee -a "$log_file" echo "Fall-through condition reached:" | tee -a "$log_file" echo "FIXME! Some other error or condition unanticipated in the main condition checking in the modem_reset function section of the script" | tee -a "$log_file" echo "Usually it's MODEMTYPE is not set correctly in script" | tee -a "$log_file" echo "Current values for:" | tee -a "$log_file" echo "previouswanstatus: ${previouswanstatus}" | tee -a "$log_file" echo "MODEMTYPE: ${MODEMTYPE}" | tee -a "$log_file" echo "NVRELOAD: ${NVRELOAD}" | tee -a "$log_file" echo "Current NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "Current NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "WWANNAME: ${WWANNAME}" | tee -a "$log_file" echo "WWANDEVICE: ${WWANDEVICE}" | tee -a "$log_file" echo "CONNECTIONSTATUS: ${CONNECTIONSTATUS}" | tee -a "$log_file" echo "VPNCHANGED: ${VPNCHANGED}" | tee -a "$log_file" echo "DOYOULIKECAKE: is ${DOYOULIKECAKE}" | tee -a "$log_file" echo "FEELINGSQMISH: is ${FEELINGSQMISH}" | tee -a "$log_file" echo "Doing a network_flush_cache and re-reading NETx_IF_NAME..." network_flush_cache network_find_wan NET_IF_NAME network_find_wan6 NET6_IF_NAME echo "Current NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "Current NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "Exiting script..." | tee -a "$log_file" exit # Immediately exit script fi } ########################### start main script ####################### echo # new blank line echo "Logging to: ${log_file}" if [ "$TESTMODE" = 0 ]; then service sysntpd restart # restart the time-server on the router, only after BOOTWAIT. sleep 5 # give 5 seconds for sysntpd to retrieve and reset the clock to current time. fi echo | tee -a "$log_file" # Blank line to delineate new logging sequence (on restart, startup of router) echo "Wan-Watchdog script started / Router booted on $(date)" | tee -a "$log_file" if [ "$DOYOULIKECAKE" = "1" ] && [ "$VPN" = "1" ]; then # The next line addresses the automatically-applied fq_codel on the WWAN interface. Since we will be running cake-autorate inside the VPN usually, this should be removed/set to noqueue. This is a one-time check after every script start / boot-up. echo "Cake-autorate selected ON. VPN turned ON. Therefore, removing default fq_codel from basic ${WWANDEVICE}/${WWANNAME} interface.." | tee -a "$log_file" tc qdisc replace dev ${WWANDEVICE} root noqueue | tee -a "$log_file" fi echo # new blank line echo "Entering wan & vpn testing:" network_flush_cache network_find_wan NET_IF_NAME network_find_wan6 NET6_IF_NAME # ATTENTION: Look at the following output during testing, to determine if you correctly set WWANDEVICE & WWANNAME, assuming you are connected in a basic way to your default internet connection when you test-run this echo "MODEMTYPE (for modem_reset / wan reset) is set to ${MODEMTYPE}" echo "WWANDEVICE device is set to ${WWANDEVICE}" echo "WANNNAME interface is set to ${WWANNAME}" echo "MODEMUSBDEVNODE is set to ${MODEMUSBDEVNODE}" echo "NET_IF_NAME reports as ${NET_IF_NAME}" # if NIN is null, then the WAN is not connected (usually during testing where WWANNAME has been manually stopped). echo "NET6_IF_NAME reports as ${NET6_IF_NAME}" # same as above # Ordinarily either NIN or N6IN must match WWANNAME, if the router is connected to the internet when you test-run this script, and you have no VPN profiles active. # If neither NET or NET6 IF match WWANNAME, stop this script and correct the value for $WWANNAME stored at the top. # You could set a name of 'mbim' when creating the device, and openwrt will make 'mbim_4' and 'mbim_6' virtual adapters. # Alternatively, you make 'qmi' and openwrt uses that for ipv6, and makes a virtual adapter of 'qmi_4' for ipv4. # This script attempts to determine all possibilities, and writes an error to the log with info if nothing matches, and exits. echo "Sleeping for ${WAITFOR} seconds..." sleep $WAITFOR # Connection Status: 0 Not connected-connection attempt failure, exit and reboot. # Connection Status: 1 Unknown State, try to take corrective action # Connection Status: 2 WWAN interface connected, fping success # Connection Status: 3 WWAN & VPN connected, fping success NEEDREBOOT="0" # Initialize VPNCHANGED="0" # Initialize CONNECTIONSTATUS="1" # Initialize ACTIVEVPN="none" # Initialize the -active and tested- VPN profile, to 'none'. sleep $WAITFOR while [ $CONNECTIONSTATUS -ge 1 ]; do # While CONNECTION STATUS is 1 or greater, do the following loop: # if1 if check_fping ; then echo echo "Successful fping response." echo # if2 if [ "$VPN" = "1" ] ; then # If VPN selector is turned ON, try/test VPN connections 1 through 4. echo "VPN Option selector turned-on '1'. Getting current interface connection values..." network_flush_cache network_find_wan NET_IF_NAME network_find_wan NET6_IF_NAME echo "Current values for:" echo "NET_IF_NAME: ${NET_IF_NAME}" echo "NET6_IF_NAME: ${NET6_IF_NAME}" echo "WWANNAME interface: ${WWANNAME}" echo "WWANDEVICE device: ${WWANDEVICE}" echo "Checking if any VPN connections are currently up..." sleep $WAITFOR # if3 if [ "$NET_IF_NAME" = "$VPN1" ] || [ "$NET_IF_NAME" = "$VPN2" ] || [ "$NET_IF_NAME" = "$VPN3" ] || [ "$NET_IF_NAME" = "$VPN4" ]; then echo "ACTIVEVPN is ${NET_IF_NAME}, is reported as up, and fping is successful!" echo "Basic WWAN Connection is up, and 1 out of 4 possible VPN profiles/interfaces are connected." echo "Setting value of ACTIVEVPN to ${NET_IF_NAME}.." ACTIVEVPN=$NET_IF_NAME echo "Setting CONNECTIONSTATUS to 3.." CONNECTIONSTATUS=3 sleep $WAITFOR # if4 if [ -f /tmp/currentvpn.txt ]; then echo "/tmp/currentvpn.txt exists reading value into PREVIOUSVPN.." read -r PREVIOUSVPN < /tmp/currentvpn.txt else echo "/tmp/currentvpn.txt does not exist. Initializing PREVIOUSVPN to 'none'." PREVIOUSVPN="none" fi # fi4 echo "Comparing ACTIVEVPN to PREVIOUSVPN:" echo "PREVIOUSVPN is: ${PREVIOUSVPN}.." # if5 if [ "$ACTIVEVPN" = "$PREVIOUSVPN" ]; then # If ACTIVEVPN matches PREVIOUSVPN in this loop or run, then do nothing and continue echo "Current VPN is the same as the previously-connected VPN profile, continuing..." VPNCHANGED=0 sleep $WAITFOR else # if the values do not match, then maybe the VPN has changed from previous, write the currentvpn to 'currentvpn.txt' echo "VPN has changed. Previous VPN was ${PREVIOUSVPN}. Current VPN is ${CURRENTVPN}" | tee -a "$log_file" echo "Writing changes to currentvpn.txt..." VPNCHANGED=1 echo "$ACTIVEVPN" > /tmp/currentvpn.txt # write the current vpn value to a file, e.g. to be used by the cake autorate script. sleep $WAITFOR fi # fi5 Finished sensing and reacting to a match or mismatch between ACTIVEVPN and PREVIOUSVPN # if3 elif # else if the WWAN interface is up, & VPN selector is turned-on, but no VPN is up yet. elif [ "${WWANNAME}" = "$NET_IF_NAME" ] || [ "${WWANNAME}" = "$NET6_IF_NAME" ] || [ "${WWANNAME}_4" = "$NET_IF_NAME" ] || [ "${WWANNAME}_6" = "$NET6_IF_NAME" ]; then echo "VPN not up yet; VPN selector is ON. WWAN network interface is up. Trying to get a VPN connection up..." | tee -a "$log_file" sleep $WAITFOR echo "Trying $VPN1..." | tee -a "$log_file" echo "ifup ${VPN1}" | tee -a "$log_file" ifup $VPN1 | tee -a "$log_file" sleep 7 # wait for it to connect network_flush_cache network_find_wan NET_IF_NAME if [ "$NET_IF_NAME" = "$VPN1" ] && check_fping ; then ACTIVEVPN=$VPN1 CONNECTIONSTATUS=3 echo "${VPN1} is up, and fping success!" | tee -a "$log_file" else echo "Unable to connect to ${VPN1}... trying ${VPN2} profile..." | tee -a "$log_file" echo "ifdown ${VPN1}" | tee -a "$log_file" ifdown $VPN1 | tee -a "$log_file" sleep $WAITFOR echo "ifup ${VPN2}" | tee -a "$log_file" ifup $VPN2 | tee -a "$log_file" sleep 7 network_flush_cache network_find_wan NET_IF_NAME if [ "$NET_IF_NAME" = "$VPN2" ] && check_fping ; then ACTIVEVPN=$VPN2 CONNECTIONSTATUS=3 echo "${VPN2} is up and fping success!" | tee -a "$log_file" else echo "Unable to connect to ${VPN2}, trying ${VPN3} profile..." | tee -a "$log_file" echo "ifdown ${VPN2}" | tee -a "$log_file" ifdown $VPN2 | tee -a "$log_file" sleep $WAITFOR echo "ifup ${VPN3}" | tee -a "$log_file" ifup $VPN3 | tee -a "$log_file" sleep 7 network_flush_cache network_find_wan NET_IF_NAME if [ "$NET_IF_NAME" = "$VPN3" ] && check_fping ; then ACTIVEVPN=$VPN3 CONNECTIONSTATUS=3 echo "${VPN3} is up, and fping success!" | tee -a "$log_file" else echo "Unable to connect to ${VPN3}, trying ${VPN4} profile..." | tee -a "$log_file" echo "ifdown ${VPN3}" | tee -a "$log_file" ifdown $VPN3 | tee -a "$log_file" sleep $WAITFOR echo "ifup ${VPN4}" | tee -a "$log_file" ifup $VPN4 | tee -a "$log_file" sleep 7 network_flush_cache network_find_wan NET_IF_NAME if [ "$NET_IF_NAME" = "$VPN4" ] && check_fping ; then ACTIVEVPN=$VPN4 CONNECTIONSTATUS=3 echo "${VPN4} is up and fping success!" | tee -a "$log_file" else echo "Unable to start VPN4, or connect to any previous VPN1-3 profiles on $(date)" | tee -a "$log_file" fi # Finished VPN4 connect/all VPNx attempts fi # Finished VPN3 connect attempt fi # Finished VPN2 connect attempt fi # Finished VPN1 connect attempt else # something is wrong with the basic configuration of the script / interfaces mismatch echo "Whoops. Something went wrong!" | tee -a "$log_file" echo "Most likely this scripts VPN/WWAN variables are not matched to your router's interface names." | tee -a "$log_file" echo "Check that your script VPN/WWAN values match the actual interfaces. Exiting..." | tee -a "$log_file" break fi # Finished check if WWAN or any VPN connection is connected, and establishing required VPN connection if not active already. # Else if VPN Selector is set to 0 / no VPN, then elif [ $VPN -eq 0 ] ; then echo "VPN=${VPN}. VPN Selector is turned-off." echo "Checking to see if the current NET_IF_NAME or NET6_IF_NAME matches WWANNAME (or a _4 _6 variation):" network_flush_cache network_find_wan NET_IF_NAME network_find_wan6 NET6_IF_NAME echo "Current values:" echo "NET_IF_NAME: ${NET_IF_NAME}" echo "NET6_IF_NAME: ${NET6_IF_NAME}" echo "WWANNAME interface: ${WWANNAME}" echo "WWANDEVICE device: ${WWANDEVICE}" if [ "${WWANNAME}" = "$NET_IF_NAME" ] || \ [ "${WWANNAME}" = "$NET6_IF_NAME" ] || \ [ "${WWANNAME}_4" = "$NET_IF_NAME" ] || \ [ "${WWANNAME}_6" = "$NET6_IF_NAME" ]; then echo "They do and VPN is selected OFF. Do nothing here/continue." else echo "Neither the NET_IF_NAME (${NET_IF_NAME}) or NET6_IF_NAME (${NET6_IF_NAME}) match the WWANNAME (${WWANNAME})" | tee -a "$log_file" echo "VPN1-4 interfaces (ifdown), e.g. in case a VPN was started manually prior to this script run. VPN selector is OFF." | tee -a "$log_file" # VPN may have been manually started prior to current script invocation. ifdown $VPN1 ifdown $VPN2 ifdown $VPN3 ifdown $VPN4 VPNCHANGED=1 fi echo "No VPN active: Setting 'CONNECTIONSTATUS' = 2" CONNECTIONSTATUS=2 sleep $WAITFOR fi # finished for VPN 1/0 Selector check, establishing VPN connection, determining if active VPN has changed from prior VPN interface. else # fping failure - No connectivity: on initial or subsequent fping test, whether through VPN or without. These responses should always be logged. # This is directly after an fping testing failure, so no need for an additional blank line echo echo "On $(date): ALERT: fping failure - No WAN or VPN connectivity." | tee -a "$log_file" # Log when no connectivity. if [ $VPN -eq 1 ] ; then # If VPN is selected, and basic connectivity is failing, echo "VPN Selector is ON. WWAN/VPN is failing at initial fping test." | tee -a "$log_file" echo "This would ordinarily be the case, e.g. if you forcibly stopped your WWAN prior to running this script, to test." | tee -a "$log_file" elif [ $VPN -eq 0 ] ; then echo "VPN Selector is turned-off in this script and there is no WAN connectivity." | tee -a "$log_file" fi echo "Script will now attempt to (re)connect cell link, according to the designated ${MODEMTYPE} procedure." | tee -a "$log_file" CONNECTIONSTATUS="1" # reset to unknown echo "Shutting down VPN interfaces, in case they have been manually actived..." ifdown $VPN1 ifdown $VPN2 ifdown $VPN3 ifdown $VPN4 ACTIVEVPN="none" sleep $WAITFOR modem_reset # Modem reset function as defined above. if check_fping ; then echo "fping follow-up after modem_reset SUCCESS: restart procedure worked, no need for reboot." | tee -a "$log_file" echo | tee -a "$log_file" sleep $WAITFOR CONNECTIONSTATUS="2" NEEDREBOOT="0" else echo "fping follow-up after modem_reset FAILURE: Reboot required." | tee -a "$log_file" CONNECTIONSTATUS="0" NEEDREBOOT="1" fi # end of follow-up probe to see if WWAN interface restart worked... fi # End of initial fping test ######## sqm & cake-autorate portion of script ########## # Now let's start or restart, as needed, cake-autorate to run: # probably should check cake-autorate status and correct if necessary, here # Check if cake-autorate is running, and what it is currently using for it's 'ul_if' value: echo echo echo "CAKE-AUTORATE check / DOYOULIKECAKE & SQM portion of wan-watchdog.sh:" echo echo "Detecting interface/SQM/Cake-Autorate status & preferences:" echo "CONNECTIONSTATUS is ${CONNECTIONSTATUS}." echo "Sending: 'service cake-autorate status'" echo "If 'running', cake-autorate is running." echo "If 'inactive', cake-autorate is not running." echo "If 'not found', cake-autorate is not installed." echo service cake-autorate status echo echo "Sending: 'service sqm status'" echo "If 'running', SQM is running." echo "If 'inactive', SQM is not running." echo "If 'active with no instances', SQM service is running." echo "If 'not found', SQM is not installed." echo service sqm status echo echo "VPNCHANGED is ${VPNCHANGED}." echo "DOYOULIKECAKE is ${DOYOULIKECAKE}." sleep $WAITFOR # if the VPN is selected (1), enabled (on) and running (working) (therefore connectionstatus=3), VPN has not changed from prior, and cake-autorate/sqm are already running: # The most typical state when the VPN is selected, and script is running in a loop the second, third etc times. if [ "$CONNECTIONSTATUS" = "3" ] && [ "$(service cake-autorate status)" = "running" ] && [ "$(service sqm status)" = "active with no instances" ] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPNCHANGED" = "0" ]; then echo "Connection Status is 3: VPN ON & active. SQM is set ON and running, Cake-Autorate is set ON and running. VPN has NOT changed: Do nothing." # else if all of the above is true but the VPN has changed, and cake-autorate/sqm are already running, restart them: elif [ "$CONNECTIONSTATUS" = "3" ] && [ "$(service cake-autorate status)" = "running" ] && [ "$(service sqm status)" = "active with no instances" ] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPNCHANGED" = "1" ]; then echo "Connection Status is 3: VPN ON & Active. SQM is set to ON and running, Cake-Autorate is set ON and running. But VPN has changed: restart SQM/Cake-autorate" | tee -a "$log_file" echo "Stopping cake-autorate..." echo "service cake-autorate stop" service cake-autorate stop sleep $WAITFOR echo "(Re)starting SQM to verify it's running the correct profile on the correct interface:" echo "service sqm stop" service sqm stop sleep $WAITFOR echo "service sqm start" service sqm start sleep $WAITFOR # Config.primary.sh in /root/cake-autorate/ will pull the current VPN connection /tmp/currentvpn.txt, and operate on that. echo "service cake-autorate start" service cake-autorate start VPNCHANGED="0" sleep $WAITFOR # else if the VPN has not changed, the VPN is active, but cake-autorate is not yet running (typically due to set to 'disabled' (from autostart) in System, Startup), then start it: elif [ "$CONNECTIONSTATUS" = "3" ] && [[ "$(service cake-autorate status)" == "not running" || "$(service cake-autorate status)" == "inactive" ]] && [ "$DOYOULIKECAKE" = "1" ]; then echo "CONNECTIONSTATUS is 3: VPN is ON & active, Cake-autorate is NOT running, yet cake-autorate selector is ON. Starting Cake-Autorate..." if [ "$(service sqm status)" = "active with no instances" ]; then # Probing if SQM is running first, instead of just doing a 'service sqm restart', avoids the alarming but normal 'Command failed: Not found' error. echo "SQM is running. Stopping then starting to make sure its on the active interface..." echo "service sqm stop" service sqm stop sleep $WAITFOR fi echo "service sqm start" service sqm start sleep $WAITFOR echo "service cake-autorate start" # Config.primary.sh in /root/cake-autorate/ will pull the current VPN connection /tmp/currentvpn.txt, and operate on that. service cake-autorate start # Initialize VPNCHANGED to 0, because regardless of whether it changed, cake-autorate was not running # This can also be the situation during testing sometimes, that VPNCHANGED=1 due to subsequent manual runs of the script. VPNCHANGED="0" sleep $WAITFOR # atypical: if script was halted/killed manually, a VPN was on, and VPN option was changed to '0' in the script, then script was re-executed: elif [ "$CONNECTIONSTATUS" = "2" ] && [ "$(service cake-autorate status)" = "running" ] && [ "$(service sqm status)" = "active with no instances" ] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPN" = "0" ] && [ "$VPNCHANGED" = "1" ]; then echo "VPN active, but VPN selected OFF in script. Restarting SQM/Cake-Autorate to set to ${WWANDEVICE} / ${WWANNAME}:" echo "Connection Status is 2, SQM is running, Cake is running, Cake selector is enabled, but this script was probably killed, then re-run while there was a (previous) VPN running:" echo "Stopping cake-autorate, restart sqm, restart cake-autorate." echo "service cake-autorate stop" service cake-autorate stop echo "Restarting SQM to get SQM running on the current active WWAN interface..." echo "service sqm restart" service sqm restart sleep $WAITFOR echo "Writing WWANDEVICE to /tmp/currentvpn.txt for Cake-Autorate config.primary.sh to pick-up active interface" echo "${WWANDEVICE}" > /tmp/currentvpn.txt # write the current WWAN device (not interface!) to a file, to be used by the cake autorate script. sleep $WAITFOR echo "service cake-autorate start" # Config.primary.sh in /root/cake-autorate/ will pull the current VPN connection /tmp/currentvpn.txt, and operate on that. service cake-autorate start # Initialize VPNCHANGED to 0 VPNCHANGED="0" sleep $WAITFOR # typical when script running in second, third etc loop and VPN selected OFF: # else if VPN is selected OFF (0), Connection = 2 (good connection, no vpn), and cake selector is ON, and running, and SQM is running, do nothing: elif [ "$CONNECTIONSTATUS" = "2" ] && [ "$(service cake-autorate status)" = "running" ] && [ "$(service sqm status)" = "active with no instances" ] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPN" = "0" ] && [ "$VPNCHANGED" = "0" ]; then echo "Connection Status is 2: VPN is not active, VPN selected OFF, DOYOULIKECAKE is ON, CAKE and SQM running.. No change in interfaces: Do Nothing." # typical on firstboot with VPN off: # else if VPN is selected OFF (0), connection = 2 (good connection), and cake selector is ON, but not yet running: elif [ "$CONNECTIONSTATUS" = "2" ] && [[ "$(service cake-autorate status)" == "not running" || "$(service cake-autorate status)" == "inactive" ]] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPN" = "0" ]; then echo "Connection status is 2: No VPN Active, VPN is selected OFF, Cake is NOT running, and cake-autorate selector is ON:" echo "(Re)starting SQM to verify that SQM is running on the current active WWAN interface..." if [ "$(service sqm status)" = "active with no instances" ] ; then # Probing if SQM is running first, instead of just doing a 'service sqm restart', avoids the alarming but normal 'Command failed: Not found' error. echo "SQM is running. Stopping..." echo "service sqm stop" service sqm stop sleep $WAITFOR fi echo "service sqm start" service sqm start sleep $WAITFOR echo "service cake-autorate start" echo "$WWANDEVICE" > /tmp/currentvpn.txt # write the current WWAN device (not interface!) to a file, to be used by the cake autorate script. sleep $WAITFOR # Config.primary.sh in /root/cake-autorate/ will pull the current VPN connection /tmp/currentvpn.txt, and operate on that. service cake-autorate start sleep $WAITFOR # typical when script has been stopped, CAKE selector has been turned off, and script is re-run # else if cake-autorate is running, but cake selector is set to OFF, then shut down cake-autorate: elif [[ "$CONNECTIONSTATUS" == "2" || "$CONNECTIONSTATUS" == "3" ]] && [ "$(service cake-autorate status)" = "running" ] && [ "$DOYOULIKECAKE" = "0" ]; then echo "Connection Status is ${CONNECTIONSTATUS}, Cake-autorate is running, but is selected OFF. Stopping cake-autorate:" echo "service cake-autorate stop" service cake-autorate stop # This can be the case during manually running the script, or if you decided not to use cake-autorate for now, but forgot to disable it from startup # or it had been started previously but you don't want it running rn. sleep $WAITFOR # And if SQM is running, but FEELINGSQMISH SQM selector is also set to OFF, then also shut down SQM: if [ "$(service sqm status)" = "active with no instances" ] && [ "$FEELINGSQMISH" = "0" ]; then echo "SQM is also turned-off, but SQM is running. Stopping SQM:" echo "Stopping SQM..." echo "service sqm stop" service sqm stop # This can be the case during manually running the script, or if you decided not to use SQM for now, but forgot to disable it from startup # or it had been started previously but you don't want it running rn. sleep $WAITFOR elif [ "$(service sqm status)" = "active with no instances" ] && [ "$FEELINGSQMISH" = "1" ]; then echo "SQM is selected ON, and is running. Do nothing" sleep $WAITFOR fi # else if Connection is 2 or 3 - good, w or wo VPN, Cake is selected OFF and is not running, do nothing elif [[ "$CONNECTIONSTATUS" = "2" || "$CONNECTIONSTATUS" == "3" ]] && [[ "$(service cake-autorate status)" == "not running" || "$(service cake-autorate status)" == "inactive" ]] && [ "$DOYOULIKECAKE" = "0" ] ; then echo "CONNECTIONSTATUS is ${CONNECTIONSTATUS}. DOYOULIKECAKE is selected OFF and Cake-Autorate is not running. Do nothing." sleep $WAITFOR # and test SQM within that: if [ "$(service sqm status)" = "inactive" ] && [ "$FEELINGSQMISH" = "0" ] ; then echo "and SQM selector is also turned OFF and SQM is installed, and not running. Do nothing." sleep $WAITFOR elif [ "$(service sqm status)" = "inactive" ] && [ "$FEELINGSQMISH" = "1" ] ; then echo "and SQM is installed, selected ON, but not running. Starting..." echo "service sqm start" service sqm start sleep $WAITFOR fi # normal condition when only SQM is installed on the router, is running, SQM is selected ON and running or not, Connection is 2 or 3 good w or wo VPN, do nothing generally but start SQM if not running: elif [[ "$CONNECTIONSTATUS" = "2" || "$CONNECTIONSTATUS" == "3" ]] && [[ "$(service sqm status)" = "active with no instances" || "$(service sqm status)" = "inactive" ]] && [ "$FEELINGSQMISH" = "1" ]; then echo "CONNECTIONSTATUS is ${CONNECTIONSTATUS}. FEELINGSQMISH is ON. Evaluate if SQM is running..." sleep $WAITFOR # and test SQM within that: if [ "$(service sqm status)" = "active with no instances" ] && [ "$FEELINGSQMISH" = "1" ] ; then echo "SQM is running. Do nothing." sleep $WAITFOR elif [ "$(service sqm status)" = "inactive" ] && [ "$FEELINGSQMISH" = "1" ] ; then echo "SQM is not running, but installed. Starting SQM. Starting..." echo "service sqm start" service sqm start sleep $WAITFOR fi # normal conditions when either SQM or Cake-Autorate are not installed at all on the router: # if Connection status is 2 or 3 - good w or wo VPN, and Cake & SQM is selected OFF and both are not installed, continue.. elif ([ "$CONNECTIONSTATUS" = "2" ] || [ "$CONNECTIONSTATUS" = "3" ]) && \ [ ! -f "/etc/init.d/cake-autorate" ] && \ [ "$DOYOULIKECAKE" = "0" ] && \ [ ! -f "/etc/init.d/sqm" ] && \ [ "$FEELINGSQMISH" = "0" ]; then echo "SQM and Cake-Autorate: not enabled in script & not installed." echo "Connectionstatus is 2/3 (good)" echo "Do nothing - continue running" # else if Connection is 2 / good w or wo VPN, SQM is selected ON yet not installed, log it & exit script. elif ([ "$CONNECTIONSTATUS" = "2" ] || [ "$CONNECTIONSTATUS" = "3" ]) && \ [ ! -f "/etc/init.d/sqm" ] && \ [ "$FEELINGSQMISH" = "1" ]; then echo "SQM is not installed, but enabled in script." | tee -a "$log_file" echo "Connectionstatus is 2/3 (good)" | tee -a "$log_file" echo "Quitting... FIXME: Install SQM or disable it in this script." | tee -a "$log_file" exit # Immediately exit script # else if Connection is 2 / good w or wo VPN, Cake-Autorate is selected ON and yet not installed, log it & exit script. elif ([ "$CONNECTIONSTATUS" = "2" ] || [ "$CONNECTIONSTATUS" = "3" ]) && \ [ ! -f "/etc/init.d/cake-autorate" ] && \ [ "$DOYOULIKECAKE" = "1" ]; then echo "Cake-Autorate not installed, but enabled in script." | tee -a "$log_file" echo "Connectionstatus is 2/3 (good)" | tee -a "$log_file" echo "Quitting... FIXME: Install Cake-Autorate or disable it in this script" | tee -a "$log_file" exit # Immediately exit script # else if we need a reboot because there was an fping failure, & connection status will be '0'. We didn't use to catch this, but # now we do (20260219.xx updates added a 'fall-through' condition, below). So we have to account for it. # fping previously failed (no throughput) after a modem_reset, reboot required. elif [ "$CONNECTIONSTATUS" = "0" ] && [ "$NEEDREBOOT" = "1" ] ; then echo "Skipping CAKE-SQM portion of script, fping failed after modem_reset: reboot required" | tee -a "$log_file" else # 'fall through' else condition. No previous conditions met: log everything and quit the script. echo "On $(date): ALERT!: SQM-CAKE section fall-through" | tee -a "$log_file" echo "FIXME! Some other error or condition unanticipated, in the CAKE SQM section of the script" | tee -a "$log_file" echo "Current values for:" | tee -a "$log_file" echo "NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "MODEMTYPE: ${MODEMTYPE}" | tee -a "$log_file" echo "WWANNAME interface: ${WWANNAME}" | tee -a "$log_file" echo "WWANDEVICE device: ${WWANDEVICE}" | tee -a "$log_file" echo "CONNECTIONSTATUS: ${CONNECTIONSTATUS}" | tee -a "$log_file" echo "VPNCHANGED: ${VPNCHANGED}" | tee -a "$log_file" echo "DOYOULIKECAKE: ${DOYOULIKECAKE}" | tee -a "$log_file" echo "FEELINGSQMISH: ${FEELINGSQMISH}" | tee -a "$log_file" echo "NEEDREBOOT: ${NEEDREBOOT}" | tee -a "$log_file" echo "Exiting script..." | tee -a "$log_file" exit # Immediately exit script fi # Finished starting Cake-autorate for current connection, if selected, and SQM. Restarting services or terminating as necessary. echo # Pause for LOOPWAIT then re-test connection, or if in testing-mode, go directly to exit: if [ $TESTMODE = 0 ] ; then # now check to see if you should loop or break echo "Pausing ${LOOPWAIT} seconds and testing again..." sleep $LOOPWAIT elif [ $TESTMODE = 1 ] ; then echo "Would loop at this point, but it is in Testing mode. Exiting.." | tee -a "$log_file" echo "Test run ended: $(date)" | tee -a "$log_file" exit fi echo done # When CONNECTION STATUS no longer 1 or greater, or break if [ $TESTMODE = 0 ] && [ $NEEDREBOOT = 1 ]; then echo "Preparing to reboot router:" | tee -a "$log_file" echo "Shutting down CAKE..." | tee -a "$log_file" service cake-autorate stop echo "Shutdown down SQM..." | tee -a "$log_file" service sqm stop echo "Rebooting router in 5 seconds..." | tee -a "$log_file" sleep 5 reboot elif [ $TESTMODE = 1 ] && [ $NEEDREBOOT = 1 ]; then echo "Failed WWAN connectivity check. Reboot required, but script in testing mode... no auto-reboot. Exit." | tee -a "$log_file" fi } ########################### start main script ####################### echo # new blank line echo "Logging to: ${log_file}" if [ "$TESTMODE" = 0 ]; then service sysntpd restart # restart the time-server on the router, only after BOOTWAIT. sleep 5 # give 5 seconds for sysntpd to retrieve and reset the clock to current time. fi echo | tee -a "$log_file" # Blank line to delineate new logging sequence (on restart, startup of router) echo "Wan-Watchdog script started / Router booted on $(date)" | tee -a "$log_file" if [ "$DOYOULIKECAKE" = "1" ] && [ "$VPN" = "1" ]; then # The next line addresses the automatically-applied fq_codel on the WWAN interface. Since we will be running cake-autorate inside the VPN usually, this should be removed/set to noqueue. This is a one-time check after every script start / boot-up. echo "Cake-autorate selected ON. VPN turned ON. Therefore, removing default fq_codel from basic ${WWANDEVICE}/${WWANNAME} interface.." | tee -a "$log_file" tc qdisc replace dev ${WWANDEVICE} root noqueue | tee -a "$log_file" fi echo # new blank line echo "Entering wan & vpn testing:" network_flush_cache network_find_wan NET_IF_NAME network_find_wan6 NET6_IF_NAME # ATTENTION: Look at the following output during testing, to determine if you correctly set WWANDEVICE & WWANNAME, assuming you are connected in a basic way to your default internet connection when you test-run this echo "MODEMTYPE (for modem_reset / wan reset) is set to ${MODEMTYPE}" echo "WWANDEVICE device is set to ${WWANDEVICE}" echo "WANNNAME interface is set to ${WWANNAME}" echo "MODEMUSBDEVNODE is set to ${MODEMUSBDEVNODE}" echo "NET_IF_NAME reports as ${NET_IF_NAME}" # if NIN is null, then the WAN is not connected (usually during testing where WWANNAME has been manually stopped). echo "NET6_IF_NAME reports as ${NET6_IF_NAME}" # same as above # Ordinarily either NIN or N6IN must match WWANNAME, if the router is connected to the internet when you test-run this script, and you have no VPN profiles active. # If neither NET or NET6 IF match WWANNAME, stop this script and correct the value for $WWANNAME stored at the top. # You could set a name of 'mbim' when creating the device, and openwrt will make 'mbim_4' and 'mbim_6' virtual adapters. # Alternatively, you make 'qmi' and openwrt uses that for ipv6, and makes a virtual adapter of 'qmi_4' for ipv4. # This script attempts to determine all possibilities, and writes an error to the log with info if nothing matches, and exits. echo "Sleeping for ${WAITFOR} seconds..." sleep $WAITFOR # Connection Status: 0 Not connected-connection attempt failure, exit and reboot. # Connection Status: 1 Unknown State, try to take corrective action # Connection Status: 2 WWAN interface connected, fping success # Connection Status: 3 WWAN & VPN connected, fping success NEEDREBOOT="0" # Initialize VPNCHANGED="0" # Initialize CONNECTIONSTATUS="1" # Initialize ACTIVEVPN="none" # Initialize the -active and tested- VPN profile, to 'none'. sleep $WAITFOR while [ $CONNECTIONSTATUS -ge 1 ]; do # While CONNECTION STATUS is 1 or greater, do the following loop: # if1 if check_fping ; then echo echo "Successful fping response." echo # if2 if [ "$VPN" = "1" ] ; then # If VPN selector is turned ON, try/test VPN connections 1 through 4. echo "VPN Option selector turned-on '1'. Getting current interface connection values..." network_flush_cache network_find_wan NET_IF_NAME network_find_wan NET6_IF_NAME echo "Current values for:" echo "NET_IF_NAME: ${NET_IF_NAME}" echo "NET6_IF_NAME: ${NET6_IF_NAME}" echo "WWANNAME interface: ${WWANNAME}" echo "WWANDEVICE device: ${WWANDEVICE}" echo "Checking if any VPN connections are currently up..." sleep $WAITFOR # if3 if [ "$NET_IF_NAME" = "$VPN1" ] || [ "$NET_IF_NAME" = "$VPN2" ] || [ "$NET_IF_NAME" = "$VPN3" ] || [ "$NET_IF_NAME" = "$VPN4" ]; then echo "ACTIVEVPN is ${NET_IF_NAME}, is reported as up, and fping is successful!" echo "Basic WWAN Connection is up, and 1 out of 4 possible VPN profiles/interfaces are connected." echo "Setting value of ACTIVEVPN to ${NET_IF_NAME}.." ACTIVEVPN=$NET_IF_NAME echo "Setting CONNECTIONSTATUS to 3.." CONNECTIONSTATUS=3 sleep $WAITFOR # if4 if [ -f /tmp/currentvpn.txt ]; then echo "/tmp/currentvpn.txt exists reading value into PREVIOUSVPN.." read -r PREVIOUSVPN < /tmp/currentvpn.txt else echo "/tmp/currentvpn.txt does not exist. Initializing PREVIOUSVPN to 'none'." PREVIOUSVPN="none" fi # fi4 echo "Comparing ACTIVEVPN to PREVIOUSVPN:" echo "PREVIOUSVPN is: ${PREVIOUSVPN}.." # if5 if [ "$ACTIVEVPN" = "$PREVIOUSVPN" ]; then # If ACTIVEVPN matches PREVIOUSVPN in this loop or run, then do nothing and continue echo "Current VPN is the same as the previously-connected VPN profile, continuing..." VPNCHANGED=0 sleep $WAITFOR else # if the values do not match, then maybe the VPN has changed from previous, write the currentvpn to 'currentvpn.txt' echo "VPN has changed. Previous VPN was ${PREVIOUSVPN}. Current VPN is ${CURRENTVPN}" | tee -a "$log_file" echo "Writing changes to currentvpn.txt..." VPNCHANGED=1 echo "$ACTIVEVPN" > /tmp/currentvpn.txt # write the current vpn value to a file, e.g. to be used by the cake autorate script. sleep $WAITFOR fi # fi5 Finished sensing and reacting to a match or mismatch between ACTIVEVPN and PREVIOUSVPN # if3 elif # else if the WWAN interface is up, & VPN selector is turned-on, but no VPN is up yet. elif [ "${WWANNAME}" = "$NET_IF_NAME" ] || [ "${WWANNAME}" = "$NET6_IF_NAME" ] || [ "${WWANNAME}_4" = "$NET_IF_NAME" ] || [ "${WWANNAME}_6" = "$NET6_IF_NAME" ]; then echo "VPN not up yet; VPN selector is ON. WWAN network interface is up. Trying to get a VPN connection up..." | tee -a "$log_file" sleep $WAITFOR echo "Trying $VPN1..." | tee -a "$log_file" echo "ifup ${VPN1}" | tee -a "$log_file" ifup $VPN1 | tee -a "$log_file" sleep 7 # wait for it to connect network_flush_cache network_find_wan NET_IF_NAME if [ "$NET_IF_NAME" = "$VPN1" ] && check_fping ; then ACTIVEVPN=$VPN1 CONNECTIONSTATUS=3 echo "${VPN1} is up, and fping success!" | tee -a "$log_file" else echo "Unable to connect to ${VPN1}... trying ${VPN2} profile..." | tee -a "$log_file" echo "ifdown ${VPN1}" | tee -a "$log_file" ifdown $VPN1 | tee -a "$log_file" sleep $WAITFOR echo "ifup ${VPN2}" | tee -a "$log_file" ifup $VPN2 | tee -a "$log_file" sleep 7 network_flush_cache network_find_wan NET_IF_NAME if [ "$NET_IF_NAME" = "$VPN2" ] && check_fping ; then ACTIVEVPN=$VPN2 CONNECTIONSTATUS=3 echo "${VPN2} is up and fping success!" | tee -a "$log_file" else echo "Unable to connect to ${VPN2}, trying ${VPN3} profile..." | tee -a "$log_file" echo "ifdown ${VPN2}" | tee -a "$log_file" ifdown $VPN2 | tee -a "$log_file" sleep $WAITFOR echo "ifup ${VPN3}" | tee -a "$log_file" ifup $VPN3 | tee -a "$log_file" sleep 7 network_flush_cache network_find_wan NET_IF_NAME if [ "$NET_IF_NAME" = "$VPN3" ] && check_fping ; then ACTIVEVPN=$VPN3 CONNECTIONSTATUS=3 echo "${VPN3} is up, and fping success!" | tee -a "$log_file" else echo "Unable to connect to ${VPN3}, trying ${VPN4} profile..." | tee -a "$log_file" echo "ifdown ${VPN3}" | tee -a "$log_file" ifdown $VPN3 | tee -a "$log_file" sleep $WAITFOR echo "ifup ${VPN4}" | tee -a "$log_file" ifup $VPN4 | tee -a "$log_file" sleep 7 network_flush_cache network_find_wan NET_IF_NAME if [ "$NET_IF_NAME" = "$VPN4" ] && check_fping ; then ACTIVEVPN=$VPN4 CONNECTIONSTATUS=3 echo "${VPN4} is up and fping success!" | tee -a "$log_file" else echo "Unable to start VPN4, or connect to any previous VPN1-3 profiles on $(date)" | tee -a "$log_file" fi # Finished VPN4 connect/all VPNx attempts fi # Finished VPN3 connect attempt fi # Finished VPN2 connect attempt fi # Finished VPN1 connect attempt else # something is wrong with the basic configuration of the script / interfaces mismatch echo "Whoops. Something went wrong!" | tee -a "$log_file" echo "Most likely this scripts VPN/WWAN variables are not matched to your router's interface names." | tee -a "$log_file" echo "Check that your script VPN/WWAN values match the actual interfaces. Exiting..." | tee -a "$log_file" break fi # Finished check if WWAN or any VPN connection is connected, and establishing required VPN connection if not active already. # Else if VPN Selector is set to 0 / no VPN, then elif [ $VPN -eq 0 ] ; then echo "VPN=${VPN}. VPN Selector is turned-off." echo "Checking to see if the current NET_IF_NAME or NET6_IF_NAME matches WWANNAME (or a _4 _6 variation):" network_flush_cache network_find_wan NET_IF_NAME network_find_wan6 NET6_IF_NAME echo "Current values:" echo "NET_IF_NAME: ${NET_IF_NAME}" echo "NET6_IF_NAME: ${NET6_IF_NAME}" echo "WWANNAME interface: ${WWANNAME}" echo "WWANDEVICE device: ${WWANDEVICE}" if [ "${WWANNAME}" = "$NET_IF_NAME" ] || \ [ "${WWANNAME}" = "$NET6_IF_NAME" ] || \ [ "${WWANNAME}_4" = "$NET_IF_NAME" ] || \ [ "${WWANNAME}_6" = "$NET6_IF_NAME" ]; then echo "They do and VPN is selected OFF. Do nothing here/continue." else echo "Neither the NET_IF_NAME (${NET_IF_NAME}) or NET6_IF_NAME (${NET6_IF_NAME}) match the WWANNAME (${WWANNAME})" | tee -a "$log_file" echo "VPN1-4 interfaces (ifdown), e.g. in case a VPN was started manually prior to this script run. VPN selector is OFF." | tee -a "$log_file" # VPN may have been manually started prior to current script invocation. ifdown $VPN1 ifdown $VPN2 ifdown $VPN3 ifdown $VPN4 VPNCHANGED=1 fi echo "No VPN active: Setting 'CONNECTIONSTATUS' = 2" CONNECTIONSTATUS=2 sleep $WAITFOR fi # finished for VPN 1/0 Selector check, establishing VPN connection, determining if active VPN has changed from prior VPN interface. else # fping failure - No connectivity: on initial or subsequent fping test, whether through VPN or without. These responses should always be logged. # This is directly after an fping testing failure, so no need for an additional blank line echo echo "On $(date): ALERT: fping failure - No WAN or VPN connectivity." | tee -a "$log_file" # Log when no connectivity. if [ $VPN -eq 1 ] ; then # If VPN is selected, and basic connectivity is failing, echo "VPN Selector is ON. WWAN/VPN is failing at initial fping test." | tee -a "$log_file" echo "This would ordinarily be the case, e.g. if you forcibly stopped your WWAN prior to running this script, to test." | tee -a "$log_file" elif [ $VPN -eq 0 ] ; then echo "VPN Selector is turned-off in this script and there is no WAN connectivity." | tee -a "$log_file" fi echo "Script will now attempt to (re)connect cell link, according to the designated ${MODEMTYPE} procedure." | tee -a "$log_file" CONNECTIONSTATUS="1" # reset to unknown echo "Shutting down VPN interfaces, in case they have been manually actived..." ifdown $VPN1 ifdown $VPN2 ifdown $VPN3 ifdown $VPN4 ACTIVEVPN="none" sleep $WAITFOR modem_reset # Modem reset function as defined above. if check_fping ; then echo "fping follow-up after modem_reset SUCCESS: restart procedure worked, no need for reboot." | tee -a "$log_file" echo | tee -a "$log_file" sleep $WAITFOR CONNECTIONSTATUS="2" NEEDREBOOT="0" else echo "fping follow-up after modem_reset FAILURE: Reboot required." | tee -a "$log_file" CONNECTIONSTATUS="0" NEEDREBOOT="1" fi # end of follow-up probe to see if WWAN interface restart worked... fi # End of initial fping test ######## sqm & cake-autorate portion of script ########## # Now let's start or restart, as needed, cake-autorate to run: # probably should check cake-autorate status and correct if necessary, here # Check if cake-autorate is running, and what it is currently using for it's 'ul_if' value: echo echo echo "CAKE-AUTORATE check / DOYOULIKECAKE & SQM portion of wan-watchdog.sh:" echo echo "Detecting interface/SQM/Cake-Autorate status & preferences:" echo "CONNECTIONSTATUS is ${CONNECTIONSTATUS}." echo "Sending: 'service cake-autorate status'" echo "If 'running', cake-autorate is running." echo "If 'inactive', cake-autorate is not running." echo "If 'not found', cake-autorate is not installed." echo service cake-autorate status echo echo "Sending: 'service sqm status'" echo "If 'running', SQM is running." echo "If 'inactive', SQM is not running." echo "If 'active with no instances', SQM service is running." echo "If 'not found', SQM is not installed." echo service sqm status echo echo "VPNCHANGED is ${VPNCHANGED}." echo "DOYOULIKECAKE is ${DOYOULIKECAKE}." sleep $WAITFOR # if the VPN is selected (1), enabled (on) and running (working) (therefore connectionstatus=3), VPN has not changed from prior, and cake-autorate/sqm are already running: # The most typical state when the VPN is selected, and script is running in a loop the second, third etc times. if [ "$CONNECTIONSTATUS" = "3" ] && [ "$(service cake-autorate status)" = "running" ] && [ "$(service sqm status)" = "active with no instances" ] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPNCHANGED" = "0" ]; then echo "Connection Status is 3: VPN ON & active. SQM is set ON and running, Cake-Autorate is set ON and running. VPN has NOT changed: Do nothing." # else if all of the above is true but the VPN has changed, and cake-autorate/sqm are already running, restart them: elif [ "$CONNECTIONSTATUS" = "3" ] && [ "$(service cake-autorate status)" = "running" ] && [ "$(service sqm status)" = "active with no instances" ] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPNCHANGED" = "1" ]; then echo "Connection Status is 3: VPN ON & Active. SQM is set to ON and running, Cake-Autorate is set ON and running. But VPN has changed: restart SQM/Cake-autorate" | tee -a "$log_file" echo "Stopping cake-autorate..." echo "service cake-autorate stop" service cake-autorate stop sleep $WAITFOR echo "(Re)starting SQM to verify it's running the correct profile on the correct interface:" echo "service sqm stop" service sqm stop sleep $WAITFOR echo "service sqm start" service sqm start sleep $WAITFOR # Config.primary.sh in /root/cake-autorate/ will pull the current VPN connection /tmp/currentvpn.txt, and operate on that. echo "service cake-autorate start" service cake-autorate start VPNCHANGED="0" sleep $WAITFOR # else if the VPN has not changed, the VPN is active, but cake-autorate is not yet running (typically due to set to 'disabled' (from autostart) in System, Startup), then start it: elif [ "$CONNECTIONSTATUS" = "3" ] && [[ "$(service cake-autorate status)" == "not running" || "$(service cake-autorate status)" == "inactive" ]] && [ "$DOYOULIKECAKE" = "1" ]; then echo "CONNECTIONSTATUS is 3: VPN is ON & active, Cake-autorate is NOT running, yet cake-autorate selector is ON. Starting Cake-Autorate..." if [ "$(service sqm status)" = "active with no instances" ]; then # Probing if SQM is running first, instead of just doing a 'service sqm restart', avoids the alarming but normal 'Command failed: Not found' error. echo "SQM is running. Stopping then starting to make sure its on the active interface..." echo "service sqm stop" service sqm stop sleep $WAITFOR fi echo "service sqm start" service sqm start sleep $WAITFOR echo "service cake-autorate start" # Config.primary.sh in /root/cake-autorate/ will pull the current VPN connection /tmp/currentvpn.txt, and operate on that. service cake-autorate start # Initialize VPNCHANGED to 0, because regardless of whether it changed, cake-autorate was not running # This can also be the situation during testing sometimes, that VPNCHANGED=1 due to subsequent manual runs of the script. VPNCHANGED="0" sleep $WAITFOR # atypical: if script was halted/killed manually, a VPN was on, and VPN option was changed to '0' in the script, then script was re-executed: elif [ "$CONNECTIONSTATUS" = "2" ] && [ "$(service cake-autorate status)" = "running" ] && [ "$(service sqm status)" = "active with no instances" ] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPN" = "0" ] && [ "$VPNCHANGED" = "1" ]; then echo "VPN active, but VPN selected OFF in script. Restarting SQM/Cake-Autorate to set to ${WWANDEVICE} / ${WWANNAME}:" echo "Connection Status is 2, SQM is running, Cake is running, Cake selector is enabled, but this script was probably killed, then re-run while there was a (previous) VPN running:" echo "Stopping cake-autorate, restart sqm, restart cake-autorate." echo "service cake-autorate stop" service cake-autorate stop echo "Restarting SQM to get SQM running on the current active WWAN interface..." echo "service sqm restart" service sqm restart sleep $WAITFOR echo "Writing WWANDEVICE to /tmp/currentvpn.txt for Cake-Autorate config.primary.sh to pick-up active interface" echo "${WWANDEVICE}" > /tmp/currentvpn.txt # write the current WWAN device (not interface!) to a file, to be used by the cake autorate script. sleep $WAITFOR echo "service cake-autorate start" # Config.primary.sh in /root/cake-autorate/ will pull the current VPN connection /tmp/currentvpn.txt, and operate on that. service cake-autorate start # Initialize VPNCHANGED to 0 VPNCHANGED="0" sleep $WAITFOR # typical when script running in second, third etc loop and VPN selected OFF: # else if VPN is selected OFF (0), Connection = 2 (good connection, no vpn), and cake selector is ON, and running, and SQM is running, do nothing: elif [ "$CONNECTIONSTATUS" = "2" ] && [ "$(service cake-autorate status)" = "running" ] && [ "$(service sqm status)" = "active with no instances" ] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPN" = "0" ] && [ "$VPNCHANGED" = "0" ]; then echo "Connection Status is 2: VPN is not active, VPN selected OFF, DOYOULIKECAKE is ON, CAKE and SQM running.. No change in interfaces: Do Nothing." # typical on firstboot with VPN off: # else if VPN is selected OFF (0), connection = 2 (good connection), and cake selector is ON, but not yet running: elif [ "$CONNECTIONSTATUS" = "2" ] && [[ "$(service cake-autorate status)" == "not running" || "$(service cake-autorate status)" == "inactive" ]] && [ "$DOYOULIKECAKE" = "1" ] && [ "$VPN" = "0" ]; then echo "Connection status is 2: No VPN Active, VPN is selected OFF, Cake is NOT running, and cake-autorate selector is ON:" echo "(Re)starting SQM to verify that SQM is running on the current active WWAN interface..." if [ "$(service sqm status)" = "active with no instances" ] ; then # Probing if SQM is running first, instead of just doing a 'service sqm restart', avoids the alarming but normal 'Command failed: Not found' error. echo "SQM is running. Stopping..." echo "service sqm stop" service sqm stop sleep $WAITFOR fi echo "service sqm start" service sqm start sleep $WAITFOR echo "service cake-autorate start" echo "$WWANDEVICE" > /tmp/currentvpn.txt # write the current WWAN device (not interface!) to a file, to be used by the cake autorate script. sleep $WAITFOR # Config.primary.sh in /root/cake-autorate/ will pull the current VPN connection /tmp/currentvpn.txt, and operate on that. service cake-autorate start sleep $WAITFOR # typical when script has been stopped, CAKE selector has been turned off, and script is re-run # else if cake-autorate is running, but cake selector is set to OFF, then shut down cake-autorate: elif [[ "$CONNECTIONSTATUS" == "2" || "$CONNECTIONSTATUS" == "3" ]] && [ "$(service cake-autorate status)" = "running" ] && [ "$DOYOULIKECAKE" = "0" ]; then echo "Connection Status is ${CONNECTIONSTATUS}, Cake-autorate is running, but is selected OFF. Stopping cake-autorate:" echo "service cake-autorate stop" service cake-autorate stop # This can be the case during manually running the script, or if you decided not to use cake-autorate for now, but forgot to disable it from startup # or it had been started previously but you don't want it running rn. sleep $WAITFOR # And if SQM is running, but FEELINGSQMISH SQM selector is also set to OFF, then also shut down SQM: if [ "$(service sqm status)" = "active with no instances" ] && [ "$FEELINGSQMISH" = "0" ]; then echo "SQM is also turned-off, but SQM is running. Stopping SQM:" echo "Stopping SQM..." echo "service sqm stop" service sqm stop # This can be the case during manually running the script, or if you decided not to use SQM for now, but forgot to disable it from startup # or it had been started previously but you don't want it running rn. sleep $WAITFOR elif [ "$(service sqm status)" = "active with no instances" ] && [ "$FEELINGSQMISH" = "1" ]; then echo "SQM is selected ON, and is running. Do nothing" sleep $WAITFOR fi # else if Connection is 2 or 3 - good, w or wo VPN, Cake is selected OFF and is not running, do nothing elif [[ "$CONNECTIONSTATUS" = "2" || "$CONNECTIONSTATUS" == "3" ]] && [[ "$(service cake-autorate status)" == "not running" || "$(service cake-autorate status)" == "inactive" ]] && [ "$DOYOULIKECAKE" = "0" ] ; then echo "CONNECTIONSTATUS is ${CONNECTIONSTATUS}. DOYOULIKECAKE is selected OFF and Cake-Autorate is not running. Do nothing." sleep $WAITFOR # and test SQM within that: if [ "$(service sqm status)" = "inactive" ] && [ "$FEELINGSQMISH" = "0" ] ; then echo "and SQM selector is also turned OFF and SQM is installed, and not running. Do nothing." sleep $WAITFOR elif [ "$(service sqm status)" = "inactive" ] && [ "$FEELINGSQMISH" = "1" ] ; then echo "and SQM is installed, selected ON, but not running. Starting..." echo "service sqm start" service sqm start sleep $WAITFOR fi # normal condition when only SQM is installed on the router, is running, SQM is selected ON and running or not, Connection is 2 or 3 good w or wo VPN, do nothing generally but start SQM if not running: elif [[ "$CONNECTIONSTATUS" = "2" || "$CONNECTIONSTATUS" == "3" ]] && [[ "$(service sqm status)" = "active with no instances" || "$(service sqm status)" = "inactive" ]] && [ "$FEELINGSQMISH" = "1" ]; then echo "CONNECTIONSTATUS is ${CONNECTIONSTATUS}. FEELINGSQMISH is ON. Evaluate if SQM is running..." sleep $WAITFOR # and test SQM within that: if [ "$(service sqm status)" = "active with no instances" ] && [ "$FEELINGSQMISH" = "1" ] ; then echo "SQM is running. Do nothing." sleep $WAITFOR elif [ "$(service sqm status)" = "inactive" ] && [ "$FEELINGSQMISH" = "1" ] ; then echo "SQM is not running, but installed. Starting SQM. Starting..." echo "service sqm start" service sqm start sleep $WAITFOR fi # normal conditions when either SQM or Cake-Autorate are not installed at all on the router: # if Connection status is 2 or 3 - good w or wo VPN, and Cake & SQM is selected OFF and both are not installed, continue.. elif ([ "$CONNECTIONSTATUS" = "2" ] || [ "$CONNECTIONSTATUS" = "3" ]) && \ [ ! -f "/etc/init.d/cake-autorate" ] && \ [ "$DOYOULIKECAKE" = "0" ] && \ [ ! -f "/etc/init.d/sqm" ] && \ [ "$FEELINGSQMISH" = "0" ]; then echo "SQM and Cake-Autorate: not enabled in script & not installed." echo "Connectionstatus is 2/3 (good)" echo "Do nothing - continue running" # else if Connection is 2 / good w or wo VPN, SQM is selected ON yet not installed, log it & exit script. elif ([ "$CONNECTIONSTATUS" = "2" ] || [ "$CONNECTIONSTATUS" = "3" ]) && \ [ ! -f "/etc/init.d/sqm" ] && \ [ "$FEELINGSQMISH" = "1" ]; then echo "SQM is not installed, but enabled in script." | tee -a "$log_file" echo "Connectionstatus is 2/3 (good)" | tee -a "$log_file" echo "Quitting... FIXME: Install SQM or disable it in this script." | tee -a "$log_file" exit # Immediately exit script # else if Connection is 2 / good w or wo VPN, Cake-Autorate is selected ON and yet not installed, log it & exit script. elif ([ "$CONNECTIONSTATUS" = "2" ] || [ "$CONNECTIONSTATUS" = "3" ]) && \ [ ! -f "/etc/init.d/cake-autorate" ] && \ [ "$DOYOULIKECAKE" = "1" ]; then echo "Cake-Autorate not installed, but enabled in script." | tee -a "$log_file" echo "Connectionstatus is 2/3 (good)" | tee -a "$log_file" echo "Quitting... FIXME: Install Cake-Autorate or disable it in this script" | tee -a "$log_file" exit # Immediately exit script # else if we need a reboot because there was an fping failure, & connection status will be '0'. We didn't use to catch this, but # now we do (20260219.xx updates added a 'fall-through' condition, below). So we have to account for it. # fping previously failed (no throughput) after a modem_reset, reboot required. elif [ "$CONNECTIONSTATUS" = "0" ] && [ "$NEEDREBOOT" = "1" ] ; then echo "Skipping CAKE-SQM portion of script, fping failed after modem_reset: reboot required" | tee -a "$log_file" else # 'fall through' else condition. No previous conditions met: log everything and quit the script. echo "On $(date): ALERT!: SQM-CAKE section fall-through" | tee -a "$log_file" echo "FIXME! Some other error or condition unanticipated, in the CAKE SQM section of the script" | tee -a "$log_file" echo "Current values for:" | tee -a "$log_file" echo "NET_IF_NAME: ${NET_IF_NAME}" | tee -a "$log_file" echo "NET6_IF_NAME: ${NET6_IF_NAME}" | tee -a "$log_file" echo "MODEMTYPE: ${MODEMTYPE}" | tee -a "$log_file" echo "WWANNAME interface: ${WWANNAME}" | tee -a "$log_file" echo "WWANDEVICE device: ${WWANDEVICE}" | tee -a "$log_file" echo "CONNECTIONSTATUS: ${CONNECTIONSTATUS}" | tee -a "$log_file" echo "VPNCHANGED: ${VPNCHANGED}" | tee -a "$log_file" echo "DOYOULIKECAKE: ${DOYOULIKECAKE}" | tee -a "$log_file" echo "FEELINGSQMISH: ${FEELINGSQMISH}" | tee -a "$log_file" echo "NEEDREBOOT: ${NEEDREBOOT}" | tee -a "$log_file" echo "Exiting script..." | tee -a "$log_file" exit # Immediately exit script fi # Finished starting Cake-autorate for current connection, if selected, and SQM. Restarting services or terminating as necessary. echo # Pause for LOOPWAIT then re-test connection, or if in testing-mode, go directly to exit: if [ $TESTMODE = 0 ] ; then # now check to see if you should loop or break echo "Pausing ${LOOPWAIT} seconds and testing again..." sleep $LOOPWAIT elif [ $TESTMODE = 1 ] ; then echo "Would loop at this point, but it is in Testing mode. Exiting.." | tee -a "$log_file" echo "Test run ended: $(date)" | tee -a "$log_file" exit fi echo done # When CONNECTION STATUS no longer 1 or greater, or break if [ $TESTMODE = 0 ] && [ $NEEDREBOOT = 1 ]; then echo "Preparing to reboot router:" | tee -a "$log_file" echo "Shutting down CAKE..." | tee -a "$log_file" service cake-autorate stop echo "Shutdown down SQM..." | tee -a "$log_file" service sqm stop echo "Rebooting router in 5 seconds..." | tee -a "$log_file" sleep 5 reboot elif [ $TESTMODE = 1 ] && [ $NEEDREBOOT = 1 ]; then echo "Failed WWAN connectivity check. Reboot required, but script in testing mode... no auto-reboot. Exit." | tee -a "$log_file" fi
Dynamic connection
Preserve default route to restore WAN connectivity when VPN is disconnected.
# Preserve default route uci set network.wan.metric="1024" uci commit network service network restart
Dynamic address
Periodically re-resolve inactive peer hostnames for VPN peers with dynamic IP addresses.
# Periodically re-resolve inactive peers cat << "EOF" >> /etc/crontabs/root * * * * * /usr/bin/wireguard_watchdog EOF uci set system.@system[0].cronloglevel="9" uci commit system service cron restart
Race conditions
Resolve the race condition with sysntpd service when RTC is missing.
# Resolve race conditions cat << "EOF" >> /etc/crontabs/root * * * * * date -s 2030-01-01; service sysntpd restart EOF uci set system.@system[0].cronloglevel="9" uci commit system service cron restart
Site-to-site
Implement plain routing between server side LAN and client side LAN assuming that:
192.168.1.0/24- server side LAN192.168.2.0/24- client side LAN
Add route to client side LAN on VPN server.
uci set network.wgclient.route_allowed_ips="1" uci add_list network.wgclient.allowed_ips="192.168.2.0/24" uci commit network service network restart
Add route to server side LAN on VPN client.
uci set network.wgserver.route_allowed_ips="1" uci add_list network.wgserver.allowed_ips="192.168.1.0/24" uci commit network service network restart
Consider VPN network as private and assign VPN interface to LAN zone on VPN client.
uci del_list firewall.wan.network="vpn" uci add_list firewall.lan.network="vpn" uci commit firewall service firewall restart
IPv6 site-to-site
Provide IPv6 site-to-site connectivity assuming that:
fd00:0:0:1::/64- server side LANfd00:0:0:2::/64- client side LAN
Add route to client side LAN on VPN server.
uci set network.lan.ip6assign="64" uci set network.lan.ip6hint="1" uci set network.vpn.ip6prefix="fd00::/48" uci add_list network.wgclient.allowed_ips="fd00:0:0:2::/64" uci commit network service network restart
Add route to server side LAN on VPN client.
uci set network.lan.ip6assign="64" uci set network.lan.ip6hint="2" uci set network.vpn.ip6prefix="fd00::/48" uci add_list network.wgserver.allowed_ips="fd00:0:0:1::/64" uci commit network service network restart
Default gateway
If you do not need to route all traffic to VPN. Disable gateway redirection on VPN client.
uci del_list network.wgserver.allowed_ips="0.0.0.0/0" uci del_list network.wgserver.allowed_ips="::/0" uci commit network service network restart
If you want to disable automatic routes for allowed IPs.
uci -q delete network.wgserver.route_allowed_ips
uci commit network
service network restart
Split gateway
If VPN gateway is separate from your LAN gateway. Implement plain routing between LAN and VPN networks assuming that:
192.168.1.0/24- LAN network192.168.1.2/24- VPN gateway192.168.9.0/24- VPN network
Add port forwarding for VPN server on LAN gateway.
uci -q delete firewall.wg uci set firewall.wg="redirect" uci set firewall.wg.name="Redirect-WireGuard" uci set firewall.wg.src="wan" uci set firewall.wg.src_dport="51820" uci set firewall.wg.dest="lan" uci set firewall.wg.dest_ip="192.168.1.2" uci set firewall.wg.family="ipv4" uci set firewall.wg.proto="udp" uci set firewall.wg.target="DNAT" uci commit firewall service firewall restart
Add route to VPN network via VPN gateway on LAN gateway.
uci -q delete network.vpn uci set network.vpn="route" uci set network.vpn.interface="lan" uci set network.vpn.target="192.168.9.0/24" uci set network.vpn.gateway="192.168.1.2" uci commit network service network restart
IPv6 gateway
Set up IPv6 tunnel broker or use IPv6 NAT or NPT if necessary.
Disable ISP prefix delegation to prevent IPv6 leaks on VPN client.
DNS over VPN
Serve DNS for VPN clients on OpenWrt server when using point-to-point topology.
Route DNS over VPN to prevent DNS leaks on VPN client.
Replace peer DNS with public or VPN-specific DNS provider on OpenWrt client.
Modify the VPN connection using NetworkManager on Linux desktop client.
nmcli connection modify id VPN_CON \ ipv4.dns-search ~. ipv4.dns-priority -50 \ ipv6.dns-search ~. ipv6.dns-priority -50
Kill switch
Prevent traffic leaks on OpenWrt client isolating VPN interface in a separate firewall zone.
uci -q delete firewall.vpn uci set firewall.vpn="zone" uci set firewall.vpn.name="vpn" uci set firewall.vpn.input="REJECT" uci set firewall.vpn.output="ACCEPT" uci set firewall.vpn.forward="REJECT" uci set firewall.vpn.masq="1" uci set firewall.vpn.mtu_fix="1" uci add_list firewall.vpn.network="vpn" uci del_list firewall.wan.network="vpn" uci -q delete firewall.@forwarding[0] uci set firewall.lan_vpn="forwarding" uci set firewall.lan_vpn.src="lan" uci set firewall.lan_vpn.dest="vpn" uci commit firewall service firewall restart
Multi-client
Set up multi-client VPN server. Generate client keys and profiles. Configure VPN peers.
# Configuration parameters VPN_IDS="wgserver wgclient wglaptop wgmobile" VPN_PKI="." VPN_IF="vpn" VPN_PORT="$(uci -q get network.${VPN_IF}.listen_port)" read -r VPN_ADDR VPN_ADDR6 \ < <(uci -q get network.${VPN_IF}.addresses) # Fetch server address NET_FQDN="$(uci -q get ddns.@service[0].lookup_host)" . /lib/functions/network.sh network_flush_cache network_find_wan NET_IF network_get_ipaddr NET_ADDR "${NET_IF}" if [ -n "${NET_FQDN}" ] then VPN_SERV="${NET_FQDN}" else VPN_SERV="${NET_ADDR}" fi # Generate client keys umask go= mkdir -p ${VPN_PKI} for VPN_ID in ${VPN_IDS#* } do wg genkey \ | tee ${VPN_PKI}/${VPN_ID}.key \ | wg pubkey > ${VPN_PKI}/${VPN_ID}.pub wg genpsk > ${VPN_PKI}/${VPN_ID}.psk done # Generate client profiles VPN_SFX="1" for VPN_ID in ${VPN_IDS#* } do let VPN_SFX++ cat << EOF > ${VPN_PKI}/${VPN_ID}.conf [Interface] PrivateKey = $(cat ${VPN_PKI}/${VPN_ID}.key) Address = ${VPN_ADDR%.*}.${VPN_SFX}/24 Address = ${VPN_ADDR6%:*}:${VPN_SFX}/64 DNS = ${VPN_ADDR%/*} DNS = ${VPN_ADDR6%/*} [Peer] PublicKey = $(cat ${VPN_PKI}/${VPN_IDS%% *}.pub) PresharedKey = $(cat ${VPN_PKI}/${VPN_ID}.psk) PersistentKeepalive = 25 Endpoint = ${VPN_SERV}:${VPN_PORT} AllowedIPs = 0.0.0.0/0 AllowedIPs = ::/0 EOF done ls ${VPN_PKI}/*.conf # Back up client profiles cat << EOF >> /etc/sysupgrade.conf $(pwd ${VPN_PKI}) EOF # Add VPN peers VPN_SFX="1" for VPN_ID in ${VPN_IDS#* } do let VPN_SFX++ uci -q delete network.${VPN_ID} uci set network.${VPN_ID}="wireguard_${VPN_IF}" uci set network.${VPN_ID}.description="${VPN_ID}" uci set network.${VPN_ID}.private_key="$(cat ${VPN_PKI}/${VPN_ID}.key)" uci set network.${VPN_ID}.public_key="$(cat ${VPN_PKI}/${VPN_ID}.pub)" uci set network.${VPN_ID}.preshared_key="$(cat ${VPN_PKI}/${VPN_ID}.psk)" uci add_list network.${VPN_ID}.allowed_ips="${VPN_ADDR%.*}.${VPN_SFX}/32" uci add_list network.${VPN_ID}.allowed_ips="${VPN_ADDR6%:*}:${VPN_SFX}/128" done uci commit network service network restart
Perform OpenWrt backup. Extract client profiles from the archive and import them to your clients.
Automated
Automated VPN server installation and client profiles generation.
URL="https://openwrt.org/_export/code/docs/guide-user/services/vpn/wireguard/server" cat << EOF > wireguard-server.sh $(wget -U "" -O - "${URL}?codeblock=0") $(wget -U "" -O - "${URL}?codeblock=1") $(wget -U "" -O - "${URL}?codeblock=2") $(wget -U "" -O - "${URL}?codeblock=3") $(wget -U "" -O - "${URL}/../extras?codeblock=15") EOF sh wireguard-server.sh