Adding a new VLAN
Create the bridge VLAN, add the bridge port with the tag, set the address. Then add the new bridge VLAN interface
to the LAN interface list. Forgetting this is the #1 ten-minute mystery.
Three branches, one head office, two months of cabling, and an unreasonable affection for /interface/wireguard. What it actually takes to run a small business on RouterOS — the cabling, the quirks, the firewall ruleset that finally let us sleep through a Sunday.
The CFO had asked, plainly, why a four-site shop with eight people in the head office needed a network refresh that started above two thousand euros. The honest answer was that it didn’t. The Cisco gear we had inherited from the 2018 fitout was still routing packets. It also took ninety seconds to reload a startup-config, ran an IOS train that hadn’t seen a security patch since the 19.x line, and required a Cisco Smart Account login that the previous IT contractor had taken with him when he left for Helsinki.
So the refresh was as much about ownership as it was about throughput. The question was which platform we’d be pleased to live with for the next five years. The shortlist, in order of how many minutes we spent on each: Ubiquiti UniFi Dream Machine SE plus three U6-LRs; Cisco Meraki MX67 across all sites; and MikroTik — an RB4011iGS+ at the head end with hAP ac3 units at the branches, and a CCR2004-1G-12S+2XS in the rack as the core router once we ran the second fibre.
Ubiquiti was eliminated on a Wednesday. We trialled a UDM Pro SE on a desk, and within two hours of attempting to define a non-trivial firewall rule with a service group — the kind we use for the SCADA jump host — we conceded that the GUI was lovely and the policy engine, on the day, was not. Meraki was eliminated by a finance line: at €614/year per site in licences, across four sites, the five-year cost was higher than our entire hardware budget plus a long weekend in Lisbon.
MikroTik won because, frankly, we already trusted it. Mihkel had been running an RB750Gr3 at his apiary for three years with a WireGuard tunnel to his phone, and he’d done the firewall by hand. The platform is unfashionable in 2026 — the cool kids buy OPNsense boxes and write their configs in Nix — but RouterOS does what it says, the docs are honest, and if you can read the wiki you can read the device. There is no per-seat licence. There is no cloud you cannot turn off.
This is the field report. It is not a tutorial. There is a tutorial here, in the configs we paste in §04 and §05, but the larger story is: what happens when you actually rack the metal. What surprises you on the second day. What you would have done differently if your future self had a five-minute conversation with your past self over coffee.
The principle was: buy the smallest box that will not be the bottleneck, and keep one of the field units in a drawer as a cold spare. We violated this principle exactly once, by buying the CCR2004 a year early. I do not regret it.
| Model | Role | Sited | Qty | Unit € | Line € |
|---|---|---|---|---|---|
| CCR2004-1G-12S+2XS | Core router, future 10G uplink | Tallinn HQ — rack U21 | 1 | 1,420.00 | 1,420.00 |
| RB4011iGS+RM | Edge router & WG hub | Tallinn HQ — rack U22 | 1 | 279.00 | 279.00 |
| RB4011iGS+RM | Cold spare for the edge | Drawer, Mihkel’s desk | 1 | 279.00 | 279.00 |
| hAP ac3 (RBD53iG-5HacD2HnD) | Branch edge + WAP | Tartu, Pärnu, Païde | 3 | 119.00 | 357.00 |
| CRS310-1G-5S-4S+IN | Aggregation, SCADA segment | Tallinn HQ — rack U20 | 1 | 389.00 | 389.00 |
| cAP ax (cAPGi-5HaxD2HaxD) | Ceiling AP, open-plan office | Tallinn HQ × 4, Tartu × 1 | 5 | 144.00 | 720.00 |
| S+RJ10 (10GBASE-T SFP+) | Uplink to LAN switch, transitional | HQ rack | 2 | 59.00 | 118.00 |
| XS+31LC10D (10G SR SFP28) | Fibre uplink, future Telia drop | HQ rack — sealed bag | 2 | 94.00 | 188.00 |
| Cabling: Cat6a + LC-LC OM4 | Patch lot | All sites | 1 | 97.40 | 97.40 |
| Total, ex. VAT, line items | 3,847.40 | ||||
The cold spare is the line item that gets questioned in every kickoff meeting and is non-negotiable in every postmortem. An RB4011 will fail eventually — the PSU more often than the board itself — and a sealed twin on a shelf with a backup of the running config means a cutover in eleven minutes, not eleven hours. We have not yet needed it. We will.
One note for buyers: the RBD53iG-5HacD2HnD is the hAP ac3 you want for a branch with five to ten people.
It has a gigabit WAN port, four gigabit LAN, and reasonable 802.11ac dual-band. If you have more than ten people on Wi-Fi you should
add a cAP ax overhead and let the hAP do routing only. We learned that in Tartu, where the hAP’s built-in radio held up
just until the third week, then began dropping the financial controller’s laptop twice a day.
I will draw the topology in prose because I have never seen a topology diagram on a glossy report that survived contact with reality longer than a quarter. Here is what we built.
The Tallinn head office sits in a former tram depot in Põhja-Tallinn, with a Telia VDSL2 drop (260/40 Mbit, on a good day)
and a backup LTE link via Elisa, terminated on an industrial CAT-M router in the same rack. The Telia drop hands off to the
RB4011 on ether1; the LTE link arrives on ether2 over a static private route. Inside the rack,
sfp-sfpplus1 on the RB4011 trunks to the CCR2004, which acts as the LAN core and pushes VLANs out to the floor switches.
The three branches each terminate a residential-class fibre — Tartu has 1000/300 Mbit from Tele2, Pärnu has 500/100 Mbit
from Telia, Païde has 100/100 Mbit from Infonet (the slow one is the design constraint that shaped most of what follows).
Each branch’s hAP ac3 holds the WAN address on ether1 and runs a single WireGuard tunnel back to the RB4011 hub.
Tartu is special. Tartu is where the SCADA test bench lives part-time, and where our second-largest team works. We promoted Tartu to a half-mesh peer: it holds two WireGuard tunnels — one to the HQ hub, and one direct to Pärnu. That second tunnel exists for a single use case: when the Tartu team is debugging a substation simulation that runs out of Pärnu and the round-trip via Tallinn jitters by 18–22 ms. With the direct tunnel, jitter dropped to under 3 ms.
Former tram depot, Põhja-Tallinn. Six engineers, two admin, three management.
Riia 181 b. Eight engineers, a partial SCADA mirror, the noisy team.
wg-hqwg-prnAida 7. Five engineers, the substation simulator, the loudest UPS.
Tallinna 14. Three engineers, the document scanner, a coffee machine.
Addressing was decided in twenty minutes over a whiteboard with Mihkel. We picked the 10.42.0.0/16
supernet because it does not collide with anything any of our partners use, and because forty-two is an internal joke that has outlived
its decency. Branches get a /24 each. The WireGuard transport is a separate /24 we keep entirely off the LAN routing table except through
OSPF; this lets us trace tunnel-only problems without staring at LAN flows.
“Buy the smallest box that will not be the bottleneck, and keep its twin in a drawer. The day you need it, you will not have time to learn to be grateful.”
WireGuard is, in 2026, the only sane reason to choose RouterOS 7 over 6.49. We had been holding on the 6.49 long-term branch because RouterOS 7’s early releases were not kind to anyone who needed Cap-sMAN. By 7.16.1 the bugs we cared about were either fixed or documented, and the WireGuard implementation — native, no userspace shim, runs in the kernel — was benchmarking at 938 Mbit/s symmetric on an RB4011, with the CPU at 41% on a single core. That is enough.
The interface stanza on the RB4011 hub looks like this. The key has been re-derived for publication; do not copy it.
# Create the WG interface on the hub /interface wireguard add name=wg-hub listen-port=51820 mtu=1420 \ private-key="yA8c+5xZk2QdH7w0nW3pP+r1L9oXgVuM6bF4tS0eJqU=" # Address the tunnel on /30 per peer for cleanliness /ip address add address=10.99.0.1/30 interface=wg-hub network=10.99.0.0 # to Tartu add address=10.99.0.5/30 interface=wg-hub network=10.99.0.4 # to Pärnu add address=10.99.0.9/30 interface=wg-hub network=10.99.0.8 # to Païde # Peer: Tartu (note allowed-address includes its branch /24) /interface wireguard peers add interface=wg-hub name=peer-tartu \ public-key="5pE2v7H+nQ8mTk0jXrYcL9aBdGsW3oR4uF6iZyKqI1c=" \ endpoint-address=tartu.kpinge.ee endpoint-port=51820 \ allowed-address=10.99.0.2/32,10.42.10.0/24 \ persistent-keepalive=21s comment="branch-tartu/tele2" add interface=wg-hub name=peer-parnu \ public-key="9Tg4xB3w+VnK7sZ2pUcM5aHdOfQ8jL1iE0rYqJyXzG4=" \ endpoint-address=parnu.kpinge.ee endpoint-port=51820 \ allowed-address=10.99.0.6/32,10.42.20.0/24 \ persistent-keepalive=21s comment="branch-parnu/telia" add interface=wg-hub name=peer-paide \ public-key="3Bm5cF2v+nL6kY1qWdN9aHsXgOeRtU0iE4pZyJxKqI8=" \ endpoint-address=paide.kpinge.ee endpoint-port=51820 \ allowed-address=10.99.0.10/32,10.42.30.0/24 \ persistent-keepalive=21s comment="branch-paide/infonet"
The corresponding stanza on the Tartu hAP, which holds two tunnels (to the hub and to Pärnu, for the half-mesh):
/interface wireguard add name=wg-hq listen-port=51820 mtu=1420 \ private-key="7sY3qM8nR+kJ4xC2vWdL5aBfGoP1tU9iE6pZyHcKqI0=" add name=wg-prn listen-port=51821 mtu=1420 \ private-key="2aX9vL5kM+nQ6pT3rUdH8sBfWoG4yC1iE7zYqJxNcI9=" /ip address add address=10.99.0.2/30 interface=wg-hq network=10.99.0.0 add address=10.99.1.1/30 interface=wg-prn network=10.99.1.0 /interface wireguard peers add interface=wg-hq name=peer-hq \ public-key="Hub5xZ2vL+nM8kT3pQdR9aBfYoC1iE6sJyHcXqWnI4=" \ endpoint-address=hq.kpinge.ee endpoint-port=51820 \ allowed-address=10.99.0.0/24,10.42.0.0/22,10.42.250.0/24 \ persistent-keepalive=21s add interface=wg-prn name=peer-prn \ public-key="9Tg4xB3w+VnK7sZ2pUcM5aHdOfQ8jL1iE0rYqJyXzG4=" \ endpoint-address=parnu.kpinge.ee endpoint-port=51821 \ allowed-address=10.99.1.2/32,10.42.20.0/24 \ persistent-keepalive=21s comment="half-mesh, SCADA latency"
Three subtleties worth pointing out, since they cost us a Saturday afternoon between them.
Subtlety one. Each branch’s public hostname (tartu.kpinge.ee, etc.) resolves via Cloudflare with a TTL of 60s.
Two of our three ISPs do not give us a static public address, so we run a tiny script on each hAP that updates the A record over Cloudflare’s
API when /ip cloud sees the WAN IP change. It runs every 5 minutes; in two months it has triggered nine times.
Subtlety two. The half-mesh tunnel between Tartu and Pärnu lives on port 51821, not 51820, because the hAP cannot listen on the same port from two different WireGuard interfaces. RouterOS will accept that config in WinBox without an error and then quietly fail. Use a different port and your life improves.
Subtlety three. The allowed-address list on the hub must include each peer’s tunnel /32
and the branch’s LAN /24. If you omit the tunnel /32, OSPF Hellos do not flow even though ping over the tunnel works.
This is the kind of asymmetric brokenness that takes forty minutes to diagnose if you do not already know.
OSPF is unfashionable. The greybeards run BGP between sites because they ran it in 2008 and the muscle memory is honest; the kids run a controller and pretend it is a routing protocol. For four sites, OSPF is the obvious tool: it converges in seconds, it does ECMP if you ask, and the failure case — the LTE backup link rising while the VDSL drop dies — is exactly the case OSPF was designed for.
The config is small enough to print. This is the hub stanza:
/routing ospf instance add name=ospf-core router-id=10.42.0.1 disabled=no /routing ospf area add name=backbone instance=ospf-core area-id=0.0.0.0 /routing ospf interface-template add area=backbone interfaces=wg-hub cost=10 \ auth=md5 auth-id=1 auth-key="K0rgepinge!04May2026" \ hello-interval=10s dead-interval=40s add area=backbone interfaces=lte-standby cost=200 \ auth=md5 auth-id=1 auth-key="K0rgepinge!04May2026" \ hello-interval=10s dead-interval=40s add area=backbone networks=10.42.0.0/22 type=stub /routing ospf static-neighbor # none — we use multicast on the tunnels
The branches are symmetrical. Each hAP ac3 announces its branch LAN /24 as a stub network and adjacent the hub
over the WireGuard tunnel. Cost 10 across all primary tunnels, 200 on the LTE standby. The cost gap of 190 is more than enough
to ensure the standby route only carries traffic when the primary genuinely dies.
The numeric story, from /log print where topics~"ospf" in the two months since cutover:
| Date (EET) | Site | Event | Convergence | Cause |
|---|---|---|---|---|
| 08 May 14:22 | Pärnu | Adjacency down → up | 11.4 s | ISP flap, Telia |
| 14 May 03:11 | HQ | VDSL down, LTE took over | 23.8 s | VDSL maintenance window |
| 14 May 04:47 | HQ | VDSL up, LTE released | 8.9 s | Maintenance ended |
| 22 May 19:03 | Païde | Adjacency down → up | 14.7 s | Infonet brownout |
| 29 May 11:30 | Tartu | Tartu↔Pärnu peer cycled | 3.2 s | Scheduled MTU test |
| 02 Jun 02:14 | HQ | VDSL flap, no LTE handover | 1.6 s | BFD-like quick recovery |
| 07 Jun 16:48 | Pärnu | Adjacency down → up | 9.5 s | Power blip, UPS rode |
The takeaway, if you do not want to read the table, is that for two months we have had a network that has fixed itself seven times without anyone calling Mihkel. The number we care about is the 23.8s on 14 May: a sleep-through-able cutover from VDSL to LTE in the middle of the night, with no human in the loop.
The first time the LTE standby actually carried production traffic, at 03:11 EET on 14 May, I was asleep in Põhja-Tallinn and learned about it from the morning’s syslog. That was the moment I stopped worrying about whether we’d made the right platform choice.
The default RouterOS firewall, the one the Quick Set wizard gives you, is fine for a home network and is dangerously close to adequate for a small office. It is not enough for ours. The reason is that we have a SCADA segment that must talk only to specific hosts on specific ports, and an internal jump-host pattern that wants tight inbound discipline. We rewrote the filter chain.
/ip firewall filter # INPUT chain — what is allowed to talk to the router itself add chain=input action=accept connection-state=established,related comment="est/rel" add chain=input action=drop connection-state=invalid comment="invalid" add chain=input action=accept protocol=icmp comment="icmp ok" add chain=input action=accept in-interface-list=LAN comment="lan mgmt" add chain=input action=accept in-interface=wg-hub src-address-list=mgmt-jump comment="wg mgmt" add chain=input action=accept protocol=udp dst-port=51820 comment="wg ingress" add chain=input action=drop log=yes log-prefix="in-drop" comment="deny rest" # FORWARD chain — traffic crossing the router add chain=forward action=accept connection-state=established,related add chain=forward action=drop connection-state=invalid add chain=forward action=drop src-address-list=bogon in-interface-list=WAN log=yes log-prefix="bogon" add chain=forward action=accept in-interface-list=LAN out-interface-list=WAN comment="lan to wan" add chain=forward action=accept in-interface-list=LAN out-interface=wg-hub comment="lan to branches" add chain=forward action=accept in-interface=wg-hub out-interface-list=LAN comment="branches to lan" # SCADA segment: only listed hosts may reach VLAN 250 on the listed ports add chain=forward action=accept src-address-list=scada-trusted \ dst-address=10.42.250.0/24 protocol=tcp dst-port=1433,4840,502 \ comment="scada in" add chain=forward action=drop dst-address=10.42.250.0/24 log=yes log-prefix="scada-deny" \ comment="scada deny rest" # Everything else crossing the router: drop and log at 1/s rate add chain=forward action=drop log=yes log-prefix="fwd-drop" # RAW chain — drops before conntrack, saves CPU on a hammered hub /ip firewall raw add chain=prerouting action=drop src-address-list=bogon in-interface-list=WAN add chain=prerouting action=drop dst-address-type=broadcast in-interface-list=WAN
Read that carefully and you will note we use interface-list=LAN and interface-list=WAN rather than
naming interfaces directly. This is one of RouterOS’s under-loved features: an interface list is a named bag of interfaces
you reference once in your filter, and then membership can change with the topology without rewriting the rules. We have LAN
containing bridge-lan, vlan10-staff, vlan20-guest, and WAN containing ether1, lte-standby.
About the fasttrack lie: every RouterOS tutorial you read in 2020–2022 told you to enable fasttrack-connection in your filter chain to get line-rate forwarding. Fasttrack works by skipping the rest of the firewall and most of the IP stack — and, critically, it skips queues, mangle, and any per-connection accounting. On a small office hub where you actually want to mark SCADA traffic via mangle, or rate-limit guest Wi-Fi via a simple queue, fasttrack will undo all of it silently. We disabled fasttrack on the hub and accepted the CPU cost (a four-percent increase, fine on the RB4011) in exchange for having queues and marks that actually take effect. Leave fasttrack on if you only need NAT and throughput. Turn it off the moment you write your first mangle rule.
Fasttrack is the feature that makes RouterOS feel fast and the rule that makes everything else you wrote irrelevant. Pick one.
The first quirk is the one that bites every first-timer: interface lists are how RouterOS expects you to organize
your firewall, and the wizard does not set them up for you in a way that survives a topology change. Specifically, when you add a new
VLAN, you must remember to add it to the LAN interface list, or your filter rule that says “allow LAN to WAN”
will silently fail to allow the new VLAN. We lost twenty minutes to this on 12 May.
The second quirk is queue-tree vs simple queues. Simple queues are easier; queue trees are correct. We used a queue tree for the guest VLAN throttle, with parent queues per VLAN and child queues per host. The simple queue we tried first treated the entire VLAN as a single bucket and the marketing manager’s 4K upload cratered everyone else for fifteen seconds.
The third is RouterOS 7’s default Wi-Fi package. On the hAP ac3, the legacy wireless
package gives you everything you remember from RouterOS 6 (CapsMAN, virtual APs, the lot). The newer wifi package
is the future but doesn’t have full CapsMAN parity yet. We standardized on the legacy package across all four sites for now.
We’ll migrate when CapsMAN parity ships in 7.18 or later.
Create the bridge VLAN, add the bridge port with the tag, set the address. Then add the new bridge VLAN interface
to the LAN interface list. Forgetting this is the #1 ten-minute mystery.
If WG handshake never completes, check /ip firewall connection for stuck UDP entries on 51820;
flush them with /ip firewall connection remove [find] filtered by port. The kernel sometimes pins them.
Hold reset for 5 s with power on — this loads Netinstall mode. Bind via Winbox to MAC, pick the .backup, push. A factory RB4011 takes 4 minutes to restore. Time the cold spare drill quarterly.
wifi package, and we sidestep it by using the legacy wireless package..rsc (text) and a .backup (binary, with secrets) nightly at 02:30 EET
via a scheduled script. The text export is committed to a private Gitea repo in the HQ rack with one-week retention; the binary backup
ships to a Hetzner Storage Box in Helsinki, encrypted with age, key held by myself and Mihkel on YubiKeys.RB4011 racked, CCR2004 racked, both flashed to 7.16.1. Branch hAPs configured at HQ desk, labeled with site names in Brother P-touch tape (TZe-231), shipped to branches by GLS.
SCADA segment switch (CRS310) staged and tested with a borrowed PLC on the bench. Confirmed line-rate forwarding to VLAN 250.
Old Cisco SG350 powered off at 06:04. RB4011 brought up on Telia VDSL at 06:11. Internal VLANs migrated at 06:24. First WireGuard peer (Tartu) handshook at 06:38. Pärnu and Païde followed at 06:44 and 06:49. OSPF adjacencies converged at 06:54. Validation script completed at 07:39. Coffee at 07:42.
Three minor incidents: the VLAN-list miss on 12 May (twenty minutes), a Cloudflare DDNS hiccup on 14 May (auto-recovered), and a misconfigured queue tree on 16 May that throttled the wrong VLAN for nine minutes.
Backups run nightly, OSPF self-heals as documented, the cold spare lives in the drawer. Quarterly cold-spare drill scheduled for 04 August. The CCR2004 is currently sitting at 3% CPU; the RB4011 averages 11% with spikes to 41% during the morning sync from the substation simulator in Pärnu.