MikroTik in the rack: a RouterBOARD field report

§ 01 · The brief

A small company, an old rack, and a refusal to pay per-seat licensing.

The CFO had asked, plainly, why a four-site shop with eight people in the head office needed a network refresh that started above two thousand euros. The honest answer was that it didn’t. The Cisco gear we had inherited from the 2018 fitout was still routing packets. It also took ninety seconds to reload a startup-config, ran an IOS train that hadn’t seen a security patch since the 19.x line, and required a Cisco Smart Account login that the previous IT contractor had taken with him when he left for Helsinki.

So the refresh was as much about ownership as it was about throughput. The question was which platform we’d be pleased to live with for the next five years. The shortlist, in order of how many minutes we spent on each: Ubiquiti UniFi Dream Machine SE plus three U6-LRs; Cisco Meraki MX67 across all sites; and MikroTik — an RB4011iGS+ at the head end with hAP ac3 units at the branches, and a CCR2004-1G-12S+2XS in the rack as the core router once we ran the second fibre.

Ubiquiti was eliminated on a Wednesday. We trialled a UDM Pro SE on a desk, and within two hours of attempting to define a non-trivial firewall rule with a service group — the kind we use for the SCADA jump host — we conceded that the GUI was lovely and the policy engine, on the day, was not. Meraki was eliminated by a finance line: at €614/year per site in licences, across four sites, the five-year cost was higher than our entire hardware budget plus a long weekend in Lisbon.

MikroTik won because, frankly, we already trusted it. Mihkel had been running an RB750Gr3 at his apiary for three years with a WireGuard tunnel to his phone, and he’d done the firewall by hand. The platform is unfashionable in 2026 — the cool kids buy OPNsense boxes and write their configs in Nix — but RouterOS does what it says, the docs are honest, and if you can read the wiki you can read the device. There is no per-seat licence. There is no cloud you cannot turn off.

This is the field report. It is not a tutorial. There is a tutorial here, in the configs we paste in §04 and §05, but the larger story is: what happens when you actually rack the metal. What surprises you on the second day. What you would have done differently if your future self had a five-minute conversation with your past self over coffee.

§ 02 · The bill of materials

Seven boxes, one cold spare, €3,847 with VAT and shipping from Riga.

The principle was: buy the smallest box that will not be the bottleneck, and keep one of the field units in a drawer as a cold spare. We violated this principle exactly once, by buying the CCR2004 a year early. I do not regret it.

Table 1 — As-purchased hardware, March 2026
Model	Role	Sited	Qty	Unit €	Line €
CCR2004-1G-12S+2XS	Core router, future 10G uplink	Tallinn HQ — rack U21	1	1,420.00	1,420.00
RB4011iGS+RM	Edge router & WG hub	Tallinn HQ — rack U22	1	279.00	279.00
RB4011iGS+RM	Cold spare for the edge	Drawer, Mihkel’s desk	1	279.00	279.00
hAP ac3 (RBD53iG-5HacD2HnD)	Branch edge + WAP	Tartu, Pärnu, Païde	3	119.00	357.00
CRS310-1G-5S-4S+IN	Aggregation, SCADA segment	Tallinn HQ — rack U20	1	389.00	389.00
cAP ax (cAPGi-5HaxD2HaxD)	Ceiling AP, open-plan office	Tallinn HQ × 4, Tartu × 1	5	144.00	720.00
S+RJ10 (10GBASE-T SFP+)	Uplink to LAN switch, transitional	HQ rack	2	59.00	118.00
XS+31LC10D (10G SR SFP28)	Fibre uplink, future Telia drop	HQ rack — sealed bag	2	94.00	188.00
Cabling: Cat6a + LC-LC OM4	Patch lot	All sites	1	97.40	97.40
Total, ex. VAT, line items					3,847.40

The cold spare is the line item that gets questioned in every kickoff meeting and is non-negotiable in every postmortem. An RB4011 will fail eventually — the PSU more often than the board itself — and a sealed twin on a shelf with a backup of the running config means a cutover in eleven minutes, not eleven hours. We have not yet needed it. We will.

One note for buyers: the RBD53iG-5HacD2HnD is the hAP ac3 you want for a branch with five to ten people. It has a gigabit WAN port, four gigabit LAN, and reasonable 802.11ac dual-band. If you have more than ten people on Wi-Fi you should add a cAP ax overhead and let the hAP do routing only. We learned that in Tartu, where the hAP’s built-in radio held up just until the third week, then began dropping the financial controller’s laptop twice a day.

§ 03 · Topology

Hub-and-spoke, with one branch promoted to a half-mesh peer.

I will draw the topology in prose because I have never seen a topology diagram on a glossy report that survived contact with reality longer than a quarter. Here is what we built.

The Tallinn head office sits in a former tram depot in Põhja-Tallinn, with a Telia VDSL2 drop (260/40 Mbit, on a good day) and a backup LTE link via Elisa, terminated on an industrial CAT-M router in the same rack. The Telia drop hands off to the RB4011 on ether1; the LTE link arrives on ether2 over a static private route. Inside the rack, sfp-sfpplus1 on the RB4011 trunks to the CCR2004, which acts as the LAN core and pushes VLANs out to the floor switches.

The three branches each terminate a residential-class fibre — Tartu has 1000/300 Mbit from Tele2, Pärnu has 500/100 Mbit from Telia, Païde has 100/100 Mbit from Infonet (the slow one is the design constraint that shaped most of what follows). Each branch’s hAP ac3 holds the WAN address on ether1 and runs a single WireGuard tunnel back to the RB4011 hub.

Tartu is special. Tartu is where the SCADA test bench lives part-time, and where our second-largest team works. We promoted Tartu to a half-mesh peer: it holds two WireGuard tunnels — one to the HQ hub, and one direct to Pärnu. That second tunnel exists for a single use case: when the Tartu team is debugging a substation simulation that runs out of Pärnu and the round-trip via Tallinn jitters by 18–22 ms. With the direct tunnel, jitter dropped to under 3 ms.

Node A — HQ Tallinn• LIVE

Tallinn head office

Former tram depot, Põhja-Tallinn. Six engineers, two admin, three management.

CCR2004-1G-12S+2XS — core, 10G future
RB4011iGS+RM — edge, WG hub, OSPF DR
CRS310-1G-5S-4S+IN — SCADA agg
4 × cAP ax overhead
Telia VDSL2 260/40 — primary
Elisa CAT-M LTE — standby, OSPF cost 200

Node B — Tartu• LIVE

Tartu branch — half-mesh peer

Riia 181 b. Eight engineers, a partial SCADA mirror, the noisy team.

hAP ac3 — edge + WG x2
1 × cAP ax overhead in the open office
WG tunnel to HQ — wg-hq
WG tunnel to Pärnu — wg-prn
Tele2 fibre 1000/300

Node C — Pärnu• LIVE

Pärnu branch

Aida 7. Five engineers, the substation simulator, the loudest UPS.

hAP ac3 — edge + WG to HQ
WG tunnel to Tartu (peering)
Telia fibre 500/100

Node D — Païde• STANDBY

Païde branch

Tallinna 14. Three engineers, the document scanner, a coffee machine.

hAP ac3 — edge + WG to HQ
WG tunnel to HQ only
Infonet 100/100 — weak link

Addressing was decided in twenty minutes over a whiteboard with Mihkel. We picked the 10.42.0.0/16 supernet because it does not collide with anything any of our partners use, and because forty-two is an internal joke that has outlived its decency. Branches get a /24 each. The WireGuard transport is a separate /24 we keep entirely off the LAN routing table except through OSPF; this lets us trace tunnel-only problems without staring at LAN flows.

§ 04 · WireGuard between sites

The reason RouterOS 7 finally earned its upgrade.

WireGuard is, in 2026, the only sane reason to choose RouterOS 7 over 6.49. We had been holding on the 6.49 long-term branch because RouterOS 7’s early releases were not kind to anyone who needed Cap-sMAN. By 7.16.1 the bugs we cared about were either fixed or documented, and the WireGuard implementation — native, no userspace shim, runs in the kernel — was benchmarking at 938 Mbit/s symmetric on an RB4011, with the CPU at 41% on a single core. That is enough.

The interface stanza on the RB4011 hub looks like this. The key has been re-derived for publication; do not copy it.

# Create the WG interface on the hub
/interface wireguard
add name=wg-hub listen-port=51820 mtu=1420 \
    private-key="yA8c+5xZk2QdH7w0nW3pP+r1L9oXgVuM6bF4tS0eJqU="

# Address the tunnel on /30 per peer for cleanliness
/ip address
add address=10.99.0.1/30  interface=wg-hub network=10.99.0.0   # to Tartu
add address=10.99.0.5/30  interface=wg-hub network=10.99.0.4   # to Pärnu
add address=10.99.0.9/30  interface=wg-hub network=10.99.0.8   # to Païde

# Peer: Tartu (note allowed-address includes its branch /24)
/interface wireguard peers
add interface=wg-hub name=peer-tartu \
    public-key="5pE2v7H+nQ8mTk0jXrYcL9aBdGsW3oR4uF6iZyKqI1c=" \
    endpoint-address=tartu.kpinge.ee endpoint-port=51820 \
    allowed-address=10.99.0.2/32,10.42.10.0/24 \
    persistent-keepalive=21s comment="branch-tartu/tele2"

add interface=wg-hub name=peer-parnu \
    public-key="9Tg4xB3w+VnK7sZ2pUcM5aHdOfQ8jL1iE0rYqJyXzG4=" \
    endpoint-address=parnu.kpinge.ee endpoint-port=51820 \
    allowed-address=10.99.0.6/32,10.42.20.0/24 \
    persistent-keepalive=21s comment="branch-parnu/telia"

add interface=wg-hub name=peer-paide \
    public-key="3Bm5cF2v+nL6kY1qWdN9aHsXgOeRtU0iE4pZyJxKqI8=" \
    endpoint-address=paide.kpinge.ee endpoint-port=51820 \
    allowed-address=10.99.0.10/32,10.42.30.0/24 \
    persistent-keepalive=21s comment="branch-paide/infonet"

The corresponding stanza on the Tartu hAP, which holds two tunnels (to the hub and to Pärnu, for the half-mesh):

/interface wireguard
add name=wg-hq listen-port=51820 mtu=1420 \
    private-key="7sY3qM8nR+kJ4xC2vWdL5aBfGoP1tU9iE6pZyHcKqI0="
add name=wg-prn listen-port=51821 mtu=1420 \
    private-key="2aX9vL5kM+nQ6pT3rUdH8sBfWoG4yC1iE7zYqJxNcI9="

/ip address
add address=10.99.0.2/30 interface=wg-hq  network=10.99.0.0
add address=10.99.1.1/30 interface=wg-prn network=10.99.1.0

/interface wireguard peers
add interface=wg-hq name=peer-hq \
    public-key="Hub5xZ2vL+nM8kT3pQdR9aBfYoC1iE6sJyHcXqWnI4=" \
    endpoint-address=hq.kpinge.ee endpoint-port=51820 \
    allowed-address=10.99.0.0/24,10.42.0.0/22,10.42.250.0/24 \
    persistent-keepalive=21s
add interface=wg-prn name=peer-prn \
    public-key="9Tg4xB3w+VnK7sZ2pUcM5aHdOfQ8jL1iE0rYqJyXzG4=" \
    endpoint-address=parnu.kpinge.ee endpoint-port=51821 \
    allowed-address=10.99.1.2/32,10.42.20.0/24 \
    persistent-keepalive=21s comment="half-mesh, SCADA latency"

Three subtleties worth pointing out, since they cost us a Saturday afternoon between them.

Subtlety one. Each branch’s public hostname (tartu.kpinge.ee, etc.) resolves via Cloudflare with a TTL of 60s. Two of our three ISPs do not give us a static public address, so we run a tiny script on each hAP that updates the A record over Cloudflare’s API when /ip cloud sees the WAN IP change. It runs every 5 minutes; in two months it has triggered nine times.

Subtlety two. The half-mesh tunnel between Tartu and Pärnu lives on port 51821, not 51820, because the hAP cannot listen on the same port from two different WireGuard interfaces. RouterOS will accept that config in WinBox without an error and then quietly fail. Use a different port and your life improves.

Subtlety three. The allowed-address list on the hub must include each peer’s tunnel /32 and the branch’s LAN /24. If you omit the tunnel /32, OSPF Hellos do not flow even though ping over the tunnel works. This is the kind of asymmetric brokenness that takes forty minutes to diagnose if you do not already know.

§ 05 · OSPF for failover

Not for elegance. Not for resume value. For Sunday sleep.

OSPF is unfashionable. The greybeards run BGP between sites because they ran it in 2008 and the muscle memory is honest; the kids run a controller and pretend it is a routing protocol. For four sites, OSPF is the obvious tool: it converges in seconds, it does ECMP if you ask, and the failure case — the LTE backup link rising while the VDSL drop dies — is exactly the case OSPF was designed for.

The config is small enough to print. This is the hub stanza:

/routing ospf instance
add name=ospf-core router-id=10.42.0.1 disabled=no

/routing ospf area
add name=backbone instance=ospf-core area-id=0.0.0.0

/routing ospf interface-template
add area=backbone interfaces=wg-hub      cost=10  \
    auth=md5 auth-id=1 auth-key="K0rgepinge!04May2026" \
    hello-interval=10s dead-interval=40s
add area=backbone interfaces=lte-standby cost=200 \
    auth=md5 auth-id=1 auth-key="K0rgepinge!04May2026" \
    hello-interval=10s dead-interval=40s
add area=backbone networks=10.42.0.0/22 type=stub

/routing ospf static-neighbor
# none — we use multicast on the tunnels

The branches are symmetrical. Each hAP ac3 announces its branch LAN /24 as a stub network and adjacent the hub over the WireGuard tunnel. Cost 10 across all primary tunnels, 200 on the LTE standby. The cost gap of 190 is more than enough to ensure the standby route only carries traffic when the primary genuinely dies.

The numeric story, from /log print where topics~"ospf" in the two months since cutover:

Table 2 — OSPF events, 04 May — 09 Jun 2026
Date (EET)	Site	Event	Convergence	Cause
08 May 14:22	Pärnu	Adjacency down → up	11.4 s	ISP flap, Telia
14 May 03:11	HQ	VDSL down, LTE took over	23.8 s	VDSL maintenance window
14 May 04:47	HQ	VDSL up, LTE released	8.9 s	Maintenance ended
22 May 19:03	Païde	Adjacency down → up	14.7 s	Infonet brownout
29 May 11:30	Tartu	Tartu↔Pärnu peer cycled	3.2 s	Scheduled MTU test
02 Jun 02:14	HQ	VDSL flap, no LTE handover	1.6 s	BFD-like quick recovery
07 Jun 16:48	Pärnu	Adjacency down → up	9.5 s	Power blip, UPS rode

The takeaway, if you do not want to read the table, is that for two months we have had a network that has fixed itself seven times without anyone calling Mihkel. The number we care about is the 23.8s on 14 May: a sleep-through-able cutover from VDSL to LTE in the middle of the night, with no human in the loop.

The first time the LTE standby actually carried production traffic, at 03:11 EET on 14 May, I was asleep in Põhja-Tallinn and learned about it from the morning’s syslog. That was the moment I stopped worrying about whether we’d made the right platform choice.

— H. Vasiljeva, field log, 14 May 2026, 08:02 EET

§ 06 · The firewall ruleset

A ruleset that finally held — and the fasttrack lie.

The default RouterOS firewall, the one the Quick Set wizard gives you, is fine for a home network and is dangerously close to adequate for a small office. It is not enough for ours. The reason is that we have a SCADA segment that must talk only to specific hosts on specific ports, and an internal jump-host pattern that wants tight inbound discipline. We rewrote the filter chain.

/ip firewall filter
# INPUT chain — what is allowed to talk to the router itself
add chain=input action=accept connection-state=established,related comment="est/rel"
add chain=input action=drop  connection-state=invalid                comment="invalid"
add chain=input action=accept protocol=icmp                            comment="icmp ok"
add chain=input action=accept in-interface-list=LAN                    comment="lan mgmt"
add chain=input action=accept in-interface=wg-hub src-address-list=mgmt-jump comment="wg mgmt"
add chain=input action=accept protocol=udp dst-port=51820              comment="wg ingress"
add chain=input action=drop log=yes log-prefix="in-drop"           comment="deny rest"

# FORWARD chain — traffic crossing the router
add chain=forward action=accept connection-state=established,related
add chain=forward action=drop  connection-state=invalid
add chain=forward action=drop  src-address-list=bogon in-interface-list=WAN log=yes log-prefix="bogon"
add chain=forward action=accept in-interface-list=LAN out-interface-list=WAN comment="lan to wan"
add chain=forward action=accept in-interface-list=LAN out-interface=wg-hub  comment="lan to branches"
add chain=forward action=accept in-interface=wg-hub out-interface-list=LAN  comment="branches to lan"

# SCADA segment: only listed hosts may reach VLAN 250 on the listed ports
add chain=forward action=accept src-address-list=scada-trusted \
        dst-address=10.42.250.0/24 protocol=tcp dst-port=1433,4840,502 \
        comment="scada in"
add chain=forward action=drop  dst-address=10.42.250.0/24 log=yes log-prefix="scada-deny" \
        comment="scada deny rest"

# Everything else crossing the router: drop and log at 1/s rate
add chain=forward action=drop log=yes log-prefix="fwd-drop"

# RAW chain — drops before conntrack, saves CPU on a hammered hub
/ip firewall raw
add chain=prerouting action=drop src-address-list=bogon in-interface-list=WAN
add chain=prerouting action=drop dst-address-type=broadcast in-interface-list=WAN

Read that carefully and you will note we use interface-list=LAN and interface-list=WAN rather than naming interfaces directly. This is one of RouterOS’s under-loved features: an interface list is a named bag of interfaces you reference once in your filter, and then membership can change with the topology without rewriting the rules. We have LAN containing bridge-lan, vlan10-staff, vlan20-guest, and WAN containing ether1, lte-standby.

About the fasttrack lie: every RouterOS tutorial you read in 2020–2022 told you to enable fasttrack-connection in your filter chain to get line-rate forwarding. Fasttrack works by skipping the rest of the firewall and most of the IP stack — and, critically, it skips queues, mangle, and any per-connection accounting. On a small office hub where you actually want to mark SCADA traffic via mangle, or rate-limit guest Wi-Fi via a simple queue, fasttrack will undo all of it silently. We disabled fasttrack on the hub and accepted the CPU cost (a four-percent increase, fine on the RB4011) in exchange for having queues and marks that actually take effect. Leave fasttrack on if you only need NAT and throughput. Turn it off the moment you write your first mangle rule.

§ 07 · Quirks, regrets, and the FAQ

What we’d redo, what bit us first, and the cards we carry in our wallet.

The first quirk is the one that bites every first-timer: interface lists are how RouterOS expects you to organize your firewall, and the wizard does not set them up for you in a way that survives a topology change. Specifically, when you add a new VLAN, you must remember to add it to the LAN interface list, or your filter rule that says “allow LAN to WAN” will silently fail to allow the new VLAN. We lost twenty minutes to this on 12 May.

The second quirk is queue-tree vs simple queues. Simple queues are easier; queue trees are correct. We used a queue tree for the guest VLAN throttle, with parent queues per VLAN and child queues per host. The simple queue we tried first treated the entire VLAN as a single bucket and the marketing manager’s 4K upload cratered everyone else for fifteen seconds.

The third is RouterOS 7’s default Wi-Fi package. On the hAP ac3, the legacy wireless package gives you everything you remember from RouterOS 6 (CapsMAN, virtual APs, the lot). The newer wifi package is the future but doesn’t have full CapsMAN parity yet. We standardized on the legacy package across all four sites for now. We’ll migrate when CapsMAN parity ships in 7.18 or later.

Pocket card 01/01

Adding a new VLAN

Create the bridge VLAN, add the bridge port with the tag, set the address. Then add the new bridge VLAN interface to the LAN interface list. Forgetting this is the #1 ten-minute mystery.

Bite count2 / month

Pocket card 02/02

Recovering a hung WG peer

If WG handshake never completes, check /ip firewall connection for stuck UDP entries on 51820; flush them with /ip firewall connection remove [find] filtered by port. The kernel sometimes pins them.

Used3 × in 2 months

Pocket card 03/03

Restoring from a backup file

Hold reset for 5 s with power on — this loads Netinstall mode. Bind via Winbox to MAC, pick the .backup, push. A factory RB4011 takes 4 minutes to restore. Time the cold spare drill quarterly.

Spare drillLast: 06 Jun

Frequently asked, frankly answered

Why not OPNsense on bare metal, or pfSense, or VyOS?

All three are good. None of them ship a $119 box that bolts to a wall in Pärnu and runs for five years without a fan, which is what we needed at the branches. For the head office a small Protectli box on OPNsense would have been entirely viable; we chose MikroTik for fleet consistency — one OS, one config language across all four sites.

Do you really run RouterOS 7 in production? It has a reputation.

Yes, on 7.16.1 long-term. The reputation was earned in the 7.1–7.7 era when the WireGuard implementation had MTU bugs and CapsMAN was in transition. By 7.13 the dust had settled; by 7.16 we’ve had two months uninterrupted across four sites. The only feature gap we feel is in the new wifi package, and we sidestep it by using the legacy wireless package.

What about IPv6?

Enabled, advertised on the LAN, prefix-delegated from Telia and Tele2. Païde’s Infonet drop is v4-only, which is why Païde gets a NAT64 fallback running on the hub via Jool, on a small dedicated Linux VM. We do not love this. The plan is to either switch Païde’s ISP or wait until Infonet ships v6, whichever happens first.

How do you back up the configs, and where do the keys live?

Every site exports both an .rsc (text) and a .backup (binary, with secrets) nightly at 02:30 EET via a scheduled script. The text export is committed to a private Gitea repo in the HQ rack with one-week retention; the binary backup ships to a Hetzner Storage Box in Helsinki, encrypted with age, key held by myself and Mihkel on YubiKeys.

What’s the worst thing that’s happened in two months?

A power blip in Pärnu on 07 June at 16:48 took the hAP down for thirty-eight seconds. The UPS rode through but the desktop switch downstream didn’t. Three people complained on Slack within ninety seconds. The network itself reconverged in 9.5 s once power came back. The complaints lasted longer than the outage; we are working on a Pärnu UPS upgrade.

Would you do anything differently?

Two things. First, we’d buy the CCR2004 a year later — the RB4011 alone is fine until the 10G fibre lands. Second, we’d skip the integrated radio in the hAP at the larger branches and budget a cAP ax from day one. Otherwise: the same.

How is this report different from a vendor case study?

Nobody paid us. MikroTik does not know we exist. We bought the gear from a third-party distributor, the prices in Table 1 are the actual line items on our invoice, and the OSPF event table in Table 2 is real syslog. The screenshots that would normally go in a vendor case study are deliberately not in this document because no number of screenshots is worth one honest paragraph about fasttrack.

Cutover timeline, condensed

01Pre-stage

14 March — 02 May: hardware acceptance & bench-build

RB4011 racked, CCR2004 racked, both flashed to 7.16.1. Branch hAPs configured at HQ desk, labeled with site names in Brother P-touch tape (TZe-231), shipped to branches by GLS.

SCADA segment switch (CRS310) staged and tested with a borrowed PLC on the bench. Confirmed line-rate forwarding to VLAN 250.

Observations Lead time on the CCR2004 was longer than promised — eleven days vs the quoted seven. We dry-ran the cutover on the bench three times.

02Cutover

04 May, Sunday, 06:00–07:42 EET

Old Cisco SG350 powered off at 06:04. RB4011 brought up on Telia VDSL at 06:11. Internal VLANs migrated at 06:24. First WireGuard peer (Tartu) handshook at 06:38. Pärnu and Païde followed at 06:44 and 06:49. OSPF adjacencies converged at 06:54. Validation script completed at 07:39. Coffee at 07:42.

Observations One missed step: forgot to migrate the printer’s DHCP reservation. Caught at 07:15, fixed in three minutes. Branch managers had a Sunday morning briefing scheduled at 09:00; we had ninety minutes of headroom and didn’t need any of it.

03Burn-in

05 May — 18 May: shake-out

Three minor incidents: the VLAN-list miss on 12 May (twenty minutes), a Cloudflare DDNS hiccup on 14 May (auto-recovered), and a misconfigured queue tree on 16 May that throttled the wrong VLAN for nine minutes.

Observations Two-week burn-in is the right window. The third week is when nothing new breaks and you start to trust the system.

04Steady-state

19 May — ongoing

Backups run nightly, OSPF self-heals as documented, the cold spare lives in the drawer. Quarterly cold-spare drill scheduled for 04 August. The CCR2004 is currently sitting at 3% CPU; the RB4011 averages 11% with spikes to 41% during the morning sync from the substation simulator in Pärnu.

Observations The single best decision we made was disabling fasttrack on the hub. The single best decision we made twice was buying the cold spare.