It was DNS, but how?
-
A local service lookup like from your screenshot should be happening directly on the local DNS server, it shouldn't be going out to any upstream DNS server..
Since the records have TTL of 5 minutes wouldn't
dnsmasq
have to reach to upstream DNS servers every 5 minutes? -
I've recently suddenly had issues with DNS lookups failing to resolve as you can see from the screenshot of Uptime Kuma that shows all monitored services being down every 10 minutes.
::: spoiler Adventure with ISP / rant
After verifying the problem wasn't on my end, I eventually called my ISP. However, they didn't really understand the issue. After speaking to 9 other people and being assured my DNS queries for some domains were failing because my Wi-Fi signal was bad (I was using a wired connection), I eventually reached someone from the technical department. They told me that questions regarding DNS were too technically complex (???) and told me to check their forums instead. There I found someone that knew what was going on, and apparently my ISP had recently started enforcing DNS rebind protection without the ability to add exceptions. There was no option to disable or work around it it with the ISP's modem, and the only option was to use your own router or configure a hosts file on each device.
:::I have flashed an old access point with OpenWRT and have started using that. It has helped me to get any access to my selfhosted services (without manually setting it in the hosts file that is) but still occasionally randomly fails.
- Migrated all devices to the OpenWRT network
- Changed the public DNS record to use a CNAME record which points to my router's local device alias
- Exempted DNS rebind protection for my domains in the OpenWRT settings
Since it seems to happen at an odd interval and the records have a 5m TTL, I suspect DNS records might be expiring from the local cache and getting a response from a different upstream DNS server in the pool, but that's just speculation. Does anyone know what could cause this issue?
They always point at the network, but we know:
-
I've recently suddenly had issues with DNS lookups failing to resolve as you can see from the screenshot of Uptime Kuma that shows all monitored services being down every 10 minutes.
::: spoiler Adventure with ISP / rant
After verifying the problem wasn't on my end, I eventually called my ISP. However, they didn't really understand the issue. After speaking to 9 other people and being assured my DNS queries for some domains were failing because my Wi-Fi signal was bad (I was using a wired connection), I eventually reached someone from the technical department. They told me that questions regarding DNS were too technically complex (???) and told me to check their forums instead. There I found someone that knew what was going on, and apparently my ISP had recently started enforcing DNS rebind protection without the ability to add exceptions. There was no option to disable or work around it it with the ISP's modem, and the only option was to use your own router or configure a hosts file on each device.
:::I have flashed an old access point with OpenWRT and have started using that. It has helped me to get any access to my selfhosted services (without manually setting it in the hosts file that is) but still occasionally randomly fails.
- Migrated all devices to the OpenWRT network
- Changed the public DNS record to use a CNAME record which points to my router's local device alias
- Exempted DNS rebind protection for my domains in the OpenWRT settings
Since it seems to happen at an odd interval and the records have a 5m TTL, I suspect DNS records might be expiring from the local cache and getting a response from a different upstream DNS server in the pool, but that's just speculation. Does anyone know what could cause this issue?
Have you tried tracing the issue? What is uptimekuma using for DNS? What do the logs on that server show?
-
Have you tried tracing the issue? What is uptimekuma using for DNS? What do the logs on that server show?
Uptime Kuma seems to use
nscd
for caching internally and the default system DNS resolver.
I've added a custom DNS resolvers to Uptime Kuma, and apparently it can get the records from Cloudflare (1.1.1.1) but it can't get it from the OpenWRT router (192.168.1.1).I've enabled a proxy on the router to force the use of DoH, maybe that will help if the ISP's modem is at fault.
-
I've recently suddenly had issues with DNS lookups failing to resolve as you can see from the screenshot of Uptime Kuma that shows all monitored services being down every 10 minutes.
::: spoiler Adventure with ISP / rant
After verifying the problem wasn't on my end, I eventually called my ISP. However, they didn't really understand the issue. After speaking to 9 other people and being assured my DNS queries for some domains were failing because my Wi-Fi signal was bad (I was using a wired connection), I eventually reached someone from the technical department. They told me that questions regarding DNS were too technically complex (???) and told me to check their forums instead. There I found someone that knew what was going on, and apparently my ISP had recently started enforcing DNS rebind protection without the ability to add exceptions. There was no option to disable or work around it it with the ISP's modem, and the only option was to use your own router or configure a hosts file on each device.
:::I have flashed an old access point with OpenWRT and have started using that. It has helped me to get any access to my selfhosted services (without manually setting it in the hosts file that is) but still occasionally randomly fails.
- Migrated all devices to the OpenWRT network
- Changed the public DNS record to use a CNAME record which points to my router's local device alias
- Exempted DNS rebind protection for my domains in the OpenWRT settings
Since it seems to happen at an odd interval and the records have a 5m TTL, I suspect DNS records might be expiring from the local cache and getting a response from a different upstream DNS server in the pool, but that's just speculation. Does anyone know what could cause this issue?
I have my router (opnsense) redirect all DNS requests to pihole/adguardhome. AdGuard home is easier for this since you can have it redirect wildcard *.local.domain while pihole wants every single one individually (uptime.local.domain, dockage.local.domain). With that combo of router not letting DNS out to upstream servers and my local DNS servers set up to redirect *.local.domain to the correct location(s), my DNS requests inside my local network never get out where an upstream DNS can tell you to kick rocks.
I combined the above with a (hella cheap for 10yr) paid domain, wildcard certified the domain without exposure to the wan (no ip recorded, but accepted by devices), and have all *.local.domain requests redirect to a single server caddy instance that does the final redirecting to specific services.
I’m not fully sure what you’ve got cooking but I hope typing out what works for me can help you figure it out on your end! Basically the router doesn’t let anything DNS get by to be fucked with by the ISP.
-
Since the records have TTL of 5 minutes wouldn't
dnsmasq
have to reach to upstream DNS servers every 5 minutes?Only for records on the public internet. Local DNS records are done locally. Unless you're not using local DNS records or something?
-
Only for records on the public internet. Local DNS records are done locally. Unless you're not using local DNS records or something?
I'm using a public DNS record that points to a local device.
*.example.org → example.org example.org → device_name.lan
-
I have my router (opnsense) redirect all DNS requests to pihole/adguardhome. AdGuard home is easier for this since you can have it redirect wildcard *.local.domain while pihole wants every single one individually (uptime.local.domain, dockage.local.domain). With that combo of router not letting DNS out to upstream servers and my local DNS servers set up to redirect *.local.domain to the correct location(s), my DNS requests inside my local network never get out where an upstream DNS can tell you to kick rocks.
I combined the above with a (hella cheap for 10yr) paid domain, wildcard certified the domain without exposure to the wan (no ip recorded, but accepted by devices), and have all *.local.domain requests redirect to a single server caddy instance that does the final redirecting to specific services.
I’m not fully sure what you’ve got cooking but I hope typing out what works for me can help you figure it out on your end! Basically the router doesn’t let anything DNS get by to be fucked with by the ISP.
Thanks for the advice. I also use a cheap domain with a wildcard, but use nginx instead.
I just tried using Adguard and although it's fascinating to see the insights of all the DNS requests, it didn't really help fix the issue.
However, since using DoH with Cloudflare in combination with setting it to the specific IP instead of my local device name and have 100% uptime now (since the last 10 minutes that is). -
I'm using a public DNS record that points to a local device.
*.example.org → example.org example.org → device_name.lan
Gotcha, try setting up local records on local DNS instead to see if that solves it.
-
I've recently suddenly had issues with DNS lookups failing to resolve as you can see from the screenshot of Uptime Kuma that shows all monitored services being down every 10 minutes.
::: spoiler Adventure with ISP / rant
After verifying the problem wasn't on my end, I eventually called my ISP. However, they didn't really understand the issue. After speaking to 9 other people and being assured my DNS queries for some domains were failing because my Wi-Fi signal was bad (I was using a wired connection), I eventually reached someone from the technical department. They told me that questions regarding DNS were too technically complex (???) and told me to check their forums instead. There I found someone that knew what was going on, and apparently my ISP had recently started enforcing DNS rebind protection without the ability to add exceptions. There was no option to disable or work around it it with the ISP's modem, and the only option was to use your own router or configure a hosts file on each device.
:::I have flashed an old access point with OpenWRT and have started using that. It has helped me to get any access to my selfhosted services (without manually setting it in the hosts file that is) but still occasionally randomly fails.
- Migrated all devices to the OpenWRT network
- Changed the public DNS record to use a CNAME record which points to my router's local device alias
- Exempted DNS rebind protection for my domains in the OpenWRT settings
Since it seems to happen at an odd interval and the records have a 5m TTL, I suspect DNS records might be expiring from the local cache and getting a response from a different upstream DNS server in the pool, but that's just speculation. Does anyone know what could cause this issue?
-