Probe suddenly claiming it was offline for almost a week

wteiken · 26 January 2025 22:33

Probe 61817 seemed to work fine, but suddenly I see it being offline for most of the last week (shows online percent as 9.87%, when yesterday it showed ~99.9%).

The status page now shows it mostly offline since 2025-01-20 13:49:30.

The probe also shows offline despite sending pings etc.

Is something off on the backend side that caused some data loss?

wteiken · 26 January 2025 22:51

Seems now it’s showing the probe as back online, but it claims to have been offline even longer (i.e., a segment showing earlier as connected disappeared).

robertk · 19 February 2025 12:48

This is a bit of a blind spot for now for us: in some (and hopefully ever more rare) cases the infrastructure has a hard problem and therefore cannot report disconnect events to us. Once probes reconnect this is reported - but we can only make educated guesses about when the previous connection period ended. This leaves a gap in the connection logs. We have some ideas on how to improve on this but, at this time I apologise because I cannot offer a foolproof solution.

dot · 20 February 2025 13:47

I had a similar situation. In January 2025 , the probe 62520 suddenly RETROACTIVELY indicated having been offline for a week end of December 2024. This is the first time I saw something like this.

Later that month, it showed being offline for 2 days on jan 27, 2025 (in real-time, not retroactively this time), but this is a false positive. Probe 61107 (same AS2027, same net) was OK, so no nework issue there. Probes frequently appear disconnected for a couple of minutes (up to 20-30min), but such false positives for several days are really weird. Also, this f***s up the stats.

Do you have a clue what could have happened?

tschaefer · 14 March 2025 10:21

The same here

The probe suddenly reported a gap of 6 days without notice before. But traffic and built-in charts are ok for the last week.

tschaefer · 14 March 2025 21:16

My probe 1007887 has had also the problem, not one week, but more than a day instead of some hours (caused by the provider).
“Uptime History” should be renamed in “estimated Uptime History”. It is little bit sad, caring about good uptime and the protocol doesn’t reflect it.

zallix · 15 March 2025 10:47

Mine just did this too, down for about 10 minutes but it now says it was down for 22 days. That’s pretty bad.

ph_ybarro · 15 March 2025 11:53

Same at Probe ID 1009707. Probe is down for about 1 hour, then now it says it’s was down for 11 days (1 week 4 days). The problem is either caused by server side infrastructure or there was a problem of the RIPE Atlas controller?

tschaefer · 15 March 2025 19:35

Today I made some debian updates on all my SW-probes, including a reboot. Instead of a disconnecting time about some minutes I lost hours and days again.

ph_ybarro · 15 March 2025 22:43

Later that day, all probes are affected.

Probes frequently appear disconnected for a couple of minutes (up to 15-30min; sometime up to several hours), and lost the several days or weeks of uptime.

This is my probe:

Probe ID 1009313 - down for over 5 minutes but now it says it’s was down for 5 days.

Sister probes:

Probe ID 1010000 - down for several minutes but it now says it’s was down for 14 days (2 weeks).
Probe ID 64654 - down for several seconds but it now says it’s was down for 4 days.

The other probes/anchors are affected (not mine):

Probe ID 7470 - down for several seconds but now it says it’s was down for 9 days (1 week 2 days).
Probe ID 21165 - down for almost 10 minutes but now it says it’s was down for 5 days.

camin · 19 March 2025 09:47

Hi all,

Update on this: we can see that this is an issue and we’re actively working on a solution.

camin · 19 March 2025 13:50

I have applied a fix to probes 1009313, 1010000, 64654 to fill in the connection history. Can you check to see if it’s now how you expect?

ph_ybarro · 20 March 2025 01:46

Thanks for the explanation.

After a few days downtime and no further intervention on my part the probe came back up.

For this probe (1009313) this is the connection log for the past few weeks:

Internet Address	Controller	Connected (UTC)	Connected for	Disconnected (UTC)	Disconnected for
112.202.96.244	ctr-dub-sw01	2025-03-19 04:45:22	20h 30m	Still Connected
112.202.112.38	ctr-dub-sw01	2025-03-18 20:06:04	8h 39m	2025-03-19 04:45:10	0h 0m
112.202.112.38	ctr-dub-sw01	2025-03-18 03:24:26	16h 41m	2025-03-18 20:05:52	0h 0m
112.202.112.38	ctr-dub-sw01	2025-03-17 22:56:55	4h 27m	2025-03-18 03:24:17	0h 0m
112.202.112.38	ctr-dub-sw01	2025-03-16 19:03:47	1d 3h 52m	2025-03-17 22:56:44	0h 0m
112.202.112.38	ctr-dub-sw01	2025-03-15 16:08:48	1d 2h 54m	2025-03-16 19:03:42	0h 0m
112.202.112.38	ctr-dub-sw01	2025-03-09 19:04:00	5d 19h 33m	2025-03-15 14:37:51	1h 30m

The remaining probes (e.g. 1010001) is still not fixed:

Internet Address	Controller	Connected (UTC)	Connected for	Disconnected (UTC)	Disconnected for
112.202.102.169	ctr-dub-sw02	2025-03-16 03:21:21	3d 22h 22m	Still Connected
112.202.105.166	ctr-dub-sw02	2025-02-22 03:06:56	6d 15h 55m	2025-02-28 19:02:03	15d 8h 19m

So, we can hopefully get it fixed the entire probe.

Thank you for whoever did the behind-the-scenes work!

camin · 20 March 2025 09:07

I have applied the fix for # 1010001. Over the course of today a similar fix will be applied to all remaining probes.

camin · 21 March 2025 09:14

This issue should now be resolved for all probes. We are putting some monitoring in place to ensure that it doesn’t reoccur. Apologies for the inconvenience.

Topic		Replies	Views
Probe keeps disconnecting and reconnecting with no changes made RIPE Atlas troubleshooting	23	499	11 September 2024
Atlas probe v5 suddenly died RIPE Atlas troubleshooting , v5	52	1452	8 January 2026
RIPE Atlas probe is broken since yesterday RIPE Atlas	8	271	30 June 2025
Troubleshooting RIPE Atlas Probe Connection Issues RIPE Atlas	0	143	25 December 2024
[Resolved] Unable to reconnect to ctr-dub-hw01 RIPE Atlas troubleshooting	2	212	23 August 2024

Probe suddenly claiming it was offline for almost a week

Related topics