RIPE Atlas probe is broken since yesterday

It seems like by hosting probes through RIPE Atlas probe has been broken since yesterday morning, while the tries to connect the probe and preventing probes from connecting to controllers, regardless of how long waits to connected. (e.g. Probe 1011647).

Also, the probe is connected to a stable network with no apparent interruptions & the issue persists despite restarting the device and verifying the internet connection. The probe appears to be online/connected at RIPE, however it does not fulfil any measurements at all which makes this particularly hard to notice. (e.g. Probe 65977).

Is RIPE aware of this issue? And any possible that controllers are down?

Thanks.

Also affected the hardware probes appears to be online/connected at RIPE, however it does not fulfil any measurements at all and sometimes the probes appear had disconnected:

  1. Probe 65994
  2. Probe 65682
  3. Probe 62731 (also tagged as system: No controller connection and system: Trying to connect).
  4. Probe 61167 (also tagged as system: No controller connection).
  5. Probe 55787
  6. Probe 52790
  7. Probe 35663
  8. Probe 22416

Is something off on the backend side or infrastructure side issues?

I run the v4 ripe probe 52681 that has been disconnecting / reconnecting since yesterday. Mostly stays disconnected. Looking at tcpdumps on the firewall it is actively pinging services and performing connections to tcp/443. Nothing changed in my network and the probe can access the internet.

Would not assume atm that the probe is dead.

1 Like

Same issue here with probe #51078
Event Log is listing a ton of SOS messages as well

1 Like

It’s yet another issue with the controller [ctr-dub-hw01/2 btw]. Probe connects (it’s SSH on port 443, btw), exchanges whatever and the ctr closes the connection. I don’t know what that ~2K of traffic is, but the behavior is consistent with an authentication failure.

(Also, FWIW, the SOS uptime is also totally broken. Probe reports “U69”, portal report “19:01:09”. I know it’s only diagnostic data, but that lack of detail reflects poorly on the entire Atlas network.)

Update: And now the portal says the flash is corrupt. (hint: it’s not… it’s actively running measurements!)

1 Like

Yeah, I broke for me at 01:30AM this morning localtime Australia/Melbourne (AEST-10).

I put a port mirror on the switch, and I can see the probe getting DHCPv4, IPv6 address, and can see it egress ICMP and ICMP6 pinging egress out find.

It’s been disconnected now for 16 hours .

It’s trying to register to reg01 ctr-dub-hw01 and reg02 ctr-dub-hw01.

Probe RIPE Atlas - RIPE Network Coordination Centre

Other SW probes I have are fine.

same for me.

Dual homed probe 52560.

Is RIPE aware of this issue? And any possible that controllers are down?

I logged a case via email. Case 892256.

Thank you for reporting this - we are aware and tracking the issue on status.ripe.net

1 Like