EGroupware Cloud Status

Malfunction reports

EGroupware Cloud, Mail, Rocket.Chat operational

Favicon-EGroupware
navbar-email_64

Past Incidents

EGroupware Cloud in Frankfurt and Karlsruhe maintenance work:
2022 Oktober 12th 22:00 – 13th 0:50 (CEST):

Maintenance work finished successfully, Cloud and mail services have been temporary not available.

EGroupware Cloud in Frankfurt and Karlsruhe effected:
2022 Oktober 12th 14:38 (CEST):

Our infrastructure provider IONOS had an incident / outage (see also https://status.ionos.cloud):
We are currently investigating an issue with the IONOS Cloud Kubernetes API.

EGroupware wasn’t effected in the first phase.

2022 Oktober 12th 15:20 (CEST):

EGroupware infrastructure gets effected and we are trying to get more details from IONOS, when the issue can be solved.

2022 Oktober 12th 15:30 (CEST):

Nodes in KA are back, but FRA has still problems and after 10min, KA also is again not available.

We are still working on the situation and apologise for the outtage.

2022 Oktober 12th 16:30 (CEST):

According to IONOS they fixed something with the provisioning, but we still don’t have access via Kybernetes API. Still working on the issue.

2022 Oktober 12th 17:00 (CEST):

Currently all EGroupware services are back up, but we still can’t use the Kubernetes API. Therefore, we can only wait and see if it is completely fixed.

We will inform further as soon as there is additional information.

EGroupware Cloud in Frankfurt and Karlsruhe effected:
2022 September 27th 14:10 – 14:19 and 16.12 – 16.15 (CEST):

Our infrastructure provider IONOS had an incident / outage for some minutes. This also effected our infrastructure and the service have been offline for 10 minutes. According to the incident report from IONOS this problem will be fixed with the next Kubernetes Stability Upgrade on October 4th.

Anyway we apologize for the disturbance during working hours!

EGroupware Cloud in Frankfurt – one DB node effected: 2022 September 9th 08:00 – 08:30 (CEST):

Our infrastructure provider IONOS had an outage at a host in Frankfurt that affected a database node. Due to a change in PHP 8.1 it did not automatically switch to another DB Node. As a result, instances on the database node were temporarily unavailable. The error could be fixed before the primary node was restored, so this should not happen again.

We apologize for the outage.

EGroupware Rocket.Chat temporarily unavailable, maintenance work EGroupware Cluster: 2022 May 26th 10:30 – 12.30 (CEST):

After a scheduled Kybernetes update in the night from May 25 to May 26, Rocket chat does not start anymore.

Therefore, further maintenance work has to be carried out today during the day, which may result in short-term outages of the EGroupware Cloud. The outage will affect a few minutes at most. EGroupware Mail is not affected and will be available the whole time. We regret the possible circumstances without prior notice, but the maintenance work is currently necessary and it is a public holiday, so outside the core working hours of our customers.

==> Maintenance successfully finished, no failure of EGroupware Cloud, Rocket.Chat operational again

EGroupware Mail & cloud service temporary unavailable: 2022 May 5th 16:10 – 16:25 (CEST):

Network issues from IONOS in KA and FRA.
After reproting to IONOS it was restored and all services are available again.

EGroupware Mail & cloud service are up and running: 2022 April 15th 17:00 (CEST):

EGroupware services are now also up and running, but not all kubernetes & database nodes are already back. We’re still working on it, but we don’t expect any more service downtime.

EGroupware Mail service up and running, while EGroupware ist still affected: 2022 April 15th 16:00 (CEST):

Power supply in the datacenter is back, IONOS is still working on recovering all services.
EGroupware Mail is available, EGroupware itself is still down as underlying filesystem is not yet available. We are still working on the issue.

EGroupware Cloud services FRA & KA: 2022 April 28th 15.00h (CEST):

Power failure in the data center in Frankfurt – see also https://status.ionos.cloud/
Apparently Karlsruhe is also not reachable at the moment, more detailed information is not available yet. We will inform as soon as there is an update.

EGroupware Cloud services FRA & KA: 2021 October 28th 7.30h (CEST):

EGroupware Cloud services are available again, Mail & Rocket.Chat have been working all the time

EGroupware Cloud services FRA & KA: 2021 October 28th 7.15h (CEST):

Renewed outage of EGroupware Cloud services in the early morning. Problem in the database cluster that no more write accesses can be executed. Databases have already been stopped, and are syncing the second database node. The sync should be complete around 07.30h.

The measures since the last incident have worked in the area of monitoring, which informed us this morning at 6:17h and we could therefore initiate the restart earlier. The investigation, why the problem appears again this morning, must still take place.

EGroupware Cloud services FRA & KA: 2021 October 20th 9.00h (CEST):

EGroupware Cloud, Rocket.Chat and Mail services are available again. You may experience that the services are slower at tthe moment. The remaining database nodes will be synchronized after work today.

EGroupware Cloud services FRA & KA: 2021 October 20th 8.30h (CEST):

Two Database knodes are available and the third is syncing. EGroupware Mail and Rocket.Chat is online again. EGroupware Cloud will take about 15min to be up.

EGroupware Cloud services FRA & KA: 2021 October 20th:

Outage of EGroupware Cloud services in the early morning hours. Problem in the database cluster that no more write accesses can be executed. Databases are in the process of being stopped, then restarted. The second and third database needs to join with the first database, that can take up to 30min each.

EGroupware Cloud and mail services are expected to be available again at 9am (CEST)

 

EGroupware Cloud services FRA & KA: 2021 September 29th 11.30 (CEST):

EGroupware Cluster: new failure of a database node. We need to shut down systems temporarily to get back to normal operation with at least 3 database nodes. Services are fully available again from 12.10h (CEST).

EGroupware Cloud services FRA & KA: 2021 September 29th 08.30 (CEST):

EGroupware Cloud up and running on two database cluster notes, rest will be synct during the evening.

EGroupware Cloud services FRA & KA: 2021 September 29th:

Outage of EGroupware Cloud services at night. Problem in the database cluster that no more write accesses can be executed. Databases are in the process of being stopped, then restarted. The second database needs to join with the first database, that can take up to 20min. So EGroupware Cloud will be offline until 08:00h (CEST).

The outage may also affect mail services.

EGroupware Email services: 2021 July 6th:

Scrub (Filesystem check) is running to check everything in details. This will take anyway some days. Until its finished we moved half of the instances now to Karlsruhe to not slow down the filesystem check.

Effected mailboxes has been restored from KA and are working properly now again in FRA.

EGroupware Email services: 2021 July 5th 10.30 (CEST)

Storage System in the datacenter in Frankfurt shows Checksum errors and Mailboxes are not available. 

  • Opened Ticket at the service provider of the datacenter IONOS and waiting for there response
  • Switched Mail backend in Frankfurt off temporary, so the redundancy could take over and Mail services are now running all in the datacenter in Karlsruhe.

Your data is save, but the performance will be a bit slower until Frankfurt can be switched on again.

We will inform here as soon as we have any news on that topic.

EGroupware Cloud maintenance window: 2021 June 2nd from 8 – 9.30 pm (CEST)

Our guess on yesterday’s problem is that an “broken request” from a client on a single domain, then causes “Traefik” to respond to more than that client for some time with a “500 Internal Server Error”.

We will re-enable “Traefik” tonight and try to find out which request, domain and IP is causing the problem.

Problem has been identified with high probability and everything has been reset to normal operation.

Internal Server Error in Frankfurt: 01.06.2021 21:00 – 23.59 hrs

There was a problem in the EGroupware Cloud availability zone in Frankfurt from around 9pm, so that “500 Internal Server Error” occurred there again and again. The availability zone in Karlsruhe was not affected by the problem or only for a short time after we had switched everything to Karlsruhe as a workaround. Further investigations suggest that there is NO direct connection to the update to 21.1, but rather a problem with “Traefik” as a proxy / Kubenetes Ingress Controller, which only comes into play under very specific conditions.

As a first step, we updated the version of “Traefik”, which reduced the problem but did not eliminate it. A search in the “Github Forums of Traefik” gave a similar error description in the following post. In order to be able to provide a meaningfully usable EGroupware Cloud today, we removed “Traefik” and are talking directly to Nginx, so there were then no more “Internal Server Errors”.

Failure of all EGroupware and mail services: 06.04.2021: 17.45 – 19.20 CEST

IONOS has caused a network problem, hence the outage of the EGroupware and Mail services.
Colleagues are working as quickly as possible to clean up and restore connections.

06.04.2021: 18.30h The IONOS network is back up, but it will take some time until EGroupware and Mail are available again.

06.04.2021: 19.20h The nodes in Karlsruhe and then in Frankfurt are available again, so all EGroupware and Mail services are running.

SERVICE FAILURE EGROUPWARE NODE KARLSRUHE & FRANKFURT 24.08.2020 15.40 (CEST)

Service failure EGroupware node Karlsruhe & Frankfurt 24.08.2020 15.40 (CEST)

We are in the process of determining where the problem lies.
Currently, both nodes seem to be affected.
Only analysis shows a connection problem on the loadbalancers,
so there is no connection from outside.

18.00 o’clock (CEST): All systems (including the database cluster). were shut down.
The first database node was successfully restarted.
Currently the second database node is starting and synchronizing with the first.
As soon as this is completed, we will also restart the remaining systems.

18.30h (CEST): EGroupware and mail services are up again.