GitLab offline after catastrophic database error loses mountains of data

GitLab offline after catastrophic database error loses mountains of data

The YCombinator and Khosla Ventures-backed GitLab is hugely popular among developers, who appreciate the fact that it’s as close as you’ll get to an all-in-one solution. The service includes everything a developer could possibly need over the course of a project. At the core is a Git-based version control system, which is paired with helpful extras, like an IDE and collaboration tools.

As a result, a lot of companies depend on it, ranging from smaller startups and individual developers, to larger enterprises like Intel and Red Hat. GitLab is also a tool we use use internally at The Next Web.

Yesterday, GitLab announced that it was doing some emergency database maintenance. Unfortunately, it didn’t go to plan.

Whoops.

GitLab was quick to assure customers that no Git commits had been lost. Just things like merge requests and issue posts. Although given the often detailed nature of issue notifications, and the fact that people generally write them through the web browser instead of through, say, Microsoft Word where they can save an independent copy, this is just as bad.

Right now, GitLab is trying to restore from their backups. Given the size of its database, this is a long process.

Snapshots are taken every 24 hours, and the data loss occurred six hours after the last one was taken. As a result, six hours of data has been lost, perhaps permanently. Predictably, a lot of people are frustrated.

The outage has also meant that the site remains offline, preventing developers from using one of their crucial tools during the middle of the workweek.

While it’ll be easy to be condemnatory of GitLab, it’s worth commending them on their radical transparency. Throughout the process, users have been kept aware of progress via its GitLab Status Twitter account. It was also honest about the cause of the data loss – human error – and didn’t try to pin it on a hardware failure, or an external attacker. GitLab is also livestreaming the recovery efforts, which is certainly novel.

This is a developing story. We’ll keep you informed with any new updates.

Update: A GitLab spokesperson got in touch. Emphasis theirs:

“On Tuesday, GitLab experienced an outage for one of its products, the online service GitLab.com. This outage did not affect our Enterprise customers or the wide majority of our users. As part of our ongoing recovery efforts, we are actively investigating a potential data loss. If confirmed, this potential incident would affect less than 1% of our user base—peripheral metadata that was written during a 6-hour window. We have been working around the clock to resume service on the affected product and set up long-term measures to prevent this from happening again. We will continue to keep our community updated through Twitter, our blog and other channels.”

Read next: Simulation software helps police officers decide when to shoot or not to shoot

Here's some more distraction

Comments