Subscription billing startup Recurly is suffering a serious outage. The company says it experienced an intermittent hardware failure at approximately 3:30 AM (PDT) on Monday, which prevented some transactions from processing. It followed up with an apology to customers on Tuesday, but on Wednesday (today) the problem is still not completely resolved.

Here is the current situation:

  • All recurring invoice jobs are paused and queuing, causing live ‘Production’ customers to experience an interruption to their daily recurring collections.
  • Charges applied to any end-customer account will continue to be added to pending invoices.
  • The ability to process refunds may be impacted.

A quick check of status.recurly.com, shows that the documentation and hosted pages were not affected. The Recurly app was, by the original hardware failure, for a total of 111 minutes. Currently the status message is “Recurring transactions are currently paused, new transactions are being accepted.”

recurly status 520x258 Recurly suffers hardware failure, doesnt know how much data will be retrievable

The fiasco all started with Recurly’s encryption service. The primary encryption hardware device failed, the problem cascaded to the backup slave device, and the encryption keys used to access stored credit cards to process recurring transactions were corrupted. The worst part is that the company has stated: “At this point, it remains unclear how much of this data will be retrievable.”

Immediately following the failure, Recurly says it put its full team to work on the issue and apologized for the downtime. New sign-ups were not affected (apart from a delay in the posting of recurring daily invoices) but recurring transaction jobs were, since the company had to pause them in order to restore the service and re-rack several replacement boxes. Over the coming days, the recurring billing management firm plans to restore its credit card data store and re-enable recurring invoicing.

The SaaS firm has started running data migrations and working with several vendors to restore the recurring jobs on its service. Unfortunately, the device that failed is specifically designed to make retrieval of information extremely difficult, causing Recurly to say “we have found ourselves at odds with the very protections we have worked so hard to put in place.”

Recurly emphasized it is “working to restore as much data as possible as quickly as is practicable” and that there will be three different outcomes for its customers: some have already been fully restored, some will be restored to full functionality over the coming days, and the rest will have to reach out to (some or all) of their customers to have them re-enter billing information. For the last group, the company will provide support and tools to do so.

Recurly has been updating its customers directly via email, so if you haven’t gotten anything, check your spam folder. If you really haven’t seen anything, use the Contact Us form. Updates are also being sent out via Twitter.

Update at 11:30PM EST: Recurly says it has great progress to report. Here’s the crux of it:

We are in the process of working with our many gateway partners to restore your billing information. As of midnight tonight, we expect to have merchants who have been processing via Braintree Payments, PayPal Payments Pro, and Wirecard to be restored and operational. Over the coming days, we expect to have merchants who have been processing with Cybersource, Intuit Payment Solutions, Litle & Co., Merchant e-Solutions, restored with replenished customer billing information as well. Conversations with other payment gateways are happening as well.

I’ll continue to keep you updated.

Update on September 8: There’s been a huge amount of progress. See Status Update III – Recovery status.

Image credit: stock.xchng