This article was published on May 21, 2012

Microsoft Research just exploded the world record for data sorting in a 60 second window


Microsoft Research just exploded the world record for data sorting in a 60 second window

Microsoft Research is claiming victory in the MinuteSort test, posting a score that they say essentially triples the previous title-holder, a 2009 Yahoo team.

MinuteSort is just that, a test of how much data you can sort in a minute. Obviously, this is a critical function, as ‘big data’ becomes less a buzzword and more a reality. Technologies such as Hadoop and cloud computing have brought the need to manage huge data sets to the fore; it’s a problem that is common.

Before we get into how Microsoft managed to set the record, here’s how well it did, according to its own post on TechNet: “In raw numbers, the team’s system sorted 1401 gigabytes in just 60 seconds – using 1033 disks across 250 machines.” The company compared those hardware figures favorable to the Yahoo’s team setup, noting that its own solution employed roughly “one-sixth of the hardware resources,” while sorting about three times the data.

Interestingly, Microsoft didn’t use Hadoop, as you might have expected in its solution to the problem. Instead, a group of Microsoft Research folks created something called “Flat Datacenter Storage,” or FDS for short. The word ‘flat’ is critical. Microsoft described how FDS works in the following way:

[Microsoft Research’s Jeremy] Elson compares FDS to an organizational chart. In a hierarchical company, employees report to a superior, then to another superior, and so on. In a “flat” organization, they basically report to everyone, and vice versa.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

That combined with something called ‘full bisection bandwidth networks,’ and Microsoft just made data sorting news. I’ll be curious to see what happens next for FDS; Hadoop is now a commercial standard, could FDS steal some of its thunder? According to Microsoft, the technology will likely be deployed on its own projects, and it hinted at other applications.

Of course, this is all Microsoft tooting its own horn, so until we hear from those in the know and external to Redmond, salt is our friend. Still, a tripling of speed using less hardware? That’s just cool.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with