There’s a lot of big talk about “Big Data,” but it’s difficult to understand exactly what it all means. When you begin tossing around high-level explanations of data frameworks such as NoSQL, Hadoop, and Cassandra, it’s easy to get lost in the details and not understand what implementing Big Data IT measures can do for an enterprise business.
Big Data is an aggregate collection of large-scale and granular analytics that can be related to just about anything. In the easiest example of practical application, Amazon relies on Big Data to keep track of everything from ecommerce transactions to recommendations. Because so much effort is involved in curating, managing and storing Big Data, it relies on special frameworks that are meant to handle data efficiently.
“Enterprises have been using customer data to drive more value out of their organization for years,” says David Steinberg, CEO of New York-based digital marketing agency XL Marketing. “It used to be that a when a company would try to run a report, it could take two days — that same report can now be run in moments.”
Big Data has plenty of applications for an enterprise, but it’s also not for everyone. Keep in mind these handy tips to establishing your own Big Data framework, and you’ll be able to take advantage of more metrics — and make smarter decisions — to drive more value than you could with a standard database system alone.
1. Assess Your Needs
The first thing to recognize in pursuing a comprehensive system to take advantage of Big Data is that it’s not the end-all be-all for the company of the future. Make an honest assessment of your company’s data needs, and plan accordingly.
The main reason to not rush into implementing a Big Data framework is because it is, well, work — implementing a distributed file system that can easily handle granular (read : a ton of) information requires answering a ton of questions: Where will all this data live? How will it run? Who will manage it? Ensure that those questions are answered correctly by hiring a consultant or training someone who knows the business on Big Data.
“When I look at the organizations, there’s often a disconnect between the technology group, which is collecting the data, and the other groups in the organization and the goals they’re trying to achieve,” Steinberg explains. “You’ll have a Chief Marketing Officer who wants to achieve one goal, the EVP of Sales wants to achieve another, the CFO who wants to achieve another, the CEO who is trying to put it all together, and then a CTO and CIO who are battling it out for who owns the data.”
Despite all the posturing, the good news is that Big Data is becoming much cheaper to gather, much easier to scale, and much faster to report. If your company is looking to gather more information about users, clients and behaviors, or to internally monitor the actions of the company, then Big Data could be a cheap and sophisticated way to manage all that information.
“The technology around the space — server capacity, storage capacity, reporting capacity — has gotten so inexpensive that it’s been able to gather more data and use that data,” Steinberg adds.
2. Make Data Goals
Once the decision to embrace Big Data is made, the next step is to turn all of that previously unused information into reportable, actionable facts that make a big impact on your business. Whether you’re working with an in-house IT team or a proprietary service (like Cloudera), the number one way to manage your data is to identify what’s important and how it can be used to its fullest extent.
Omer Trajman, former VP of Technology for Cloudera and current VP of Field Operations at Big Data application firm WibiData, says that a great feature of employing Big Data is the ability to refocus onto different data types and filter systems to get the right information at the right time. Trajman adds that Cloudera CTO Amr Awadallah had first-hand experience with the limitations of traditional data — which led to the development of the company and a more vested interest in Big Data management.
“He was trying to do analytics on browser usage platforms, and they had Windows, Mac, Linux, and an ‘Other’ bucket,” Trajman explains. “And then, within two months, the ‘Other’ bucket shot through the roof and they had no idea why. They couldn’t see it in the data because in their data warehouse, they set all other platforms to ‘Other.’ If they had Hadoop at the time, they could have just gone back to the raw data and changed what they were looking at to see that they needed an ‘iPhone’ option.”
3. Monitoring is Key
Implementing a file-system and work flow is only half of the issue: companies must be prepared to monitor the health and security of their system in order to ensure it is used to its maximum efficiency. Because many of these file systems are relatively new (Hadoop is ten years old), plenty of security and monitoring systems are still nascent.
“Relatively speaking, these systems have had less exposure in the enterprise, and there’s been less time for enterprises to develop security protocol,” Trajman adds. “Adoption is going a lot faster, but there’s a bit of a chase in the near-term to lock down solid security protocols.”
Investing in a Big Data system means that you must invest in a caretaking mechanism to ensure everything works correctly. It’s not enough to assume that what works in a traditional database system will work with a file framework — especially considering many of the data frameworks that can shuttle Big Data are able to do so because everything is as raw as possible. This means that it’s imperative to stringently monitor pieces of data to ensure the system is working at maximum efficiency and nothing is out of place.
As with any IT system, the more a company knows about the shortcomings of a Big Data network, the more efficient everything will run and the more knowledge will be given to the company.
4. Pursue the Future
While Big Data is great for monitoring events that have already happened, it’s no surprise that IT companies are using it to employ the next generation of predictive modeling. In order to take advantage of Big Data in the long term, it’s time to consider what data sets are key to predicting your customers’ needs. Embracing the future now means that you can get a jump on your competition and begin to use Big Data for real change.
“Once again, the more you can make your data actionable, the more valuable it’s going to be,” Steinberg explains. “You’re seeing a lot of numbers crunching and a lot of smart people trying to figure out what it all means. It’s really the evolution of Big Data.”
Trajman’s company, WibiData, recently released an open-source, real-time predictive modeling software for the Kiji Project. He stresses that predictive modeling is the next logical step in Big Data, as more departments use the accumulation of data to forecast user behavior. For example, an Ad Ops team can see that a certain ad is getting a massive clickthrough rate on a particular page, and determine which sites with similar demographics would be just as receptive.
“We’re really just scratching the surface of the potential of all of this,” Trajman adds. “It’s really exciting.”
Like predicting weather patterns or the ends of words in Google Instant, predictive modeling is the cutting edge of Big Data. Strategize now, and you’ll ensure that your company not only moves with the curve, but gets ahead of it.
Big Data? Big Potential.
Like any IT solution, the foundational blocks of implementing a Big Data system are clear, concise, and practical. If you’re interested in putting in the work and maintaining the system correctly, you’ll find that a mysterious world of data — useful data — can be made available to you with surprising speed.
But it’s important to remember: Data is only as good as how you use it.
“Data should be transformational,” Trajman explains. “You need to create better experiences using your data, and that’s a pretty novel idea.”
Image: Thinkstock
Get the TNW newsletter
Get the most important tech news in your inbox each week.