Now is the time for IT practitioners to hone big data skills, as CIOs, CTOs and even CMOs are paying top dollar for professionals who can help in the administration and analysis of huge stores of data.
As big data matures beyond buzzword status, companies are now building teams to help them get a handle on the huge caches of business data they produce every day. According to Gartner, 73 percent of organizations plan on investing in big data in the next 24 months. However, with so many of the skill sets in big data still so new, more than a few enterprises are running into snags when it comes to hiring the right people for the job. A recent IDG survey found that 40 percent of their respondents are having a tough time finding employees with the big data skills they need. Clearly, this is a great time to hone those sought-after skills, get some training and prepare to take advantage of what’s likely to be a long-term trend.
Here are five of the key big data skill sets that recruiters are looking for:
Raw data is meaningless until it has been capably analyzed. As enterprises seek to harness the power of big data, they're increasingly craving professionals trained in the interdisciplinary science of data analysis. This includes mathematical and statistical know-how, data modeling, programming and general business acumen. It's all got to be paired with a strong creative streak that allows these data scientists to ask the right questions of the data to gain insight into areas like customer buying behaviors, sales patterns, marketing trends and product development. That's like the renaissance man or woman of IT — a tough bill to fill completely. Unsurprisingly, according to a survey by EMC, two-thirds of IT decision-makers believe that the demand for data science will outpace the supply, creating a significant talent gap. It's why, as the Wall Street Journal recently reported, there's a recent surge in data science and business analytics training programs at major schools.
Hadoop and MapReduce are now more or less mandatory when it comes to searching and analyzing data, although there are new technologies to be aware of, also. Stream processing and in-memory data grids offer ad hoc searches in something approaching real time. But for most IT people, Hadoop's distributed file system platform and the clustered storage and processing capabilities it offers on commoditized hardware makes it the infrastructure backbone for many a big data program. As a result, the skills necessary to stand-up and administer Hadoop environments will continue to sizzle in 2015 and beyond. Dice.com estimates that job postings for Hadoop developers are up 35 percent over last year and still growing.
There is only one term you’ll see in more big data job postings than Hadoop, and that’s NoSQL — the non-relational database management system that facilitates the storage, retrieval and modeling of mass quantities of complex data. Huge data sets that need to move fast tend to fit poorly in traditional relational database models. MongoDB, with its JSON-based schema is still the most popular (and sought-after) NoSQL implementation, joined also by Couchbase, Redis and CouchDB. Forrester estimates that currently, the adoption rate for NoSQL systems is at 20 percent and likely to double in the next three years.
Creating and executing jobs across enormous non-relational databases often takes a bit of scripting. The most common scripting languages in this space are Hive, Pig and JAQL. The High Level Query Languages (HLQLs) often allow more abstract queries at the expense of some performance over the API. It very much helps to have a good grounding in Java, as well, which is useful for writing user-defined functions for big data analysis. A bit of Python knowledge can be helpful too, and a solid grounding in C++ never hurt anybody.
It's not good enough to simply query the data and call it a day. With the volume and velocity of business data feeds continuing to grow exponentially by the day, complicated relationships between big data sets are difficult to comprehend using words and numbers alone. Which is why many business analysts and data scientists are turning to data visualization methods to not only make sense of the data themselves but also communicate ideas with business stakeholders who will want at-a-glance understandings of important concepts. As a result, workforce demand for skills in popular data visualization tools like Maltego and Tableau will be in high demand through 2015 and beyond.