The world of decision-making, along with the data sources and technologies that support it, is evolving rapidly. Data analytics is key to this and yet many SMEs have found themselves excluded from advanced analytics due to lack of skills and budget.
Ed Thewlis from The DataShed looks at how platforms have improved their usability to become relevant to every size of business and how the adoption of platforms like Hadoop and Spark can deliver tangible business benefits for smaller businesses
Regardless of the size of your company, chances are you’ve heard of big data. The technical definition of big data is a data set too large and complex for companies to manage within their existing IT infrastructure. However, although that definition was accurate in the 2000s, now the term has become considerably less black and white, describing not just the volume of data, but everything from its complexity and variety to the speed at which it is produced.
If SMEs can look beyond the word ‘big’, which makes big data analytics sound like an exclusive club for large corporates, they will recognise that, at its core, big data is actually about empowering business owners to make smarter decisions. Undeniably, larger companies have been the first to take advantage of the opportunity that big data affords, but thanks to falling tech prices, new analytic tools and the proliferation of Open Source projects, the benefits of big data analytics have never been more accessible to smaller businesses.
In 1999 the Apache Software Foundation was created as an organisation and framework for creating software ‘for the public good’, effectively ending Microsoft’s dominance and starting an Open Source revolution. Today, the most commonly used database, MySQL, is Open Source and the big data revolution has been driven by Hadoop, another Open Source project. If you want to build something new today, the chances are you’ll be building on top of an existing Open Source project. Open Source technologies can keep the cost of analytics down for SMEs as long as the correct technology is selected, but where to start?
All too often, SMEs focus on the transactional needs of business systems and usually forget to consider how they want to analyse the data and patterns that those systems generate until after the event. All of this can be resolved by something as simple as talking to the front-line of the business at the outset. An analytics strategy cannot be separated from the data strategy, nor the technology strategy.
Crucially, it’s important that SMEs don’t overstretch themselves at the start of an analytics project. Solutions don’t need to be big to be advanced. Start with the data you currently collect and the current set of business problems. Ask yourselves what the burning issues are and then examine what kind of solutions could solve these problems? Foster a culture internally that promotes communication at all levels, not just managerial. If you haven’t explored the value in your data, it’s a little hard to start stipulating the right governance and security procedures if you don’t fully understand how that will impact users, so it’s vital to get them involved from the beginning.
Treat big data analytics as an ongoing journey. There will be several stop-offs along the way, perhaps lasting months. The business will seek to implement decisions generated by the initial insight your new analytics, technology and data improvements have helped enable, but pretty soon, they’ll be seeking new and ever more valuable insight. So, be ready to keep developing and innovating extensions and advancements in those solutions.
There are platforms out there which have significantly improved their usability to become relevant to every size of business and these make an excellent starting point. With the ‘cloud’ now reaching maturity, it is now possible to use services only when required. Previously, to run a Hadoop cluster, you’d need to dig deep to buy the hardware you needed, find somewhere in your data centre to keep it, and someone to maintain it. Now, it is perfectly possible to spin up a Hadoop cluster to process ‘small’ data on demand. Jobs which may have previously taken several hours to run now can be completed in minutes.
A great example of this is using Apache Spark. Spark is a great, general data processing tool – with in-built real-time data processing and data mining capabilities. Most crucially, it is open source, meaning you won’t pay for expensive license fees. For example, spinning up a Spark cluster on Amazon Web Services or Azure to run your daily data processing routines is simple, and means you only pay for a few minutes of server time per day – rather than having to CapEx a big machine that you may never fully utilise.
Analytics projects are all about trial and error. Some result in significant value, some result in dead ends. Both the cloud and the Hadoop ecosystem provide business with the ability to approach IT from a marginal-cost perspective, remove huge investment upfront, and limit liability if an analytical project returns no benefit.
Big data presents an abundance of business opportunities for SMEs. Any business, large or small, can no longer afford to ignore the valuable insights that data analysis can unlock. Data holds the key to understanding your business in more detail, make better decisions and understand your customers’ needs to respond in a more meaningful way. And the good news is, small businesses now have access to many of the same big data analytics tools as large businesses. So, in big data terms, size no longer matters.
By Ed Thewlis, MD of The DataShed