5 Reasons why infrastructure matters for Big Data success
IT infrastructure is not typically a focus for organisations looking to enhance their organisation’s Big Data capabilities.
But it certainly should be.
About 60% of Big Data projects will fail this year, according to Gartner, due to issues arising around skills, tools, mind-sets and culture. Yet around a third of the big data projects we’ve seen fail are due to a lack of investment in infrastructure strategy.
Here’s 5 reasons why infrastructure matters to your Big Data strategy:
#1 – Scalability
Most proof-of-concept (PoC) Big Data projects start out small on public cloud infrastructure because they’re easy to set up and there’s plenty of great tools available to work with. But when it comes to running these projects at enterprise scale in a production environment, organisations often find on-premise infrastructure is the best option because of the sheer volume of data to manage.
#2 – Total Cost of Ownership (TCO)
One of the major advantages of building out your own infrastructure is that you can use rack servers. Yes, you need a lot of them for distributed file systems. But if you specify a server system that’s modular, flexible and managed via a single interface, on-prem is where savings are made when you move a Big Data workload into production versus buying more cloud capacity.
#3 – Compliance and security
With a programmable infrastructure, you can automate security tasks and enable policy-based compliance without compromising system performance. This avoids scenarios where multiple databases, users and compliance mandates lead to a security patchwork that’s vulnerable to human error and demands significant staff time. Moreover, there are analytics platforms available that are purpose-built for maintaining infrastructure and data security.
#4 – Active data
Sensors, Internet of Things (IoT) devices, social networking and online transactions all generate data that needs to be captured, monitored and processed rapidly to make data-based decisions instantly. It could be sentiment and exploratory analytics, trigger alerts, or any other use case that calls for extracting value from data in real time – i.e. active data.
#5 – Time to value
Users report that the ease of deploying integrated infrastructure for Big Data means that they are able to capture its benefits earlier and expand their operations much more quickly to respond to capacity requirements.
The integrated infrastructure advantage
Ultimately, infrastructure automation and management, performance and scalability is where the rubber hits the road when it comes to managing the four Vs of big data – volume, variety, velocity, and veracity.
You need a cohesive, programmable infrastructure that can scale as workloads demand. One that can ensure the timely application of analytics and the secure transfer, storage and management of data.
Cisco UCS Integrated Infrastructure for Big Data is a portfolio of reference architectures designed, optimised and tested with leading Big Data ISVs to deliver the best balance of performance and capacity. This means you can cut in half the average time data scientists and business analysts are waiting for a query or running a report. For example, Splunk Enterprise runs searches up to 6 times faster on Cisco UCS.
Superior performance is achieved via the integration of compute, storage and network, together with unique hardware and software technologies. Our automation capabilities can take ownership of manual and tedious work involved in running Big Data Hadoop applications for example, with one organisation realising a 300% improvement in IT operational efficiency.
Cisco UCS Integrated Infrastructure is supported by Actan, Cloudera, DataStax, Elastic, Hortonworks, IBM (BigInsights), MapR, Oracle (No SQL Database), Pivotal (HD and GPDB), Platfora, SAS, SAP (HANA Vora), Splunk, and others. It also includes options such as our C-Series Rack Servers and S-Series Storage Servers that are perfect for data-intensive workloads. Watch the S-Series unboxing here.
Key use cases we’re seeing today include:
- Data exploration – building new structures (data lakes) for collecting unstructured data
- Customer 360 – creating a complete view of your customers, so not just what they’re buying and from where, but many other parameters (call history, sentiment, purchase volumes, peer groups, etc.) including predicting future buying patterns
- Data warehouse augmentation – offloading data and moving processing to a more cost-effective infrastructure
- Operations analysis – predictive maintenance and product/service margin optimisation or differentiation
- IoT edge-to-enterprise analytics – real-time analysis and response at the edge of the network plus historical analysis, operation control and model development in the core data centre
One of the questions we’re often asked is why opt for a Big Data solution when there are open source options available. The simple answer is that Big Data software has to be deployed, configured and managed.
Vendors such as Cloudera, Hortonworks, IBM, MapR, and Splunk have built enterprise-ready solutions that come with service, support and features designed for specific use cases and verticals.
Moreover, together with UCS Integrated Infrastructure for Big Data, you can deploy pre-tested configurations as is, or use them as templates for building your own. You can scale your solution as workloads demand – to thousands of servers via Cisco Nexus 7000 and 9000 Series Switches.
You get a unified network fabric with unified management and advanced monitoring capabilities for consistent and rapid deployment using service profiles. This includes a global view of inventory, with one-click system software management and configuration changes. In other words, consistency and out-of-the-box performance.