Lack of Data Scientists is the New Von Neumann Bottleneck

Strata Conference's Founding Chair, Edd Dumbill, talks about bridging the data and information gap

By Brian Proffitt
Mon, January 30, 2012

Data is a huge presence within much of business and technology, and the next installment of the O'Reilly Strata Conference will provide attendees a look into the revolutionary ways data is driving, well, everything.

Hot Job: Why Your IT Department Needs Data Scientists

The Winter 2012 edition of the O'Reilly Strata Conference will offer sessions for everyone to the businessperson trying to figure out just what this whole big data thing is all about, to the hard-core data scientist wonks who are bringing all this new technology to the fore.

[ Free download: Hadoop creator Doug Cutting expects surge in interest to continue | Big data strains small-business bandwidth ]

Big data has gotten a lot of attention in the past couple of years, as Hadoop, Cassandra, MapReduce, and other open source technologies have enabled businesses and governments to use data in ways unheard of when using relational database technology. The Strata Conference is the first and most prominent gathering for any party interested in learning about just what makes big data tick.

And that, according to Founding Chair Edd Dumbill, is part of the whole point of Strata: educating users and data scientists about the benefits and applications of big data.

"There are three main themes examined at Strata," Dumbill said in a recent interview, "The increasing of data and the growth of ubiquitous computing are two, which form the start of an arc to the third aspect."

The arc, Dumbill continued, leads to a much higher level of interconnectivity, the so-called "Internet of Things," which describes the billions of objects tagged and otherwise connected to the Internet, each providing massive amounts of data to be collected and processed.

But processed by whom? Stored how? And utilized in what manner? Those are the key questions that gatherings, like Strata, hope to address, particularly that last, third part of the arc: how data is used. This is what Dumbill euphemistically refers to as "data and the final mile."

The "final mile" is likely a familiar term to network engineers: it refers to the all-important connectivity between the end-user and the rest of the Internet.

"So it is with data science and analytics within a business," Dumbill. For data, the "final mile" refers to the capability to properly process data and convey what's really important: information.

The bridge of turning data to information (which can then be used to acquire knowledge) is exactly where the data scientist lives, and it's a skill that is still lacking within this burgeoning field.

Continue Reading

Originally published on www.itworld.com. Click here to read the original story.
What is Tech Briefcase?
TechBriefcase is a new, free service where IT Professionals can Search, Store and Share IT white papers and content like this. Learn more
Bookmark content
Speed up your research efforts with content across the web.
Search and Store
Find the white papers you need. Create folders for any topic.
View Anywhere
Open your briefcase on your iPhone, tablet or desktop. Share with colleagues.
Don't have an account yet?
This high level, business problem focused eBook uses 5 customer scenarios to show how people and organizations are tackling real issues using IBM solutions.
The options for securing increasingly valuable databases are very broad and deep, and can be confusing. This research provides an overview of three categories of controls that should be implemented to ensure that enterprise data is protected in the most efficient and effective manner.
Read the analyst report and learn how you can leverage the core capabilities of a DAP solution for better database security.
This paper looks at new developments in business analytics and discusses the benefits analyzing big data bring to the business.
This paper describes a hardware and software reference architecture for using HP hardware to deploy very large and highly transactional Microsoft® SQL Server 2008 R2 OLTP database systems in tier 1 enterprise application environments.
Identifying the right configuration and deploying complete, scalable data warehouses can be a time consuming, costly and error-prone process. Success ultimately depends on the ability to deploy a system that can support an expected level of performance - then allow that performance to scale linearly as needed.
View this demo and learn how IBM InfoSphere Guardium database activity monitoring can help protect your sensitive data in distributed DBMS environments with a holistic approach to data security and compliance.
These flash modules make warehousing more tangible and relevant to business users through detailed explanations of the InfoSphere Warehouse Packs.
Date: Wednesday, June 20, 2012, 1:00 PM EDT

Siloed organizations continue doing the wrong things and doing things wrong, leading to increased costs, project delays, lower quality, and time-to-market delays. Providing a collaborative platform where the whole organization can prioritize, share and manage deliveries with more transparency can help the organizations make more informed decisions at all levels, and greatly improve communications and traceability between teams. Hear from application lifecycle management experts how to increase delivery efficiency and effectiveness with a new approach to Delivery Management.
Join IDC Analyst Dan Vesset and HP Senior Architect, Jeff Spiller, as they discuss the rise of analytics, the impact of big data and need for scalable enterprise solutions. Learn about the HP Enterprise Data Warehouse appliance, which offers massive scale at low cost for single rack appliances up to large scale Data Warehouses. All while providing a single view of information across your enterprise that scales with your data, improves query performance and reduces IT cost over traditional data warehousing offerings. Featuring Intel® Westmere processors. View the entire webcast or only the chapters you desire.
Join IDC Analyst Dan Vesset and HP Senior Architect, Jim Hautala, as they discuss the business need for fast, reliable solutions for data management and business reporting - whether you are currently using SQL Server or migrating from a different software technology. Jim Hautala will share ideas on how to help transform your data warehouse with the HP-Microsoft Fast Track reference architectures running Microsoft® SQL Server 2012 Fast Track Data Warehouse. Featuring Intel® Xeon® processors. View the entire webcast or only the chapters you desire
Business users increasingly demand 24x7 availability of their data while IT departments face the challenge of ensuring maximum availability while operating with limited budgets.
Newsletter Sign-Up »

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all Newsletters | Privacy Policy
Sponsored Links

High performance. Delivered. Click to see Accenture's client successes

Master the cloud with the power of convergence from HP

Connect with IT leaders redefining mobility at the Enterprise Mobile Hub

Choose New and manage one device instead of 170

Choose New for 8x the firewall and NAT performance

Check out a smart way of mobilizing your business with enterprise-ready Samsung Mobile.

Redefine your data center with HP servers.

Enhance your business with Windstream IT Solutions. Speak to someone local.

BlackBerry® Mobile Fusion. Different mobile devices. One platform.

CYBERMARYLAND | Learn Why Maryland is the Epicenter for Cybersecurity

Get Ethernet speeds from 1 Mbps to 10 Gbps - Comcast Business Class

Cognizant. Leading in Business, Application & Technology Services

Collaboration: driving better business outcomes

Gain cutting-edge insights at MIT in 2-5 day executive programs.

Click to see how Accenture has delivered high performance to clients

Complimentary Gartner Report on BYOD: Media Tablets & Beyond. View Now

Elevate storage agility and efficiency with HP 3PAR storage.

Choose New and slash the number of devices you manage

Customized information views & Twitter events at New Fulcrum Point

Splunk translates machine data into "aha" moments for IT and the business.

ManageEngine Desktop Central - Automate and Audit Your Desktop Management! Learn More...

Cloud Readiness Starts with Intel® Technology

Visit the Virtually There Learning Page to learn how to use virtualization to your competitive advantage.

Free: Hunter Muller's "The Transformational CIO."

Join us for an upcoming Microsoft 365 live online demo event.

Discover your easiest path to unified communications

Virtualizing Your Infrastructure Just Got Easier

Connect with global CIOs now at Enterprise CIO Forum

Resource Center