Startup Trifacta Embracing the Data Scientist in All of Us
Organizations drowning in big and small data will soon have a new way to wrangle, munge or transform it however you want to describe the process thanks to software from startup Trifacta that's now in beta tests.
Thu, December 05, 2013
Network World — Organizations drowning in big and small data will soon have a new way to wrangle, munge or transform it however you want to describe the process thanks to software from startup Trifacta that's now in beta tests.
The San Francisco company, whose staff has grown from its three computer scientist founders a year ago to a robust 22 employees, today announced a second round of venture funds totaling $12 million led by Greylock Partners and Accel Partners. That brings overall funding to $16.3 million. The new funds will allow Trifacta to bulk up product design, engineering, sales and marketing, and should pave the way for the company to roll out its data management technology next year.
First-time CEO Joe Hellerstein, who this week will present at the Data Science Summit in Redwood City, Calif., says Trifacta software sits "in the lifecycle of data between the time it has landed in infrastructure typically something like Hadoop and the time it is consumed in business intelligence or predictive analytics tools." Expect Trifacta to partner with companies whose products bookend its own, such as Cloudera, Pivotal and Tableau.
Trifacta seeks to make cleaning up that raw data faster so that data scientists and business analysts can spend more of their time analyzing it and making use of it (And the company isn't overlooking "small data," either, such as enabling users to transfer contact information from one app to another.). "There's a real user interface challenge," Hellerstein says. (Hear him talk about the startup as well in the embedded audio clip.)
The product will consist of a Web-based user interface and lightweight service on the back end that Hellerstein says can be hosted on relatively modest hardware and that will contain metadata useful for predicting how users might employ the software the longer they have it. The serious number crunching still takes place on existing big data infrastructure, he says.
While the hullabaloo over big data might elude those running enterprise networks, Hellerstein sees plenty of ways in which such IT pros might benefit from getting a better handle on the streams and reams of information in their organizations.
"Everyone's a data scientist nowadays, everybody is trying to be more data driven," he says. "People in IT obviously have a great interest in looking at machine-generated data in one form or another...such as in data security (not that Trifacta is targeting that vertical market particularly)."