EMC acquired Greenplum last year – and it hasn’t wasted any time in shaking up several markets with its strategies in this area – specifically, big data, open source, analytics and unstructured data. Greenplum founder Scott Yarra, who is now running a new EMC Data Computing Products Division, has been working with EMC to build a data warehouse appliance that incorporates Greenplum software. And looking forward, VMware, SAS data analytics and the Apache Hadoop platform are being folded in to create what was described as the next generation cloud-based data warehouse/analytics platform.
“Apache Hadoop has emerged as an important data technology and processing platform for unstructured data,” Yara said. “Hadoop is playing a significant part in the establishment of our big data/analytics stack.”
Apache Hadoop is an open-source technology inspired by Google MapReduce and Google File System implementations. It is a software framework that supports data-intensive distributed applications and is effective for analyzing and storing massive amounts of data. Yahoo, Facebook, eHarmony, Twitter, eBay and others have been using it to be more agile and to mine unstructured data, which represents most data these days. It combines software, commodity hardware and simple interconnects. Now EMC Greenplum is planning to make it enterprise ready.