TOCS Event-Silicon Valley Campus - Carnegie Mellon University

TOCS Event-Silicon Valley Campus - Carnegie Mellon University

TOCS Event


Salim Achouche

Salim Achouche

Sr. Software Architect, Informatica


April 16, 1:30 pm



CMUSV, Rm 118 [directions]

[login as guest]

Title: Trying to Optimize ETL on Top of Hadoop using HIVE

With the emergence of Hadoop, businesses are incorporating massive datasets that were previously unavailable to their analytical tools.  In order to realize business value from big data, enterprises need to establish strong information governance. Data Integration is an essential prerequisite for successful data analysis. The capabilities of a Data Integration Platform include 1) bringing data to the processing environment, 2) transforming, cleansing, and augmenting it with data from external systems, and 3) loading it to other systems. The session will focus on our approach to executing data integration jobs using MapReduce, Hive, and the Informatica Data Engine processor.

Speaker Bio: Salim Achouche is a senior Software Architect at Informatica, working on the Data Engine platform where he helped engineer Informatica's Big Data solution on Hadoop. Previously, Salim helped deliver multiple products at Informatica such as the Profiling functionality, the Metadata Model Driven Repository, and the Webmail solution at Lycos Inc. He earned his undergrad degree from Algiers University in Computer Science.  His Masters is also in Computer Science from Wichita State University.