We are looking for a Data Engineer to join our Engineering team to work with our data pipeline. We ingest data from some of the most interesting sources in the world, collecting from places like the Twitter and Foursquare APIs and even UFO sightings reported to the National UFO Reporting Center.
Leverage your expertise with data by contributing to our mission to make data more accessible for the rest of the world. Members of our Data Team make it possible for us to serve our customers with data like Trstrank and everything you can find in our Geo APIs. We are a world-class big data shop, with a unique approach and philosophy that you won’t be able to find anywhere else.
Our data pipeline uses technologies such as HBase, Elastic Search, Flume, Chef, Pig, and Hadoop. We’ve even developed tools of our own to make the ingestion pipeline run more smoothly, like a Ruby-based interface to Hadoop and a bulk loader for Elastic Search, that you can check out here: http://www.infochimps.com/labs
Natural Language Processing algorithms
ETL (Extract, Transform, and Load) Experience
Unsupervised clustering algorithms
Large scale data processing