About the right candidate:
Your tool belt holds many languages and Unix tools. You have a working knowledge of algorithms in general, and real experience about data mining, NLP, machine learning, text analysis, and/or computational linguistics. You can manage both code and large amounts of data. You are awesome with a single computer, and downright scary with hundreds of them. You know how to move fast and find the shortcut when needed. You can whip up a Web page scraper with one hand behind your back while holding a conversation about the fine points of Map Reduce versus Hadoop. You see no shame in writing a single-use script, but you can design an entire infrastructure to crawl and analyze and tame a lot of data from many sources, and keep it tidy over time.
Please send your resume to firstname.lastname@example.org.
Database experience: MongoDB.
Hadoop/Cascading/Oozie are a plus
Java and a scripting language: Ruby/Python. Experience with Scala is a plus
Bonus if you have worked with the Wikipedia dataset.
* Competitive compensation, stock options, health/dental, a brand new rig of your choosing, and everything you see here.
* You'll be the next member of an incredible, growing team
* Create something unbelievably cool from scratch, and maybe even change the world