Responsibilities - Include, but are not limited to:
Loading data from different datasets and deciding on which file format is efficient for a task.
Defining Hadoop Job Flows.
Build distributed, reliable and scalable data pipelines to ingest and process data in real-time
Managing Hadoop jobs using scheduler.
Design and implement column family schemas of Hive and HBase within HDFS.
Develop efficient pig and hive scripts with joins on datasets using various techniques.
Troubleshoot and debug any Hadoop ecosystem run time issues.
Additional duties as assigned.
Required Qualifications:
Excellent written and oral communication skills
3-5 years of hands on Hadoop experience
Physical Demands and Work Environment:
The employee is regularly required to communicate (give/receive information) through multiple methods of communication. The employee is frequently required to stand or walk (or otherwise move through the organization); sit; use hands to type, write, handle or feel and reach with hands and arms. The employee may occasionally climb or balance; stoop, kneel, or crouch; or lift and/or move up to 25 lbs. The noise level is usually moderate and typical of an office environment. The employee may be required to use a variety of standard office equipment, including computer keyboards and monitors, phones, printers, etc.