Primary Skills:Pyspark Hive Data Engineering Sqoop Cloudera Hadoop Mapreduce
Secondary Skills:Spark Hdfs Python Pig
Job Location:
Bangalore/Bengaluru
Posted Date:
391 days ago
Job Description
Greetings! We are looking for Data Engineer with experience in Python, PYSPARK, Azure, Hadoop Eco system for Bangalore location. Please find the below JD:
Required knowledge, Skills and qualifications:
Knowledge of Python and Pyspark is an absolute must.
Knowledge of Azure, Hadoop 2.0 ecosystems, HDFS, MapReduce, Hive, Pig, Sqoop, Mahout, Spark is important for this role.
Experience with Web Scraping frameworks (Scrapy or Beautiful Soup or similar).
Extensive experience working with Data APIs (Working with REST endpoints and/or SOAP).
Significant programming experience (with above technologies as well as Java, R and Python on Linux).
Knowledge of any commercial distribution application such as Horton Works, Cloudera, MapR etc.
Excellent working knowledge of relational databases, MySQL, Oracle etc.
Experience with Complex Data Parsing (Big Data Parser) a must. Should have worked on XML, JSON and other custom Complex Data Parsing formats.
Natural Language Processing (NLP) skills with experience in Apache Solr, Python a plus.
Knowledge of High-Speed Data Ingestion, Real-Time Data Collection and Streaming is a plus.
Bachelors in computer science or related educational background.
3-5 years of solid experience in Big Data technologies a must.