• Proficient understanding of distributed computing principles
• Management of Hadoop cluster, with all included services
• Ability to solve any ongoing issues with operating the cluster
• Proficiency with Hadoop v2, MapReduce, HDFS
• Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
• Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
• Experience with Spark
• Experience with integration of data from multiple data sources
• Experience with NoSQL databases, such as HBase, Cassandra, MongoDB.