.Expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams.
.Support software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.
.Create and maintain optimal data pipeline architecture.
.Assemble large, complex data sets that meet functional / non-functional business requirements.
.Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
.Develop real-time data ingestion pipelines
.A successful history of manipulating, processing and extracting value from large disconnected datasets.
.Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
.Experience in Develop data pipelines using Azure Data Factory, SQL data base & Data Lake, Databricks
.Experience with big data tools: Hadoop, Spark, Kafka, etc.
.Experience with relational SQL and NoSQL databases, including Cosmos DB , Postgres and Cassandra.
.Experience with object-oriented/ object function scripting languages: Python
.Experience with data pipeline and workflow management tools: Azure Monitor, Azkaban, Luigi, Airflow, etc.
.Experience with stream-processing systems: Spark-Streaming.
.Reporting tools: Power BI, tableau etc
Good to have:
.Experience with AWS cloud services: EC2, EMR, RDS, Redshift
.Understanding of messaging infrastructure like event hub, service bus & notification hub.
.Experience with stream-processing systems: Storm
.Takes strong initiative and ownership
.Good at communicating within team as well as with all stake holders
.Strong customer focus & Highly proactive
.Experience supporting and working with cross-functional teams in a dynamic environment.
.Good at decision making
B.E/B.Tech/ BS- Computer Science
Total Years of Exp
4 to 6