Client is looking for an Big Data ML Engineer - an expert in Hadoop Eco System.
Responsibilities
Job scheduling challenges in Hadoop.
Experienced in creating and submitting Spark jobs.
Creating Kubernetes open-source container-orchestration system for automating application deployment, scaling, and management; (Experience in Azure is preferable).
Tweaking, using Jenkins, deployment orchestration and, or Kubernetes for CI, CD pipeline.
Creating, modifying Dockers, microservices and deploying them via Kubernetes.
Design, Development, Unit and Integration testing of complex data pipelines and to handle data volumes to derive insights.
Optimize code to be able to run efficiently with stipulated SLA.
Qualifications
Expert with using the larger Hadoop Eco System.
Experience in high performance tuning and scalability.
Experience in working on real time stream processing technologies like Spark structured streaming, Kafka
Expertise in Python, Spark and their related libraries and frameworks.
Experienced with Spring Framework and Spring Boot.
Experience in building training ML pipeline and efforts involved in ML Model deployment.
Experience in other ML concepts – Real time distributed model inferencing pipeline, Champion, Challenger framework, A, B Testing, Model performance Scorecard and assessment, Retraining framework, etc.