Structuring highly scalable, robust & fault-tolerant systems.This includes configuring cluster specifications, scaling resources based on workload demands, and optimizing cluster performance for faster data processing.Expertise in setting up and managing Data Proc clusters to process large-scale data workloads efficiently.Developed data ingestion processes by implementing custom Java applications to extract data from various sources, such as databases, APIs, and file systems.Creating utilities to automate manual work using Scala reusing operators to lessen redundancy.Built data processing pipelines and ETL (Extract, Transform, Load) workflows using Java and related frameworks.Implemented Spark using Scala, Java and utilizing Data frames and Spark SQL API for faster processing of data.Intensifying unit tests to minimize the issue & presenting a quality product creating unit tests to check Python functions for their expected performance & actual performance.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |