Data Engineering Practice – Data Analysis, Data Engineering and Data Visualization
Responsibilities:
- Involvement in technical requirement gathering, analysis and design, solution analysis, proof-of-concepts, and architectural roadmaps. Work with cross- functional members such as, Business Functions, and System Integrators to drive implementations and build scalable solutions.
- Work closely with Data Scientists, Analysts, and other stakeholders to understand their data requirements.
- Architect and work with data solutions on Azure, considering Databricks for advanced analytics and processing.
- Design and implement data models within Cloud Data Systems using Azure ADLS, ADF, Synapse, Databricks, Snowflake, ensuring scalability and efficiency. Collaborate with Azure services for seamless integration with other cloud-based tools.
- Work with multiple data sources - RDBMS and No-SQL Databases such as Oracle, MS SQL Server, MySQL, HBase, PostgreSQL, Cassandra, MongoDB or similar.
- Develop and maintain ETL (Extract, Transform, Load) pipelines using tools like azure Data Factory (ADF), Apache Spark within Databricks for data transformation.
- Leverage Databricks, Snowflake's capabilities for efficient loading and unloading of data.
- Integrate data from various sources into the Snowflake data warehouse and Azure Data Services.
- Collaborate with other teams to ensure seamless data flow between systems.
- Tune and optimize Snowflake queries for performance.
- Implement best practices for performance in Databricks and optimize big data processing jobs using tools such as Microsoft Azure Stream Analytics, Synapse, Apache Spark or Amazon EMR.
- Implement security measures for data stored in Snowflake and AWS / Azure or similar cloud deployed data systems, ensuring compliance with industry standards.
- Establish and enforce data governance policies and practices.
- Set up monitoring systems to track data pipeline performance and proactively address issues.
- Conduct regular maintenance tasks such as data backups, updates, and patches for Azure Data Storage systems and AWS S3 as required.
- Develop, train, and deploy machine learning models directly within the Databricks environment.
- Develop Data Visualizations using tools such as Tableau, PowerBI or similar.
- Maintain thorough documentation for data models, ETL processes, and system architecture.
- Provide documentation for troubleshooting and issue resolution, support data-related issues, troubleshoot problems, implement solutions, work with support teams for resolution of any issues in timely manner.
- Participate in validation/testing, release activities, change management activities and training activities.
Qualifications:
- Experience working experience on Cloud Platforms such as Azure, Azure, Google Cloud with experience on ADF, Synapse, Databricks, Snowflake, Apache Spark, Amazon EMR and Data Visualization tools such as Tableau, PowerBI or similar.
- Exceptional coordination & communication skill to work effectively with client & offshore teams.
- Ability to manage multiple tasks and projects simultaneously in a fast-paced environment.
- Bachelor’s degree in computer science or equivalent is required, willing to relocate, must be available to work from office.