Site Reliability Engineer
Job Title: Platform Engineer/Site Reliability Engineer/ Observability Engineering
Experience: 5+ years
Work Mode: Remote
________________________________________
Job Description:
We are looking for a Data Engineer with strong experience in Python automation, observability (Prometheus), MongoDB, ETL workflows, and cloud data platforms. The ideal candidate should have hands-on skills in monitoring/logging, data engineering pipelines, and performance optimization across databases and cloud environments.
Key Responsibilities:
• Develop and maintain data pipelines, ETL processes, and data workflows using SQL, Python, and Unix shell scripting.
• Manage and optimize MongoDB including schema design, indexing, aggregation, and performance tuning.
• Design and implement observability systems using Prometheus for metrics, logs, events, and traces.
• Lead migration and validation of dashboards across observability/monitoring platforms.
• Build automation scripts for monitoring, log retrieval, and system analysis.
• Work with cloud platforms (Azure/Snowflake/Databricks) for data movement, storage, and scaling.
• Collaborate with cross-functional engineering teams to ensure data reliability and system visibility.
Requirements:
• 3–5+ years of experience in Data Engineering or Observability Engineering.
• Strong proficiency in Python scripting, automation workflows, and Unix/Linux systems.
• Hands-on experience with Prometheus or similar monitoring tools.
• Strong SQL experience and understanding of data warehousing concepts.
• Experience with MongoDB (architecture, aggregation framework, performance optimization).
• Understanding of cloud platforms such as Azure, Snowflake, or Databricks.
• Good analytical, debugging, and communication skills.