Lead Data Engineer - Nice

💡 Company in transition
Long-term contract
Localisation Nice, France
Agriculture
Partial remote possible
60000 - 90000€ gross (Annual)
From 5 yrs of exp.
Posted on 12-16-2024

Doriane

Faciliter les transformations nécessaires pour construire l’agriculture de demain, via la donnée. Accompagner tous les acteurs de l’innovation agricole à relever les défis écologiques et humains.

💡 Company in transition

This company has begun its transition to improve its social and environmental impact. Only jobs that contribute directly to this transition are published here, such as CSR manager or carbon footprint project manager.

More information
  • Website
  • Company
  • Between 15 and 50 persons
  • Agriculture
Impact study
Doriane did not yet communicate its impact measurement.
Labels and certifications
This structure did not communicate to us the labels or certifications that it was able to obtain.
Read more

We are looking for a Lead Data Engineer to take charge of designing and managing our data infrastructure. You will lead efforts in developing scalable and high-performance data models. You’ll oversee our ETL pipelines, data ingestion processes, and collaborate closely with data scientists to ensure their machine learning models are smoothly integrated into production. You will also play a key role in defining the infrastructure necessary for heterogeneous data ingestion, ML training processes and ML Ops, ensuring the right pipelines, monitoring, and automation are in place.

Key Responsibilities:

  • Lead the design and optimization of data models and infrastructure to support large-scale data processing.
  • Oversee and manage the data layer architecture, currently built on Cube.dev and MongoDB, with a key objective to evaluate and potentially transition to an SQL-based system (e.g., PostgreSQL) for enhanced performance.
  • Handle geospatial data management, ensuring efficient handling of location-based data for analysis, storage, and visualization.
  • Build and maintain robust ETL pipelines and data ingestion streams that ensure high availability, reliability, and performance of data systems.
  • Collaborate with the data science team to ensure the integration of machine learning models into production environments, focusing on efficient model deployment, monitoring, and iteration.
  • Design and implement ML Ops infrastructure to support model training, experimentation, and deployment, including tracking, versioning, and scalability of training processes.
  • Define and implement best practices for data governance, ensuring security, quality, and compliance.
  • Evaluate and adopt new tools and technologies to improve data processing, with a focus on real-time data ingestion and scalable ML infrastructure.
  • Provide leadership in shaping the future of our data architecture, ensuring it aligns with the company’s goals of sustainability and high-impact analytics.
Profile
  • Strong experience in data engineering, including designing and managing data architectures, ETL pipelines, and data ingestion.
  • Expertise in NoSQL databases (e.g., MongoDB), with demonstrated experience or knowledge of transitioning to or optimizing SQL-based systems (e.g., PostgreSQL, MySQL) for performance.
  • Solid understanding of geospatial data management and the ability to handle location-based datasets efficiently (e.g., PostGIS, GeoJSON, or other geospatial tools).
  • Deep understanding of AWS services and cloud-based infrastructure for managing large datasets and building data pipelines.
  • Experience with ML Ops: setting up pipelines for training machine learning models, managing infrastructure for ML experimentation, and automating model deployment and monitoring in production.
  • Familiarity with ML platforms (e.g., Kubeflow, SageMaker, or similar) and experience integrating ML workflows into production environments.
  • Proficiency with data processing frameworks and tools like Apache Airflow, or similar.
  • Strong programming skills in Python, TypeScript, or Java.
  • Excellent leadership and communication skills, with the ability to collaborate with cross-functional teams.