Ingénieur principal des données - Nice

💡 Entreprise en transition
CDI
Localisation Nice, France
Agriculture
Télétravail partiel possible
60000 - 90000€ brut (Annuel)
A partir de 5 année(s) d'exp.
Publiée le 16/12/2024

Doriane

Faciliter les transformations nécessaires pour construire l’agriculture de demain, via la donnée. Accompagner tous les acteurs de l’innovation agricole à relever les défis écologiques et humains.

💡 Entreprise en transition

Cette entreprise a entamé sa transition pour améliorer son impact social et environnemental. Seuls les emplois contribuant directement à cette transition sont publiés ici, comme par exemple responsable RSE ou chef de projet bilan carbone.

Plus d'informations
Mesure d'impact
Doriane n'a pas encore transmis de mesure d'impact
Labels et certifications
Cette structure n'a pas souhaité nous communiquer les labels ou certifications qu'elle a pu obtenir.
Voir plus

We are looking for a Lead Data Engineer to take charge of designing and managing our data infrastructure. You will lead efforts in developing scalable and high-performance data models. You’ll oversee our ETL pipelines, data ingestion processes, and collaborate closely with data scientists to ensure their machine learning models are smoothly integrated into production. You will also play a key role in defining the infrastructure necessary for heterogeneous data ingestion, ML training processes and ML Ops, ensuring the right pipelines, monitoring, and automation are in place.

Key Responsibilities:

  • Lead the design and optimization of data models and infrastructure to support large-scale data processing.
  • Oversee and manage the data layer architecture, currently built on Cube.dev and MongoDB, with a key objective to evaluate and potentially transition to an SQL-based system (e.g., PostgreSQL) for enhanced performance.
  • Handle geospatial data management, ensuring efficient handling of location-based data for analysis, storage, and visualization.
  • Build and maintain robust ETL pipelines and data ingestion streams that ensure high availability, reliability, and performance of data systems.
  • Collaborate with the data science team to ensure the integration of machine learning models into production environments, focusing on efficient model deployment, monitoring, and iteration.
  • Design and implement ML Ops infrastructure to support model training, experimentation, and deployment, including tracking, versioning, and scalability of training processes.
  • Define and implement best practices for data governance, ensuring security, quality, and compliance.
  • Evaluate and adopt new tools and technologies to improve data processing, with a focus on real-time data ingestion and scalable ML infrastructure.
  • Provide leadership in shaping the future of our data architecture, ensuring it aligns with the company’s goals of sustainability and high-impact analytics.
Profil recherché
  • Strong experience in data engineering, including designing and managing data architectures, ETL pipelines, and data ingestion.
  • Expertise in NoSQL databases (e.g., MongoDB), with demonstrated experience or knowledge of transitioning to or optimizing SQL-based systems (e.g., PostgreSQL, MySQL) for performance.
  • Solid understanding of geospatial data management and the ability to handle location-based datasets efficiently (e.g., PostGIS, GeoJSON, or other geospatial tools).
  • Deep understanding of AWS services and cloud-based infrastructure for managing large datasets and building data pipelines.
  • Experience with ML Ops: setting up pipelines for training machine learning models, managing infrastructure for ML experimentation, and automating model deployment and monitoring in production.
  • Familiarity with ML platforms (e.g., Kubeflow, SageMaker, or similar) and experience integrating ML workflows into production environments.
  • Proficiency with data processing frameworks and tools like Apache Airflow, or similar.
  • Strong programming skills in Python, TypeScript, or Java.
  • Excellent leadership and communication skills, with the ability to collaborate with cross-functional teams.