Jean-Marc Soumet
AI/ML Engineer
Summary
Experienced software engineer with over 8 years in AI/ML across Fortune 500 companies and agile startups. Proficient in Python, Java, and SQL, I have played pivotal roles at every stage of the machine learning lifecycle, including data preprocessing, model development, deployment, and maintenance. As a leader in the field, I have designed and delivered major projects, spearheading the implementation of innovative solutions that generated significant business value and enhanced operational efficiency.
Experience
Ironclad
04/2022 - Present
Principal Machine Learning Engineer / Manager
Kick-started the Machine Learning efforts at Ironclad by deploying NLP models into production, and grew a cross-functional AI/ML and full-stack engineering team, whom I led and mentored.
- Architected and implemented Ironclad's NLP machine learning platform. The platform includes data preprocessing, model training with PyTorch on Google Cloud Platform (GCP) using Nvidia CUDA GPU acceleration, and scalable serving of predictions.
- Led the development and deployment of BERT-based transformer models for text classification and Named Entity Recognition (NER); integrated large language models (LLMs) like GPT, Gemini, and Claude for advanced information extraction tasks.
- Guided the company’s AI strategy by collaborating with customers to resolve product challenges. Advocated AI initiatives and requirements with senior management, managed a team of four ML engineers and two full-stack engineers, and oversaw recruitment, screening, and training efforts.
- Designed innovative zero-shot classification techniques to accurately classify clauses without the need for model training.
- Led the implementation of a Retrieval-Augmented Generation (RAG) system utilizing large language models (LLMs), Vision-Language Models (VLMs), and vector databases to enable efficient information extraction and retrieval from millions of PDFs for a large customer base.
Roku
10/2020 - 04/2022
Senior Machine Learning Engineer
Led machine learning projects for the voice command team, managing backend systems for speech-to-text and natural language understanding to deliver on-device intents for content navigation.
- Oversaw end-to-end backend systems for speech-to-text and NLU, processing natural language queries from consumer devices at scale.
- Developed and maintained data training pipelines, including tools to generate synthetic data for multiple languages and markets.
- Optimized system performance ensuring low latency for voice command operations, essential for enhancing user experience.
- Utilized AWS services including S3, ECS, EC2, SageMaker, and Airflow to run data processing and model training jobs.
Salesforce
12/2014 - 10/2020
Worked in multiple teams including Einstein.ai Voice Assistant, Metamind.ai, and SalesforceIQ. Contributed to major products & projects focusing on machine learning/AI, large-scale data pipelines, and big data analytics.
Einstein.ai Voice Assistant
Principal Machine Learning Engineer
- Led the development of the backend for the Einstein.ai Voice Assistant app, allowing customers to query and manage Salesforce data via voice commands.
- Trained and deployed deep learning models to support intent matching and entity recognition. Built APIs for Natural Language Processing (NLP) and Automatic Speech Recognition (ASR).
Metamind.ai
Principal Machine Learning Engineer
- Designed and implemented scalable microservices for dataset uploads, training job scheduling, and real-time inference.
- Developed model templates and training scripts utilizing LSTM, BERT, and Faster-RCNN architectures for fine-tuning.
- Implemented high-performance custom C++ JNI adapters for TensorFlow and PyTorch models for improved inference speed and seamless integration.
Einstein.ai Platform
Lead Software Engineer
- Led a team of data engineers to develop large-scale ETL pipelines that transferred terabytes of data daily from Salesforce's global data centers to AWS S3.
- Architected a Scala/Spark-based platform for batch processing customer data, enabling AutoML inference across Service Cloud, Sales Cloud, and Marketing Cloud.
SalesforceIQ Data Analytics Team
Senior Software Engineer
- Utilized tools like Kafka, Flink, Spark, Hadoop, SQL, and S3 to enable user behavior analytics for the SalesforceIQ SaaS product.
- Managed a petabyte-scale Postgres data warehouse, handling tasks such as table design, data extraction, and processing.