SmartWatch
Scraping pipeline for Lyon public facility opening hours. Leverages embeddings and LLM to produce structured, unambiguous output, compared against data.grandlyon.com.
I design and deploy AI-powered solutions and data analysis pipelines with over 20 years of experience, from software engineering to data science. Expert in NLP, LLMs and data pipelines, I combine scientific rigour with a pedagogical approach to turn business challenges into robust, production-ready applications.
Scraping pipeline for Lyon public facility opening hours. Leverages embeddings and LLM to produce structured, unambiguous output, compared against data.grandlyon.com.
Local Python audio transcription pipeline with speaker diarization, usable in real time (microphone) or on file. Works offline after initial model download.
Composable Python library for string similarity matching. Supports edit distance, sequence similarity, token-based, phonetic and semantic similarity with a unified API.
AI-powered log analysis platform using clustering and statistical anomaly detection to identify patterns in large-scale log files.
Benchmarking platform for automatic speech recognition systems: controlled audio degradation, enhancement, normalization and multi-engine comparison with interactive reports.
Python benchmarking framework for text embedding models: grid search over chunking strategies and similarity metrics, with textual heatmap and embedding space visualizations.
TypeScript template for visualizing and interactively exploring hierarchical data structures.
Interactive visualizations for exploring statistical and machine learning concepts.