Paloma Cordeiro palomacdev

Hi, I'm Paloma Cordeiro 👋

Data Engineer · MLOps Engineer · Motorsport Analytics

I build end-to-end data systems — from real-time ingestion pipelines to production-ready ML infrastructure.

Currently working as a Data Lead, I own the full data lifecycle at my company: SQL Server administration, ETL pipelines, REST APIs, and BI delivery. Outside of work, I build MLOps and motorsport analytics systems that push into streaming, model tracking, and race simulation.

I don't just build models — I build the systems around them.

🚀 Featured Projects

⚡ Real-Time Fraud Detection Pipeline

Kafka · Spark Structured Streaming · MLflow · Docker Compose · PySpark

End-to-end streaming architecture: producer → Spark processing → ML inference → MLflow tracking
Fraud detection models tuned for 96% recall — deliberate business trade-off over precision
ROC-AUC improved from 0.53 → 0.77 through feature engineering within the streaming pipeline
Fully reproducible environment via Docker Compose with domain-based service architecture

🔗 palomacdev/ml-lab

🎤 OpenF1 Transcribe — Real-Time Audio Processing

FastAPI · OpenAI Whisper · MongoDB · Docker Compose

Microservices architecture separating API layer from async batch processing workers
Transcribes thousands of F1 team radio files: ~2.2s/audio · <50ms API latency
Enables full-text search across race communications via structured MongoDB indexing
Production-level repo: MIT license, contributing guidelines, modular codebase

🔗 palomacdev/openf1-transcribe

🏎️ DRS Data — Motorsport ML & Simulation Platform

XGBoost · Scikit-learn · SHAP · FastF1 · Python

Qualifying grid prediction model achieving ~3 position MAE
XGBoost selected over Random Forest based on lower prediction error and stronger generalization
Race simulation engine modeling strategy scenarios and tire degradation
SHAP explainability for model interpretation and validation
Built custom feature engineering using telemetry, track characteristics, and driver performance history

🔗 palomacdev/drs_data

🛠️ Tech Stack

⚙️ Data Engineering

🧠 Machine Learning & MLOps

🗄️ Databases

☁️ Cloud & Languages

🎯 Current Focus

Machine Learning for real-time decision systems
MLOps and model lifecycle in production
Feature engineering on streaming data
Motorsport analytics and simulation systems

📊 GitHub Stats

Visitors

📫 Let's Connect

💼 LinkedIn
📧 palomacordeiro2009@hotmail.com
🔬 Architecture & experimental projects: palomahub-arch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly