Johnny
Costa
Data Engineer & Python Developer
Building end-to-end data solutions across cloud environments — from raw ingestion to actionable dashboards. Experienced in ETL pipelines, machine learning, and process automation across Cloud solutions.
End-to-end data solutions,
from raw ingestion to insights
I'm a Data Engineer with experience building scalable data solutions across cloud environments — GCP, AWS, and Azure. My work spans ETL pipeline design, machine learning, business intelligence, and process automation.
At Banese Card, I developed a fraud detection ML model with 80% recall and reduced operational SLA by 60%. At Ford Motor Company, I built a scalable ETL pipeline centralizing multi-source vehicle data into BigQuery and co-developed an ML model for vehicle weight estimation.
I hold a Bachelor's in Computer Science and two Postgraduate degrees in Big Data & AI and Software Engineering. Currently deepening expertise in AWS and Databricks. B2 CEFR English level.
Technical Skills
Selected work
Automates the process of reading files from a local directory and uploading them to Amazon S3. Features secure credential management via environment variables and structured logging with Loguru — built as part of a hands-on AWS learning journey.
Full ETL pipeline that scrapes game data from Steam via Selenium, transforms it with Pandas, and loads it into Google BigQuery. Connected to a live Google Sheets dashboard for real-time analysis of pricing trends and sales.
Full-stack sales management app built with Streamlit and PostgreSQL. Features Pydantic data validation, a Medallion architecture (Bronze/Silver/Gold) orchestrated with dbt, and auto-generated docs via MkDocs.
Professional journey
Let's build something
worth the data.
Open to data engineering roles, freelance pipeline work, or just a conversation about data systems and Python.