Skip to content
View jeanboutros's full-sized avatar
🧨
Refactoring!
🧨
Refactoring!

Block or report jeanboutros

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jeanboutros/README.md

Jean Boutros

Engineering Manager

πŸ“ London, UK | πŸ“§ jean [dot] boutros [at] gmail [dot] com | πŸͺͺ LinkedIn | πŸ“ Medium Blog | πŸ’Ύ Github

About Me

Data engineering manager with 19 years of experience building scalable data systems and high-performing teams. Over the past 5 years, I've led data engineering teams of 3–25 engineers, delivering data platforms, analytics solutions, and customer-facing products across consulting, media, humanitarian, and fintech sectors.

I thrive in roles where I can build teams, shape technical roadmaps, and deliver systems that compound value over time.

Core Expertise

Engineering Leadership

  • Practice Leadership & Team Building: Built data engineering teams from scratch (0–25 engineers); defined operating models, enterprise standards, hiring pipelines, and engineering direction across business areas
  • End-to-End Platform Ownership: Own data platforms from architecture and delivery through to reliability, cost, and continuous improvement
  • Executive & Stakeholder Engagement: Report to and influence C-suite (CTO, CPO); shape strategy, secure investment, and align technology with business outcomes
  • Cross-Functional Collaboration: Partner with Product, Business, Architecture, and Delivery teams to shape roadmaps, resolve dependencies, and drive delivery across portfolios
  • Coaching & Development: Mentor engineers across all levels; create training roadmaps and communities of practice
  • Hiring & Retention: Recruit and retain high-performing engineers; build career progression frameworks

Technical Strategy & Architecture

  • Data Platform Engineering: Databricks Lakehouse (Delta Lake, PySpark, Unity Catalog), data mesh, data products, and analytics platforms
  • Cloud Architecture: Cloud-native systems, data pipelines, dimensional models, event-driven and streaming architectures
  • AI & Innovation: AI-ready platform enablement, agentic AI workflows, MLOps foundations, Responsible AI principles
  • Infrastructure & DevOps: Infrastructure as Code (Terraform, Terragrunt), CI/CD with GitHub Actions, DataOps, observability
  • Delivery Excellence: Kanban-based delivery with iterative shipping and continuous improvement

Technical Foundation

  • Core Languages: Python (primary), PySpark, SQL, Bash
  • Other Languages: TypeScript, JavaScript, Java, Kotlin
  • Cloud & Infrastructure: Azure (Data Factory with self-hosted IRs, Functions, Data Lake, Event Hub), Kubernetes, Docker, Terraform, Terragrunt
  • Data Platforms: Databricks (Lakehouse, Delta Lake, Spark, Unity Catalog); Snowflake (working knowledge)
  • Databases: PostgreSQL, MySQL, ClickHouse, MongoDB, SQLite
  • AI & Automation: OpenAI API, Anthropic Claude, GitHub Copilot, agentic AI workflows
  • DevOps & Integration: GitHub Actions, CI/CD, DataOps, REST APIs, NestJS, GraphQL, Spring Boot, Microservices, Databricks Asset Bundles

Professional Experience

Data Architect | Bauer Media Outdoor | Aug 2024 - Present | London, UK

End-to-end owner of the data engineering practice managing 3 direct reports + 12 indirect across data architecture and cloud engineering. Report directly to the CPO on platform performance and partner with the CTO on technical strategy. Defined the operating model, governance, and staffing frameworks from the ground up while balancing hands-on technical leadership with strategic planning and team development.

Product & Engineering Leadership:

  • Define and implement the data architecture framework serving BI and analytics customers across the organisation
  • Partner with the CTO, CPO, product, and business stakeholders to shape technical direction and translate requirements into scalable platform capabilities
  • Establish principles and standards for data quality and processing, reducing production incidents by 80%
  • Set enterprise standards for BI, data storage, pipeline design, naming conventions, and software engineering practices across multiple teams and business areas
  • Review and approve architectures of solutions built by other teams to ensure scalability, reliability, and adherence to best practices
  • Establish enterprise-wide engineering standards, DataOps practices, and delivery governance across the practice

Technical Delivery:

  • Own the Azure data platform end-to-end: Databricks Lakehouse, Data Factory, Data Lake, Event Hub, and supporting services
  • Design, implement, and govern Azure Data Factory pipelines with self-hosted integration runtimes for hybrid on-prem and cloud sources; set ADF standards adopted across teams
  • Engineer automated CI/CD pipelines with GitHub Actions, increasing deployment frequency by 3x and eliminating manual release errors
  • Implement Infrastructure as Code (IaC) using Terraform/Terragrunt for version-controlled, repeatable Azure infrastructure
  • Standardise data governance, access control, and security using Databricks Unity Catalog and Delta Lake as the foundational Lakehouse architecture
  • Build data processing applications in Databricks with PySpark and Python, reducing operational costs by ~50% and enabling the business to identify and close revenue spillage points
  • Design and implement an ingestion framework supporting API-based, batch, and streaming patterns, converting all data downstream to a unified streaming model
  • Establish data contracts through AI agents β€” defining contracts in markdown and enabling consumers to track changes and apply them without human interaction

Reliability & Operational Excellence:

  • Own production reliability of the data platform; reduced incidents by 80% through quality standards, automated testing, and proactive monitoring
  • Implement automated alerting in Databricks with Microsoft Teams integration for real-time incident response and on-call routing
  • Establish observability, support, and incident-management practices, embedding operational ownership across the team

AI & Innovation:

  • Champion AI adoption across the organisation; co-organised internal hackathon to accelerate AI literacy and experimentation
  • Enable teams to build agentic AI workflows for the software development lifecycle, improving engineering productivity
  • Build AI-powered agentic workflows for data engineering delivery, automating repetitive tasks across the team
  • Early adopter and tester of AI platforms (Anthropic Claude, GitHub Copilot, ChatGPT), integrating them into engineering practices

Team Leadership:

  • Mentor and coach 5 data engineers (3 junior, 2 senior), developing technical growth plans and preparing them for future leadership roles
  • Redesign hiring process with coffee chats, design pattern discussions, and gamified coding assessments focused on real-world scenarios
  • Create 3-year technical development program where each engineer owns a domain (CI/CD, metadata-driven development, unit testing) to build deep expertise and accountability
  • Establish development standards: contribution guidelines, PR processes, commit conventions, IaC requirements, data flow standards, and naming conventions
  • Design 3-year team roadmap focused on sustainable growth, cross-skilling, and wellbeing through empathetic, incremental transformation

Data & Analytics Manager | PwC | Nov 2021 - Aug 2024 | London, UK

Led data transformation programs across Insurance and Transport sectors within a large matrix organisation, architecting solutions using Databricks, Azure Data Factory, and Snowflake. Engaged directly with client C-suite stakeholders (CDOs, CTOs, programme sponsors) to shape data strategies and secure executive buy-in.

Delivery Leadership at Scale:

  • Built a 25-engineer data engineering capability from the ground up β€” defined the operating model, governance, staffing frameworks, hiring pipeline, and engineering direction; reduced bench by 30%
  • Led team modernisation: recruited engineers, defined training roadmaps, shifted from quick solutions (Alteryx, Excel, Power BI) to scalable cloud architectures using Databricks and Azure Data Factory
  • Consulted on all data transformation projects across multiple sectors and portfolios; supported client designs, pitches, and internal peer reviews; chaired steering group
  • Coached engineers across consulting lines on technical growth and delivery excellence
  • Influenced client executives on data platform strategy, target operating models, and investment cases

Technical & Product Delivery:

  • Implemented data mesh architecture principles, enabling domain-oriented data ownership and decentralised delivery
  • Built a sales reporting data product from the ground up, demonstrating the data product operating model
  • Designed and implemented Azure Data Factory ingestion pipelines as a core delivery pattern across regulated client engagements
  • Architected ingestion frameworks supporting batch and API-based patterns with downstream streaming conversion
  • Delivered scalable solutions on Databricks (Delta Lake, PySpark, Unity Catalog) and Snowflake balancing customer needs, technical debt, and time-to-market
  • Developed reusable data engineering and DevOps accelerators adopted by practitioners across the organization

Impact:

  • Grew the practice from 0 to 25 engineers
  • Established community of practice with 50+ active members
  • Delivered client engagements across Insurance, Transport, and other regulated industries

Data Science & Insights Lead | UNHCR (UN Refugee Agency) | Apr 2018 - Dec 2020 | Beirut, Lebanon

Built analytics applications and call center tools for humanitarian operations across the country.

Product Development in Challenging Environments:

  • Designed and shipped custom analytics applications and call center tools to gather data from refugee communities
  • Streamlined reporting workflows and customized outputs, increasing data accessibility for leadership and external partners
  • Increased call center case processing from 5 cases per week to hundreds per day, including life-threatening emergencies
  • Doubled call center funding through improved data transparency and impact reporting

Technical Delivery:

  • Built automated dashboards and visualizations using Python, Tableau, and Power BI to accelerate decision-making
  • Performed advanced analysis of multi-channel data using statistical modeling and machine learning techniques
  • Delivered an end-to-end data solution for collecting, analysing, and reporting humanitarian emergencies

Impact:

  • Reduced report generation time from days to hours
  • Enabled faster response to natural disasters and crises through real-time data insights
  • Improved donor transparency, contributing to increased funding

Earlier Career | Nov 2006 - Apr 2018 | Beirut, Lebanon

  • Built fintech products (digital wallets, payment gateways, transaction systems) using Java, Kotlin, and Spring Boot with reactive microservices; integrated with Mastercard, Visa, and banking APIs
  • Migrated legacy systems to SOA frameworks using Oracle middleware; designed integration solutions with microservices and service buses
  • Developed HR applications for personnel management, leave tracking, and payroll; delivered reporting and data visualizations for auditing
  • Built full-stack web applications with PHP, MySQL, PostgreSQL, and REST APIs

Education

Master of Business Administration (MBA) | 2022-2023

Augment Business School, UK
Focus: Business Strategy, Marketing, Leadership, Innovation

MA in Data, Culture and Society | 2020-2021

University of Westminster, UK
Focus: Data & Society, Research Methods, Database Design, Data Visualisation

PGCert in Data Science, Innovation and Technology | 2019-2020

The University of Edinburgh, UK
Focus: Data Science, Machine Learning, Data Visualisation

BSc in Management Computer Information Systems | 2008-2011

Saint Joseph University of Beirut
Focus: Software Design, Business Management, Human-Machine Interaction

Languages

English: Fluent | French: Fluent | Mandarin Chinese: Pre-Intermediate

Pinned Loading

  1. work_samples work_samples Public

    Jupyter Notebook 1

  2. CD40103 CD40103 Public

    This project demonstrates interfacing the CD40103 8-bit binary 8-stage down counter with Arduino.

    C++

  3. clickhouse-test-project clickhouse-test-project Public

    This project is a simple test run to try ClickHouse.

    Shell

  4. esp32-uart-gps esp32-uart-gps Public

    Rust

  5. jokteur/python_communism jokteur/python_communism Public archive

    A module for initiating the communist revolution in each of our python modules

    Jupyter Notebook 761 13