π London, UK | π§ jean [dot] boutros [at] gmail [dot] com | πͺͺ LinkedIn | π Medium Blog | πΎ Github
Data engineering manager with 19 years of experience building scalable data systems and high-performing teams. Over the past 5 years, I've led data engineering teams of 3β25 engineers, delivering data platforms, analytics solutions, and customer-facing products across consulting, media, humanitarian, and fintech sectors.
I thrive in roles where I can build teams, shape technical roadmaps, and deliver systems that compound value over time.
- Practice Leadership & Team Building: Built data engineering teams from scratch (0β25 engineers); defined operating models, enterprise standards, hiring pipelines, and engineering direction across business areas
- End-to-End Platform Ownership: Own data platforms from architecture and delivery through to reliability, cost, and continuous improvement
- Executive & Stakeholder Engagement: Report to and influence C-suite (CTO, CPO); shape strategy, secure investment, and align technology with business outcomes
- Cross-Functional Collaboration: Partner with Product, Business, Architecture, and Delivery teams to shape roadmaps, resolve dependencies, and drive delivery across portfolios
- Coaching & Development: Mentor engineers across all levels; create training roadmaps and communities of practice
- Hiring & Retention: Recruit and retain high-performing engineers; build career progression frameworks
- Data Platform Engineering: Databricks Lakehouse (Delta Lake, PySpark, Unity Catalog), data mesh, data products, and analytics platforms
- Cloud Architecture: Cloud-native systems, data pipelines, dimensional models, event-driven and streaming architectures
- AI & Innovation: AI-ready platform enablement, agentic AI workflows, MLOps foundations, Responsible AI principles
- Infrastructure & DevOps: Infrastructure as Code (Terraform, Terragrunt), CI/CD with GitHub Actions, DataOps, observability
- Delivery Excellence: Kanban-based delivery with iterative shipping and continuous improvement
- Core Languages: Python (primary), PySpark, SQL, Bash
- Other Languages: TypeScript, JavaScript, Java, Kotlin
- Cloud & Infrastructure: Azure (Data Factory with self-hosted IRs, Functions, Data Lake, Event Hub), Kubernetes, Docker, Terraform, Terragrunt
- Data Platforms: Databricks (Lakehouse, Delta Lake, Spark, Unity Catalog); Snowflake (working knowledge)
- Databases: PostgreSQL, MySQL, ClickHouse, MongoDB, SQLite
- AI & Automation: OpenAI API, Anthropic Claude, GitHub Copilot, agentic AI workflows
- DevOps & Integration: GitHub Actions, CI/CD, DataOps, REST APIs, NestJS, GraphQL, Spring Boot, Microservices, Databricks Asset Bundles
End-to-end owner of the data engineering practice managing 3 direct reports + 12 indirect across data architecture and cloud engineering. Report directly to the CPO on platform performance and partner with the CTO on technical strategy. Defined the operating model, governance, and staffing frameworks from the ground up while balancing hands-on technical leadership with strategic planning and team development.
- Define and implement the data architecture framework serving BI and analytics customers across the organisation
- Partner with the CTO, CPO, product, and business stakeholders to shape technical direction and translate requirements into scalable platform capabilities
- Establish principles and standards for data quality and processing, reducing production incidents by 80%
- Set enterprise standards for BI, data storage, pipeline design, naming conventions, and software engineering practices across multiple teams and business areas
- Review and approve architectures of solutions built by other teams to ensure scalability, reliability, and adherence to best practices
- Establish enterprise-wide engineering standards, DataOps practices, and delivery governance across the practice
- Own the Azure data platform end-to-end: Databricks Lakehouse, Data Factory, Data Lake, Event Hub, and supporting services
- Design, implement, and govern Azure Data Factory pipelines with self-hosted integration runtimes for hybrid on-prem and cloud sources; set ADF standards adopted across teams
- Engineer automated CI/CD pipelines with GitHub Actions, increasing deployment frequency by 3x and eliminating manual release errors
- Implement Infrastructure as Code (IaC) using Terraform/Terragrunt for version-controlled, repeatable Azure infrastructure
- Standardise data governance, access control, and security using Databricks Unity Catalog and Delta Lake as the foundational Lakehouse architecture
- Build data processing applications in Databricks with PySpark and Python, reducing operational costs by ~50% and enabling the business to identify and close revenue spillage points
- Design and implement an ingestion framework supporting API-based, batch, and streaming patterns, converting all data downstream to a unified streaming model
- Establish data contracts through AI agents β defining contracts in markdown and enabling consumers to track changes and apply them without human interaction
- Own production reliability of the data platform; reduced incidents by 80% through quality standards, automated testing, and proactive monitoring
- Implement automated alerting in Databricks with Microsoft Teams integration for real-time incident response and on-call routing
- Establish observability, support, and incident-management practices, embedding operational ownership across the team
- Champion AI adoption across the organisation; co-organised internal hackathon to accelerate AI literacy and experimentation
- Enable teams to build agentic AI workflows for the software development lifecycle, improving engineering productivity
- Build AI-powered agentic workflows for data engineering delivery, automating repetitive tasks across the team
- Early adopter and tester of AI platforms (Anthropic Claude, GitHub Copilot, ChatGPT), integrating them into engineering practices
- Mentor and coach 5 data engineers (3 junior, 2 senior), developing technical growth plans and preparing them for future leadership roles
- Redesign hiring process with coffee chats, design pattern discussions, and gamified coding assessments focused on real-world scenarios
- Create 3-year technical development program where each engineer owns a domain (CI/CD, metadata-driven development, unit testing) to build deep expertise and accountability
- Establish development standards: contribution guidelines, PR processes, commit conventions, IaC requirements, data flow standards, and naming conventions
- Design 3-year team roadmap focused on sustainable growth, cross-skilling, and wellbeing through empathetic, incremental transformation
Led data transformation programs across Insurance and Transport sectors within a large matrix organisation, architecting solutions using Databricks, Azure Data Factory, and Snowflake. Engaged directly with client C-suite stakeholders (CDOs, CTOs, programme sponsors) to shape data strategies and secure executive buy-in.
- Built a 25-engineer data engineering capability from the ground up β defined the operating model, governance, staffing frameworks, hiring pipeline, and engineering direction; reduced bench by 30%
- Led team modernisation: recruited engineers, defined training roadmaps, shifted from quick solutions (Alteryx, Excel, Power BI) to scalable cloud architectures using Databricks and Azure Data Factory
- Consulted on all data transformation projects across multiple sectors and portfolios; supported client designs, pitches, and internal peer reviews; chaired steering group
- Coached engineers across consulting lines on technical growth and delivery excellence
- Influenced client executives on data platform strategy, target operating models, and investment cases
- Implemented data mesh architecture principles, enabling domain-oriented data ownership and decentralised delivery
- Built a sales reporting data product from the ground up, demonstrating the data product operating model
- Designed and implemented Azure Data Factory ingestion pipelines as a core delivery pattern across regulated client engagements
- Architected ingestion frameworks supporting batch and API-based patterns with downstream streaming conversion
- Delivered scalable solutions on Databricks (Delta Lake, PySpark, Unity Catalog) and Snowflake balancing customer needs, technical debt, and time-to-market
- Developed reusable data engineering and DevOps accelerators adopted by practitioners across the organization
- Grew the practice from 0 to 25 engineers
- Established community of practice with 50+ active members
- Delivered client engagements across Insurance, Transport, and other regulated industries
Built analytics applications and call center tools for humanitarian operations across the country.
- Designed and shipped custom analytics applications and call center tools to gather data from refugee communities
- Streamlined reporting workflows and customized outputs, increasing data accessibility for leadership and external partners
- Increased call center case processing from 5 cases per week to hundreds per day, including life-threatening emergencies
- Doubled call center funding through improved data transparency and impact reporting
- Built automated dashboards and visualizations using Python, Tableau, and Power BI to accelerate decision-making
- Performed advanced analysis of multi-channel data using statistical modeling and machine learning techniques
- Delivered an end-to-end data solution for collecting, analysing, and reporting humanitarian emergencies
- Reduced report generation time from days to hours
- Enabled faster response to natural disasters and crises through real-time data insights
- Improved donor transparency, contributing to increased funding
- Built fintech products (digital wallets, payment gateways, transaction systems) using Java, Kotlin, and Spring Boot with reactive microservices; integrated with Mastercard, Visa, and banking APIs
- Migrated legacy systems to SOA frameworks using Oracle middleware; designed integration solutions with microservices and service buses
- Developed HR applications for personnel management, leave tracking, and payroll; delivered reporting and data visualizations for auditing
- Built full-stack web applications with PHP, MySQL, PostgreSQL, and REST APIs
Master of Business Administration (MBA) | 2022-2023
Augment Business School, UK
Focus: Business Strategy, Marketing, Leadership, Innovation
MA in Data, Culture and Society | 2020-2021
University of Westminster, UK
Focus: Data & Society, Research Methods, Database Design, Data Visualisation
PGCert in Data Science, Innovation and Technology | 2019-2020
The University of Edinburgh, UK
Focus: Data Science, Machine Learning, Data Visualisation
BSc in Management Computer Information Systems | 2008-2011
Saint Joseph University of Beirut
Focus: Software Design, Business Management, Human-Machine Interaction
English: Fluent | French: Fluent | Mandarin Chinese: Pre-Intermediate



