Talk
Building recommendation systems for e-commerce marketplaces
Data Engineering & MLOps
Data Product Management
Recommendation and Personalization
Bongani Shongwe
Senior Data Engineer at Adevinta
Bongani is a data engineer working in personalization and recommendations for Adevinta's central team for the online classified markets across Europe and North America. He has a background as a full-stack software engineer but swapped to Data Engineering as it allowed him to follow his passion for large-scale distributed systems and related topics. Bongani obtained his MSc in Computer Science from the University of the Witwatersrand in 2014, focusing on peer-to-peer infrastructures for online gaming.
Talk
Creating a large dataset for pretraining LLMs
Deep Learning
Generative AI
NLP
Guilherme Penedo
ML Research Engineer at HuggingFace
With a background in Aerospace Engineering, Guilherme entered the ML world as a Research Engineer at a French startup called LightOn. While at LightOn, he was part of the Falcon team and was in charge of creating the pretraining dataset for the Falcon LLM: the RefinedWeb dataset. After working on the Falcon project, he joined HuggingFace, where he maintains the open-source data processing library "datatrove" and works on improving pretraining datasets as a member of the HuggingFace Science Team.
Talk
Vector Databases (pg_vector) and Real-Time Analytics
Advanced Analytics
Generative AI
NLP
Hubert Dulay
Developer Advocate at StarTree
Hubert Dulay is an O'Reilly author of "Streaming Data Mesh" and "Streaming Databases" (early access). He is a veteran engineer with over 20 years of experience in big and fast data. Hubert has compiled his experiences with data from his time while consulting for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems. With the massive growth in ML/AI, Hubert helps data practitioners understand how existing data pipelines can be altered for ML/AL.
Talk
Beyond text: Exploring the world with Large Multimodal Models
Advanced Analytics
NLP
Computer Vision
Jules Talloen
Senior Machine Learning Engineer at ML6
As a full-stack ML engineer, Jules is interested in the entire development process, from state-of-the-art research to data pipelines and web development. To stay at the forefront of innovation, he continuously monitors the latest research and enables applications built on top of the latest advancements. Using his technical expertise and drive for perfectionism, Jules translates business needs into high-quality, scalable, and performant data processing solutions.
Talk
First Principles Thinking in real world Machine Learning
Data Career Skills
Data Product Management
Data Team Leadership
Pedro Azevedo
General Partner at Union Square Ventures
Pedro Azevedo is a mechanical engineering graduate from the University of Aveiro who researched Autonomous Driving Assistance Systems (ADAS) in the Laboratory of Automation and Robotics at the Department of Mechanical Engineering. After completing his master's degree, Pedro transitioned to the industry, where he worked on real-time ML applied to the manufacturing industry and now works on large-scale machine learning systems, developing end-to-end ML pipelines through use cases in world-class fashion brands such as Adidas.
Talk
Policy as code: Automating compliance with Data Mesh
Data Centric
Data Engineering & MLOps
Data Product Management
Responsible AI
Shawn Kyzer
Associate Director of Data Engineering at AstraZeneca
Shawn is an innovative technologist and data strategist with over 15 years of experience. He is passionate about harnessing data strategy, engineering, and analytics to help businesses uncover new opportunities. His holistic view of technology and emphasis on developing strong engineering talent, focusing on delivering outcomes while minimizing outputs, sets him apart. Shawn's deep technical knowledge includes distributed computing, cloud architecture, data science, machine learning, and engineering analytics platforms. He worked as a consultant for prestigious clients, from secret-level government organizations to Fortune 500 companies.
Talk
High Mileage of Low Rank Adapters
Data Engineering & MLOps
Deep Learning
NLP
Shikhar Chauhan
Data Scientist at KBC Bank NV
Shikhar Chauhan has worked with Deep Learning for images and text throughout his entire career, from solving fake news with AI to automating electronic waste recycling to automating document processing for a bank and optimizing internal processes using GenAI. Shikhar has also been a contributor to Transformers, Torchvision, and spaCy. He loves to share knowledge and keep up to date with the industry and the latest in the field. Shikhar is fascinated by NLP, Images, and Multi-Modal AI.
Talk
DGCF: Deep Graph-Powered Recommendations
Deep Learning
Recommendation and Personalization
Sofia Bourhim
Research Scientist at ENSIAS
Sofia Bourhim is a Research Scientist. She is working on Graph Machine Learning and its interdisciplinary applications to AI for Good problems. She recently obtained her Ph.D. in the field of AI. She previously interned at Microsft research lab (MARI) as a Research intern and is a recipient of the Microsoft PhD Fellowship. Sofia's contributions extend to receiving recognition for the best research papers at various conferences and notable events like NeurIPS'23 and Gitex'23.
Talk
Unlocking LLMs: From forgiving errors to ethical considerations in domain adaptation
Domain-specific applications of AI
NLP
Responsible AI
Vivek Kumar
General Partner at Union Square Ventures
Vivek Kumar is a Senior Researcher/Scientist and the Chair of Open Source Intelligence at the University of Federal Armed Forces under the Federal Ministry of Defence, Germany. He coordinates STELAR, an EU's Horizon 2020 program targeting "Artificial Intelligence, Data, and Robotics" in agri-food. Dr. Kumar won the Marie Sklodowska-Curie fellowship in 2019 for the EU's Horizon 2020 ITN network program PhilHumans & worked with Philips Research, Netherlands, and the University of Cagliari to receive a doctorate in 2023. His current research focuses on NLP, Knowledge Graphs, LLMs, AI Fairness & Bias.
Talk
The ML monitoring flow for models deployed to production
Data Engineering & MLOps
Explainability
Responsible AI
Wojtek Kuberski
Co-Founder and CTO at NannyML
Wojtek Kuberski is an AI professional and entrepreneur with a master's in AI from KU Leuven. He founded Prophecy Labs, a consultancy specializing in machine learning, before getting his current role as a co-founder and CTO of NannyML. NannyML is an open-source library for ML monitoring. At NannyML, he leads the research and product teams, contributing to novel algorithms in the model monitoring space. Wojtek was featured in the Breakout Session at Web Summit for "some of the world's most exciting early-stage startups".

The emerging festival for all data practitioners.

We could claim to be the best data conference in Europe, but we can do better because we're a festival. Join us this September for the only data event you need to attend this year.

Play the 1st edition video
DISCOVER MORE
About the Event

What is
Data Makers Fest?

Data Makers Fest is a festival dedicated to all data makers. On our first edition we gathered in Porto data professionals and enthusiasts from all fields to learn, network, and make things happen. Here's what we're cooking for this year's edition.

>1000
Attendeest
3
Stages
50
Talks
6
Hands-on Tutorials
2
Conference days
>30
Exhibitors
>1000
Attendees
3
Stages
50
Talks
6
Hands-on Tutorials
2
Conference days
>30
Exhibitors
>1000
Attendees
3
Stages
50
Talks
6
Hands-on Tutorials
2
Conference days
>30
Exhibitors
>1000
Attendees
3
Stages
50
Talks
Join Us

Reasons to join
Data Makers Fest

Access lots of high-quality content
Not the commercial stuff we’ve all seen too much by now, but the things that really matter.
It’s for the “real” data makers
You know, the ones who actually work with data and not only pretend to.
Meet worldwide data experts
An opportunity to informally network with experts, colleagues, and companies.
It’s a festival!
Not another boring conference. Expect beer, music, fun, and games between the talks.
TOPICS

Explore hot topics in data and AI

Embark on a journey into the future of AI and data with dynamic sessions covering cutting-edge topics such as Responsible AI implementation, advancements in Natural Language Processing, and the transformative power of Generative AI. Join us at Data Makers Fest 2024 to learn, explore new possibilities, and participate in exciting discussions on all the trendiest data and AI topics.

Advanced Analytics
Explainability
Data Career Skills
Forecasting
Data-Centric
Generative AI
Data Engineering & MLOps
Information Retrieval
Data Product Management
NLP
Data Team Leadership
Recommendation and Personalization
Data Visualization
Responsible AI
Deep Learning
Domain-specific applications of AI
speakers

Meet our speakers

Take a sneak peek into the lineup for Data Makers Fest 2024. We will be announcing more names very soon. If you're interested in giving a talk and think you have what it takes, click the button below to read more about the Call for Speakers.

Bongani Shongwe
Senior Data Engineer at Adevinta
Guilherme Penedo
Machine Learning Research Engineer at HuggingFace
Talk
Building recommendation systems for e-commerce marketplaces
Data Engineering & MLOps
Data Product Management
Recommendation and Personalization
Bongani Shongwe
Senior Data Engineer at Adevinta
Bongani is a data engineer working in personalization and recommendations for Adevinta's central team for the online classified markets across Europe and North America. He has a background as a full-stack software engineer but swapped to Data Engineering as it allowed him to follow his passion for large-scale distributed systems and related topics. Bongani obtained his MSc in Computer Science from the University of the Witwatersrand in 2014, focusing on peer-to-peer infrastructures for online gaming.
Talk
Creating a large dataset for pretraining LLMs
Deep Learning
Generative AI
NLP
Guilherme Penedo
ML Research Engineer at HuggingFace
With a background in Aerospace Engineering, Guilherme entered the ML world as a Research Engineer at a French startup called LightOn. While at LightOn, he was part of the Falcon team and was in charge of creating the pretraining dataset for the Falcon LLM: the RefinedWeb dataset. After working on the Falcon project, he joined HuggingFace, where he maintains the open-source data processing library "datatrove" and works on improving pretraining datasets as a member of the HuggingFace Science Team.
Talk
Vector Databases (pg_vector) and Real-Time Analytics
Advanced Analytics
Generative AI
NLP
Hubert Dulay
Developer Advocate at StarTree
Hubert Dulay is an O'Reilly author of "Streaming Data Mesh" and "Streaming Databases" (early access). He is a veteran engineer with over 20 years of experience in big and fast data. Hubert has compiled his experiences with data from his time while consulting for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems. With the massive growth in ML/AI, Hubert helps data practitioners understand how existing data pipelines can be altered for ML/AL.
Talk
Beyond text: Exploring the world with Large Multimodal Models
Advanced Analytics
NLP
Computer Vision
Jules Talloen
Senior Machine Learning Engineer at ML6
As a full-stack ML engineer, Jules is interested in the entire development process, from state-of-the-art research to data pipelines and web development. To stay at the forefront of innovation, he continuously monitors the latest research and enables applications built on top of the latest advancements. Using his technical expertise and drive for perfectionism, Jules translates business needs into high-quality, scalable, and performant data processing solutions.
Talk
First Principles Thinking in real world Machine Learning
Data Career Skills
Data Product Management
Data Team Leadership
Pedro Azevedo
General Partner at Union Square Ventures
Pedro Azevedo is a mechanical engineering graduate from the University of Aveiro who researched Autonomous Driving Assistance Systems (ADAS) in the Laboratory of Automation and Robotics at the Department of Mechanical Engineering. After completing his master's degree, Pedro transitioned to the industry, where he worked on real-time ML applied to the manufacturing industry and now works on large-scale machine learning systems, developing end-to-end ML pipelines through use cases in world-class fashion brands such as Adidas.
Talk
Policy as code: Automating compliance with Data Mesh
Data Centric
Data Engineering & MLOps
Data Product Management
Responsible AI
Shawn Kyzer
Associate Director of Data Engineering at AstraZeneca
Shawn is an innovative technologist and data strategist with over 15 years of experience. He is passionate about harnessing data strategy, engineering, and analytics to help businesses uncover new opportunities. His holistic view of technology and emphasis on developing strong engineering talent, focusing on delivering outcomes while minimizing outputs, sets him apart. Shawn's deep technical knowledge includes distributed computing, cloud architecture, data science, machine learning, and engineering analytics platforms. He worked as a consultant for prestigious clients, from secret-level government organizations to Fortune 500 companies.
Talk
High Mileage of Low Rank Adapters
Data Engineering & MLOps
Deep Learning
NLP
Shikhar Chauhan
Data Scientist at KBC Bank NV
Shikhar Chauhan has worked with Deep Learning for images and text throughout his entire career, from solving fake news with AI to automating electronic waste recycling to automating document processing for a bank and optimizing internal processes using GenAI. Shikhar has also been a contributor to Transformers, Torchvision, and spaCy. He loves to share knowledge and keep up to date with the industry and the latest in the field. Shikhar is fascinated by NLP, Images, and Multi-Modal AI.
Talk
DGCF: Deep Graph-Powered Recommendations
Deep Learning
Recommendation and Personalization
Sofia Bourhim
Research Scientist at ENSIAS
Sofia Bourhim is a Research Scientist. She is working on Graph Machine Learning and its interdisciplinary applications to AI for Good problems. She recently obtained her Ph.D. in the field of AI. She previously interned at Microsft research lab (MARI) as a Research intern and is a recipient of the Microsoft PhD Fellowship. Sofia's contributions extend to receiving recognition for the best research papers at various conferences and notable events like NeurIPS'23 and Gitex'23.
Talk
Unlocking LLMs: From forgiving errors to ethical considerations in domain adaptation
Domain-specific applications of AI
NLP
Responsible AI
Vivek Kumar
General Partner at Union Square Ventures
Vivek Kumar is a Senior Researcher/Scientist and the Chair of Open Source Intelligence at the University of Federal Armed Forces under the Federal Ministry of Defence, Germany. He coordinates STELAR, an EU's Horizon 2020 program targeting "Artificial Intelligence, Data, and Robotics" in agri-food. Dr. Kumar won the Marie Sklodowska-Curie fellowship in 2019 for the EU's Horizon 2020 ITN network program PhilHumans & worked with Philips Research, Netherlands, and the University of Cagliari to receive a doctorate in 2023. His current research focuses on NLP, Knowledge Graphs, LLMs, AI Fairness & Bias.
Talk
The ML monitoring flow for models deployed to production
Data Engineering & MLOps
Explainability
Responsible AI
Wojtek Kuberski
Co-Founder and CTO at NannyML
Wojtek Kuberski is an AI professional and entrepreneur with a master's in AI from KU Leuven. He founded Prophecy Labs, a consultancy specializing in machine learning, before getting his current role as a co-founder and CTO of NannyML. NannyML is an open-source library for ML monitoring. At NannyML, he leads the research and product teams, contributing to novel algorithms in the model monitoring space. Wojtek was featured in the Breakout Session at Web Summit for "some of the world's most exciting early-stage startups".
Hubert Dulay
Developer Advocate at StarTree
Jules Talloen
Senior Machine Learning Engineer at ML6
Pedro Azevedo
Machine Learning Analyst at Adidas
Shawn Kyzer
Associate Director of Data Engineering at AstraZeneca
Shikhar Chauhan
Data Scientist at KBC Bank NV
Sofia Bourhim
Research Scientist at ENSIAS
Vivek Kumar
Sr Researcher/Scientist at Universität der Bundeswehr München
Wojtek Kuberski
Co-Founder and CTO at NannyML
Be the first to know about Data Makers Fest 2024 speakers
Thank you! We'll send you the updates.
Oops! Something went wrong while submitting the form.
PARTNERS

Meet our partners

From market-leading companies to innovative startups, our partners  fuel our journey with unparalleled support and collaboration.
As you see, there's still room for you.

Builders
Supporter
Communication Partners
Become a Partner

Partner with us!

Partner with Data Makers Fest side by side with market-leading companies. Check our previous partners here and our current ones below.

Find out the corporate opportunities we have for this edition and tailor your plan to your needs. We want you to join the Fest!

Talk
Building recommendation systems for e-commerce marketplaces
Data Engineering & MLOps
Data Product Management
Recommendation and Personalization
Bongani Shongwe
Senior Data Engineer at Adevinta
Bongani is a data engineer working in personalization and recommendations for Adevinta's central team for the online classified markets across Europe and North America. He has a background as a full-stack software engineer but swapped to Data Engineering as it allowed him to follow his passion for large-scale distributed systems and related topics. Bongani obtained his MSc in Computer Science from the University of the Witwatersrand in 2014, focusing on peer-to-peer infrastructures for online gaming.
Talk
Creating a large dataset for pretraining LLMs
Deep Learning
Generative AI
NLP
Guilherme Penedo
ML Research Engineer at HuggingFace
With a background in Aerospace Engineering, Guilherme entered the ML world as a Research Engineer at a French startup called LightOn. While at LightOn, he was part of the Falcon team and was in charge of creating the pretraining dataset for the Falcon LLM: the RefinedWeb dataset. After working on the Falcon project, he joined HuggingFace, where he maintains the open-source data processing library "datatrove" and works on improving pretraining datasets as a member of the HuggingFace Science Team.
Talk
Vector Databases (pg_vector) and Real-Time Analytics
Advanced Analytics
Generative AI
NLP
Hubert Dulay
Developer Advocate at StarTree
Hubert Dulay is an O'Reilly author of "Streaming Data Mesh" and "Streaming Databases" (early access). He is a veteran engineer with over 20 years of experience in big and fast data. Hubert has compiled his experiences with data from his time while consulting for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems. With the massive growth in ML/AI, Hubert helps data practitioners understand how existing data pipelines can be altered for ML/AL.
Talk
Beyond text: Exploring the world with Large Multimodal Models
Advanced Analytics
NLP
Computer Vision
Jules Talloen
Senior Machine Learning Engineer at ML6
As a full-stack ML engineer, Jules is interested in the entire development process, from state-of-the-art research to data pipelines and web development. To stay at the forefront of innovation, he continuously monitors the latest research and enables applications built on top of the latest advancements. Using his technical expertise and drive for perfectionism, Jules translates business needs into high-quality, scalable, and performant data processing solutions.
Talk
First Principles Thinking in real world Machine Learning
Data Career Skills
Data Product Management
Data Team Leadership
Pedro Azevedo
General Partner at Union Square Ventures
Pedro Azevedo is a mechanical engineering graduate from the University of Aveiro who researched Autonomous Driving Assistance Systems (ADAS) in the Laboratory of Automation and Robotics at the Department of Mechanical Engineering. After completing his master's degree, Pedro transitioned to the industry, where he worked on real-time ML applied to the manufacturing industry and now works on large-scale machine learning systems, developing end-to-end ML pipelines through use cases in world-class fashion brands such as Adidas.
Talk
Policy as code: Automating compliance with Data Mesh
Data Centric
Data Engineering & MLOps
Data Product Management
Responsible AI
Shawn Kyzer
Associate Director of Data Engineering at AstraZeneca
Shawn is an innovative technologist and data strategist with over 15 years of experience. He is passionate about harnessing data strategy, engineering, and analytics to help businesses uncover new opportunities. His holistic view of technology and emphasis on developing strong engineering talent, focusing on delivering outcomes while minimizing outputs, sets him apart. Shawn's deep technical knowledge includes distributed computing, cloud architecture, data science, machine learning, and engineering analytics platforms. He worked as a consultant for prestigious clients, from secret-level government organizations to Fortune 500 companies.
Talk
High Mileage of Low Rank Adapters
Data Engineering & MLOps
Deep Learning
NLP
Shikhar Chauhan
Data Scientist at KBC Bank NV
Shikhar Chauhan has worked with Deep Learning for images and text throughout his entire career, from solving fake news with AI to automating electronic waste recycling to automating document processing for a bank and optimizing internal processes using GenAI. Shikhar has also been a contributor to Transformers, Torchvision, and spaCy. He loves to share knowledge and keep up to date with the industry and the latest in the field. Shikhar is fascinated by NLP, Images, and Multi-Modal AI.
Talk
DGCF: Deep Graph-Powered Recommendations
Deep Learning
Recommendation and Personalization
Sofia Bourhim
Research Scientist at ENSIAS
Sofia Bourhim is a Research Scientist. She is working on Graph Machine Learning and its interdisciplinary applications to AI for Good problems. She recently obtained her Ph.D. in the field of AI. She previously interned at Microsft research lab (MARI) as a Research intern and is a recipient of the Microsoft PhD Fellowship. Sofia's contributions extend to receiving recognition for the best research papers at various conferences and notable events like NeurIPS'23 and Gitex'23.
Talk
Unlocking LLMs: From forgiving errors to ethical considerations in domain adaptation
Domain-specific applications of AI
NLP
Responsible AI
Vivek Kumar
General Partner at Union Square Ventures
Vivek Kumar is a Senior Researcher/Scientist and the Chair of Open Source Intelligence at the University of Federal Armed Forces under the Federal Ministry of Defence, Germany. He coordinates STELAR, an EU's Horizon 2020 program targeting "Artificial Intelligence, Data, and Robotics" in agri-food. Dr. Kumar won the Marie Sklodowska-Curie fellowship in 2019 for the EU's Horizon 2020 ITN network program PhilHumans & worked with Philips Research, Netherlands, and the University of Cagliari to receive a doctorate in 2023. His current research focuses on NLP, Knowledge Graphs, LLMs, AI Fairness & Bias.
Talk
The ML monitoring flow for models deployed to production
Data Engineering & MLOps
Explainability
Responsible AI
Wojtek Kuberski
Co-Founder and CTO at NannyML
Wojtek Kuberski is an AI professional and entrepreneur with a master's in AI from KU Leuven. He founded Prophecy Labs, a consultancy specializing in machine learning, before getting his current role as a co-founder and CTO of NannyML. NannyML is an open-source library for ML monitoring. At NannyML, he leads the research and product teams, contributing to novel algorithms in the model monitoring space. Wojtek was featured in the Breakout Session at Web Summit for "some of the world's most exciting early-stage startups".