Master's thesis in Data & AI: Scaling a Machine Learning Data Quality framework for Generalizability
Challenging assignment with €1000 compensation or €500 + lease car or €600 + housing, professional guidance, training sessions, knowledge events, brainstorming with colleagues and 2 vacation days p/m.
We usually respond within three days
We know that the quality of a training dataset is an important indicator of model performance, but how well does a data quality framework developed on a simple machine learning task generalize to real-world ML scenarios? In this thesis, you’ll extend and test an existing framework on diverse datasets, models and tasks, including LLMs. You’ll explore new quality dimensions and benchmark generalizability, building towards a practical tool that helps teams evaluate and improve their data in complex AI pipelines
💡Areas of Interest: data quality, machine learning, LLMs, statistics, data science
The impact of data quality on machine learning performance is well established, yet most frameworks are tested only in limited, controlled environments. Previously, we developed an automatic data quality framework (Automatic Assessment of Dataset Quality for ML), which showed promising results by quantifying data quality across three core dimensions: completeness, consistency and accuracy using synthetic data and a small set of machine learning models.
However, real-world systems operate in far more varied and complex contexts. Today’s AI models range from classical algorithms to advanced LLMs, and datasets span structured tables, text, sensor streams, and more. Without validating how such a framework performs across these environments, its insights remain confined to the lab. The question is not just does it work, but how well does it generalize?
The Assignment
This thesis explores the generalizability of the developed data quality assessment framework across a wide spectrum of machine learning use cases. You will:
· Extend dataset coverage using both real-world and synthetic data from diverse domains (e.g., healthcare, finance, social media, e-commerce, public benchmarks).
· Diversify task types, including classification, regression and clustering.
· Broaden algorithmic scope by comparing a range of machine learning models.
· Evaluate the role of LLMs in assessing data quality across different data domains.
· Compare model responses to varying data quality degradations across model sizes and architectures.
· Add quality dimensions such as uniqueness, timeliness, accessibility, believability and statistical measures.
· Benchmark generalizability by measuring the framework’s reliability across tasks, models, and datasets.
The final deliverable is an empirically validated, modular extension of the original framework - capable of guiding users towards improving their datasets for machine learning.
About Info Support
Info Support specializes in custom software, data/AI solutions, management, and training and is active in the Finance, Industry, Agriculture, Food & Retail, Mobility & Public, and Healthcare sectors. We provide solid and innovative solutions for complex and critical software issues. Our headquarters are located in Veenendaal (NL) and Mechelen (BE). At present, approximately 500 employees are employed by Info Support.
Info Support's working method is characterized by a number of core values: solidity, integrity, craftsmanship, and passion. These core values are intertwined in our work and the way we interact with each other.
To ensure that all employees are always up to date with the latest developments, Info Support has an in-house knowledge center that eagerly satisfies the hunger for more or different knowledge and skills.
B2 language proficiency in Dutch is required.
- Department
- Student Master
- Role
- Data & AI
- Locations
- Info Support Nederland
- Remote status
- Hybrid
Why graduate with Info Support?
- 
          🧑🏫 Engaged guidance» Personal mentors 
 » Weekly sessions with experts
 » Training and knowledge-sharing evenings
- 
          💰 Choose your compensation p/m€ 1000,00 euro compensation 
 € 500,00 euro + a lease car
 € 600,00 euro + living space
- 
          ⚖️ Flexibility & balance» Hybrid working 
 » Flexible working hours
 » Sole focus on your graduation
Behind the scenes
CodeDocent
In this episode of CodeDocent, Nico Jansen, instructor at the Info...
Josse @ Info Support
Josse talks about his experience as a beginner at Info Support.
Customer case KPN
KPN was guided playfully towards DevOps by Info Support.
Growing in an environment full of knowledge and joy
- 
          🌞 Welcoming company culture» An informal and open atmosphere 
 » You’re part of the team from day one
 » Weekly knowledge-sharing sessions
 » Engaging community events
 » An unforgettable New Year’s party!
- 
          ❤️ Passion for IT & Craftsmanship» Colleagues with a true passion for their craft 
 » Learn from teammates who love to share their knowledge
 » Work alongside experts who challenge and inspire you
- 
          🌱 Room to grow» Graduating is the starting point of your career 
 » Opportunity to seamlessly transition into a job after graduation
 » Clear development paths and growth opportunities
Your journey to Info Support
- 
          🖥️ Digital introductionDuring the digital introduction, you'll share who you are and what you're looking for. We'll tell you more about who we are and what we can offer you. That way, we can discover together whether there's a connection. 
- 
          🔍 Online assessmentsThrough two short online assessments, we gain a clear picture of who you are and what you're capable of. They cover your personality and motivations, as well as your technical knowledge. 
- 
          🏢 Meeting at our officeBased on the assessments, we gain insight into your profile. We’ll discuss your personality, have a sparring session with a fellow professional, and take the time to truly get to know the person behind the results. 
- 
          ✍️ Finishing touchesAfter the interview, we’ll fine-tune the assignment and make the right match. This way, we lay the foundation for a successful collaboration. The final step is a personal signing moment with our director. 
 
 
 
 
