Follow us on: GitHub YouTube Vimeo Twitter LinkedIn

Master's Thesis Topics at CWI DIS

Publication date: 2025-05-06

The Distributed & Interactive Systems group at CWI has new open positions for motivated students who would like to work on their Master’s thesis as an internship in the group. Topics include human computer interaction, artificial intelligence, cognitive (neuro-)science and/or interaction design. Keep reading for more information about research topics, requirements and contact information.

How to apply

  • Send your application to Pablo Cesar (P.S.Cesar@cwi.nl) and the responsible person
  • In your email, please include: (a) your recent CV (b) current academic transcripts (c) brief motivation why you want to work on a given or self-defined topic(s)
  • Note: Stipend possible provided your current GPA is greater or equal to 8

Further info

Topics


Evaluating the Perceptual Quality of Volumetric Content in 6 Degrees of Freedom (6DoF) VR Environments

Responsible person: Xuemei Zhou (Xuemei.Zhou@cwi.nl)

Description

As VR applications continue to grow, especially in areas like telepresence and immersive storytelling, the demand for high-quality volumetric content increases. Different representations of humans such as point clouds, polygonal meshes, 3DGS, and computer-generated avatars offer distinct visual and computational trade-offs. However, there is currently no standardized or comprehensive methodology to evaluate the perceptual quality of these representationsparticularly in dynamic, real-time, 6DoF VR scenarios. This gap is even more significant for real-time point cloud systems. The research question is, which representation of 3D volumetric content delivers the highest perceptual quality in immersive 6DoF VR environmentspoint cloud, mesh, 3D Gaussian Splatting (3DGS), or computer-generated avatars?


Multimedia Behavioural Analysis and Prediction of 6-DoF users in VR

Responsible person: Silvia Rossi (s.rossi@cwi.nl)
Website: https://www.silviarossi.nl/

Description

Immersive reality technologies, such as Virtual Reality (VR) and Extended Reality (XR) at large, have opened the way to a new era of user-centric systems, in which every aspect of the coding–delivery–rendering chain is tailored to the interaction of the users. However, to fully enable the potential of XR systems in current network limitations, we need to optimize the system around the final user. That involves the complex problem of effectively modelling and understanding how users perceive and interact with XR spaces [1,2]. Within this framework, the student joining our group will be working on machine learning/deep learning strategies to analyse users’ navigation trajectories within VR space with 6-DoF (e.g., volumetric video content) [3]. The student will focus on the develop of novel user metrics to understand similarities among users [4], with the possibility of extending the research to a more challenging task such as the prediction of 6-DoF trajectories.

Skills

  • Good programming skills (e.g. preferably Python, Matlab), prior knowledge of classical machine learning models (e.g., clustering techniques, linear regression) and/or Deep Learning models (e.g. CNN, RNN, Bayesian networks)
  • Prior knowledge of Virtual Reality applications (optional)

References

  • S. Rossi, A. Guedes, and L. Toni. 2021. Streaming and user behaviour in omnidirectional videos. In Immersive Video Technologies (pp. 49-83). https://discovery.ucl.ac.uk/id/eprint/10158036/1/2021_chapter_ODV.pdf
  • S. Rossi, I. Viola, L. Toni, and P. Cesar. 2021. A new Challenge: Behavioural analysis of 6-DoF user when consuming immersive media. Proceedings of IEEE International Conference on Image Processing (ICIP), 2021.
  • G.K. Illahi, A. Vaishnav, T. Kämäräinen, M. Siekkinen, and M. Di Francesco. 2023. Learning to Predict Head Pose in Remotely-Rendered Virtual Reality. Proceedings of the Conference on ACM Multimedia Systems (MMSys).
  • S. Rossi, I. Viola, L. Toni, and P. Cesar. 2023. Extending 3-DoF metrics to model user behaviour similarity in 6-DoF immersive applications. Proceedings of the Conference on ACM Multimedia Systems.

Exploring Haptic Displays and Biosignal Visualization in (Social) VR

Responsible person: Abdallah El Ali (aea@cwi.nl)
Website: https://abdoelali.com/

Description

Haptic stimulation is an intrinsic aspect of sensory and perceptual experience, and is tied with several experience facets, including cognitive, emotional, and social phenomena. The capability of haptic stimuli to evoke emotions has been demonstrated in isolation, or to augment media. This project will build on our prior work on visualizing biosignals [1,2], and exploring virtual agent biosignals through haptic displays (cf., [1]), to create new forms of social experiences in social VR that leverage physiological signals and body-based actuation. Depending on the exact topic and direction, this project may possibly include a (paid) research visit to Nara Institute of Science and Technology (NAIST) in Japan.

Skills

  • Required: Information visualization (sketching + prototyping); biosensors (e.g., HR, EDA, EMG); HCI research methods; quantitative and qualitative analysis; statistics
  • Recommended: Hardware prototyping (e.g., Arduino), fabrication, thermal, vibrotactile, and/or multimodal output

References

  • A. El Ali, X. Yang, S. Ananthanarayan, T. Röggla, J. Jansen, J. Hartcher-O’Brien, K. Jansen, and P. Cesar. 2020. ThermalWear: Exploring Wearable On-chest Thermal Displays to Augment Voice Messages with Affect. Proceedings of the CHI Conference on Human Factors in Computing Systems (ACM CHI). https://doi.org/10.1145/3313831.3376682
  • S. Lee, A. El Ali, M. Wijntjes, and P. Cesar. 2022. Understanding and Designing Avatar Biosignal Visualizations for Social Virtual Reality Entertainment. Proceedings of the CHI Conference on Human Factors in Computing Systems (ACM CHI). https://doi.org/10.1145/3491102.3517451
  • A. El Ali, R. Ney, Z. M. C. van Berlo and P. Cesar. 2023. “Is that My Heartbeat? Measuring and Understanding Modality-Dependent Cardiac Interoception in Virtual Reality,” IEEE Transactions on Visualization and Computer Graphics.
  • A. El Ali, E.R. Stepanova, S. Palande, A. Mader, P. Cesar, and K. Jansen. 2023. BreatheWithMe: Exploring Visual and Vibrotactile Displays for Social Breath Awareness during Colocated, Collaborative Tasks. Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (ACM CHI). https://doi.org/10.1145/3544549.3585589

Reconstructing Faces for Immersive Telepresence: HMD Removal via Crossmodal Inpainting

Responsible person: Chirag Raman (c.a.raman@tudelft.nl)
Website: https://chiragraman.com

Description

In immersive telepresence systems, head-mounted displays (HMDs) occlude key regions of the face, impeding nonverbal cues such as dynamic expressions essential for communication. This project investigates crossmodal inpainting, reconstructing occluded face regions using speech, visible lower-face motion, and temporal dynamics.

The goal is to synthesize the full 3D facial structure and plausible expressions in real time, enabling more natural avatar-mediated interactions. You will explore techniques that combine speech-driven facial animation [1, 2] with recent developments in one-shot face reenactment [3, 4], 3D shape completion [5], and temporal coherence in 3D portrait reconstruction [6]. This work has strong applications in VR telepresence, immersive broadcasting, and socially intelligent agents.

Skills

  • Required: 3D computer vision / graphics; deep learning
  • Recommended: Experience with deep learning frameworks (e.g. PyTorch, Jax); interest in generative models; familiarity with speech/face datasets and processing frameworks

References

  • J. Thies, M. Elgharib, A. Tewari, C. Theobalt, and M. Nießner. 2020. Neural voice puppetry: Audio-driven facial reenactment. Proceedings of the European Conference in Computer Vision (ECCV).
  • D. Cudeiro, R. Villegas, W. Luo, H. Zhou, and S. Tulyakov. 2019. Capture, Learning, and Synthesis of 3D Speaking Styles. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  • P. Tran, E. Zakharov, L.N. Ho, L. Hu, A. Karmanov, et al. 2024. VOODOO XP: Expressive One-Shot Head Reenactment for VR Telepresence. arXiv preprint.
  • Y. Li, H. Wu, X. Wang, Q. Qin, Y. Zhao, Y. Wang, and A. Hao. 2024. FaceCom: Towards High-fidelity 3D Facial Shape Completion via Optimization and Inpainting Guidance. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  • S. Bounareli, C. Tzelepis, V. Argyriou, I. Patras, and G. Tzimiropoulos. 2023. Hyperreenact: one-shot reenactment via jointly learning to refine and retarget faces. Proceedings of the IEEE/CVF International Conference on Computer Vision.
  • S. Wang, X. Li, C. Liu, M. Chan, M. Stengel, et al. 2024. Coherent3D: Coherent 3D Portrait Video Reconstruction via Triplane Fusion. arXiv preprint.

Adaptive News Interfaces Driven by Real-Time AI Literacy Detection

Responsible person: Pooja Prajod (Pooja.Prajod@cwi.nl), Abdallah El Ali (Abdallah.El.Ali@cwi.nl)
Website: https://scholar.google.nl/citations?user=HjZ2RVMAAAAJ&hl=en and https://abdoelali.com

Description

As AI-generated news becomes increasingly widespread, there is a growing demand for transparency and disclosure of AI usage in content creation [1]. However, research shows mixed findings on how much users trust AI-generated news, and this uncertainty can have important societal implications [2]. One key factor influencing trust may be the reader’s level of AI literacy, specifically their ability to understand and critically evaluate AI-generated content [3]. Typically, AI literacy is measured through questionnaires or knowledge tests. However, these methods are often impractical in dynamic, real-world settings - especially when building interfaces that aim to adapt to an individual user’s AI literacy level in real-time. This project aims to develop systems that can infer a user’s AI literacy level based on behavioral and physiological data during their interaction with news content, and adapt the interface in real-time to better support their needs. As part of this project, you will explore one of the various research directions such as analysis (e.g., identifying indicators of AI literacy), implementation (e.g., real-time machine learning models, interface development), and designing adaptive interfaces (e.g., transparency levels).

Skills

  • Required: Programming and Application development (e.g., Python), Machine learning, Signal processing, Quantitative analysis, HCI research methods, User studies
  • Recommended: Interest in behavioral and physiological signals, Data collection, UI and UX design

References

  • [1] El Ali, Abdallah, Karthikeya Puttur Venkatraj, Sophie Morosoli, Laurens Naudts, Natali Helberger, and Pablo Cesar. 2024. Transparent AI disclosure obligations: Who, what, when, where, why, how. Extended Abstracts of the CHI Conference on Human Factors in Computing Systems.
  • [2] Fabrizio Gilardi, Sabrina Di Lorenzo, Juri Ezzaini, Beryl Santa, Benjamin Streiff, Eric Zurfluh, and Emma Hoes. 2024. Willingness to Read AI-Generated News Is Not Driven by Their Perceived Quality. arXiv preprint.
  • [3] Davy Tsz Kit Ng, Jac Ka Lok Leung, Samuel Kai Wah Chu, and Maggie Shen Qiao. 2021. Conceptualizing AI literacy: An exploratory review. Computers and Education: Artificial Intelligence.

Designing Meaningful, Digital-Physical Interactions for Social VR to Experience Cultural Heritage

Responsible person: Willemijn Elkhuizen (W.S.Elkhuizen@tudelft.nl), Karolina Wylezek (Karolina.Wylezek@cwi.nl)
Website: https://www.tudelft.nl/en/ide/about-ide/people/elkhuizen-ws and https://www.cwi.nl/en/people/karolina-wylezek/

Description

You will design, prototype and test a Virtual Reality (VR) or Mixed Reality (MR) experience, which will offer a rich experience of a masterpiece: the ‘Visboeck’. A hand-drawn and hand-written manuscript, from the 16th century, containing detailed descriptions of fishes, and many other real and mythical ocean creatures. You will explore how this could be a shared, social VR/MR experience, where 2 or multiple people can collectively have an experience. You will also explore how visitors might explore the book, and/or stories about its author and historical context, through creating hybrid digital-physical interactions. Here we envision to combine a VR/MR head-mounted display with connected micro-electronics, to create for instance a smart environment and/or smart interactables, for intuitive and meaningful interaction. This project is a collaboration with SURF and the National Library (KB).

Skills

  • Experience with or willing to learn how to create VR/MR experiences (autonomous learning materials & support are available)
  • Experience with and well voiced in creating functional prototypes with connected micro-electronics (such as in ‘Digital Interfaces’, and similar courses)
  • Has affinity with the cultural heritage context and/or literary heritage

Transparent and Trustworthy Human-AI Interaction

Responsible person: Abdallah El Ali (aea@cwi.nl)
Website: https://abdoelali.com/

Description

When dealing with AI-generated or AI-edited content, AI system disclosures (such as AI labels) can influence users’ perceptions of media content [1]. For example, effective AI labels can enable viewers to immediately recognize AI’s involvement, allowing them to quickly evaluate source credibility, verify the accuracy of the content, acquire contextual knowledge, and make informed decisions around the trust and authenticity of such content.

This topic includes several sub-topics:

  1. Remote audience engagement with news using behavioral and physiological sensors: This project explores remote or in-situ sensing of audience engagement using a range of behavioral and physiological sensors. The research will focus primarily on objective measures of human engagement (e.g., using computer vision for head tracking) to infer engagement with the news.
  2. User perceptions of human and AI news using voice assistants: This topic explores user perceptions of human and AI-generated news delivered using a range of voice assistants (see e.g., [2]).
  3. Intelligent disclosure-aware user interfaces
  4. Intelligent visualization techniques for disclosures

These topics are in collaboration with the AI, Media, Democracy lab (https://www.aim4dem.nl/), where the master’s student is expected to be part of. More details per request.

Skills (can differ by topic)

  • Required: sensors; computer vision; signal processing; visualization; HCI research methods; quantitative and qualitative analysis
  • Recommended: Interest in physiological and behavioral sensing; interest in journalism and news media

References

  • A. El Ali, K. Puttur Venkatraj, S. Morosoli, L. Naudts, N. Helberger, and P. Cesar. 2024. Transparent AI Disclosure Obligations: Who, What, When, Where, Why, How. Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (ACM CHI). https://doi.org/10.1145/3613905.3650750
  • S. Rao, V. Resendez, A. El Ali, and P. Cesar. 2022. Ethical Self-Disclosing Voice User Interfaces for Delivery of News. Proceedings of the Conference on Conversational User Interfaces (ACM CUI). https://doi.org/10.1145/3543829.3544532

Exploring Digital Proximity in VR

Responsible person: Silvia Rossi (s.rossi@cwi.nl)
Website: https://www.silviarossi.nl/

Description

Virtual Reality (VR) offers immersive experiences that enable users to connect and collaborate together within the same virtual space. However, improving these experiences requires a deep understanding and analysis of how social interactions happens within these virtual environments. Digital Proxemics, the study of proximity in digital spaces, is an example of tool to analyse social experiences in VR [1]. Recent research has also shown that social conversations in VR involve complex group dynamics where participants adapt their behaviours based on social cues and relationships within the group [2]. By leveraging these insights, we can better understand how digital proximity influences user interactions and experiences in VR also by using behavioural user metrics such as clustering in VR [3]. The student will explore digital proximity in VR and comparing it with other user metrics to enhance our understanding of user experiences in immersive virtual environments.

Skills

  • Good programming skills (e.g. preferably Python), prior knowledge of classical machine learning models (e.g., clustering techniques, linear regression), optional prior knowledge of Virtual Reality applications

References

  • J.R. Williamson, J. O’Hagan, J.A. Guerra-Gomez, J.H. Williamson, P. Cesar, and D.A. Shamma. 2022. Digital proxemics: Designing social and collaborative interaction in virtual environments. Proceedings of the CHI conference on human factors in computing systems (ACM CHI).
  • C. Raman, H. Hung, and M. Loog. 2022. Social processes: Self-supervised metalearning over conversational groups for forecasting nonverbal social cues”. Proceedings of the European Conference on Computer Vision.
  • S. Rossi, F. De Simone, P. Frossard, and L. Toni. 2019. Spherical clustering of users navigating 360 content. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP).

Hybrid 3D Representations for Immersive Avatars: Toward Efficient Remote Embodied Interaction

Responsible person: Chirag Raman (c.a.raman@tudelft.nl)
Website: https://chiragraman.com

Description

Representing people in immersive telepresence environments requires a delicate balance between visual fidelity, interactivity, and transmission efficiency. Point clouds are flexible but noisy, meshes are efficient but may lack expressivity, and neural implicit representations offer realism but are computationally heavy [1, 2]. This project explores hybrid 3D representations that combine the strengths of multiple representations to support dynamic fidelity, real-time compression, and adaptive streaming of avatars. You will explore the integration of structured and learned representations for remote social interaction and volumetric video avatars. Building on recent work on neural localizer fields [1]. This project asks: how can we intelligently switch or blend representations to optimize for communication constraints without compromising the user experience? The goal is to develop a technique that adapts the avatar representation based on bandwidth constraints and interaction context, while maintaining identity coherence and social presence. Applications include multi-user virtual conferencing, holographic broadcasting, and human-robot embodied interaction.

Skills

  • Required: 3D computer vision / graphics; deep learning
  • Recommended: Experience with deep learning frameworks (e.g. PyTorch, Jax); interest in generative models and neural representations; familiarity with streaming or compression techniques

References

  • I. Sárándi and G. Pons-Moll. 2024. Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation. arXiv preprint.
  • S. Peng, Y. Zhang, Y. Xu, Q. Wang, Q. Shuai, H. Bao, and X. Zhou. 2021. Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

Virtual Avatar Co-embodiment and the Sense of Agency

Responsible person: Abdallah El Ali (aea@cwi.nl)
Website: https://abdoelali.com/

Description

Virtual avatar co-embodiment refers to situations where two or more users embody a single, shared avatar (e.g., in Virtual Reality). This offers a multi-user experience characterized by shared control over the avatar’s movement, allowing for the creation of stronger bonds between humans at a distance. Prior experiments have shown that participants who co embody a virtual avatar report high levels of perceived control, with lower levels of actual control [1], making it a promising method for VR-based rehabilitation and training. In this project we will build on our prior work in which we implemented non-verbal coordination between co-embodied participants using position-aware haptic feedback [2]. Initial results showed that participants reported a lower Sense of Agency (SoA) with haptics than without. However, the type of haptics, and the type of shared avatar representation could have impacted the findings. This raises questions about how best to design haptic feedback and avatar representations for convincing virtual co-embodiment.

For this project, you will start from an existing VR system [2] and explore one of several directions relating to the design, implementation, and (quantitative) analysis of haptic feedback, avatar embodiment, or a combination of both. Other directions may be considered as well. Depending on the topic and direction, this can be a collaboration with TU Delft IDE. Also, this project may possibly include a (paid) research visit to Nara Institute of Science and Technology (NAIST) in Japan.

Skills

  • Required: Electronics (e.g., Arduino), programming (e.g., Android, C#), user evaluation, quantitative analysis
  • Recommended: interest in haptic actuators and design, AR/VR, quantitative and qualitative analysis

References

  • D. Kodama, T. Mizuho, Y. Hatada, T. Narumi and M. Hirose. 2023. Effects of Collaborative Training Using Virtual Co-embodiment on Motor Skill Learning. IEEE Transactions on Visualization and Computer Graphics.
  • K. Puttur Venkatraj, W. Meijer, M. Perusquía-Hernández, G. Huisman, and A. El Ali. 2024. ShareYourReality: Investigating Haptic Feedback and Agency in Virtual Avatar Co-embodiment. Proceedings of the CHI conference on human factors in computing systems (ACM CHI).

Evaluating the Role of Generative Models in Visual Quality Assessment of Point Clouds in 6 Degrees of Freedom (6DoF)

Responsible person: Xuemei Zhou (Xuemei.Zhou@cwi.nl)

Description

Can generative models, particularly when combined with large language models (LLMs), enhance the visual quality assessment for point clouds in 6DoF environments? If yes, to what degree?

References

  • Generative Agents: Interactive Simulacra of Human Behavior
  • GenAssist: Making Image Generation Accessible
  • Going Incognito in the Metaverse: Achieving Theoretically Optimal Privacy-Usability Tradeoffs in VR
  • Where is the Boundary? Understanding How People Recognize and Evaluate Generative AI-extended Videos.
  • A Survey of AI-Generated Video Evaluation
  • Large Point-to-Gaussian Model for Image-to-3D Generation

Interaction Design; Human-Computer/AI Interaction; VR Conference, AR Theatres

Responsible Person: Moonisa Ahsan (moonisa@cwi.nl)
Website: https://www.imoonisa.com

Description

Exploring the Interaction with an AI-enabled Agent in a VR Conference Setup: This work will focus on designing and analyzing the interaction between users and an AI-enabled agent within a virtual reality (VR) conference space through voice. chat and visual interactions. The research aims to understand how AI-driven agents can support conference participants by providing information, facilitating networking, and enhancing engagement. The study may cover design principles, user experience, and the agent’s role in creating an immersive and interactive environment for virtual events.

Exploring Design Aspects of an AR Greek Theater Play: This work will explore the design of an augmented reality (AR) experience for a Greek theater play. The study will focus on creating a dynamic, interactive experience where users can access translated subtitles and supplementary content in real time for international audience. This research will examine how AR can enhance the theatrical experience by providing additional layers of storytelling, accessibility and cultural context for both Greek and non-Greek-speaking viewers.

Skills

  • Human-Computer Interaction (HCI): Familiarity with user-friendly design principles and interactions within XR.
  • Programming: Fundamental experience with Unity and Python (for AI tasks) for using and adding new components to existing code and coordinating with developers.
  • AI Basics: Understanding of how AI works, especially for virtual assistants or chatbots, to help design simple AI-powered features.
  • Usability Testing and Analysis: Ability to run simple tests with users, gather feedback, analyze the data and use it to improve AR/VR interactions
  • UX Design: Ability to create simple sketches or prototypes of AR/VR experiences, focusing on how users will interact with virtual environments.

Designing Tools for Journalistic Disclosure of AI Usage in News Production

Responsible Person: Pooja Prajod (Pooja.Prajod@cwi.nl), Abdallah El Ali (Abdallah.El.Ali@cwi.nl)
Website: https://scholar.google.nl/citations?user=HjZ2RVMAAAAJ&hl=en and https://abdoelali.com

Description

As generative AI tools become more integrated into journalism [1, 2] - from drafting headlines to generating full articles - there is a growing demand for newsrooms to be transparent with their readers and disclose the usage of AI [1]. There is currently little standardization for AI usage disclosures in journalism [3]. This presents a design opportunity for developing interfaces or tools that support journalists to disclose AI usage in a clear, trustworthy, and contextually appropriate way. This project aims to develop an AI disclosure system that helps journalists effectively communicate how AI was used in the creation of news content. This includes both the what (e.g., which parts of an article were AI-assisted) and the how (e.g., the role AI played in the editorial process). As part of this project, you will explore two key aspects - design & implementation (e.g., co-creation workshops with journalists, interface/tool development), and user studies (e.g., effectiveness of the disclosures, willingness to use).

Skills

  • Required: Programming and Application development (e.g., Python), UI design, Quantitative and Qualitative analysis, HCI research methods, User studies
  • Recommended: Interest in applications of generative AI, Data collection, Focus group interviews

References

  • [1] El Ali, Abdallah, Karthikeya Puttur Venkatraj, Sophie Morosoli, Laurens Naudts, Natali Helberger, and Pablo Cesar. 2024. Transparent AI disclosure obligations: Who, what, when, where, why, how. Extended Abstracts of the CHI Conference on Human Factors in Computing Systems.
  • [2] Nicholas Diakopoulos. 2019. Automating the news: How algorithms are rewriting the media. Harvard University Press.
  • [3] Agnes Stenbom et al. 2024. Nordic AI Journalism X Utgivarna Report: AI transparency in Journalism. Nordic AI Journalism. Available at: https://www.nordicaijournalism.com/ai-transparency