Master's Thesis Topics available at DIS

Publication date: 11-04-2019

The Distributed & Interactive Systems group at CWI has new open positions for motivated students who would like to work on their Master’s thesis as an internship in the group. Topics include smart textiles, activity recognition, physiological sensing, virtual reality, point clouds, Internet of things and web technologies. Keep reading for more information about research topics, requirements and contact information.

TextileEmotionSensing: Exploring smart textiles for emotion recognition in mobile interactions

Contact: Abdallah El Ali (abdallah.el.ali@cwi.nl)

Can the clothes we wear or chairs we sit on know how we’re feeling? In this project, you will explore a range of smart textiles (e.g., capacitative pressure sensors) for emotion recognition in mobile interactions. Should the textile be embedded in a couch? Should it be attached to the users? Can we robustly detect affective states such as arousal, valence, joy, anger, etc.? This project will require knowledge and know-how of embedded systems, and use of fabrication techniques for embedding sensors in such fabrics. It will involve running controlled user studies to collect (and later analyze) such biometric data.

Skills:

Required:

Embedded systems, signal processing, interest in human computer interaction and fabrication

Recommended:

Applied machine learning

Literature:

IEEE Engineering in Medicine and Biology 2010: Towards a smart glove: Arousal recognition based on textile Electrodermal Response
CHI 2015: FlexTiles: A Flexible, Stretchable, Formable, Pressure-Sensitive, Tactile Input Sensor https://dl.acm.org/citation.cfm?doid=2851581.2890253
CHI 2017 EA: Laugh Log: E-textile Bellyband Interface for Laugh Logging https://dl.acm.org/citation.cfm?id=3053104
CHI 2018: Capturing, Representing, and Interacting with Laughter https://dl.acm.org/citation.cfm?id=3173932
MMHealth 2017: Wearable Emotion Recognition System based on GSR and PPG Signals https://dl.acm.org/citation.cfm?doid=3132635.3132641

ThermalEmotions: Exploring thermal cameras for pose-invariant emotion recognition while mobile

Contact: Abdallah El Ali (abdallah.el.ali@cwi.nl)

Thermal cameras have the unique advantage of being able to capture thermal signatures (heat radiation) from energy-emitting entities. Previous work has shown the potential of such cameras for cognitive load estimation, even under high pose variance. In this project, you will explore (using a standard computer vision approach) the potential for mobile (FLIR) (or possibly higher resolution) thermal cameras for pose-invariant emotion recognition while users are mobile. The idea is that the emotional signature on a user’s face allows such recognition, when coupled with standard facial expression features even under computer vision challenging conditions. You will use/collect a mobile thermal face dataset, and explore different types of SOTA deep neural network architectures to perform such supervised emotion classification.

Skills:

Required:

Computer vision, machine learning

Recommended:

Human computer interaction

Literature:

UbiComp 2017: Cognitive Heat: Exploring the Usage of Thermal Imaging to Unobtrusively Estimate Cognitive Load https://dl.acm.org/citation.cfm?id=3130898
MIPRO 2018: “An overview of thermal face recognition methods” https://ieeexplore.ieee.org/abstract/document/8400200
Sensors 2018: “Convolutional Neural Network-Based Classification of Driver’s Emotion during Aggressive and Smooth Driving Using Multi-Modal Camera Sensors” https://www.mdpi.com/1424-8220/18/4/957
CHI 2018: “Understanding Face and Eye Visibility in Front-Facing Cameras of Smartphones used in the Wild” https://www.medien.ifi.lmu.de/pubdb/publications/pub/khamis2018chi2/khamis2018chi2.pdf
MMHealth 2017: Wearable Emotion Recognition System based on GSR and PPG Signals https://dl.acm.org/citation.cfm?doid=3132635.3132641
ICMEW 2015: “A fast and robust emotion recognition system for real-world mobile phone data” https://ieeexplore.ieee.org/document/7169787

Developing and comparing emotion/mood induction methods across domains

Contact: Abdallah El Ali (abdallah.el.ali@cwi.nl)

Affect is a fundamental aspect of internal and external human behavior and processes. While much research has been done on eliciting emotions, it remains a challenge what is the most effective method(s) for inducing emotions, and under which context. In this project, you will design, develop, and test different affect induction procedures, across a range of contexts that involve HMDs, driving simulators, and/or smartphone interaction. Techniques can be visual, auditory, haptic, but also may explore newer techniques such as electrical muscle stimulation.

Skills:

Required:

Programming, hardware prototyping (e.g., Arduino), experiment design (controlled, field), statistics, interest in human computer interaction

Literature:

Front Psychol. 2014: “How does this make you feel? A comparison of four affect induction procedures” https://www.ncbi.nlm.nih.gov/pubmed/25071659
Front Psychol. 2016: “Emotion Elicitation: A Comparison of Pictures and Films” https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4756121/
AutoUI EA 2018: “A Comparison of Emotion Elicitation Methods for Affective Driving Studies” https://dl.acm.org/citation.cfm?id=3265945
ICEEPSY 2012: “Using mood induction procedures in psychological research” https://www.sciencedirect.com/science/article/pii/S1877042812053669

Comparing AR/VR emotion capture methods for real-time continuous emotion annotation

Contact: Abdallah El Ali (abdallah.el.ali@cwi.nl)

Emotion recognition has moved away from the desktop, and on to virtual environments. This requires collecting ground truth labels in such settings. This project asks: How can we continuously annotate how we are feeling while immersed in a mixed or fully virtual environment? What kind of scenarios does this work in, and which scenarios do not? Can we leverage gaze and head movement, and other non-verbal input methods? This project will require prototyping emotion input techniques, and evaluating them in AR or VR environments to ensure high usability and high quality of collected ground truth data.

Skills:

Required:

Programming/prototyping, controlled user studies, statistics

Recommended:

Unity

Literature:

IJHCS 2013: “AffectButton: A method for reliable and valid affective self-report” https://www.sciencedirect.com/science/article/pii/S1071581913000220
IEEE Transactions on Affective Computing 2017: “Continuous, real-time emotion annotation: A novel joystick-based analysis framework” https://ieeexplore.ieee.org/document/8105870
arXiv 2018: “A dataset of continuous affect annotations and physiological signals for emotion analysis” https://arxiv.org/abs/1812.02782
MUM 2016: “Comparison of in-situ mood input methods on mobile devices” https://dl.acm.org/citation.cfm?id=3012709.3012724

AffectMaps: quantifying the mood of urban city locations with wearable physiological sensing

Contact: Abdallah El Ali (abdallah.el.ali@cwi.nl)

Is there a correlation between user defined tags and physiological markers? What about between the photos they take and the semantics of those photos? In this project, you will investigate how we can physiologically crowdsource urban feelings. This will involve either finding a physiological dataset that is geotagged, or deploying an Android crowdsourcing app to collect such data. This project will be co-supervised by Telefonica Research in Spain, where we may use call detail record (CDR) datasets for Barcelona and/or London.

Skills:

Required:

Android programming, knowledge of smartphone physiological sensors, natural language processing (NLP), geographic information systems (GIS)

Recommended:

Applied machine learning, human computer interaction

Literature:

DH 2018: Hearts and Politics: Metrics for Tracking Biorhythm Changes during Brexit and Trump. https://dl.acm.org/citation.cfm?id=3194678
WWW 2015: The Digital Life of Walkable Streets https://dl.acm.org/citation.cfm?doid=2736277.2741631, Chatty, Happy, and Smelly Maps https://dl.acm.org/citation.cfm?doid=2740908.2741717
MM 2017: Venues in Social Media: Examining Ambiance Perception Through Scene Semantics http://www.idiap.ch/~gatica/publications/BenkheddaSantaniGatica-mm17.pdf

Low-Resolution facial emotion micro-expression recognition in the wild

Contact: Abdallah El Ali (abdallah.el.ali@cwi.nl)

Low-resolution (LR) face recognition is a challenging task, especially when the low resolution faces are captured under non-ideal conditions (e.g., mobile settings). Such face images are often contaminated by blur, non-uniform lighting, and non-frontal face pose. While there is work that investigates a variety of techniques (e.g., super resolution) for dealing with LR face images, it is unclear to what extent such methods can be useful for facial micro expression emotion recognition. This project will involve working with existing micro-expression datasets (e.g., CASME II, CAS(ME)2), and exploring different super resolution techniques on both real and artificially downsampled LR images.

Skills:

Required:

computer vision, machine learning

Recommended:

deep learning, generative models

Literature:

arXiv 2018: “On Low-Resolution Face Recognition in the Wild: Comparisons and New Techniques” https://arxiv.org/abs/1805.11529
PLOS One 2014: “CASME II: An Improved Spontaneous Micro-Expression Database and the Baseline Evaluation” https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0086041
HCI 2016: “CAS(ME)2: A Database of Spontaneous Macro-expressions and Micro-expressions” https://link.springer.com/chapter/10.1007/978-3-319-39513-5_5
Front. Psychol. 2018: “A Survey of Automatic Facial Micro-Expression Analysis: Databases, Methods, and Challenges” https://www.frontiersin.org/articles/10.3389/fpsyg.2018.01128/full
arXiv 2019: “A Review on Facial Micro-Expressions Analysis: Datasets, Features and Metrics” https://arxiv.org/abs/1805.02397
arXiv 2019: “Face Recognition in Low Quality Images: A Survey” https://arxiv.org/abs/1805.11519

Objective metrics for point cloud quality assessment

Contact: Pablo Cesar (p.s.cesar@cwi.nl)

Volumetric data captured by state of the art capture devices, in its most primitive form, consists of a collection of points called a point cloud (PC). A point cloud consists of a set of individual 3D points. Each point, in addition to having a 3D (x, y, z) position, i.e., spatial attribute, may also contain a number of other attributes such as color, reflectance, surface normal, etc. There are no spatial connections or ordering relations specified among the individual points. When a PC signal is processed, for example undergoing lossy compression to reduce its size, it is critical to be able to quantify how well the processed signal is approximating the original one, as in the perception of the end user, which is the human being who will visualize the signal. The goal of this project is to develop a new algorithm (i.e. objective full-reference quality metric) to evaluate the perceptual fidelity of a processed PC with respect to its original version. A framework implementing the objective metrics currently available in literature to assess PC visual quality, and comparing the performance to the proposed method will also be developed. Subjective feedbacks on the visual quality of the signals will be collected from users to serve as ground-truth.

Skills:

Required:

Good (Matlab, Python, or C++) programming skills

Literature:

E. Torlig; E. Alexiou; T. Fonseca; R. de Queiroz; T. Ebrahimi, A novel methodology for quality assessment of voxelized point clouds”, 2018. SPIE Optical Engineering + Applications
E. Alexiou; T. Ebrahimi, “Benchmarking of objective quality metrics for colorless point clouds”, 2018 Picture Coding Symposium

Contact: Pablo Cesar (p.s.cesar@cwi.nl)

Nowadays, Virtual Reality (VR) applications are typically designed to provide an immersive experience with three Degrees of Freedom (3DoF): a user who watches a 360-degree video on a Head-Mounted Display (HMD) can choose the portion of the spherical content to view, by rotating the head to a specific direction. Nevertheless, the feeling of immersion in VR results not only from the possibility to turn the head and change the viewing direction but also from changing the viewpoint, moving within the virtual scene. VR applications allowing translations inside the virtual scene are referred to as six Degrees of Freedom (6DoF) applications.

The goal of this project is the development of a platform to capture user’s navigation patterns in 6DoF VR. First, an interface to capture user’s position in the virtual space will be implemented in Unity3D for a HMD equipped with controllers and eventually special sensors for positional tracking. Second, a user study to collect the navigation patterns of actual users in a virtual environment, such a 3D scene model, will be designed and performed. Third, the data will be analyzed to explore correlation between different user’s navigation behavior.

Skills:

Unity programming

Literature:

X Corbillon, F De Simone, G Simon, P. Frossard, “Dynamic Adaptive Streaming for Multi-Viewpoint Omnidirectional Videos”, Proceedings of the 9th ACM on Multimedia Systems Conference ACM MMsys 2018
A. L. Simeone et al., “Altering User Movement Behaviour in Virtual Environments”, IEEE Transactions on Visualization and Computer Graphics 2017
http://antilatency.com

Studying the impact of audio cues on Focus of Attention in 3DoF VR

Contact: Pablo Cesar (p.s.cesar@cwi.nl)

Omnidirectional, i.e., 360-degree videos, are spherical signals used in Virtual Reality (VR) applications: a user who watches a 360-degree video on a Head-Mounted Display (HMD) can choose which portion of spherical content to display by moving the head to a specific direction. This is referred to as three Degrees of Freedom (3DoF) navigation. The portion of spherical surface attended by the user is projected to a segment of plane, called viewport. Recently, many studies have appeared proposing datasets of user’s head movements during 360-degree video consumption to analyze how users explore immersive virtual environments.

Understanding VR content navigation is crucial for many applications, such as designing VR content, developing new compression algorithms, or learning computational models of saliency or visual attention. Nevertheless, most of the existing dataset are considering video content without an audio channel. The goal of this project is the creation of a dataset of head movements of users watching 360-degree videos, with or without an audio channel, on a Head-Mounted Display (HMD). First, a user experiment to collect such data will be designed and performed. Second, the collected data will be analyzed by means of statistical tools to quantify the impact of audio cues, which might drive the user’s visual Focus of Attention.

Skills:

Unity programming

Literature:

X Corbillon, F De Simone, G Simon, “360-degree video head movement dataset”, Proceedings of the 8th ACM on Multimedia Systems Conference ACM MMsys 2017
V. Sitzmann et al., “Saliency in VR: How do people explore virtual environments?”, in IEEE Transactions on Visualization and Computer Graphics 2018
Alia Sheikh, Andy Brown, Zillah Watson, and Michael Evans, “Directing attention in 360-degree video” IBC 2016
http://dash.ipv6.enstb.fr/headMovements/

Comparing the performance of mesh versus point cloud-based compression

Contact: Pablo Cesar (p.s.cesar@cwi.nl)

Recent advances in 3D capturing technologies enable the generation of dynamic and static volumetric visual signals from real-world scenes and objects, opening the way to a huge number of applications using this data, from robotics to immersive communications. Volumetric signals are typically represented as polygon meshes or point clouds and can be visualized from any viewpoint, providing six Degrees of Freedom (6DoF) viewing capabilities. They represent a key enabling technology for Augmented and Virtual Reality (AR and VR) applications, which are receiving a lot of attention from main technological innovation players, both in academic and industrial communities. Volumetric signals are extremely high rate, thus require efficient compression algorithms able to remove the visual redundancy in the data while preserving the perceptual quality of the processed visual signal. Existing compression technologies for mesh-based signals include open source libraries such a Draco. Compression of point clouds signals is currently under standardization. The goal of this project is the development of a platform to compare the performance of mesh versus point cloud based compression algorithms in terms of visual quality of the resulting compressed volumetric object. Starting from a set of high quality point cloud (or mesh) volumetric objects, the corresponding mesh (or point cloud) representations of the same objects are extracted. Each representation is then compressed using a point cloud/mesh-based codec, and the resulting compressed signals are evaluated in terms of objective and subjective quality.

Skills:

Good programming skills
Computer graphic basics

Literature:

Kyriaki Christaki, et al, “Subjective Visual Quality Assessment of Immersive 3D Media Compressed by Open-Source Static 3D Mesh Codecs”, preprint, 2018
Gauthier Lafruit et al., “MPEG-I coding performance in immersive VR/AR applications”, IBC 2018
M. Berger et al., “A Survey of Surface Reconstruction from Point Clouds”, Computer Graphics Forum 2016
https://github.com/google/draco

Human perception of volumes

Contact: Pablo Cesar (p.s.cesar@cwi.nl)

Recent advances in 3D capturing technologies enable the generation of dynamic and static volumetric visual signals from real-world scenes and objects, opening the way to a huge number of applications using this data, from robotics to immersive communications. Volumetric signals are typically represented as polygon meshes or point clouds and can be visualized from any viewpoint, providing six Degrees of Freedom (6DoF) viewing capabilities. They represent a key enabling technology for Augmented and Virtual Reality (AR and VR) applications, which are receiving a lot of attention from main technological innovation players, both in academic and industrial communities. The goal of this project is to design and perform a set of psychovisual experiments, using VR technology and visualization via a Head Mounted Display (HMD), in which the impact on human perception of different properties of the volumetric signal representation via point clouds or meshes, such as the convexity and concavity of a surface, the resolution, the illumination and color, are analyzed. First, a review of the state of the art on the perception of volumetric objects will be performed, second a set of open research questions will be chosen and a set of experiments will be designed and performed, and the collected data will be analysed in order to answer the research question.

Skills:

Unity programming
Interest in human perception analysis

Literature:

C. J. Wilson and A. Soranzo, “The Use of Virtual Reality in Psychology: A Case Study in Visual Perception”, Computational and Mathematical Methods in Medicine, 2015
J. Zhang et al., “A subjective quality evaluation for 3d point cloud models”, in Proc. of the 2014 International Conference on Audio, Language and Image Processing
J. Thorn et al., “Assessing 3d scan quality through paired-comparisons psychophysics test”, in Proc. of the 2016 ACM Conference on Multimedia

Reconstructing high frame rate point clouds of human bodies

Contact: Pablo Cesar (p.s.cesar@cwi.nl)

Volumetric sensing, based on range-sensing technology, allows to capture the depth of on object or an entire scene, in addition to its color information. A format that has recently become widespread to represent volumetric signals captured by such sensors, such as the Intel Real Sense camera, is the point cloud. A point cloud is a set of individual points in the 3D space, each associated with attributes, such as a color triplet. With respect to other volumetric representations, such as polygon meshes, the point cloud content generation implies much less computational processing, thus it is suitable for live capture and transmission. The goal of this project is the development of a machine learning-based approach to interpolate the point clouds representing a human body captured at different instants in time, in order to increase the frame rate of dynamic point cloud capturing a user’s body. The core of the project will be on the design of the network. The second main goal will be the collection of training data produced by using a capture set-up made of multiple Intel Real Sense sensors, in order to train the neural network aiming at learning how the point cloud signal representing body movement evolves in time.

Skills:

Machine learning
Good programming skills

Literature:

S. Niklaus et al, “Video Frame Interpolation via Adaptive Convolution”, https://arxiv.org/abs/1801.07829 2017
Bolei Zhou, Alex Andonian, Aude Oliva, and Antonio Torralba, “Temporal Relational Reasoning in Videos”, European Conference on Computer Vision (ECCV) 2018
“Dynamic Graph CNN for Learning on Point Clouds” https://arxiv.org/abs/1801.07829 2017
http://relation.csail.mit.edu/
http://tedxiao.me/pdf/CS294_Report.pdf

Impartial Data

Contact: Steven Pemberton (steven.pemberton@cwi.nl)

Pilot implementation of a generic data platform.

To create a pilot implementation of a generic platform that reads data in any (parseable) form for generic processing. Program will be used as a showcase for the technique, and will be possibly embedded in other platforms. May eventually lead to a W3C standard.

Skills:

Programming in Python and/or C
Some knowledge of grammars would be advantageous.

CSS Skinning

Contact: Steven Pemberton (steven.pemberton@cwi.nl)

XForms is a W3C standard [1], originally designed for forms on the web, but in later versions generalized to support more general applications [e.g. 2, 3]. This project will create a CSS suite to allow the ‘skinning’ of XForms: parameterized for color and other elements, in the manner of Bootstrap [4] and Bulma [5] mobile aware, so that the same form would be usable equally well on a desktop as a phone.

Skills:

Understanding of markup and CSS

Literature:

XForms Test Suite

Contact: Steven Pemberton (steven.pemberton@cwi.nl)

XForms is a W3C standard [1], originally designed for forms on the web, but in later versions generalized to support more general applications [e.g. 2, 3]. This project will create a test infrastructure to test implementations of the latest version of XForms. Parameterized using XForms for easy updating Generating an overview of test results.

Skills:

HTML
XML would be a plus.

Literature:

Internet of Things (IoT) programming notation

Contact: Steven Pemberton (steven.pemberton@cwi.nl)

Development of a programming notation for a new interactive IoT platform. A declarative interface technique is used to access and control the devices without having to worry about the variety of device-interfaces. This project will analyze programming notations for other similar projects, such as IFTTT, and design a declarative programming notation based on state changes, rather than using events, or traditional programming.

Igor is an IoT platform using new techniques for managing IoT devices by placing a thin layer around a non-homogeneous collection of Internet of Things devices, hiding the data-format and data-access differences, and auto-updating the devices as needed.

Skills:

Knowledge of Python would be advantageous.

Literature:

https://www.cwi.nl/~steven/iot/

	Github
	YouTube
	Vimeo
	Twitter
	LinkedIn

Master's Thesis Topics available at DIS

TextileEmotionSensing: Exploring smart textiles for emotion recognition in mobile interactions

ThermalEmotions: Exploring thermal cameras for pose-invariant emotion recognition while mobile

Developing and comparing emotion/mood induction methods across domains

Comparing AR/VR emotion capture methods for real-time continuous emotion annotation

AffectMaps: quantifying the mood of urban city locations with wearable physiological sensing

Low-Resolution facial emotion micro-expression recognition in the wild

Objective metrics for point cloud quality assessment

User navigation in 6DoF Virtual Reality

Studying the impact of audio cues on Focus of Attention in 3DoF VR

Comparing the performance of mesh versus point cloud-based compression

Human perception of volumes

Reconstructing high frame rate point clouds of human bodies

Impartial Data

CSS Skinning

XForms Test Suite

Internet of Things (IoT) programming notation

Contact

Centrum Wiskunde & Informatica

Follow us...