Several of my research projects aim to discover the preferences and visual taste of a designer from interactions with interactive evolutionary systems, learning the underlying factors of users' choices in order to better accommodate them.
Abstract: This paper introduces a large scale multimodal corpus collected for the purpose of analysing and predicting player engagement in commercial-standard games. The corpus is solicited from 25 players of the action role-playing game Tom Clancy’s The Division 2, who annotated their level of engagement using a time-continuous annotation tool. The cleaned and processed corpus presented in this paper consists of nearly 20 hours of annotated gameplay videos accompanied by logged gamepad actions. We report preliminary results on predicting long-term player engagement based on in-game footage and game controller actions using Convolutional Neural Network architectures. Results obtained suggest we can predict the player engagement with up to 72% accuracy on average (88% at best) when we fuse information from the game footage and the player’s controller input. Our findings validate the hypothesis that long-term (i.e. 1 hour of play) engagement can be predicted efficiently solely from pixels and gamepad actions.
in Proceedings of the 25th ACM International Conference on Multimodal Interaction, 2023. BibTex
Abstract: Tabletop Role-Playing Games (TTRPGs) offer players the opportunity to form imaginary gameworlds and stories within them, create community, solve problems, and explore identity. Designers and researchers have tried to identify how aspects of TTRPGs facilitate collaboration, immersion, creativity, and more. However, there has been no attempt to develop a formal assessment methodology for player experience during TTRPG play. This paper argues that evaluating TTRPG players' experience can provide vital data for Game Masters to improve on their future games, for players to reflect on their experience, and for TTRPG designers or event organizers to collect and compare data. As a first step towards developing such an evaluation method, we identify important dimensions of TTRPG play that can be meaningful to track and actionable to improve upon. Moreover, we review player experience dimensions and evaluation methods in digital games, and explore similarities and differences with TTRPGs.
in Proceedings of the International Conference on the Foundations of Digital Games, 2023. BibTex
Abstract: Games are designed to elicit strong emotions during game play, especially when players are competing against each other. Artificial Intelligence applied to predict a player's emotions has mainly been tested on single-player experiences in low-stakes settings and short-term interactions. How do players experience and manifest affect in high-stakes competitions, and which modalities can capture this? This paper reports a first experiment in this line of research, using a competition of the video game Hearthstone where both competing players' game play and facial expressions were recorded over the course of the entire match which could span up to 41 minutes. Using two experts' annotations of tension using a continuous video affect annotation tool, we attempt to predict tension from the webcam footage of the players alone. Treating both the input and the tension output in a relative fashion, our best models reach 66.3% average accuracy (up to 79.2% at the best fold) in the challenging leave- one-participant out cross-validation task. This initial experiment shows a way forward for affect annotation in games "in the wild" in high-stakes, real-world competitive settings.
in Proceedings of the International Conference on the Foundations of Digital Games, 2023. BibTex
Abstract: This paper introduces a paradigm shift by viewing the task of affect modeling as a reinforcement learning (RL) process. According to the proposed paradigm, RL agents learn a policy (i.e. affective interaction) by attempting to maximize a set of rewards (i.e. behavioral and affective patterns) via their experience with their environment (i.e. context). Our hypothesis is that RL is an effective paradigm for interweaving affect elicitation and manifestation with behavioral and affective demonstrations. Importantly, our second hypothesis - building on Damasio's somatic marker hypothesis - is that emotion can be the facilitator of decision-making. We test our hypotheses in a racing game by training Go-Blend agents to model human demonstrations of arousal and behavior; Go-Blend is a modified version of the Go-Explore algorithm which has recently showcased supreme performance in hard exploration tasks. We first vary the arousal-based reward function and observe agents that can effectively display a palette of affect and behavioral patterns according to the specified reward. Then we use arousal-based state selection mechanisms in order to bias the strategies that Go-Blend explores. Our findings suggest that Go-Blend not only is an efficient affect modeling paradigm but, more importantly, affect-driven RL improves exploration and yields higher performing agents, validating Damasio's hypothesis in the domain of games.
in Proceedings of the International Conference on Affective Computing and Intelligent Interaction, 2022. BibTex
Abstract: Using artificial intelligence (AI) to automatically test a game remains a critical challenge for the development of richer and more complex game worlds and for the advancement of AI at large. One of the most promising methods for achieving that long-standing goal is the use of generative AI agents, namely procedural personas, that attempt to imitate particular playing behaviors which are represented as rules, rewards, or human demonstrations. All research efforts for building those generative agents, however, have focused solely on playing behavior which is arguably a narrow perspective of what a player actually does in a game. Motivated by this gap in the existing state of the art, in this paper we extend the notion of behavioral procedural personas to cater for player experience, thus examining generative agents that can both behave and experience their game as humans would. For that purpose, we employ the Go-Explore reinforcement learning paradigm for training human-like procedural personas, and we test our method on behavior and experience demonstrations of more than 100 players of a racing game. Our findings suggest that the generated agents exhibit distinctive play styles and experience responses of the human personas they were designed to imitate. Importantly, it also appears that experience, which is tied to playing behavior, can be a highly informative driver for better behavioral exploration.
in Proceedings of the Foundations on Digital Games Conference, 2022. BibTex
Abstract: Video game testing has become a major investment of time, labor and expense in the game industry. Particularly the balancing of in-game units, characters and classes can cause long-lasting issues that persist years after a game's launch. While approaches incorporating artificial intelligence have already shown successes in reducing manual effort and enhancing game development processes, most of these draw on heuristic, generalized or optimal behavior routines, while actual low-level decisions from individual players and their resulting playing styles are rarely considered. In this paper, we apply Deep Player Behavior Modeling to turn atomic actions of 213 players from 6 months of single-player instances within the MMORPG Aion into generative models that capture and reproduce particular playing strategies. In a subsequent simulation, the resulting generative agents ("replicants") were tested against common NPC opponent types of MMORPGs that iteratively increased in difficulty, respective to the primary factor that constitutes this enemy type (Melee, Ranged, Rogue, Buffer, Debuffer, Healer, Tank or Group). As a result, imbalances between classes as well as strengths and weaknesses regarding particular combat challenges could be identified and regulated automatically.
in IEEE Transactions on Games, 2022. (accepted) BibTex
Abstract: What is believability? And how do we assess it? These questions remain a challenge in human-computer interaction and games research. When assessing the believability of agents, researchers opt for an overall view of believability reminiscent of the Turing test. Current evaluation approaches have proven to be diverse and, thus, have yet to establish a framework. In this paper, we propose treating believability as a time-continuous phenomenon. We have conducted a study in which participants play a one-versus-one shooter game and annotate the character's believability. They face two different opponents which present different behaviours. In this novel process, these annotations are done moment-to-moment using two different annotation schemes: BTrace and RankTrace. This is followed by the user's believability preference between the two playthroughs, effectively allowing us to compare the two annotation tools and time-continuous assessment with discrete assessment. Results suggest that a binary annotation tool could be more intuitive to use than its continuous counterpart and provides more information on context. We conclude that this method may offer a necessary addition to current assessment techniques.
in Proceedings of the ACII Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction (MA3HMI), 2021. BibTex
Abstract: Assessing the believability of agents, characters and simulated actors is a core challenge for human computer interaction. While numerous approaches are suggested in the literature, they are all limited to discrete and low-granularity representations of believable behavior. In this paper we view believability, for the first time, as a time-continuous phenomenon and we explore the suitability of two different affect annotation schemes for its assessment. In particular, we study the degree to which we can predict character believability in a continuous fashion through a two-player game study. The game features various opponent behaviors that are assessed for their believability by 89 participants that played the game and then annotated their recorded playthrough. Random forest models are then trained to predict believability based on ad-hoc designed in-game features. Results suggest that a discrete annotation method leads to a more robust assessment of the ground truth and subsequently better modelling performance. Our best models are able to predict a change in perceived believability with a 72.5% accuracy on average (up to 90% in the best cases) in a time-continuous manner.
in Proceedings of the IEEE International Conference on Affective Computing and Intelligent Interaction, 2021. BibTex
Abstract: To which degree can abstract gameplay metrics capture the player experience in a general fashion within a game genre? In this comprehensive study we address this question across three different videogame genres: racing, shooter, and platformer games. Using high-level gameplay features that feed preference learning models we are able to predict arousal accurately across different games of the same genre in a large-scale dataset of over 1,000 arousal-annotated play sessions. Our genre models predict changes in arousal with up to 74% accuracy on average across all genres and 86% in the best cases. We also examine the feature importance during the modelling process and find that time-related features largely contribute to the performance of both game and genre models. The prominence of these game-agnostic features show the importance of the temporal dynamics of the play experience in modelling, but also highlight some of the challenges for the future of general affect modelling in games and beyond.
in Proceedings of the IEEE Conference on Games, 2021. BibTex
Abstract: Balancing the options available to players in a way that ensures rich variety and viability is a vital factor for the success of any video game, and particularly competitive multiplayer games. Traditionally, this balancing act requires extensive periods of expert analysis, play testing and debates. While automated gameplay is able to predict outcomes of parameter changes, current approaches mainly rely on heuristic or optimal strategies to generate agent behavior. In this paper, we demonstrate the use of deep player behavior models to represent a player population (n = 213) of the massively multiplayer online role-playing game Aion, which are used, in turn, to generate individual agent behaviors. Results demonstrate significant balance differences in opposing enemy encounters and show how these can be regulated. Moreover, the analytic methods proposed are applied to identify the balance relationships between classes when fighting against each other, reflecting the original developers' design.
in Proceedings of the IEEE Conference on Games, 2020. BibTex
Abstract: Online crowds have the potential to do more complex work in teams, rather than as individuals. Team formation algorithms typically maximize some notion of global utility of team output by allocating people to teams or tasks. However, decisions made by these algorithms do not consider the decisions or preferences of the people themselves. This paper explores a complementary strategy, which relies on the crowd itself to self-organize into effective teams. Our preliminary results show that users perceive the ability to choose their teammate extremely useful in a crowdsourcing setting. We also find that self-organisation makes users feel more productive, creative and responsible for their work product.
in Proceedings of the IE3 workshop on crowd-powered e-services, Springer, 2019. BibTex
Abstract: We present semantics-based mechanisms that aim to promote reflection on cultural heritage by means of dates (historical events or annual commemorations), owing to their connections to a collection of items and to the visitors' interests. We argue that links to specific dates can trigger curiosity, increase retention and guide visitors around the venue following new appealing narratives in subsequent visits. The proposal has been evaluated in a pilot study on the collection of the Archaeological Museum of Tripoli (Greece), for which a team of humanities experts wrote a set of diverse narratives about the exhibits. A year-round calendar was crafted so that certain narratives would be more or less relevant on any given day. Expanding on this calendar, personalised recommendations can be made by sorting out those relevant narratives according to personal events and interests recorded in the profiles of the target users. Evaluation of the associations by experts and potential museum visitors shows that the proposed approach can discover meaningful connections, while many others that are more incidental can still contribute to the intended cognitive phenomena.
in SAGE Journal of Information Science 47(1), pp. 82-100, 2021. BibTex
Abstract: How could we gather affect annotations in a rapid, unobtrusive, and accessible fashion? How could we still make sure that these annotations are reliable enough for data-hungry affect modelling methods? This paper addresses these questions by introducing PAGAN, an accessible, general-purpose, online platform for crowdsourcing affect labels in videos. The design of PAGAN overcomes the accessibility limitations of existing annotation tools, which often require advanced technical skills or even the on-site involvement of the researcher. Such limitations often yield affective corpora that are restricted in size, scope and use, as the applicability of modern data-demanding machine learning methods is rather limited. The description of PAGAN is accompanied by an exploratory study which compares the reliability of three continuous annotation tools currently supported by the platform. Our key results reveal higher inter-rater agreement when annotation traces are processed in a relative manner and collected via unbounded labelling.
in Proceedings of the International Conference on Affective Computing and Intelligent Interaction, 2019. BibTex
Abstract: There is growing evidence suggesting that subjective values such as emotions are intrinsically relative and that an ordinal approach is beneficial to their annotation and analysis. Ordinal data processing yields more reliable, valid and general predictive models, and preference learning algorithms have shown a strong advantage in deriving computational models from such data. To enable the extensive use of ordinal data processing and preference learning, this paper introduces the Python Preference Learning Toolbox. The toolbox is open source, features popular preference learning algorithms and methods, and is designed to be accessible to a wide audience of researchers and practitioners. The toolbox is evaluated with regards to both the accuracy of its predictive models across two affective datasets and its usability via a user study. Our key findings suggest that the implemented algorithms yield accurate models of affect while its graphical user interface is suitable for both novice and experienced users.
in Proceedings of the International Conference on Affective Computing and Intelligent Interaction, 2019. BibTex
Abstract: Is it possible to predict the motivation of players just by observing their gameplay data? Even if so, how should we measure motivation in the first place? To address the above questions, on the one end, we collect a large dataset of gameplay data from players of the popular game Tom Clancy's The Division. On the other end, we ask them to report their levels of competence, autonomy, relatedness and presence using the Ubisoft Perceived Experience Questionnaire. After processing the survey responses in an ordinal fashion we employ preference learning methods based on support vector machines to infer the mapping between gameplay and the reported four motivation factors. Our key findings suggest that gameplay features are strong predictors of player motivation as the best obtained models reach accuracies of near certainty, from 92% up to 94% on unseen players.
in Proceedings of the IEEE Conference on Games, 2019. BibTex
Abstract: This paper describes a method for generative player modeling and its application to the automatic testing of game content using archetypal player models called procedural personas. Theoretically grounded in psychological decision theory, procedural personas are implemented using a variation of Monte Carlo Tree Search (MCTS) where the node selection criteria are developed using evolutionary computation, replacing the standard UCB1 criterion of MCTS. Using these personas we demonstrate how generative player models can be applied to a varied corpus of game levels and demonstrate how different play styles can be enacted in each level. In short, we use artificially intelligent personas to construct synthetic playtesters. The proposed approach could be used as a tool for automatic play testing when human feedback is not readily available or when quick visualization of potential interactions is necessary. Possible applications include interactive tools during game development or procedural content generation systems where many evaluations must be conducted within a short time span.
IEEE Transactions on Games, vol. 11, no. 4, pp. 352-362, 2019. BibTex
Abstract: Videogame avatars are more than visual artifacts - they express cultural norms and expectations from both the real world and the fictional world. In this paper, we describe how artificial intelligence clustering can automatically discover distinct characteristics of players' avatars without prior knowledge of a system's underlying data structures. Using only avatar images collected from a study with 191 players, we applied two clustering techniques - namely non-negative matrix factorization and archetypal analysis - that automatically revealed and detected (1) an avatar's gender, (2) regions that appeared to isolate shapes of items and accessories, and (3) aesthetic preferences for particular colors (e.g., bright or muted) and shapes for different body parts. These clusters correlated with players' preferences for character abilities, e.g., male avatars in dark clothes correlated with having high physical but low magic-casting attributes. These findings show that a bottom-up analysis of images can reveal explicit categories like gender, but also implicit categories like preferences of players. We believe that such computational approaches can enable developers to (1) better understand players' desires and needs, (2) quantitatively view how systems may be limited in supporting players, and (3) find actionable solutions for these limitations.
in Proceedings of the International Joint Conference of DiGRA and FDG. 2016. BibTex
Abstract: Is it possible to conduct player modeling without any players? In this paper we use Monte-Carlo Tree Search-controlled procedural personas to simulate a range of decision making styles in the puzzle game MiniDungeons 2. The purpose is to provide a method for synthetic play testing of game levels with synthetic players based on designer intuition and experience. five personas are constructed, representing five different decision making styles archetypal for the game. The personas vary solely in the weights of decision-making utilities that describe their valuation of a set affordances in MiniDungeons 2. By configuring these weights using designer expert knowledge, and passing the configurations directly to the MCTS algorithm, we make the personas exhibit a number of distinct decision making and play styles.
in Proceedings of the AIIDE workshop on Player Modeling, 2015. BibTex
Abstract: This paper describes MiniDungeons 2 (MD2): a turn-based rogue-like game developed to support research in capturing and modeling player decision making processes through procedural personas and using such models as critics for procedural content generation. MD2 intends to provide a full-circle framework for collecting, modeling, simulating, and producing content for player decision making styles. The fully instrumented and telemetric game will soon be made available to the public to be played on smart-phones for the purpose of collecting as many play traces, representing as many different decision making styles, as possible.
in Proceedings of the 10th Conference on the Foundations of Digital Games. 2015. BibTex
Abstract: The current paper investigates multiple approaches to modeling human decision making styles for procedural play-testing. Building on decision and persona theory we evolve game playing agents representing human decision making styles. Three kinds of agents are evolved from the same representation: procedural personas, evolved from game designer expert knowledge, clones, evolved from observations of human play and aimed at general behavioral replication, and specialized agents, also evolved from observation, but aimed at determining the maximal behavioral replication ability of the representation. These three methods are then compared on their ability to represent individual human decision makers. Comparisons are conducted using three different proposed metrics that address the problem of matching decisions at the action, tactical, and strategic levels. Results indicate that a small gallery of personas evolved from designer intuitions can capture human decision making styles equally well as clones evolved from human play-traces for the testbed game MiniDungeons.
Entertainment Computing. Elsevier Volume 16, 2016, pp. 95–104. BibTex
Abstract: The current paper investigates how to model human play styles. Building on decision and persona theory we evolve game playing agents representing human decision making styles. Two methods are developed, applied, and compared: procedural personas, based on utilities designed with expert knowledge, and clones, trained to reproduce playtraces. Additionally, two metrics for comparing agent and human decision making styles are proposed and compared. Results indicate that personas evolved from designer intuitions can capture human decision making styles equally well as clones evolved from human playtraces.
in Proceedings of the International Conference on Entertainment Computing (ICEC), 2014. BibTex
Abstract: This paper documents the challenges in creating a computer-aided level design tool which incorporates computer-generated suggestions which appeal to the human user. Several steps are suggested in order to make the suggestions more appropriate to a specific user's overall style, current focus, and end-goals. Designer style is modeled via choice-based interactive evolution which adapts the impact of different dimensions of quality based on the designer's choice of certain suggestions over others. Modeling process is carried out similarly to style, but adapting to the current focus of the designer's actions. Goals are modeled by estimating the visual patterns of the designer's final artifact and changing the parameters of the algorithm to enforce such patterns on generated suggestions.
in Proceedings of the IEEE Conference on Computational Intelligence and Games (CIG), 2014. BibTex
Abstract: This paper explores how evolved game playing agents can be used to represent a priori defined archetypical ways of playing a test-bed game, as procedural personas. The end goal of such procedural personas is substituting players when authoring game content manually, procedurally, or both (in a mixed-initiative setting). Building on previous work, we compare the performance of newly evolved agents to agents trained via Q-learning as well as a number of baseline agents. Comparisons are performed on the grounds of game playing ability, generalizability, and conformity among agents. finally, all agents' decision making styles are matched to the decision making styles of human players in order to investigate whether the different methods can yield agents who mimic or differ from human decision making in similar ways. The experiments performed in this paper conclude that agents developed from a priori defined objectives can express human decision making styles and that they are more generalizable and versatile than Q-learning and hand-crafted agents.
in Proceedings of the IEEE Conference on Computational Intelligence and Games (CIG), 2014. BibTex
Abstract: This paper presents a method for modeling player decision making through the use of agents as AI-driven personas. The paper argues that artificial agents, as generative player models, have properties that allow them to be used as psychometrically valid, abstract simulations of a human player's internal decision making processes. Such agents can then be used to interpret human decision making, as personas and playtesting tools in the game design process, as baselines for adapting agents to mimic classes of human players, or as believable, human-like opponents. This argument is explored in a crowdsourced decision making experiment, in which the decisions of human players are recorded in a small-scale dungeon themed puzzle game. Human decisions are compared to the decisions of a number of a priori defined "archetypical" agent-personas, and the humans are characterized by their likeness to or divergence from these. Essentially, at each step the action of the human is compared to what actions a number of reinforcement-learned agents would have taken in the same situation, where each agent is trained using a different reward scheme. finally, extensions are outlined for adapting the agents to represent sub-classes found in the human decision making traces.
in Poster Proceedings of the 9th Conference on the Foundations of Digital Games, 2014. BibTex
Abstract: With the growing use of automated content creation and computer-aided design tools in game development, there is potential for enhancing the design process through personalized interactions between the software and the game developer. This paper proposes designer modeling for capturing the designer's preferences, goals and processes from their interaction with a computer-aided design tool, and suggests methods and domains within game development where such a model can be applied. We describe how designer modeling could be integrated with current work on automated and mixed-initiative content creation, and envision future directions which focus on personalizing the processes to a designer's particular wishes.
in Proceedings of the AIIDE Workshop on Artificial Intelligence & Game Aesthetics, 2013. BibTex
Abstract: This paper introduces Rank-based Interactive Evolution (RIE) which is an alternative to interactive evolution driven by computational models of user preferences to generate personalized content. In RIE, the computational models are adapted to the preferences of users which, in turn, are used as fitness functions for the optimization of the generated content. The preference models are built via ranking-based preference learning, while the content is generated via evolutionary search. The proposed method is evaluated on the creation of strategy game maps, and its performance is tested using artificial agents. Results suggest that RIE is both faster and more robust than standard interactive evolution and outperforms other state-of-the-art interactive evolution approaches.
in Proceedings of the IEEE Conference on Computational Intelligence and Games (CIG), 2013. BibTex
Abstract: This paper presents a tool geared towards the collaboration of a human and an artificial designer for the creation of game content. The framework combines procedural content generation using stochastic search with user input in the form of an initial goal statement as well as preference of generated results. Feedback from industry experts in a pilot user experiment showcased the limitations of this approach and the protocol chosen for evaluating the authoring tool. The limitations are discussed with respect to the suitability of interactive evolution for creative design and the design of experimental protocols for evaluating authoring tools for games.
in Proceedings of the AIIDE Workshop on Human Computation in Digital Entertainment, 2012. BibTex
Abstract: This paper introduces a search-based approach to personalized content generation with respect to visual aesthetics. The approach is based on a two-step adaptation procedure where (1) the evaluation function that characterizes the content is adjusted to match the visual aesthetics of users and (2) the content itself is optimized based on the personalized evaluation function. To test the efficacy of the approach we design fitness functions based on universal properties of visual perception, inspired by psychological and neurobiological research. Using these visual properties we generate aesthetically pleasing 2D game spaceships via neuroevolutionary constrained optimization and evaluate the impact of the designed visual properties on the generated spaceships. The offline generated spaceships are used as the initial population of an interactive evolution experiment in which players are asked to choose spaceships according to their visual taste: the impact of the various visual properties is adjusted based on player preferences and new content is generated online based on the updated computational model of visual aesthetics of the player. Results are presented which show the potential of the approach in generating content which is based on subjective criteria of visual aesthetics.
IEEE Transactions on Computational Intelligence and AI in Games 4(3), 2012, pp. 213-228. BibTex