An Interactive and Multimodal Virtual (reality) Mind Map for Future Immersive Projection Workplace: Research Article

The following is a featured report authored by David Kutak, Milan Dolezal, Bojan Kerous, Zdenek Eichler, Jiri Vasek and Fotis Liarokapis. We keep well on top of the latest industry perspectives and researchers from academics globally- it's our business.

So to bring that to you we share our favourite reports on a monthly basis to give greater exposure to some of the leading minds and researchers in the mixed reality, immersive technology and projection fields- enjoy!

Traditional types of mind maps involve means of visually organizing information. They can be created either using physical tools like paper or post-it notes or through the computer-mediated process. Although their utility is established, mind maps and associated methods usually have several shortcomings with regards to effective and intuitive interaction as well as effective collaboration. Latest developments in virtual reality demonstrate new capabilities of visual and interactive augmentation, and in this paper, we propose a multimodal virtual reality mind map that has the potential to transform the ways in which people interact, communicate, and share information. The shared virtual space allows users to be located virtually in the same meeting room and participate in an immersive experience. Users of the system can create, modify, and group notes in categories and intuitively interact with them. They can create or modify inputs using voice recognition, interact using virtual reality controllers, and then make posts on the virtual mind map. When a brainstorming session is finished, users are able to vote about the content and export it for later usage. A user evaluation with 32 participants assessed the effectiveness of the virtual mind map and its functionality. Results indicate that this technology has the potential to be adopted in practice in the future, but a comparative study needs to be performed to have a more general conclusion.

1. Introduction

Modern technologies offer new opportunities for users to communicate and interact with simulated environments, to quickly find information and knowledge when needed and also to learn anytime and anywhere (Sharples, 2000). When humans communicate, the dynamics of this social interaction are multimodal (Louwerse et al., 2012) and provide several different patterns, like entrainment of recurrent cycles of behavior between partners, suggesting that users are coordinated through synchronization and complementarity, i.e., mutual adjustments to each other resulting in corresponding changes in their behavior occurring during the interaction (Sadler et al., 2009). When users are engaging in a collaborative task, then synchronization takes place between them through multiple modalities. Some of these include gestures, facial expression, linguistic communication (Louwerse et al., 2012) or eye-movement patterns (Dale et al., 2011). By designing multimodal interfaces, it is possible to improve the accessibility and usability of mind mapping systems to achieve a natural and intuitive experience to the users.

Traditional types of mind maps involve some means of the visual organization of information. During the past few years, there have been some initial approaches to developing 3D mind maps to boost productivity. A number of human-computer interaction (HCI) technologies exist nowadays that address this topic, and some of them tend to work well in specific situations and environments. A virtual reality (VR) solution for shared space can be realized in several ways depending on the required level of immersion of the end-user in addition to the requirements of the application. One of the most accepted definitions states that immersion refers to the objective level of sensory fidelity a VR system provides whereas presence addresses user's subjective psychological response to a VR system (Slater, 2003). The level of immersion is directly interrelated with the end-user perception and promoted if tactile devices are used. VR, therefore, has the potential to augment processes of our everyday life and to mitigate difficult problems.

An open problem is a co-location in the office of the future environment. Meeting with people on site is costly because often people need to travel to the location from various cities or countries. The Internet enables to connect these people from the technical point of view while VR allows to achieve much more immersive and natural cooperation. Nowadays, several immersive virtual solutions that address co-location exist, such as collaborative applications for cave automatic virtual environment (CAVE) systems (Cruz-Neira et al., 1993), where users do not wear head-mounted displays (HMDs) and are able to see each other and interact directly due to their location in the same physical space. Nowadays, collaborative immersive VR allows users to be co-located in the same space or in different locations and achieve communication through internet (Dolezal et al., 2017). The availability of current HMDs allows for the easier creation of strongly immersive user experiences. Typical sub areas of shared spaces for VR include visualization, communication, interaction and collaboration, and VR-based mind map workflow overlaps and relies on all four aspects of the experience.

The main focus of this research is on multimodal VR collaborative interfaces that facilitate various types of intelligent ideation/brainstorming (or any other mostly creative activity). Participants can be located in different environments and have a common goal on a particular topic within a limited amount of time. Users can group (or ungroup) actions (i.e., notes belonging in a specific category) and intuitively interact with them using a combination of different modalities. Ideally, the multimodal interface should allow users to create actions (i.e., post-it note) and then post it on the virtual mind map using one or more intuitive methods, such as voice recognition, gesture recognition, and through other physiological or neurophysiological sources. When a task is finished, users should be able to access the content and assess it.

This paper presents a novel shared virtual space where users are immersed in the environment (i.e., same meeting room) and participate in a multimodal manner (through controllers and voice recognition). Emphasis is given on the (a) shared VR environment; (b) effective performance of the multimodal interface; and (c) assessment of the whole system as well as the interaction techniques. The tasks are typically moderated by one or two individuals who facilitate the process, take care of the agenda, keep the schedule, and so on. Such ideation exercise can be used on various occasions but is typically associated with the creative process in the company where the output of the exercise is uncertain before it is executed. During a particular task users can create and manipulate shared nodes (equivalent to real-world sticky notes), modify their hierarchical or associative relationships and continuously categorize, cluster, generalize, comment, prioritize, and so on. Moderator's role is to guide the discussion and regulate the voting phase.

With the appearance of novel interfaces and mediums, such as VR and increasing presence of sensors and smart devices in our environment, it has become apparent that the typical way we interact with the computer is changing rapidly. Novel ways of achieving fluent interaction in an environment saturated with sources of useful behavioral or physiological data need to be explored to pave the way for new and improved interface designs. These interfaces of the future hold the promise of becoming more sophisticated, informative, and responsive by utilizing speech/gesture recognition, novel peripherals, eye-tracking, or affect recognition. The role of multimodal interfaces is to find ways to combine multiple sources of user input and meaningful ways of leveraging diverse sources of data in real time to promote usability. These sources can be combined in one of three levels, as outlined in Sharma et al. (1998), that depends on the level of integration (fusion) of distinct sources of data. There is a real opportunity to mitigate the difficulties of a single modality-based interface by combining other inputs.

Collaborative Virtual Environments (CVEs) may be considered as shared virtual environments operating over a computer network (Benford et al., 2001). They have different application domains ranging from health-care (McCloy and Stone, 2001Rizzo et al., 2011), cultural heritage (White et al., 2007Liarokapis et al., 2017), education (Redfern and Galway, 2002Pan et al., 2006Faiola et al., 2013Papachristos et al., 2013) to psychology (Loomis et al., 1999), and neuroscience (Tarr and Warren, 2002). One of the main disadvantage of CVEs is that they do not support non-verbal communication cues (Redfern and Galway, 2002). The typical solution to overcome this problem is to include a representation of the participants in a form of avatars. Although this does not solve the problem, it allows for some form of limited non-verbal communication. As a result, participants of CVEs can interact with objects or issue commands while being observed by the virtually co-located collaborator.

The benefits, design, and evaluation in the field of designing speech and multimodal interactions for mobile and wearable applications were recently presented (Schaffer and Reithinger, 2016). Having a multimodal VR interface can be beneficial for several complex operations as well as new applications, ranging from automotive styling to museum exhibitions. The multimodality can also be achieved by providing different visual representations of the same content so the user can choose the most suitable one (Liarokapis and Newman, 2007). The main principle of the concept of multimodality is that it allows users to switch between different types of interaction technologies. Multimodal interfaces can greatly expand the accessibility of computing to diverse and non-specialist users, for example, by offering traditional means of input like the keyboard and also some uncommon ones like specialized or simplified controllers. They can also be used to promote new forms of computing and improve the expressive power and efficiency of interfaces (Oviatt, 2003).

The flexibility of multimodal interfaces allows for the better alternation of input modalities, preventing overuse and physical damage arising from a repeated action during extended periods of use. Furthermore, multimodal interfaces can be used to provide customizable digital content and scenarios (White et al., 2007) while, on the other hand, they can bring improvements by combining information derived from audio and visual cues (Krahnstoever et al., 2002). Acquisition of knowledge is also augmented through the use of such multimodal MR interface compared to a traditional WIMP-based (Windows, Icons, Menu and Pointer) interface (Giraudeau and Hachet, 2017). In fact, one example implementation of a mind-map based system reported in Miyasugi et al. (2017) allows multiple users to edit a mind map by using hand gestures and voice input and share it through VR. Initial comparative experiments with representative mind map support software (iMindMap1) found that the task completion time for creating and changing the key images was shorter than that of iMindMap. Currently, there are several software alternatives for mind map creation; XMind2 and iMindMap being the most famous ones, but most of these solutions are aimed at a single user and support only traditional non-VR enabled interfaces. In the world of VR applications, Noda3 is one of the most progressive alternatives. Noda utilizes spatial mind maps with nodes being positioned anywhere in the three-dimensional space, while it does not offer collaboration possibilities.

Having a three-dimensional mind map presents some advantages like increased ability to exploit spatial thinking and theoretically infinite place for storing ideas. On the other hand, spatial placement might decrease the clarity of the mind map as some nodes might be hidden behind the user or other nodes. The one-to-one correspondence with traditional mind mapping software is lost as well, which makes it hard to export the results for later processing and review. This would decrease the usability of the outputs created inside the VR, and it is the reason why our approach works with two-dimensional (2D) mind maps.

Another alternative tool is Mind Map VR4 offering more or less same functionalities as Noda. An interesting feature of the Mind Map VR is the ability to change the surroundings for a different looking one. When concerned about collaborative platforms, rumii5, CocoVerse (Greenwald et al., 2017), and MMVR6 (Miyasugi et al., 2017) are closely related to our work. In its core, all of these systems provide users a possibility to cooperate in VR. Rumii is, however, aimed mostly at conferencing and presentation, while CocoVerse is aimed at co-creation mainly via drawing so although mind mapping is theoretically possible, the application is not designed for this purpose.

As MMVR is focused on mind mapping and online collaboration in VR, it does have a similar purpose as our application. MMVR utilizes hand gestures to create mind maps with nodes positioned in three-dimensional space. On the contrary, in our system, VR controllers are used for the interaction, and the map canvas is two-dimensional. Similarly to Noda, authors of MMVR decided to take a slightly different approach than we did regarding the mind map representation. Besides already mentioned things, we tried to make the mind mapping process more related to the real-world one—VR controller acts as a laser pointer while 2D canvas is a virtual representation of a whiteboard. MMVR also excludes features related to brainstorming, such as voting.

3. System Architecture

The traditional way of brainstorming using post-it notes presents several drawbacks related to a reshuffling or modifying of notes during the whole process as post-it notes often fall from the wall and trying to do it multiple times makes them not staying on the wall anymore. Besides that, taking multiple notes from multiple places to move them to some other place is cumbersome. Mapping relationships between post-it notes is another difficult task, one needs to make lines among post-it notes and to label the lines if needed, but to do this one must often re-shuffle the post-it notes to make the relationships visible. Elaborating on a particular topic (for example deeper analysis requiring more post-it notes) in one part of the exercise is also difficult as all of the other post-it notes need to be reshuffled again to make space for the new exercise. It is challenging to draw on the post-it note when needed and then stick it on the wall. Finally, post-exercise analysis is difficult; it typically involves a photograph of the result and then manual transcription into a different format; for example, brainstorming “tree” and word document as meeting minutes. If it is necessary to perform a remote brainstorming, the disadvantages are much more significant, and there does not exist a flawless solution.

Our system is designed in such a way to try to take the best of both the interpersonal brainstorming and software approaches and merge it into one application. The main part of our system is a canvas with a mind map. The canvas serves as a virtual wall providing users space to place their ideas and share them with others. All nodes are positioned at this 2D wall to keep the process as close as possible to the real-world while also providing similar visual style as conventional mind mapping software tools. To simplify collective manipulations, our system introduces a simple gesture. The user draws a shape around the nodes he or she wishes to select and then simply moves them around using a cursor button which appears after the selection ends. This feature is described in more detail in section 3.2. One of the big challenges of VR technology lies in the interaction side. Existing speech-to-text tools were integrated into our system to allow users to use voice-based interaction.

When the process of brainstorming finishes, voting about the best ideas takes place. In real-world exercise, it is hard to make sure that all participants obey the voting rules. Some participants might distribute a different number of points than they should, or they can be influenced by other participants. Our tool provides a voting system, ensuring that the voting points are distributed correctly and without being influenced by other participants. Brainstorming exercise is usually performed with more people at the same place while one of them serves as a moderator. This might not be a problem for teams sharing a workspace, but when it is desired to collaborate with people being physically far away, things are much more complicated. Our tool provides a VR environment where all users can meet, although they might be located at different places in the world. The overview of the different parts of the system is shown in Figure 1. Figure 2 shows a screenshot of the application while brainstorming is in progress.

Figure 1. System overview.

Figure 2. Mind map canvas while brainstorming is in progress.

The software was developed in Unity and C#. The networking system presented in Dolezal et al. (2017) was incorporated into the application. Interaction with the HMDs is possible thanks to Virtual Reality Toolkit (also VRTK) plugin. To run the application, SteamVR is required as it provides application interfaces for VR devices. Even if the application is designed to be operational with HMDs, it is also possible to use it just with personal computers like desktop or laptop—without HMD. In this case, the keyboard and mouse are required as input devices. If a microphone is present as well, speech recognition service can be still utilized as an input modality. Regarding the HMDs, the system is implemented to work with HTC Vive (Pro) and one controller. Our focus was on making the controls as simple as possible. For this reason, the user is required to work only with two of the controller's buttons (touchpad and trigger) to have complete control over the system features. When the user presses the touchpad, laser pointer is emitted from the VR controller. Trigger serves as an “action” button; when the user is pointing at some element, pressing the trigger initiates the appropriate action. Video showing some of the system functionality and interaction is in the Supplementary Material.

3.1. Map Nodes

Map nodes are the core component of the system. Each node is represented by a visual element having color and a label. Map nodes can be modified in several ways - they can be moved, deleted, modified, and being updated with new visual styles. It is also possible to make a relation between nodes represented by lines between appropriate nodes. Two types of relations were implemented - parent/child and general ones. Former ones represent a “strong” relation where each node can have only one parent and is dependent on its antecedents - when some of them are moved or removed, this node is modified as well. Latter type of relations is there mainly for semantic purpose. Each node can have as many of these relations with other nodes as desired while persisting independency. Modifications of the nodes are done using radial menus shown while pointing at a node. This allows users to perform the most important actions while still being focused on the mind map. The content of the node's radial menu is shown in Figure 3. Blue buttons provide the functionality to add the aforementioned relations to other nodes. Red button removes nodes while the green button creates a new node as a child of the currently selected node. The last button allows users to record a text for this node.FIGURE 3

Figure 3. Radial menu which opens when a single node is selected.

3.2. Selection of Multiple Nodes

The multiple selection is handled in such a way that a user is required to draw a shape around the nodes he wishes to select. Selection shape is drawn using a controller by pressing touchpad and the trigger buttons at the same time while pointing at the canvas. When the selection is finished, the user can move the selected nodes or perform some actions provided by the appropriate radial menu. Thanks to this feature, selecting several nodes and changing their visual style, position, or relations is quite simple. In the background, this feature is based on a point-in-polygon test. The selection shape is visually represented as a long red line (technically a polyline) which is converted into a polygon (based on vertices of individual line segments of a polyline) after the drawing is finished. Then, for each node, it is computed whether its center lies in a resulting polygon.

3.3. Voice Recognition

Language technology is easier to accept for participants only if it is implemented in an intuitive and easy to use way (Schaffer and Reithinger, 2016). Text input is a big challenge for all VR applications as a traditional keyboard cannot be properly used due to not being visible. It also disallows the user to move freely. The most straightforward idea is to use a virtual keyboard, but this approach is not very effective, especially with only one controller. For this reason, we decided to use speech-to-text technology. Our system is using service to provide this functionality. The user uses an appropriate button in the radial menu to start the recording, says the desired input, and then ends the recording. The rest is handled by our system in cooperation with the service mentioned above. In the background, voice recognition operates in such a way that the user's recording is converted into an audio file which is uploaded to the servers. These servers process the audio and return a response containing the appropriate text. The whole process is running on a separate thread to not block the program while speech is transformed into the result.

3.4. Voting

Voting is a special state of the application during which nodes cannot be edited and which provides an updated user interface where each node is accompanied by plus and minus buttons and a text box with points. This allows participants to assign points easily. Voting consists of several rounds where during each round, one color to be voted about is chosen. Voting is led by a moderator of the brainstorming who decides about the colors to vote about and assigns the number of points to distribute between the voted ideas. For each such voting round, participants see only the number of points they assigned, and they have to distribute all points. When a moderator tries to end the voting round, the system checks whether all participants distributed their points and if not, then the round cannot be closed. When the voting ends, all participants see the summary of points for each node. Winners in each category are made visually distinct.

3.5. Online Collaboration

The core of the network-related part of the system is Collaborative Virtual Environments (CVR) platform (Dolezal et al., 2017) utilizing Unity Networking (UNET) technology (its core structure is shown at Figure 4). The system works on a host/client model, in which one user is a server and a client at the same time while other users are just clients. Each user is represented as an abstract representation of a capsule (as shown in Figure 5) with HMD and VR controller in hands. Both the positions of the avatar and controller are synchronized over the network. Online collaboration also includes a system of node-locking, preventing users from modifying a node while another user is currently working with it, and controller-attached laser pointer which allows users to get immediate feedback about the place they or other user are pointing to. Regarding the node-locking, this functionality is based on the concept of node managers. When a client points at a node, system locally checks whether the node is locked for this client or not. If the node is already locked, it is not selected. Otherwise, the client sends a request to the server to lock this node. Server processes these requests sequentially and for each request verifies whether the claimed node is without a manager, otherwise denies the request. If the node has no manager yet, the server makes requesting user the manager of this node and sends an remote procedure call (RPC) to the rest of the clients that this node is locked. If a node is deselected, unlock message is sent to the server, which then propagates this information down to the clients.FIGURE 4

Figure 4. UNET function calls.FIGURE 5

Figure 5. Representation of the user in VR environment with overlayed image of real users.

3.6. Mind Map Export

At any time during the brainstorming, users of our system can export the mind map into the open-source XMind format. Possibilities of the format are fully utilized, therefore most of the important information like node visuals, relations, or points received during voting are exported. Support of mind map export provides an ability to access the brainstorming results later on or even modify them in other software tools. The mind map is also regularly saved to a backup file which is stored in a custom JavaScript Object Notation (JSON)-like format. This format was designed to be as simple and as fast as possible while still storing all the necessary information. The backup file can be loaded at any time during the brainstorming, making it therefore possible to restore mind mapping progress in case of some failure like lost internet connection.

4. Methodology

This section presents the methodology of the experiment performed for collecting information about the application.

4.1. Participants and Questionnaires

The study consisted of a total of 32 healthy participants (19 males, 13 females) and testing took place in pairs (16 individual groups). Participants were a voluntary sample, recruited based on their motivation to participate in the study. All subjects signed informed consent to participate in the study and to publish their anonymous data. They were aged from 18 to 33 years old, and all of them were regular computer users. They were rather inexperienced with mind maps and generally had some experience with remote collaboration. The very first step was to explain the workflow of the experiment to participants. Then, statistical and demographic data were collected. After the completion of the experiment, subjects were asked to fill in questionnaires related to the recent experience. Two questionnaires were used. The first one focused on measuring presence in VR (Witmer and Singer, 1998Witmer et al., 2005). The second questionnaire aimed at assessing the cognitive workload and was based on the NASA Task Load Index (Hart, 2006). The subjects were also asked to fill in a free-form debriefing session questionnaire, where they provided qualitative feedback for the whole experiment.

4.2. Procedure

The procedure of user testing consisted of two main steps. Participants were located in different rooms, and during the first 10–15 min, depending on the skill of the individual user, each of them was alone in the virtual environment while being introduced to the system and presented with its features. While trying the system functionality, the participant's feedback was gathered. The second part of the evaluation consisted of participants trying to brainstorm on a scenario. To assess the functionality of the system, a number of different brainstorming scenarios were designed. The topics that were chosen include: (a) How to cook an egg properly, (b) What is best to do on Friday night, (c) How will artificial intelligence take over the world, (d) Wine selection for dinner, and (e) Propose your own topic. The given topic for the experiment was “What is best to do on Friday night.” The process was directed by a moderator and contained the following steps:

1. Participants were asked to present possibilities how to spend Friday night using nodes on the wall together

2. Participants were asked to assign other specific properties to ideas from previous exercise and to use different color of nodes

3. Each participant was asked to select one idea and add nodes describing concrete proposal

4. Participants were asked to present to each other results of previous exercise

5. Participants ran a voting session. One of the participants took a role of a voting moderator, the second one was acting as a voting participant.

Time of completion for each of the steps was measured and the behavior of the participants was monitored in order to get another type of feedback.

5. Results

5.1. Qualitative Results

The participants provided us with valuable feedback necessary for further improvements. The feedback was gathered not only by direct communication with participants but also by watching their behavior during the actual scenario. Thanks to this approach, it was possible to collect useful information during the whole time of the testing. During the debriefing, we asked participants whether they know any other tools which can be used for remote brainstorming or collaboration and if they can find some (dis)advantages of our system in comparison to these tools. The mentioned tools included services like Skype, Google Docs/Hangouts, Slack, Facebook, Team Speak, IBM Sametime, and video conferencing platforms.

The most commonly mentioned advantage of our system was immersion. Quoting one of the users, “It makes you feel like you are brainstorming in the same room on a whiteboard (…).” Similarly, the ability to see what is going on was praised, mainly the fact that the users are represented as avatars with a laser pointer instead of abstract rectangles with names as is common in some applications. Another advantage, in comparison to other tools known to participants, was an absence of outside distractions. Also, it was mentioned several times, that our application is more fun than comparable tools. Regarding the disadvantages, the inability to see other peoples' faces was mentioned. Many users also pointed out the necessity to have appropriate hardware, i.e., that such an application requires more equipment and preparations than the tools they know. Another drawback was physical discomfort, mainly the requirement to wear a HMD. Some users mentioned that it takes more time to get familiar with the interface in comparison to common tools they know. Also, the speed with which the ideas can be generated was considered by some participants to be slower than in the case of conventional platforms.

At the end of the experiment, users gave us general feedback about the application. We expanded this feedback by insights we collected by observing their behavior. The most mentioned drawback of the system was the position of the mind map canvas. It was positioned too high, forcing users to look up all the time, which resulted in physical discomfort and decreased the readability of nodes which were positioned at the top of the canvas. Some users also had some remarks about the speed and reliability of the used speech-to-text service. The application itself was generally considered as responsive, although the user interface has space for improvement. Mainly at the beginning, users tended to forget to stop the voice recording after they finished saying the desired text for a node. Also, the difference between parent-child relations and general relations was not clear enough. Regarding the environment, some participants spoke favorably about the space surroundings; on the other hand, one user mentioned that there exists a risk of motion sickness or nausea for some people. Others mentioned that the text at the top of the canvas is hardly readable. Unfortunately, the pixel density of HMD is not good enough at such distance, so it is necessary to consider this drawback when designing similar types of applications. We also noticed that the system of node locking sometimes slows down the work.

Participants also provided some ideas for further improvements. One mentioned that it would be good to have the possibility to decide whether to hide or show activity (including laser pointers) of other people during the voting. Another one pointed out that the current selection of color themes is not very visually pleasing and that it might be good to use some better color palette. One participant said that it might be useful to have more control over voice so you can mute yourself or others, for example, when saying a text for a node. Ability to change the size of the node's text would also be welcomed addition for some users. Overall, the application seemed to be quite immersive but for the price of increased physical demand and possibly slower pacing.

5.2. Quantitative Results

The first part of this section presents compound histogram summarizing participants' evaluations of core system features. Each user was assigning one (= poor) to five (= excellent) points to each feature.

Figure 6 confirms observed behavior which is that users had no major problems when trying to create or delete nodes. The delete might perform a bit worse because when a node is deleted its radial menu remains open until the user points elsewhere. Although the menu is not working anymore, it is a bit confusing that it is still present. This behavior is going to be addressed in the future to deliver a smoother user experience. Distribution of yellow colored responses in Figure 6 shows that the mechanism for moving nodes was not as user-friendly as desired for some participants. This might be caused by the fact that moving of nodes fails (i.e., the node returns to the previous state) when both of the controller buttons are released at the same time. This was a slight complication for some users. Red values, revealing evaluations of change text feature, have a distribution with a mean of 2.94, it can be therefore said that the speech recognition in its current state is acceptable. The question is, how would it perform if different scenario with more complicated words was used? Hence, although the performance is not entirely bad, there is a space for improvement in both the user interface and recognition quality. Then, it might be worth considering whether to stick to the current speech recognition solution or try something else. Another idea to think about is to utilize multimodality even in text input. It was not unusual that the user said a word which was recognized as a different one, but also very similar, to what he wanted, so the difference was just a few letters. It might come as handy to have a quick way of fixing these errors, either in the form of a virtual keyboard or some dictionary-like mechanism.FIGURE 6

Figure 6. Evaluation of usability of system features.

Table 1 presents the results obtained based on Spearman's correlation. An interesting point is the relation between stated physical demand and frustration. When users felt physical discomfort, caused, for example, by too highly placed canvas or weight of the HMD, they became more frustrated. Physical demand can be partly decreased by improving the application's interface, but as long as HMD is used, there will always be a certain level of discomfort. Another interesting output is the correlation between the TLX temporal demand and effort. Participants considering the pace of the task hurried felt that they have to work harder in order to accomplish the task. In this case, improvement of speech-to-text service might be helpful. There was also a strong correlation between answers on “How easy did you find the cooperation within the environment?” and “How quickly did you adjust to the VR environment?” A negative correlation was found between satisfaction with "change text" functionality and answers to TLX questions regarding the feeling of frustration and physical demand. Since this is a key feature from the system perspective, it is used a lot, and when the user does not feel comfortable with it, it might make him or her tired both physically and mentally. Finally, users who considered the visual display quality of HMD as distracting and unsatisfactory felt like the task was more physically demanding. This is partially due to the technological limits of current HMDs but also certain design aspects could be improved. The idea is to improve the colors and sizes of UI elements to decrease the users' eye strain caused by the relatively low pixel density of HMDs.TABLE 1

Table 1. Outputs of selected Spearman's correlation coefficients.

5.3. Log Results

The activity of participants during the testing scenario was logged in order to get more information about their behavior as well as the effectiveness of our platform. Stored information contains general actions performed by the participant (e.g., node creation and deletion) and visualizations of mind map canvas interactions. The median of collaboration times for the scenario was 19 min and 5 s (excluding explanations of each step). Nodes were created only during the first three steps of the scenario, the median of these times is 14 min and 20 s. This accounts for an average speed of ~1–2 nodes per minute since the median of nodes created during the scenario was 24. It is worth mentioning that the speed, respectively duration, of the brainstorming depends on the creativity of the users. The fastest pair was able to create 3.3 nodes per minute on average while slowest one achieved the speed of nearly one node per minute. The relation between the number of nodes at the end of the exercise and total time is shown in figure 7.

Figure 7. Scatter plot of collaboration times and number of nodes (two testing pairs had exactly the same time and node count so there are only 15 visible points).

This could be justified in several ways. First, the users with higher node count might have been simply more creative than the rest, and so it was easier for them to come up with new ideas. Moreover, as each step of the study was not limited by time but rather by a rough minimum of the number of nodes, participants had no problems creating more than enough nodes to continue. The flow of the session was also not interrupted so much by the time spent on thinking about possible ideas. The effect can also be caused by differences in the communication methods between participants. In any case, this confirms that the speed of the brainstorming does not depend only on the system capabilities. Another results from the logs are shown is created by merging heatmaps of all tested users. The points in the image (Figure 9) represent positions in the mind map canvas, which were “hit” by a laser pointer while aiming at a canvas and selecting. The RGB colors determine the relative amount of hits at a given pixel with red being the most “hit” pixels while blue being the least “hit” pixels, whereas green pixels are somewhere in between. Figure 9 shows an averaged heatmap of selected nodes of all users. This determines positions where nodes were selected for the longest time - in this case holds that the bigger the opacity is, the longer this position was covered by a selected node.FIGURE 8

Figure 8. Merged heatmap with pointer movements of all users.

Figure 9. Merged heatmap highlighting positions of selected nodes.

An observation regarding both heatmaps is the fact that the space near the corners of the mind map is unused. This suggests a tendency of users to place most of the nodes near the central root node. Another interesting point is the significant difference in the density of heatmap in the bottom and the upper half of the canvas. This confirms that there might be reduced readability in the upper half of the canvas and users are therefore preferring nodes which are closer to them, i.e., at the bottom of the canvas. Figure 9 also reveals that users generally like to move the nodes around as they wish, and they do not just stick to the default automatic circular placement. This means that it is necessary to have a good interface for node movement. Regarding the movement, in order to be more precise, Figures 10 and 11 show two heatmaps which clusters the users into two categories. The first type is less common and prefers to stick to the default node placement and does only minor changes while the second category of users is more active in this regard. This is also related to another observed user behavior—some people use the laser pointer nearly all the time while others use it only when necessary.FIGURE 10

Figure 10. Example heatmap of the first type of users reorganizing nodes rather rarely.

Figure 11. Example heatmap of the second type of users with more active mind map reorganization.

6. Conclusions and Future Work

This paper presented a collaborative multimodal VR mind map application allowing several participants to fully experience the process of brainstorming. It covered both aspects (a) idea-generation phase as well as (b) voting procedure. Multimodality was achieved based on the combination of speech and VR controller. To verify the usability of the system, experiment with 32 participants (19 males, 13 females) was conducted. Users were tested in pairs and filled several questionnaires summarizing their experience. The results indicate that the system performs according to its specifications and does not show critical problems. In terms of user feedback, comments include mainly minor issues of the usability of the environment and can be clustered as design issues.

Furthermore, there are many possibilities on how to improve and extend the application. Besides general improvements to the interface, avatars will be exchanged for some more realistic ones. Also, name tags will be added to identify individual participants. Thanks to the integrated voice solution, some speech-related features will be added, for example, automatic muting of users when they are saying a label for a node. Moreover, there is also going to be visual feedback, like icon or mouth animation, to make it clear which user is speaking. Possibilities of hand gesture controls will be examined as well. Finally, a comparative user study will be made between traditional platforms for remote collaboration and the VR mind map to assess the advantages and disadvantages of each approach.

Portalco delivers an alternate method of delivery for any VR or non-VR application through advanced interactive (up-to 360-degree) projection displays. Our innovative Portal range include immersive development environments ready to integrate with any organisational, experiential or experimental requirement. The Portal Play platform is the first ready-to-go projection platform of it's type and is designed specifically to enable mass adoption for users to access, benefit and evolve from immersive projection technologies & shared immersive rooms.

Achieving Presence Through Evoked Reality

Jayesh S. Pillai1*, Colin Schmidt1,2 and Simon Richir1

The following report collates a variety of information and perspectives on our multiple realities and how these can impact an immersive experience but more so the human experience which is so critical to long-lasting and user-focused digital experiences that improve memorization, understanding and engagement as a whole.

The sense of “Presence” (evolving from “telepresence”) has always been associated with virtual reality research and is still an exceptionally mystifying constituent. Now the study of presence clearly spans over various disciplines associated with cognition. This paper attempts to put forth a concept that argues that it’s an experience of an “Evoked Reality (ER)” (illusion of reality) that triggers an “Evoked Presence (EP)” (sense of presence) in our minds. A Three Pole Reality Model is proposed to explain this phenomenon. The poles range from Dream Reality to Simulated Reality with Primary (Physical) Reality at the center. To demonstrate the relationship between ER and EP, a Reality-Presence Map is developed. We believe that this concept of ER and the proposed model may have significant applications in the study of presence, and in exploring the possibilities of not just virtual reality but also what we call “reality.”


Research on presence has brought to our understanding various elements that certainly cause or affect the experience of presence in one way or another. But in order to evoke an illusion of presence, we in effect try to generate an illusion of reality different from our apparent (real world) reality through different mediations like Virtual Reality. The attempt to evoke an illusory reality is what brought researchers to think about presence in the first place. “Reality,” despite its being a major concept, is most often either overlooked or confused with other aspects that affect presence. To study presence we must first understand the reality evoked in one’s mind. It is this illusion of reality that forms a space-time reference in which one would experience presence. It is evident from the research in the field of virtual reality, that if a medium is able to create a convincing illusion of reality, there will certainly be a resultant feeling of presence. Various theories have been proposed, to explore and define the components of this mediated presence. We aim to abridge those theories in an efficient manner. Moreover, studies in the field of cognition and neuroscience confirm that the illusion of reality can as well be non-mediated (without the help of external perceptual inputs), that is purely evoked by our mind with an inception of corresponding presence. One of the most common but intriguing example of a non-mediated illusion of reality would be – a dream. This self evoking faculty of mind leading to the formation of presence is often neglected when observed from the perspective of virtual reality.

Sanchez-Vives and Slater (2005), suggest that presence research should be opened up, beyond the domain of computer science and other technologically oriented disciplines. Revonsuo (1995) proposed that we should consider both – the dreaming brain and the concept of Virtual Reality, as a metaphor for the phenomenal level of organization; they are excellent model systems for consciousness research. He argues that the subjective form of dreams reveals the subjective, macro-level form of consciousness in general and that both dreams and the everyday phenomenal world may be thought of as constructed “virtual realities.”

According to Revonsuo (2006), any useful scientific approach to the problem of consciousness must consider both the subjective psychological reality and the objective neurobiological reality. In Virtual Reality it’s not just the perceptual input and the technical faculties that contribute to a stronger illusion of reality but also various psychological aspects (Lombard and Ditton, 1997Slater, 20032009) relating to one’s emotion, attention, memory, and qualia (Tye, 2009) that help mold this illusion in the mind. In the case of non-mediated illusion of reality like dreams or mental imagery, the perceptual illusion is generated internally (Kosslyn, 19942005LaBerge, 1998). The dream images and contents are synthesized to fit the patterns of those internally generated stimulations creating a distinctive context for the dream reality (DR; Hobson and McCarley, 1977Hobson, 1988). Whether mediated or non-mediated, the illusion of reality is greatly affected by the context. “A context is a system that shapes conscious experience without itself being conscious at that time” (Baars, 1988, p. 138). Baars describes how some types of contexts shape conscious experience, while others evoke conscious thoughts and images or help select conscious percepts. In fact it’s a fine blend of perceptual and psychological illusions (explained in section The Illusion of Reality) that leads to a strong illusion of reality in one’s mind. We attempt to explore this subjective reality that is the fundamental source of experience for presence.

Presence and Reality

With the growing interest in the field of Virtual Reality, the subject of presence has evolved to be a prime area of research. The concept of presence, as Steuer (1992) describes, is the key to defining Virtual Reality in terms of human experience rather than technological hardware. Presence refers not to one’s surroundings as they exist in the physical world, but to the perception of those surroundings as mediated by both automatic and controlled mental processes.


Presence is a concept describing the effect that people experience when they interact with a computer-mediated or computer-generated environment (Sheridan, 1992). Witmer and Singer (1994) defined presence as the subjective experience of being in one environment (there) when physically in another environment (here). Lombard and Ditton (1997) described presence as an “illusion of non-mediation” that occurs when a person fails to perceive or acknowledge the existence of a medium in his/her communication environment and responds as he/she would if the medium were not there. Although their definition confines to presence due to a medium, they explained how the concept of presence is derived from multiple fields – communication, computer science, psychology, science, engineering, philosophy, and the arts. Presence induced by computer applications or interactive simulations was believed to be what gave people the sensation of, as Sheridan called it, “being there.” But the studies on presence progressed with a slow realization of the fact that it’s more than just “being there.” We believe that presence, whether strong or mild is the result of an “experience of reality.”

In fact “presence” has come to have multiple meanings, and it is difficult to have any useful scientific discussion about it given this confusion (Slater, 2009). There can be no advancement simply because when people talk about presence they are often not talking about the same underlying concept at all. No one is “right” or “wrong” in this debate; they are simply not talking about the same things (Slater, 2003). On the general problems in conveying knowledge due to the intersection of the conceptual, material, and linguistic representations of the same thing, there exists an attempt to explain the workings of communication and its mishaps (Schmidt, 1997a,b2009), which clearly states that scientists must always indicate which representation they speak of. In this article, we are mainly speaking about the phenomenon, which is the experience of presence.


The term “reality” itself is very subjective and controversial. While objectivists may argue that reality is the state of things as they truly exist and is mind-independent, subjectivists would reason that reality is what we perceive to be real, and there is no underlying true reality that exists independently of perception. Naturalists argue that reality is exhausted by nature, containing nothing supernatural, and that the scientific method should be used to investigate all areas of reality, including the human spirit (Papineau, 2009). Similarly a physicalist idea is that the reality and nature of the actual world conforms to the condition of being physical (Stoljar, 2009). Reality is independent of anyone’s beliefs, linguistic practices, or conceptual schemes from a realist perspective (Miller, 2010). The Platonist view is that reality is abstract and non-spatiotemporal with objects entirely non-physical and non-mental (Balaguer, 2009). While some agree that the physical world is our reality, the Simulation Argument suggests that this perceivable world itself may be an illusion of a simulated reality (SR; Bostrom, 2003). Still others would endeavor to say that the notion of physical world is relative as our world is in constant evolution due to technological advancement; also because of numerous points of view on its acceptation (Schmidt, 2008). Resolving this confusion about theories on reality is not our primary aim and is however beyond the scope of this study. So we reserve the term “Primary Reality” to signify the reality of our real world experiences, which would be explained later in this paper.

The Illusion of Reality

The factors determining the experience of presence in a virtual environment have been explored by many in different ways. For example, presence due to media has previously been reviewed as a combination of:

• Perceptual immersion and psychological immersion (Biocca and Delaney, 1995Lombard and Ditton, 1997).

• Perceptual realism and social realism (Lombard and Ditton, 1997).

• Technology and human experience (Steuer, 19921995).

• Proto-presence, core-presence, and extended-presence (Waterworth and Waterworth, 2006).

• Place illusion and plausibility illusion (Slater, 2009).

To summarize, the two main factors that contribute to the illusion of reality due to media are (1) Perceptual Illusion: the continuous stream of sensory input from a media, and (2) Psychological Illusion: the continuous cognitive processes with respect to the perceptual input, responding almost exactly how the mind would have reacted in Primary Reality. Virtual reality systems create highest levels of illusion simply because it can affect more senses and help us experience the world as if we were inside it with continuous updated sensory input and the freedom to interact with virtual people or objects. However other forms of media, like a movie (where the sensory input is merely audio-visual and there is no means to interact with the reality presented) can still create a powerful illusion if it manages to create a stronger Psychological Illusion through its content (for example a story related to one’s culture or past experiences, would excite the memory and emotional aspects). One of the obvious examples illustrating the strength of Perceptual illusion is a media that enforces stereoscopic view enhancing our depth perception (the illusion works due to the way our visual perception would work otherwise, without a medium). The resultant of the two, Perceptual Illusion and Psychological Illusion evokes an illusion of reality in the mind, although subjectively varying for each person – in strength and experience.

The Concept of “Evoked Reality”

We know that it’s not directly presence that we create but rather an illusion in our minds as a result of which we experience presence. When we use virtual reality systems and create convincing illusions of reality in the minds of users, they feel present in it. This illusion of reality that we evoke through different means in order to enable the experience of presence is what we intend to call “Evoked Reality (ER).” To explore this experience of presence we must first better understand what ER is.

As deduced earlier, all the factors influencing presence would essentially be categorized as Perceptual Illusion and Psychological Illusion. We believe that every media in a way has these two basic elements. Thus ER is a combined illusion of Perceptual Illusion and Psychological Illusion. This combined spatiotemporal illusion is what evokes a different reality in our minds (Figure 1) inducing presence.FIGURE 1

Figure 1. Spatiotemporal illusion due to mediation: reality so evoked generates the experience of presence

Evoked Reality

Even though the terms like telepresence and virtual reality are very recent, their evidence can be traced back to ancient times. The urge to evoke reality different from our Primary Reality (real world reality) is not at all new and can be observed through the evolution of artistic and scientific media throughout history. “When anything new comes along, everyone, like a child discovering the world, thinks that they’ve invented it, but you scratch a little and you find a caveman scratching on a wall is creating virtual reality in a sense. What is new here is that more sophisticated instruments give you the power to do it more easily. Virtual Reality is dreams.” Morton Heilig. (as quoted in Hamit, 1993, p. 57).

From Caves to CAVEs

Since the beginning of civilizations, man has always tried to “express his feelings,” “convey an idea,” “tell a story” or just “communicate” through a number of different media. For example, the cave paintings and symbols that date back to prehistoric times may be considered as one of the earliest forms of media used to convey ideas. As technology progressed media evolved as well (Figure 2) and presently we are on the verge of extreme possibilities in mediation, thus equivalent mediated presence.FIGURE 2

Figure 2. Evolution of media: from caves to CAVEs

We all like to experience presence different from our everyday happenings. To do so, we basically find methods to create an illusion of reality different from the reality that we are familiar with. With the help of different media we have already succeeded to evoke a certain amount of presence and we further aim for an optimum level – almost similar to our real world. Every form of mediation evokes a different kind of illusory reality and hence different degrees of presence. In the early examples of research in presence, studies were conducted based on television experiences before Virtual Reality became a more prominent field of research (Hatada and Sakata, 1980). While some types of media evoke mild illusion of presence, highly advanced media like Virtual Reality may evoke stronger presence. “But we must note that the basic appeal of media still lies in the content, the storyline, the ideas, and emotions that are being communicated. We can be bored in VR and moved to tears by a book” (Ijsselsteijn, 2003). This is precisely why the reality evoked (by media) in one’s mind depends greatly on the eventual psychological illusion, although it may have been triggered initially by a perceptual illusion. Media that could evoke mild or strong presence may range from simple paintings to photos to televisions to films to interactive games to 3D IMAX films to simulation rides to immersive Virtual Reality systems.

Evoked Reality

Evoked Reality is an illusion of reality, different from our Primary Reality (Physical Reality as referred in previous studies). ER is a transient subjective reality created in our mind. In the case of ER due to media, the illusion persists until an uninterrupted input of perceptual stimuli (causing perceptual illusion) and simultaneous interactions (affecting the psychological illusion) continue to remain. The moment at which this illusion of ER breaks due to an anomaly is when we experience what is called a “Break in Presence (BIP)” (Slater and Steed, 2000Brogni et al., 2003). Thus a BIP is simply an immediate result of the “Break in Reality (BIR)” experienced. Different kinds of media can evoke realities of different qualities and different strengths in our minds for different amount of time. It’s an illusion of space or events, where or during which we experience a sense of presence. Thus, it is this ER in which one may experience Evoked Presence (EP).

Evoked Presence

Depending on the characteristics of ER, an experience of presence is evoked. To be more specific this illusion of presence created by ER, we would like to refer to as EP. In this paper, the term “EP” would imply the illusion of presence experience (the sense of presence), while the term “presence” would be reserved for experience of presence in its broad sense (real presence and the sense of presence). EP is the spatiotemporal experience of an ER. We could say that so far it’s through the media like highly immersive virtual reality systems, that we were able to create ER that could evoke significantly strong EP.

Media-Evoked Reality and Self-Evoked Reality

As we saw before, ER is a momentary and subjective reality created in our mind due to the Perceptual Illusion and Psychological Illusion imposed by a media. It is clear that due to ER induced through media like Virtual Reality we experience an EP. This illusion of reality evoked through media, we would like to call “Media-Evoked Reality” or Media-ER.

As mentioned earlier, it’s not just through the media that one can evoke an illusion of reality. The illusion can as well be endogenously created by our mind evoking a seemingly perceivable reality; whether merely observable or amazingly deformable; extremely detailed or highly abstract; simple and familiar or bizarrely uncanny. Thus to fully comprehend the nature of presence, we must study this category of ER that does not rely on media. In fact, we always or most often undergo different types of presence without mediation. Sanchez-Vives and Slater (2005) proposed that the concept of presence is sufficiently similar to consciousness and that it may help to transform research within domains outside Virtual Reality. They argue that presence is a phenomenon worthy of study by neuroscientists and may help toward the study of consciousness. As rightly put by Biocca (2003), where do dream states fit in the two pole model of presence (Reality-Virtuality Continuum)? The psychological mechanisms that generate presence in a dream state have to be at least slightly different than psychological mechanisms that generate presence in an immersive, 3D multimodal virtual environment. Dreaming, according to Revonsuo (1995) is an organized simulation of the perceptual world and is comparable to virtual reality. During dreaming, we experience a complex model of the world in which certain types of elements, when compared to waking life, are underrepresented whereas others are over represented (Revonsuo, 2000). According to LaBerge (1998), theories of consciousness that do not account for dreaming must be regarded as incomplete. LaBerge adds, “For example, the behaviorist assumption that ‘the brain is stimulated always and only from the outside by a sense organ process’ cannot explain dreams; likewise, for the assumption that consciousness is the direct or exclusive product of sensory input.” It is very clear that one can think, imagine, or dream to create a reality in his mind without the influence of any media whatsoever. This reality evoked endogenously, without the help of an external medium, we would like to call “Self-Evoked Reality” or Self-ER (implying that the reality evoked is initiated internally by the mind itself).

Ground-breaking works by Shepard and Metzler (1971) and Kosslyn (19801983) in the area of Mental Imagery provide empirical evidence of our ability to evoke images or imagine stimuli without actually perceiving them. We know that Perceptual and Psychological Illusion are factors that affect Media-ER and corresponding EP. We believe that Self-ER essentially has Psychological Illusion for which the Perceptual element is generated internally by our mind. By generally overlooking or occasionally completely overriding the external perceptual aspects (sensorimotor cues), our mind endogenously creates the Perceptual Illusion required for the ER. It’s evident in the case of dreaming which according to LaBerge (1998), can be viewed as the special case of perception without the constraints of external sensory input. Rechtschaffen and Buchignani (1992) suggest that the visual appearance of dreams is practically identical with that of the waking world. Moreover, Kosslyn’s (19942005) work show that there are considerable similarities between the neural mappings for imagined stimuli and perceived stimuli.

Similar to Media-ER, one may feel higher or lower levels of presence in Self-ER, depending on the reality evoked. A person dreaming at night may feel a stronger presence than a person who is daydreaming (perhaps about his first date) through an on-going lecture with higher possibilities of BIRs. According to Ramachandran and Hirstein (1997) we occasionally have a virtual reality simulation like scenario in the mind (although less vivid and generated from memory representations) in order to make appropriate decisions in the absence of the objects which normally provoke those qualities. However, the vividness, strength, and quality of this internally generated illusion may vary significantly from one person to another. For example, the intuitive “self-projection” phenomenon (Buckner and Carroll, 2007; personal internal mode of mental simulation, as they refer to it) that one undergoes for prospection will certainly differ in experience and qualia from another person. It is a form of Self-ER that may not be as strong or prolonged as a picturesque dream, but strong enough to visualize possible consequences. It is clear that ER is either the result of media or induced internally. This dual (self and media evoking) nature of ER directs us toward a fresh perceptive – three poles of reality.

Three Poles of Reality

As we move further into the concept of ER and EP, we would like to define the three poles of reality to be clearer and more objective in the explanations that follow. Reality, as discussed earlier (in subsection Simulated Reality), has always been a term interpreted with multiple meanings and theories. To avoid confusion we would like to use an impartial term – “Primary Reality,” which would refer to the “experience” of the real world (or what we call physical world). It is the spatiotemporal reality in our mind when we are completely present in the real world. It would mean that any reality other than Primary Reality is a conscious experience of illusion of reality (mediated or non-mediated), or more precisely – ER.

Presence and Poles of Reality

Inherited from early telerobotics and telepresence research, the two pole model of presence (Figure 3) suggests that presence shifts back and forth from physical space to virtual space. Research on presence has been dominated ever since by this standard two pole psychological model of presence which therefore requires no further explanation.FIGURE 3

Figure 3. The standard two pole model of presence

Biocca (2003) took the study of presence model one step further. According to the model he proposed, one’s spatial presence shifts between three poles of presence: mental imagery space, the virtual space, and the physical space. In this three pole graphic model, a quasi-triangular space defined by three poles represented the range of possible spatial mental models that are the specific locus of an individual user’s spatial presence. His Model of presence attempted to offer a parsimonious explanation for both the changing loci of presence and the mechanisms driving presence shifts. Though the model explained the possibilities of presence shifts and varying levels of presence, it is vague about certain aspects of reality. It did not clarify what happens when we experience an extremely low level of presence (at the center of the model). How or why do we instantly return to our Primary Reality (in this model – Physical Space) as soon as a mediated reality or a DR is disrupted (Even though we may have entirely believed to be present in the reality evoked during a vivid dream)? Moreover it took into account only the spatial aspects but not the temporal aspects of shifts in presence.

We would like to define three poles of reality from the perspective of ER. The Three Pole Reality Model (Figure 4) may help overcome the theoretical problems associated with presence in the standard two pole model of presence as well as the model proposed by Biocca. According to us it’s the shifts in the type of reality evoked that create respective shifts in the level of presence evoked. For example if one experiences a highly convincing ER during a virtual reality simulation, he/she would experience an equivalently strong EP until a BIR occurs. The three poles of reality that we define are:

• DR (Threshold of Self-ER)

• Primary Reality (No ER)

• SR (Threshold of Media-ER)FIGURE 4

Figure 4. Three pole reality model

Primary reality

Primary reality refers to the reality of our real world. In Primary reality, the experience evoking stimulation arrives at our sensory organs directly from objects from the real world. We maintain this as an ideal case in which the stimulus corresponds to the actual object and does not deceive or misinform us. For instance, imagine yourself running from a tiger that is chasing you. It’s very near and is about to pounce on you. You scream in fear, and wake up to realize that you are safe in your bed, like every morning. You know for sure that this is the real world and the chasing tiger was just a part of the DR that your mind was in, some time before. So, Primary Reality is our base reality to which we return when we are not in any ER. In other words, when a BIR occurs, we come back to Primary Reality. Thus, as we can see in Figure 5, any point of reality other than Primary Reality is an ER. We could say that it’s this Primary Reality that we rely on for our everyday activities. It’s the reality in which we believe that we live in. Our experiences in this Primary Reality may form the basis for our experiences and expectations in an ER. For example, our understanding of the real world could shape how we experience presence in an immersive virtual reality environment, or even in a Dream. We could suppose that it’s the Primary Reality in which one believes this paper exists, or is being read.FIGURE 5

Figure 5. Three poles of reality: evoked reality constantly shifts between them

Simulated reality

In the case of Media-ER, an experience similar to Primary Reality is attempted to be achieved by interfering with the stimulus field, leading to an illusion of reality. For example virtual reality uses displays that would entirely mediate our visual perception in a manner that our head or eye movements are tracked and updated with appropriate images to maintain this illusion of receiving particular visual stimuli from particular objects. SR would be the most compelling and plausible reality that could ever be achieved through such mediations. It would be the reality evoked in our mind under the influence of a perfectly simulated virtual reality system. It’s the ultimate level that virtual reality aims to reach someday. At the moment an immersive virtual reality system, like flight simulators would be able to create ER considerably close to this pole. Its effectiveness is evident in the fact that pilots are able to perfectly train themselves being in that ER created by the simulator, helping them eventually to directly pilot a real plane. However, in the hypothetical condition of a perfectly SR our mind would completely believe the reality evoked by the simulation medium, and have no knowledge of the parent Primary Reality (Putnam, 1982Bostrom, 2003). In this state, it would be necessary to force a BIR to bring our mind back to Primary Reality. A Perfect SR is the Media-ER with strongest presence evoked and will have no BIRs.

Dream reality

In the case of Self-ER, the external perceptual stimuli are imitated by generating them internally. DR is an ideal mental state in which we almost entirely believe in the reality experienced, and accept what is happening as real. It does not return to the Primary Reality unless a BIR occurs. For instance, in the case of our regular dreams, the most common BIR would be “waking up.” Although internally generated, dream states may not be completely divorced from sensorimotor cues. There can be leakage from physical space into the dream state (Biocca, 2003). The experienced EP during a strong Dream can be so powerful that even the possible anomalies (causing BIRs) like external noises (an alarm or phone ringing) or even elements from physical disturbances (blowing wind, temperature fluctuations) may be merged into the DR, so as to sustain this ER for as long as possible. A Perfect DR is a Self-ER with the strongest presence evoked and will have no BIRs (similar to SR on the media side).

Presence Shifts and Presence Threshold

We are often under the effect of either Media or Self-ER. Imagine that we are not influenced by any mediation, nor any kind of thoughts, mental imagery, or dreams and our mind is absolutely and only conscious about the Primary Reality. In such an exceptional situation we would supposedly feel complete presence in the Primary Reality. Thus we presume that this perfect Primary Reality-Presence (or “real presence” as some may call) is the threshold of presence one’s mind may be able to experience at a point of time. It is clear that we can experience presence either in Primary Reality or in an ER. We cannot consciously experience presence in two or more realities at the same time, but our mind can shift from one reality to another voluntarily or involuntarily, thus constantly shifting the nature and strength of the presence felt. As pointed out by Garau et al. (2008), presence is not a stable experience and varies temporally. They explain how even BIPs could be of varying intensities. They also try to illustrate using different presence graphs the phenomenon of shifting levels of presence with the course of time and how subjective the experience is for different participants. Media like virtual reality aims to achieve the Presence Threshold at which one’s mind might completely believe the reality evoked. Though we have not however achieved it, or may never do, theoretically it’s possible to reach such a level of SR. Similarly if one experiences a Perfect Dream without any BIR, he/she would be at this threshold of presence exactly like being in the Primary Reality. SR and DR are the two extreme poles of reality at which the EP is at its threshold. These presence shifts due to the shifting of reality between these poles is something that we seldom apprehend, although we always experience and constantly adapt to them. In the following section we attempt to represent this phenomenon with a schematic model that would help us examine presence and reality from a clearer perspective.

Reality-Presence Map

Based on the three poles of reality and Presence Threshold we would like to propose the Reality-Presence Map (Figure 6). This map is a diagram of the logical relations between the terms herein defined. At any point of time one’s mind would be under the influence of either a Media-ER or a Self-ER when not in the Primary Reality (with no ER at all). Between the poles of reality, ER would constantly shift evoking a corresponding presence EP. As we can see in the map there is always a sub-conscious Parent Reality-Presence corresponding to the EP. This Parent Reality-Presence is very important as it helps our mind to return to the Primary Reality once the illusion of ER discontinues (or a BIR occurs). For a weaker EP, the Parent Reality-Presence is stronger (although experienced sub-consciously). When the ER manages to evoke very strong presence, the strength of Parent Reality-Presence drops very low (almost unconscious) and we start to become unaware of the existence of a Primary Reality; which is what an excellent immersive virtual reality system does. The shifting of presence is closely related to our attention. As soon as our attention from the ER is disrupted (predominantly due to interfering external perceptual elements), our attention shifts to the parent reality-presence sliding us back to Primary Reality (thus breaking our EP).FIGURE 6

Figure 6. Reality-presence map.

At the extreme poles, we would experience an Optimum Virtual Presence in a SR and similarly an Optimum Dream Presence in a DR. At these extreme points one may completely believe in the illusion of reality experienced almost or exactly like it is our Primary Reality, without the knowledge of an existing Parent Reality. At such a point, possibly a very strong BIR should be forced to bring one back to the parent Primary Reality. Experiencing a strong DR is one such example which many would relate to. During a very compelling but frightening dream, “waking up” acts as a very strong BIR, helping in the desperate attempt to leave the DR. After such a sudden and shocking change in reality most often our mind takes time to adjust back to the Primary Reality where everything would slowly turn normal and comforting.

Whenever there is an ER, the EP part of the presence (in the map) is what has our primary attention, and thus is the conscious part. Hence, the higher the EP, the lesser we are aware of our parent reality. Evidence of the sub-conscious Parent Reality-Presence can be observed in our experience of any media that exists today. Many studies have shown that in virtual environments, although the users behaved as if experiencing the real world, at a sub-conscious level they were certain that it was indeed “not” real. BIPs (that are used to measure presence) are in fact triggered by shifts in attention from the virtual world to the real world. For instance, virtual reality systems that help visually surround us completely with a virtual environment, elevates our presence (compared to a panorama view or television with visible frame boundaries) as our chances of shifting attention toward the real world drastically reduce in such higher levels of immersion (Grau, 2004Slater, 2009). Since ER is a subjective feeling, it can never be measured or even compared truthfully. This is the reason why we depend on the measurement of presence EP to determine if a system creates a stronger or weaker ER. Since the strength of presence itself is relative, the best way to measure is to compare between systems in similar context. “The illusion of presence does not refer to the same qualia across different levels of immersion. The range of actions and responses that are possible are clearly bound to the sensorimotor contingencies set that defines a given level of immersion. It may, however, make sense to compare experience between systems that are in the same immersion equivalent class” (Slater, 2009).

A major task for empirical consciousness research is to find out the mechanisms which bind the experienced world into a coherent whole (Revonsuo, 1995). This map provides a framework where the various experiences of ER could be mapped. Note that this map is not a “graph” that shows the strength of EP as directly proportional to the strength of ER. In fact it would help us represent every possible kind of ER as a point fluctuating between the two extreme poles of reality, with its respective strength of EP. We may refer to ER as stronger or weaker, when its qualia evoke stronger or weaker EP respectively. The Reality-Presence Map shows that if we can skillfully manipulate these qualia of ER (although subjective to each individual) bringing it closer to either of the two extreme poles, we may be able to evoke higher levels of EP. We should also note that, in order to introduce its basic concept, the Reality-Presence Map is presented here in a flattened two-dimensional manner. In the later sections we will illustrate how this map attempts to account for different experiences which were unable to be explained by previous presence models.

Subjectivity of Evoked Reality

As a matter of fact, the same mediation can create different subjective ER for different users depending on their personal traits. For example, two users reading the same book, or playing the same video game, or using the same Virtual Reality system would experience presence in an entirely different manner. EP (especially evoked by a medium) may be affected by one’s knowledge related to the context, degree of interest, attention, concentration, involvement, engagement, willingness, acceptance, and emotional attributes making it a very subjective experience. This is precisely why it is difficult to evaluate the efficiency of a particular Virtual Reality system by means of presence questionnaires. In fact many researchers confuse few of these terms above, with the concept of presence.

Therefore, to locate ER on the map, we have to examine “presence.” In fact finding reliable ways to measure presence has been a pursuit among many virtual reality and communication media researchers. In order to lead to testable predictions, we would rely on currently evolving measuring and rating systems, so as to determine an objective scale for presence (from Primary Reality to each extreme pole). Presently existing measuring techniques include questionnaires like “presence questionnaire” (Witmer and Singer, 1998Usoh et al., 2000), ITC-SOPI questionnaire (Lessiter et al., 2001), SUS questionnaire (Slater et al., 19941995), analysis of BIPs (Slater and Steed, 2000Brogni et al., 2003), objective corroborative measures of presence like psycho-physiological measures, neural correlates, behavioral measures, task performance measures (Van Baren and Ijsselsteijn, 2004), to mention a few. We can certainly predict the positions of different everyday experiences for a person in general (Figure 7); however it could be tested in the future only using above mentioned methods of measuring presence.FIGURE 7

Figure 7. An example range of Media-ER and Self-ER experiences mapped on reality-presence map, for an individual, that would occur at various points in time.

In virtual reality, distinction between “presence” and “immersion” has been made very clear previously in (Slater, 19992003). Though immersion (which is discussed extensively in the domain of virtual reality) is one of the significant aspects of EP, it falls under the technical faculty of a mediated system. “Immersion (in perceptual sense) provides the boundaries within which Place Illusion can occur” (Slater, 2009). Detailed aspects of presence related to immersive virtual reality are also discussed in (Slater et al., 2009). The characteristics like involvement, engagement, degree of interest, emotional response, may seem similar to presence, but are in fact different elements that may influence or be influenced by EP. The psychological impact of content, i.e., good and bad, exciting and boring, depends to a large extent on the form in which it is represented (Ijsselsteijn, 2003). Thus one of the most important aspects of Media-ER is its context. In most cases it forms a reference in one’s mind to how they may experience ER and hence the presence evoked. For example, in some contexts, especially in art and entertainment, it would invoke a “genre” that plays a major role in its communication. The context (whether artistic expression, communication, entertainment, medical application, education, or research) should be a core concern while designing a Virtual Reality System, in order to bring about a subjectively higher quality of ER. A descriptive account on the importance of context in Self-ER is given by Baars (1988). With examples of different sources and types (perceptual and conceptual) of contexts, he demonstrates how unconscious contexts shape conscious experience. In addition, he explains the importance of attention, which acts as the control of access to consciousness. Attention (in both Media-ER and Self-ER) can direct the mind toward or away from a potential source of qualia. The experience of an ER therefore depends also on the voluntary and involuntary characteristics of one’s attention.

According to the concept, our presence shifts continuously from one ER to another and does not require passing through Primary Reality to move from one side to another. This map does not provide a temporal scale per se. However in future (with the advancements in presence measurement techniques), the map can be used to trace presence at different times to study the temporal aspects of presence shifts.

Evoked Reality within Evoked Reality

There is an important question that arises now. How can we account for our thoughts or mental imagery experiences during VR simulations, games, movies, or most importantly books? It is the phenomena of experiencing Self-ER during a Media-ER experience.

Self-ER within media-ER

Whenever we experience an ER, our mind is capable of temporarily presuming it as the parent reality and reacting accordingly. The better the ER and stronger the EP, the easier it is for our mind to maintain the illusion. In such states Media-ER is experienced as a temporarily form of Primary Reality, and we are able to experience Self-ER within it. In fact that is the core reason why virtual reality systems and virtual environments work. This phenomenon is clearly displayed in such experiences, where the users require thinking, planning, and imagination in order to navigate in the virtual world, just like they would do in the real world. Below, it is demonstrated how this phenomenon may be represented with respect to the Reality-Presence Map (Figures 8 and 9). This scenario will ultimately be classified under Media-ER.FIGURE 8

Figure 8. An example of how Media-ER would temporarily act as a version of primary reality

Figure 9. An example of presence shift due to Self-ER within Media-ER (for e.g., thinking within a virtual environment).

Self-ER triggered during media-ER

“Self-ER within Media-ER” should be distinguished from the phenomenon of “Self-ER triggered during Media-ER.” This is similar to a well-known case of Self-ER – the phenomenon of mind-wandering that temporarily detaches us from the Primary Reality. It is otherwise known as “task unrelated thought,” especially with respect to laboratory conditions. Smallwood et al. (2003) define it as the experience of thoughts directed away from the current situation. It is in fact a part of (and closely related to) our daily life experiences (Smallwood et al., 2004McVay et al., 2009). Although studies on mind-wandering are principally focused on shifts between Self-ER and tasks relating to Primary Reality (falling under usual case of Self-ER experience – Figure 10), we propose that they are applicable to similar cases in Media-ER as well. It has been suggested that this involuntary experience may be both stable and a transient state. That means we can experience a stable EP during mind-wandering or an EP oscillating between the Self-ER, Media-ER, and the Primary Reality.FIGURE 10

Figure 10. The usual case of presence shift from primary reality to Self-ER

Therefore, when an unrelated Self-ER is triggered while experiencing a Media-ER (or when Self-ER within Media-ER traverse the presence threshold and becomes unaware of the Media-ER itself), it should be considered under the case of Self-ER (Figure 11).FIGURE 11

Figure 11. An example of presence shift toward Self-ER triggered during Media-ER.


Our attempt was a novel idea, to fit together different concepts regarding presence into a single coherent graphical representation. Although this concept of ER and EP along with the proposed map provides us a simplified way to look at reality and presence, it raises plenty of questions. Can the experience of an altered state of consciousness (ASC) like hallucination, delusion, or psychosis due to mental disorders be a kind of Self-ER? Revonsuo et al. (2009) redefines ASC, as the state in which consciousness relates itself differently to the world, in a way that involves widespread misrepresentations of the world and/or the self. They suggest that, to be in an ASC is to deviate from the natural (world-consciousness) relation in such a way that the world and/or self tend to be misrepresented (as evident in reversible states like dreaming, psychotic episodes, psychedelic drug experiences, epileptic seizures, and hypnosis). According to Ramachandran and Hirstein (1997) we have internal mental simulations in the mind using less vivid perceptual attributes, in the absence of the regular external sensory inputs. If they possessed full-strength perceptual quality, that would become dangerous leading to hallucinations. They argue that in cases like temporal lobe seizures, this illusion (Self-ER) may become indistinguishable to real sensory input losing its revocability and generating incorrect sense of reality (creating a permanent ER situation that makes it difficult to return to Primary Reality). So can hallucinations due to Self-ER be compared to Augmented Reality due to Media-ER?

In contrast to Presence, is there an “Absence” and do we experience that? If so, how? Can it be compared to a dreamless sleep? Can Presence Threshold itself be subjective and differ from person to person? With reference to the Reality-Presence Map, is there a possibility of an experience analogous to uncanny valley when ER is nearest to the two extreme poles? Is this the reason why many experience anomalies during exceptionally vivid nightmares or lucid dreams? Similarly on the Media-ER side, can simulator sickness due to inconsistencies during virtual reality simulations be compared to this phenomenon? Other than the obvious difference between Media-ER and Self-ER that was discussed before, they have another main differentiation. In most cases of Media-ER, multiple users could share the experience of a common ER at the same time (naturally, with subjective differences, especially due to psychological illusion). While in the case of Self-ER, every person’s mind experiences unique ER. Thus a Dream is typically an individual experience (as far as our present technological advancements and constraints suggest), while SR may be shared.

Furthermore, the Reality-Presence Map helps us investigate into potential ideas on Reality, for instance the possibility of Simulation within a Simulation (SWAS). The Map could be extended to and be applicable for any level of reality, in which we believe there’s a Primary Reality – the base reality, to which we return to in case of absence of any form of ER. Let’s imagine that someday we achieve a perfect SR. As per our proposition, one’s mind would accept it as the Primary Reality as long as the experience of presence continues (or till a “BIR” occurs). It would imply that at such a point, one can experience presence exactly as in the Primary Reality. In this perfect SR if one experiences Media-ER (e.g., virtual reality) or Self-ER (e.g., dream), as soon a BIR occurs they return back to it since it’s the immediate Parent Reality. Figure 12 attempts to illustrate such a situation with DR and SR as two orthogonal Poles of Reality. Similarly in the Self-ER side, one’s mind could experience a Dream within a Dream (DWAD). When one wakes up from such a dream, he could find himself in the parent DR from which he would have to wake up again into the Primary Reality. Can this be how people experience such false awakenings [a hallucinatory state distinct from waking experience (Green and McCreery, 1994)]? Figure 13 attempts to illustrate such a situation of DWAD.FIGURE 12

Figure 12. Simulation within a simulation

Figure 13. Dream within a dream

In fact it makes us curious about the even bigger questions. Can there be an ultimate reality beyond Primary Reality or even beyond the scope of this map. The Simulation argument claims that we are almost certainly living in a computer simulation (Bostrom, 2003), in which case what we believe to be our Primary Reality might itself be a SR [similar to Brains in a vat scenario (Putnam, 1982)]. Metzinger (2009) proposes that our experience of the Primary Reality is deceptive and that we experience only a small fraction of what actually exists out there. He suggests that no such thing as “self” exists and the subjective experience is due to the way our consciousness organizes the information about outside world, forming a knowledge of self in the first person. He claims that everything we experience is in fact a SR and the on-going process of conscious experience is not so much an image of reality as an “ego tunnel” through reality. So, is our Primary Reality in fact the base reality? Or are we always under an ER of some kind? Figure 14 attempts to put together different levels of reality as a Reality Continuum. It would make us wonder if it’s probable, to how many levels would one be able to go? Do we already visit them unknowingly through our dreams? Would the levels of reality in the figure be represented as a never ending fractal structure? In any case, will we be able to understand someday all these aspects of our experience of reality?FIGURE 14

Figure 14. Reality continuum (illustrating the levels of reality).


In this paper we explored presence and different elements that contribute to it. Presence is not just “being there” but a combination of multiple feelings and most importantly “experiencing the reality.” The two main factors affecting presence due to mediation are Perceptual Illusion and Psychological Illusion. These factors evoke an illusion of reality in our mind in which we feel presence. We are constantly subjected to such illusions of reality, during which we experience presence differently from that of our apparent real world. This illusion of reality is called ER.

Evoked Reality is not just media-evoked but can also be self-evoked. Media-ER may range from the mild effect of a painting to an extremely plausible immersive Virtual Reality experience while a Self-ER may range from a simple thought to an exceptionally believable DR (the strength of ER may not necessarily be in the same order, as it depends on one’s qualia and personal characteristics). This dual nature of ER led us to define three poles of reality: primary reality – the unaltered and unmediated Real World, SR – the ultimate Media-ER (a perfect Virtual Reality condition) and DR – the ultimate Self-ER (a perfect dream condition). Thus ER is an illusion of reality formed in our mind, which is different from Primary Reality. It’s a combined illusion of space and events, or at least one of them. It is in this ER, one would experience presence. Thus EP is the spatiotemporal experience of an ER.

The proposed Reality-Presence Map attempts to graphically illustrate the concept of ER and EP. This map provides a framework where the various experiences of ER could be mapped. The subjectivity of ER qualia and how these subjective factors affect Media-ER and EP were explained. The idea of Presence Threshold was also explored which formed the basis for different levels of EP and temporal Presence Shifts. Different possibilities like SWAS and DWAD conditions were discussed with respect to the proposed model. However certain elements still demand clarifications to fill in the theory. The concept presented here is an inception of a potential future research. We believe that ER and the proposed Reality-Presence Map could have significant applications in the study of presence and most importantly in exploring the possibilities of what we call “reality.”

The full report including references can be found here

Virtual Realities' Bodily Awareness, Total Immersion & Time Compression Affect

VR and Time Compression- A Great Example of How Deeply Immersion Works

Time flies when you’re having fun. When you find yourself clock-watching in a desperate hope to get something over and done with, it often feels like the hands of the clock are moving like treacle. But when you find yourself really enjoying something,

It’s no surprise at all to hear that this phenomenon is particularly prevalent when it comes to virtual reality. After all, we all know that the more immersive the experience, the much more engaging and enjoyable it often tends to be. Researchers have in fact given this case of technology warping our sense of time a name: time compression.

Intel HTC virtual Reality Accident Reduction
We don't only get lost in the concept of time, but we feel the benefits too!

The Marble Game Experiment

Grayson Mullen and Nicolas Davidenko, two Psychology professors, conducted a survey in 2020 to see if there was any measurable scientific proof to this widely-reported phenomenon. And indeed there was!

They invited 41 undergraduate university students to play a labyrinth-like game, where the player would rotate a maze ball to navigate the marble inside to the target. One sample group played the game via a conventional monitor, while the other played within a virtual reality environment. The participants were asked to stop playing and press a yellow button at the side of the maze once they had sensed five minutes had passed.

With all the responses timed and recorded, the study ultimately found that the students who played the VR version of the labyrinth game pushed the button later than their conventional monitor counterparts, spending around 28.5% more real time playing!

Why does it happen?

We don’t exactly know how VR locks us in a time warp. There’s no denying that video games in general can be extremely addictive for some players. Even conventional games are so easy to get immersed into that you could forget whereabouts in the day you are.

Palmer Luckey, founder of Oculus, thinks it could boil down to the way we rely on the environment around us to sense the passage of time. Here is what he said during an interview at the 2016 Game Development Conference:

“I think a lot of times we rely on our environments to gain perceptual cues around how much time is passing. It's not just a purely internal thing. So when you're in a different virtual world that lacks those cues, it can be pretty tough...You've lived your whole life knowing roughly where the sun is [and] roughly what happens as the day passes…

In VR, obviously, if you don't have all those cues — because you have the cues of the virtual world — then you're not going to be able to make those estimates nearly as accurately.”

When you play a game on a conventional platform such as a console or a PC, you’ve got other things going on around you to give you a good indication of what the time is, like the sun and the lighting, and any background noises (e.g. the sounds of rush-hour traffic). With virtual reality, you block all this out, so you can’t rely on these to help you tell the time anymore.

What does this mean for immersion & us?

Time compression isn’t just relevant when it comes to enjoying entertainment: we can also use it to help people in other contexts. For example, Susan M Schneider led a clinical trial exploring the possibility of incorporating virtual reality experiences into chemotherapy sessions. This medical procedure can be very stressful for cancer patients, but the results of the trial found clear evidence for the VR simulation reducing anxiety levels and perceived passage of time, acting as a comforting distraction from the chemotherapy.

But despite all these potential benefits, we can’t forget the elephant in the room of gaming addiction. The time-warping effect of virtual reality also sadly means it’s easier for players to spend hour after hour stuck in their virtual world, which sacrifices their health as well as their time! Not only does this increase the risk of motion sickness, but it can also throw off your natural body clock, negatively affecting how well you sleep and thus your overall wellbeing.

It kind of sounds like one step away from the Lotus Casino from Rick Riordan’s Percy Jackson series - a casino where time never stops and nobody ever wants to leave. In their study, Mullen and Davidenko urge game developers not to take a leaf from the Lotus Eaters’ book. While a near-addictive  feeling in your audience is a positive sign of a successful immersive application, it shouldn’t be something you exploit to put them at risk.

Here are a couple of recommendations to help players know when it’s time to stop:


Miller, R. (2016). Oculus founder thinks VR may affect your ability to perceive time passing. [online] The Verge. Available at:

Mullen, G. & Davidenko, N. (2021). Time Compression in Virtual Reality. Timing & Time Perception. 9 (4). pp. 377–392.

Schneider, S.M., Kisby, C.K. & Flint, E.P. (2011). Effect of Virtual Reality on Time Perception in Patients Receiving Chemotherapy. Supportive Care in Cancer. 19 (4). pp. 555–564.

To view the full report on

Holospatial delivers an alternate method of delivery for any VR or non-VR application through advanced interactive (up-to 360-degree) projection displays. Our innovative Portal range include immersive development environments ready to integrate with any organisational, experiential or experimental requirement. The Holospatial platform is the first ready-to-go projection platform of it's type and is designed specifically to enable mass adoption for users to access, benefit and evolve from immersive projection technologies & shared immersive rooms.

Nvidias' Immersive Omniverse of Virtual Tools Making Immersion a Reality- Expanding the industry

Nvidia has been venturing into the Omniverse for sometime now. It's a set of tools for software developers aimed at helping them create a "metaverse" of three-dimensional virtual worlds and a concurrent set of facilities to help you create and manage your metaverse.

Obviously the metaverse has been hitting the headlines, despite the fact this only truely exists in your head- but that's a different thought path altogether. The digital metaverse or omniverse in Nvidias' case is right in the heart of Portal territory with most of these platforms and immersive innovations all ploying away at the immersive opportunity which however fast organisations realise it, IS going to be the future and if you look even now at our reliance on digitised 2D information/platforms it's a pretty easy link to realise that stepping into 3D interfaces is where the digital "verse" (whichever pre-fix you choose) is really going to crank-up.

At the company's annual technology conference, Nvidia released Omniverse Enterprise- which has been on the agenda all year. It will start at $9,000 per year and be sold by partners such as Dell Technologies and Lenovo Group- two majorly significant distributors of Nvidia chips.

Omniverse Application

The Omniverse tools help various apps used to create three-dimensional worlds out of current technologies, software tools and applications. Without doubt, it's a bold step into the right (and the immersive) direction.

In an interview with Reuters, Richard Kerris, vice president of the Omniverse platform at Nvidia, called it "the plumbing of the virtual worlds. That's what we've built. And that's what we're building with all of these partners."

At current this is a business behind-the-scenes type game. And it's technologies such as Portals which allow a totally different delivery medium for such tech advances. The Omniverse is a step towards a Portal-world.

Kerris told Reuters that Nvidia worked with more than 700 companies to test and develop the software, including firms like telecommunications equipment maker Ericsson, which used the software to create a "digital twin" of a city that it used to test cell phone signal coverage before rolling out physical trucks to install real-world antennas.

Earlier this month, Wells Fargo analyst Aaron Rakers wrote that software and other tools for creating virtual worlds could be a $10 billion market opportunity for Nvidia over the next five years - especially as firms like Meta Platforms Inc (the new/old Facebook), entice people to spend more time in what it calls the metaverse.

Nvidia's stock market value has surged $191 billion since Facebook's capital expenditure announcement on Oct. 25, a two-week gain that is nearly as large as rival Intel's entire market capitalization of $209 billion an indicator of just how significant the 3D-verse will eventually be.

The "verse" is the future, and more so it is the prized asset which the largest and most forward-thinking companies and investors are looking at to add true value to their organisations, processes all of which to ultimately benefit it's people and wider audience.

Delicious Data: Will We Ever Taste Our Computer Applications?

Just imagine you are in a restaurant. A virtual restaurant. You feel as if you are really sitting at a table on a chair. You hear the chatter and plate clattering of your nearby diners. You see the posh lighting, finely decorated tables and velvety restaurant wallpaper. You pick up virtual menus and look through what you fancy eating. Maybe you can even smell the ambience of all that delicious food being prepared from in the chefs’ kitchen. But then, your meal arrives. It looks convincing and appetising, and you pick up your digital knife and fork to cut it up and try it...but it’s bland. There’s no mouth feel, no tingling taste eating experience.

The sense of taste is one which has often been overlooked in the world of technology. As graphic displays, audio speakers and even haptic devices have all advanced over the years, the idea of gustatory devices has been left in the dust. Not because people aren’t interested in making digital taste happen, but rather that a lot more research needs to be done to understand how we can make it a reality. The sense of taste is not quite as straightforward a sense as you may expect...

How does taste work?

Of course, one of the secrets to how we can appreciate the different flavours of what we eat is in the tens of thousands of taste buds in our tongue. Within each taste bud is a cluster of taste receptor cells.  There are actually only five taste sensations the tongue can experience: sweetness, saltiness, sourness, bitterness and umami/savouriness, which are triggered by the detection of sugar, sodium chloride, acids, alkaloids and glutamates respectively.

But how is it that can we taste the difference between lemon juice and vinegar if both are sour? That’s where your nose comes into play, as the sense of smell and taste are actually very closely linked. As you eat or drink, the chemicals in what you consume trigger the olfactory receptors in your nasal cavity as well as your taste buds, and it is the combination of these responses that signal flavours to your brain. Next time you eat something tasty, hold your nose while you put it in your mouth and chew...and it’s all but guaranteed you won't have the gustatory explosion you would have felt in your mouth otherwise.

What will taste tech look like?

The world of taste-inducing technology is still in the conception stage, let alone its infancy. Due to the complex, chemical nature of our sense of taste, thinking of ways to simulate human taste sensations artificially and safely have proven difficult. No product or standard for taste-simulating technology currently exists in the market, and it’s currently impossible to break down smells into categories or elements. However, a few researchers have come up with some potential ideas as to how taste can work...

Tokyo University’s Takuji Narumi drew attention to the close link between the senses of smell and taste, and explored how exposure to different visual and olfactory stimuli could affect how we taste. At a computing conference in Canada in 2011, he demonstrated the Meta Cookie system, a head-mounted visual and olfactory display which was worn while eating an unflavoured biscuit. The headset’s display laid an image of a different-coloured biscuit over the original using augmented reality, as a perfume scent travelled through tubes attached to the nose.

In 2012, another team of researchers at the National University of Singapore, led by Nimesha Ranasinghe, explored how technology could tantalise the taste buds directly. They developed an experimental tongue-mounted device which aimed to replicate rudimentary taste sensations via controlled electrical stimulation. The results of this experiment found that the device was most effective at replicating sour taste sensations, and was capable of doing so in three degrees of intensity.

Six years later, another team of researchers proposed a similar taste-actuating interface of their own, this time activating the taste buds by changing temperature. This method could stimulate sweetness much better than any other taste, indicating that different taste sensations may require different strategies.

Where will we see taste tech used?

Imagine a culinary arts training simulation where you can actually learn what flavours to look out for when sampling your cooking. Or a virtual marketing campaign where you can sample beverages without physically having to drink them (want to taste a wine but need to drive back home?). Or maybe a virtual travel experience where you can actually get a taster for another country’s cuisine, all within one of our Portals perhaps!

Integrating the sense of taste into technology is still ages away from being a possibility, but it’s nonetheless exciting to think about the doors it’ll open up for what kind of immersive applications we can create.


Karunanayaka, K., Johari, N., Hariri, S., Camelia, H., Bielawski, K.S. & Cheok, A.D. (2018). New Thermal Taste Actuation Technology for Future Multisensory Virtual Reality and Internet. IEEE Transactions on Visualization and Computer Graphics. 24 (4). pp. 1496–1505.

Narumi, T., Nishizaka, S., Kajinami, T., Tanikawa, T. & Hirose, M. (2011). Augmented reality flavors: Gustatory display based on Edible Marker and cross-modal interaction. Conference on Human Factors in Computing Systems - Proceedings.

Ranasinghe, N., Nakatsu, R., Nii, H. & Gopalakrishnakone, P. (2012). Tongue Mounted Interface for Digitally Actuating the Sense of Taste. In: 2012 16th International Symposium on Wearable Computers. [Online]. June 2012, Newcastle, United Kingdom: IEEE, pp. 80–87. Available from: [Accessed: 1 November 2020].

Ambisonics: An Untapped World Of Full Dome Surround Sound

When we hear a sound in the real world, we can get an idea of where exactly it’s coming from: in front of or behind us, to the left or the right, above or below us, and right up close or far in the distance. When we listen to a piece of music or watch a TV show with our headphones on, however, we don’t get that full three-dimensional experience. Surround sound systems are getting better and better, but even then they can’t give us that exact same sensation.

This is where the Ambisonics audio system: a true 360-degree sound system where listeners can hear impressive sounds from every thinkable angle, in every thinkable direction (including height!!).They can take your immersive applications one step higher, and introduce your users to a whole new world of sound...

A brief history

The concept of Ambisonics was devised in the 1970s by a group of British academics, including the mathematician Michael Gerzon. Later on in the decade, Gerzon would go on to partner with fellow engineer Peter Craven to develop the world’s first Soundfield microphone.

Though it took a very long time for the technology to catch on (and even today, it’s still more niche than we think it should be), a few immersive content developers have recently begun taking advantage of ambisonics in their applications.


How does Ambisonic audio work?

A Soundfield microphone typically consists of four subcardioid capsules positioned in what's known as a tetrahedral array. This fancy term actually just means the capsules point in different directions, recording sounds both horizontally and vertically from the microphone. The raw recording signal is often referred to as the ‘A-Format’. 

Soundfield microphones also come with a built-in decoder that converts the A-Format recording into a B-Format one; you can think of the decoder as mapping the audio onto some sort of matrix.

B-Format audio is actually made up of four separate signals, one for each axis. Some say it’s a ‘speaker-agnostic’ format, because the decoder can output the signal to a sound format suitable for virtually any speaker setup: mono, stereo, 5.1/7.1 surround sound etc. This makes the Ambisonics recording process an extremely flexible one, where it’s easy to adapt your audio for whatever setup you want to play it out of.

The different axes of the audio recording also mean that the resulting sound can blanket an entire 360-degree space consistently. A typical surround sound setup tends to have more speakers at the front of the listener than anything else, so can sound a bit top-heavy (not to mention the back channels tend to be reserved for special effects). This is not an issue for an Ambisonic recording: you could set your sound up to be projected more uniformly, allowing for an even more engaging listening experience!

What about the disadvantages?

Though Ambisonics is a simple way for you to get your audience lost in an amazing soundscape, it’s ultimately not without its disadvantages that are important to keep in mind.

 It does everything itself

The Soundfield microphones needed to create Ambisonic audio decode the signal by themselves. You may think that sounds like an advantage, as it can save you a lot of time and effort in setting things up...but it also means the decoding process is out of your hands. Because the decoder is programmed to fine-tune audio frequencies and channels in a particular way for each speaker setup, you’re not given as much control over editing the result yourself as you perhaps would be with traditional surround sound setups.

Large file sizes

Because a B format audio recording is much more complex in nature than your standard WAV or MP3 file, you need to allocate more space for it on your harddrive!

Most importantly: we don’t hear enough about it!

When we mentioned earlier that Ambisonics is a niche technology, we really meant it. Outside of academic and specialist circles, it’s almost unheard of, and only now are virtual reality developers beginning to realise its full potential.As developers of their own Soundfield recording microphones, RØDE have developed a fantastic library of Ambisonics learning resources, which we recommend checking out for more information. They’ve even released a sound library with free-to-use Ambisonics audio ready for your own VR portals!

Accessibility Guidelines for VR Games & Immersive Projection - A Comparison and Synthesis of a Comprehensive Set

Below is a featured report enabling deep understanding of how accessibility can be achieved in gamified content, but the report also considers wider factors for various user levels. Accessibility and inclusion is a critical part of what we do at Portalco as our environments are all designed physically and in their interfaces to enable all users to interact with immersive technology.

This requires multiple aspects to be considered and if you are looking to create or develop content, or start a project for your people then reports like this are a great place to start, to understand pre-existing features that can make a huge difference to your deliverables.

Increasing numbers of gamers worldwide have led to more attention being paid to accessibility in games. Virtual Reality offers new possibilities to enable people to play games but also comes with new challenges for accessibility. Guidelines provide help for developers to avoid barriers and include persons with disabilities in their games. As of today, there are only a few extensive collections of accessibility rules on video games. Especially new technologies like virtual reality are sparsely represented in current guidelines. In this work, we provide an overview of existing guidelines for games and VR applications. We examine the most relevant resources, and form a union set. From this, we derive a comprehensive set of guidelines. This set summarizes the rules that are relevant for accessible VR games. We discuss the state of guidelines and their implication on the development of educational games, and provide suggestions on how to improve the situation.

1 Introduction

In 2020 the number of people who play video games was estimated to 3.1 billion worldwide, which is 40% of the world population (Bankhurst 2020). This shows that video games are not a niche hobby anymore. The game industry has picked up new technologies like Virtual Reality (VR). Thus, VR is thriving recently, with more and more (standalone) headsets being developed for the consumer market. The current state of the art in VR headsets is dominated by Sony and Facebook’s Oculus, and the market is expected to grow rapidly in the following years (T4, 2021).

1.1 Games and Disability

The rising numbers of gamers worldwide and the technological advances come with new challenges for accessibility. According to an estimate of the World Health Organization (WHO) from 2010, around 15% of the world population has some form of disability (World Health Organization, 2011). This means over a billion people live with physical, mental, or sensory impairments. It is not surprising that an increasing number of these people play or want to play video games but are excluded from it because of barriers they cannot overcome (Yuan et al., 2011). Furthermore, not only people with impairments can profit from accessible games. Situational disabilities like a damaged speaker, loud environment or a broken arm can affect any gamer (Sears et al., 2003Grammenos et al., 2009Ellis et al., 2020).

VR comes with new chances to include people with disabilities and make games more accessible. However, it also adds to the accessibility problems that can occur in games. As it is a relatively new technology, new rules and interaction forms still need to be developed.

1.2 Scope and Methodology of the Review

The matter we illuminate in this work is the importance and the need for accessible games in general and VR games in particular. Like others we come to the conclusion that what is needed is more awareness and a well-formulated set of rules developers can follow. By showing how relevant it is to make accessible games, we want to draw attention to and emphasize what the problem with the current state of accessibility guidelines is. The few accessibility guidelines for games that exist, do not or little deal with special requirements for VR.

Besides the general importance of accessibility due to increasing demand, in most countries educational games including VR games are legally required to be accessible for persons with disabilities. To achieve this designers and developers need a guide they can understand and follow. However the existing guidelines make it hard for game developers to apply and follow them when developing a VR game. This work shows what already exists in this field and explores whether it is sufficient.

We evaluate all noteworthy guidelines for games and VR applications. The result shows how small the number of applicable guidelines is. We then combine the found guidelines to a union set. The challenge is, that the different sources often contain the same rules but in different formulations and levels of detail. We also evaluate which of the rules are relevant for VR games in particular and therefore reduce the need for developers to filter relevant guidelines themselves. The union set reveals what rules are missing in the evaluated relevant works and where there is room for improvement. The comparison can help developers to read about guidelines from different sources and give a broader understanding of how to increase accessibility in their games.

2 Related Works

In this section, we look at 1) the state of accessibility in games in general, 2) the state of accessibility of VR games, and 3) the role of guidelines for making games accessible.

2.1 Accessibility in Games

The accessibility of games is a more complex problem than software or web accessibility in general because they often require a lot of different skills to play (Grammenos et al., 2009). Accessibility problems in video games can affect different parts of a game. The reasons are typically divided into motor disability, sensory disability and cognitive disability (Aguado-Delgado et al., 2018).

Video games are not only a pastime for disabled players, although this is an essential part of being able to play. The benefits of making accessible games are presented by Bierre et al. (2004)Harada et al. (2011)Beeston et al. (2018)Cairns et al. (2019a), and Cairns et al. (2019b), and These sources can be summarized into the following list:

• Entertainment and satisfaction: Games are made to be a source of entertainment and distraction.

• Connection and socializing: Playing games gives the chance to participate and feel included.

• Enabling: Playing games can enable impaired people to do things they otherwise cannot do.

• Range: For companies it is important to realize that many people benefit from accessible games.

• Legal issues: Legal requirements for accessibility are becoming more, including games.

• Universal: Everyone can benefit from accessible games.

Developing accessible games has long been seen as a subordinate topic mostly looked at in special interest groups or academics. The majority of accessible games are not mainstream games and/or never leave the state of research. Often accessible games are developed specifically for one particular disability. Making special accessible games can lead to multiple point-solutions that can only be played by a small group of people. (Cairns et al., 2019a)

Additionally, many studies concentrate on easy-to-use games with simple gameplay. Most games rely on hand usage to control them and visuals and sound as output. Problems mainly arise when people are not able to receive the output feedback (visual, auditory, haptic) or use the required devices to give input (Yuan et al., 2011Hamilton 2018). People with severe visual impairment can not use assistive features and accessible input devices often offer only limited possibilities for interaction. This is why games that can be played without visuals or with alternative input devices are often simple and require minimal input or interaction. (Yuan et al., 2011)

A reason for poor accessibility could be lacking information in schools for developers or the false assumption that making accessible games is not worth it because the number of people who benefit from it is too small. Complications in developing accessible games can be the individuality of impairments or the necessity to change the game fundamentally to make it accessible. It is difficult to find a good compromise between challenge and accessibility. (Yuan et al., 2011)

These difficulties lead to the most general problem with game accessibility: There are not many accessible games on the market. Examples of accessible games for each type of impairment are surveyed by Yuan et al. (2011), which also demonstrates the mentioned problem of point-solutions. A noteworthy mainstream game that is said to be the most accessible game to date is The Last of Us: Part 2. It has over 60 accessibility features that tend to visual, hearing and mobility impairments (PlayStation, 2020).

Many websites, organizations, and communities support accessible gamers and raise awareness. Well-known contact points are the International Game Developers Association (IGDA) Game Accessibility Special Interest Group (GASIG) (IGDA GASIG, 2020), the Able Gamers Charity (AbleGamers, 2018b) and the charity SpecialEffects (SpecialEffect, 2021). An organization that specialized in Extended Reality (XR), including VR, is the XR Access Initiative (XR Access, 2020).

2.2 Accessibility in VR and VR Games

The accessibility problems that occur in regular games overlap with the accessibility problems in VR games. VR applications and VR games come with both: ways to bypass the accessibility problems in games and new challenges and barriers that add to them. In VR, there is still little experience on a best practice compared to other domains. There is no conclusion on what approaches are good or not so far (Hamilton, 2018). This also influences already lacking methods for game accessibility (Mott et al., 2019).

Interaction in VR relies mainly on the head, hands and arms, which can be a huge barrier for people with motor impairment. Hamilton (2018), a better-known activist in the accessible games scene, did a thorough research of accessibility for all kinds of impairments in VR. Besides simulation sickness, Photosensitive Epilepsy and situational disabilities like not seeing one’s hands, he emphasized the problems with VR controllers. He summarizes issues that occur for people with motor impairment in VR games such as the presence, strength or range of limbs or the user’s height. VR controllers have developed into using one controller in each hand. They often have an emphasis on motion controls, like mid-air motions, requiring more physical ability than normal controllers or keyboards (W3C, 2020). In many games and applications there are no alternative input methods to using the controller (Mott et al., 2019). Additionally, at the moment, each manufacturer uses their own controllers, each model being different in terms of accessibility (W3C, 2020). Besides hand usage, most VR headsets are heavy and bulky, which requires strength of the neck and shoulders to use. Many games dictate the position in which the player must be. They require upper-body movements or even have to be played standing.

A more obvious barrier is the visual aspect. Apart from full blindness, barriers in VR can also occur for people with low vision, unequal vision in one eye or stereo-blindness (Hamilton, 2018). An issue that occurs only in VR is problems with wearing glasses under the HMD. Another problem is traditional zooming which can cause simulation sickness and disorientation in VR environments (Chang and Cohen, 2017). Similar problems occur for hearing impairments, such as stereo sound. Subtitles or captions are a special challenge in VR as they can not simply be put at the bottom of the screen. (Hamilton, 2018)

Despite the additional accessibility problems, VR can also help people with disabilities experience things they could not do otherwise, such as horseback riding or driving a car. Contrary to the exclusion people with disabilities might experience in games and the real world, Mott et al. (2019) see VR as a chance for all users to be equal in a game. VR offers new ways for input and output that are not possible with standard devices. Many of these can be realized with the sensors that are already included in current Head-Mounted Displays (HMD).

Most studies on accessible VR concentrate on removing barriers of VR headsets with special applications rather than introducing full games. Therefore there are not many specially made accessible VR games yet. Some games provide accessibility options, but often they only tend to one specific issue which is demonstrated by Bertiz (2019) presenting a list of some of these games. However, tools like SeeingVR (Zhao et al., 2019) and WalkinVR (2020) make VR applications more accessible in general.

2.3 The Role of Guidelines for Accessible Gaming

Software, in general, is becoming more accessible due to better awareness and legal requirements (Miesenberger et al., 2008). Guidelines are an important tool to support this. In Human-Computer-Interaction (HCI) they help designers and developers to realize their projects while also ensuring a consistent system. As for accessibility guidelines especially in the web environment they are well represented.

Different aspects of accessibility are considered in this work: games in general and VR games. The limited range of accessible games and VR games is attributed to a lack of awareness. Grammenos et al. (2009) brings this into relation with the problem of missing guidelines.

Although accessible games gained more awareness in the last few years, there is still a big gap between the regulations for web accessibility and games, which was researched by Westin et al. (2018). They compared the Web Content Accessibility Guidelines (WCAG) (Kirkpatrick et al., 2018) with the Game Accessibility Guidelines (GAG) (Ellis et al., 2020) in the context of web-based games and found that there are many differences. As a conclusion they state that game guidelines would have to be used in conjunction with WCAG for web-based games. Different references, for example Yuan et al. (2011) and Cairns et al. (2019a), draw attention to the lack of good literature, universal guidelines and awareness for accessibility in games. There is no official standard that can be used for games like the WCAG for web applications.

Zhao et al. (2019) found this to be especially true for VR games. It was also noticed by (Westin et al., 2018) who emphasize the importance to pay attention to XR in future guideline development. So far, guidelines for games rarely consider VR accessibility and few guidelines are exclusively made for VR applications. Many of them are specialized in one specific impairment or device. The way users interact with VR is hardly comparable with other software, so generalized guidelines can not be applied (Cairns et al., 2019a).

The success of using guidelines to make a game accessible depends on how good the guidelines are. Some guidelines are very extensive and aim to be a complete guide, while others are summaries or short lists of the most important rules. Many sets of rules try to be as broadly applicable as possible and match a wide variety of use-cases. However, in practice, this makes them harder to understand. It is not easy to make guidelines that are general enough, but at the same time developers can transfer them to their scenario (Cairns et al., 2019a). It can also be hard to decide what guidelines are relevant in a specific context and extract them from a big set. Yuan et al. (2011) see this as a problem when guidelines do not explain each rule’s purpose and when they should be applied.

3 Guidelines

In this section, we introduce existing guidelines that are noteworthy for this work. For each set of guidelines a summarized description is provided. They are the most relevant resources we were able to find in the English language. The guidelines were chosen by relevance in the areas of accessible games and accessible VR applications. Most of them contain explanatory texts for each rule, stating what they are for and providing good examples and tools. The relatively small number of found guidelines confirms the concerns of Yuan et al. (2011) and Cairns et al. (2019a).

The EN 301 549 standard is a collection of requirements from the European Telecommunications Standards Institute (2019). It was included in this comparison as it is the relevant work for legal requirements on accessibility. Its goal is to make Information and Communication Technology (ICT) accessible. This includes all kinds of software such as apps, websites and other electronic devices. As a European standard, these rules have to be followed by the public sector, including schools and universities (Essential Accessibility, 2019). Where applicable, the standard reflects the content of WCAG 2.1, which is why we do not look at WCAG separately in this work. The guidelines were updated several times since 2015. We use version V3.1.1 from 2019 for our comparison. Because the EN 301 549 is a very extensive document that considers any form of information technology, not all chapters are suitable for accessible games or VR. Therefore, the less relevant chapters were omitted, integrated into other guidelines or summarized into one more general rule.

3.1 Guidelines for Games

Many game guidelines build on each other or are mixtures of different sources. The most extensive game guidelines mentioned frequently in relevant literature are Includification and the GAG.

3.1.1 IGDA GASIG White Paper and Top Ten

In 2004 the IGDA GASIG published a white paper (Bierre et al., 2004) that lists 19 game accessibility rules found out from a survey. Later, they summarized these to a top ten list (IGDA GASIG, 2015) that is constantly updated. It boils down to the most important and easy to realize rules a developer should follow, providing quick help.

3.1.2 MediaLT

Based on the rules from the IGDA GASIG white paper MediaLT, a Norwegian company developed their own guidelines (Yuan et al., 2011). They presented 34 guidelines for “the development of entertaining software for people with multiple learning disabilities” (MediaLT, 2004).

3.1.3 Includification

Includification from the AbleGamers Charity (Barlet and Spohn, 2012) came up in 2012. It is a long paper including an accessibility checklist for PC and console games. Each rule is additionally explained in detail in plain text.

3.1.4 Accessible Player Experience

As a successor to Includification AbleGamers published a set of patterns on their website called the Accessible Player Experience (APX) in 2018 (AbleGamers, 2018a). They are, in fact, more patterns than guidelines, providing a game example for each accessibility problem.

3.1.5 Game Accessibility Guidelines

The Game Accessibility Guidelines (GAG) (Ellis et al., 2020) were also developed in 2012 and are the most known and extensive guidelines for games. They are based on WCAG 2.0 and the internal standards of the British Broadcasting Corporation (BBC) (Westin et al., 2018). The rules are constantly updated. For each guideline the GAG offer game examples that implemented the rule and list useful tools to do so.

We used the GAG as the basis for this work because they are the most extensive game guidelines of all considered. At the same time they also fit the game context best and provide easy-to-follow wording.

3.1.6 Xbox

Like many other companies, Microsoft has its own guidelines for products. For games on the Xbox console the Xbox Accessibility Guidelines (XAG) provide best practices (Microsoft, 2019). These guidelines are based on the GAG and also include some references to the APX.

3.2 Guidelines for VR

As before, we make no distinction between VR games and other VR applications. Only two sources that list measures for better accessibility in VR in the form of guidelines were found.

3.2.1 XR Accessibility User Requirements

The XR Accessibility User Requirements (XAUR) are a set of guidelines published by the Accessible Platform Architectures (APA) Working Group of the World Wide Web Consortium (W3C) in 2020. They contain definitions and challenges as well as a list of 18 user needs and requirements for accessible XR applications (including VR). The current version is a Working Draft as of September 16, 2020. (W3C, 2020).

3.2.2 Oculus VRCs: Accessibility Requirements

The Virtual Reality Checks (VRC) from Oculus developer portal are requirements a developer must or should follow to publish his/her app on Oculus devices. These VRCs have recently (in 2021) been extended by the section “Accessibility Requirements”, providing recommendations to make VR apps more accessible. (Oculus, 2020).

3.2.3 The University of Melbourne

On their website the University of Melbourne provides an overview of “Accessibility of Virtual Reality Environments” (Normand, 2019). The main content are the pros and cons of VR for people with different types of disabilities. For each type they provide a list which also includes use cases that can be seen as guidelines.

4 Synthesis of Guidelines

We used the previously introduced sources to derive a comprehensive set of guidelines that includes all rules that are relevant for accessible VR games. Inspired by the proposed procedure of the GAG we took the following steps to achieve this.

1) All guidelines mentioned above were evaluated and filtered by what is directly relevant for VR environments and games.

2) The remaining rules were compared to each other and the union set was formed. Similar guidelines were summarized and the formulations slightly changed or enhanced.

3) The result is a set of guidelines that combine and summarize all rules for accessible VR games found in the existing sources.

All found guidelines are shown as a list below. To avoid duplicate entries, this set is sorted by topic not by impairment or importance. This classification does not imply that some rules can not be relevant for other categories. The main source of the wording is given in parenthesis. Because the GAG was used as a basis, the most formulations were overtaken from them. This does not mean that those rules are not included in other guidelines. To provide good readability and the source of the text at the same time, the guidelines are color coded as follows:

• Black text in normal font type: Text written in black was taken as is from the original source which is written behind each rule in parenthesis. This does not mean that this rule does only appear in this particular set. It merely marks where the formulation was used from.

• Orange text in italic font type: Text written in orange marks where the original formulation from the source in parenthesis was changed or extended. This could be because the wording from another source was added or if the wording was adapted to be more clear.

The full comparison table is available as supplementary material on this paper.

Input and Controls

• Allow controls to be remapped/reconfigured; avoid pinching, twisting or tight grasp to be required (GAG)

• Provide very simple control schemes. Ensure compatibility with assistive technology devices, such as switch or eye tracking (GAG)

• Ensure that all areas of the user interface can be accessed using the same input method […] (GAG)

• Include an option to adjust the sensitivity of controls (GAG)

• Support more than one input device simultaneously, include special devices (GAG)

• Ensure that multiple simultaneous actions (eg. click/drag or swipe) and holding down buttons are not required […] (GAG)

• Ensure that all key actions can be carried out with a keyboard and/or by digital controls (pad/keys/presses) […] (GAG)

• Avoid repeated inputs (button-mashing/quick time events) (GAG)

• Include a cool-down period (post acceptance delay) of 0.5 s between inputs (GAG)

• Include toggle/slider for any haptics (e.g., controller rumble) (GAG)

• Provide a macro system. If shortcuts are used they can be turned off or remapped (GAG)

• Make interactive elements that require accuracy […] stationary or prevent using them (GAG)

• Make sure on-screen keyboard functions properly (Includification)

Audio and Speech

• Provide separate volume controls and stop/pause or mutes for effects, speech and background sound/music (independent from the overall system) (GAG)

• Ensure no essential information is conveyed by sounds alone (GAG)

• Use distinct sound/music design for all objects and events (GAG)

• Use surround sound (GAG)

• Provide a stereo/mono toggle and adjustment of balance of audio channels (GAG)

• Avoid or keep background noise to minimum during speech (GAG)

• Provide a pingable sonar-style audio map (GAG)

• Provide a voiced GPS (GAG)

• Simulate binaural recording (GAG)

• Provide an audio description track (GAG)

• Allow for alternative Sound Files (IGDA White Paper)

• Provide pre-recorded voiceovers and screenreader support for all text, including menus and installers (GAG)

• Masked characters or private data are not read aloud without the users allowance (EN 301 549)

• The purpose of each input field collecting information about the user is presented in an audio form (EN 301 549)

• […] Speech output shall be in the same human language as the displayed content […] (EN 301 549)

• Ensure that speech input is not required […] (GAG)

• Base speech recognition on individual words from a small vocabulary (eg. “yes” “no” “open”) instead of long phrases or multi-syllable words (GAG)

• Base speech recognition on hitting a volume threshold (eg. 50%) instead of words (GAG)

Look and Design

• Ensure interactive elements/virtual controls are large and well spaced […] (GAG)

• Use an easily readable default font size and/or allow the text to be adjusted. Use simple clear text formatting. (GAG)

• Ensure no essential information is conveyed by text (or visuals) alone, reinforce with symbols, speech/audio or tactile (GAG)

• Ensure no essential information is conveyed by a colour alone (GAG)

• Provide high contrast between text/UI and background (at least 4.5:1) (GAG)

• UI Components and Graphical Objects have a contrast ratio of at least 3:1 or provide an option to adjust contrast (GAG)

• Provide a choice of […] colour […] (GAG)

• Allow interfaces to be rearranged (GAG)

• Allow interfaces to be resized (GAG)

• Provide a choice of cursor/crosshair colours/designs and adjustable speed and size (GAG)

• Instructions provided for understanding and operating content do not rely solely on sensory characteristics of components such as shape, color, size, visual location, orientation, or sound (original from WCAG 1.3.3) (EN 301 549)

• No 3D Graphics Mode (IGDA White Paper)

• Indicate focus on (UI) elements (XAG)

• Enable people to edit their display settings such as brightness, include screen magnification (VRC)

• Provide an option to turn off/hide background movement or animation. Moving, blinking or auto-update can be turned off or paused (GAG)

• Headings, Labels and Links describe their topic or purpose in their text. If they are labeled, the label contains their text (EN 301 549)


• Provide subtitles for all important speech and supplementary speech. (Provide a spoken output of the available captions) (GAG)

• If any subtitles/captions are used, present them in a clear, easy to read way and/or allow their presentation to be customised (GAG)

• Ensure that subtitles/captions are cut down to and presented at an appropriate words-per-minute for the target age-group (GAG)

• Ensure subtitles/captions are or can be turned on with standard controls before any sound is played (GAG)

• Provide captions or visuals for significant background sounds. Ensure that all important supplementary information conveyed by audio is replicated in text/visuals (GAG)

• Provide a visual indication of who is currently speaking (GAG)

• Captions and Audio Description have to be synchron to the audio (EN 301 549)


• Use simple clear language. Employ a simple, clear narrative structure. (GAG)

• Include tutorials (GAG)

• Include a means of practicing without failure […] (GAG)

• Include contextual in-game help/guidance/tips (GAG)

• Include assist modes such as auto-aim and assisted steering (GAG)

• Indicate/allow reminder of current objectives during gameplay (GAG)

• Indicate/allow reminder of controls during gameplay (GAG)

• Offer a means to bypass gameplay elements […] and/or give direct access to individual activities/challenges and secret areas (GAG)

• Allow the game to be started without the need to navigate through multiple levels of menus (GAG)

• Offer a wide choice of difficulty levels. Allow them to be altered during gameplay, either through settings or adaptive difficulty (GAG)

• Include an option to adjust the game speed and/or change or extend time limits (GAG)

• Allow players to progress through text prompts at their own pace (GAG)

• Allow all narrative and instructions to be paused and replayed, care for automatic interruption. (GAG)

• Give a clear indication on important or interactive elements and words (GAG)

• Provide an option to turn off/hide all non interactive elements (GAG)

• Players can confirm or reverse choices they have made [] (APX)


• Avoid (or provide an option to disable) VR simulation sickness triggers (GAG)

• Allow for varied body types in VR, all input must be within reach of all users (GAG)

• Do not rely on motion tracking and the rotation of the head or specific body types (GAG)

• If the game uses field of view, set an appropriate default or allow a means for it to be adjusted (GAG)

• Avoid placing essential temporary information outside the player’s eye-line (GAG)

• Ensure the user can reset and calibrate their focus, zoom and orientation/view in a device independent way (XAUR)

• Applications should support multiple locomotion styles (VRC)

• Provide an option to select a dominant hand (VRC)


• Support voice chat as well as text chat for multiplayer games (GAG)

• Provide visual means of communicating in multiplayer (GAG)

• Allow a preference to be set for playing online multiplayer with players who will only play with/are willing to play without voice chat (GAG)

• Allow a preference to be set for playing online multiplayer with/without others who are using accessibility features that could give a competitive advantage (GAG)

• Use symbol-based chat (smileys etc) (GAG)

• Realtime text - speech transcription (GAG)


• Allow gameplay to be fine-tuned by exposing as many variables as possible (GAG)

• Avoid flickering images and repetitive patterns to prevent seizures and physical reactions (GAG)

• Provide an option to disable blood and gore, strong emotional content or surprises (GAG)

• Avoid any sudden unexpected movement or events as well as a change of context (GAG)

• Provide signing (GAG)

• Include some people with impairments amongst play-testing participants and solicit feedback. Include every relevant category of impairment [], in representative numbers based on age/demographic of target audience (GAG)

• Provide accessible customer support (XAG)

• If a software can be navigated sequentially, the order is logical (EN 301 549)

• Provide details of accessibility features in-game and/or as accessible documentation, on packaging or website. Activating accessibility features has to be accessible (GAG)

• Ensure that all settings are saved/remembered (manual and autosave). Provide thumbnails and different profiles (GAG)

• Do not make precise timing essential to gameplay [] (GAG)

• Allow easy orientation to/movement along compass points (GAG)

• Where possible software shall use the settings (color, contrast, font) of the platform and native screen readers or voice assistance (XAUR)

• Ensure that critical messaging, or alerts have priority roles that can be understood and flagged to assistive technologies, without moving focus (XAUR)

• Allow the user to set a “safe place” - quick key, shortcut or macro and a time limit with a clear start and stop (XAUR)

• Locking or toggle control status can be determined without visual, sound or haptic only (EN 301 549)

• Using closed functionality shall not require to attach, connect or install assistive technology (EN 301 549)

5 Discussion and Final Remarks

The rapidly growing market of video games and VR headsets indicates an increase in the number of people who play games.

In this work, we address the chances and problems for people with disabilities regarding games and VR applications. Our comparison of existing game and VR guidelines provides a broader understanding on existing guidelines from various sources. It can also help the authors of the guidelines to improve them in the future as they see what might be missing. Furthermore, we hope this work can help raise awareness, especially for accessible VR games.

The comparison showed that none of the presented guidelines is an exhaustive list. We found that there are some important rules missing in the relevant works that are included in other guidelines. However, most rules are covered by either the Game Accessibility Guidelines or the EN 301 549 standard. Among game guidelines, only the GAG and Xbox Accessibility Guidelines include rules that are specific to VR. As can be seen, the guidelines from MediaLT (2004) and the Top Ten from IGDA GASIG (2015) do not add any rules to the set that are not included in other guidelines.

It should be noted that our resulting set of guidelines is based on literature research only, and that we have not conducted empirical research with users to identify possible omissions of accessibility requirements in existing guidelines. Therefore, the “comprehensive” set of guidelines that we present in this paper may need to be further extended in the future to address accessibility needs that have yet to be identified in the field.

We noticed that there are a few guidelines in the EN 301 549 standard that do not occur in the GAG. On the other hand, there are some rules that are missing in the European standard or are not stated with sufficient specificity. We conclude that the legal requirements are currently not sufficient to cover the full range of accessibility needs of the users. Therefore, we suggest that missing guidelines should be added to the European standard.

Another problem with the European standard is its structure and wording. During the evaluation it became apparent that the standard is very hard to read and understand. Rules that are not linked to WCAG can be interpreted in different ways and no examples are given. We fear that the EN 301 549 may not be suitable as a guide to be used by developers directly. A possible approach would be to translate the standard into an alternative version of more applicable rules with practical examples. Also, a tutorial should be provided that shows in detail how each criterion is applied to VR applications.

A last remark on the European standard relates to the fact that it does not include a table that lists all criteria that are legally required for VR applications. Such tables are given for Web and mobile applications. Therefore, it is currently unclear which criteria are enforced by the European Commission for VR applications in public institutions, as opposed to criteria that are “nice to have”.

The overall conclusion from working with the available guidelines was that there is room for improvement in all existing guidelines and including rules that are specific for VR should be considered by the most relevant guidelines.

A comprehensive and widely acknowledged set of accessibility guidelines for VR games is needed in the future, just as the Web Content Accessibility Guidelines for Web applications. The guidelines we presented in this paper can be a starting point for this. However, we use the wording of the original sources and there are no explanations or examples included. To make for a good checklist for developers to follow, a much more detailed description of each guideline would be necessary. Also, a companion tutorial would be useful to provide support for VR game developers who are new to the field of accessibility.

As mentioned, not only guidelines are underrepresented for accessibility, but there is also a lack of available tools for developers. Many of the approaches to avoid accessibility problems in games could be supported by suitable libraries and automatic checking tools. This takes some of the burden away from developers and makes development much easier and faster while ensuring a consistently high level of accessibility. Eventually, the employment of suitable platforms and libraries should ensure a decent level of accessibility, and the majority of guidelines could be automatically checked and hints for fixes provided by development tools.

Author Contributions

This manuscript was written by FH with corrections and minor adaptions made by GZ. The research work was conducted by FH and supervised by GZ and PM. All authors have read, and approved the manuscript before submission.

Time Compression Affect In Virtual Reality vs Conventional Monitor

The following report is available to download from here and was authored by Grayson Mullen & Nicolas Davidenko

Time & Time Perception: Abstract

Virtual-reality (VR) users and developers have informally reported that time seems to pass more quickly while playing games in VR. We refer to this phenomenon as time compression: a longer real duration is compressed into a shorter perceived experience. To investigate this effect, we created two versions of a labyrinth-like game. The versions are identical in their content and mode of control but differ in their display type: one was designed to be played in VR, and the other on a conventional monitor (CM). Participants were asked to estimate time prospectively using an interval production method. Participants played each version of the game for a perceived five-minute interval, and the actual durations of the intervals they produced were compared between display conditions. We found that in the first block, participants in the VR condition played for an average of 72.6 more seconds than participants in the CM condition before feeling that five minutes had passed. This amounts to perceived five-minute intervals in VR containing 28.5% more actual time than perceived five-minute intervals in CM. However, the effect appeared to be reversed in the second block when participants switched display conditions, suggesting large novelty and anchoring effects, and demonstrating the importance of using between-subjects designs in interval production experiments. Overall, our results suggest that VR displays do produce a significant time compression effect. We discuss a VR-induced reduction in bodily awareness as a potential explanation for how this effect is mediated and outline some implications and suggestions for follow-up experiments.

Keywords: Virtual realitybodily awarenessinteroceptiontime compressionprospective time estimationpresenceimmersion

1. Introduction

Virtual-reality (VR) head-mounted displays (HMDs) take up the user’s entire field of view, replacing all of their real-world visual cues with a contrived virtual world. This imposes unique conditions on human vision and on all other brain functions that make use of visual information. The consequences have mostly been studied in terms of presence, or the feeling of being inside the virtual scene presented on the HMD rather than in the real world (see Heeter, 1992 for a more encompassing and widely used definition of presence). Because the virtual scene can be designed to look like anything, VR can produce unique psychological effects by placing users in situations that rarely (or never) occur naturally. For example, it can present visual stimuli that conflict with the users’ vestibular cues, causing cybersickness (Davis et al., 2014). VR experiences have also been intentionally used to reduce pain in burn patients (Hoffman et al., 2011), to elicit anxiety or relaxation (Riva et al., 2007), and even to affect self-esteem and paranoia by manipulating the height of the user’s perspective relative to the virtual scene (Freeman et al., 2014).

One unintentional effect, which has been anecdotally reported by VR users and developers, is a time compression phenomenon wherein a larger real duration is compressed into a shorter perceived experience. At a 2016 gaming conference, Hilmar Veigar (of CCG Games) said, “You think you’ve played for 15 minutes and then you go out and it’s like, ‘Wait, I spent an hour in there?’ There’s a concept of, I don’t know, VR time” (Miller, 2016). Palmer Luckey (founder of Oculus) suggested that the effect could be a result of not having access to real-world environmental cues, like the position of the sun. Distorted time perception has been observed as an effect of conventional gaming (Nuyens et al., 2020), but the influence of VR on time perception has been studied relatively less.

One notable study (Schneider et al., 2011) successfully used VR experiences to shorten perceived durations during chemotherapy and found individual differences in time compression effects related to diagnosis, gender and anxiety. It is not clear, though, whether a non-VR version of the same experience would have resulted in a similar distortion of time perception. Only a few studies have directly compared time estimation processes between a VR experience and a non-VR counterpart, and none so far have found significant differences.

Bansal et al. (2019) examined the influence of a novel modification of a VR game (which coupled the flow of time to the speed of players’ body movements) on participants’ performance on subsequent time estimation tasks. Compared to control groups, participants who played the modified game made shorter estimates of brief (6 s and shorter) intervals, but only on estimation tasks that involved continuous movement. No significant difference in time perception was found between participants who played an unmodified (normal-time) version of the VR game and those who played a non-VR game. These results indicate that VR alone may not recalibrate temporal perception, but that a specifically tailored VR experience may induce such an effect. Because all the time estimation tasks were performed outside of VR, these results do not provide an answer to the question of whether time perception is distorted during VR use.

Schatzschneider et al. (2016) investigated how time estimation was affected by display type (VR/non-VR) and cognitive load. The researchers found no significant difference in time estimation between the display conditions, but the study used a within-subjects design and all participants experienced the non-VR condition first and the VR condition second. Completing the non-VR condition first may have anchored participants’ time estimates in the subsequent VR condition. Thus, it is possible that the lack of counterbalancing in Schatzschneider et al. (2016) may have obscured an effect of display type. Another study (van der Ham et al., 2019) also found no difference in time estimates between VR and non-VR displays, but used a retrospective time estimation paradigm.

According to Block and Zakay (1997), retrospective and prospective time estimates depend on different processes. Retrospective estimates are made when participants are unaware that they will be asked to estimate a duration until after the interval has ended. These estimates are based only on information that is stored in memory. Factors that have been found to affect retrospective time estimates are mostly related to stimulus complexity and contextual changes (more complex information and more changes are associated with longer retrospective estimates). Because they rely on memory, retrospective time estimates are affected by cognitive load only indirectly, when information relevant to cognitive load is stored in memory.

In contrast, prospective estimates are made by participants who are aware during the interval that they will be asked to estimate its duration. The most prominent model to illustrate the processes underlying prospective time estimation is Zakay and Block’s (1995) attentional-gate model of prospective time estimation (but see also Grondin, 2010Ivry & Schlerf, 2008; and Wittmann, 2009for reviews of alternate models of time perception). The first component of this abstract model is a pacemaker (which can be thought of as an internal metronome) that generates pulses at a rate that scales with the estimator’s arousal. Before the pulses can be counted, they are modulated by an attentional gate, which is open to a variable degree depending on the amount of attentional resources allocated to tracking time. When attentional resources are consumed by a demanding task, the gate becomes narrower (i.e., fewer resources are available to attend to time), and fewer pulses are able to pass.

The pulses that pass the attentional gate are counted by an accumulator, and the resulting sum is used as the basis for an estimate of the interval’s duration. The larger the count, the more time the estimator reports has passed. This means that time seems to pass more quickly (i.e., it becomes compressed) when attentional demands are high, and it seems to pass more slowly (i.e., it dilates) when attentional demands are low. The attentional-gate model is supported by the preponderance of attention-related manipulations that have been found to significantly affect prospective estimates, but not retrospective estimates (Block & Zakay, 1997). Thus, whereas prospective estimates are affected by cognitive load, retrospective estimates are more affected by contextual changes and other memory-related factors.

The current study is the first to investigate the effect of VR HMDs on time perception using a prospective time estimation paradigm and counterbalanced display conditions. We chose a prospective time estimation paradigm in order to measure the experience of VR rather than the memory of it (Block & Zakay, 1997), and also to obtain results that are relevant to intentional time management while playing VR games. We also used an interval production method of time estimation (Zakay, 1993), in which the research assistant specifies a duration (five minutes, in our case) and the participant starts and ends an interval that they feel matches that duration. This method is less susceptible to rounding biases than methods that ask the participant to report the number of seconds or minutes an interval lasted. In our study, every participant attempts to produce a five-minute interval, and we use the actual durations of the intervals they produce as our main dependent variable.

1.1. Hypotheses

First, we predict that intervals produced while playing a VR game will be longer than those produced while playing an equivalent game displayed on a conventional monitor (CM). This hypothesis is based on the anecdotal reports of a time compression effect in VR, and is motivated by past studies which have probed the relationship between time perception and VR but failed to find evidence of this effect. Based on Block and Zakay’s (1997) comparison of time estimation methods, we expect an interval production method to yield evidence of a compression effect in VR that has not been directly revealed by other methods.

Second, we predict that VR interval durations will be more variable across participants than CM interval durations. Higher variability is naturally expected if VR interval durations are longer, assuming that errors are proportional to the size of the estimate. Additionally, we predict that variability may be further increased by uncertainty in time perception among participants in VR. If VR interferes with normal time perception, participants may be less confident in their ability to track the passage of time, and produce a wider range of interval durations.

2. Methods

2.1 Participants

Forty-one undergraduate students participated for course credit. Two of them produced extreme outlier responses (their intervals in the VR condition were more than three standard deviations above the mean), so our final analysis includes data from 39 participants (24 female and 15 male, ages 18–26, M= 19.5, SD = 1.7). The UC Santa Cruz IRB approved the study and participants provided informed consent.

2.2 Materials

In both conditions, participants played a 3D labyrinth-like game designed in Unity. Each level consisted of a floating maze inside an otherwise empty room with textured walls and floors (see Fig. 1). The lighting and object textures did not change between levels, conditions, or maze sets, and there was no representation of the user’s body. The maze was positioned in front of and below the virtual camera to allow participants to see into the maze from above. Each maze contained a ball and a glowing yellow cube representing a goal, as well as walls and holes in the floor. Participants were directed to guide the ball to the goal by tilting the maze. Each version of the game included one of two maze sets (designed to be equally complex and difficult) so that participants did not repeat any levels between the two conditions. Each version included one practice level followed by up to 13 timed levels, which became increasingly difficult to complete as the mazes became larger and more complex (to simulate the general sense of progression in video games). Letting the ball fall through a hole in the maze would restart the current level, while getting the ball to reach the goal would start the next level. Above the maze in the timed levels, white text reading, “When you think five minutes have passed, press the right bumper and trigger at the same time” continuously faded in and out on an 8-s cycle to remind participants of the interval production task.

Figure 1.
Figure 1.

We decided it was important to include this reminder because when using an interval production method, the interval does not end until the participant chooses to end it. If a participant forgets that they were asked to keep track of time, they could produce exceedingly long intervals that are not accurately descriptive of their perception of time. Although the periodic fading of the reminder may have served as a temporal cue to make time perception more accurate, we do not expect it to have confounded our results because it was presented the same way in the VR and CM conditions of the game.

Participants used an Xbox 360 controller (Microsoft Corporation; Redmond, WA, USA) to manipulate the maze. They could tilt it in eight directions by moving the left joystick and could return it to its original position by holding any of the colored buttons (A, B, X, or Y). The right trigger and bumper (buttons at the back of the controller) were pressed simultaneously to end the practice level, and later to indicate the end the perceived 5-min interval.

In the VR condition, participants wore an Oculus Rift CV1 HMD (Oculus VR; Menlo Park, CA, USA) with head-tracking enabled to show a stable 3D environment. In the CM condition, participants viewed the game on a 20-inch Dell monitor with a 1920 × 1080 pixel resolution and a 60Hz refresh rate. Participants in the CM condition were seated approximately 45 cm away from the monitor. At this distance, the maze subtended approximately 22 degrees by 22 degrees of visual angle. Participants in the VR condition saw the maze from a virtual camera that was positioned similarly with respect to the maze, but the maze subtended a slightly larger visual angle (approximately 30 degrees by 30 degrees). However, participants were allowed to move freely during the game in both conditions, so the visual angle of the maze varied considerably across participants and across maze levels. Other than these differences between displays, the game was played on the same computer hardware between conditions.

After completing both conditions, participants filled out a questionnaire that asked about the difficulty of tracking time and of playing the game, their confidence in their ability to estimate time, previous experience with VR and video games, and included 19 Likert-scale items about immersion (e.g., “I felt detached from the outside world”). The purpose of this immersion scale was to measure whether participants felt significantly more immersed in the VR condition compared to the CM condition, and to show if immersion played a mediating role in any time compression effect we might find.

2.3 Procedure

We used a counterbalanced within-subjects design because we expected time perception accuracy to be highly variable between people. There were two display conditions (virtual reality [VR] and conventional monitor [CM]) as well as two sets of mazes (A and B). Each participant played the game once in VR and once on the CM, one of which used maze set A and the other used set B. Display condition and maze set were both counterbalanced to minimize order and maze difficulty effects.

Participants were asked to keep their phones and watches out of sight for the duration of the experiment, and to sit in front of a computer at a desk in our lab room. No clocks were visible to the participants, and research assistants in adjacent rooms refrained from using time-related language. Figure 2 illustrates the equipment used in each condition. A research assistant read instructions on how to play the game, and the practice level was started while the controls were described. Participants were told they could play the practice level for as long as they wanted to get comfortable with the game, and that it was not timed. Once they were ready to stop practicing, they could start the timed levels, which they were instructed to end once they felt they had been playing for five minutes. The research assistant left the room and shut the door after the instructions to minimize distractions and aural cues from outside the room.

Figure 2.
Figure 2.

We chose not to vary the duration of the intervals that participants were instructed to produce because of our limited sample size. We set the target duration at five minutes because it is a familiar and memorable unit of time, and we expected it would be long enough to discourage deliberate counting of seconds, but short enough to minimize fatigue effects (especially in the second sessions).

When the participant ended the timed levels, the elapsed time in seconds since the end of the practice level was automatically recorded in a text file, along with their practice time and the level that the participant had reached. No feedback about how much time had actually passed was given to the participant. Then, the research assistant briefly reminded the participant of the instructions and started the second game, which used the display condition and maze set that were not used in the first game.

After both versions of the game were completed, the participant was brought to a new room to complete a post-task survey (see Materials above).

3. Results

We conducted a two-way mixed-effects ANOVA with factors of starting display type (VR or CM) and block number (first or second). The results, shown in Fig.3, revealed a main effect of block number (F1,37 = 9.94, p = 0.003, ηp2 = 0.212), indicating that the mean duration of intervals produced in the second block (341.9 s) was significantly longer than that of intervals produced in the first block (290.1 s). Importantly, there was a main effect of starting display type (F1,37 = 6.45, p = 0.015, ηp2 = 0.148). Participants who played the VR game first (and the CM game second) produced longer intervals than participants who played the CM game first (and the VR game second). This means that the effect of display type on interval duration depends on order: in the first block, participants in the VR condition produced longer durations (327.4 s on average) than participants in the CM condition (254.8 s), whereas in the second block, VR durations (299.9 s) were shorter than CM durations (386.2 s). Furthermore, we found a strong correlation between participants’ first and second interval durations (r = 0.62, p < 0.001, n = 39), suggesting individuals’ second intervals were heavily anchored to their first ones. Because of this order effect, we limit our remaining analyses to first-block responses.

Figure 3.
Figure 3.

As shown in Fig. 4, first-block participants in the VR condition let significantly more time pass than first-block participants in the CM condition before indicating that five minutes had passed (t37 = 2.165, p = 0.037, d = 0.693). VR intervals were 327.4 s long (SD = 114.0) on average, and CM intervals were 254.8 s (SD = 95.1) on average. This means that in the VR condition, 72.6 more seconds (95% CI, [4.6, 140.6]) passed on average before participants felt that five minutes had elapsed. This finding supports our first hypothesis, that participants experience time compression in VR compared to playing an identical game on a CM.

Figure 4.
Figure 4.

To rule out an account based on differences in task difficulty, we compared how quickly participants in the two conditions completed the levels of the maze game. Figure 5 shows that the relationship between interval duration and level reached is described by a similar linear relationship in the two conditions. To determine whether these slopes were significantly different, we ran 10,000 bootstrap samples from each condition to compare the resulting best-fit lines and found that the 95% confidence interval for the difference between best-fit slopes in the VR and the CM condition [−0.0021, 0.0072] contained zero. Therefore participants across the VR and CM conditions completed levels at similar rates, suggesting that the time compression effect cannot be attributed to participants spending more time on each level in VR compared to CM and using the number of levels completed as a proxy to decide when five minutes had elapsed. Furthermore we found no significant difference in practice time between conditions (t37 = −0.147 p > 0.5, d = 0.047) suggesting it was not more difficult to learn the game in VR than in CM.

Figure 5.
Figure 5.

We did not find support for the hypothesis that produced interval durations would be more variable in the VR condition. Although intervals produced in the VR condition (SD = 114 s) were slightly more variable than intervals produced in the CM condition (SD = 95 s), Levene’s test showed that there was no significant difference in interval variance between conditions (F1,37 = 0.195, p > 0.5).

The survey responses did not reveal a significant relationship between interval durations and previous experience with video games or with VR, nor was there a significant difference between conditions in rated difficulty (either of the game or of keeping track of time). This result conflicts with our second prediction that time estimation in VR would be more difficult, and that produced intervals would therefore be more variable in VR compared to CM. However, because the survey was administered after participants had completed both tasks, it is possible that participants’ responses pertaining to one condition were confounded by their experience with the other. In fact, we found no significant differences in ratings of immersion between the VR and CM conditions. Only one of the 19 Likert scales about immersion (“I did not feel like I was in the real world but the game world”) appeared to be higher in VR compared to CM (t36 = 2.215, p = 0.033, d = 0.717), but this difference did not reach significance at the Bonferroni-corrected alpha of 0.0026 (see Supplementary Table S1 for the complete immersion scale results). The surprising lack of an immersion difference between conditions suggests that administering the survey after both conditions were completed may have diminished our ability to detect an effect.

4. Review/Discussion

These results constitute the first evidence that VR as a medium produces a unique time compression effect. At least one previous experiment (Schneider et al., 2011) successfully used VR to produce a similar effect, but the present study is the first to observe time compression as a significant difference between VR and non-VR experiences with otherwise identical content. Importantly, our results suggest that there is something inherent about the VR interface (as opposed to a characteristic of its content) that produces a time compression effect.

Most of the previously observed effects on prospective time estimation are related to attention, but the significance of our main finding does not appear to be attributable to a difference in attentional demands. The tasks in both conditions were of identical complexity and difficulty; the two sets of maze levels were counterbalanced across conditions, and participants in both conditions spent about the same amount of time on each level.

The VR condition did present a simpler scene to the participant than the CM condition (it had a narrower field of view, and the physical lab environment was not visible), but this is unlikely to explain our effect either. Visual-stimulus complexity has been found to only affect retrospective estimates (Block & Zakay, 1997). If we were to repeat this experiment using retrospective estimates, we would expect to find shorter perceived intervals in the VR condition, because the VR scene presents a smaller amount of information that could be later recalled from memory. This would also be a kind of time compression effect, but assuming that the participants’ attention remains on the screen during the interval, we would expect a much weaker effect than the one we found. Based on Block and Zakay’s (1997) meta-analysis, though, stimulus complexity should have no significant effect on prospective estimation tasks like the one we used.

Arousal can also influence prospective time estimation in general, but it is highly unlikely to explain our main finding because of the direction of its effect. Images displayed in VR have been found to elicit higher arousal than the same images displayed on conventional monitors (Estupiñán et al., 2014), but higher arousal is associated with time dilation, according to the attentional-gate model (Zakay & Block, 1995). In the context of our study, this would predict that participants in the VR condition would produce shorter intervals than participants in the CM condition. Because produced intervals in the VR condition were in fact longer, we conclude that arousal did not play a role in the main effect we observed, either.

One difference between our two conditions that does seem likely to be responsible for the effect is that participants could not see their own body, or any representation of it, in the VR condition. In pacemaker–accumulator models of time perception, pulse generation is treated as an abstract module of the time estimation process, but it is thought to be a function of bodily rhythms like heart rate, breathing, or neural oscillations (Pollatos et al., 2014Wittmann, 2009). The model’s inclusion of arousal as an influence on the pacemaker is based on this assumption, and there is accumulating evidence that time estimation accuracy is dependent on awareness of bodily rhythms. It has been found that time estimation accuracy is significantly correlated both with ability to estimate one’s own heart rate (Meissner & Wittmann, 2011), and with heart rate variability itself (Cellini et al, 2015). A more recent study found that people with high interoceptive accuracy are less susceptible to emotion-induced distortions of time perception (Özoğlu & Thomaschke, 2020).

Bodily awareness was measured as a participant variable in those studies, but it can also be manipulated. An experiment which used a VR and non-VR version of the same interactive environment found that bodily awareness was reduced in VR (Murray & Gordon, 2001). Specifically, the participants in the VR condition gave significantly lower ratings on scales of cardiovascular, skin, and muscle awareness. This is presumably related to the absence of any visible representation of the users’ body in the VR scene.

The combination of these two findings, (1) that prospective time estimation accuracy is related to awareness of bodily rhythms and (2) that being in VR reduces bodily awareness, suggests a likely explanation for the effect observed in the current study: participants in the VR condition were less aware of the passage of time because they were less aware of the bodily rhythms that form the basis of prospective time perception.

This is notable because the most prominent models of prospective time estimation do not account for interoceptive awareness as an independent influence on perceived interval durations. For example, pacemaker–accumulator models like Zakay and Block’s (1995) attentional gate include arousal, attention, and reference memory ‒ but not interoceptive awareness ‒ as influences on prospective time estimation. Because we suspect that a difference in interoceptive awareness (and not in attention, arousal, or memory) best explains the VR-induced time compression effect, models like these might be modified to account for interoceptive awareness as an independent influence on prospective time estimation. Dedicated timing models (Ivry & Schlerf, 2008) such as the attentional-gate model involve a pacemaker module that produces pulses that depend on bodily rhythms such as heart rate, breathing, or neural oscillations. We propose that such models might be amended to include interoceptive awareness as a factor that mediates the reception of these pulses. Impairing interoceptive awareness would lead to underestimations of time by reducing the number of pulses that ultimately reach the accumulator. Although prominent models so far have not treated interoceptive awareness as its own factor, our results suggest that it may affect time estimation independently from attentional demands, arousal, and reference memory.

The durations of participants’ second intervals were heavily anchored to first interval durations. It could be that the time production task in the first block severely revised each participants’ reference for what a five-minute interval feels like, and caused them to use that new reference to produce the second interval. Second intervals were also longer. This effect was exhibited by participants who played the VR version first and then switched to CM, as well as those who started with CM and switched to VR. The greater durations of second block intervals could be due to a novelty effect which may have dilated time perception more during the first block compared to the second block. Alternatively, participants may have expected to complete more levels in a 5-min period during the second block after having gained experience with the task. If participants expected to complete more levels in the second block, and used the level reached in the first block as a proxy to indicate the passing of five minutes, they may have purposely played additional levels in the second block. In fact, participants did on average play one additional level in the second block, but the rate of completing levels was no faster compared to the first block.

It is well established that order effects in general can confound results when counterbalancing is not used, but in our case the order effect was so overwhelming that the time compression effect becomes completely obscured if we analyze our data without regard for condition order. This suggests that counterbalancing may not be sufficient for experiments which use interval production tasks, and that future studies should use between-subjects designs when possible.

A follow-up experiment could further investigate the role of interoception in VR-induced time compression by having participants complete a bodily awareness scale after they complete the maze game. Using a between-subjects design in such an experiment would allow the questionnaire to be administered immediately after a single playthrough of the maze game, making it more valid than ours (which was administered after participants had completed both conditions).

Including an additional VR condition with a virtual body representation could also help clarify the role of body visibility in time perception (and more broadly, in bodily awareness). It is unclear now if hiding one’s body from view is enough to reduce bodily awareness, or if the effect depends on the VR-induced feeling of presence that makes the user feel as though they are in a place that is remote from their body. If adding a virtual body were found to both increase bodily awareness and mitigate the time compression effect, that would support the idea that reduced body visibility is responsible for the main effect we observed. If that manipulation were found to have no impact on bodily awareness or the time compression effect, it would suggest that the effect depends not on body visibility but on some higher-level feeling of virtual presence.

Another limitation of the present experiment is that we did not vary the duration of the interval that participants were asked to produce. Bisson and Grondin (2013) and Tobin et al. (2010) found that during gaming and internet-surfing tasks, significant time compression effects were only evident during longer sessions (around 30 min or longer). The authors of those studies note that this difference may be due to the time estimation methods they used: participants were asked to verbally estimate durations, and might have rounded their answers to multiples of five minutes. This rounding bias would have a much stronger influence on the results of their shorter-interval trials (12 min) than on their longer-interval trials (35, 42, or 58 min). Our finding of a time compression effect on a five-minute scale suggests that the interval production method we used likely protected our results from such a rounding bias. It is unclear whether or how the VR-induced effect we found might depend on the target duration of the produced interval. Future studies investigating this effect could explore this influence by instructing participants in different conditions to produce intervals shorter and longer than five minutes.

If transient reminders like the one we used are employed during prospective time estimation tasks, we recommend that the durations of the interval be pseudo-randomized. Our reliably periodic reminder may have helped our participants produce more accurate intervals in both conditions. Making the cue unreliable might reveal a larger effect, which could be crucial in experiments that test time perception in more delicate contexts.

4.1 Implications for VR Experience Design

An average of 28.5% more real time passed for participants who played the VR game than for those in the control group ‒ with no difference in perceived duration. If this effect proves to generalize to other contexts at similar magnitudes, it will have significant implications. Keeping track of time accurately is desirable in most situations, and impairing that ability could be harmful.

Time compression might cause VR users to unintentionally spend excessive amounts of time in games, especially as HMDs become more comfortable to wear for long sessions. Even non-immersive games entail some risk of addiction, which has been associated with depression and insomnia (Kuss & Griffiths, 2012). VR games may pose a greater risk of interfering with their players’ sleep schedules, mood, and health by reducing their ability to notice the passage of time. Developers should take care not to create virtual ‘casinos’; a clock should always be easily accessible, and perhaps even appear automatically at regular intervals.

On the other hand, time compression effects can be desirable in situations that are unpleasant but necessary, and there are potential applications that could take advantage of the effect in a beneficial way. VR might be used, for example, to reduce the perceived duration of long-distance travel. More importantly, the value of using VR to make chemotherapy more bearable (Schneider et al., 2011) is supported by the current study. Especially considering that VR has been used successfully as an analgesic (Hoffman et al., 2011), VR experiences could be similarly applied to reduce the negative psychological impact of other painful medical treatments. Our interpretation of the results suggests that other equipment or treatments which reduce bodily awareness, such as sensory deprivation tanks, may also be useful for producing time compression effects.

Supplementary Material

Supplementary material and the full original version of this report is available online at:


Bansal, A., Weech, S., & Barnett-Cowan, M. (2019). Movement-contingent time flow in virtual reality causes temporal recalibration. Sci. Rep., 9, 4378. doi: 10.1038/s41598-019-40870-6.

Search Google Scholar

Export Citation

Bisson, N., & Grondin, S. (2013). Time estimates of internet surfing and video gaming. Timing Time Percept., 1, 39–64. doi: 10.1163/22134468-00002002.

Search Google Scholar

Export Citation

Block, R.A., & Zakay, D. (1997). Prospective and retrospective duration judgments: A meta-analytic review. Psychon. Bull. Rev., 4, 184–197. doi: 10.3758/BF03209393.

Search Google Scholar

Export Citation

Cellini, N., Mioni, G., Levorato, I., Grondin, S., Stablum, F., & Sarlo, M. (2015). Heart rate variability helps tracking time more accurately. Brain Cogn., 101, 57–63. doi: 10.1016/j.bandc.2015.10.003.

Search Google Scholar

Export Citation

Davis, S., Nesbitt, K., & Nalivaiko, E. (2014). A systematic review of cybersickness. Proc 2014 Conf. Interact. Entertain., 1–9. doi: 10.1145/2677758.2677780.

Search Google Scholar

Export Citation

Estupiñán, S., Rebelo, F., Noriega, P., Ferreira, C., & Duarte, E. (2014). Can virtual reality increase emotional responses (arousal and valence)? A pilot study. In A. Marcus (Ed.), Design, User Experience, and Usability. User Experience Design for Diverse Interaction Platforms and Environments. DUXU 2014. Lecture Notes in Computer Science, Vol. 8518, pp. 541–549. Springer, Cham, Switzerland. doi: 10.1007/978-3-319-07626-3_51.

Search Google Scholar

Export Citation

Freeman, D., Evans, N., Lister, R., Antley, A., Dunn, G., & Slater, M. (2014). Height, social comparison, and paranoia: An immersive virtual reality experimental study. Psychiat. Res., 218, 348–352. doi: 10.1016/j.psychres.2013.12.014.

Search Google Scholar

Export Citation

Grondin, S. (2010). Timing and time perception: a review of recent behavioral and neuroscience findings and theoretical directions. Atten. Percept. Psychophys., 72, 561–582. doi: 10.3758/APP.72.3.561.

Search Google Scholar

Export Citation

Heeter, C. (1992). Being there: the subjective experience of presence. Presence (Camb.), 1, 262–271. doi: 10.1162/pres.1992.1.2.262.

Search Google Scholar

Export Citation

Hoffman, H. G., Chambers, G. T., Meyer, III, W. J., Arceneaux, L. L., Russell, W. J., Seibel, E. J., Richards, T. L., Sharar, S. R., & Patterson, D. R. (2011). Virtual reality as an adjunctive non-pharmacologic analgesic for acute burn pain during medical procedures. Ann. Behav. Med., 41, 183–191. doi: 10.1007/s12160-010-9248-7.

Search Google Scholar

Export Citation

Ivry, R. B., & Schlerf, J. E. (2008). Dedicated and intrinsic models of time perception. Trends Cogn. Sci., 12, 273–280. doi: 10.1016/j.tics.2008.04.002.

Search Google Scholar

Export Citation

Kuss, D. J., & Griffiths, M. D. (2012). Internet gaming addiction: a systematic review of empirical research. Int. J. Ment. Health Addict., 10, 278–296. doi: 10.1007/s11469-011-9318-5.

Search Google Scholar

Export Citation

Meissner, K., & Wittmann, M. (2011). Body signals, cardiac awareness, and the perception of time. Biol. Psychol., 86, 289–297. doi: 10.1016/j.biopsycho.2011.01.001.

Search Google Scholar

Export Citation

Miller, R. (2016) Oculus founder thinks VR may affect your ability to perceive time passingThe Verge

Search Google Scholar

Export Citation

Murray, C. D., & Gordon, M. S. (2001). Changes in bodily awareness induced by immersive virtual reality. CyberPsychol. Behav., 4, 365–371. doi: 10.1089/109493101300210268.

Search Google Scholar

Export Citation

Nuyens, F. M., Kuss, D. J., Lopez-Fernandez, O., & Griffiths, M.D. (2020). The potential interaction between time perception and gaming: A narrative review. Int. J. Ment.l Health Addict., 18, 1226–1246. doi: 10.1007/s11469-019-00121-1.

Search Google Scholar

Export Citation

Özoğlu, E., & Thomaschke, R. (2020). Knowing your heart reduces emotion-induced time dilation. Timing Time Percept., 8, 299–315. doi: 10.1163/22134468-bja10016.

Search Google Scholar

Export Citation

Pollatos, O., Yeldesbay, A., Pikovsky, A., & Rosenblum, M. (2014). How much time has passed? Ask your heart. Front. Neurorobot., 8, 15. doi: 10.3389/fnbot.2014.00015.

Search Google Scholar

Export Citation

Riva, G., Mantovani, F., Capideville, C. S., Preziosa, A., Morganti, F., Villani, D., Gaggioli, A., Botella, C., & Alcañiz, M. (2007). Affective interactions using virtual reality: the link between presence and emotions. CyberPsychol. Behav., 10, 45–56, doi: 10.1089/cpb.2006.9993.

Search Google Scholar

Export Citation

Schatzschneider, C., Bruder, G., & Steinicke, F. (2016). Who turned the clock? Effects of manipulated zeitgebers, cognitive load and immersion on time estimation. IEEE Trans. Vis. Comput. Graph., 22(4), 1387–1395. doi: 10.1109/tvcg.2016.2518137.

Search Google Scholar

Export Citation

Schneider, S. M., Kisby, C. K. & Flint, E. P. (2011). Effect of virtual reality on time perception in patients receiving chemotherapy. Support. Care Cancer, 19, 555–564. doi: 10.1007/S00520-010-0852-7.

Search Google Scholar

Export Citation

Tobin, S., Bisson, N., & Grondin, S. (2010). An ecological approach to prospective and retrospective timing of long durations: a study involving gamers. PloS ONE 5, e9271. doi: 10.1371/journal.pone.0009271.

Search Google Scholar

Export Citation

van der Ham, I. J. M., Klaassen, F., van Schie, K., & Cuperus, A. (2019). Elapsed time estimates in virtual reality and the physical world: the role of arousal and emotional valence. Comput. Hum. Behav., 94, 77–81. doi: 10.1016/j.chb.2019.01.005.

Search Google Scholar

Export Citation

Wittmann, M. (2009). The inner experience of time. Philos. Trans. R. Soc. B Biol. Sci., 364, 1955–1967. doi: 10.1098/rstb.2009.0003.

Search Google Scholar

Export Citation

Zakay, D. (1993). Time estimation methods ‒Do they influence prospective duration estimates?. Perception, 22, 91–101. doi: 10.1068/p220091.

Search Google Scholar

Export Citation

Zakay, D., & Block, R. A. (1995). An attentional-gate model of prospective time estimation. In M. Richelle, V. D. Keyser, G. d’Ydewalle & A. Vandierendonck (Eds), Time and the dynamic control of behavior (pp. 167–178). Liège, Belgium: Université de Liège.

Search Google Scholar

Export Citation

Portalco delivers an alternate method of delivery for any VR or non-VR application through advanced interactive (up-to 360-degree) projection displays. Our innovative Portal range include immersive development environments ready to integrate with any organisational, experiential or experimental requirement. The Portal Play platform is the first ready-to-go projection platform of it's type and is designed specifically to enable mass adoption for users to access, benefit and evolve from immersive projection technologies & shared immersive rooms.

Virtual Reality- Understanding The Screen Door Effect

Have you ever worn a virtual reality headset excited for the immersive experience, only to have it marred by a disruptive black grid blanketing your vision? That isn’t your eyes playing tricks on you: it’s actually a real tech phenomenon, known as the screen door effect.

Of course, the key to a successful immersive application is to captivate the user and involve themin the virtual world, not just offer them a view from the outside (we already have standard screens and monitors for that!). And if the entire world is blanketed by a strange meshing like the one you get on a screen door, it may feel as painful as being slammed in the face by said door...okay, maybe not that painful, but the full power of immersion can’t be unlocked.

Why exactly does the screen door effect happen?

Nowadays, most electronic screens are either LCD or OLED displays. They are made up of loads of pixels, each consisting of different-coloured subpixels (red, green and blue).


If you have a TV or a computer monitor, chances are it’s an LCD. All of the pixels are lit up by a singular backlight, which illuminates the entire screen.


OLEDs are most commonly found in smartphones and virtual reality headsets (though OLED TVs and monitors exist now too!). OLED displays tend to be crisper; the structure of the pixels is often quite different, and each individual pixel emits light by itself.

Wasted space

As illustrated in the above images, between each of the pixels (and subpixels) is an area of unlit space - and it’s this space that is the culprit of the intrusive screen meshing.

The screen door effect isn’t unique to VR displays; if you sit too close to the television or put a magnifying glass over your smartphone’s screen, you may be able to see the miniscule mosaic of red, green and blue shapes. But when you casually watch TV or text a friend, your eyes will be at a comfortable distance from the screen where you won’t even be able to make out the distance between each pixel.

However, with head-mounted devices, it’s a different story. When you wear a virtual reality headset, your eyes are much closer to the display than they would be looking at a phone, computer monitor or television. The lens magnification and all-encompassing field of view of the headset visuals only further exaggerate the gap between the pixels.

How can we prevent it?

Due to the way LCD and OLED screens work, the screen door effect is impossible to shut out completely (at least at the moment). That doesn’t mean there aren’t any ways we can make the effect less prevalent, but they’re not without their own drawbacks.

Needs more cowbell pixels

Probably the most obvious option. Increasing the amount of pixels per inch (ppi) decreases the amount of black space in the display, and more and more tech companies are recognising the need to improve the quality and crispness of their product’s visual screens. Nowadays, a commercial VR headset display can have upwards of 500 pixels per inch

However, a higher pixel density may require a larger display to host them all on, not to mention a larger amount of system memory to handle each pixel’s information without sacrificing motion smoothness.

Diffusion filters

Diffusion filters are a common and inexpensive visual trick for making images look softer and smoother. The material of the diffuser scatters the emitted light, creating the illusion of the pixels blending into one another and eliminating a lot of the explicit gaps between each one.

It’s important to note, however, that this technique also carries some burdens of its own. If not done carefully, a diffuser may make visuals look too blurry to be convincing, and even cause more disruptive artifacts such as sparkling and moiré-patterned (ripple-like) interference. Many users may also feel themselves having to make more effort to focus on the visuals, resulting in both hindered immersion and bad eye strain!

Low pass filters

A low-pass filter positioned over the screen is a straightforward way of filtering out the most high-frequency black space of the display.

Not only does implementing such a filter bump up the costs, but also power consumption (which is also expensive, and in today’s ecologically-minded climate, not ideal!)

Mechanical shifting

Researchers at Facebook Reality Labs, the developers of the Oculus Rift display, have proposed a way of mitigating the screen door effect via piezo actuators, devices which convert electrical energy into a high-speed force of displacement without any moving parts. Two of these devices would shift the pixels along the screen so that they occupy unlit gaps, minimising the perceived distance between them. While this sounds like it would make everything shaky, this would be done at such a rapid rate that the picture would still appear stable to the viewer.

Though studies into this method are currently limited, the researchers highlight that this method is most likely not ‘one-size-fits-all’, and the screen door effect in a display would need to be inspected and characterised before deciding how the pixels should be moved.


In 2018, Valve Corp (creators of the Valve Index VR headset and the Steam game marketplace) filed a patent for a display they believe can greatly mitigate the screen door effect.

Sandwiched between this display’s lens and the eye’s view is a phase optic, which consists of an array of microlenses (tiny lenses less than a millimetre in diameter, often in micrometres!). These would magnify the pixels and scatter light, making the meshing of gaps much more subtle. While this is probably the most sophisticated method mentioned, it’s also very expensive and fiddly to implement, so it may be a long time before we see this technique in use.

Stereoscopy In Projection Environments- Is It The Future Or The Past?

A quick timeline

1838 - Stereoscopes: the very first 3D viewer

In 1838, English inventor Charles Wheatstone made the groundbreaking discovery that would kickstart this entire branch of technology in the first place:

“The mind perceives an object of three dimensions by means of the two dissimilar pictures projected by it on the two retinae.”

In other words, each of our two eyes views the same object from a slightly different position and angle. Our brain, however, combines both signals into a singular picture, which is where our sense of depth perception stems from.

To demonstrate this, Wheatstone developed the preliminary stereoscope: a device consisting of a pair of mirrors positioned at 45 degrees to the user’s eyes. Each mirror reflected an identical drawing positioned off to the side and parallel from each other, creating an illusion of one singular image being displayed rather than two separate ones.

The user would find that the reflected image in the mirror had more depth and volume than the original drawing alone.

1849 - Lenticular stereoscope

It wasn’t too long before David Brewster, a scientist who specialised in optics, took note of Wheatstone’s observations and set out to improve upon the original 3D viewing device. He proposed what he termed the ‘lenticular stereoscope’ in 1849. Dubbed the world’s first portable 3D viewer, this model used two lenses in lieu of mirrors, and the user would look through the device as they would a pair of binoculars.

1853 - Anaglyph 3D (a.k.a the retro cinema glasses)

When you think of 3D glasses, the first ones that come to mind are those paper ones with the colourful lenses, aren’t they?

Anaglyphic photographs and films display both eyes’ views on the same image simultaneously: they are positioned slightly apart and encoded by colour (usually red and cyan). When the viewer puts on those retro glasses to view this funky film, each of the cellophane lenses filters out its own colour so the eye gets to view its own designated picture.

Super anaglyph

One of the biggest disadvantages of the anaglyph format is that the resulting image isn’t always faithful to the original. Because the coloured lenses filter out particular hues, an anaglyph photograph may not have the full range of colours found in the original. Some companies have acknowledged this flaw and built upon the anaglyph idea by developing more sophisticated methods for separating the left and right visual channels.

Dolby 3D uses what’s known as an interference filter system (a.k.a. “super anaglyph”); rather than separate the left and right eye images by one single colour, both images are in full colour albeit at slightly different wavelengths (not different enough for the naked eye to tell apart). The lenses in the special glasses separate the display into bands of colour (red, green and blue), only letting through the picture meant for the individual eye. That way, the Dolby 3D system gives viewers the best of both worlds: eye-popping pictures in eye-popping colour!

1890 - Polarisation: light angles are the new colours

First demonstrated in the 1890s, the idea of polarised 3D works very similarly to anaglyph 3D: the images for each eye are positioned on the same display, differentiated by a filter and viewed through special glasses. The key difference is that while an anaglyph display uses colour, a polarised 3D uses, you guessed it, polarisation.

But what exactly is polarisation?

Light travels in the form of electromagnetic waves. Most of the time, the waves that make up the light you see travel in all angles. When light is polarised, however, it means it’s filtered so that only waves travelling at a particular angle pass through.

Polarised 3D takes advantage of this to offer audiences the three-dimensional illusion without any impact on colour. The two eyes’ images are projected onto the same screen, albeit polarised in opposite directions. The viewer puts on glasses that contain polarised filters in the lenses; each lens lets in the light polarised in one direction and blocks out all light polarised in the other.

Because of its low cost and minimal impact on picture quality, this is nowadays the most common type of 3D system used in cinemas.

1922 - Active 3D

Active 3D is when the glasses or lenses used to view are ‘active’ in the sense of being powered, rather than just serve as mere panels to look through. Though in the world of electronic media it’s a relatively modern technique, the concept itself dates back to 1922.

Laurens Hammond, best known for inventing the Hammond organ, devised a then-novel lens for viewing cinema films in 3D: the Teleview. As two projectors showed each eye’s films with the shutters out of sync, the audience would look through a viewing device attached to their seat. This device would block out the appropriate eye, thanks to a rotary shutter operating in sync with the projectors. The experimental system was only used in one cinema viewing, because it was expensive to implement

Only one of both eye’s images are presented on the screen at once, and the glasses are constantly ‘winking’: the left eye’s view is blocked when the screen displays a picture for the left eye and vice versa. The winking is controlled via a timing signal synchronised with the screen’s refresh rate.

The frame rate of a film projected through an active 3D display has to be double that of a standard one. Another disadvantage of the active 3D system is that it doesn’t work with most modern monitors due to the extremely high refresh rate required to view the picture without any flickering.

1985 - Autostereoscopy: glasses not required

The term ‘autostereoscopy’ doesn’t just refer to a single method: it’s an umbrella term covering all stereoscopic display types that don’t require glasses or lenses to view.

Perhaps one of the most well-known examples of this form of stereoscopy is the Nintendo 3DS. The 3D effect in the top screen works via a parallax barrier, an array of strips placed over the LCD screen which project the display at just the right angles for giving the player a sense of depth perception. The parallax barriers can be disabled if the player wants to game without the 3D effect.