Wearable eye tracking devices such as Tobii Pro Glasses 2 produce eye gaze data mapped to a coordinate system relative to the wearable eye tracker and the recorded video, not to static objects of interest in the environment around the participant wearing the eye tracker. For most statistical/numerical analysis to be meaningful, the collected eye tracking data needs to be mapped on to objects of interest and into a new coordinate system with its origin fixed in the environment around the participant.
The same scenario arises if you are performing a study using a remote eye tracker and a scene camera. Again, the data collected by the eye tracker is mapped to a coordinate system relative to scene camera video, not to the static objects or person of interest in front of the participant.
Tobii Pro Lab addresses this challenge by allowing the user to map gaze data onto still images (snapshots and screenshots) of the environments and target objects. Data from a recording can be mapped onto one or several images. These images are used for generating visualizations, such as heatmaps and gaze plots, and Areas Of Interest. The mapping can be done either entirely manually, or in an assisted way by using the assisted mapping function.
How to map data onto a Snapshot or Screenshot manually:
How to map data onto a Snapshot or Screenshot using the assisted mapping algorithm:
How to re-map gaze points manually, for each point or fixation:
Mapped gaze/fixation point color-coding:
For the assisted mapping algorithm to be able to interpret the snapshot images correctly, there are a few things you should consider when you select the picture you want to use as a reference (the snapshot).
The algorithm compares the snapshot with the picture frames in the recording from Pro Glasses 2. For this procedure to work as best as possible it is important that the scene in the snapshot is as ‘flat’ as possible. With ‘flat’ we mean that the scene should be as two-dimensional as possible in the sense that all objects in the images should be more or less on the same distance from your viewpoint and always visible, no matter your viewing angle. Imagine a grocery store shelf with lines of cans and cereal boxes; all items on the shelf will be visible even if you move a few meters to the left or to the right from your original position, only a bit skewed, but this is no problem as the algorithm can interpret the image anyway. There is no risk of an item ‘shadowing’ another item; this makes for a good reference snapshot as we never know where the participant will stand in front of the store shelf. In contrast, we can describe scenery which is much more three-dimensional — imagine a store desk with a cash register on it. To the left of the cash register and a little more to the back on the shelf, there is a can with pens. If we stand right in front of the desk, we see both the cash register and the can, the scene looks two-dimensional from this point of view; a photograph (snapshot) of this scene would make both items fully visible and the snapshot image would, in fact, be correct. As long as the images from the participant's recording are made from more or less the same position we have no problems. But, what if the participant stands a few meters to the right, so the can of pens is shadowed by the cash register, in effect the can will no longer be visible from that point in the recording and the algorithm will not be able to map the data correctly.