Moving Innovative Game Technology from
the Lab to the Living Room
RICHARD MARKS
Sony Computer Entertainment R&D
Video games have become a giant industry, with global revenues in 2011 estimated at well over $60 billion. They enjoy a mass-market audience while at the same time riding the bleeding edge of technology. Because video games are entertainment, they offer a unique launch pad for new technologies: players are supportive and hopeful, and the focus is enjoyment rather than productivity or a high level of reliability. Game developers can thus make the most of new technologies to explore new experiences.
INNOVATIVE HARDWARE TECHNOLOGIES
Game hardware manufacturers have produced significant innovations in the areas of graphics, computing, display, and input technologies. The interactive nature of video games drives technologies with requirements of both high performance and low latency.
Graphics
An example of game graphics innovation is the Voodoo 3D technology (introduced by 3dfx in 1997), a 3D-only add-on card for PCs that enabled arcade-level visuals. In the same time frame, Nintendo released the Nintendo 64 “Reality” 3D coprocessor, touted as the equivalent of “a high-end Silicon Graphics workstation in your home.” Several years later, Sony created the PlayStation 2 Graphics Synthesizer, which achieved enormous polygon fill rate due to its 2,560-bit width bus to embedded RAM. Games continue to be the driving force in real-time graphics
hardware, and are often used as benchmarks for measuring personal computer performance.
Computing
Game-related advances in computing, especially parallel computing, include the Cell processor in the PlayStation 3, a microprocessor that consists of one general-purpose CPU and eight coprocessors to handle streaming computation like that often found in interactive applications. Another example is “cloud gaming.” Companies such as OnLive and Gaikai have demonstrated that high-performance gaming is possible using only a thin client by streaming output (effectively a movie) from a powerful server in the cloud. These companies are raising the bar for the types of applications that can be moved to the cloud.
Display
One of the Nintendo 3DS screens uses parallax-barrier technology to present a different image to the left and right eye, effectively creating a 3D image with no need for glasses. The 3DS also includes a slider that lets the player control the strength of the 3D. The PlayStation 3D monitor uses fairly standard LCD shutter glasses to achieve 3D, but it also uses this technology for an innovative “dual view” mode in which two players see different 2D images. In the near future, low-cost head-mounted displays (HMDs) will provide an immersive 3D experience that updates the image seen based on the player’s head motion.
Input
Recently, video games have pushed the boundaries of input technology beyond the joystick, gamepad, keyboard, and mouse. A primary reason for this innovation is that earlier advances in graphics and display technology (output) greatly outpaced those in interface technology (input), creating an unbalanced user experience.
Several key technologies for input have enabled the revolution in interfaces. The commercialization of microelectromechanical systems (MEMS) has made possible small, low-cost sensors such as accelerometers, gyros, magnetometers, and pressure and temperature sensors. In conjunction with wireless, high-speed, low-cost, low-power, low-encumbrance digital communication, these microsensors can be used to collect a wide assortment of data for processing. And finally, digital video cameras have become viable as low-cost input devices that (when combined with ever-growing processing capabilities) can provide real-time information about how players are moving their bodies.
Peripherals that use these technologies in various combinations have changed the way video games can be played. The EyeToy digital video camera for
PlayStation 2 was the primary interface for games that explored both enhanced reality (i.e., augmented reality) and video-as-input paradigms (Figure 1). The Nintendo Wii featured a wireless one-handed MEMS-based motion-sensing remote as its primary controller, redefining the way games are played. The PlayStation Eye camera improved on EyeToy, enabling new marker-based augmented reality experiences such as Eye of Judgment and EyePet. The Microsoft Kinect camera extended video sensing to capture depth information at every pixel, enabling the Xbox to compute the dynamic pose of the player (i.e., skeleton tracking). Kinect also included a microphone array to enable voice control without needing to hold or wear a microphone. PlayStation Move combined MEMS inertial sensing and digital camera sensing into a single system that provides high-precision, high-speed, six-degree-of-freedom tracking of the one-handed controller. Wonderbook uses marker-based technology and PlayStation Move to create an interactive book in which each “page” is printed with a different marker, so a unique interactive augmented reality experience is shown on the television the player flips through the pages (Figure 2).
FROM LAB TO LIVING ROOM
Transitioning new technologies from research to product poses challenges in every industry, and it is no different for video game manufacturers. The following sections describe the trajectories of three consumer products that began as research projects in Sony Computer Entertainment R&D.
EyeToy
EyeToy, a mass-market product that sold more than 10 million units globally, began as a research project to investigate the types of experiences that would be possible by plugging a video camera (webcam) into a video game console (Figure 3). The powerful computation capabilities of consoles align well with what is necessary for real-time video processing and computer vision. EyeToy was essentially a standard webcam, but several design choices made it well suited for interactive experiences. For example, low latency was prioritized, so EyeToy compressed each frame individually in order to transfer over USB 1.1, rather than using an interframe video compression method such as MPEG. In addition, 60 frames/sec was the default output video rate to provide smooth, high-speed tracking.
The goal of EyeToy was to introduce video games to a wider audience via an intuitive interface and improve the interactive experience by directly involving the player visually. The biggest concern for EyeToy was the highly variable lighting in people’s homes; unlike the office or laboratory environment, which is almost always well lit, the lighting in many family or living rooms is much less consistent. But because players wanted to enjoy their experience, most were happy to add lighting as necessary to brighten the scene while they played.
FIGURE 1 The EyeToy camera turns the television into an augmented reality mirror. Computer graphics are overlaid onto the live video, and real-time motion detection allows players to interact with the graphics as if they were real—for example, to wipe virtual dirt off the screen or bat away virtual ninjas. Source: Courtesy of Sony Computer Entertainment.
FIGURE 2 Wonderbook uses PlayStation Move and a “book” with different markers on each page to create a rich augmented reality experience. The left image shows reality, and the right image shows an example of what the player might see on the television; in this case, reality is “augmented” to include a magic wand, a flying dragon, and an old book that is burning. Source: Courtesy of Sony Computer Entertainment.
FIGURE 3 The EyeToy camera. Source: Courtesy of Sony Computer Entertainment.
PlayStation Eye
Based on feedback from both players and developers, the PlayStation Eye (Figure 4) improved on EyeToy with the addition of a fixed-focus, low-distortion lens with two choices for field of view, standard or wide. To improve video quality, USB 2.0 high speed was used so the resolution could be quadrupled (to 640 × 480) and video could be transferred uncompressed to avoid artifacts. And the low-light sensitivity was greatly improved. These technical attributes and the product’s low cost have made PlayStation Eye the most widely used camera among computer vision researchers and hobbyists.
PlayStation Move
The high specifications of the PlayStation Eye enabled the creation of PlayStation Move (Figure 5), a one-handed motion controller for PlayStation 3 that incorporates a combination of optical and inertial sensing to provide complete six-degree-of-freedom tracking. The design addresses issues discovered during the development of EyeToy, combining the advantages of a camera-based interface with those of motion sensing and buttons. Game developers thus have absolute position and orientation, linear and angular velocities/accelerations, and button state.
Tracking the Move involves two major steps: image analysis and sensor fusion. Images from the PlayStation Eye are analyzed to locate the illuminated sphere that sits atop the controller (because the sphere is lit, it can be tracked even in complete darkness). Color segmentation is used to find the sphere in the image, and then a projected sphere model is fit to the image to extract the 3D position of the sphere. The results of the image analysis are fused with inertial sensor data from a 3-axis accelerometer and 3-axis gyroscope to provide the full state using a modified unscented Kalman filter (LaViola and Marks 2010).
CONCLUSION
The continual adoption of new technology has been a key factor in the growth of the video game industry. Advances in graphics, processing, display, and input technologies have both improved existing experiences and enabled new ones that appeal to a wider audience. Looking forward, there is every indication that the video game industry will continue to leverage new technology to help push the boundaries of play.
REFERENCE
LaViola JJ Jr, Marks R. 2010. An introduction to 3D spatial interaction with video game motion controllers. Course presented at ACM SIGGRAPH 2010, Los Angeles.