The specific multifocal designs with their theoretical principle and enabling technology are introduced and discussed in this section, which is further divided into four sub-sections based on their information multiplexing channels. Since each method has its own pros and cons, there are also hybrid approaches taking advantage of more than one principle or technology.
Space-multiplexing
As the most conventional approach, space-multiplexing allows a direct way to build multifocal displays. This section summarizes the representative designs based on the spatially multiplexed channel, including transparent display/screen stack, optical combiner stack, and optical space-to-depth mapping.
Transparent display/screen stack
Rolland et al. [30] proposed a multifocal HMD with a thick transparent display stack [Fig. 3(a)] at the end of last century. Based on the visual acuity, stereo acuity and pupil size of HVS, an optimal focal plane arrangement was proposed, where virtual planes are linearly spaced within the DOF from 0 to 2 diopters with a 1/7 diopter spacing. Although transparent display panels can enable the simplest implementation of distance-based multifocal displays, it is still challenging at this time for transparent displays to manifest high transmittance with decent image quality as or close to conventional flat panel displays [38]. Most of the transparent displays are realized by reducing the aperture ratio of the light-emitting region [39], saving more non-emitting areas to achieve the semi-transparency. Therefore, there is an intrinsic trade-off between transparency and brightness in this type of transparent displays. A direct consequence of low transmittance is low light efficiency and therefore high power consumption. Additionally, since there is a significant permittivity contrast between the light-emitting area and transparent area, the diffraction of light passing through the display panel may degrade the image quality of rear panels [40]. This common issue for transparent displays can become more acute and intolerable with such a thick panel stack in multifocal displays. Moreover, due to the similar spatial periodicity of the pixels in cascaded panels, the Moiré effect [41] may happen through the multiple panels. The resulting Moiré fringes would degrade the quality and resolution of displayed images.
Lee et al. [42] demonstrated a multi-focal display prototype by projecting 2D images onto multiple immaterial scattering FogScreens [Fig. 3(b)], which consist of a thin sheet of fog protected by surrounding non-turbulent airflow. The FogScreens [43] can be arranged with either a stack or an L-shaped configuration to extend 2D screens to a 3D display. Since the projection screens utilized here are made of fog, the audience could directly work through the 3D scene and manipulate the 3D objects. Although their experiments verified the 3D effect of the FogScreens projection display with approximate accommodation cue, the vergence cue was not enabled in this design. The images projected on the FogScreens were rendered based on the depth-fused algorithm for the midpoint of the viewer’s two eyes, so there is always an error in the images with both eyes open. Also, the fog flow is not an ideal projection screen since complicated turbulences exist within the fog flow and it tends to break up at the margin of the screen. Later, Barnum et al. [44] also presented a similar projection-type multi-plane design using water droplets instead of fog flow as the nonsolid screen. Rakkolainen and Palovuori [45] further extended this concept to fluorescent dye screens working by photoluminescence but not scattering with ultraviolet projectors, which manifest higher transparency than the standard FogScreen. The fluorescence of dye screens offers omnidirectional emission, unlike the Mie scattering from FogScreens, where bright images can be observed only from a small range of viewing angles.
Recently, Lee et al. [46] demonstrated a dual-focal projection-type see-through display based on holographic optical elements (HOEs) [Fig. 3(c)]. Each holographic screen only diffuses light from a distinct direction that satisfies the Bragg condition. Thus, with a proper spatial configuration, each HOE screen with ~ 90% transmittance can work as a see-through additive 2D focal plane. Compared with dynamic fog flow screens, the static holographic screens can offer a stable and sharper image with simpler hardware. In general, at this stage, multifocal displays using a projection screen stack can avoid the limitations and artifacts from those with transparent displays at the cost of an enlarged footprint.
Optical combiner stack
Instead of directly stacking displays or projection screens, the distance-based multifocal displays can be built by stacking multiple optical combiners, including beam splitters, freeform prisms, and lightguide. Akeley et al. [47] designed and presented a multifocal display prototype with three focal planes [Fig. 4(a)], utilizing stacked beam splitters to divide a LCD panel into three sub-panels. The flat panel had a width of 47.8 cm and a height of 29.9 cm. The resulting virtual planes were equally separated by 0.67 diopters and aligned on-axis in front of the viewer. This prototype is more for a proof of concept than a practical implementation due to its large volume, which may further increase if a wider field of view (FOV) is desired. The employment of beam splitters can be found in numerous designs not only for cascading focal planes but also for enabling the see-through functionality. Suyama et al. [48] also built a dual-focal depth fused display using a beam splitter for combining two LCD panels. Afterwards, as the most available optical combiner, beam splitters were utilized in the many later designs with space-multiplexing [49,50,51].
As a distinct type of optical combiners for augmented reality displays, freeform optics enables off-axis operation and offers more degrees of freedom in the design of HMDs. Cheng et al. [32] proposed a space-multiplexed dual-focal near-eye display by stacking two freeform prisms, both of which are equipped with a micro-display [Fig. 4(b)]. According to our taxonomy, this custom-designed freeform layout is both distance- and power-based. Despite the demanding optical design process to maintain a decent contrast and resolution for both see-through and virtual images, another apparent problem of this freeform prism stack is the considerably large footprint, and it is quite challenging for miniaturization. The thickness of just two freeform prism stacks is already around 20 cm, which would be much heavier than lightguide combiners in HMD applications. In the proposed design, the two focal planes are located at 0.2 and 0.8 diopters; namely, the viewing distance ranges from 1.25 m to 5 m. To support 3D objects closer than 1.25 m, more prisms need to be stacked on the optical axis, which would render the freeform combiner even thicker. In this regard, among the various optical combiners, geometric lightguides [52] can make a better candidate for space-multiplexed multifocal displays due to their compact form factor, although no prototype has been reported yet.
Optical space-to-depth mapping
Cui and Gao [53] designed and demonstrated an power-based space-multiplexed multifocal display by dividing a display panel into four subpanels and optically mapping them to different depths with a liquid-crystal-on-silicon (LCoS) SLM located at the Fourier plane of the 4f system [Fig. 5(a)]. The SLM presents a static phase profile, including the quadratic phases that image the subpanels to different depths and linear phases that shift the center of each subpanel to the optical axis, which functions as a multifocal off-axis diffractive lens. By changing the configuration of subpanels and the phase pattern on the LCoS SLM, this design can dynamically change the number of its focal planes under the trade-off between lateral resolution and depth density. The LCoS SLM can be replaced by a diffractive optical element (DOE) [54] if the arrangement of the focal planes is fixed. This prototype can only display monochromic contents, since the display panel was covered with a narrowband color filter at 550 nm with 10 nm bandwidth, in order to reduce the chromatic aberrations originated from the wavelength-dependent effective focal lengths of the LCoS SLM. Field sequential color helps achieve full-color operation but at the expense of frame rate loss. Another problem is that this kind of design is prone to the stray light and resolution loss issue due to the phase quantization and phase resets of SLMs.
In the meantime, Matsuda et al. [55] proposed and demonstrated a full-color focal surface display with an 18° FOV. In this design, the pixels at different spatial locations on the 2D display are optically mapped to different depths using a SLM-based programmable lens with spatially varying focal length [Fig. 5(b)]. Although the time-multiplexing method was implemented to present three focal surfaces in the prototype, the novel optical feature of this work is the generation of a 3D focal surface from a 2D panel by space-multiplexing. The focal surface display has a smaller footprint than the design reported by Cui and Gao [53] due to the absence of the 4-f system and it is able to support arbitrary depth maps. Moreover, the key advantage of focal surface display over conventional multifocal displays is more accurate depth blur with reduced multiplexed images. In the prototype, the primary concerns of using SLM, such as chromatic aberrations and stray light, were identified and mitigated. The transverse chromatic aberrations were digitally corrected by pre-warping the displayed images, while the average axial chromatic aberrations were measured as 0.25 diopter within the supported DOF (0.75–4 diopters), which is less than that of typical human eyes. Also, a circular polarizer is placed upon the display panel to suppress the stray light reflections. As a result, high-resolution imagery was achieved within the supported focal range, according to the measured modulation transfer function (MTF). In spite of the FOV limited by the existence of beam splitter and the size of the SLM, a limitation of these optical mapping designs is the increased stray light when the SLM manifests a shorter focal length, as the SLM phase accuracy is relatively lower when synthesizing high spatial frequencies. Another practical concern is about the computation speed, similar to holographic displays. Here, the whole task of generating correct depth is placed on the computation, rendering it very challenging to achieve real-time global optimization of phase patterns on the SLM.
Time-multiplexing
The time-multiplexed multifocal display designs temporally divide each frame of the 3D content into multiple sub-frames with distinct depths and present 2D images sequentially through the DOF. In this dynamic type of solution, active components are utilized to avoid the difficulty of stacking multiple physical displays in a compact way, as in space-multiplexed designs. However, the information added for expanding 2D to 3D always comes with a cost. The temporally multiplexed systems necessitate not only fast-response tunable devices but also high-refresh-rate displays to attain a flicker-free performance. The following part also covers some optical layouts of varifocal displays that can be adapted to multifocal designs with updated hardware.
Mechanical sweeping
Mechanical sweeping screen or optics along the optical axis is a typical time-multiplexed distance-based method, working by actively changing the optical path lengths in the display system. Shiwa et al. [56] demonstrated the first mechanical sweeping 3D display [Fig. 6(a)] with 48° FOV in 1996. In their proof-of-concept prototype, a 20-in. cathode-ray-tube (CRT) display was split horizontally into left and right sub-screens, each displaying a distinct image content for one eye. The light emitted from each sub-screen passes through a relay lens and an eyepiece before reaching the observer’s eye. The relay lenses, which can be mechanically displaced along the optical axis, produce a real intermediate image of the according sub-screen at the vicinity of the eyepiece’s focal point. The stepper motor is able to displace the relay lens by 4 mm within 0.3 s, sweeping the virtual image from 20 cm to 10 m. The proposed design detects the observer’s gaze point and moves the virtual image to the matching depth, illustrating the concept and requirement of a varifocal display, which also includes a potential hardware layout for multifocal displays based on mechanical sweeping. This layout was further adapted by Sugihara and Miyasato [57] for a lightweight HMD.
Akşit et al. [58] built a varifocal HMD with a holographic see-through projection screen and a movable curved half-mirror combiner [Fig. 6(b)]. Since their reflective combiner is placed in front of the screen but not between the viewer’s eye and the screen, the eye relief distance can stay still when varying the display depth. In their demonstration, a holographic rear projection screen is placed in front of the eye as an intermediate image plane to display the information offered by an off-axis projector. The forwardly diffused light is reflected and collimated by a curved beam combiner, which is essentially a custom spherical concave mirror with 80% reflectance and an f-number of f/0.6. They demonstrated the varifocal ability by translating their curved beam combiner back and forth up to 5 mm, covering a depth range of 1 to 4 diopters.
Shibata et al. [59] designed a varifocal display based on the mechanical translation of the display panel, instead of the viewing optics [Fig. 6(c)]. This implementation includes a 6-in. LCD panel and a custom-designed telecentric optical system, which can keep the size of the virtual image unchanged when the LCD panel is translated mechanically to offer a depth range spanning from 30 cm to 200 cm. These varifocal layouts can provide dynamic accommodation cue with eye-tracking but not the authentic optical depth blur, which necessitates displaying virtual images at multiple depths at a flicker-free rate.
Voxon Photonics [60] demonstrated a time-multiplexed multiplane or volumetric display, VX1, which consists of a high-speed projection system with a fast-moving reciprocating screen. The rear projection diffuser screen is driven back and forth at 15 cycles per second, occupying an 18 cm × 18 cm × 8 cm volume. Since the projector can offer 2D images with 1000 × 1000 resolution at 4000 frames per second, this product can display 3D scenes with ~ 200 depths at 30 frames per second. Such a dense focal plane stack offers near-correct accommodation support and vivid 3D experience. Despite its wide applications for multiuser interactive displays, this design is not adaptable so far for a mobile system such as HMDs due to the challenges in miniaturization.
Switchable screen stack
As an alternative to swept screens, active projection screens that can switch between transparent and diffusive states are also developed, utilizing liquid crystal (LC) technologies, for multifocal displays with time-multiplexing. Stacking switchable screens for high-speed projection in sequential results in a multifocal display without moving parts. The switchable screens or shutters demand not only a high contrast ratio between two states to avoid cross-talk between distinct depths but also ultra-fast switching time for supporting a dense depth stack at a flicker-free rate.
In 2004, Sullivan [61] from LightSpace Technologies integrated a custom 3-chip DLP projector employing the digital micromirror devices (DMDs) and an air-spaced stack of LC scattering shutters together as a multiplanar display called DepthCube [Fig. 7(a)]. As a compromise for achieving a higher frame rate, the color depth of the projector was restricted to 5-bit per color. In this case, the maximum frame rate supported by the DLP projector can be more than 1500 frames per second. Since it takes time for the screen to switch between transparent and scattering, a blanking interval was inserted between each 2D images in a 3D scene, which reduced the frame rate to 1000 frames per second. The 20 stacked switchable screens in the system were made of polymer-stabilized cholesteric texture (PSCT) [62], which can manifest 88% and 2% transmittance without anti-reflection coating in the transparent and scattering state, respectively. The custom PSCT screens can switch rapidly from scattering to transparent state in 0.08 ms and the other way around in 0.39 ms. This submillisecond switchable screen and fast projector enabled 5-bit full-color 2D images with a 1024 × 768 resolution displayed in 20 depths at 50 Hz. An issue of this system is that the 2D image intensity drops gradually further away from the projector. Even with a proper anti-reflection coating, the total transmittance of the stack of 20 PSCT screens decreases to 44%. Thus, it is still challenging to make full use of the light from the projection engine and minimize the crosstalk between screens.
Recently, Zabels et al. [63] (also from LightSpace) demonstrated a multifocal HMD using a similar architecture. They minimized the design and added an eyepiece for generating virtual images. This prototype supports six depths, linearly spaced by 0.58 diopter, at 60 Hz but with a relatively low resolution, 480 × 800, with a 72° horizontal FOV. They improved the maximum transparency of the screens at the transparent state to 93.6% over the visible spectrum, from 420 nm to 700 nm. However, the response time is ~ 0.5 ms, which is slower than that reported by Sullivan 15 years ago, as mentioned earlier.
Liu et al. [64,65,66] developed a series of multifocal HMD benchtop systems [Fig. 7(b)] using polymer-stabilized liquid crystal (PSLC) instead of PSCT. Both PSLC and PSCT usually consist of low-molar-mass LCs and a high-molar-mass polymer. The main difference between them is that chiral dopants are added in PSCT but not PSLC. Compared with polymer-dispersed liquid crystals (PDLCs), where ~ 50 wt% LCs are dispersed as droplets in a polymer matrix, the concentration of polymer is much lower in both PSLC and PSCT, usually in the order of 3% or less [67]. Their first demonstration [64] is a single-color dual-focal display at 30 Hz, where the projector is an amplitude SLM with a 60 Hz refresh rate and two virtual image planes are located at 1.25 and 5 diopters. The PSLC shutter exhibits 6% and 86% transmittance at the scattering state and clear state, respectively. The rise and decay time of the PSLC screens are ~ 0.3 ms and ~ 0.35 ms. The second demonstration [65] is a monochromatic multifocal system utilizing reverse-mode PSLC screens that stay transparent without driving and becomes diffusive when a voltage beyond the threshold is applied, which is different from conventional PSLC and PSCT that can be switched from diffusive state to transparent state by applying a voltage. If N screens are employed in this type of multifocal system, the voltage needs to be applied to N-1 PSLC or PSCT screens but only one reversed-mode PSLC screen. Thus, the reversed-mode PSLC screens reduce the power consumption for screen driving to 1/(N-1) of that with conventional LC switchable screens. In this design, they increased the number of focal planes to four and replaced the amplitude SLM by a 120 Hz DMD projector, such that four focal planes (located at 0.2, 1.25, 2.5 and 5 diopters) can be displayed at 30 Hz refresh rate. Afterward, in [66], they built a binocular full-color dual-focal HMD benchtop demo using conventional PSLC and 360 Hz DMD projector to reach a flicker-free refresh rate at 60 Hz.
Polarization-dependent optical distance
Another type of approach for distance-based multifocal display with time-multiplexing has been developed by creating polarization-dependent optical distance in the system. The optical path between displays and optics can be switched with a high-speed polarization rotator in these systems, such that two or more virtual image planes can be displayed in sequence.
In 2016, Lee et al. [68] from our group demonstrated a proof-of-concept dual-focal near-eye display system that is temporally multiplexed and based on polarization-dependent optical distance [Fig. 8(a)]. They used a broadband twisted-nematic (TN) LC cell as the polarization rotator, manifesting a response time of 4.3 ms and 1.0 ms for rising and decay, respectively. In this design, the optical path difference is produced by placing two mirrors at different distances from a polarizing beam splitter.
Later that year, Lee et al. [69] from Seoul National University reported a temporally multiplexed dual-focal HMD prototype [Fig. 8(b)] also by switching the polarization states of the display light. They took advantage of a polarization-dependent Savart plate made of two anisotropic crystal plates, which is placed in front of the display panel to distinguish the effective refractive indices for ordinary and extraordinary lights. Thus, for light with different polarization states, the Savart plate would manifest different optical path lengths. In their prototype, the 60 Hz 1666 pixel-per-inch micro-OLED provides high-resolution yet 30 Hz contents for each of the two depths, which are placed 230 mm and 640 mm in front of the eyebox. They also put efforts to reduce the aberrations of the imaging system. Sub-pixel shifting is included in the rendering process as a digital correction of the transverse chromatic aberrations. Meanwhile, a half-wave plate is inserted between the two plane-parallel calcite plates, forming a modified Savart plate to compensate the astigmatisms optically. In addition to the devoted anisotropic optical design, another merit of the prototype is the fast response time of the LC polarization rotator, which can work at a refresh rate of 540 Hz and support up to 9 focal planes without flickering given a fast enough display.
In 2017, Moon et al. [70] built a projection-type dual-focal prototype utilizing polarization-dependent scattering polarizers as the projection screen [Fig. 8(c)]. These screens from Teijin Dupont Films can diffuse light rays with a linear polarization state and transmit those with the orthogonal polarization state. Their system consists of two screens and a 60 Hz projector synchronized with a polarization rotator with ultra-fast 30 μs response time from LC-Tec Displays AB. Similarly, the system refresh rate is display-limited but not polarization-rotator-limited as that in [69]. An issue of this design is that the diffusing angle of the scattering polarizers is around 10°, resulting in apparent vignetting in the virtual images.
Recently, Chen et al. [71] demonstrated another HMD design employing reflective cholesteric liquid crystal (CLC) cells with opposite handedness to differentiate the optical distances between right- and left-handed circularly polarized (RCP and LCP) light [Fig. 8(d)]. The polarization rotator is also a TN LC cell, whose rise time and decay time are 3.52 ms and 0.52 ms, respectively. However, it is quite challenging for a CLC cell to support a full-color operation. In their system, even with a large birefringence (Δn~ 0.4) LC material, the reflection band of the two CLC cells can only provide a high extinction ratio over the green and red spectral range.
Most of the reported multifocal systems based on polarization-dependent optical distances only demonstrate two focal planes, since there are only two orthogonal polarization states, either s/p or RCP/LCP. Generally, it requires extra polarization-dependent components to add more focal planes, but the system footprint would also increase accordingly in this case. This form factor issue is severer for the distance-based systems than the power-based ones.
Polarization-dependent lens
Optical elements with polarization-dependent focal lengths can offer the same features as the systems comprising polarization-dependent optical distances, yet usually with a compact size. Figure 9 illustrates the optical behavior of several special optical lenses with polarization-dependent focal length, including anisotropic crystal lenses [72], LC Fresnel lenses [73,74,75,76], Pancharatnam-Berry phase lenses (PBLs) [77,78,79,80] and CLC lenses [81]. Even though only part of them has been implemented to enable a multifocal display, the others also hold great potential for this application.
Love et al. [72] reported a temporally multiplexed multifocal display prototype with four focal planes generated by two anisotropic crystal lenses [Fig. 9(a)]. These lenses are made of calcite crystals and assembled in such way that their extraordinary axis is vertical and ordinary axis is horizontal, both perpendicular to the system optical axis. In this manner, the s- and p-polarized light can experience different refractive indices and also focal lengths. Each planar-convex calcite lens provides 0.6 diopter optical power difference between the extraordinary and ordinary polarizations, so the system working focal range is 1.8 diopters with two calcite lenses. Additionally, the polarization rotator they employed is made of ferroelectric LC that can switch the polarization state very quickly (< 1 ms). In their prototype, the four focal planes are presented at 45 Hz, which is determined by the 180-Hz CRT display. If equipped with a high-frame-rate display panel, this prototype can be improved to display flicker-free multifocal 3D scenes.
The PBLs [Fig. 9(b)] working by spatially varying optical anisotropy was proposed to enable multifocal displays in [82]. More recently, Yoo et al. [83] built a dual-focal see-through near-eye display with two PBLs. This kind of lenses can be considered as polarization-sensitive DOEs and also dielectric metasurfaces, manifesting opposite optical power for RCP and LCP lights in a diffractive fashion. Unlike the calcite lens functioning by the dynamic phase, the PBLs works by the Pancharatnam-Berry phase, also known as geometric phase. Thus, the PBLs made of liquid crystal polymer are very thin, usually with a thickness < 5 μm, delivering an attractive form factor for cascading more focal planes. A practical concern of PBL is that the spectral dispersion of focal length is much severer than that of calcite lens due to its diffractive nature. Hence, PBLs with large optical power may lead to significant chromatic aberrations, as discussed by Yousefzadeh et al. [84]. Although adding a refractive optics can help correct the chromatic aberrations for one polarization as discussed in [79], it is still challenging to fix the other orthogonal polarization at the same time. With a compromise, the transverse chromatic aberrations can be corrected digitally by image warping, but the longitudinal ones are not changed. Birefringent crystal lenses and PBLs mentioned above can work with both orthogonal polarization states but with a different optical power.
There are other optical elements that only have optical power for a single polarization state. The well-developed LC Fresnel lenses [Fig. 9(c)] can continuously tune the focal length but only for the linearly polarized light whose polarization direction is parallel to the LC alignment direction. Another example is the reflective DOE lens with patterned CLC, reported by Kobashi et al. [81]. This type of CLC lenses [Fig. 9(d)] only focus or defocus light with one circular polarization in the reflective fashion, while let the that with orthogonal polarization pass through. Although the CLC lenses reported so far can only work within a limited spectral bandwidth, it is possible to enlarge the spectral range by cascading several thin-film lenses with different working wavelengths [85]. The LC Fresnel lens and CLC lens can also function as the calcite lens and PBL if they are cascaded with another one that works for the orthogonal polarization or integrated with a polarization-independent bias lens, like the standard refractive lens.
Continuously tunable lens
Various types of optical components with tunable or switchable optical power has been developed for a wide range of applications. These optical parts can be utilized to build power-based multifocal displays if they can be manufactured with an appropriate size and vary their optical power fast enough. The following sections will cover designs using transmissive tunable lenses, including liquid crystal lenses, liquid lenses, freeform lenses, and also reflective optical components such as LCoS-SLM and deformable membranes mirrors.
To our knowledge, Suyama et al. [86] demonstrated the first LC lens based multifocal display system in 2000. They fabricated an active addressable Fresnel lens with dual-frequency liquid crystal (DFLC) mixture, whose dielectric anisotropy changes sign from positive to negative by increasing the electric field frequency. The DFLC is injected to the cell made of a Fresnel lens with surface alignment layer. In this way, by alternating the driving signal frequency, the optical power of the DFLC Fresnel lens can oscillate between − 1.2 and + 1.5 diopters at 60 Hz. In the prototype, the DFLC lens is placed between two static lenses as part of the viewing optics. To maintain a constant FOV when the focal length of the optical power changes, the DFLC lens is placed at the focal point of the eyepiece. An issue about the demonstrated tunable lens is the limited imaging quality when its optical power is oscillating between the two stable states. Since the cell gap is not uniform and the alignment at the Fresnel surface is not well defined, the effective refractive index of the LC mixture may vary considerably within the whole volume during the transitional states. Thus, the disturbed phase profile would degrade the imaging ability of the DFLC lens.
Liu et al. [87] reported a see-through dual-focal HMD prototype [Fig. 10(a)] enabled by a tunable liquid lens from Varioptic™. This design is adapted from the conventional bird-bath architecture by adding an electrowetting liquid lens in front of the beam splitter. The employed tunable lens manifests a varying optical power from − 5 to 20 diopters if driven by an alternating electric field with a root-mean-square voltage from 32 Vrms to 60 Vrms. The response time of the liquid lens, Arctic 320, employed in this work was ~ 75 ms. This prototype also utilized an OLED microdisplay with a refresh rate of 85 Hz and a graphics card that can support 75 Hz rendering. Thus, the system refresh rate is limited to ~ 7 Hz by the liquid lens. Then, in their continued work [88], a faster liquid lens with 9-ms response time was adopted and boosted the dual-focal refresh rate to 37.5 Hz, which is limited by the graphics card. Since they also added an empty frame after each frame of the image content to avoid transitional states of the liquid lens, the final refresh rate could be reduced to 18.75 Hz if accurate focus cues are desired. After that, they further improved their system [89] in 2010 by upgrading the graphics card to support 240 Hz SVGA (800 × 600) contents. As a result, the refresh rate of dual-focal display with empty frames is increased to 21.25 Hz, becoming microdisplay-limited at this time. However, even if the micro-OLED is replaced by a fast-response DMD display, the system still could not achieve 60 Hz rate for two depths because the response of the tunable lens is not fast enough. As electrowetting lenses function by electrically changing the contact angle of a fluid droplet, the inertial effect intrinsically limits the response time of these lenses, especially those with a large aperture.
In 2015, Llull et al. [90] realized a flicker-free multifocal display [Fig. 10(b)] employing an ultra-high-speed DMD display and a fast-response tunable lens from Optotune™, which is based on a combination of optical fluids and a polymer membrane. The employed DMD display offers 400 Hz 6-bit grayscale imagery and the fast liquid lens can switch its focal length between 2 and 8 diopters within 2 ms, yet with 5 ms settling time. In this benchtop binocular prototype, they successfully offered six focal planes (0.6 diopters spacing) with 31° FOV at the rate of 60 Hz. Then, from the same company, Wu et al. [91] presented a multifocal display with content-adaptive depth arrangement to improve the perceived 3D image quality, which appears like a hybrid of varifocal and multifocal concept.
In 2018, Chang et al. [92] further increased the focal plane number yet lowered the frame rate, demonstrating a proof-of-concept grey near-eye display [Fig. 10(c)] with a dense collection of 40 depths at a 40 Hz refresh rate. The optical layout in this work is similar to that reported by Llull et al. [90], employing a DMD display for large bandwidth and a fast-response liquid lens for depth changing. An essential feature of their design is the focal length tracking system that can accurately acquire the real-time depth at microsecond time resolution, allowing rapid and precise synchronization between the constantly sweeping tunable lens and DMD display. In the meantime, another independent 60 Hz multifocal display work that supports full-color operation was reported by Rathinavel et al. [93], featuring a denser depth stack (280 depths) and using the same liquid lens. The novel part of this work is decomposing colored 3D scenes into 280 binary patterns, each with a single color channel and placed at one depth. The design offers a dense focal stack, spanning from 0.25 diopter to 6.7 diopters, which is adequate for supporting most 3D image contents.
Lee et al. [94] from Seoul National University proposed the tomographic display [Fig. 10(d)], featuring a 60 Hz LCD with a fast DMD backlight and a tunable liquid lens. In the prototype, 80 depths occupy the focal range from 0 to 5.5 diopters with 0.07 diopter spacing in between. When the scanning virtual image plane arrives at one depth, only the 3D content at the vicinity of this depth is illuminated by the DMD. Thus, in principle, the correct synchronization of the liquid lens and the DMD backlight can map each pixel of the 2D image displayed on the LCD to an arbitrary depth within the working focal range, generating an accurate but discretized focal surface [55]. A limitation of the tomographic display is that each pixel on the LCD is fixed at one depth, making it unachievable to provide accurate depth for 3D scenes with semi-transparent objects, where the viewer should observe multiple depths in a single direction.
Switchable lens
Aside from the continuously tunable lenses, addressable lenses that can switch between discrete optical powers can also be applied for building multifocal display systems in a time-sequential way.
In early 2018, Zhan et al. [95] fabricated switchable LC lenses employing the Pancharatnam-Berry phase and built a 4-depth multifocal HMD prototype based on them [Fig. 11(a)]. These actively addressable PBLs are made of LC instead of LC polymer, so their polarization-dependent optical power will vanish if a large enough electric field is applied to the electrodes. Since the cell gap of PBLs is usually less than 2 μm, the response time can be as fast as 0.5 ms if a low-viscosity LC material is employed. In the prototype, two LC PBLs with 0.5 and 1.5 diopters optical power, are attached together and sandwiched by two plano-convex lenses as a compact and switchable viewing optics assembly with four evenly spaced optical powers. A fast LCD with 240 Hz was employed to provide 60 Hz flicker-free contents for the four depths. After the synchronization of the LCD and two PBLs, a multifocal HMD prototype was constructed with the same form factor of commercial VR HMDs. Thanks to the bifocal nature of the switchable PBLs, this design is free from the longitudinal focal shift when switching between focal planes using continuously tunable lenses. A general limitation of this design and others using switchable lenses is that it cannot realize the dynamic multifocal structure proposed by Wu et al. [91] because of the fixed focal plane arrangement.
Later, Wang et al. [96] reported another multi-focal switchable lens using freeform optics and a patterned LC shutter [Fig. 11(b)]. They designed and fabricated a freeform singlet consisting of four concentric zones, and each zone has a distinct optical power. Then a custom-designed LC shutter is attached to the freeform surface, which also has four corresponding concentric patterns of electrodes that can be controlled independently. This combination realized a switchable lens with four focal lengths. The LC shutters can switch between transparent and dark states within 2.5 ms, rendering it possible for the switchable freeform singlet to scan through the four focal lengths at an overall rate of about 400 Hz. Such a fast-response freeform lens could potentially support a multifocal display with 60 Hz and six depths if combined and synchronized with a high speed display panel such as DMD.
Tunable reflector
In addition to the transmissive active optics, there also exist reflective optical parts that have been developed and exploited for multifocal display systems. Although employing reflective elements usually enlarges the footprint of display systems, a decent optical see-through functionality can be acquired if the tunable reflector is also configured as the see-through combiner.
Traub [97] is a forerunner of using varifocal mirrors for multifocal displays. The tunable mirror he employed in 1967 was a metalized Mylar membrane reported earlier in 1960 by Muirhead [98]. The membrane was taut and fixed over a circular frame as a stretchable curved mirror, which was driven by a loudspeaker to change the curvature and hence the focal length [Fig. 12 (a)]. Traub claimed that the mirror surface was mostly spherical when driven by a single frequency. In the demonstration, an oscilloscope screen was placed 45° to the normal of the mirror surface, and the viewer could directly observe the multifocal display volume from the mirror. This pioneering work successfully verified the feasibility of multifocal displays, even though the deformable membranes driven by loudspeakers are not suitable for practical applications.
In 1997, Neil et al. [99] demonstrated a multifocal display using a programmable reflective Fresnel zone plate generated by a ferroelectric liquid crystal (FLC) SLM that could support a high refresh rate at several kHz. This design includes two FLC SLMs, where one is used for generating 2D images and the other for functioning as the active Fresnel zone plate [Fig. 12(b)]. As a proof-of-concept, a monochrome two-level grey 3D scene was displayed with three focal planes located at 45 cm, 90 cm and infinity at 60 Hz. However, the contrast of the displayed images was not good enough as shown in the paper. This problem could be caused by the unsatisfying performance of the amplitude modulation from the first SLM and also the unwanted orders from the second SLM with the Fresnel zone plate phase. Nevertheless, this work is still a precursor of multifocal display systems using SLM as the active optics.
McQuaide et al. [100] proved the feasibility of offering multi/vari-focal planes with an electrically-driven deformable membrane mirror in a retinal scanning display [Fig. 12(c)]. The membrane mirror (from Flexible Optical B.V.) utilized in their monocular prototype was made of a thin membrane of silicon nitride that was coated with aluminum and suspended over an electrode. The membrane surface would change to parabolic shape when a voltage is applied to the electrode. By varying the voltage from 0 to 300 V, the optical power of the membrane can be continuously tuned from 0 to 1 diopter. They achieved a working focal range from 0 to 3 diopters and verified its accuracy by simultaneously measuring the eye accommodation response using an autorefractor. Then, in 2006, Schowengerdt and Seibel [101] further improved this retinal scanning design, demonstrating a binocular prototype with extended focal range, from 0 to 16 diopters.
In addition to the retinal scanning displays, the electromechanical deformable membrane was also applied in panel-based designs reported by Hu and Hua [102,103,104]. Their first demonstration [102] in 2011 features a DMD display and a deformable membrane mirror in a birdbath arrangement [Fig. 12(d)]. The illuminated DMD is imaged by the active membrane mirror to an intermediate image plane in front of the eyepiece. Even though the addressable mirror with 1-kHz rate can support up to 16 focal planes with 60 Hz contents, in this design, they demonstrated six focal planes evenly occupying the working focal range from 0 to 3 diopters. As they claimed, this is the first display system that can offer 2D images with decent image quality at six depths without flicker. Then, in [104], they miniaturized this design and modified it as the light engine for a see-through freeform combiner. With the help of custom-designed freeform optics, their six-depth AR HMD prototype can offer decent imaging (1.8 arcmin resolution) and see-through quality within 40° diagonal FOV.
Polarization-multiplexing
To our knowledge, the concept of multiplexing 2D images by polarization was firstly mentioned in early 2016 by Lee et al. [68], where they claim that a pixelated LC panel can create depth information to a 2D image by polarization modulation in the pixel level. Later in 2018, Zhu et al. [105] proposed a detailed optical design of a multifocal display with both space and polarization multiplexing. In this configuration, two LCDs are cascaded together for creating two independent depths. Then, another LC panel as the polarization modulator is employed to define a spatial-varying polarization pattern on the two LCDs, and therefore double the focal plane number with the birefringent crystal lens. However, to our knowledge, there has been no published implementation of this design yet.
Tan et al. [106] built and reported the first polarization-multiplexed dual-focal HMD prototype [Fig. 13(a)]. They used a non-switchable PBL made of LC polymer for providing distinct optical power for the RCP and LCP lights. The prototype consists of two 60-Hz LC panels (one as the display and the other as the polarization modulator), a quarter-wave plate, a PBL, and an eyepiece. Before implementation, they characterized the polarization modulation performance of the LC panel for a detailed mapping between the input grey level and output polarization states at RGB wavelengths. In this manner, a dual-focal full-color near-eye display is demonstrated with a ~ 70° FOV. Moreover, the ghost images caused by the zero-order leakage of the single-layer PBL in this work can be considerably reduced by employing a ultra-broadband PBL with tailor spectral response for display applications [79, 107]. Compared with commercial VR devices, this design only adds three flat optical components to achieve the bifocal functionality such that a small footprint can be well maintained. If the PBL is replaced by a birefringent crystal lens as designed by Zhu et al. [105], then the quarter-wave plate is no longer needed [Fig. 13(b)]. In general, most previously mentioned multifocal designs that function by polarization switching can incorporate with the polarization multiplexing method for sharing the burden from the high frame rate.
Wavelength-multiplexing
The wavelength-multiplexed multifocal display system was not developed until recently by Zhan et al. in [37]. The dual-focal benchtop demonstration [Fig. 14] is essentially a distance-based dual-focal birdbath design. A spectral notch filter was used as the wavelength-sensitive component to generate distinct depths. The two laser beam scanning projectors employed in the system have close but different wavelengths, 532 nm and 517 nm, for the green channel. Due to the sharp spectral stopband of the notch filter, the 517 nm laser light can pass through it with negligible reflection, but the 532 nm light is totally reflected. In this manner, the notch-filter-based layout can create an optical path length difference, and therefore distinct focal planes for the two green wavelengths. Although this simple design verified the feasibility of offering multiple focal planes by wavelength multiplexing, there are still many challenges for practical applications. Firstly, mixing wavelengths in each color channel would directly affect the overall color performance, since only colors shared by all focal planes can be displayed if a uniform system color appearance is desired. Also, the stopband of the notch filter usually shifts under different angles of incidence, which becomes a critical limitation for achieving a larger FOV.
After the detailed discussion of multifocal display optical designs, Fig. 15 offers a short summary for all the approaches and their relations discussed in this section.