Multi-focus light-field microscopy for high-speed large-volume imaging

High-speed visualization of three-dimensional (3D) processes across a large field of view with cellular resolution is essential for understanding living systems. Light-field microscopy (LFM) has emerged as a powerful tool for fast volumetric imaging. However, one inherent limitation of LFM is that the achievable lateral resolution degrades rapidly with the increase of the distance from the focal plane, which hinders the applications in observing thick samples. Here, we propose Spherical-Aberration-assisted scanning LFM (SAsLFM), a hardware-modification-free method that modulates the phase-space point-spread-functions (PSFs) to extend the effective high-resolution range along the z-axis by ~ 3 times. By transferring the foci to different depths, we take full advantage of the redundant light-field data to preserve finer details over an extended depth range and reduce artifacts near the original focal plane. Experiments on a USAF-resolution chart and zebrafish vasculatures were conducted to verify the effectiveness of the method. We further investigated the capability of SAsLFM in dynamic samples by imaging large-scale calcium transients in the mouse brain, tracking freely-moving jellyfish, and recording the development of Drosophila embryos. In addition, combined with deep-learning approaches, we accelerated the three-dimensional reconstruction of SAsLFM by three orders of magnitude. Our method is compatible with various phase-space imaging techniques without increasing system complexity and can facilitate high-speed large-scale volumetric imaging in thick samples.


Introduction
Morphological and functional dynamics in three-dimensional (3D) organisms, such as neuronal calcium transients [1,2], vascular transport [3][4][5], embryo development [6][7][8], and molecular signaling networks [9,10], necessitate high-speed volumetric imaging techniques to resolve these complexities across large spatiotemporal scales [11,12]. To meet the pressing demand, various efforts have been invested during the past decades to develop fast and high-quality volumetric imaging methods [13][14][15][16], among which light-field microscopy (LFM) [17] is an attractive candidate due to its high parallelization and low phototoxicity. By inserting a microlens array (MLA) into the detection path, LFM encodes the high-dimensional information of a large volume in a snapshot, providing a powerful capability of high-speed 3D imaging [18][19][20]. However, similar to common microscopy techniques, LFM still suffers from the tradeoff between spatial-temporal resolution and the volume coverage, which impedes observing subtle details in large-scale volumes.
Several methods have been proposed to alleviate this limitation. A bifocal lenslet array [21] and confocal LFM [22] can extend the high-resolution range along the z-axis at the cost of system complexity, resolution, or speed. Wavefront coding has been explored to address the non-uniform resolution limitation [23], but the corresponding volume coverage is stilled restricted to a shallow range. Moreover, the achievable resolutions of these methods are much lower than the diffraction limit, due to the intrinsic tradeoff of spatial and angular resolutions. Scanning LFM (sLFM) with high-speed drifting of the image plane [24] bypasses the limitation and improves the spatial resolution near the focal plane up to the diffraction limit, but the lateral resolution still degrades rapidly with the increase of the defocus distance. Therefore, high-speed high-resolution imaging across a large depth of field remains a challenging problem for broad biological applications.
To address this problem, we propose a method termed as Spherical-Aberrationassisted sLFM (SAsLFM) by achieving simultaneous multi-focus light-field detection with the aid of additional spherical aberrations without complex hardware modifications. We reason that sLFM, as an aperture-partitioning method, has the potential to modulate the phase of each sub-aperture separately to extend the high-resolution range of the system. Spherical aberration is usually not preferred in optical imaging since it will make the light rays at the outer part of the lens and the central part of the lens focus on different axial regions, which is easy to generate and implement in the optical path [25]. Although it deteriorates image formation in wide-field microscopy, it can be used to generate multi-focus along the optical axis in sLFM by directly transferring the foci of phase-space point-spread functions (PSFs) to different depths. Merging all the information during iterations of the phase-space reconstruction can produce a large-scale volume with high resolution over an extended axial range. We demonstrate the principle of our method through numerical simulations and verify the improvement experimentally by imaging a USAF-1951 resolution target placed at various depths and the entire vasculature system of zebrafish larvae. The results show that our method provides a conspicuous improvement in high-resolution coverage. To demonstrate the versatility of SAsLFM, we imaged various large-scale biological dynamics. We monitored the activities in awake head-fixed mice and successfully revealed the calcium response with a depth range of ~ 300 μm with a high signal-to-background ratio. We also achieved 3D tracking of the folding tentacles of freely-swimming jellyfish at 22 Hz in a volume of ~ 2000 × 2000 × 500 μm 3 and observed the developmental dynamics of Drosophila embryos over a duration of ~ 6.5 h. All the experimental results indicate that our method could visualize fast biological activities with high resolution in a 3D space spanning a large depth of range. Moreover, as learning-based approaches become a generic solution for data analysis and image restoration [26], we investigated the ability of deep learning to automatically extract and reveal high-resolution features from SAsLFM data, and successfully achieved a high reconstruction throughput of 167 volumes per second.

SAsLFM approach
Our method can be easily applied to various phase-space imaging schemes. For demonstration, we chose unfocused sLFM with an MLA inserted at the native image plane (Fig. 1a). A piezo tilt platform was placed at the conjugated pupil plane to shift the image plane by small intervals (typically, 1/3 or 1/5 of the pitch size of each microlens) to increase the spatial sampling rate. An image sensor was placed at the focal plane of the MLA with each microlens covering N × N sensor pixels. Based on the periodic scanning path, pixels with the same position relative to the center of each microlens can be realigned together to form an intensity projection I(x, u), where x and u are 2D spatial and angular coordinates, respectively. The I(x, u), referred to as a sub-aperture component in phase space, represents an intensity projection of the 3D sample from a specific perspective (Fig. 1a). The I(x, u) can be directly calculated by ∑ z X(x, z) ⊗ W(z, x, u), where z is Fig. 1 Principle of SAsLFM. a Schematic of SAsLFM setup. A petri dish was filled with water to introduce a large spherical aberration into the optical path. A microlens array (MLA) was inserted at the native image plane, and a camera was placed at the back focal plane of the MLA. The sweep of the piezo tilt platform and the trigger of the camera were synchronized to obtain a sequence of light-field images, each of which was shifted by a small interval. Based on the scanning path (3 × 3 scanning period for demonstration), pixels having the same relative position to the center of each microlens were rearranged together to form sub-aperture components. b Illustration of sub-aperture PSFs modification. The light rays passing through different sub-apertures change their original directions when entering the water and ultimately focus at various depths (20 × /0.5NA sLFM and SAsLFM systems with each microlens covering 9 × 9 pixels are used for demonstration). Maximum intensity projections (MIPs) of the 81 sub-aperture PSFs of traditional sLFM and SAsFLM at the depth of − 300 μm are shown on the right for comparison. After the modulation of the spherical aberration, the intensity distributions of sub-aperture PSFs are changed. c Rearranged sub-aperture components of the sLFM data are divided into several groups according to the corresponding sub-aperture positions. Components within the same group are marked with the same color and contain the high-resolution content acquired from a specific depth range. Our phase-space reconstruction algorithm can fully exploit the details contained in the sub-aperture data and merge them during iterations to reveal a large-scale volume with high resolution the axial coordinate in object space, W(z, x, u) represents the sub-aperture PSF along the same direction, X(x, z) is the 3D sample and ⊗ denotes convolution operation. The formula reveals that sub-aperture components are mutually independent (Supplementary Note 1). By introducing a phase modulation at the pupil plane, the focal plane of each sub-aperture component can be transferred to a different depth. Considering the crosstalk between adjacent sub-aperture components and the artifact induced by wavefront discontinuity at the pupil plane, a smooth phase such as spherical aberration is thus preferred. By increasing the refractive index mismatch between the sample buffer and the immersion medium of the objective, a spherical aberration can be easily induced. For example, we can fill a petri dish with water and then image the sample with a dry objective (Fig. 1b). Depending on the imaging conditions, alternative solutions with a higher refractive index such as immersion oil (n = 1.51) can be used to introduce a larger spherical aberration into the optical path. The approach described above also produces other types of aberrations, which are not as significant as spherical aberration and have a negligible impact on the final performance due to the digital adaptive optics (DAO) [24] capability. As shown in Fig. 1b, with an ideal centrosymmetric spherical aberration imposed at the pupil plane, the focus depth of each view component is determined by the relative distance between the position of the sub-aperture and the center of the pupil plane. The axial high-resolution position of each sub-aperture component is offset from each other and the energy distribution is redistributed. By merging all the information through appropriate reconstruction algorithms, a high-resolution large-scale volume with an extended DOF can be obtained (Fig. 1c).
To quantitatively analyze the impact of spherical aberration, we numerically simulated the beam propagation of a 20 × /0.5NA sLFM system with each microlens covering 1313 pixels. We first added different levels of spherical aberration to the pupil plane and the corresponding spatial PSFs H (z, x) with many zero-value elements are shown in Fig. 2a. The relationship between the appended phase pattern and the spatial PSF shape is implicit, making it difficult to quantitatively evaluate the increase of DOF directly from H. We then rearranged the images in phase-space to form sub-aperture PSFs along different directions and typical components (u = (3, 3); (5, 5); (7,7)) are shown in Fig. 2b. Without spherical aberration, all the sub-aperture PSFs focus at the same depth and the DOF is just ~ 50 μm, which is consistent with the previous study [27] The disparity of focus depth between different sub-apertures components increases as the spherical aberration becomes larger ( Supplementary Fig. 1). When the Zernike coefficients of the primary spherical aberration are increased to 5 and 10, the DOF reaches ~ 120 μm and ~ 185 μm, respectively (Fig. 2b). Note that the DOF has different degrees of expansion on either side along the optical axis. For simplicity, the axial focal position of the central component of each method is denoted as the native focal plane in the following description. We next investigated other factors that may affect the extension of DOF, such as system numerical aperture (NA) and the number of sub-apertures. As shown in Supplementary Fig. 2, we applied the same degree of the primary spherical aberration (Zernike coefficient = 10) at the pupil planes of sLFM systems with 20 × /0.5NA, 40 × /1.0NA, and 63 × /1.4NA objectives. Here, the DOFs increase from ~ 50 μm, ~ 20 μm, and ~ 10 μm to ~ 185 μm, ~ 52 μm and ~ 23 μm, respectively, indicating that the degree of extension decreases with increasing NA. In addition, we characterized the effect of the number of sub-apertures on performance ( Supplementary Fig. 3). As each microlens covers N × N sensor pixels, an sLFM image can be decomposed into N × N sub-aperture components. When the size of the sensor pixel is fixed, the pitch size of a single microlens rises with the increase of N. Meanwhile, as the effective NA of sLFM sub-aperture components is reduced by a factor of N, the corresponding DOF increases monotonically with N. When the size of the microlens pitch is fixed, larger N implies a higher sampling rate. DOF no longer rises with the increase of N when the sampling rate is adequate.

SAsLFM extends DOF and alleviates artifacts
As a proof of principle, we imaged a USAF-1951 resolution target using a 10 × /0.28NA long working distance dry objective with different scanning periods. In this experiment, the resolution target was translated axially from the focal plane with an interval of ~ 100 μm. The imaging processes for sLFM and SAsLFM are identical except that we drenched the chart with 2-cm-thick water to introduce spherical aberration into the latter system. As for the reconstruction process in the traditional method, all the components are focused at the same depth and sequentially fed into the algorithm one by one to update the estimated volume. However, we find that there is a huge redundancy in the angular sampling for the existing 3D deconvolution algorithm ( Supplementary Fig. 4). Adding spherical aberration is advantageous to make use of the redundancy by assigning the foci of sub-aperture components to different depths. We divided the sub-apertures PSFs of SAsLFM into several groups, depending on their positions relative to the center of the pupil plane. To resolve information at different depths, specific groups are selected as the inputs for the ADMM (alternating direction method of multipliers)-based 2D deconvolution algorithm as a whole. Note that although the number of components in each group differs, each group contains enough angular components to recover the original 3D information due to the angular redundancy. The reconstruction results are shown in Figs. 3a-b and Supplementary Figs. 5 and 6. Although there is a slight drop in the resolution at the native focal plane due to the laterally expanded PSFs, SAsLFM preserves a significant portion of lateral spatial resolution with the ability to resolve two bars separated by an interval of 3.91 μm (group 7th, element 1st) in a large axial depth range from − 600 to 200 μm (Figs. 3c-d). Because the spherical aberration transfers the foci of the off-center components to the negative side of the optical axis, the resolution is asymmetric with respect to the native focal plane. In addition, SAsLFM eliminates the edge artifacts caused by the ringing dilemma at the native focal plane ( Supplementary Fig. 6). Meanwhile, we conducted another experiment using 2-cm thick oil as the immersion medium, and the results show that a larger spherical aberration could further extend the DOF ( Supplementary Fig. 7). Next, we compared the SAsLFM with cubic-phase-assisted  Fig. 8). We respectively appended spherical aberration and cubic phase at the pupil plane of a sLFM system using a 20 × /0.5NA objective. Objects randomly located ~ 150 μm away from the focal plane can be clearly revealed by SAsLFM with the structural similarity index measure (SSIM) reaching 0.78. While cubic-phase has been shown to be effective in reducing the artifacts around the native object plane of LFM systems [23], it fails to alleviate the rapid degradation of resolution at out-of-focus layers. The SSIM of cubic-phase-assisted sLFM is 0.66, which is almost the same as the value of 0.65 obtained from the sLFM without phase modulation.

SAsLFM enables observation of whole-body vasculatures of zebrafish larvae
To demonstrate the advantages of SAsLFM in volumetric imaging, we imaged the wholebody vasculatures of zebrafish larvae. Considering both temporal and spatial resolutions, we chose a 3 × 3 scanning strategy for data acquisition. The zebrafish larvae were embedded in 1% agarose in a petri dish. When imaging with SAsLFM, we filled the petri dishes with water to introduce spherical aberration. To analyze the spherical aberration induced by the water, we adopted a phase-retrieval-based method [27] to estimate the wavefront. As expected, a larger spherical aberration was observed in the phase retrieved from the SAsLFM data (Fig. 4a). During the imaging process, we ensured that the central sub-aperture components of the original sLFM and SAsLFM focus on the same position of the sample. Other sub-aperture components of SAsLFM can discern finer details that were blurred in the original sLFM (Fig. 4a). The sharpness parameters also show that the resolution of components of SAsLFM is generally higher than that of sLFM (Figs. 4b-c). During ADMM-based phase-space reconstruction, components were divided into several groups based on their focal depths. Since the components of the same group correspond to a specific high-resolution range along the z-axis, the reconstruction process of merging all the information from all the groups during iterations resolves a large-scale volume with high resolution over an extended depth range (Fig. 4d-e and Supplementary  Fig. 9). As shown in Fig. 4f, the vascular structure is blurred in sLFM due to resolution degradation, but SAsLFM could resolve vascular structures over a large axial range with high fidelity.

SAsLFM enables large-scale volumetric imaging at high speed with low phototoxicity
To demonstrate the versatility of SAsLFM, we first imaged large-scale 3D calcium activities in the brain of an awake Rasgrf2-2A-dCre/Ai148D mouse in vivo. Each microlens corresponds to 15 × 15 pixels on the sensor. We used a 3 × 3 scanning trajectory to balance the trade-off between spatial and temporal resolutions. During experiments, the mouse was head-fixed on a customized holder with a 3D-printed plastic tube to restrict the body to minimize vibration, and we chose to conduct the experiments when the mouse was in a calm state. For SAsLFM imaging, we mounted a container weighing less than 2 g on the head-fixed mouse to hold the water, thus introducing the spherical aberration into the optical path (Fig. 5a). During imaging, we maintained the thickness of water at ~ 1 cm. With SAsLFM, we can achieve single-neuron resolution across a large volume covering ~ 2000 × 2000 × 330 μm 3 without axial scanning ( Fig. 5b and Supplementary Movie 1). We acquired 1009 frames of SAsLFM data at 15 Hz during ~ 67 s imaging session and reconstructed the 3D distribution of neurons using the phase-space deconvolution. The distribution of the resolved neurons is consistent with the previous study [28]. Then a constrained non-negative matrix factorization (CNMF) [29] was performed on the maximum intensity projections (MIPs) of the volumetric data for calcium signal extraction. The extracted calcium traces show that even neurons across a large depth range of ~ 300 μm due to the spherical surface of the cortex exhibit significant responses, indicating that SAsLFM could offer prospects for imaging the activity of neural ensembles over a large range (Fig. 5c).
SAsLFM also has unique advantages in observing freely-moving animals as it captures a large-scale volume with an extended DOF. To demonstrate the performance of SAsLFM in animal tracking, we then imaged a freely-swimming jellyfish ephyra at 22 Hz. Jellyfish are attracting increasing interests in the biological community because they have complex life cycles that provide an exciting opportunity to understand the behavioral evolution. In previous studies, jellyfish were immobilized in a container to stay in objective focus due to the small DOF of related imaging technologies [30]. While in our experiment, we filled a petri dish with clean artificial seawater and put a single jellyfish ephyra inside without immobilization. To study the activity of freely-behaving jellyfish, we imaged the jellyfish ephyra at 22 Hz. Note that the jellyfish larvae are too tiny to cause large fluctuations in the water surface, and therefore will not cause significant distortion of the PSF. In our experiments, SAsLFM reveals the 3D structure with stable performance across a large depth (Fig. 5d and  (Fig. 5e-f ). These results indicate the potential of SAsLFM to benefit large-scale 3D imaging and analysis of freely-behaving animals.
In addition, SAsLFM facilitates long-term volumetric imaging because of its high photon efficiency and low phototoxicity. We imaged a developing Drosophila embryo expressing His2Av-mRFP1 at an interval of 60 s over a period of ~ 6.5 h. At the late developmental stages, the embryonic muscular becomes very active, but SAsLFM still achieves artifact-free whole-embryo observation with single-cell resolution (Fig. 5g). To track the cells in the embryos, we first performed cell segmentation using ImageJ's threshold and binary function. Then, we implemented an optical flow estimation algorithm [31] on two MIPs of the reconstructed volumes to automatically estimate the motion of every single cell (Figs. 5h-i). The result indicates that SAsLFM enables longterm 3D observations across a large FOV, potentially extending the applications of sLFM in developmental biology.

Learning-based approach accelerates 3D deconvolution
With the continuous improvements of computing hardware and the availability of multiple open-source methods, deep learning has become an attractive candidate for image deconvolution. To achieve real-time reconstruction, we present a U-net-based phasespace deconvolution procedure and achieve a reconstruction throughput of 167 volumes per second (Supplementary Fig. 10). We acquired ~ 120 sets of high-resolution 3D confocal images of 3 transparent mice brains and performed forward projection to obtain paired phase-space SAsLFM images (Fig. 6a). The synthetic SAsLFM images were added with Poisson noise and then served as the inputs for the neural network. About 100 pairs of data were used to train the network, and the remaining ~ 20 pairs were used as the validation set to prevent overfitting (Fig. 6b). A combination of normalized absolute difference (L1 loss), root-mean-square error (L2 loss), and SSIM were used as the loss function to measure the reconstruction quality. To test the capability of the trained network, we captured SAsLFM images of another clarified mouse brain as the test data. Totally, SAsLFM revealed 3D structures in 25 different FOVs, covering about 1960 × 1960 × 600 μm 3 after we stitched them together using an ImageJ plugin (Fig. 6c). While the reconstruction performance is comparable to that of the ADMM-based phase-space deconvolution method, the computing time is reduced by 3 orders of magnitude (Fig. 6d). The average processing time using the ADMM-based phase-space deconvolution algorithm is ~ 16 s/volume (513 × 513 × 101 pixels), while the average computational time of the neural network is ~ 0.006 s/volume. The combination of learning-based reconstruction and SAsLFM acquisition promises to achieve video-rate recording and real-time reconstruction of large-scale 3D signals.

Conclusions
As an instrumental technique for instantaneous 3D imaging, LFM has aroused growing interest in the community of biology and medicine. Recently, more attentions have been attracted to sLFM because of its superior performance in terms of spatial resolution and aberration amelioration. But the sparsity of microscopic data and the redundancy of angular components suggest that sLFM has not realized its full potential. In this work, we present a multifocal technology, termed SAsLFM based on sub-aperture PSF engineering to extend the high-resolution range utilizing the redundancy of phase-space data. By simply changing the refractive index of the immersion medium between the objective and the sample, we induced a large spherical aberration into the optical path to modulate the energy distribution of sub-aperture PSFs along the z-axis. Our method not Fig. 6 Learning-based approach accelerates the reconstruction without sacrificing performance. a The network training pipeline. We captured ~ 120 sets of confocal image stacks to generate the paired SAsLFM images using the calibrated phase-space SAsLFM PSFs. To simulate the camera acquisition process, additional Poisson noise was added to the synthetic images. The network was trained by iteratively minimizing the loss function, which consists of the L1 loss term, the L2 loss term and the SSIM loss term. b The dataset was divided into two parts, one for network training and the other used as the validation set. After ~ 1600 epochs, the loss of the validation procedure started to increase, while the loss of the training was still decreasing. c Stitched volume covering ~ 1960 × 1960 × 600 μm 3 , consists of 25 reconstructed image stacks with 50% overlap. The segmentation provided by the ImageJ's 3D objects counter function shows a sloping distribution of neurons in a depth range of ~ 600 μm. d The reconstruction performance of the ADMM-based algorithm and the network is equivalent, while the inference time of the learning-based approach is ~ 0.006 s, which is three orders of magnitude less than that of the former. Scale bars: 1000 μm (c) and 400 μm (d) only inherits the advantages of sLFM, such as low phototoxicity and high imaging speed, but also mitigates the degradation of lateral resolution for layers away from the focal plane. Moreover, reconstruction artifacts close to the native object plane can be reduced by our method. Such improvements facilitate applications that require fast measurement of fine 3D structures in thick samples. To prove the versatility of SAsLFM, we conducted extensive experiments including large-scale calcium imaging in mice, behavioral tracking of freely-swimming jellyfish, and long-term observation of Drosophila embryo development. Finally, we also derived a U-net-based method to accelerate the deconvolution of light-field data and achieved real-time 3D reconstruction on our system.
Our method provides a practical approach for PSF engineering to modulate the sub-aperture PSFs of sLFM in phase space rather than optimizing the original spatial PSFs with a large amount of discrete zero elements. In this work, for simplicity and wavefront continuity, we just used spherical aberration as a proof-of-concept. In further work, we will explore more complicated phase patterns to optimize the performance and broaden the applications of sLFM to compensate for the degradation of axial resolution due to the missing-cone problem. We also seek to reduce the localization ambiguity and realize axial "jumps" to allow the users to probe separated regions of interest at the same time. After figuring out the optimal phase pattern, our method can be implemented by inserting a laser-lithographic or 3D-printed phase mask into the optical path to maximize the robustness and compactness of the imaging system.
In addition, akin to many supervised approaches, our deep-learning-based deconvolution requires paired training images and is likely to suffer from limited generalizability or suboptimal performance. More investigations will be made in the future to address these difficulties. Unsupervised learning, such as Cycle-GAN and its variants [32][33][34], could be an optional solution to circumvent the requirement for pixel-wise matched training pairs. Transfer learning based on pre-trained models also offers a potential opportunity to promote generalizability. Another limitation is that the deep-learning-based method is susceptible to noise. We modeled the camera noise mathematically to solve the problem in our work, but the performance could be unstable for different noise types and levels. Pre-processing the input data using some self-supervised denoising methods could be a more versatile way to address the issue [35][36][37].

SAsLFM optical setup
Our approach is applicable in various schematics of phase-space imaging technologies, such as LFM and sLFM. In this work, we chose the unfocused sLFM for experimental demonstration (Supplementary Fig. 11). sLFM is a wide-field-esque imaging method with a microlens array placed at the native image plane. A camera is places at the back focal plane of the MLA. At the conjugated pupil plane, a piezo tilt platform was placed to rapidly scan the image plane at small intervals, which is determined by different experimental conditions, to balance the trade-off between spatial and temporal sampling rates. Meanwhile, the periodic scan was synchronized with the image acquisition. Detailed imaging conditions, including the microlens array, cameras, number of sub-apertures, and the scan trajectory, are illustrated in Supplementary Table 1.

Stochastic path integration of pupil phases
Even the most elaborate optical systems inevitably have system aberrations, which affect the imaging performance. Although the DAO capability of sLFM mitigates the distortion caused by aberrations to a certain extent, the imaging quality still suffers from higherorder optical aberrations. To lift the burden of DAO and to circumvent the laborious process of capturing 3D phase-space PSFs, we used a phase-retrieval-based algorithm to generate and calibrate the PSFs with the wave optics model [38] as previously described [27]. For each optical system, we calibrated the PSFs twice, once for sLFM and the other for SAsLFM. We firstly captured images of sub-diffraction-limited fluorescence beads (with or without a certain amount of water) as the target distribution and simulated the PSFs of sLFM without any additional aberration. During each iteration, we calculated the correlations between the simulated PSFs and the captured images along all viewing directions. We can obtain the residual wavefront on the pupil plane by integrating the correlation map. Here we adopted an effective integration method based on stochastic paths, which provides a smoother and more accurate estimation. There is a total of C r r+c optional integration paths connecting the central point to the point with coordinates. (r, c). on the pupil plane. For each point, we randomly selected 1000 paths and integrated the correlation map along these paths. Then we averaged the integration values as the final estimation of the residual wavefront at (r, c). The estimated wavefront is then fitted with Zernike polynomials and appended to the pupil plane to generate a new simulated PSF. The above iteration procedure continuously shrinks the disparities between the generated PSFs and the captured ones until the residual wavefront converges. We show the effectiveness of our method in Supplementary Fig. 12. In different experiments, there is a small difference between the actual PSF and the calibrated ones because the thickness of the water used in the experiments is not the same as that used in the calibration process. However, the distortion induced by the slight difference can be eliminated by the light-field's DAO capability.

Phase-space deconvolution with a circular trajectory
The phase-space deconvolution method proposed by Zhi Lu et al. [39] increases the convergence speed. Here we make further improvements to this algorithm in Fourier space with a circular trajectory. Based on the ADMM algorithm [40], the update process can be represented as: Where I(u) is the sub-aperture component, δ is a small number to avoid a division by zero, * signifies 2D convolution operation, V(j, k) is the estimated volume within iteration j th of inner loop and k th iteration of outer loop. During the inner loop, each iteration uses information from a single angular component to update V. After going through all the sub-aperture components, it goes into the k + 1 th iteration of the outer loop. W(x, z, u) T is the adjoint of the point spread function W(x, z, u). J is an allone matrix to compensate for energy loss at the edges of the images. β is an empirical coefficient to determine the convergence speed. α u is an update weight calculated according to the energy distribution of each angular component: The aforementioned method may lead to oscillation during the convergence process and crosstalk between adjacent layers. As the focus depth of each component is determined by the relative distance between the position of the sub-aperture and the center of the pupil plane, we provided an update scheme based on a circular trajectory. The volume is updated by the formula: Here we divided the sub-aperture components into several groups according to the position relative to the center of the pupil plane. i is the index of each group, G i is the index set of angular components belonging to the i th group. M i is the total number of components in G i , p(z) denotes an update rate changing with the depth. It has many options, such as a sigmoid function. Starting from the outermost or the innermost ring, the new reconstruction method with a circular trajectory uses all the components in one group to update the volume during each inner iteration. With our approach, as the focus depth of each group is assigned to a different position, an empirical p(z) is used to merge all the information together to generate a large-scale high-resolution volume. To accelerate the update process, we derived the forward and backward projection in Fourier space. Due to the properties of Fourier transform, the forward projection process can be represented as: Where FT n and IFT n denote n-dimensional Fourier transform and inverse Fourier transform respectively. sum z signifies the sum operation along z-axis in the spatial domain and sum fz is the sum operation along fz axis in the frequency domain. To demonstrate the speed improvement, we firstly assume that the size of V is N x × N y × N z , the size of sub-aperture PSF is n x × n y × n z . For simplicity, we assume that N x and N y are larger than n x and n y . The computational complexity of the forward projection in spatial space is O N x N y n x n y N z + N x N y N z = O N x N y n x n y N z While the corresponding computational complexity in Fourier space is It illustrates that calculating the projection in frequency space significantly reduces the computational cost, especially when the size of V is large. In terms of the backward projection, the computational complexity is equivalent to that of the forward projection. As the projection operations are repeated for each sub-aperture component during iterations, deconvolution in Fourier space provides orders of magnitude reductions in computational costs.

Evaluation metrics
We used the SSIM to quantitatively evaluate the reconstruction performance. SSIM is defined as Where μ A and μ B are the local means of A and B, σ A ， σ B and σ AB are standard deviations and cross-deviations for images A and B. C 1 and C 2 are constants to avoid a division by null. A and B are converted to grayscale images with a range from 0 to 1.
We used a sharpness metric to evaluate the amount of high-frequency information contained in a single image, which can be calculated by: Where Gradient(u, v) is a function calculating the numerical gradient of a 2D matrix I(x, y). U and V are the height and width of the gradient map, respectively.

Network architecture
U-net is a generic deep-learning solution for various quantification tasks such as cell segmentation and biomedical image deconvolution. The network architecture in this work was inspired by previous research [41] (Supplementary Fig. 10). We captured ~ 120 sets of confocal images and used a wave optics model to generate synthetic SAsLFM images. To mimic the camera acquisition process, we added independent Poisson noise to the generated SAsLFM data. The data pairs were divided into two groups, ~ 100 pairs were used for network training and the rest served as the validation set to prevent overfitting. The loss function of the network consists of a pixel-wise L1-norm loss term, an L2-norm loss term and an SSIM term. The size of the input data was set to 513 × 513 × 21 pixels (x-y-u), where the last term is the angular index. And the size of the output data was 513 × 513 × 201 pixels (x-y-z). Of note, among all the 225 phase-space components, we selected 21 components, considering the data redundancy and the computational consumption. Before training the network, all data were normalized in the range from 0 to 0.9. The network was trained using the Adam optimizer with the learning rate set to 0.0002, and the exponential decay rates for the first-moment and second-moment estimates were 0.5 and 0.99, respectively. The training procedure costs about 10 h to converge with a single GPU (GeForce Titan RTX, Nvidia).

Calcium trace extraction of mouse brain
We employed a CNMF framework to decompose the MIPs of the volumetric data of the mouse brain into a matrix that encodes the footprints of segmented neurons. In the algorithm, the regions of interest (ROIs) were set to ~ 15 × 15 × 15 μm 3 to match the size of neurons of the mice brain. Because the vessel has a significant effect on the selection of ROIs, resulting in a biased segmentation of neurons, we manually performed exclusion guided by the visual information. The calcium responses were then extracted directly from the abovementioned ROIs in the volumetric time-lapse stacks. The temporal traces of the calcium activities were calculated by the formula: Where F is the raw averaged intensity of the extracted ROI, and F 0 is the corresponding intensity baseline, which was calculated by averaging the intensity of the signals that below 120% of the mean value of the entire trace.

Semi-automated tracking of Drosophila embryo cells
For cell tracking in Drosophila embryos, we used a semi-automated framework rather than performing manually as the cells are densely labeled and the number of cells is too large, which makes manual labeling very time-consuming and cumbersome. We calculated the MIPs of two volumes separated by a time interval of ~ 255 s. During this time period, the distribution of the cells is slightly changed while the overall morphology remains almost the same. Then, we adopted an optical flow estimation method based on conjugation gradient [31] to calculate the distribution of velocities of movements of the bright patterns (the cells). Segmentation was applied using ImageJ's threshold and binary functions to find out the contours of the cells, together with the fill holes function to compensate for the over-segmentation. Large areas were excluded by threshold filtering. Finally, 554 connected domains were segmented and then we extracted the values of the optical flow map on these coordinates as the estimation of the cell motions.

Data analysis
All data analyses were performed with customized MATLAB (MathWorks, 2020b) programs, open-source ImageJ and Amira (Thermo Fisher Scientific, Amira 2019). The hardware was controlled by LabVIEW 2018. The 3D tracking of 8 tentacles of the freelyswimming jellyfish was carried out manually in MATLAB. Details of the parameters and rendering models are listed in Supplementary Table 2.

Imaging of Drosophila embryos
All Drosophila experimental procedures were conducted with ethical approval from the Animal Care and Use Committee of Tsinghua University. All Drosophila in the �F /F 0 = (F − F 0 )/F 0 experiments expressed His2Av-mrfp1. Drosophila embryos were dechorionated with 50% (vol/vol) sodium hypochlorite solution. During live imaging, Drosophila embryos were embedded in 0.4% low-melting agarose in a 35-mm petri dish with the temperature kept at 25 °C. For SAsLFM, we filled the petri dish with water to introduce the spherical aberration.

Zebrafish vascular system imaging
All zebrafish experimental procedures were conducted with ethical approval from the Animal Care and Use Committee of Tsinghua University. We cultured flk: EGFP transgenic zebrafish embryos at 28.5 °C in Holtfreter's solution. The zebrafish larvae were anesthetized by ethyl 3-aminobenzoate methanesulfonate salt (100 mg/L) at 4-5 days postfertilization (dpf ) and mounted in 1% low-melting-point agarose in a petri dish filled with water at 26-27 °C during the imaging process.

In vivo mouse experiments
All procedures involving mice were approved by the Institutional Animal Care and Use Committee of Tsinghua University. We used both male and female C57BL/6 mice 10 weeks to 6 months old without randomization or blinding. Mice were group-housed under a cycle of 12 h light/dark (lights on at 7 a.m.) and provided with water and food ad libitum. The relative humidity was 50% at 20-22 °C.
The craniotomy surgery was performed on the stereotaxic apparatus (RWD, China). Mice were anesthetized with 1.5-2% isoflurane. After the surgery, flunixin meglumine (Sichuan Dingjian Animal Medicine Co., Ltd) was injected subcutaneously (1.25 mg/kg) for at least 3 days to reduce inflammation.
The scalp was removed by sterile surgical scissors to expose the entire dorsal skull. The skull was thoroughly cleaned with saline to remove all fascia above the skull. Then, a piece of skull (8 mm in diameter) was removed using a cranial drill and replaced with a crystal skull. The edge of the crystal skull and the skin incision was sealed with a thin layer of cyanoacrylate adhesive (Krazy glue, Elmer's Products Inc). A custom-made head-post was implanted above the skull and fixed with dental cement.
For acute imaging, we used adult double-transgenic Rasgrf2-2A-dCre/Ai148D mice (JAX No.: 022864 and 030328) to specifically label cortical layer 2/3 neurons. Trimethoprim saline solution (solution concentration: 5 mg/ml, dose: 10ul/kg) was injected for 3 days to induce cre expression. For chronic imaging, adult C57BL/6 mice injected with diluted AAV9-hSyn-GCaMP6s virus (from BrainVTA Technology Co., Ltd., China) were allowed to recover for at least 2 weeks after craniotomy. During imaging, awake mice were placed in a tube with the head restrained under the objective to minimize vibration. For SAsLFM, a container was mounted on the head of the mouse to hold water ( Supplementary Fig. 13).

Cubic-MACS clearing
Firstly, the mice were anesthetized with a 0.5% pentobarbital sodium solution (0.4 ml/30 g body weight). To flush blood vessels, the mice were transcardially perfused with 0.01 M PBS (Sigma-Aldrich Inc., St. Louis, MO, United States) with 4% paraformaldehyde (PFA, Sigma-Aldrich Inc., St. Louis, MO, United States) in PBS (pH 7.4) for fixation. Then the sample was post-fixed with 4% PFA for 2 days under 4 °C. The brain samples were washed with PBS for 1 day with the solution replaced at 8 and 16 h. Then the samples were delipidated with a CUBIC-1 solution (~ 50 ml) for 6 days at room temperature. The brain samples were washed again with PBS for 1.5 days with the solution changed every 8 h at room temperature and then immersed in CUBIC-X 1 swelling solution for 2.5 days with the solution replaced every 12 hours. Finally, for refractive index matching, the samples were immersed with CUBIC-X 2 for 1.5 days with the solution replaced every 12 h. During imaging, the samples were dissected and embedded in 4% agarose mixed with CUBIC-X2.