Digital staining in optical microscopy using deep learning - a review

Until recently, conventional biochemical staining had the undisputed status as well-established benchmark for most biomedical problems related to clinical diagnostics, fundamental research and biotechnology. Despite this role as gold-standard, staining protocols face several challenges, such as a need for extensive, manual processing of samples, substantial time delays, altered tissue homeostasis, limited choice of contrast agents, 2D imaging instead of 3D tomography and many more. Label-free optical technologies, on the other hand, do not rely on exogenous and artificial markers, by exploiting intrinsic optical contrast mechanisms, where the specificity is typically less obvious to the human observer. Over the past few years, digital staining has emerged as a promising concept to use modern deep learning for the translation from optical contrast to established biochemical contrast of actual stainings. In this review article, we provide an in-depth analysis of the current state-of-the-art in this field, suggest methods of good practice, identify pitfalls and challenges and postulate promising advances towards potential future implementations and applications.

thin sections (typically 3-7µm) and staining, before imaging under conventional microscopes or whole slide scanners can be pursued.
Label-free optical technologies (see Fig. 1C), on the other hand, exploit natural contrast mechanisms, instead of relying on a limited choice of exogenous markers in the above mentioned staining procedures.Simple white-light microscopy, for instance, rely on amplitude differences based on scattering and absorption properties of cells and tissues, optical phase microscopy measures phase contrast based on refractive index (RI) differences, birefringence, or orientation, while other imaging modalities use intensity or lifetime of natural autofluorescence (AF).Although these label-free contrast mechanisms can actually carry highly-relevant information related to factors like density and thickness, mass, redox-ratio and many more, their specificity as direct biomarkers is typically less obvious to the human observer.Over the past decades, machine learning Fig. 1 Basic principle of Digital staining.a Conventional staining of 3D tissue samples requires a time-demanding and cumbersome procedure of biopsy acquisition, formalin-fixed paraffin-embedding (FFPE), manual sectioning, dehydration and artificial staining.These prepared tissue slices are then imaged by optical microscopes and the obtained images are quantified (e.g., via histopathology scoring by experienced experts).b Staining of cell cultures is conventionally based on antibody reactions with immuno-fluorescence (IF) stains.This process does not require embedding, sectioning and dehydration, and can even compatible with live cell imaging.However, the image quantification is still specific to the applied staining (e.g., nuclei staining for segmentation of nuclei).c Label-free optical technologies exploit the natural contrast of biomedical samples, without relying on artificial stainings.Although this omits the need for extensive sample preparation, the quantification is bound to the specific type of optical contrast that was used (e.g., dry mass approximation in quantitative phase imaging).d Digital staining (DS) can combine the advantages of an experimentally more practical imaging technique, with the high specificity of a thorough but cumbersome staining approach.Thus, DS can be used to digitally enhance label-free optical microscopy (e.g., generation of IF images based on white light microscopy) or to perform stain-to-stain translation (e.g., generation of specific IHC staining based on already available H&E stainings).A detailed literature overview of commonly used input-target image pairings and respective examples images is displayed in Fig. 2 (ML) or artificial intelligence (AI) demonstrated vast success in optical microscopy, e.g. in automated detection of diseases [1], 3D image segmentation [2] or simultaneous optimization of microscopy and software components [3].In conventional pathology, AI models are often used to perform classification or segmentation of histology images from diseased and healthy tissues.As common in most supervised ML, training of these models requires large datasets with reliable ground truth labels.These labels are commonly generated manually by experts, i.e., for the automated segmentation of background, cell boundaries, and cell compartments by convolutional neural networks (CNN) [4].Especially with the rise of the U-Net architecture [5] for image segmentation, cell segmentation could be solved more effectively.Nevertheless, the conventional procedures of histological staining and manual annotations are still rather time-consuming, and the need for reliable ground-truth data often acts as bottlenecks for throughput in digital pathology.
Over the past decade, ML researchers have developed several techniques for imageto-image translation.Upon training and validation, these generative models allowed transfer from one image domain to another, e.g., from maps to satellite images [6], from horses to zebras [7] or for style transfer in art [8].Recently, alternative image-to-image training strategies, such as Normalizing Flows [9,10] and Denoising Diffusion Probabilistic Models [11,12] have also gained significant popularity.Digital staining (DS) is an emerging concept in the field of computational microscopy that can digitally augment microscopy images by transferring the contrast of input images into a target domain (see Fig. 1D).Implementation of digital models is most often based on machine learning algorithms, that are trained on pairs of input and target images.In a nutshell, these ML models then learn to link characteristic features in structure and contrast from one input domain (most often a label-free image) with those of the target domain (most often images from staining with well known molecular specificity).Thereby, digital staining very elegantly bypasses two obstacles: (i) during the development and training of computational models, digital staining omits the need of manual annotations of ground truth data, by obtaining the ground truth annotations from specific stainings and (ii) upon deployment, the inference with a trained model can then circumvent the timeconsuming and tedious procedure of actual sample preparation, including sectioning and staining.
Despite the vast potential of this technique, the growing number of new digital staining pipelines and a wider range of applications, thorough review articles on this topic are rare and only touch side aspects of digital staining.A 2022 review on GANs in ophthalmology [13] mentioned some DS techniques in the specific use case of transforming fundus photographs to angiography images.Jiang et al. provide a concise review on deep learning (DL) in cytology, including classification, segmentation, object detection and stain normalization of microscopy images [14], but without covering digital staining as such.In a similar fashion, Wu et al., touch on the topic of style transfer in microscopy in their 2021 review on computational histopathology [15], however only in the context of color normalization.In 2018, Jo et al., mentioned 'image enhancement via style transfer' as promising developments for the specific technique of quantitative phase imaging (QPI) [16], but without generally reviewing the entire field of digital staining.Rivenson et al., published a 2020 review article on virtual staining for histopathology [17].
However, it only targeted digital staining of FFPE sections and did not include the immense increase in publications in this field over the past three to four years (see Fig. 5A).Latest reviews from 2022 and 2023 summarized the concept to translate input images into target images in the sole context of histological tissue sections [18,19], but without an in-depth analysis of other digital staining applications, including (live) cell staining.

Basic principle and key examples
Successful implementation of digital staining essentially relies on four key parts: • the use of input images that carry a sufficient amount of information to allow the translation into the target domain (see chapter on Input domains).This input domain usually relates to the use of a label-free technique, but it is not limited to that.• the use of target images with reliable ground-truth information that can be linked to the features in the input domain (see chapter on Target domain).These target images usually use the biochemical specificity of molecular stains as ground-truth, but are not restricted to those.• the use of appropriate computational models that can translate input images to target images (see chapter on Computational models ).Most often, this image-to-image regression problem is solved by machine learning algorithms (specifically U-Net or GAN architectures), but earlier implementations also relied on linear, mathematical formulas to translate color spaces.• a procedure to accurately generate paired input and target images (see chapter on Generation of paired images).The exact registration of input pixels and target pixels of the same structures might seem trivial but is essential to enable the model to perform accurate image-to-image regression.While a few recent implementations use unsupervised learning with unpaired images for training, all implementations at least require paired input and target images for a truthful validation of the predictions, as discussed below.
Thus, digital staining can be viewed as a holistic concurrence of biology, optical microscopy and computational modeling.Successful implementations rely on an understanding of the entire workflow that starts from a reasonably posed biological problem, involve input images that carry a sufficient amount of information with respect to that biological problem, as well as target images that can be linked to the information from the input domain and end with a computational model that is able to accurately translate input images into target images.Furthermore, the practical workflow to generate pairs of input and target images, as well as the choice of quantitative metrics for training and validation are important considerations for digital staining.Depending on the mode of operation and the preference of the authors, the concept of DS is also termed 'virtual staining' , 'in silico staining' , 'pseudo-H&E staining' or 'virtual fluorescence' .
The earliest, and still one of the most common, implementation of DS translates label-free images of tissue sections into target images of well-known and widely accepted histological stainings.This is often based on two subsequent tissue sections, where one is imaged with label-free modalities as input, while the consecutive section is used for a conventional histology stain as target.This digital H&E staining has been shown extensively and for a multitude of different organ samples based on label-free autofluorescence [20].
In 2018, Christiansen et al. demonstrated the next stage for digital staining from live cell cultures with different IF dyes in a shared optical path, by using phase microscopy and a U-Net model [21].The use of fluorescently labelled antibodies for digital staining in live cells opened the door for many new biomedical experiments, like an extension into 3D digital staining [22], the use of digital staining to promote prior-informed cell segmentation [23], digital staining of two different cell cycle markers for mitosis stage classification [24] or a detailed evaluation of virtual labeling of mitochondria in living cells [25].
Besides the conceptual advancements of digital staining, the field was undoubtedly fueled by the introduction of more powerful ML models for image-to-image regression, such as U-Net [5], generative adversarial networks (GAN) [26] and cycle conditional GANs [7].Since these models became generally more available and were applied to digital staining, e.g., the use of the 'Pix2Pix' for digital staining in 2018 [27] or the stainGAN, which was initially used for stain normalization [28], the number of publications in this field increased exponentially over the past 3-4 years (see Fig. 5A).

Input domains: label-free contrast mechanisms in optical microscopy as "optical specificity"
As mentioned above, label-free contrast mechanisms often carry highly-relevant information that can be linked to functional and/or morphological features like density, thickness or mass, the redox-ratio of a cell cycle, surface topography, presence or absense of certain molecules and many more.Whether this information is sufficient for a given digital staining task, is among the first and most important considerations when implementing a digital staining model, as discussed in Trends & methods of good practice below.
In this chapter, all reviewed publications are categorized according to contrast mechanism of the input domain.Label-free microscopy techniques are commonly used to generate input images, while elaborate staining procedures of known biochemical specificity are usually used as target images.The two label-free techniques of optical phase contrast and wide-field / white light illumination are the most commonly used techniques to generate input images with 19% and 16% of our reviewed literature respectively.Other notable label-free input imaging methods include autofluorescence (AF), nonlinear techniques, or photoacoustic imaging.There are several studies that employ combinations of different contrasts.On the one hand, this can be implemented in one single setup, e.g., in Fourier ptychographic microscopy (FPM) as a combination of amplitude and phase contrast [29], in dark field reflectance and autofluorescence (DRUM) [30] or in complementary nonlinear techniques [31][32][33].On the other hand, some papers present the use of different imaging systems for combined data input, e.g., wide-field and phase contrast [21,22,34].Furthermore, there are also several implementations of stain-to-stain translation, where inputs from one stain are digitally transferred to a different target stain.
While H&E is the most wide-spread stain used for digital staining, the single most popular combination is the use of phase contrast microscopy as input images to predict multiple IF stains, as displayed in Fig. 2. Since most of these IF stains are targeting membrane (Dil stain), nuclei (DAPI or Hoechst) or cytoskeleton (Microtubuli,  [22].Images are publicly available at https:// downl oads.allen cell.org/ publi cation-data/ label-free-predi ction/ index.html, re-use was permitted and licensed by Springer Nature.(B4) translation of bright field images of living cells to genetically encoded mitochondria markers by Somani et al. [25].Images are publicly available at https:// doi.org/ 10. 18710/ 11LLTW [35], re-use licensed under CC0 1.0.(B5) stain-to-stain translation of H&E images into cytokeratin stain by Hong et al. [36], Images are publicly available at https:// github.com/ YiyuH ong/ ck_ virtu al_ stain ing_ paper, re-use licensed under CC BY 4.0.(B6) stain-to-stain translation of IHC images into different IHC images by Ghahremani et al. [37].Images are publicly available at https:// zenodo.org/ record/ 47517 37#.YV379 XVKhH4, re-use permitted and licensed by Springer Nature.IHC = immuno-histochemcial stain, IF = immuno-fluorescence stain, WF = wide field (white light illumination), AF = autofluorescence, PAM = photo-acoustic microscopy, IR = infra-red.An extended version of the detailed literature analysis can be found in the Supplementary material of this manuscript MAP-stains), phase imaging techniques are an ideal match, as their optical phase contrast is highest for the very same cellular structures (membranes, nuclei and cytoskeleton).
The most important label-free optical techniques are briefly presented in this chapter, while biochemical staining methods which are usually used as target images, are presented in the following chapter Target domain: biochemical stains as ground-truth.

Wide-field (WF) microscopy
Perhaps the most basic type of optical microscope is the standard wide-field microscope, known since the beginnings of optical microscopy.Wide-field (WF) microscopy is an imaging technique where the whole sample is illuminated with light.The basic design consists of a light source that illuminates an extended area typically of a thin sample which scatters and transmits a fraction of the illumination into a lens or collection of lenses that image the light onto an arrayed detector.Depending on the particular illumination conditions, discussed below, wide-field microscopy has also been referred to as bright-field microscopy and white-light microscopy, among others.
In its most basic form, wide-field microscopy offers qualitative contrast derived from the spatially-varying complex transmittance of the sample: where n(x, y) is the spatially-varying complex refractive index of the sample, �z(x, y) is the sample thickness, and is the wavelength of the illumination.The real part of the refractive index imparts a phase shift on the incident light that is often difficult to observe in thin samples using standard bright-field illumination, or illumination whose angular range falls within the numerical aperture (NA) of the objective lens.However, off-axis illumination in the bright-field regime or in the dark-field regime (i.e., with illumination angles higher than the cutoff imposed by the NA of the objective) can highlight certain features that may be used for virtual staining, such as cell or organelle boundaries.The imaginary part of the refractive index corresponds to absorption induced by the sample.As such, wide-field microscopy can be useful for imaging certain types of cells that contain strongly absorbing molecules at certain wavelengths, such as red blood cells and melanocytes.The wavelength dependence of the absorption is often useful for distinguishing certain types of molecules, which can be achieved with a wide-field microscope by sweeping the illumination wavelength or by using white-light illumination with a multi-or hyperspectral camera.
For thicker samples, a simple model based on complex transmittance map (Eq. 1) is insufficient.Such samples may exhibit higher attenuation contrast due to multiple scattering and absorption, which can be quantified by an attenuation coefficient, µ t , that subsumes both the absorption coefficient, µ a , and scattering coefficient, µ s , though a standard wide-field microscope generally cannot distinguish the effects of the two sources.While such coefficients are gross or bulk metrics of biological samples (i.e., having an opaque relationship with the 3D structure of the sample), they can still offer useful sources of contrast for virtual staining [22].

Phase sensitive methods
Phase contrast (PC) is an important endogenous contrast mechanism of label-free samples.Small changes in the refractive index and thickness of cells result in detectable changes in the optical phase.Generally, phase contrast microscopy attenuates the background light and compensates the phase shift of the scattered light.This way, the scattered light interferes with the background light more constructively, which enhances the image contrast [38].Phase microscopy techniques are quite diverse in their exact implementation.They range from the use of phase rings or spatial light modulators, to interferometric setups or active illumination control and most often include computational phase reconstruction.
Phase contrast microscopy and differential interference contrast (DIC) microscopy are still two of the most commonly used phase imaging modalities that reveal structures of semi-transparent cells that are invisible to the previously discussed wide-field microscopy.Due to the substantial development of PC and DIC in the last half-century and the increasing demand for monitoring in vitro cells, those two modalities are now commonly available in commercial microscope solutions.Therefore, a large and diverse amount of PC and DIC studies have been conducted on multiple sites for predicting fluorescence labels including nuclei and dendrites for human motor neurons cells, as well as nuclei and membranes for human breast cancer line cells [21].Further, DIC-based virtual staining has been proposed in hematology to replace the laborious and inconsistent H&E stain of blood smears [39].In this case, as DIC only preserves the edges of phase images, they tend to lack details for accurate predictions of the inner-cellular structures.To relieve this issue, Tomczak et al. [39] proposed to add an auxiliary task of nucleus and cytoplasm segmentation in addition to the prime domain transformation task (i.e., to predict H&E stain from DIC images), which forces the encoder to be aware of the shape of structures.Compared to transformation networks trained with the prime domain transformation task alone, such a multi-task learning method can improve performance on digitally staining leukocytes from hematology slides imaged with DIC.
Another imaging technique based on the RI of the sample is optical coherence tomography (OCT) [40].Modern point-scan OCT is typically implemented in the frequency domain with a Michelson or Mach-Zehnder interferometer, using wavelength-swept light sources or broadband (low-coherence) sources, such as superluminescent diodes for illumination.In analogy to ultrasound imaging, OCT uses an optical "pulse-echo" time-of-flight method to create tomographic line-scan images along an optical ray, which can penetrate up to a few millimeters inside human tissue.Scanning mirrors can then be used to move the optical beam to transversely across the sample and create a volumetric 3D image of a tissue sample.While the lateral resolution of OCT depends on the NA, its axial resolution is inversely proportional to the bandwidth of the source [41].Since its invention in the early 1990s, OCT has become one of the most successful optical methods in the medical industry [41].Due to this commercial success, OCT devices are now available off-the-shelf.For instance, Lin et al. use a multi-modal OCT system (AcuSolutions Inc, Taiwan) that can create registered images from both optical coherence microscopy and fluorescence microscopy [42].The images from the two modalities were merged and false-colored to create pseudo-H&E images.Extensive in-depth comparisons between pseudo-H&E images and frozen-section H&E images from various biopsy specimens showed that the proposed digital stain method can provide H&E images that describe cellular-level morphology around two times faster than the frozensection method [42].In addition, another study shows that digital staining can also be achieved from in vivo OCT measurements [43] where tomographic images of the optic nerve heads are acquired from 10 healthy subjects using a standard OCT eye scanner (Heidelberg Engineering Inc, Germany).Four different tissue types are identified based on pixel-intensity histograms and digitally stained in a way that connective and neural tissues of the optics nerve heads can be easily visualized [43].
While the well-established techniques of phase-contrast microscopy and DIC provide qualitative phase contrast by converting phase differences into intensity differences, quantitative phase imaging (QPI) can provide intrinsic quantification of the optical path lengths difference which is a function of refractive index (RI) and sample thickness [44].Thus QPI shows decent specificity in the imaging signal without requiring any sample preparations.Due to its ability to map the physical refractive index of the sample, digital staining based on QPI has been widely explored recently with various computational microscopy implementations [38].The QPI concept was gradually extended towards 3D imaging, which resulted in the invention of gradient light interference microscopy (GLIM) in 2017 [45].GLIM uses data post-processing for filtering of out-of-focus components for 3D imaging.In 2020, this technique was further augmented by computational specificity (phase imaging with computational specificity -PICS) to digitally stain 3D GLIM images using a U-Net implementation [46].
FPM is a computational microscopic technique that enables wide-field and high-resolution QPI without any interferometry and mechanical scanning [47].Usually, a lowmagnification objective lens is used for a wide field-of-view, and an LED array is utilized for varying illumination angles.In FPM, multiple measurements are captured by varying illumination angles, and each measurement represents a different spatial frequency of the sample.Phase information is then recovered via phase retrieval algorithms, that utilize overlapped spatial frequency as a constraint.FPM was already used for digital staining of antibody conjugates stained mouse kidney slides from monochromatic phase images reconstructed with Fourier Ptychography [29].An FPM-like setup using the same active illumination of a LED array was also used to digitally stain cell membrane, and nuclei in two different cell cultures [48], although actual FP reconstruction was not applied in that case.

Autofluorescence (AF)
There are several naturally occurring proteins, that emit fluorescence upon excitation by UV or blue light.This process of autofluorescence is often exploited for label-free fluorescence imaging.The most common autofluorescent molecules are listed in Table 1.The excited molecule can then emit standard fluorescence after internal energy conversion (Stokes shift).Intensities, life times as well as ratios of different autofluorescence molecules can be specific to certain cell types and/or functional states [49].Thus, AF is a reasonable candidate to be used for digital staining.Similar to WSI with white light illumination, some articles use whole slide scanners with UV light to excite these natural fluorophores to WSI based on AF contrast [50].

Nonlinear techniques
Optical, nonlinear label-free contrast mechanisms described here include multiphoton microscopy (based on nonlinear AF and second harmonic generation -SHG) and Coherent Anti-Stokes Raman Scattering (CARS).
Although the non-linear excitation process in Multiphoton Microscopy is slightly different to the single-photon AF, described above, most molecules displayed in Table 1 can also be excited with a corresponding two-or three-photon excitation.Compared to conventional fluorescence, which uses blue or UV light of around 400 nm, MPM uses longer wavelengths typically in the range of 780-850 nm (two photon process) or 1,100-1,300 nm (three photon process).This avoids the strong scattering and absorption of biological tissues in the UV range and is not yet affected by the immense attenuation from absorption in water towards the far infra-red region.Therefore, MPM enables deeper tissue imaging than single-photon microscopy.Additionally, the signal generation is naturally limited to the confined focal volume, which outwears the need for a pinhole in the detection path.Most commonly, the native fluorophores of NADH and flavins, are used for label-free MPM [51].Similar to single-photon autofluorescence, this signal was shown to be specific for certain cell types and/or functional states [52,53], making it a useful input contrast for digital staining.Furthermore, MPM naturally enables higher harmonic generation (second harmonic generation -SHG or third harmonic generation -THG) as additional contrast mechanism for imaging.SHG or THG are based on the electrical field component of the incident light and the polarization properties of the sample.This electrical field induces a directional polarization within the sample, which in turn induces the emission of a secondary wave at higher frequency.In contrast to fluorescence, SHG or THG does not experience a Stokes shift.This signal is very specific to structures within the sample that have respective non-linear susceptibility properties (i.e., χ (2) > 0 for SHG or χ (3) > 0 for THG).SHG for instance, is specific for structures that lack inversion symmetry ( χ (2) > 0 ), such as biological molecules of collagen, myosin and tubulin [54].
A multi-modal microscopy system, including coherent anti-stokes Raman scattering (CARS) at 2,850 cm −1 , SHG in forward direction and two-photon AF in backward direction [55], was used to demonstrate a computational transformation from images label-free multi-modal contrast to image with an artificial H&E contrast [33].This translation was later updated by using GAN models [32].

Photoacoustic microscopy (PAM)
Photoacoustic imaging is based on the photoacoustic effect [56] and detects sound propagation upon laser excitation of the most prominent absorbers in tissue [57].Thus, PAM promises high molecular specificity to molecules that have a high absorption coefficient, such as hemoglobin, water, melanin and collagen [57].As ultrasonic scattering is typically weaker in tissue compared to optical scattering, photoacoustic microscopy can produce absorption images at deeper depths compared with traditional microscopy techniques, which makes it suitable for a variety of in vivo studies [58].Digital staining of PAM images was demonstrated for FFPE brain secitons [59,60] or skin sections [61,62].

Target domain: biochemical stains as ground-truth
While the previous chapter on Input domains discusses label-free optical imaging techniques, that are mostly used as input images for digital staining, this chapter presents a similar analysis for artificial staining methods that are usually used as target images for digital staining.Here, we have grouped the typical staining methods into: the histological H&E staining, immuno-histochemical staining (IHC) and immuno-fluorescence staining (IF).

Histological staining
In standard histopathology, tissue samples are most often analyzed with respect to their morphological appearance.Due to low contrast of thin tissue sections under conventional light microscopes, histopathology relies on artificial staining to evaluate tissue morphology.The combination of hematoxylin and eosin staining (H&E) is the most widely used in histopathology and serves as gold standard for most medical diagnosis of tissues.Generally, tissue biopsies are first extracted, using techniques, such as strip biopsy [63], endoscopic pincer grasping instruments [64] or ligating devices [65].These samples are then fixed, embedded and sectioned.Typical fixation media are based on formaldehyde, while some techniques use Zinc or Alcohol/acetone, sometimes with the addition of picric acid, mercuric chloride or sodium acetate [66].There are two main approaches for tissue embedding: embedding in paraffin [67] or snap freezing in optimal cooling temperature gel [68].Each of these procedures comes with certain procedural requirements and different time durations.The most common technique for fixation and embedding is the use of formaldehyde for fixation and paraffin for embedding, leading to the gold standard for tissue preparation of Formalin-Fixed Paraffin-Embedding (FFPE).Depending on the type of embedding, the samples are then sectioned by cryotomes or microtomes to thin slices, typically between 3 and 10 µm.Finally, these sections are mounted on glass slides and stained.In the case of H&E staining, following Cardiff et al. [69], paraffin tissue sections are first cleared of paraffin in baths of xylene (three changes for 2 min per change), then hydrated by ethanol baths (three changes of 100% ethanol for 2 min per change, transfer to 95% ethanol for 2 min, transfer to 70% ethanol for 2 min) and rinsed in running tap water (2 min) [69].
Afterwards, the tissue sections are stained in hematoxylin solution (3 min), washed again in running tap water (5 min) and then stained with eosin (2 min) [69].The samples are dehydrated (dipping in 95% ethanol, transfer to 95% ethanol for 2 min, two transfers to 100% ethanol for 2 min per change) and cleared in three changes of xylene (2 min per change) [69].Thereby, hematoxylin stains cell nuclei and eosin stains extracellular matrix and cytoplasm.Finally, the stained tissue sections are sealed and preserved between glass slice and a coverslip [69].Thus, the staining protocol alone already accounts for at least 90 min, and the entire procedure from biopsy acquisition to microscopic images of the stained tissue sections can easily last multiple days or even weeks, when considering queuing times in the common laboratory work-flow.In the current state-of-the-art of digital staining, H&E staining is the most common target stain for digital staining, as displayed in Fig. 2.

Immuno-histochemical staining (IHC)
Compared to the purely morphological approach of H&E staining in histopathology, the concept of immuno-histochemical staining (IHC) allows more specific antigen detection.Thereby, IHC goes beyond morphological analysis and fills the gap between classic histopathology (see section on Histological staining) and the molecular specificity of immuno-fluorescence staining (see section on Immuno-fluorescence staining (IF)) [66].Similar to histopathology, IHC stains are usually applied to fixed tissue sections.In contrast to H&E however, IHC stains are based on specific antibodies [66].IHC can either use direct staining, where a primary antibody directly leads to colored histochemical reaction, or indirect staining, where the primary antibody is combined with a secondary antibody.In the latter case, the primary antibody binds to the target epitope and the secondary antibody is loaded with a chromogen and binds to that primary antibody.Common examples for example IHC include cell stainings, such as anti-CD3 or anti-CD20 or picro Sirius red staining for collagen [70].Similar to the above mentioned histology stainings, IHC stains are most commonly used on FFPE tissue sections.IHC were regularly used for digital staining, for instance by using human cancer marker (Ki-67 antigen) [31], Jones' stain [20,[71][72][73]], Masson's trichrome [20,34,[71][72][73][74][75][76][77], picro sirius red [78,79], orcein [78],Verhoeff van Gieson (EVG) stains [79] or periodic acid-Schiff (PAS) stain [34,76,80].

Immuno-fluorescence staining (IF)
The third category is the use of fluorescence markers for staining.This can either be achieved by a fluorescent primary antibody (for instance the DAPI stain) [81] or by using the established combination of primary antibodies against specific epitopes and a fluorescent secondary antibody.In the latter case, the primary antibodies are sometimes similar to those used in IHC.Although the boundary between IHC and IF staining can sometimes be blurry, we deliberately make this distinction, since IF can also be used with unfixed samples, like in vivo cell cultures.Due to the toxicity of the fixation process and the physical sectioning of the samples, this is challenging or even infeasible using histology or IHC stainings.Therefore, IF enables a series of new applications for digital staining.
In addition to these exogenous molecular markers, fluorescence stains can also be encoded by genetic modification of the target organism to achieve expression of fluorescence markers in target components, e.g., in mitochondria [25].Furthermore, IF stains are also being used in multiplexed fashion (see Fig. 4E) for multiplexed immunofluorescence (mpIF) [37,95].

Biochemical specificity of target stains
In digital staining, it is often overlooked that the biochemical binding specificity represents the fundamental uncertainty that defines the upper limit of trustworthiness of any digital staining model.Although most stains mentioned above are commonly used as 'gold-standard' , they are actually not always standardized with respect to their biochemcial binding specificity.In the case of H&E or IHC stains, the appearance of stained samples severely dependents on the type of stain solution, the exact staining protocol and the quality or age of dyes.This is especially the case for histology and IHC stainings, but also applies to many IF stains, like the common fluorescent DNAstain DAPI.In these cases, a standardized specificity value (commonly stated in %) is not available.
For IF stains on the other hand, antibody manufacturers occasionally state reference measurements for specificity.However, it is still challenging to standardize the actual biochemical specificity values across different studies, as it is severely affected by the precise biochemical conditions of the experiment and the environment, including pH value, different behavior in medium vs in cells, ligand buffer interaction, temperature or competing binding partners, to name a few.As displayed in Table 2, the stated specificity values can range between 66% and almost 100 % for different target molecules.Moreover, this binding specificity can even vary for the same molecule, for instance if different antibodies target different binding sites (see the example of antitau antibodies in Table 2).
For most histological applications, this is completely acceptable, as long as the stain quality enables pathologists to count cells, determine diseased tissue and make a diagnosis.In the case of IF stains, careful calibration measurements can still enable quantitative analysis under standardized conditions.However, it is essential to consider the limited specificity of any target image instead of treating it as actual ground-truth and to regard digital staining as a model prediction that is fundamentally based on these limitations.

Computational models to transfer input images to target domain
As already mentioned, the development of image-to-image regression models, like U-Net [5], GANs [26] or cycle conditional GANs [7] fueled the field of digital staining over the past years.Together, these models make up more than 60% of all reviewed articles here.A short overview of the basic principle of these models is displayed in Fig. 3 and elaborated upon below.

Pre-processing
Before image data can be used to train a digital staining model, several pre-processing steps are often required.Unless a common optical path is used (see chapter on Generation of paired images), digital staining usually requires image registration to ensure optimal pixel overlay between input and target images.As discussed in the chapter on Caution & pitfalls, this can lead to several challenges, as for instance with sectioning artifacts when using consecutive sections to generate paired images.A detailed explanation of an image registration workflow for digital staining can be found in the work of Bai et al [104], who used a combination of finding speeded up robust features (SURF) points, correlation-based elastic registration algorithms, trained registration models and pyramid elastic image registration algorithms.The generation of image patches is an additional pre-processing step that is very common.Especially, when images are acquired from large-FOV whole-slide imaging (WSI) systems (see section on Wide-field (WF) microscopy), slide images are usually cropped into 2,000-20,000 image patches of 256x256 pix 2 or 512x512 pix 2 each before training a digital staining model.

Linear color-coding methods for stain transformation
Training of data-driven machine learning models is the current method of choice as computational model to transfer style and color from input into target images.However, especially earlier studies also used simpler mathematical equations for color transfer that worked reasonably well, but were often not verified quantitatively on a separate validation data set [42,74,[105][106][107][108][109][110][111][112][113].Most of them follow a simple colorcoding method, i.e., a linear mathematical model based on Gareau et al. [105], which was also applied to the previously mentioned OCT images [42].Although most of these linear color coding methods were applied for earlier implementations of DS, they were still used as recently as 2022 [30].

Feature engineering and classical machine learning
In the next phase of digital staining models, researchers quantified engineered image features and exploited them in classical machine learning models.For instance, k nearest neighbor [114], spectral Angle Map (SAM), Nearest neighbor (NN), nearest mean classifier (NearMean) [115], random forest [31,116] or partial least squares regression (PLS) [33] were used for digital staining problems.Although these approaches require more human-supervised feature extraction and prior knowledge, it can perform very robustly and often generalizes well across different data sets from the same sample under different imaging systems.On the other hand, it is often challenging to transfer it to other samples and can be more labor-intense than deep learning methods.

Deep learning
Deep neural networks (DNNs) refer to neural networks with multiple layers, allowing for the extraction of increasingly abstract features from input data.While early models in the 1940s and 1950s were limited in their ability to learn from data and to scale to larger and more complex problems, the development of backpropagation in the 1980s sparked renewed interest in DL.However, computational limitations prevented training of neural networks with many layers, and progress in DL was slow.The emergence of faster and more powerful processors, along with the availability of large amounts of labeled data, led to a resurgence of interest in DL in the early 2000s.

Convolutional neural networks (CNNs)
Researchers began to develop more sophisticated NN architectures, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), that could learn from complex and high-dimensional data, such as images, leading to image recognition using a so-called deep convolutional neural network (DCNN) [117].The employed convolutional layers are particularly well-suited for image data, as they use a set of learnable filters to convolve over the image, detecting various features such as edges, corners, or textures.Since then, DL has become one of the most active areas of research in artificial intelligence (AI).DL has been used for many machine learning tasks of images, including classification, regression, and segmentation.The most popular DL architecture for image segmentation is the U-Net which is a fully convolutional neural network that was introduced in 2015 by Ronneberger et al [5].The U-Net consists of an encoder network and a decoder network.The encoder network consists of several convolutional and pooling layers that decrease the spatial dimension of the input images while simultaneously increasing the number of feature maps.The decoder network is made up of convolutional and upsampling layers that restore the spatial dimensions of the resulting segmentation map and simultaneously decrease the number of feature maps.The U-Net utilizes skip connections to combine low-level features from the contracting path with high-level features from expanding path for preserving spatial resolution in the output.

Generative models
Generative adversarial networks (GANs) have revolutionized the field of DL by enabling the generation of realistic data samples.The first GAN was proposed by Ian Goodfellow in 2014 [26], and consisted of one generator and one discriminator.The generator produces fake data samples, while the discriminator tries to distinguish between real and fake data samples.The training process involves the two networks playing a min-max game, with the generator trying to fool the discriminator into classifying its fake samples as real, while the discriminator tries to correctly classify the samples.While GANs work most of the time, there is no guarantee that the generator will produce images that actually look like the input dataset.To address this issue, researchers have proposed various modifications to the GAN architecture, such as the conditional GAN, which adds class labels to the generator [26], and the CycleGAN, which consists of two generators and one discriminator [6].Within this GAN family, the conditional GAN architecture of 'Pix2Pix' is the most commonly used for digital staining [25,61,62,79,82,[120][121][122][123][124].
The CycleGAN has gained popularity in recent years due to its ability to translate between different modalities without the need for paired datasets with labels.Instead, the CycleGAN uses a cycle consistency loss to ensure that the generated output is consistent with the input data [6].This has enabled researchers to apply the CycleGAN to a wide range of tasks, such as predicting H&E stain from photoacoustic microscopy images [125] and improving periodic-acid-Schiff-stained renal tissue for whole slide image segmentation [126].In addition, translations between different stains have been proposed, like, transferring between Papanicolaou and Giemsa stains [127].Moreover, CycleGAN approaches have also been used to improve periodic-acid-Schiff-stained renal tissue for whole slide image segmentation [126], as well as to predict color brightfield images and antibody conjugates stained mouse kidney slides from monochromatic phase images reconstructed with Fourier ptychography [29].In general, CycleGAN approaches have proven to be versatile and flexible, allowing for the translation between various modalities, making it easier to acquire the data required for medical diagnostics and research.One recent development in the field is the introduction of saliency maps, which have been used to improve the performance of unsupervised models for image transformation tasks.For example, an unsupervised model named Unsupervised content-preserving Transformation for Optical Microscopy (UTOM) uses a saliency constraint to learn the mapping between different histology stains [50].
One of the major challenges with GANs is the problem of "hallucination" [128].Hallucination occurs when the generator produces synthetic data that do not correspond to the input data distribution.In other words, the discriminator is still fooled by synthetic data that either shows realistic looking artifacts (e.g., a digitally stained cell, when there is no actual cell in that region) or by synthetic data that deletes features (e.g., cells) that are actually present in the real data.This problem can arise when the training data are limited or when the input data are highly variable.Hallucination can be problematic, particularly in the medical domain.It is difficult to detect when a GAN is hallucinating, as the synthetic data may look plausible to the human eye.To mitigate the problem, researchers have proposed various techniques, such as incorporating regularization terms in the GAN loss function [129], using pre-training of the generator [26], or using multiple discriminators [130].However, the problem of hallucination remains a challenging issue in GAN training, particularly when working with limited or highly variable data, as discussed below.

Loss functions
For deep learning-based digital staining, loss function selection is one of the most important aspects of the neural network designing.Similar to other DL applications, the most commonly used loss functions are mean absolute error (MAE) or L1 loss, the mean squared error (MSE), which takes the L2-norm penalty, and cross-entropy.The innovations in the fields of convolutional networks, the U-Net and generative models, were accompanied by a series of new quantitative metrics for image similarity, such as Wasserstein loss [131], structural similarity index (SSIM) [132] or multi-scale SSIM (MS-SSIM) [133].In addition to MSE and MAE, these metrics are used frequently for training and/or performance evaluation in digital staining.However, employing a single loss function may lead to performance degradation.MAE keeps brightness and color unchanged, but assumes that the influence of noise and the local characteristics of the image are independent [82], MSE tends to generate blurry results [134].SSIM became popular since it tends to produce results that are closer to the human visual system in terms of brightness, contrast, structure, and resolution [93].Additionally, SSIM can detect high-level structural errors [50].However, we note that the MS-SSIM loss can lead to brightness changes and color deviations [82].Therefore, there is a range of customized metrics designed for specific tasks, as well as the combination of multiple basic loss functions [50,94,135,136].In Table 3, we summarize formal definitions and featured references for the most common loss functions.
The invention of GANs and their wide use for digital staining, as discussed above, require more complex adversarial loss metrics that are composed of generator loss and discriminators loss.CycleGAN models [7] typically contain two terms: the adversarial loss, to quantify the style match between target and generated images, and a cycle consistency loss L cyc (G, F ) , which prevents the learned mappings G and F from contradicting each other.Additional losses are also often incorporated into these basic terms, i.e., for regularization purposes.

Generation of paired images
Digital staining relies on paired images.Although some GAN-based techniques use unpaired data sets for unsupervised training, the majority of articles reviewed here still relies on paired images for supervised learning.Moreover, due to the "hallucination-gap" mentioned above, we postulate that paired images are a necessity at least for a trustworthy validation and performance evaluation of a given digital staining model.
The generation of these paired input and target images is as an important consideration in the practical implementation of digital staining.With the exception of earlier studies that used mathematical equations for color / style transfer [42,74,[105][106][107][108][109][110][111][112][113] and most recent techniques that use semi-supervised or un-supervised ML models [23,29,32,50,125,137], most approaches use paired images of input and target space for training.At the very least, a truthful validation of the output of trained digital staining models still requires paired images, even for conventional linear color translation or for modern unsupervised learning.Therefore, the process of sample preparation, staining protocol and sequence of imaging is also important for digital staining.Here, we have identified five main procedures, as displayed in Fig. 4: Table 3 Selected loss functions used for digital staining (DS) with featured references, not including adversarial losses.O(x, y) represents the output image, Ô(x, y) represents the target image, µ represents the average value, σ represents the standard deviation, c 1 and c 2 are stabilization constants used to prevent division by weak denominator, respectively

Formal definition DS References
Mean Absolute Error (MAE) Mean Squared Error (MSE) Cross-entropy Structural Similarity Index Measure (SSIM) ) [29,59] • cutting consecutive tissue sections from a block of FFPE tissue, and imaging each at a different device (e.g., one for label-free input and one for actual staining as target image).• the sample is first stained and then imaged consecutively by two different techniques (e.g., one for input and one for target imaging) • the unstained sample is first imaged for the (label-free) input domain and is then stained for the target image domain • the choice of input contrast and target contrast allows for spectral separation between input and target images in the same shared optical path.• multiplex-staining or de-and re-staining.Here, the same sample is imaged multiple times with multiple different staining techniques.A previous set of stains is chemically removed or bleached, before the next set is applied.
The unique advantages and disadvantages of these techniques are summarized in the table in Fig. 4. Please note that these limitations are only relevant for the generation of image pairs, and, therefore for the development/training and the verification of single digital staining models.While early studies relied mostly on working with consecutive sections (Fig. 4A), imaging of the same tissue section is actually the preferred method of choice to remove sectioning artifacts between input and target.Ideally, staining of the target contrast is performed after input imaging, although a few niche applications used a workflow where the sample was first stained (Fig. 4B).Whenever different imaging platforms are used sequentially (Fig. 4A-C), image registration is still essential (see section on Pre-processing).In contrast to that, techniques with shared optical path can almost omit the need for image registration, while also enabling digital staining of cell cultures without the presence of tissue sectioning artifacts for continuous digital staining of processes, like cell growths and cell-to-cell interaction (Fig. 4D).Multiplexing of the staining protocol by using de-staining and re-staining protocols (Fig. 4E), can maximize the amount of staining from a given sample (tissue section or cell culture).The main advantages and disadvantages of each technique are summarized in Fig. 4.
The immediate goal of digital histopathology staining of tissue sections (sometimes termed 'pseudo-H&E' staining) is to facilitate a wider use of label-free optical technologies by physicians and biomedical researchers, as it allows analysis of label-free images by a pathologist in the well-known and accepted histology image domain [17,33].Furthermore, it could allow the use of routine image analysis protocols that have been developed for conventional stainings, e.g., for surgical margin analysis [60] or white blood cell identification in blood smears [124].Digital staining of tissue sections is often used for pathological evaluation of disease scores [104].
Compared to virtual histology staining of tissue sections and blood smears, digital staining of cells cultures offers entire new research applications that could otherwise not be investigated.IHC staining are unfeasible, especially if cells need to be kept alive.Even fluorescence antibodies stains can interfere with biological processes, if their molecular size is large.A common applications for digital staining of cell cultures is the distinction between live and dead cells using label-free imaging and digital staining [21,50,84,87,92].Digital staining is also frequently applied to neurons [21,50,83,86], where functional information from living cells is especially interesting and where actual staining can be particular challenging.The combination of label-free imaging and digital staining allowed the simultaneous use of an AI-based nucleus finding algorithm and an additional tracking algorithm, which was not possible to the traditional method, as fluorescent tracking can affect cell behavior [89].This concept of combining digital staining with object detection, i.e., for nucleii detection is also used in other applications [154].As already mentioned, digital staining of phase microscopy images enabled investigation of cell growth and cell division, where the translation model was trained on samples concurrently stained for the G1 and the S stage of the cell cycle [24].The overlap of both signals could then indicates the G2 or M stage [24].The concept was even extended to infer not only the staining procedure, but also 3D optical sectioning capability of confocal fluorescence microscopy based on non-confocal 3D quantitative phase images [98].The approach was generalized for different fluorescence channels, different cell types and different magnifications [98].

Trends & methods of good practice
Pillar and Ozcan identified several key advantages of virtual staining over actual staining [18], like a reduced time to perform staining, minimal manual labor, minimal stain variability, less hazardous waste composition of tissue fixatives, preservatives and staining dyes, no tissue disruption of the actual sample, no restrains to use multiple stains on a single slide, the chance to perform stain-to-stain transformation and a reduced chance for technical failures [18].Most of these advantages also generally apply to digital staining, as it is discussed here.An important addition to the field is the use of digital staining for cell cultures (both fixed and alive cells), as discussed above.In this case, digital staining offers additional advantages, such as an identification of functional stages (e.g., growth phase) without the biochemical binding of actual antibody stains, that could otherwise interfere with biological homeostasis and impact motion, growths or other aspects of relevance.
As digital staining was refined over the years, we can identify certain trends in this field (see Fig. 5 and supplementary material Figs.S1 and S2).While the first techniques mostly used linear color translation for pseudo-H &E staining, computational tools became more powerful and the applications became more diverse over time.Nowadays, DL models, like the U-Net (since 2015), or GAN models (since 2016), are the most common models used for digital staining.At the same time, the range of applications has significantly expanded, coming from histological tissue sections to cell cultures (since 2018), multiplexed cell imaging (since 2018), or even advanced examples mentioned above, like live cell growth imaging [24] or inference of 3D confocal fluorescence from non-confocal phase images [161].Similarly, the applied input imaging technologies diversified over time.Label-free modalities, like phase contrast (20/105 articles), widefield (17/105 articles) and single-photon autofluorescence (12/105 articles) microscopy are still the most frequently used, making good use of digital staining to add computational specificity to these label-free technologies.However, digital staining is also used for stain-to-stain translations, e.g., with artificial stains (H&E with 10/105 or IF with12/105 articles) as input instead of targets.
Based on these ongoing trends and the current state-of-the-art, we suggest the following methods of good scientific practice, when developing digital staining.Since each of these topics is an entire field of research in itself, we will only shortly address their relevance to the field of digital staining.
• General feasibility: As with most ML problems, one should consider first, whether the information content in the data, i.e., the input domain, is believed to be sufficient for the given task.More specifically, a good first question is if the general information in the input images is correlated with the one in the target domain.For instance, it might seem unfeasible to digitally stain cell nuclei (target) from images that only contain fluorescence of a membrane marker as input, if no additional information was used.On the other hand, it would seem quite feasible to perform DS of nuclei and membrane markers based on phase contrast images, as the contrast in optical phase is high for both nuclei and membranes.If a paired data set is already available, we suggest to test the general feasibility first by developing a model for simpler tasks, such as patch classification, object detection, or semantic segmentation.• Report uncertainties: One of the main short-comings of the current state-of-the-art for digital staining is that fundamental uncertainties in input and in target data are usually not reported.As presented in this review, however, DS is a holistic approach that involves the entire pipeline of biology, imaging and ML.Simply reporting a performance metric of the ML task, is therefore insufficient, as those metrics assume a perfect ground-truth.However, target data from actual biochemical staining are always affected by the specificity of the molecular marker as fundamental uncertainty in the 'ground truth' (see section on Biochemical specificity of target stains).Similarly, imaging of inputs and targets is subject to the specific contrast mechanism, resolution and SNR of the respective imaging technology.Thus, we propose that digital staining should always be embedded in the context of input and target uncertainties of the actual stain as well as SNR of the imaging process to allow a fair evaluation of its performance.• Generalizability: there is a generalization gap [162] in DL and digital pathology, which also applies to digital staining.DS can often be very hardware-specific and can be prone to over-fitting.Therefore, it is essential to discuss generalizability.Ideally, one should take a hardware-agnostic approach when testing a DS pipeline.[29], dark field reflectance and autofluorescence (DRUM) [30] or complementary nonlinear techniques, like CARS, SHG and twophoton AF [31][32][33].

Caution & pitfalls
As antithesis to the methods of good practice discussed above, we identify certain pitfalls that can reduce the overall success of digital staining (i.e., prediction performance, robustness, validity, computation time, and required number of examples).Generally, we consider the error analysis of digital staining to be not fully developed yet.While most articles in the field do a very good job to report a growing collection of ML performance metrics, a holistic error analysis of the entire process, is not part of the state-of-the-art.We postulate that error analysis for digital staining should include modeling uncertainties (ML performance metrics, training curves but also errors from pre-processing, e.g., image registration) and biological uncertainties (binding specificity, purity of cell cultures, contamination, bleaching of fluorescence), as well as optical uncertainties (contrast, resolution, SNR).Moreover, high uncertainties in input contrast and target 'ground truth' will remain undetected, if data from the same general population (e.g., the same target stain and same imaging system) are used for the validation of predictions and for the overall performance evaluation.

Conclusion & future perspectives for digital staining
Despite this role as gold-standard, staining protocols face several challenges, such as a need for extensive, manual processing of samples, substantial time delays, altered tissue homeostasis, limited choice of contrast agents for a given sample, 2D imaging instead of 3D tomography and many more.Label-free optical technologies, on the other hand, do not rely on exogenous and artificial markers, by exploiting intrinsic optical contrast mechanisms, where the specificity is typically less obvious to the human observer.Over the past few years, digital staining has emerged as a promising concept to use modern deep learning for the translation from optical contrast to established biochemical contrast of actual stainings.In this following chapter, we present potential future trends and challenges, as well as our view on the broader impact on clinical diagnostics, research, and biotechnology.Generally, medical diagnostics in remote and resource-limited settings would greatly profit from a low-cost, stainless approach like digital staining.When applied to simple and robust systems, like portable white-light or phase contrast microscopes, this could enable reasonable diagnostic yield from inexpensive hardware.On the other hand, labelfree technologies, like MPM, CARS, PAM, FPM and others, are growing fields of research in high-income contries, and yet digital staining is currently still under-investigated for these emerging techniques.Thus, we foresee a further implementations of digital staining for these more advanced optical contrast mechanisms.Furthermore, we believe that the input and target images with rely more on multiple different stains and/ or mpIF, which was shown to have a higher accuracy in diagnostic prediction as compared to single stainings [164].
In the branch of ML models that are used for digital staining, several innovations can be imagined.For once, multi-task learning is an emerging concept that is being used for ML in optical microscopy.As it was already realized for digital staining with auxiliary tasks [39], it will likely become more relevant for this field in the future.The concept of multi-domain image translation, i.e., training a single model to learn mappings among multiple domains, was already implemented in a larger number of publications [22,72,73,82,102].In a similar fashion, physics-informed learning and integration of prior information or simulation data into the learning process are interesting concepts in modern ML research.Since these are well-suited to increase robustness and generalizability, they would probably be able to address several challenges in modern digital staining, as in the case of hallucinations.This concept has not really found its way to digital staining yet, except for publications that employed simulations to improve the training process [25] or that modeled the microscope's point spread function in the learning of an adversarial neural network to improve digital staining [25].The trend in the development of new ML models, from classical ML over DL to GAN models, is likely to continue and to produce entirely new concepts for ML models.One potential candidate is Adversarial Diffusion Models.These are already used to translate between MRI and CT data [165], which is a very similar problem to digital staining in optical microscopy.
The continuation of current trends as well as potential innovations in the field will very likely result in a series of exciting new applications for digital staining.Although digital staining of histology sections has shown to facilitate easier, faster and potentially more accurate clinical diagnosis in several research publications, a full FDA approval as medical product will remain challenging, due to extensive documentation requirements and current technical limitations.We believe that this technique is currently more interesting for cell cultures, as discussed in this review.Since this use-case does not imply sensitive patient data or critical decisions on clinical diagnosis, a commercialization in the biotechnology sector might be more feasible.The technique of 3D fluorescent labeling based on phase microscopy was already patented [118].Following this trend, digital staining could potentially be used for organoids, that gained a lot of popularity in the recent years.
In the long-term future, however, clinical applications of digital staining would not only be limited to tissue sections but could become a vital tool for clinical in vivo imaging.Currently, DL is already used to improve image quality in endomicroscopes [166], and endoscopic or endomicroscopic implementations are already available for many imaging technologies and optical contrast mechanisms mentioned in this review.This next step of digital staining, however, needs to be accompanied by designing more robust, generalizable and interpretable models, as discussed above.This point was also identified by Jiang et al., who mentioned the problems of variable clinical factors regarding imaging microscopes, staining techniques, patch extraction, and selection and stated that "To address this issue, designing more robust architectures can make the model less dependent on data quality in digital medicine" [14].

Literature review
For the literature data base in this review, 108 articles between 2005 and January 2023 were reviewed and categorized.We considered peer-reviewed articles of above two pages length, not including short conference abstracts or un-reviewed preprints.

Fig. 2
Fig. 2 Pairings of input and target contrast.a Target image contrast is plotted against the input contrast, the number of publications in each combination is color-coded.Selected examples in (b) are indicated by numbers in (a): (B1) a translation of autofluorescence images from tissue slides to H&E images by Rivenson et al. [20], re-use permitted and licensed by Springer Nature.(B2) translation of phase contrast images of human neuron cells to specific fluorescence images (DAPI, anti-MAP2 and anti-neurofilament).Data available at https:// github.com/ google/ in-silico-label ing from Ref. [21], re-use was permitted and licensed by Elsevier and Copyright Clearance Center.(B3) translation of bright field images of cells to multiple fluorescence stains by Ounkomol et al. [22].Images are publicly available at https:// downl oads.allen cell.org/ publi cation-data/ label-free-predi ction/ index.html, re-use was permitted and licensed by Springer Nature.(B4) translation of bright field images of living cells to genetically encoded mitochondria markers by Somani et al. [25].Images are publicly available at https:// doi.org/ 10. 18710/ 11LLTW [35], re-use licensed under CC0 1.0.(B5) stain-to-stain translation of H&E images into cytokeratin stain by Hong et al. [36], Images are publicly available at https:// github.com/ YiyuH ong/ ck_ virtu al_ stain ing_ paper, re-use licensed under CC BY 4.0.(B6) stain-to-stain translation of IHC images into different IHC images by Ghahremani et al. [37].Images are publicly available at https:// zenodo.org/ record/ 47517 37#.YV379 XVKhH4, re-use permitted and licensed by Springer Nature.IHC = immuno-histochemcial stain, IF = immuno-fluorescence stain, WF = wide field (white light illumination), AF = autofluorescence, PAM = photo-acoustic microscopy, IR = infra-red.An extended version of the detailed literature analysis can be found in the Supplementary material of this manuscript

Fig. 3
Fig. 3 Computational models for Digital staining.a The general supervised machine learning workflow for most digital staining.Please refer to the main text for some examples that use an unsupervised workflow b The most commonly used models: besides earlier implementations of color-coding with a linear contrast translation equation f(k) or feature engineering and classical ML, almost all modern digital staining implementations use deep learning with either CNN and GAN architectures (I = Input image, T = Target image, G = Generator, I g = generated image, D = Discriminator)

Fig. 4
Fig. 4 Generation of image pairs for the training of digital staining models.A-E Schematic workflow of the five different procedures.The table shows positive features (green), neutral features (orange) and negative features (red)

Fig. 5
Fig. 5 Historical trends in the field of digital staining.a The total number of publications in the field.b All reviewed articles as parallel and linked categories.The year of each publication is color-coded.An interactive version of this plot is available as Supplementary HTML file

Table 2
Examples of primary antibodies for immuno-fluorescence staining with reported binding specificity and features examples for digital staining (DS).MAP2 = Microtubule Associated Protein 2, HEK = human embryonic kidney, isoform specificity = "no detectable non-specific binding" [21,22,25,34,50,93,94,104,135,136,151] a system across different imaging systems and/or different tissue types to evaluate if it generalizes well.This can further be extended to evaluate the generalizability across different experimenters, different staining methods or different data sets.See references[21,22,25,34,37]for good examples.•Choice of the right loss function: After the selection of input and target technologies (which might be predefined for a given problem), the choice of the loss function is important.Different loss functions can emphasize different aspects of the imageto-image regression task, e.g., high-level structural errors (SSIM), absolute errors at the pixel level (peak signal-to-noise ratios -PSNR), brightness and color (MAE) or custom loss functions (see section on Loss functions for more details).•Imageinspectionanddecisionvisualization:Besides the mere reporting of loss curves and performance metrics, it is indispensable to visually inspect and report the actual target and prediction images.Although the above mentioned loss functions are suited for training and quantitative performance comparison, some can be ill-suited to detect hallucinations[128], artifacts or other localized prediction errors in the images.Moreover, decision visualization, like occlusion maps, Shapley values or perturbation studies can inform the researcher about features that are particularly important to the learning process.This can not only support de-bugging during the development of DS, but it can also offer valuable scientific feedback e.g., to understand which parts of an input image are particularly relevant to predict a certain target.•Interpretability:similartothe point above, ML models can often lack interpretability, which prevents identification of biases and can thereby reduce generalizability.Interpretability is especially relevant for digital staining to prevent false halluzinations from overfitting.A good rule of thumb is that simpler models with a smaller number of parameters are more interpretable.Furthermore, it is preferred to create more interpretable models from the beginning instead of post-hoc explanations of complicated models[163].•Availability of code & data: Whenever possible, it is recommended to make code and data available to other researchers, according to the FAIR principle (Findability, Accessibility, Interoperability, and Reuse of digital assets).This enhances trustworthiness and transparency of the general scientific procedure and further enables other researchers to test new approaches, especially since good data sets of paired images might be a bottleneck for many ML researchers.Positive examples, where code and data were made public are[21,22,25,34,50,93,94,104,135,136,151].
• Multi-modal imaging: Combinations of different contrast mechanisms for a richer information content an be more robust.Examples include FPM as a natural combination of amplitude and phase