On‐line untargeted metabolomics monitoring of an Escherichia coli succinate fermentation process

Abstract The real‐time monitoring of metabolites (RTMet) is instrumental for the industrial production of biobased fermentation products. This study shows the first application of untargeted on‐line metabolomics for the monitoring of undiluted fermentation broth samples taken automatically from a 5 L bioreactor every 5 min via flow injection mass spectrometry. The travel time from the bioreactor to the mass spectrometer was 30 s. Using mass spectrometry allows, on the one hand, the direct monitoring of targeted key process compounds of interest and, on the other hand, provides information on hundreds of additional untargeted compounds without requiring previous calibration data. In this study, this technology was applied in an Escherichia coli succinate fermentation process and 886 different m/z signals were monitored, including key process compounds (glucose, succinate, and pyruvate), potential biomarkers of biomass formation such as (R)‐2,3‐dihydroxy‐isovalerate and (R)‐2,3‐dihydroxy‐3‐methylpentanoate and compounds from the pentose phosphate pathway and nucleotide metabolism, among others. The main advantage of the RTMet technology is that it allows the monitoring of hundreds of signals without the requirement of developing partial least squares regression models, making it a perfect tool for bioprocess monitoring and for testing many different strains and process conditions for bioprocess development.

dissolved oxygen (DO) in the liquid phase, as well as the oxygen and carbon dioxide in the gas phase. Although there are many other parameters that can be measured-such as turbidity, rheology, enzyme activity, or metabolite concentration among others (Harada et al., 2014)-their monitoring is less common, especially at a large scale.
In a bioreactor, the biomass and bioprocess metabolites-such as substrates and products-are found in the liquid phase. The monitoring of these compounds has received a lot of attention in the last decades. Different technologies can be used to monitor the biomass, including optical density (turbidity), dielectric spectroscopy, microscopy, flow cytometry, fluorescence spectroscopy, calorimetry, and vibrational spectroscopy (Bayer et al., 2020;Broger et al., 2011;Grobbelaar, 2009;Kamiloglu et al., 2020;Müller et al., 2018;Sonnleitner, 2012), whereas the monitoring of metabolites can be achieved using high-performance liquid chromatography (HPLC), vibrational spectroscopy, nuclear magnetic resonance (NMR), enzymatic reactions, and mass spectrometry (MS) (Druhmann et al., 2011;Shalabaeva et al., 2017;Svendsen et al., 2015;Vann et al., 2017;Warth et al., 2010). As metabolites are at the final step of biological regulation, monitoring them provides the best picture of cellular phenotypes (Farrell et al., 2014;Fiehn, 2002). Metabolite and transcriptional changes can occur very rapidly. For instance, Xu et al. (2012) detected changes in glycolysis and tricarboxylic acid (TCA) cycle metabolites within 1-5 min of removing or changing the carbon source in the growth medium, and Lara et al. (2006) calculated the time required to synthesize one molecule of messenger RNA of the mixed-acid fermentation genes to be between 10 and 72 s. For this reason, monitoring the metabolites in a bioprocess via highresolution time-course analysis enables the detection of these fast metabolic changes much earlier than using conventional off-line analysis, which is usually sparse, in the order of magnitude of hours.
Bioreactor monitoring by HPLC has been implemented on-line, but it has some limitations such as time delays between samples of typically around 10 min and the requirement of a biomass filtration system to avoid column blocking (Koch et al., 2016;Koliander et al., 1990;Warth et al., 2010), limiting the analysis to extracellular metabolites.
Vibrational spectroscopy-especially near-infrared, mid-infrared, and Raman spectroscopy-has been implemented in-line and on-line, yielding very accurate monitoring models for several process compounds.
However, these vibrational spectroscopy technologies have certain limitations, the main one being that the spectra that they generate are very convoluted with many overlapping signals. This results in the need to use chemometric mathematical models such as partial least squares (PLS) regression to break down the different signals contributed by the different compounds in the mixture (do Nascimento et al., 2017;Li et al., 2018;Marison et al., 2012;Rodrigues et al., 2018;Stuart, 2005;Zu et al., 2017). These models require significant time and resources to build and are usually not transferable, that is, they are only applicable to the configuration used to build them (bioreactor, medium composition, strain, temperature, pH, etc.) (Marison et al., 2012;Pu et al., 2020;Roggo et al., 2007), making these monitoring techniques of limited use for early stages of bioprocess development, when the strain, process parameters, and media composition are often changed in an iterative manner (Baradez et al., 2018). Finally, due to the large signal overlap, these technologies tend to report only a few compounds from the mixture, usually the most abundant ones. There have been some examples in the literature using NMR for on-line fermentation monitoring (Kreyenschulte et al., 2015;Legner et al., 2019). Similar to vibrational spectroscopy, a limitation of NMR for bioprocess monitoring is the presence of overlapping peaks, which limits the number of compounds that can be detected and quantified, usually less than 10 (Brecker et al., 1999;Kreyenschulte et al., 2015;Majors et al., 2008).
Due to the increasing demand for tools to monitor metabolites, a range of commercial bioprocess analyzers has been developed in the last couple of decades. Some of these are based on enzymatic analysis-such as the Cedex Bio ® Analyzer (Roche) and the BioProfile FLEX2 (Nova biomedical) (Morris et al., 2021;Obaidi et al., 2021)while others use the so-called "miniaturized" MS analyzers-such as the MiD (Microsaic) and the Rebel (908 devices) (Hamilton et al., 2014;Synoground et al., 2021), employing a low-resolution quadrupole and ion trap mass analyzers, respectively (Blakeman & Miller, 2021;Hemida et al., 2021). However, these analyzers are still almost exclusively being used at-line or off-line-thus limiting their monitoring potential-and only targeting the predefined set of compounds dictated by the vendor reagents.
The aim of this study is to explore the use of untargeted metabolomics as a technology to monitor the metabolites present in the liquid phase of a bioreactor in real time. Metabolomics is the global analysis of small to medium size molecules (i.e., up to 1000-2000 Da) present in the metabolism of biological systems.
Current analytical methods and MS instrumentation allow for a generous compound coverage across different metabolic pathways, thus facilitating the interpretation of biological experiments. The advantages of MS are that it offers a much wider detection capacity than NMR, vibrational spectroscopy, the common detectors used with HPLC (refractive index [RI] and UV/Vis spectroscopy) and commercial enzymatic analyzers, has very high sensitivity and allows detection of metabolites in a much less convoluted manner than NMR and vibrational spectroscopy, thus not requiring the development of laborious chemometric models such as PLS regression. All these attributes make MS an attractive technology for bioprocess monitoring.
Despite these advantages, to date, on-line metabolomics still remains unexploited for bioprocess monitoring. Link et al. (2015) reported the use of metabolomics to monitor different organisms directly from a cultivation flask. However, the cells grown in these experiments were cultured in fermentation media that was up to eight times diluted, which is an impractical imitation for bioprocess monitoring. Plum and Rehorek (2005) reported an on-line MS system for analyzing nine azo dyes in a wastewater treatment process. However, this system contained a biomass filtration unit and targeted only nine compounds, thus limiting the vast detection capacity of MS.
In this study, untargeted metabolomics has been used for the first time, to our knowledge, for bioprocess monitoring. This technology was tested with a bench-top 5 L bioreactor using undiluted and unfiltered fermentation medium for the untargeted monitoring of 886 different intracellular and extracellular m/z signals of an Escherichia coli (E. coli) succinate fermentation process using a high-resolution Orbitrap mass spectrometer.
Succinate is used as an intermediate in the manufacturing of highvalue consumer products such as personal care items, pharmaceutical intermediates and food and drink additives, as well as in the manufacturing of high production-volume products such as polybutylene succinate (PBS), polybutylene succinate adipate (PBSA), resins, coatings, lubricants, and polyurethanes. Furthermore, succinate can also be derivatised into other platform chemicals such as 1,4-butanediol, tetrahydrofuran, and γbutyrolactone (Matano et al., 2014;Nghiem et al., 2017;Saxena et al., 2017;Thakker et al., 2012) (see Figure 1), all of which have significant market applications, such as the production of elastic fibers, plastics, and polyurethanes. Detected features include, among others, the main process compounds, potential biomarkers of biomass formation and metabolites from the pentose phosphate pathway (PPP) and nucleotide metabolism. This study is a step in the development of new technology for both bioprocess monitoring during product manufacturing, and also for earlier research and development phases; for instance, for the evaluation of different strains, process conditions and for the identification of engineering targets, by-products, and biomarkers, among others.

| Bacterial strain
All experiments described in this article were carried out using a proprietary industrial E. coli strain (Ingenza Ltd.), based on the E. coli NZN111 strain with deletions of the pyruvate-formate lyase (pflB) and lactate dehydrogenase (ldhA) genes as described by Chatterjee et al. (2001).

| Growth media
All 5 L scale fermentation experiments were carried out with a batch phase for biomass formation using a defined minimal medium containing 11.90 g/L glucose as the sole carbon source, 2.00 mM MgSO 4 , a mix of salts solution (2.00 g/L (NH 4 ) 2 SO 4 , 14.60 g/L K 2 HPO 4 , 3.60 g/L NaH 2 PO 4 ·2H 2 O, 0.  -2000). Shake flask overnight cultures were prepared using the same medium but with 10.00 g/L glucose and no antifoam.

| Fermentation process conditions
All fermentation experiments were carried out in a 5 L Applikon stirred tank fermenter (ADI 1030 Bio Controller, 1035 Bio Console), and the process consisted of an initial batch phase where the minimal medium was primarily used for biomass formation, followed by a 24 h anaerobic succinate production phase (Figure 2), similar to the process described by Vemuri et al. (2002).

| Inoculum
Fermentation inocula were prepared by inoculating 50 µl of cell bank into 100 ml of growth medium in a 500 ml baffled shake flask and incubated at 37°C and 165 rpm for 17-17.5 h.

| Aerobic batch phase for biomass growth
The fermentation was started by inoculating 100 ml of overnight culture into 3 L of growth medium in the 5 L fermenter for a starting OD 600 of 0.21 ± 0.025. During biomass growth, the conditions were maintained at 37°C temperature, 500-900 rpm agitation (controlled to keep the DO > 30%), 4.00 L/min air (1.33 vvm), and pH 7.0 ± 0.1, controlled with 2.00 M H 2 SO 4 and 28% (w/v) NH 4 OH.

| Anaerobic succinate production phase
At the beginning of the production phase, glucose from a 500 g/L solution and sodium bicarbonate from a 100 g/L solution were added to the fermenter as a single bolus addition to a final concentration of 20 and 5 g/L, respectively, in the vessel, as described by Wu et al. (2007). The sodium bicarbonate provides soluble CO 2 , which is required for the conversion of PEP to oxaloacetate ( Figure 3) (Thakker et al., 2012). Once the glucose and sodium bicarbonate were added to the fermenter, the sparged air was replaced by pure (99.8%) CO 2 at 0.50 L/min (0.17 vvm), agitation was set to 300 rpm, temperature at 37°C, and pH at 7.0 ± 0.1, controlled with 2.00 M H 2 SO 4 and 28% (w/v) NH 4 OH.

| Biomass measurement
Biomass levels were reported as OD 600 and wet cell weight (WCW).
The former was the measured optical density at 600 nm wavelength.
The latter was determined by spinning down 1 ml of sample for 5 min at 14,462g twice in a preweighed Eppendorf tube, removing the supernatant and weighing the resulting pellet. The weight of the pellet in g/L was calculated from gravimetric difference.

| On-line metabolomics
On-line metabolomics was conducted by connecting the fermenter to an Exactive™ Orbitrap (Thermo Scientific) mass spectrometer with a fluidics system similar to what had previously been described in the literature (Link et al., 2015), but adapted to inject undiluted fermentation broth samples straight into the mass spectrometer. The modified fluidics system consisted of a peristaltic pump and two valves (Figure 4), and sample injections were alternated with washing steps, one-to-one. The peristaltic pump was a Masterflex™ L/S ® (Cole-Parmer) high-performance pump model 77252-72 and was operated at a high flow rate of 75-100 ml/min. The first valve was a six-port, two-position valve (Vici Valco ® ) and the second one was a 10-port, two-position valve (Dionex Corporation). Note, however, that no chromatography was used.

| Sample injections
During sample injections, fermentation broth containing cells is constantly extracted from the fermenter with the peristaltic pump, injected into the six-port valve, circulated through a 50 µl loop and returned to the fermenter. Upon valve switching, the broth sample from the 50 µl loop of the six-port valve is carried by sterile water pumped at a 200 µl/min flow rate using an external piston pump into the 10-port valve, where it is collected in a 1 µl loop. The sample is finally injected into the mass spectrometer carried by a 70:30 ACN:IPA + 0.1% formic acid mixture at a 400 µl/min flow rate. The duration of the injection method was 1 min, and the total traveling time from the bioreactor to the mass spectrometer was 30 s, with ca. 10 s to reach the six-port valve and 20 more seconds to reach the mass spectrometer.

| Washing steps
Each sample injection was followed by a 4 min washing step to avoid system blockage and signal loss. The six-port valve was washed with 70:30 IPA:ACN + 0.1% formic acid for 1 min and with sterile water for 3 min, both at a 400 µl/min flow rate. The 10-port valve was washed with sterile water for 2 min and then 70:30 ACN:IPA + 0.1% formic acid for 2 min, both at a 600 µl/min flow rate (see Supporting Information: Figure S1). The washing solutions were sent to waste and did not enter either the fermenter or the mass spectrometer.

| Mass spectrometer parameters
Gas-phase ions were generated with an electrospray ionization (ESI) source. The mass spectrometer was operated at 50,000 resolution, mass range 50-1000m/z in polarity switching mode with a spray voltage of ±3.5 kV. The capillary temperature was set to 350°C, sheath gas 40 a.u., automatic gain control target 1 × 10 6 a.u., and the F I G U R E 2 Schematic diagram of the fermentation process. The dashed black line splits both phases of the process.

| Metabolomics data processing and analysis
Raw MS data were processed with the Xcalibur™ software (version 3.1.66.10) using the Genesis peak detection method. The peak integration threshold was set to 0.5 signal-to-noise ratio (S/N), smoothing points to 1 and peak detection was set to the highest peak within a 15 s retention window, with a minimum peak height threshold of 3 S/N. A small number of signals that were not properly detected with the Genesis method were instead processed with the ICIS method. In these cases, peak integration was performed setting smoothing points to 1, baseline window to 40, area noise factor to 5, peak noise factor to 10, minimum peak height threshold of 3 S/N and peak width constrained to 5% of the peak height with a tailing factor of 2.
After processing the raw data with the Xcalibur™ software, metabolite features were extracted as a.csv file, which was used to generate time-course metabolic profiles using the ggplot2 package (version 3.3.3; Wickham, 2016) in the statistical software environment R (version 3.6.1). Data smoothing was carried out using the same version of the ggplot2 package with a locally estimated scatterplot smoothing method with a span between 0.2 and 0.5, depending on the metabolite.   . During the "inject position", fermentation broth sample is continuously circulated through the 50 µl of the left six-port valve (sample n + 1) and the fermentation broth sample from the 1 µl loop in the right 10-port valve (sample n) is pushed to the mass spectrometer. During the "load position," the sample n + 1 is pushed from the 50 µl loop of the six-port valve into the 1 µl loop of the 10-port valve, ready for injection at the next "Inject position."

| Data scaling
The monitoring system handles whole-broth samples containing cells.

| Six-port valve
The fermentation broth is pushed by the peristaltic pump into a sixport valve, which has a 50 µl sampling loop. The broth is recirculated back into the fermenter for the majority of the time in the "inject position" (see Figure 4). When the valve position is changed to "load

| 10-Port valve
During the "load position," the 50 µl fermentation broth sample collected in the sampling loop of the six-port valve is introduced into the 10-port valve, pushed with sterile water using a piston pump from an HPLC instrument (aqueous pump). This way, the fermentation sample is delivered to a 1 µl sampling loop on the 10-port valve.
When the valve switches to the "inject position," the 1 µl sample gets injected into the mass spectrometer, pushed with a 70:30 ACN:IPA + 0.1% formic acid solvent mixture using a second piston pump from the same HPLC instrument (organic pump).

| Further considerations of the system
Using two valves is a solution to mitigate the solvent incompatibility at the two ends of the system. Namely, at one end, the bioreactor contains a water-based environment with living cells, and at the other end, ESI MS works best with volatile organic solvents, which are more effective than water at generating gas-phase ions (Hoffmann & Stroobant, 2007).
The total traveling time from the bioreactor to the mass spectrometer was 30 s, with ca. 10 s to reach the six-port valve and 20 more seconds to reach the mass spectrometer. This traveling time is short compared to the biomass doubling time (65 min, see Supporting Information: Figure S2) and allows the capture of rapid metabolic changes while minimizing the time the sample spends out of the fermentation environment.

| Introduction of a washing step between sample injections
A washing step between sample injections was introduced to prevent blockages of the on-line monitoring system. By tracking the total ion chromatogram (TIC) across the first injections of two different fermentation experiments, it was observed that the washing step was instrumental in preventing signal loss of the mass spectrometer. Specifically, it was observed that when one wash was performed after every three samples, the TIC signal consistently increased immediately after every washing step (Supporting Information: Figure S3A). This indicated that the washing step helps to prevent not only system blockages but also signal loss, potentially due to the removal of particulates and build-up molecules accumulated in the system during sample injection. When the washing step was used after every injection (Supporting Information: Figure S3B), the changes in the TIC did not follow any periodic pattern. In both cases (with a wash after three samples and a wash after each sample), there was a decreasing trend in the TIC during the first 15 injections, but this is probably not caused by signal loss, but by the consumption of glucose from the media by the cells for biomass formation. With these observations, it was deemed necessary to implement a washing step after every single injection.

| On-line untargeted metabolomics analysis of a succinate fermentation process
Nine fermentation runs were performed testing different parameters of the on-line monitoring system-the solvent system, the washing method, sampling frequency, and wash frequency (see Supporting Information: Table S1). The best conditions were selected based on being able to run the fermentation process without blockage or overpressure of the system. F I G U R E 6 Example of annotated metabolites observed with on-line metabolomics monitoring of a succinate production fermentation process in Escherichia coli. Time is indicated with respect to the beginning of the succinate production phase.
The on-line data also offers a very high time-resolution compared to the full fermentation duration, allowing identification of key points of the bioprocess with a 5-minute error margin, such as glucose depletion (2.95 h before succinate production) and the beginning of succinate production.

| Biomarkers for growing biomass
Metabolites annotated as (R)-2,3-dihydroxy-3-methylpentanoate and (R)-2,3-dihydroxy-isovalerate follow an exponential increase coinciding with the exponential growth of biomass during the batch phase ( Figure 9). For this reason, these two metabolites were identified as potential biomarkers for biomass. To evaluate this, the on-line signal of these two metabolites was compared with the off-line WCW biomass measurements by Pearson correlation (Figure 10). A better correlation was found using the natural logarithm of the biomass WCW. When the whole fermentation was evaluated, the correlation between the two signals was poor (Pearson correlation estimates 0.55 and 0.63; Figure 10a). However, a good correlation was found during the aerobic batch phase (Pearson correlation estimates 0.94 and 0.96; Figure 10b), suggesting that (R)-2,3-dihydroxy-3methylpentanoate and (R)-2,3-dihydroxy-isovalerate, especially the latter, could potentially be used as biomarkers for growing biomass.
Both these metabolites belong to the branched-chain amino acid biosynthetic pathways for the formation of valine, leucine and F I G U R E 7 Glucose, pyruvate, and succinate are monitored by on-line metabolomics (a) and off-line high-performance liquid chromatography (HPLC) (b). The mass spectrometry measurements are represented as dots and the corresponding smoothed signal is represented with lines and calculated with locally estimated scatterplot smoothing. Ions 203.0527, 87.0088, and 117.0193m/z were, respectively, used for glucose, pyruvate, and succinate. The HPLC data are represented as dots and the interpolated data are represented with lines. Time is indicated with respect to the beginning of the succinate production phase.
isoleucine-essential building blocks for biomass formation-which could explain the good correlation of these metabolites with biomass growth.

| CONCLUSIONS
Fermentation monitoring is a crucial step to understand and control the evolution of a bioprocess to ensure that the desired process Namely, MS can detect many more compounds, has a higher sensitivity, and does not require the use of PLS regression models, which tend to have little transferability when process conditions are changed (e.g., temperature, medium, strain, etc.).
Commercial enzymatic analyzers and "miniaturized" lowresolution mass spectrometers are also becoming a trend for bioprocess analysis, allowing the rapid measurement of a predefined set of compounds. However, these are still almost exclusively used at-line or off-line, requiring manual handling and offering limited time F I G U R E 8 Pearson correlation between off-line HPLC-UV/Vis-RI (refractive index) and on-line metabolomics data for glucose, pyruvate, and succinate, where R is the Pearson correlation coefficient and p shows the p value of the test. The metabolomics data were scaled from 0 to 1 to fit in the same axis.
F I G U R E 9 On-line metabolomics signals corresponding (R)-2,3-dihydroxy-3methylpentanoate and (R)-2,3-dihydroxyisovalerate and of-line WCW biomass, all three scaled from 0 to 1 to fit in the same axis. Time is indicated with respect to the beginning of the succinate production phase.
resolution. Furthermore, these systems are limited to the analysis of the compounds in the commercial assay kits, and the use of lowresolution MS significantly limits the quality of metabolite annotation.
In this study, an on-line untargeted metabolomics platform (RTMet) was developed to be able to analyze fermentation wholebroth samples directly from the bioreactor with flow injection MS (no chromatography) every 5 min and no manual intervention. The use of a high-resolution Orbitrap mass spectrometer allowed for the detection of 67 compounds without the need to build timeconsuming PLS regression models. These features make this technology especially useful for the detection of important pathways, by-products, and biomarkers during the process development stage, allowing the evaluation of different strains, cell lines, and process conditions (temperature, medium, pH, etc.). This is the first step in demonstrating the use of untargeted on-line metabolomics for bioprocess optimization. Future work will include the use of this technology with other bioprocesses and organisms, as well as the development of quantitative monitoring models to be able to correlate ion intensity to metabolite concentration.

AUTHOR CONTRIBUTIONS
Karl Burgess conceived the idea and obtained the funding. Joan

DATA AVAILABILITY STATEMENT
The data that support the findings of this study will be openly available in MetaboLights at https://www.ebi.ac.uk/metabolights/ index, reference number MTBLS5197.
F I G U R E 10 Pearson correlation between the natural logarithm of the WCW biomass and real-time metabolomics data for (R)-2,3dihydroxy-3-methylpentanoate and (R)-2,3dihydroxy-isovalerate using the data for the whole fermentation (a) or only using the data of the aerobic batch phase of biomass growth (b). R is the Pearson correlation coefficient and p shows the p value of the test.