Cell cycle state proteomics and classification using in-cell protease digests and mass spectrometry

Proteomic analysis of rare cell states is a major challenge. We report an advance to our PRoteomics of Intracellular iMMUnostained cell Subsets (PRIMMUS) workflow whereby fixed cells are directly digested by proteases in cellulo for mass spectrometry-based proteomics. This decreased the cell number requirement by two orders of magnitude to

The proteome is a functional readout of cellular phenotype, which includes dynamic and 59 persistent molecular features that reflect cell state and cell type, respectively. Rare cell 60 phenotypes play key physiological roles. Quiescent stem cells, while often rare relative 61 to differentiated cell types in a tissue, are essential for tissue homeostasis. Similarly, 62 mitosis is critical for the accurate propagation of genetic material and a phase during 63 which cellular commitment to proliferation is made [1] [2] . Mitotic states are generally 64 short-lived and thus rare in an asynchronous population. Proteomic analysis of these 65 critically important cell phenotypes is a major challenge because typical proteomic 66 workflows require >10 5 cells as input. 67 Recent advances have been made in methods for low cell number proteome 68 analysis. For example, ~2,700 proteins were identified from 6,250 CD34+ hematopoietic 69 progenitor cells using optimized in-solution digests combined with data-independent 70 acquisition (DIA) [3]. ~3,000 proteins were identified from 10 HeLa cells using 71 'nanodroplet processing in one pot for trace samples' (nanoPOTS) [4]. Single cell 72 proteomic analysis using nanoPOTS with tandem mass tag (TMT) booster channels has 73 been recently described [5]. NanoPOTS requires microfabricated glass chips, robotics 74 that can handle picoliter volumes and sample storage in prepacked nano-LC columns. 75 These requirements are challenging to satisfy in most labs and limit widespread 76 adoption of the technique. 77 We previously developed an approach called 'PRIMMUS' or 'Proteomics of 78 Intracellular Immunostained Subsets" to analyse abundant and rare cell cycle states [6]. 79 Formaldehyde-fixed cells are fractionated into specific cell states by staining cells for 80 intracellular markers and separating them using Fluorescence-Activated Cell Sorting 81 (FACS). Cells grown in asynchronous culture are immediately fixed, thereby minimizing 82 perturbation to physiological processes. This step is critical, as small molecule-based 83 synchronsation can lead to effects on the proteome that are associated with stress 84 responses arising from arrest rather than cell cycle regulation per se [7]. The application 85 of PRIMMUS was limited to abundant subpopulations where >10 5 cells can be collected 86 by FACS within a reasonable time [6]. A more sensitive PRIMMUS approach would 87 enable high resolution mapping of proteomic changes during an unperturbed cell cycle, 88 A reference sample was generated by lysing TK6 cells in DPBS with 2% SDS and 243 cOMPLETE protease inhibitors without EDTA (Roche, 1x concentration) at 70 ºC, 244 homogenised with a probe sonicator and treated with benzonase. Protein was reduced 245 with 20 mM TCEP for 2 hr before alkylation with 20 mM iodoacetamide at ambient 246 temperature in the dark for 1 hr. Protein was precipitated with 4 volumes cold acetone at 247 -20 ºC overnight, washed with 100% cold acetone and 90% cold ethanol. Protein pellet 248 was air dried before resuspending in DPBS and digesting with 1:50 w/w trypsin for 249 ~16hrs. Peptides were acidified, desalted, aliquoted, and fractionated as previously 250 described. For isopropylation, 50 µg peptides were resuspended in 200 µl 90% 251 acetonitrile containing 0.1% formic acid before addition of 50 µl acetone containing 36 252 µg/µl NaBH 3 CN. The reaction was conducted at ambient temperature for ~16 hrs before 253 quenching with ammonium bicarbonate, drying off solvent and desalting peptides over 254 C18. For dimethylation, 50 µg peptide was resuspended in 200 µl DPBS before addition 255 of 0.32% formaldehyde and 50 mM NaBH 3 CN. The reaction was conducted at ambient 256 temperature for ~16 hrs before quenching with ammonium bicarbonate and desalting 257 peptides over C18. 200 ng of unmodified, dimethylated, and isopropylated peptides 258 were analysed by AMPL and DDA, and unmodified fractionated peptide samples were 259 analysis by DDA, as previously described. LCMS data were searched using MaxQuant, 260 as previously described. Note that dimethylation and isopropylation modifications were 261 not specified in in the search parameters. 262

Cell cycle proteomic data analysis 263
All subsequent data analysis on the protein intensity were swapped in order to produce a heatmap that follows a logical, sequential order of 289 peak abundance, i.e. cluster 1 with highest abundance in P0-P8 and cluster 5 with peak 290 abundance in P3-P7, etc. 291 For PCA and cell cycle state classification, scaled pseudotimecourses were used. 292 Cell cycle states were classified using the k-NN model as implemented in the class R 293 library (v. 7.3-15) using k = 6, with k being the number of nearest neighbours for 294 classification. Three biological replicates were used as the training set and the 295 remaining replicate was used as a test set. 296 For the pairwise comparison of the proteomes of P17 with P1 and P16, t-tests 297 were performed on ppm intensities. Uncorrected p-values were plotted against mean 298 fold change in order to identify candidate proteins that were specifically changed in 299 abundance in P17. 300

Results 301
Impact of formaldehyde crosslinking on whole proteome analysis 302 Heat treatment at 95 C is sufficient to reverse most formaldehyde crosslinks, as shown 303 previously [9]. However, a pool of crosslinked, multimeric species remained in a protein-304 dependent manner. Therefore, we aimed to optimize the PRIMMUS approach by first 305 focusing on improving the decrosslinking efficiency (Fig. 1A). 306 Previous reports have suggested that the reversal step is accelerated by co-307 treatment with a nucleophilic quenching agent [12]. We tested addition of Tris and 308 hydroxylamine on crosslink removal (Fig. 1B) of Tris and hydroxylamine treatment shows decreased crosslinked proteins relative to 319 control, or to either treatment alone. 320 These samples were then subjected to MS-based proteome characterization.  Table 2). We then hypothesized 327 that formaldehyde-induced modifications were present in exceptionally low 328 stoichiometry and therefore any differences between the samples were masked by the 329 relatively low peptide coverage in the single-shot analyses. We therefore chose three 330 samples for HPLC pre-fractionation and deeper proteome analysis: control protein 331 extract from non-fixed cells, protein extract from fixed cells, and fixed and heat-treated 332 protein extract from fixed cells (95 C for 45 min). For reference, these samples 333 correspond to lanes 1, 2, and 5, respectively, in Fig. 1B. Fig. 1C shows that the 334 numbers of peptides identified are similar among all three samples; in total, 73,885, 335 72,785, and 72,779 peptides for control, decrosslinked and fixed samples, respectively. 336 The numbers of proteins detected are similarly comparable (Fig. 1D) Previous reports on short peptides have shown that formaldehyde produces +30 and 341 +12 mass shifts, corresponding to methyloyl and imine modifications, respectively. We 342 saw no appreciable increase in these mass shifts, which is consistent with the instability  and extracts from non-fixed cells processed by precipitation (see Methods, ~4,561 369 proteins, n = 3). We conclude that the proteome coverage from the in-cell digest is 370 similar, or higher, than the other protocols tested. 371 We did not observe a broad bias in quantitation, as label free intensities 372 measured in fixed cells prepared by the in-cell digest and by decrosslinking followed by We conclude that the measurements of protein abundance from the in-cell digest 386 are quantitative, reproducible and broadly comparable to conventional sample 387 preparation methods. We note that each sample preparation method will have its own 388 specific biases. In the case of the in-cell digest, the increased abundance of membrane 389 proteins may more accurately reflect the abundance of these proteins in cells, as will be 390 detailed in the Discussion section. 391

Averaged MS1 Precursors with Library matching (AMPL) improves feature detection 392
To increase the sensitivity and detection speed of the Orbitrap Elite MS instrument 393 (release date in 2011), we utilised MS1-based identification and quantitation using 394 accurate mass and retention time matching, as proposed originally by the Smith lab [16]. We reasoned that the additional peptides detected by AMPL originate from low-432 abundance features detected by virtue of the S/N increase due to averaging. However, the lack of MS2-based identification for these matched sequences could lead 459 to an increased false discovery rate (FDR). We estimated that the matching FDR is ~4.5%  Table 3). 466 We conclude the library matching approach dramatically increases sensitivity, 467 particularly for low column loads, with AMPL providing the highest peptide and protein 468 coverages overall with relatively low estimated match FDR (<3%). 469

An improved PRIMMUS for proteomic analysis of low cell number populations 470
As shown in Fig. 4C, AMPL detects a slightly higher number of proteins in 10 ng on-471 column load as DDA with 1 µg load, demonstrating a 100x increase in sensitivity. A 10 472 ng on-column load is equivalent to the protein content of ~67 cells based on the protein 473 per cell measured in bulk assays. However, the effective number of cells required for 474 proteome analysis is usually much higher. This is due to losses during sample 475 preparation. We reasoned that these losses are significantly reduced using the 476 streamlined in-cell digest.   Table 4). 486 Over 4,500 proteins were quantitated with 2,000 cells, with 4,480 proteins reproducibly 487 quantitated in two technical repeats. At the lower end of the cell titration (shown in Fig.  488 4E), over 300 proteins on average were quantitated from 10 cells with 259 reproducibly 489 quantitated in two cell aliquots that were separately collected by FACS. While approx. MaxQuant with MBR and filtered by match parameters as discussed above. 522 Of the 7,757 proteins quantitated overall (Supplementary Table 5), 4,918 proteins 523 were quantitated in all 8 replicates (4 biological x 2 technical repeats) in at least one 524 population (Fig. 5C). Next, to identify cell cycle regulated proteins, we treated each set 525 of 16 populations as an ordered series of related biochemical states. We have called 526 each set a pseudotimecourse. While these states can be projected onto a temporal axis 527 (i.e. cell cycle progression), the link with time is indirect as the duration of each phase 528 has been shown to vary substantially on a per-cell basis. We then performed a Fisher's 529 periodicity test to identify proteins abundance patterns that showed periodic behavior. In 530 order to increase robustness, the periodicity test was separately performed on each 531 technical repeat. Only those proteins showing a p-value <= 0.10 and a periodic 532 frequency of 0.0625 or 0.125 (i.e. one cycle every 8 or 16 pseudo-timepoints) in both 533 tests were considered further as periodic. Fig. 5D shows the abundance profiles for heat 534 shock protein HSP90AA1 and ATPase AAA domain-containing protein ATAD2 as 535 example non-periodic and periodic proteins, respectively. ATAD2 shows highly 536 reproducible abundance variation in all 8 pseudotimecourses, with peak abundance in 537 S-phase populations (P5-P6). We note proteins meeting the significance cutoffs are 538 highly enriched in cell cycle GO terms (Supplementary Table 6 Amongst these 119 proteins are cyclins A2 and B1. The MS-measured 542 abundance patterns (Fig. 5E) show similarity with those measured by immunostaining 543 (Fig. 6B) with accumulation in interphase and decreased abundance in mitosis. We also 544 detect cyclin B2, an isoform of cyclin B that is localized to the Golgi apparatus. Cyclins 545 B2 and B1 show a nearly identical abundance pattern in interphase. However, at 546 anaphase and late mitosis (P13 -P16), cyclin B2 abundance does not decrease to 547 background levels, which suggests that unlike cyclin B1, there is a pool of cyclin B2 that 548 is stable towards degradation (Fig. 5E, right). 549 Hierarchal clustering of the 119 proteins (Fig. 6A) identified five major classes of 550 protein abundance patterns (Fig. 6B). Cluster 1 proteins show high abundance in 551 interphase, which decreases in early mitosis (P8-P10) and recovers slightly in late 552 mitotic populations (P15-P16), as illustrated by the example protein hepatoma-derived 553 growth factor, SRSF6 (Fig. 6C). Like SRSF6, most proteins in this cluster are either 554 RNA-or DNA-binding (26 / 33). For example, several mRNA splicing factors are in this 555 group, including serine/arginine-rich proteins (SRRM2, SRSF2, SRSF3, SRSF5, 556 SRSF6). These proteins decrease in abundance in mitosis with a small fold change ( 2) 557 compared with, for example, cyclin B1 (Fig. 5E). The remaining proteins with no known 558 or anticipated oligonucleotide-binding properties are enriched in cytoskeleton-binding 559 factors, e.g. the actin-binding proteins MARCKS and ZYX. 560 Cluster 2 contained proteins that had peak abundances in late G1/S populations. 561 Included in this cluster is the protein SLBP, a histone gene expression factor that peaks 562 in P4-P5 (Fig. 6D). Indeed, nearly all proteins in this cluster are directly involved in DNA which requires two priming phosphorylations for recognition by SCF-Fbxw7 and 577 targeted degradation. Cluster 2 is most enriched in the T-P-X-X-E motif, which requires 578 only one phosphorylation for substrate recognition. Interestingly, Cluster 1 is also more 579 highly enriched in CDK consensus sites. We conclude that multisite phosphorylation by 580 CDK may play a role in directing these proteins for degradation by SCF-Fbxw7. 581 Cluster 3 shows peak abundance in G2 and early mitosis (P6 to P9). This cluster 582 contains several proteins associated with DNA replication and DNA damage repair, 583 including the dsDNA exonuclease EXO1, PCNA-associated factor (PAF/KIAA0101) and 584 ribonucleotide reductase M2 (RRM2, Fig. 6E). The abundance pattern of RRM2 is 585 consistent with previous proteomic studies and targeted degradation of RRM2 in late 586 Example proteins from this cluster include TPX2 and Aurora A kinase (Fig. 6G). TPX2 is 606 the activator of Aurora A kinase whose activity is important in centrosome separation in 607 prophase and mitotic progression. Other proteins in cluster 5 with regulatory roles in 608 mitotic progression include the catalytic E2 subunits of the APC/C (UBE2C, UBE2S), 609 the chromosome passenger complex (AURKB, INCENP, BIRC5 -Survivin, CDCA8 -610 Borealin) and the spindle-associated protein FAM83D. 611 SLIM analysis of these clusters identified differences in the enrichment in nuclear 612 import and export signals. As shown in Fig. 6I, clusters 1 and 2 are enriched in nuclear 613 localisation signals (mono-and bi-partite). By contrast, cluster 4 shows a strong 614 enrichment for the Crm1-mediated nuclear export signal (NES). Eight proteins in cluster 615 4 matched the NES consensus. Some predicted NES located in globular domains will 616 likely be constitutively inaccessible to Crm1 but may be recognized upon conformational 617 change. Notably, cluster 4 includes cyclins B1 and B2, whose constitutive export from 618 the nucleus is thought to be important in preventing premature mitotic entry. Crm1-619 binding and/or exclusion from the nucleus of the remaining six proteins (e.g. Bub1, 620 BubR1, cyclin A2, CLEC16A, MVP, and ARMC1) may also be important in the proper 621 timing of cell cycle events. 622 We identified strongly pseudoperiodic proteins that have no reported function in 623 cell cycle control. These novel cell cycle regulated proteins may, like many of the other 624 proteins identified in this manner, have significant roles in cell cycle progression. These 625 candidates include EXO1, the DNA helicase PIF1, the guanine-exchange factor NET1 626 and the uncharacterized protein FAM111B (Fig. 6H). Potential functional roles for 627 FAM111B in cell cycle regulation are discussed further below. 628

Analysis of mitotic protein abundance dynamics in unperturbed cells 629
A major regulator of protein abundance during the cell cycle is the anaphase promoting ubiquitination of APC/C substrates is tightly temporally controlled, with APC/C substrate 634 specificity changing during the cell cycle. This is mediated through changes in the 635 APC/C co-activators and substrate recognition factors, Cdc20 and Cdh1. While APC/C-636 Cdc20 is active in early mitosis, the substrate receptor changes to Cdh1 in late mitosis, 637 thereby conferring a temporal order to substrate degradation. Cdc20 is itself a substrate 638 of the APC/C-Cdh1, allowing for switch-like handover in substrate receptor control. 639 Interestingly, 25 of the 119 core pseudoperiodic proteins are experimentally 640 validated APC/C substrates and the vast majority (24) are found in clusters 3, 4 and 5. 641 Substrate recognition by APC/C-Cdc20 and APC/C-Cdh1 is mediated by the interaction 642 between WD40 domains on the APC/C-(Cdc20/Cdh1) and SLIMs found on substrates. 643

The KEN and D-box (RxxL) degrons are well documented SLIMs that bind both APC/C-644
Cdc20 and APC/C-Cdh1, with APC/C-Cdh1 having a preference for the KEN degron. 645 More recently, a third SLIM called the ABBA motif was shown to be important in 646 substrate recognition by APC/C-Cdc20 [30]. Its name comes from the four proteins in 647 which it is found: Cyclin A, Bub1, BubR1 and the yeast-specific protein Acm1. 648 proteins that have on average, higher abundance in G0/early G1. 660 Six out of 12 proteins that peak in mid-mitosis (cluster 4) contain the RxxL D-box 661 sequence. The 50% frequency is ~8-fold higher than the background frequency (6%). 662 By contrast, the fold-enrichment is considerably lower in the other clusters (Fig. 6I). 663 Similarly, 5 out of 12 proteins contain the ABBA motif (42%, Fig. 6I A2 and cyclin B1 removed essentially produces identical results, which indicates that 706 the relationships produced by using ~119 cell cycle marker proteins are robust towards 707 the absence of individual proteins, including key proteins that drive cell cycle 708

progression. 709
A simple kNN-model was used to classify cell cycle states using these data. 710 Replicates 1 -3 were used as a training set and replicate 4 was used as the test set. Fig.  711 7B shows the performance of the classification by plotting predicted versus actual 712 populations. There is a linear correlation with some minor deviations. We then repeated 713 this kNN analysis for each combination of the four replicates using 3 replicates as the 714 training set and the remaining replicate as the test set. We treated the populations as a 715 circular, progressive series of cell states whereby P1 is the next state after P16. We 716 then calculated the distance between predicted and actual populations for each 717 replicate combination (Supplementary Figure 4A). The average distance is 0 in all four 718 cases with a standard deviation ~1. This indicates that the kNN models are broadly 719 accurate in predicting the cell cycle state with a precision of ± 1 cell state. 720 We then asked whether the PCA classification could be used to identify 721 where replicate data were averaged (Fig. 7D), the PCA places P17 between P16 and 728 P1. Indeed, using the kNN model described above classifies two replicates of P17 as 729 P16 and two other replicates as P1. From these observations, we conclude that P17, 730 while having DNA content consistent with a G2-phase cell, has a cell cycle protein 731 profile that is more consistent with an early G1/G0-phase cell. 732 This population may reflect a senescent state consistent with previous reports 733 where the APC/C is re-activated in G2 phase cells in response to DNA damage, leading 734 to premature degradation of cyclin B1. Our data suggest that numerous APC/C targets 735 are decreased relative to a typical G2 state (Fig. 7E), effectively resetting the cell cycle 736 state of these cells to an early G1/G0-like state. We then performed pairwise 737 comparisons proteome-wide between P1, P16 and P17 to identify proteins showing 738 reproducible changed abundance in P17 cells. Of the top candidates (Supplementary 739 Figure 4B), three are regulators of the DNA damage response (Fig. 7F) We conclude that we have found a core set of 119 proteins that can be used to 747 robustly assign cell cycle states with high resolution and to phenotypically characterise 748 cell populations whose position in the cell cycle is unknown. We show that the in-cell digest enables reproducible and quantitative analysis of 763 proteomes from 2,000 TK6 and MCF10A cells using AMPL analysis. The AMPL 764 approach overcomes the low duty cycle of the Orbitrap Elite to enable proteome 765 analysis with a sensitivity comparable with current instruments. Newer instrumentation 766 with higher duty cycles, including the TIMS-TOF Pro and Exploris 480, is expected to 767 enable conventional DDA analysis of proteomes at a similar depth with 2,000 TK6 cells, 768 or alternatively, improve proteome depth further using MS1-based matching methods. 769 The in-cell digest is compatible with other approaches of low cell number sample 770 preparation for MS-based proteomics. In-cell digested samples can be efficiently 771 labelled by isobaric tags, e.g. TMT and iTRAQ, and therefore compatible with use of 772 carrier channels to boost the signal of rare or single cell channels (e.g. iBASIL). The 773 protocol requires no specialized humidified sample handling chambers or direct loading 774 onto premade, single-use analytical nanoLC columns, such as those described in the 775 nanoPOTS workflow. While the proteome coverages obtained by nanoPOTS is higher 776 than reported here, it is possible that a new workflow combining aspects of the in-cell 777 digest and nanoPOTS could improve both generalizability and performance compared 778 to either method as originally described. 779 Each sample preparation method will have its unique advantages and potential 780 biases, which we evaluated by quantitatively comparing the in-cell digest with a more 781 conventional in-solution digest. This analysis revealed an overrepresentation of 782 membrane proteins amongst those proteins with higher abundance measured in the in-783 cell digest samples. These proteins include mitochondrial membrane proteins (e.g. 784 TOMM7) and proteins that are known to be localized to the cell surface (ADAM15). 785 Membrane proteins have been shown to irreversibly aggregate in soluble extracts when 786 heat-treated and precipitated. Delipidation by methanol, which is used to increase cell 787 permeability, could also play an important role in increasing digestion efficiency by 788 trypsin. We suggest that the higher abundances measured for membrane proteins is 789 unlikely to be an artefact of the in-cell digest; in contrast, the measurements are likely to 790 more accurately reflect the abundances of these proteins in cells. protein bias observed in more detail, and we have preliminary evidence suggesting the 797 latter. Interestingly, proteins in cluster 2 (Fig. 6A), which show a robust, pseudoperiodic 798 change in abundance are nearly all known to interact with either DNA or RNA. Few of 799 these proteins have been shown to be cell cycle regulated previously. It may be the 800 changes in MS-measured abundance reflect differences in RNA-and/or DNA-801 interactions by these proteins rather than a change in the protein abundance in cells. 802 We identify novel proteins whose cell cycle function has not been previously 803 characterized. FAM111B is a pseudoperiodic protein in cluster 1 (Fig. 6B,  sequence, predicted interactions with PCNA and peak protein abundance in S-phase, 820 we propose that FAM111B also is likely to play a key role in DNA replication. 821 We present an unbiased pseudotemporal analysis of protein abundance changes 822 during 8 biochemically resolved mitotic states (P9 to P16 in Fig. 5B) with a resolution 823 extremely challenging to obtain with high precision using arrest and release 824 methodologies. The protein clusters are functionally related. For example, clusters 4 825 and 5 both contain proteins essential for mitotic progression but differ in when during 826 mitosis the functions are required. Cluster 4 contains proteins directly involved in or 827 directly downstream of the spindle assembly checkpoint that are degraded upon 828 checkpoint satisfaction. These regulatory pathways ensure that proper spindle 829 microtubule-chromatid attachments are formed prior to loss of sister-chromatid cohesion 830 and separation of the sister chromatids. By contrast, cluster 5 contains proteins that are 831 functional throughout mitosis, such as chromosome passenger complex (CPC), or 832 primarily in cytokinesis, such as ECT2, PRC1, RACGAP1 and ARHGAP11A. 833 Interestingly, several core subunits of the APC/C E3 ligase are also present in cluster 4. 834 Their degradation at the end of mitosis is expected to significantly decrease APC/C-835 mediated substrate degradation promote accumulation of substrates and facilitate rapid 836 progression into the next cell cycle. 837 A high proportion of proteins in clusters 4 and 5 (24/69, 35%) are experimentally 838 validated APC/C substrates, which represents a 70-fold overrepresentation in these two 839 clusters compared to non-pseudoperiodic proteins (0.5%). Previous studies have 840 identified APC/C-Cdh1 and APC/C-Cdc20 substrates by bioinformatic analysis of co-841 regulation, stabilization by siRNA depletion of Cdc20 or Cdh1, and immunoprecipitation 842 of APC/C at different timepoints during mitosis. Interestingly, the high mitotic phase 843 resolution and purity obtained in this study enabled unbiased identification and 844 separation of APC/C substrates. As discussed above, clusters 4 and 5 differ in the 845 representation of ABBA and D-box short linear motifs, key degrons that are recognized 846 by APC/C-Cdc20. Note that there are an additional 44 proteins in these two clusters that 847 have not been previously experimentally validated as APC/C substrates and are 848 candidates for future follow-up analysis as novel, uncharacterized substrates. 849 High resolution classification of cell cycle state is an important prerequisite to 850 obtaining meaningful biological insights into single cell 'omics' data. However, datasets 851 on the cell cycle regulated transcriptome and proteome generally provide low time 852 resolution, particularly in mitosis. This is more important with single cell proteomics. 853 Whereas transcriptional and translational activity are dampened during mitosis, there 854 are tremendous changes in protein phosphorylation and protein abundances, which will 855 contribute towards single cell proteome variation. 856 Here we have identified a cell cycle signature composed of the abundances from 857 119 pseudoperiodic proteins that can be used to classify the cell cycle state of a cell 858 population by virtue of the proteome. By using a split train/test strategy, we showed a 859 The performance of the kNN model was assessed using one replicate as the test set.