OAF: a new member of the BRICHOS family

Abstract Summary The 10 known BRICHOS domain-containing proteins in humans have been linked to an unusually long list of pathologies, including cancer, obesity and two amyloid-like diseases. BRICHOS domains themselves have been described as intramolecular chaperones that act to prevent amyloid-like aggregation of their proteins' mature polypeptides. Using structural comparison of coevolution-based AlphaFold models and sequence conservation, we identified the Out at First (OAF) protein as a new member of the BRICHOS family in humans. OAF is an experimentally uncharacterized protein that has been proposed as a candidate biomarker for clinical management of coronavirus disease 2019 infections. Our analysis revealed how structural comparison of AlphaFold models can discover remote homology relationships and lead to a better understanding of BRICHOS domain molecular mechanism. Supplementary information Supplementary data are available at Bioinformatics Advances online.


Introduction
The 10 human BRICHOS domain-containing proteins are divided among five subfamilies: proSP-C/SFTPC (one member), ITM/BRI (ITM2A, BRI2/ITM2B and BRI3/ITM2C), BRICD5 (one member), Gastrokines (GKN1, GKN2 and GKN3) and Tenomodulin/ Chondromodulin (TNMD and CHM1/LECT1). For most, their precursors consist of three parts: an N-terminal transmembrane region, part of either a signal peptide for secretion or a signal-anchor for type II membrane proteins, followed by a long BRICHOS domain and a C-terminal shorter mature polypeptide generated by proteolytic cleavage. BRICHOS domain-containing proteins have been characterized as pre-pro-proteins, which once secreted, undergo proteolytic processing, usually by proprotein convertases, releasing their mature form (Chen et al., 2022;Hedlund et al., 2009;Knight et al., 2013;Sanchez-Pulido et al., 2002;Willander et al., 2011). Furin-cleavage of the transmembrane protein BRI2, for example, yields a 23 amino acid polypeptide that may then be released from the precursor molecule. Two different mutations of its termination codon extend its reading frame, yielding neurotoxic polypeptides ABri and Adan 11 amino acids longer than normal (Vidal et al., 1999(Vidal et al., , 2000. These extended polypeptides are deposited as amyloid fibrils causing neuropathologies called familial British and Danish dementias. Mutations in another member of the BRICHOS family, proSP-C (pro-Surfactant Protein C) also cause an amyloid-like disorder called interstitial lung disease, a heterogeneous group of respiratory pathologies that affect the normal function of the surfactant mono-layer that covers the alveoli and allows gas exchange in the lungs (Gustafsson et al., 1999;Sá enz et al., 2015).
To date only a single high-resolution structure of a BRICHOS domain has been determined (Willander et al., 2012). More structural information, however, has recently been forthcoming from coevolution-based structure prediction algorithms, such as AlphaFold or trRosetta (Du et al., 2021;Jumper et al., 2021). Not only have these methods yielded a step-change in high-quality structure prediction, they have also substantially modified strategies used for computational protein analysis and remote homology identification (Monzon et al., 2022;Sanchez-Pulido and Ponting, 2021). Twenty years after the discovery of the BRICHOS domain (Sanchez-Pulido et al., 2002), we decided to take advantage of these modified strategies and AlphaFold-predicted structures to undertake an in-depth exploration of the human BRICHOS domain family. Our investigation revealed a previously unknown human BRICHOS domain-containing protein, its putative proprotein convertase cleavage site and its associated mature 70 amino acid polypeptide, which we predict to have amyloidogenic properties.
2 Results and discussion 2.1 Structural comparison between proSP-C known structure and AlphaFold models of the BRICHOS family The X-ray structure of proSP-C shows its BRICHOS domain to contain a central b-sheet composed of four consecutive anti-parallel bstrands followed by a fifth b-strand parallel to b-strand 4 ( Fig. 1A) (Willander et al., 2012). As expected, all AlphaFold BRICHOS domain models adopt the same topology, including two a-helices: one that is variably located, and another that is shorter and establishes a conserved disulphide bridge with b-strand 4 (Fig. 1A). Nevertheless, the proSP-C BRICHOS domain is atypical in two respects. First, it lacks three additional b-strands: two (b-strands 1' and 2') that continue the b-sheet at its N-terminus and a third that intervenes between b-strands 4 and 5 (Fig. 1A). Second, the SP-C mature polypeptide sequence is N-terminal to its BRICHOS domain, whereas it is C-terminal for all other human BRICHOS domain-containing proteins. In full-length proSP-C, this mature polypeptide is bound and stabilized by the groove ('face A') within the BRICHOS domain b-sheet (Willander et al., 2012).
We started by inspecting all AlphaFold-predicted structures of human BRICHOS domain proteins (Tunyasuvunakool et al., 2021), focusing first on the locations of their mature polypeptides. For the BRI2 model, this mature polypeptide is located within its face A binding groove ( Supplementary Fig. S1A). Indeed, all-but-three mature polypeptides of human BRICHOS family members are located in this groove in models, consistent with the interacting surface identified experimentally for proSP-C (Willander et al., 2012). For these three exceptions (Tenomodulin, Chondromodulin and proSP-C), their mature polypeptide lies outside of their models' binding groove, as if it were a separate domain structure ( Supplementary  Fig. S1B). This major structural difference may reflect lower binding affinities between their BRICHOS domains and mature polypeptides, resulting in weaker evolutionary constraints and therefore different predicted tertiary structures. It is also plausible that AlphaFold is predicting only one of these proteins' conformations, specifically either the mature polypeptide-bound or -unbound form.

Structural similarity searches against the AlphaFold human proteome
Structural comparison of BRICHOS AlphaFold models thus defined a conserved structural core for the BRICHOS domain, including bstrands b1', b2', b1-4, b4' and b5, and the second a-helix (C) The significance of profile-to-profile matches was evaluated in terms of an E-value, which estimates the number of observations of better sequence matches expected in a database by chance alone (Zimmermann et al., 2018). The E-values correspond to HHpred searches against all Pfam profile database (including profiles independently generated for each human BRICHOS subfamily), using profiles of each human BRICHOS subfamily as query. For example, in an HHpred profile versus profile comparison search, the OAF profile matched the GKN (Gastrokine subfamily), BRI (ITM subfamily) and SPC (proSP-C subfamily) profiles with E-values 1.6 Â 10 À4 , 0.008 and 0.015, respectively ( Supplementary Fig. S1). Subsequent structural searches with Dali (Holm, 2022) were undertaken against the AlphaFold human proteome (Tunyasuvunakool et al., 2021), using human BRICHOS domain cores as query structures. Unexpectedly, these searches identified a new member of the human BRICHOS family, namely the OAF (Out at First) protein. Structural searches were convergent, identifying statistically significant structural similarity between OAF and different members of the BRICHOS family (Figs 1B and 2). No further human BRICHOS family domains were identified.
As structural similarity may result from either divergence from a common ancestor (i.e. homology) or else convergence, we next investigated amino acid sequence similarities between OAF and the BRICHOS domain sequences. For this, we performed a sequence conservation analysis using the HHpred profile-to-profile comparison tool (Zimmermann et al., 2018). Statistically significant similarities were identified between OAF and BRICHOS domain sequences indicative of homology ( Fig. 1C and Supplementary Fig. S2).
OAF shows five conserved features commonly found in BRICHOS family members (Knight et al., 2013;Sanchez-Pulido et al., 2002): (i) a predicted N-terminal transmembrane helix as part of a signal peptide facilitating secretion; (ii) a putative proprotein convertase (Furin or Furin-like) cleavage site, followed by, (iii) two anti-parallel b-strands likely covalently linked by disulphide bonds, whose nested arrangement of conserved cysteines within the predicted mature polypeptide is consistent with it adopting an extended anti-parallel hairpin structure ( Fig. 2 and Supplementary Fig. S3); (iv) a predicted disulphide bridge between a-helix 2 and b-strand 4, whose conserved cysteines have been involved in a homopolymerization mechanism in reducing conditions, key for the ATPindependent chaperone function of these domains (Leppert et al., 2022); and (v) a highly conserved aspartic acid (Asp74) located at the end of b-strand 2 ( Supplementary Figs S2 and S3). This residue has been recently implicated in a pH-dependent regulatory mechanism of the BRICHOS domain's chaperone activity (Chen et al., 2022). It is also expected to have a key functional relevance in BRICHOS domains because its mutation in proSP-C causes interstitial lung disease (Pobre-Piza et al., 2022;Willander et al., 2012).
Statistical significance of sequence and structural comparisons, and the presence in OAF of five features conserved among the BRICHOS family, are sufficient to infer that OAF is an 11th and previously unanticipated member of the human BRICHOS family.

The OAF family
Oaf was originally described in Drosophila, where its function was related to neuronal development and hatching (Bergstrom et al., 1995). Phyletically, OAF is widely distributed in animals, including cnidarians, arthropods, annelids, molluscs, echinoderms and chordates, but it is absent from the nematode Caenorhabditis elegans (Pfam entry: PF14941) (Mistry et al., 2021). Human OAF is poorly Fig. 2. Structural superposition of BRI2 and OAF. Structural similarity of BRI2 and OAF AlphaFold model is shown. Top: BRI2 and OAF BRICHOS domains, corresponding to positions 84-233 and 29-172, respectively. Cartoons of BRI2 and OAF were coloured in violet and red, respectively. The BRI2 and OAF BRICHOS domains' structural superposition (top row and middle column) was generated using Dali (Holm, 2022); other models in this figure are shown in this orientation. Middle: BRICHOS domains including their respective mature polypeptides. BRI2 and OAF mature polypeptide cartoons were coloured in a violet and dark red, respectively. Bottom: BRI2 and OAF mature polypeptides, corresponding to positions 244-266 and 204-273, respectively. Disulphide bridges in the mature polypeptides (one in BRI2 and four in OAF) are shown in yellow sticks. AlphaFold structural models were rendered using Pymol (http://www.pymol.org) characterized experimentally and its physiological roles are unknown. It is ubiquitously expressed in human tissues, with high expression in brain (in particular in astrocytes), gastrointestinal tract, liver and respiratory system (The Human Protein Atlas; Uhlén et al., 2015). It is also expressed highly in the eye's crystalline lens and is a candidate gene for Peters anomaly type 2 (involving corneal opacity) and ectopia lentis (dislocation or displacement of the lens) (David et al., 2018). Oaf gene knockout mice exhibit abnormal eye phenotypes (International Mouse Phenotyping Consortium; http://www. mousephenotype.org/data/genes/MGI:94852).
Down-regulation of OAF in kidney has been associated to defects in tubular re-uptake of albumin (Teumer et al., 2019). In agreement with this, knockdown of Oaf in Drosophila nephrocytes reduces albumin endocytosis (Teumer et al., 2019).
A de novo heterozygous missense mutation (T171I) has been recently identified in OAF, and described as a putative cause of a musculoskeletal and neurological developmental disorder (Kaplanis et al., 2020). Human OAF residue T171 is well conserved across vertebrates and is located at the end of its BRICHOS domain ( Supplementary Figs S2 and S3). Its mutation to isoleucine is predicted by PolyPhen2 to be probably damaging (score: 0.992; sensitivity: 0.70; specificity: 0.97) (Adzhubei et al., 2010).
OAF's predicted mature polypeptide is longer than for other BRICHOS family members, containing 70 residues (residues 204-273), three-times longer than the BRI2 mature peptide (23 amino acids), for example. It is likely to be highly stable, owing to its four conserved disulphide bridges ( Fig. 2 and Supplementary Fig. S3) and its sequence is highly conserved (48% identity between human OAF and Drosophila melanogaster Oaf), indicative of functional conservation.
Other mature polypeptides of the BRICHOS family form amyloid-like structures (Chen et al., 2022;Hedlund et al., 2009;Knight et al., 2013;Willander et al., 2011). To investigate this for OAF, we applied a machine learning approach, AMYPred-FRL, to its wild-type sequence (Charoenkwan et al., 2022). This predicted OAF to have the highest probability (97%) of forming amyloid-like structures among all BRICHOS domain mature polypeptides ( Supplementary Fig. S4A). This is consistent with the OAF BRICHOS domain having an intramolecular chaperone function that hinders aggregation of its mature polypeptide.
OAF protein is a candidate biomarker for progression and clinical management of pulmonary tuberculosis and coronavirus disease 2019 (COVID-19)-associated pneumonia, owing to its 1.3-fold expression levels increase in untreated tuberculosis patients versus healthy controls (Lu et al., 2022), and its approximately 1.6-to 2.2-fold greater abundance in sera or plasma from critical versus non-critical cases of COVID-19 (Calvet et al., 2022;Di et al., 2020). We note that SARS-CoV-2 infection complications in lungs and brain, such as acute respiratory distress syndrome and acute neurological disorder, have been proposed to be amyloid-related pathologies (Ziff et al., 2022;Sinha and Thakur, 2021) and, furthermore, impaired amyloid processing has been implicated in patients with COVID-19-associated neurological syndromes (Ziff et al., 2022). Experimental investigation of whether OAF mature polypeptide forms amyloid structures ( Supplementary Fig. S4B) or whether the OAF BRICHOS domain is anti-amyloidogenic in COVID-19 and other disease contexts is thus justified.