Aromatase (P450arom, CYP19A1) is the terminal enzyme in the synthesis of the steroid hormone family of estrogens. Not surprisingly, this enzyme has structural similarities between the limited number of species studied thus far. This study examined the structure of aromatases from four diverse Australian species including a marsupial (tammar wallaby; Macropus eugenii), monotreme (platypus; Ornithorhynchus anatinus), ratite (emu; Dromaius novaehollandiae) and lizard (bearded dragon; Pogona vitticeps). We successfully built homology models for each species, using the only crystallographically determined structure available, human aromatase. The amino acid sequences showed high amino acid sequence identity to the human aromatase: wallaby 81%, platypus 73%, emu 75% and bearded dragon at 74%. The overall structure was highly conserved among the five species, although there were non-secondary structures (loops and bends) that were variable and flexible that may result in some differences in catalytic activity. At the N-terminal regions, there were deletions and variations that suggest that functional distinctions may be found. We found that the active sites of all these proteins were identical, except for a slight variation in the emu. The electrostatic potential across the surfaces of these aromatases highlighted likely variations to the protein-protein interactions of these enzymes with both redox partner cytochrome P450 reductase and possibly homodimerization in the case of the platypus, which has been postulated for the human aromatase enzyme. Given the high natural selection pressures on reproductive strategies, the relatively high degree of conservation of aromatase sequence and structure across species suggests that there is biochemically very little scope for changes to have evolved without the loss of enzyme activity.
The synthesis of the steroid hormone family of estrogens is a defining step in the reproductive ability of animal species. The biosynthesis of steroid hormones begins with cholesterol, which is converted into a variety of steroid hormones by a series of enzymatically catalyzed chemical reactions. A key step in estrogen production is a final transformation of the precursors, androstenedione or testosterone, into estrogens by a three-step, six-electron process. It is catalyzed by a cytochrome P450 enzyme called aromatase (P450arom, CYP19A1) [1, 2].
Cytochrome P450 enzymes are a superfamily of heme-thiolate enzymes that achieve stereoselective insertion of an oxygen atom into an otherwise inactive carbon-hydrogen bond [3–5]. Two electrons from the reducing nicotinamide adenine dinucleotide phosphate (NADPH) are required for the catalytic cycle. Electrons are transferred to the microsomal aromatase from a membrane bound flavoprotein, NADPH cytochrome P450 reductase (CPR). This requires a specific protein-protein interaction between CPR and aromatase.
Since estrogens are essential for reproduction, which is under intense natural selection pressures, estrogen biosynthesis dictates the fitness and success of many higher order species across evolution. Additionally, estrogen has many different regulatory roles in different species . This is achieved partially by the production of estrogen by different tissues, such as gonads, placenta, trophoblast and brain. To enable tissue-specific expression of the aromatase gene (CYP19A1) for estrogen production, the pig family replicated the gene twice [6–11]. In contrast, most other mammalian species replicated only the first exon, enabling the use of alternative tissue-specific promoters .
To date, X-ray crystallography has resolved the structural details of only one natively isolated aromatase, obtained from human placenta . Since then, recombinantly modified human aromatase has been expressed and structurally characterized by Ghosh and colleagues [13, 14]. However, aromatase has not been crystallized from any other species. Furthermore, despite the essential importance of this enzyme, aromatase has only been isolated and sequenced from a few animal species, and even fewer have been characterized in terms of catalytic activity [1, 15].
Previously, we investigated the suiform family, including pigs, in which ancient gene duplication has resulted in three tissue-specific aromatase. Employing a range of biophysical (Quartz crystal microbalance with Dissipation monitoring using biomimetic lipid bilayers) and biochemical methods (FRET and enzyme kinetics assays), we previously compared the human with porcine aromatase enzymes . Experimental analysis of human P450arom and porcine gonadal , placental  and the pre-implantation blastocyst  P450arom displays significantly different kinetics. The human aromatase has a catalytic efficiency (Vmax/Km, min−1 /μmol) of 14.1 compared with porcine placental and gonadal isozyme, which are 19.2 and 7.8, respectively . Selecting the least catalytically efficient porcine isozyme (gonadal) of aromatase to compare with the human aromatase, we undertook sequence analysis, building homological models for structural comparisons, and finally molecular dynamics simulations and calculations that supported differences in oligomerization in vitro. This in silico analysis found that homodimerization occurs with human aromatase , but this appears not to occur with the porcine gonadal aromatase, and this could be the reason for the differences in enzyme kinetics between the two species . These computational data were further supported by in vitro biophysical studies using purified recombinant and wild-type aromatase proteins with or without their redox partner CPR and also visualized by Atomic Force Microscopy (AFM) in solution at a membrane interface . These data led us to propose that other species among the evolutionary tree might also have developed alternative ways to regulate the activity of aromatase.
The availability of genomes for many species provides an opportunity to data-mine aromatase sequences across representative species of classes that lie at essential junctures across evolution. By doing this, we may begin to elucidate subtle evolutionary differences between these species which has led to their survival. Here we analyzed the aromatase enzyme across four Australian species that included the classes of reptilia, aves and mammalia. Specifically, bearded dragon (Pogona vitticeps), emu (Dromaius novaehollandiae), platypus (Ornithorhynchus anatinus) and tammar wallaby (Macropus eugenii) were compared with human P450arom.
Analysis of the complexes was conducted using NAMD  and figures constructed using Pymol (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.) and Chemaxon software (ChemAxon Ltd.).
Amino acid sequences were downloaded from NCBI for human (NP_001334185.1), bearded dragon (XP_020645700.1) and platypus (XP_003429099.2) species. The tammar wallaby aromatase used an unpublished sequence provided by Professor Marilyn Renfree (University of Melbourne) and colleagues as noted in our acknowledgements. The emu sequence (XP_025952868.1) was provided prior to publication  by Dr. Tim Sackton of Harvard University with assistance from Dr Chris Smith, Monash University. Alignment of sequences was undertaken using Clustal W 2.1 . Species-specific amino acids are annotated by the species name, i.e., h = human, w = wallaby, p = platypus, e = emu and d = dragon.
Three-dimensional models of the wallaby, platypus, emu and dragon aromatase were built using the crystal structure of human aromatase as a template (Protein Data Bank entry no. 3EQM) in Modeller v9.11  and Swiss Model . Model quality was determined using the Structure Assessment module in Swiss Model, which includes a combination of Ramachandran Plots, MolProbity , QMean  and visual analysis (Supplementary Figure S1). The models that scored the highest across multiple categories were utilized in further analysis. Analysis of the resultant models suggested that Swiss Model was able to create higher quality models of the dragon species. Conversely, the Modeller program was able to create higher quality models using the wallaby, platypus and emu species. Flexible regions were removed from the N- and C-termini. The heme and androstenedione from the human structure were merged into the Australian animal structures and minimized under the Tripos Force Field for 10 000 iterations or until convergence was reached. Local protein contact potentials were generated with vacuum electrostatics in Pymol, using default settings (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC New York, NY, 2020).
The molecular dynamics program NAMD  was used to minimize the protein models. The proteins were initially solvated with TIP3P water using the Solvate plugin within VMD version 1.9 (VMD—Visual Molecular Dynamics, molecular graphics software, https://www.ks.uiuc.edu/Research/vmd/). The rotate to minimize volume was selected, the boundary reduced to 1.8 and the box padding increased to 20 Å in all directions; all other parameters were kept at default. Charges were then neutralized with NaCl using the autoionize plugin within VMD version 1.9 using default settings. Each structure was minimized and equalized for 1 nanosecond (2 femtosecond time step) under the CHARMM27 all-atom force field at 298 K. Langevin dynamics were used with group pressure, and Langevin piston was turned on. Trajectory snapshots were collected every picosecond. After approximately 200 picoseconds, the average root mean square deviation (RMSD) had leveled to approximately 2 Å and analysis was conducted after this time point. The total energy of the system was calculated during and at the end of the simulation runs and averaged.
The amino acid sequences from the four animal sequences were aligned to the human sequence (Figure 1). The sequences from the four animals showed high amino acid sequence identity to the human aromatase; wallaby 81%, platypus 73%, emu 75% and bearded dragon at 74%. The wallaby had the highest sequence similarity to human which is not surprising as evolutionarily they are the most similar of the five species. Regardless, there were still some major sequence differences between the P450arom of the different species. A closer look at these differences showed that some may impact their ability to function in the same manner as the human aromatase. The biggest difference between the aromatase of all four species was at the N-termini. Specifically, when compared to the human, the wallaby is truncated at the N-terminus, with 69 residues fewer than the human aromatase. Importantly, in the crystallized human aromatase , this region forms the A-helix. Thus, the lack of these amino acids may affect the overall 3D structure of the wallaby P450arom. The N-terminal region anchors the enzyme into the membrane bilayer; thus, the significance of the species variation across this region could have a role in enzyme mobility and thus function.
We also assessed the ability of the N-terminus of all species to be a part of a transmembrane domain using ΔG Predictor [25, 26]. This program uses the distribution pattern of amino acids present in a trans-membrane region (typically an alpha helix) to well defined physicochemical properties, that are employed to enable the biological machinery to send the protein to the membrane for insertion into the endoplasmic reticulum . In this program, the amino acid sequences are inputs, and the graphical outputs (Supplementary Figure S2) show the predicted ΔGapp values against the position in the sequence for different helix lengths (L ). Notably, in all our species, a trans-membrane helix is predicted within the first 50 amino acids (Supplementary Figure S2). However, the N-terminal helix of the wallaby aromatase was the least likely to insert as a transmembrane domain, as the expected change in predicted free energy (−ΔG) was slightly positive. However, the authors of the ΔG Predictor program note that “In principle, a negative value of ΔGapp indicates that the sequence is predicted to be recognized as a trans-membrane helix by the Sec  and integrated into the membrane. On the other hand, a positive value of ΔGapp does not necessarily imply that the sequence is not trans-membrane. Even if the predicted ΔGapp value is positive, that does not necessarily imply that the sequence entered cannot be trans-membrane. All it suggests is that the segment would not be inserted efficiently by itself and may need stabilizing interactions from surrounding helices” (http://dgpred.cbr.su.se/index.php?p=instructions). Thus, these data suggest that the wallaby aromatase sequence is less convincing as a trans-membrane helix region, especially with charged residues (Lys, Glu and Arg) present and which are not observed in the N-terminal trans-membrane helix in the other species. Further biophysical data using expressed wallaby aromatase enzyme would be required to provide a definitive functional significance for this region.
There are subtle variations in specific amino acid residues between species that may be associated with activity and function of the aromatase. For example, one of the residues that we have reported as capable of forming a functional homodimer, h290 D , is also found in the platypus (p299D) and the emu (e289D), whereas, the wallaby and dragon have w221E and d290 E, respectively. This residue is located within the human H-helix and together with other charged residues creates an electrostatic homodimeric interface for human aromatase . In this study, we compared the human with the porcine gonadal (pg ) aromatase structure using homology models and molecular dynamics and found that the porcine gonadal aromatase was more stable as a monomer. Additionally, those data were supported by dramatic differences in enzyme activities and biophysical properties . Thus, the dimeric human aromatase exhibited a close salt-bridge interaction at 2.6 Å, between h290D⋯h293R, whereas the porcine gonadal aromatase showed no contact at the equivalent residues, pg288E | pg291K suggesting it was less likely to form a stable dimer, and we proposed that it is likely to function as a monomer. The equivalent residues in the species investigated here are w221E⋯w224G, p299D⋯p302R, e289D⋯e292A and d290E⋯d293G. This implies that the platypus is the only species we examined that is likely to exhibit activity and catalytic efficiency of aromatase similar to that found in humans. More detailed investigation, including expression and determination of the activities of the wallaby, emu and dragon P450arom is now required.
We examined the CPR-binding region, the heme-binding domain and the substrate-binding site. The residues which form the CPR binding region in humans are predominately positively charged arginine (K) and lysine (R) residues, which allow for the electrostatic interaction with the CPR . In the human, this consists of eight positively charged residues: 108K,142K,145R,150K,352K, 354K, 420K and 425 R; highlighted in green in Figure 1. The only difference between all four species is a single amino acid difference in the emu sequence where the human 352K is replaced by an asparagine (N). This minor change from a basic to an amidic amino acid would be expected to have only a slight influence on the binding affinity of the CPR. In comparison with the porcine aromatases, the positively charged amino acids required for CPR binding were identical to the human except for the porcine blastocyst isozyme that had 352 E . Thus, it is difficult to gauge whether this single change to the CPR binding in the emu will impact the strength of the protein-protein interaction or affect the nature of the electron transfer needed between CPR and P450arom. Site-directed mutagenesis and enzyme activity studies on expressed aromatase would be required and were beyond the scope of this study.
In the heme-binding domain, there are more amino acid differences between the human (430FGFGPRGCA) and the other four species. This is surprising as all species are expected to bind heme. The wallaby sequence, also being a mammal, is predicted to have identical heme-binding amino acids to the human. The next most similar sequence to the human is the emu, which has a single minor conservative residue change (438A → V), which we expect to have no significant impact on the binding of the heme. The platypus also has a single difference from the human sequence; however, this is a non-polar to polar mutation (436G → S), which, depending on the location, may reduce hydrophobic interactions and thus the binding affinity for the heme. Interestingly, the bearded dragon sequence had both the conservative platypus change (438A → V) and the less conservative emu change (436G → S), plus another non-polar to polar substitution (432F → S). Since heme is essential for the transfer of electrons, these changes suggest that they do not interfere with binding to a heme group in a major way.
In the substrate-binding site, there is a requirement to bind the hydrophobic androstenedione and allow for its conversion to estrone. Therefore, the active site of aromatase needs to consist of predominately hydrophobic, non-polar residues. According to Ghosh et al. , in the human crystal structure, the residues which form direct van der Waals interactions with androstenedione are 115R, 133I, 134F, 221F, 224W, 306A, 310T, 370V, 373V, 374M and 477L. However, in order to orient the substrate correctly and to stabilize intermediates, there needs to be polar interactions with the two distal oxygens of the androstenedione. In the human crystal structure, this is achieved via amino acids 309D and 374M. Specifically, the 309D interacts with O1 (oxygen 1) on the androstenedione via its side-chain and the backbone of 374 M interacts with O2 (oxygen 2) on of androstenedione . All of the aromatase residues directly interacting with androstenedione mentioned above are common between all five species with the exception of the emu that has a conservative residue substitution at h373V becoming a similarly non-polar amino acid isoleucine (I) in the emu sequence (e372I).
Using both Swiss Model and Modeller, we were able to successfully create high quality models of the wallaby, platypus, emu and dragon aromatase proteins. For further analysis, we used the Swiss Model dragon models and the models created by Modeller for the wallaby, platypus and emu. Amino acids computationally predicted to be unstructured were deleted from both the N- and C- termini. Molecular dynamics minimizations were then undertaken on models of each species. The total energies of each species were found to be similar (Table 1); therefore, visual analysis was required to identify specific differences between the structures of each animal species.
|Minimum||−154 367||−167 206||−161 000||−161 009|
|Maximum||−154 344||−167 189||−159 480||−160 993|
|Mean||−154 355||−167 200||−160 160||−161 001|
The homology models were aligned in Pymol (Figure 2), and the ligand (androstenedione) and heme were manually docked in and re-minimized. Based on the reasonably high sequence homology, it was not unexpected that all species aligned relatively well. The root mean square deviation (RMSD) was calculated by aligning the alpha carbons of the other species to the human structure (Supplementary Table S1 and Supplementary Figure S1). This gave RMSDs of 0.12, 0.19, 0.23 and 0.12 Å for wallaby, platypus, dragon and emu, respectively. Although the wallaby had the closest RMSD to the human, as anticipated from the sequence alignments, the wallaby is missing structure at the N-terminus when compared to the human crystal structure. Specifically, this consists of two alpha helices and approximately one and a half beta strands from the first beta sheet. Although not unexpected, the extent of the missing structural components was surprising, as from the sequence, we only expected a single alpha helix not to be present.
Visually, the backbone alignment of all the other species looks structurally similar to the human crystal structure. At approximately 0.1 Å worse than the other species RMSD, the dragon only had very minor differences in overall structural features. Specifically, this was just some very minor differences at the C-terminus, perhaps due to the two amino acids truncation in the sequence when compared to the human.
Local protein contact potentials using vacuum electrostatics were successfully generated for each animal model using Pymol. The surface charges (Figure 3) represent the surface electrostatic morphology with regions of positive (blue), negative (red) and neutral (white) surface potential. The human aromatase (Figure 3A) illustrates the positively charged (R/K) residues, shown in blue spheres that are needed to bind the redox partner protein CPR. These residues, discussed earlier, are all conserved across species (see Figure 1) except for the emu sequence in which the human 352 K is replaced by an asparagine (N). In Figure 3A, this substitution reduces the positive charge (Supplementary Figure S4); however, the reduction from eight to seven positive charges is not expected to have a significant impact on the overall strength of CPR binding, although possibly it could impact the coupling efficiency of the electron transfer form CPR of aromatase in the emu.
Another feature of interest is the hypothesized human aromatase dimer interface . The 90° rotation of Figure 3A shows the homodimerization interface (Figure 3B). As discussed above, analysis of the amino acid residues that form a close salt bridge in the postulated human aromatase dimer interface suggests that the platypus aromatase is the only one of these five species likely to exhibit activity and catalytic efficiency of aromatase similar to that found in the human. In Figures 3C–F, the animal species are oriented similarly to the human to show the equivalent interface used by the human aromatase for homodimerization. Interestingly, the charge distribution across the wallaby, platypus, emu and dragon has broadly similar patterns, but are different to the human aromatase, that has a “stripe” of positive charge flanked by negative charges along the dimerization interface. These models are also shown in Supplementary Figure S3, where each aromatase structure is rotated through 90° stages to provide a 3D visualization of the electrostatic surfaces. Supplementary Figure S4 provides a qualitative but relative comparison between the species, and the subtle changes observed could in turn translate into altered activity of the respective aromatases.
Computational and visual analysis of the models showed that the active site residues of the human, dragon, platypus and wallaby are very similar. This is interesting, as the dragon not only has less sequence identity to the human than the wallaby, but the structural model of the dragon also has a higher RMSD than the wallaby, when compared to the human (Figure 4).
The human structure has 12 hydrophobic interactions with androstenedione (h133I, h134F, h221F, h224W, h306A, h309D, h310T, h370V, h372L, h373V, h374M and h477L). These equivalent residues are found in the dragon, platypus and the wallaby and thus as expected, these same hydrophobic interactions are observed. As expected from the sequence analysis, the oxygens of the androstenedione are orientated into place by hydrogen bonds between the side-chain of 309D and O1 and 374M and O2 for human. Evolutionary, this might be interesting, as subtle variation in these structures could alter the sequential hydroxylation reactions that result in aromatization of the steroid A-ring.
There is one very small difference in the amino acid sequences of human and emu, which was predicted to have a very minor effect on androstenedione binding. Namely, h373V is an e407 I in emu. Visual analysis of this residue shows that they align perfectly (Figure 5) with the extra carbon of e407I predicted to give a slight increase in hydrophobic interactions with the heme and the androstenedione.
The homology models built for wallaby, platypus, emu and dragon represent a wide and diverse variation in species and yet, these show good 3D structural similarities. The amino acid sequences from the four Australian animals showed high amino acid sequence identity to the human aromatase; wallaby 81%, platypus 73%, emu 75% and bearded dragon at 74%. In the 3D structures of all the enzymes, with regard to the active sites, we found that the dragon, platypus and wallaby had identical substrate binding sites to the human aromatase. Comparisons between the emu and the human showed only small differences with variations due to conserved amino acid substitutions. Among the five species, the aromatase structure was highly conserved, although there were non-secondary structures (loops and bends) that were variable and flexible, and these may result in some differences in the catalytic activity. Certainly, the electrostatic potential across the surfaces of these aromatases highlighted likely variations to the protein-protein interactions of these enzymes with CPR and possibly, like the porcine gonadal isoform, an inability to homodimerize in the case of the wallaby, emu and dragon. Given the high natural selection pressures on reproductive strategies, the relatively high degree of conservation of aromatase sequence and structure across species suggests that there is biochemically very little scope for changes to have evolved whilst still retaining aromatase activity.
Access to the revised tammar wallaby genome 3.4, was kindly provided by Thomas Heider, Asao Fujiyama, Geoff Shaw, Marilyn Renfree, Andrew Pask and Rachel O’Neill. We also thank Dr’s Craig Smith and Tim Sackton for access to the emu aromatase sequence prior to publication. The emu sequence is now published on NCBI and can be found at https://www.ncbi.nlm.nih.gov/genome/123?genome_assembly_id=391045. We thank Dr John Bruning, School of Biological Sciences, University of Adelaide for invaluable discussions on aromatase structure.
All the authors declare they have no financial or other potential conflicts of interest.