Draft Genome Sequence of the Symbiotic Frankia sp. strain B2 isolated from root nodules of Casuarina cunninghamiana found in Algeria

Frankia sp. strain B2 was isolated from Casuarina cunninghamiana nodules. Here, we report the 5.3-Mbp draft genome sequence of Frankia sp. strain B2 with a G+C content of 70.1 % and 4,663 candidate protein-encoding genes. Analysis of the genome revealed the presence of high numbers of secondary metabolic biosynthetic gene clusters.


Introduction
Actinobacteria of the genus Frankia are Gram positive filamentous bacteria that are able to fix molecular nitrogen in free living state or in symbiosis with their host plant [1,2]. These bacteria establish a nitrogen-fixing symbiosis with a diverse variety of plant species, collectively named actinorhizal plants, which include 8 dicotyledonous plant families, 24 genera and over 220 species. The mutualistic association is referred to as the actinorhizal symbiosis and results in the formation of a root nodule structure. The bacteria are housed within plant cells in the nodule which allows for the trophic exchange between the two partners. The bacteria reduce atmospheric nitrogen to ammonia that is supplied to the host plant, which in return provides carbon compounds from photosynthesis to the bacteria.
Because of the symbiosis, actinorhizal plants can colonize poor and degraded soils and thrive in inhospitable and harsh habitats [2]. Actinorhizal plants are pioneer species that allow the succession of other plant communities by providing organic matter, a fundamental matrix for the dynamics and biodiversity of terrestrial ecosystems. There is currently a renewing interest for actinorhizal symbiosis due to its significant contribution to global soil amendment in combined nitrogen (more than 15%) [3].
Based on the recent molecular phylogenetic studies, Frankia strains are classified into four major clusters [4][5][6]  Actinorhizal species include Casuarina spp., tropical trees native in Australia, Southeast Asia and Oceania [7]. These woody plants are well adapted to drought, heat, salinity, polluted soils and can withstand multiple varieties of environments [2]. This property is one reason why they have been massively planted in several regions of the globe for land reclamation, prevention of erosion, crop protection and fighting against desertification, tsunamis and typhoons [7]. In Algeria, like in all the Maghreb, Casuarina trees were introduced in the 19 th century and are currently found widespread in all bioclimatic zones of the country ranging from the coastal zone to the Saharan areas. Today, the propagation of Casuarina trees occurs mostly from plantlets produced in nurseries via seeds or by cutting. As a part of a project that aims to reassess the identity, the distribution and the relative abundance of Casuarina trees in Algeria, we were interested in investigating the prevalence of actinorhizal symbiosis in nurseries from different regions of the country, and to examine whether the symbiotic status can help the installation of the plantlets in natural environments. For this purpose, we have collected nodules samples from young Casuarina trees from Algerian nurseries and the symbiotic Frankia strain was isolated.

Isolation of Frankia strain B2
Frankia strain B2 was isolated in two-step process from nodules collected from Casuarina cunninghamiana seedlings growing in a nursery located at Souk El Tenine (District of Bejaia, Algeria). For the first step, the collected nodules were crushed and used as an inoculum on Casuarina glauca plants growing hydroponically in N-free BD medium [8] in a culture chamber under controlled conditions (25° C, 75% of relative air humidity and 16 h of photoperiod. After 8 weeks, root nodules were observed and harvested. For the second step, harvested nodules were washed, fragmented and surface-sterilized by immersion in a 30% H202 solution for 30 min based on protocol described previously [9]. Sterilized nodule fragments were inoculated onto the surface of different solid growth media including BAP [10], DPM (Defined Propionate Minimal Medium) [11] or modified QMOD [12] under nitrogen-free conditions (without yeast extract and peptone for QMOD). Plates were incubated in dark at 28°C. After 4-6 weeks, Frankia hyphae developed around the nodule fragments inoculated on BAP medium and these colonies were transferred into liquid BAP growth medium. Figure 1 shows the different stages of the isolation process and photomicrographs show typical Frankia features. Frankia has three different morphogenetic forms; vegetative hyphae (Hy), vesicles (Ve), the site of nitrogen fixation and sporangia containing spores (Sp). All three types of cell structures were produced by Frankia strain B2 (Fig. 1K-M). Frankia strain B2 was able to re-infect C. cunninghamiana and the nodules produced ( Fig. 1N-O) showed a higher level of nitrogenase activity compared to C. cunninghamiana nodules with Frankia casuarinae strain CcI3, the type strain [13] (Fig. 2). The acetylene reduction activity (ARA) was used to determine nitrogenase activity of C. cunninghamiana [14]. Because Frankia strain B2 had these traits and it represented an Algerian isolate, we chose to sequence its genome.

Sequencing of Frankia strain B2
Sequencing of the draft genome of Frankia sp. strain B2 was performed at the Hubbard Center for Genome Studies (University of New Hampshire, Durham, NH) using Illumina technology techniques [15]. High quality gDNA of Frankia sp. strain B2 was extracted using CTAB method [16]. A standard Illumina shotgun library was constructed and sequenced using the Illumina HiSeq2500 platform, which generated 4,247,110 reads (260-bp insert size) totaling 965 Mbp. The Illumina sequence data were trimmed by Trimmomatic version 0.36 [17], and assembled using Spades version 3.10 [18]. The final draft assembly for Frankia sp. strain B2 consisted of 145 contigs with an N50 contig size of 103.6 kb and 176X coverage of the genome. The final assembled genome contained a total sequence length of 5,331,433 bp with a G+C content of 70.12%.
The assembled Frankia sp. strain B2 genome was annotated via the NCBI Prokaryotic Genome Annotation Pipeline (PGAP), and resulted in 4,663 candidate protein-encoding genes, 41 tRNA and 5 rRNA. The genome features of Frankia sp. strain B2 are similar to other cluster 1c genomes (Table 1) including F. casuarinae strain CcI3 T [13]. Phylogenetic analysis of the 16S rDNA shows that Frankia sp. strain B2 groups with the cluster 1c strains ( Figure S1) and further confirmed by dendrogram of the entire genomes ( Figure S2). The genome also contained a nif, 2 hup, and 1 shc operons encoding the nitrogenase, hydrogenase uptake enzymes, and the hopanoid biosynthetic pathway, respectively. The operons were organized similar to those reported for Frankia cluster 1c genomes [19]. The pan-genome of Frankia cluster 1c consisted of 4,736 genes including a core genome of 3,107 genes. Figure S3 shows a Venn diagram of the orthologs shared among six Frankia cluster 1c strains.    Bioinformatic analysis of this genome by the use of the AntiSMASH program [20] revealed the presence of high numbers of secondary metabolic biosynthetic gene clusters, which is consistent with previous results for other Frankia genomes including subcluster Ic [19,21]. Table 2 shows a comparison of the various profiles of different Casuarina isolates for these secondary metabolic biosynthetic gene clusters. Although the majority of these secondary metabolic biosynthetic gene clusters were shared among the F. casuarinae genomes, the Frankia sp. strain B2 genome contained five unique nonribosomal peptide synthase (NRPS) clusters that were completely novel without homologues to other microbes but had minimal information on the chemical structures of the natural products. Predicted monomers for some of these unique NRPS clusters were identified, but no structure could be predicted from this algorithm.
In summary, the Frankia sp. strain B2 genome has revealed an interesting potential for secondary metabolites pathways and natural product profile and serves as another representative of Frankia cluster 1c.

Nucleotide sequence accession numbers
This whole-genome shotgun sequence has been deposited at DDBJ/EMBL/GenBank under the accession number SOPN00000000.1. The version described in this paper is the first version, SOPN01000 000.