Draft genome sequence of Actinomadura sp. K4S16 and elucidation of the nonthmicin biosynthetic pathway

Actinomadura sp. K4S16 (=NBRC 110471) is a producer of a novel tetronate polyether compound nonthmicin. Here, we report the draft genome sequence of this strain together with features of the organism and assembly, annotation and analysis of the genome sequence. The 9.6 Mb genome of Actinomadura sp. K4S16 encoded 9,004 putative ORFs, of which 7,701 were assigned with COG categories. The genome contained four type-I polyketide synthase (PKS) gene clusters, two type-II PKS gene clusters, and three nonribosomal peptide synthetase (NRPS) gene clusters. Among the type-I PKS gene (t1pks) clusters, a large t1pks cluster was annotated to be responsible for nonthmicin synthesis based on bioinformatic analyses. We also performed feeding experiments using labeled precursors and propose the biosynthetic pathway of nonthmicin.


Introduction
Actinomycetes are well known as a promising source for diverse bioactive secondary metabolites. Especially, members of Streptomyces have attracted attention as the most useful screening sources for new drug leads and a large number of bioactive compounds have been identified from cultures of this genus [1,2]. Consequently, the chance of finding novel secondary metabolites from Streptomyces members has recently dwindled. Thus, the focus of screening has recently moved to less exploited genera of rare actinomycetes [3]. In our screening for novel bioactive compounds from rare actinomycetes, Actinomadura sp. K4S16 was isolated from rice field soil in Thailand and found to produce a tetronate polyether designated nonthmicin along with ecteinamycin ( Fig.  1) [4]. Nonthmicin shows inhibitory activity against tumor cell invasion and protective activity for neuronal cell damage. This new polyether compound is characterized by the tetronic acid functionality modified by a chlorine atom. Halogenated tetronic acids are not known from nature except nonthmicin. In this study, we conducted whole genome shotgun sequencing of the strain to elucidate the biosynthetic pathway of nonthmicin. We herein present the draft genome sequence of Actinomadura sp. K4S16, together with the taxonomical identification of the strain, description of its genome properties and annotation of the gene cluster for nonthmicin biosynthesis. Biosynthetic pathway for nonthmicin was predicted by bioinformatics analysis and confirmed by precursor-incorporation experiments.

Sequenced strain
In the course of screening for novel bioactive substances from rare actinomycetes, Actinomadura sp. K4S16 was isolated from rice field soil collected in Thailand and found to produce a novel polyketide Ivyspring International Publisher compound named nonthmicin and its known congener ecteinamycin (Fig. 1) [4]. Actinomadura sp. K4S16 was preserved as TP-A0891 at the Toyama Prefectural University, deposited into the NBRC culture collection, and publicly available from the collection as NBRC 110471.

Chemotaxonomic analyses
The isomer of diaminopimelic acid in the whole-cell hydrolysate was analyzed according to the method described by Hasegawa et al. [5]. Isoprenoid quinones and cellular fatty acids were analyzed as described previously [6].
Phylogenetic analysis based on 16S rRNA gene sequences PCR template was prepared according to the protocol for Gram-positive bacteria of DNeasy Blood & Tissue kit (Qiagen). The gene encoding 16S rRNA was amplified by PCR using two universal primers, 9F and 1541R. After purification of the PCR product by AMPure (Beckman Coulter), the sequencing was carried out according to an established method [7]. Homology search of the sequence was conducted using EzBioCloud [8]. A phylogenetic tree was reconstructed by on the basis of the 16S rRNA gene sequence together with taxonomically close type strains showing more than 98% similarities by ClustalX2 [9].

Growth conditions and genomic DNA preparation
A monoisolate of Actinomadura sp. K4S16, isolated as single colony, was grown on polycarbonate membrane filter (Advantec) on doublediluted NBRC 227 agar medium (0.2% yeast extract, 0.5% malt extract, 0.2% glucose, 2% agar, pH 7.3) at 28°C. High quality genomic DNA for sequencing was extracted and isolated from the mycelia with an EZ1 DNA Tissue Kit and a BioRobot EZ1 (Qiagen) according to the manufacturer's protocol for extraction of nucleic acid from Gram-positive bacteria. The size, purity, and double-strand DNA concentration of the genomic DNA were measured by pulsed-field gel electrophoresis, ratio of absorbance values at 260 nm and 280 nm, and Quant-iT PicoGreen dsDNA Assay Kit (Life Technologies), respectively, to assess the quality of genomic DNA.

Genome sequencing and assembly
Shotgun and paired-end libraries were prepared and subsequently sequenced using 454 pyrosequencing technology and MiSeq (Illumina) paired-end technology, respectively ( Table 1). The 82 Mb shotgun sequences and 707 Mb paired-end sequences were assembled using Newbler v2.8 and subsequently finished using GenoFinisher [10] to yield 43 scaffolds larger than 500 bp. The draft genome sequence has been deposited in the INSDC database under the accession number BDDE01000001-BDDE01000043. The project information and its association with MIGS version 2.0 compliance are summarized in Table 1 [11].

Genome annotation
Coding sequences were predicted with Prodigal [12] and tRNA-scanSE [13]. The gene functions were assigned using an in-house genome annotation pipeline, and domains related to polyketide synthase (PKS) and nonribosomal peptide synthetase (NRPS) were searched using the SMART and PFAM domain databases. PKS and NRPS gene clusters and their domain organizations were determined as reported previously [7]. Substrates of adenylation (A) and acyltransferase (AT) domains were predicted using antiSMASH [14]. Protein-protein BLAST search against the NCBI Non-redundant protein sequences (nr) database was also used for predicting function of proteins encoded in the nonthmicin biosynthetic gene cluster.

Feeding experiments using labeled precursors
Inoculation, cultivation, extraction, and purification were performed in the same manner as previously reported [4]. Supplementation of sodium [1-13 C]acetate or [1-13 C]propionate (20 mg/100 ml medium/flask, 10 flasks) was initiated at 48 h after inoculation and periodically carried out every 24 h for four times. After further incubation for 24 h, the whole culture broths were extracted with 1-butanol and several steps of purification yielded 55 mg and 100 mg of 13 C-labeled nonthmicin, respectively.

Feature, classification, and genome properties
The general feature of Actinomadura sp. K4S16 is shown in Table 2. This strain grew well on ISP 2 and ISP 4 agar media, but poorly on ISP 5 and ISP 7. The color of aerial mycelia was white and that of the reverse side was pale orange on ISP 2 agar medium. Strain K4S16 formed extensively branched substrate mycelium. The aerial mycelium formed short chains of arthrospores. A scanning electron micrograph of this strain (Fig. 2) shows that spore chains are hooked or spiral (1 turn) and the spore surface is rugose. Growth occurred at 20-45 °C (optimum 28 °C) and pH 5-8 (optimum pH 7). Strain K4S16 exhibited growth with 0-2 % (w/v) NaCl (optimum 0 % NaCl) and the strain utilized arabinose, fructose, glucose, mannitol, rhamnose, sucrose, and xylose as sole carbon source for energy and growth, but not raffinose (all at 1%, w/v).
A draft genome size of Actinomadura sp. K4S16 was 9,647,292 bp and the G+C content was 72.4 % ( Table 3). Of the total 9,068 genes, 9,004 were protein-coding genes and 64 were RNA genes. The classification of genes into COGs functional categories is shown in Table 4. Digital DDH between Actinomadura sp. K4S16 and the type strain of the closest species, A. mexicana DSM 44485 T suggested that the DNA-DNA relatedness was 49.0 %, which is below 70 %, the cut-off point for the assignment of bacterial strains to the same species [16]. This suggests that Actinomadura sp. K4S16 is a novel independent genomospecies.   The tree uses sequences aligned by ClustalX2 [9] and constructed by the neighbor-joining method [24]. All positions containing gaps were eliminated. The building of the tree also involves a bootstrapping process repeated 1,000 times to generate a majority consensus tree, and only bootstrap values above 50% are shown at branching points. Streptosporangium roseum DSM 43021 T was used as an outgroup.

PKS and NRPS gene clusters in the genome
We analyzed biosynthetic gene clusters for polyketides and nonribpsomal peptides in the genome. Actinomadura sp. K4S16 harbored four type-I PKS gene (t1pks) clusters, two type-II PKS gene (t2pks) clusters, and three NRPS gene (nrps) clusters, as shown in Table 5. T1pks-1 cluster encoded only a PKS composed of ACP-KS/AT/DH/KR/ACP/ACP-TE domains, which showed 87% sequence identity to phenolpthiocerol synthesis type-I polyketide synthase PpsD of Mycobacterium tuberculosis 401416 (CND43678), suggesting it may synthesize phenolpthiocerol-like compounds. T1pks-2 cluster encoded two PKSs whose domain organizations are KS/AT/KR and KS/AT, respectively. Since these PKSs did not show sequence similarities to PKSs whose products are identified and the domain organization is unusual, we are not able to predict the product. T1pks-3 cluster encoded a PKS composed of KS/AT/KR/DH domains. Because such domain organization is specific to iterative PKSs for enediyne syntheses, this gene cluster likely synthesizes enediyne-type polyketide compounds. T1pks-4 cluster is responsible for nonthmicin synthesis as stated in the following section. T2pks-1 cluster might synthesize aromatic compounds similar to tetarimycin A or mithramycin, because its KSα showed 70 to 71 % sequence identities to TamM (AFY23044) and MtmP (CAA61989). T2pks-2 cluster did not show high sequence similarities (less than 55 % identities) to any PKSs registered in GenBank, suggesting that the product will be unique. Nrps-1 gene cluster harbored six NRPS modules and the products were predicted to be peptides containing amino dihydroxybenzoic acid, cysteine, glycine, and methyl ornithine. Nrps-2 gene cluster encoded four modules and the products will be composed of starter molecule-Cys-Cys-methyl Cys. Nrps-3 gene cluster had seven modules and the products are likely hexapeptides including amino acid residues such as alanine and threonine. The presence of these PKS and NRPS gene clusters suggests that this strain has the potential to produce diverse polyketide-and nonribosomal peptide-compounds as the secondary metabolites. The total is based on the total number of protein coding genes in the genome.

Nonthmicin biosynthetic pathway
The chemical structure of nonthmicin suggested that their carbon skeletons are assembled from five malonyl-CoA, four methylmalonyl-CoA, and three ethylmalonyl-CoA molecules by a type-I PKS pathway. We therefore searched for a t1pks cluster consisting of 12 PKS modules. Among all of the four t1pks clusters present in Actinomadura sp. K4S16 (Table 5), t1pks-4 cluster encoded six large PKSs and several enzymes related to secondary metabolite syntheses (Table 6, Fig 4a) and its assembly line contains 12 PKS modules. Substrates of AT domains in modules 1, 3 and 6 were predicted to be ethylmalonyl-CoA, whereas those in modules 5, 7, 8 and 9 were methymalonyl-CoA. According to the collinearity rule of type-I PKS pathways [17] and the chemical structure of nonthmicin, the polyketide backbone synthesized by the PKS assembly line was predicted as shown in Fig. 4b. The predicted structure is in good accordance with the nonthmicin backbone. The elongated polyketide chain is then converted to form three polyether moieties by an epoxidase and epoxide hydrase/cyclase(s) in a similar manner for the nanchangmycin biosynthesis [18]. The tetronic acid part may be synthesized by ORFs K4S16_09_00680 to K4S16_09_00720 as proposed for tetronic acid-containing polyketides such as tetrocarcin A, chlorothricin, abyssomicin, and quatromicin [19][20][21][22], because these ORFs are orthologues of TcaDs, ChlM and ChlDs, AbyAs and QunDs. Two cytochrome P450s (K4S16_09_00590 and K4S16_09_00730) and a methyltransferase (K4S16_09_00740) are probably responsible for the introduction of one hydroxy group and one methoxy group to produce ecteinamycin. Chlorination to the tetronate moiety is presumably catalyzed by a halogenase (K4S16_09_00450) to yield nonthmicin. On the basis of these bioinfomatic evidences, we here propose the biosynthetic pathway of nonthmicin and ecteinamycin as shown Fig. 4b.

Precursor-directed biosynthesis of bromo-analogue of nonthmicin
A putative halogenase gene (K4S16_09_00450), showing 56% identity and 72% similarity of amino acid sequence to HalB from Actinoplanes sp. ATCC 33002, present in the nonthmicin biosynthetic gene cluster was expected to be responsible for the halogenation (Table 6). If this gene product is also active for bromine, it can be used for the precursordirected biosynthesis of a brominated analogue. In fact, supplementation of sodium bromide into the culture resulted in the production of a new nonthmicin congener (Fig. 6a) in which the chlorine atom was replaced by the bromine atom. The structure of the bromo analogue was confirmed analysing data by MS (Fig. 6b) and NMR (data not shown). a 13 C signal intensity of each peak in the labeled 1 divided by that of the corresponding signal in the unlabeled 1, respectively, normalized to give an enrichment ratio of 1 for the unenriched peak of C27 and C33. The numbers in bold type indicate 13 C-enriched atoms from 13 C-labeled precursors.

Conclusion
We successfully found the type-I PKS gene cluster for nonthmicin biosynthetic and proposed a plausible biosynthetic pathway by the genome analysis of Actinomadura sp. K4S16, a producer of nonthmicin. Incorporation experiments of 13 C-labeled precursors also suggested that nonthmicin is biosynthesized by PKS pathway. These findings will provide significant information not only for the biosynthetic mechanism but also for the genetic engineering to synthesize more potential bioactive molecules based on the nonthmicin structure.