2.2 Genome assembly
Nanopore long reads were first corrected to obtain clean reads using the
Canu package(Koren et al., 2017). Genome assembly was performed using
SMART denovo (https://github.com/ruanjue/smartdenovo) and polished in
three runs for error-corrected long reads using the gDNA Illumina short
reads by Pilon with default parameters to produce the nanopore-assembled
genome of O. bidens (Walker et al., 2014). Genome integrity was
assessed using the gDNA short reads by the underlying aligned rate in
BWA (Li and Durbin, 2009). The number of genes in the CEGMA database was
presented in the assembly(Parra et al., 2007), and further to predict
genes in the assembled genome using BUSCO with the Vertebrata-odb10
database(Simao et al., 2015).
For chromosome assembly, the Hi-C technique was applied for O.
bidens . Raw Hi-C sequencing data were filtered to obtain high-quality
clean reads using HiC-Pro with default parameters(Servant et al., 2015),
and the clean-read pairs were mapped to the polished O. bidensgenome using BWA in end-to-end mode (Li and Durbin, 2009). Only valid
interaction pairs were used to construct the chromosome-level genome ofO. bidens using LACHESIS with default parameters (Burton et al.,
2013)