2.2 Genome assembly
Nanopore long reads were first corrected to obtain clean reads using the Canu package(Koren et al., 2017). Genome assembly was performed using SMART denovo (https://github.com/ruanjue/smartdenovo) and polished in three runs for error-corrected long reads using the gDNA Illumina short reads by Pilon with default parameters to produce the nanopore-assembled genome of O. bidens (Walker et al., 2014). Genome integrity was assessed using the gDNA short reads by the underlying aligned rate in BWA (Li and Durbin, 2009). The number of genes in the CEGMA database was presented in the assembly(Parra et al., 2007), and further to predict genes in the assembled genome using BUSCO with the Vertebrata-odb10 database(Simao et al., 2015).
For chromosome assembly, the Hi-C technique was applied for O. bidens . Raw Hi-C sequencing data were filtered to obtain high-quality clean reads using HiC-Pro with default parameters(Servant et al., 2015), and the clean-read pairs were mapped to the polished O. bidensgenome using BWA in end-to-end mode (Li and Durbin, 2009). Only valid interaction pairs were used to construct the chromosome-level genome ofO. bidens using LACHESIS with default parameters (Burton et al., 2013)