Parameter-bootstrap verification
Being able to accurately determine phylogeny depends on the accurate and
efficient acquisition of target genes. At this time, selecting a
suitable reference from closely related taxa for non-model species still
remains challenging for researchers. To address this issue, we developed
a parameter-bootstrap verification solution in GeneMiner. This method
generates a new set of simulated reference data by employing a mutation
model and repeated re-sampling. GeneMiner can then include the newly set
of simulated reference data as input to assemble the target gene again.
We observe target genes with bootstrap scores below 90 tend to
have an unstable assembly, and therefore may not be reliable reference
choices. While target genes with bootstrap scores above 90 can also
exhibit an unstable assembly, their overall indel and substitution rates
are comparatively lower. Our statistical analysis led us to select
target genes with bootstrap scores of 90 or higher, which allowed us to
mitigate the potential impact of unstable assembly. This indicates that
the parameter-bootstrap verification method is effective in evaluating
assembly results and can guide reference sequence selection. GeneMiner
offers improved sensitivity to false substitutions compared to
insertions and deletions, which excludes accounting for insertions and
deletions in reference data. Users should exercise caution when
inserting and removing large portions of the assembly results, although
these large insertions and deletions have minimal impact on the
phylogenetic results after being corrected. As an added measure to
ensure accuracy when using GeneMiner, we suggest that users employ a
software such as trimAL (Capella-Gutierrez et al., 2009) to edit the
sorting file before constructing a phylogenetic tree.