2.4 Identification of TF, CDS, lncRNAs, SSR, and transposable
elements
The transcription factors (TF) were identified using blast, comparing
against the AnimalTFDB database (Huet al. 2019). Coding sequences (CDS) of the unigenes were
annotated through blast against NR, Swiss-Prot, and KOG databases. The
long non-coding RNAs (lncRNAs) were predicted from transcripts without
coding potential using Coding Potential Calculator 2
(Kang et al. 2017), Coding
Non-coding Index (Sun et al.2013), Pfam (Mistry et al. 2021)
and PLEK (Li et al. 2014), with
min length 200 bp and min ORF 300 bp as the cut-off criterion. Simple
sequence repeats (SSRs) were identified using MISA
(Beier et al. 2017). Transposable
elements were identified using RepeatMasker (http://www.repeatmasker.
org).