2.4 Identification of TF, CDS, lncRNAs, SSR, and transposable elements
The transcription factors (TF) were identified using blast, comparing against the AnimalTFDB database (Huet al. 2019). Coding sequences (CDS) of the unigenes were annotated through blast against NR, Swiss-Prot, and KOG databases. The long non-coding RNAs (lncRNAs) were predicted from transcripts without coding potential using Coding Potential Calculator 2 (Kang et al. 2017), Coding Non-coding Index (Sun et al.2013), Pfam (Mistry et al. 2021) and PLEK (Li et al. 2014), with min length 200 bp and min ORF 300 bp as the cut-off criterion. Simple sequence repeats (SSRs) were identified using MISA (Beier et al. 2017). Transposable elements were identified using RepeatMasker (http://www.repeatmasker. org).