With the rapid increase in sequence data from Human Genome Project, genomics and bioinformatics play a more important role in
the biological research. One of the key applications of bioinformatics in genome science is the creation and maintenance of
biological databases. These databases which are designed to store, manage and retrieve biological data using computational
technology, become essential for life research.
The mainstream public databases of sequences such as NCBI, EMBL and DDBJ save data through a general pattern, which does not
contain the special type of information. In order to research and satisfy biologists’ specific needs, several smaller,
specialized databases have been constructed.
The overall ESTs statistics were identified from human tissues. Data was collected from two sites, with the majority from the
National Center for Biotechnology Information (www.ncbi.nlm.nih.gov), and the remainder from the Beijing Institute of
Genomics (www.big.cas.cn/). The second set represents our independent procreant data. The sequences were downloaded from the
dbEST database, which contained 8,444,018 reported EST entries from Homo sapiens in Oct., 2011. ESTs in groups of one hundred
or more were considered members of a library, which resulted in 5,943,083 EST sequences. These sequences were sorted by
physiological and pathological origin based on sequence source. Physiological sequences from different tissues or cells were
used for researching gene expression relationships. Some specific EST sequences of pathologic tissue can be help for
researching morbid cause. Go to wikicell ftp.