
- 积分
- 50
- 威望
- 50
- 包包
- 541
|
本帖最后由 细胞海洋 于 2010-7-20 15:26 编辑 1 @ B1 Z8 Y9 n* n$ L! ?
Q( o. o$ y0 A: l
Modern biology has become an information science. Since the invention of a
/ X* n4 l$ l+ _( HDNA sequencing method by Sanger in the late seventies, public repositories& ]2 i& |. L0 p( |+ F; \7 P( j7 g" ^1 d
of genomic sequences have been growing exponentially, doubling in size every
7 C- l. F' F4 I: r- l& s" {16 months—a rate often compared to the growth of semiconductor transistor7 G6 C, R5 _$ P0 n9 K* A+ }
densities in CPUs known as Moore’s Law. In the nineties, the public–private
; v p, ?- o* X9 \' frace to sequence the human genome further intensified the fervor to gener-
8 ^5 Y" t! \1 A7 Sate high-throughput biomolecular data from highly parallel and miniaturized) p/ x8 g( o2 q G" T- I! l7 X) R
instruments. Today, sequencing data from thousands of genomes, including' h$ r9 ]& d/ P6 Y4 }
plants, mammals, and microbial genomes, are accumulating at an unprece-$ s y" H% ]# w; O# V) j
dented rate. The advent of second-generation DNA sequencing instruments,
5 V, [' d! W$ d4 ohigh-density cDNA microarrays, tandem mass spectrometers, and high-power
' J) h! _6 E, A* mNMRs have fueled the growth of molecular biology into a wide spectrum of
8 ~5 c6 ?' h F |disciplines such as personalized genomics, functional genomics, proteomics,6 X( d( E6 r% i# |; m9 v
metabolomics, and structural genomics. Few experiments in molecular biol-
4 l2 Y* T2 ?! v& c' Rogy and genetics performed today can afford to ignore the vast amount of- ~ m5 {# o4 y; A& x6 n) r
biological information publicly accessible. Suddenly, molecular biology and5 H4 f# j6 u+ T7 H
genetics have become data rich.$ P- R4 A! |0 p. x
Biological data mining is a data-guzzling turbo engine for postgenomic% r& }8 |: k X6 s" O+ T9 [
biology, driving the competitive race toward unprecedented biological discov-
6 E6 _8 g% D0 Z) very opportunities in the twenty-first century. Classical bioinformatics emerged
5 S# X! z* g/ J9 p1 h3 o( Ffrom the study of macromolecules in molecular biology, biochemistry, and; e0 e' S8 t+ m8 K
biophysics. Analysis, comparison, and classification of DNA and protein se-
/ C+ C" ~, g3 Wquences were the dominant themes of bioinformatics in the early nineties.
+ d' o" `* b- A# E0 ~" m+ z* \Machine learning mainly focused on predicting genes and proteins functions# a1 d9 W# p/ \# Y9 }. r1 e
from their sequences and structures. The understanding of cellular functions" M; y- o! r! J5 W- E
and processes underlying complex diseases were out of reach. Bioinformatics/ _0 `4 T. o5 i, w: m8 X
scientists were a rare breed, and their contribution to molecular biology and
3 M+ X8 k' h [) @8 jgenetics was considered marginal, because the computational tools available+ d. W& ^& p% m. p4 j
then for biomolecular data analysis were far more primitive than the array
5 Y- ?" @% R- j- u; D$ Sof experimental techniques and assays that were available to life scientists.
+ Y0 ]' k6 |( y! J! Z9 y3 oToday, we are now witnessing the reversal of these past trends. Diverse sets) E: ^2 r" U. n5 s. e+ z/ y
of data types that cover a broad spectrum of genotypes and phenotypes, par-
: N; o; Z. O$ h/ G% \% lticularly those related to human health and diseases, have become available.% s; s% b8 {9 \1 d; u
Many interdisciplinary researchers, including applied computer scientists, ap-
& \: a5 v ]; S( o8 ?8 wplied mathematicians, biostatisticians, biomedical researchers, clinical scien-2 `8 ^7 d1 N6 C
tists, and biopharmaceutical professionals, have discovered in biology a goldmine of knowledge leading to many exciting possibilities: the unraveling of the
6 Y# ?( _: u1 S& A$ O; vtree of life, harnessing the power of microbial organisms for renewable energy,( P3 Q$ \* c2 j& `4 N1 A7 R" }' v
finding new ways to diagnose disease early, and developing new therapeutic8 e8 S# t; Y$ z
compounds that save lives. Much of the experimental high-throughput biology' ~! P/ X' T( @0 {- u- L4 f
data are generated and analyzed “in haste,” therefore leaving plenty of oppor-
! M+ J4 C/ a0 q8 ~' O6 X5 g% }; ]tunities for knowledge discovery even after the original data are released. Most2 u( u5 l- ^: z' \$ z
of the bets on the race to separate the wheat from the chaff have been placed8 q! |3 M5 t+ ]; ]
on biological data mining techniques. After all, when easy, straightforward,
0 {2 w" J: W+ h6 y* w% Hfirst-pass data analysis has not yielded novel biological insights, data mining; L$ K# w8 \' q+ e* V+ {
techniques must be able to help—or, many presumed so.- Q3 |+ s2 Q O$ c9 O/ {
% b1 O5 t; K# j3 g& m$ q' t
[hide][/hide] |
附件: 你需要登录才可以下载或查看附件。没有帐号?注册
-
总评分: 威望 + 5
包包 + 10
查看全部评分
|