PDF电子书：Biological.Data.Mining - 经典PDF电子书下载区干细胞之家



免疫细胞治疗专区	欢迎关注干细胞微信公众号

12 3 4 5 6 7 8 9 10 ... 28 下一页

返回列表

查看: 734514\|回复: 278	go [生物学相关学科类] PDF电子书：Biological.Data.Mining [复制链接]

viogriffin

注册会员

Rank: 2

积分: 50
威望: 50
包包: 541

楼主

发表于 2010-7-20 15:17 |只看该作者 |倒序浏览 |打印

本帖最后由细胞海洋于 2010-7-20 15:26 编辑 % I2 ^3 ]1 D' [& ^: |

Modern biology has become an information science. Since the invention of a4 F$ h4 D- l. N  i$ q( o+ A) {
DNA sequencing method by Sanger in the late seventies, public repositories7 Q2 N. ?- M5 z( C0 N
of genomic sequences have been growing exponentially, doubling in size every
16 months—a rate often compared to the growth of semiconductor transistor) [6 H+ m4 o4 T0 c- P+ ]
densities in CPUs known as Moore’s Law. In the nineties, the public–private1 ~; w: Q4 J7 r0 O1 V
race to sequence the human genome further intensiﬁed the fervor to gener-
ate high-throughput biomolecular data from highly parallel and miniaturized
instruments. Today, sequencing data from thousands of genomes, including
plants, mammals, and microbial genomes, are accumulating at an unprece-, U: D: ?5 p  D3 t) z6 r
dented rate. The advent of second-generation DNA sequencing instruments,2 I/ d4 y1 }  t( ?& F; D* [, n
high-density cDNA microarrays, tandem mass spectrometers, and high-power
NMRs have fueled the growth of molecular biology into a wide spectrum of% g1 H' P5 U* }! F) ?6 t# K
disciplines such as personalized genomics, functional genomics, proteomics,: P# W: I) k9 o' P. X8 B  q
metabolomics, and structural genomics. Few experiments in molecular biol-. a+ H) l3 G4 j# {% y- D
ogy and genetics performed today can aﬀord to ignore the vast amount of$ v* i. h0 K1 a3 f! A- |
biological information publicly accessible. Suddenly, molecular biology and
genetics have become data rich.
Biological data mining is a data-guzzling turbo engine for postgenomic
biology, driving the competitive race toward unprecedented biological discov-5 Z* H* F# D  i: b# {3 D
ery opportunities in the twenty-ﬁrst century. Classical bioinformatics emerged
from the study of macromolecules in molecular biology, biochemistry, and
biophysics. Analysis, comparison, and classiﬁcation of DNA and protein se-
quences were the dominant themes of bioinformatics in the early nineties.
Machine learning mainly focused on predicting genes and proteins functions( [) K$ ~" F/ j
from their sequences and structures. The understanding of cellular functions7 j) a& h2 _- }) m1 ?/ r/ j
and processes underlying complex diseases were out of reach. Bioinformatics
scientists were a rare breed, and their contribution to molecular biology and% ?$ D' d; R  |: r; z  P
genetics was considered marginal, because the computational tools available( D* h5 N$ n' V# [; P  ~
then for biomolecular data analysis were far more primitive than the array
of experimental techniques and assays that were available to life scientists.3 i# y! D! q3 o, G, d3 U( ]5 c& N5 j
Today, we are now witnessing the reversal of these past trends. Diverse sets
of data types that cover a broad spectrum of genotypes and phenotypes, par-
ticularly those related to human health and diseases, have become available.1 o2 X$ B5 |' [
Many interdisciplinary researchers, including applied computer scientists, ap-3 H3 @4 w0 b9 g* x3 I
plied mathematicians, biostatisticians, biomedical researchers, clinical scien-
tists, and biopharmaceutical professionals, have discovered in biology a goldmine of knowledge leading to many exciting possibilities: the unraveling of the
tree of life, harnessing the power of microbial organisms for renewable energy,
ﬁnding new ways to diagnose disease early, and developing new therapeutic
compounds that save lives. Much of the experimental high-throughput biology
data are generated and analyzed “in haste,” therefore leaving plenty of oppor-
tunities for knowledge discovery even after the original data are released. Most
of the bets on the race to separate the wheat from the chaﬀ have been placed( \9 H. n& B9 w( r5 t
on biological data mining techniques. After all, when easy, straightforward,. }/ E, @2 }" k2 _# Z9 h; R
ﬁrst-pass data analysis has not yielded novel biological insights, data mining
techniques must be able to help—or, many presumed so.

[hide][/hide]

附件: 你需要登录才可以下载或查看附件。没有帐号？注册