ASEE Prism Magazine Online - December 2001
Managing the Unmanageable
Spread the Word
A Bumpy Road
Giants of the Sea
Comments
Perspective
Briefings
On Politics
Teaching Toolbox
ASEE Today
Classifieds
Last Word
Back Issues

Managing the Unmanageable


Bioinformatics, the new science that analyzes the mind-boggling amounts of computer-generated data from human genome sequencing, requires an army of scientists with abilities that cross traditional disciplinary boundaries.
- By Joannie Fischer

Never before has a new academic discipline burst so explosively into higher education's hallways. Ten years ago, hardly anyone had even heard of the newly coined term “bioinformatics,” but today, it dominates the agenda of dozens of top universities across the country. They are all racing to play a role in the hottest research arena on earth, one that will redefine medicine forever, rewrite the story of how human beings came to exist, and reveal the inner workings of the body and mind.

Since the mid-1990s, advances in DNA sequencing have promised to change our understanding of humanity and all other living creatures. But suddenly, the completion of the human genome sequence and that of several other animals over the past year has transformed the young bioinformatics industry into the fastest growing field ever, with an almost insatiable appetite for new researchers and new labs to analyze the mind-boggling amounts of computer-generated data. In tremendously short order, bioinformatics has traded the problem of a paucity of information for an embarrassment of riches, and researchers are literally lost inside all of the information. “Once, you could print out most of the existing sequence databases onto paper and cram them into a single binder,” recalls Damian Counsell, head of the bioinformatics department at the Institute for Cancer Research in London. “Now a search for “actin” (one of the body's 200,000 proteins) will pull out hundreds and hundreds of lengthy sequences.”

The United States' publicly accessible genome database, GenBank, maintained by the National Institutes of Health, already includes close to 13 billion pairs of DNA “letters,” and is on pace to double in size every year for the foreseeable future. A similar database called BLAST gets more than 50,000 hits per day from scientists comparing the properties of new genes they've found to ones that are already documented in the database. Genome giant Celera's Craig Ventor predicts it will take a century of constant research to understand the data that has been produced so far. The promises of such a tireless endeavor are especially great for human health. Bioinformatics could usher in a whole new era of completely individually-tailored medicine.

“Ideally, a baby's genotype will be recorded, and a program of personalized immunization, lifestyle management, and refined treatments will be developed to target disease,” says Phyllis Gardner, a Stanford dean who is helping to launch the school's bioinformatics program.

To fulfill that promise, she says, requires “an army of scientists with abilities that cross traditional disciplinary boundaries, [but] the dearth of faculty trained across these disciplines is a global phenomenon.”

Too Much Too Soon

Leena Peltonen, chairwoman of the Department of Human Genetics at UCLA, says that industry and academia alike were poorly prepared to handle this avalanche of information, partly because no one realized the data would start pouring in so soon. The need has grown drastically for trained specialists and won't dry up anytime soon, she says. The National Science Foundation estimates that 20,000 new bioinformatics jobs will be created by 2005, and market estimates show the field is growing at a phenomenal 50 percent annual rate and will remain at 25 percent or above far into the future. And the growth is likely to accelerate as the computerized study of genetics spreads from human medicine to animal and plant research to agriculture. Twenty-five percent of all venture capital money now goes into new life-sciences start-ups, which in turn are hiring academics away from campuses at a dizzying speed, leaving schools crippled in their ability to offer high-quality training for this new research frontier.

Student interest in the new field is keen. Peltonen notes a huge leap in the number of graduate students interested in bioinformatics, up “10 times” from interest levels a year or two ago. One lure is the lofty salaries: Unlike other fields where Ph.D.s face uncertain employment, those talented in both biology and computer science enter into the field at over $90,000 a year, almost double the entry-level computer scientist salary and almost triple that of an average beginning biologist. Colleges and universities are scrambling to meet student interest and industry demand by coming up with the right combination of curricula, facilities, and brainpower to train tomorrow's bioinformaticians.

That's no easy matter, says the first faculty to try to tailor the new programs. “Because bioinformatics is highly interdisciplinary, it requires a significant course exposure to chemistry, biology, and math/statistics courses,” says Kenneth Marx, chemistry professor at the University of Massachusetts, Lowell, who helped design a new bioinformatics/chemoinformatics curriculum implemented last year. “This high demand for different courses makes it difficult to fit within the traditional four-year academic structure for an undergraduate degree, as well as the course requirements for M.S. and Ph.D. degrees.” Furthermore, he adds, traditional academic departments are hesitant to train “Renaissance individuals” in all these academic disciplines. “However, the marketplace will drive academic departments to accommodate the necessary training.”

 

Pushing The Boundaries

Bioinformatics has indeed become the ultimate interdisciplinary study area and will forever blur the lines between engineering and biology, bringing together researchers who previously spent their entire careers not interacting with one another. But the new field is too complicated to be prepared for simply by doubling up on traditional biology and computer courses. Counsell calls bioinformatics “a special kind of engineering discipline,” and attributes the burgeoning field's success to the fact that it is driven by the “characteristically practical and rigorous approach” of engineering. At first, bioinformatics was seen as dealing specifically with sequence analysis—figuring out the genetic lettering of everything from frogs to ferns. But as one genome after another becomes completed, bioinformatics is moving into the “post-genomic” age, and the field focuses on a whole array of additional endeavors. Some research is aimed at comparative genomics: looking for key commonalities or differences between the DNA of various species to learn more about each type of creature and about evolution in general. One of the largest areas of focus is now shifting from the genes themselves to the structure and function of their products—namely the proteins that carry out the work of a body's daily life.

And a new area of “research informatics” is arising, in which all data collected on any particular molecule are kept together and compared in order to give a fuller picture of that entity, be it a new drug candidate or one of the body's own proteins associated with battling cancer. The computer software programs now being designed to advance the research fall into four broad categories: those that search through sequences of genes, those that analyze the sequences, those that predict the structure of particular proteins, and those that provide imaging of molecules. The computer power needed to drive the research is unprecedented. Until now, the most complicated data handled by governments, industry, and research was measured in terabytes (a trillion bytes). But bioinformatics will soon become the first industry in which researchers measure their data in pedabytes (a quadrillion bytes).

Because bioinformatics is the most challenging and promising area of research to face both the life sciences and the computing worlds in recent memory, most educators now agree that no biology degree or computer science degree is complete without bioinformatics training. And because of the new understanding it offers of the human body, some educators are pushing to make bioinformatics courses a requirement in all medical schools. Many note enthusiastically that the new field rejuvenates the older disciplines, allowing biologists to escape the lab and computer scientists to escape dull database work. But it also requires typically different personality types to change their attitudes and aptitudes. Biology has long been considered the least mathematical of the sciences, focusing on “wet,” living things and their relationships to one another. The “dry,” numbers-oriented, “non-messy” computer science area has allowed people to work for years in virtual isolation. But London's Counsell likes to joke to computer programmers that “you are as likely to be useful to biologists working in isolation at the keyboard as you are to conceive a child with your clothes on.”

 

Whole New Order

Russ Altman, president of the International Society for Computational Biology, adds that these gaps are not bridged simply by taking a computer scientist and then training him or her in biology. Not only is that process too slow, expensive, and cumbersome, he says, but it does not reliably instill some of the key traits necessary to perform bioinformatics successfully. Altman has published a proposed curriculum for bioinformatics that serves as a model for several universities now establishing major new centers, including Stanford, Princeton, and the University of Chicago.The University of California, Berkeley, is launching a half-billion dollar research initiative that links a range of scientific disciplines to create a facility to prepare the best minds in genomics. The initiative will require a “rethinking” of the entire science education curriculum, say leaders, and both undergraduate and graduate education at the school will be permanently altered as a result.

Universities are certainly not alone in their conviction that bioinformatics is the key to their future. States such as Michigan, Georgia, and North Carolina all see bioinformatics initiatives at state colleges as a way to spur state economies and prestige, and state legislators are looking for ways to give even more incentives to universities and businesses to cooperate toward the goal. The National Institutes of Health has persuaded the U.S. government to spend $10 million to fund 20 new biomedical computing “programs of excellence.” And private foundations are getting in on the act, too. For example, Johns Hopkins University is launching a new computational biology program with the help of a $2.5 million grant from the Burroughs Wellcome Fund. The Alfred P. Sloan Foundation will fund new bioinformatics master's degree programs starting this fall at several universities, among them the University of California's Los Angeles and Santa Cruz campuses, the University of Texas at El Paso, the New Jersey Institute of Technology, Boston University, and Northeastern University. And in March, Renssalaer Polytechnic Institute received an anonymous $360 million donation, the largest gift ever to a U.S. college, and will direct a significant portion of the money toward state-of-the-art biomedical computing facilities.

By far, the most progress is being made by schools that forge aggressive and substantial partnerships with industry and government. By forming the Corporate Associates Program, Harvard has managed to avoid “brain drain” by drawing on top minds in area biotechnology companies to mentor its students. Even larger-scale partnerships are in the works, such as in Virginia, where Virginia Polytechnic Institute is now spearheading the launch of the Virginia Bioinformatics Institute, which will eventually total a $100 million effort. The state government kicked in $12 million to help get the Institute off the ground. So far, the project employs 25 researchers who share their expertise in physics, mathematics, biology, and engineering to solve cutting-edge problems in bioinformatics. Next year, the institute wants to double in size again, and one day, officials hope to house 300 researchers there.

Yet that effort pales in comparison with a project afoot in North Carolina, where more than 40 nonprofits, corporations, and public and private universities have pooled their resources to become world leaders in the bioinformatics field. For its part, Duke University may spend $200 million to create an Institute for Genomic Sciences and Policy. Duke will cross-train students and faculty together with North Carolina State University, which recently opened the North Carolina Bioinformatics Research Center, home to about 30 faculty and 40 graduate students. Wake Forest will spend nearly $20 million for a new center, and UNC-Chapel Hill will put $100 million toward advancing genomics research capabilities. The massive campus spending is underwritten in part by grants from institutions such as the National Science Foundation and the National Institutes of Health. Some biotech corporations in the consortium are funding graduate students' education in order to get the highly-skilled new employees they so desperately need.

Rather than looking for a corporate partner, faculty members at the University of Massachusetts, Lowell, decided to create their own spin-off company, AnVil Informatics, that helps the university by funding graduate students, and in turn benefits by having the first crack at hiring the optimally trained graduates to help develop the company's suite of data mining and high dimensional visualization tools.

 

Finding Their Place

Many schools have decided to tackle the informatics revolution by carving out a particular niche in the market. The University of Pennsylvania, with sponsorship from a company called Pangea Systems, will use its new Center for Bioinformatics to specialize in developing new software that can help researchers visualize the biological data they study. The University of Massachusetts wants to hire about 15 new faculty devoted to the study of exactly how genes cause both healthy functioning and the onset of certain diseases. Harvard's Institute for Proteomics (the area of bioinformatics that studies the form and function of proteins) is creating a collection of cloned copies of all known human genes in a massive storage system, or warehouse, that could be conveniently used by researchers everywhere who want to order a certain gene for a particular experiment. And MIT's Center for Genome Research is using DNA chips to measure the actions of thousands of genes all at the same time in an attempt to find the key differences between various types of cancer. Led by top researcher Eric Lander, the group has already identified the important distinctions between two types of leukemia. Only by studying multiple genes at once can researchers hope to get a true sense of how different diseases start.

At least one school, the new Keck Graduate Institute in California, has decided to create a professionally oriented master of bioscience program that combines bioinformatics training with more traditional business leadership education so that students will emerge from the two-year program with not only the skills that a master's in science affords but also the talents conferred by an M.B.A. Already, science education consultant Sheila Tobias has dubbed the bioinformatics degree the “M.B.A. of the new century.” Adding business management skills to the mix will only enhance the cachet of the new trainees, she says.

The Howard Hughes Medical Institute in Chevy Chase, Md., promises to steadily increase the quality of bioinformatics scholars throughout higher education with the $500 million it will use to build “the Bell Labs of biology” near Ashton, Va. One of the world's top research institutions, Howard Hughes already boasts more than 350 leading scientists who work out of “host” universities across the nation. This time, the institute will create its own campus with a staff of about 25 permanent “chief” scientists, all with their own research staffs, plus a revolving crop of visiting scientists from campuses and corporations around the country.

The government's genome chief, Francis Collins, has often said that the human brain is far too puny to comprehend the full range of benefits that will emerge from bioinformatics endeavors. Among the fruits of current efforts is likely to be the know-how to fix faulty genes, discover new drugs, and test their safety without using human guinea pigs: cure diseases; and create disease-resistant plants and animals. For Jeff Bizarro, executive director of the nonprofit Bioinformatics.org, and many fellow pioneers in the new field, some of the sweetest rewards will be the intellectual satisfaction of the knowledge itself. “I think nearly all bioinformaticists would agree that the most exciting promise of bioinformatics is...that we can fully understand the ‘nature' of life.” Bizarro is keen to discover, for example, where life originated, what makes one organism different from another, what makes one being different from another, where life is headed, and if and how humans can alter their paths by extending and improving life. “Part of me hopes that the most interesting questions are yet to come, that there will always be plenty of science to be done,” Bizarro says. “But another part hopes that I will live to know all the answers.” For the first time in history, such a hope actually seems plausible.

 

Joannie Fischer is a freelance writer based in Palo Alto, Calif.
She can be reached at jfischer@asee.org.

Prism@asee.org
s