Taking Advantage of Big Data Opportunities in the Laboratory with the Help of an ELN
The intent of the Human Genome Project was to sequence the three billion base pairs that make up a single individual’s DNA. The project began in 1990 and took nearly 13 years to complete at a cost of nearly $4 billion dollars. However, since the project has been completed, the time and cost involved in sequencing a genome has declined rapidly. Thanks to advancements in computer processing power, as well as big data analytics, scientists are now able to sequence any DNA strand in a relatively short amount of time.
The advances in DNA sequencing are just one area in which big data opportunities are revolutionizing the laboratory. Other areas of science are benefitting as well. For example, big data is helping chemistry scientists find better electronic material for solar cells. Using big data tools, scientists at Harvard University were able to perform 150 million theoretical calculations to identify organic materials that could be used to make solar cells. The group screened 2.3 million molecular structures in order to find the best one necessary to build a solar cell. Only the most promising structures made it into manual tests.
Although these results sound incredible, not every laboratory is prepared to seize big data opportunities when they occur. Why not? The problem comes down to data format. In order to be analyzed, big data requires that information be located in a repository and format that can be accessed with data analytics tools.
Current Data Management Methods
If we are truly being honest, laboratories across several industries are not very efficient when it comes to data management. Many tend to create individual data warehouses for specific research, creating silos that prevent other people and systems from accessing the information stored inside.
For example, most laboratory scientists will run a test and store the resulting information in a database. However, the notes associated with that data are then written in a laboratory notebook. The results are two different data storage receptacles for related information.
Better Data Management Methods
Laboratory scientists need a way to centralize and store information that may have multiple different formats or file types and sources within a single place. The best way to manage this type of unstructured data is to centralize it with an electronic laboratory notebook (ELN).
A significant benefit of an ELN includes its ability to act as a repository for all information sources regardless of format. Any type of data, including pictures, charts, data sets, etc. can be stored within a single system. In addition, an ELN allows many collaborators to securely access this information and ensures data integrity amongst multiple users.
Imagine the Possibilities
Once the data is centralized and structured within a centralized ELN database, then big data opportunities can become a reality. For example, let’s imagine that a small molecule researcher stores all of their test data within an ELN. Then, using big data tools, they mesh the results of their tests with an outside reaction database. By connecting the two databases, they may identify a specific target molecule that should be focused on. Once identified, possible syntheses could be suggested by the reaction database. The researcher could then cross-reference these syntheses with the data stored within the ELN to see if the work has already been performed, thus allowing them to focus only on the new and most promising results.
Big data has the potential to revolutionize the laboratory. It will help researchers discover new and interesting ways to combine information and help solve the problems of the world. The first step required before big data opportunities can be seized is to organize data in a such a way that the system can use it. The Accelrys Notebook helps to organize data by providing structure and storing data regardless of format. For more information, please visit our website today.