Discover The Article
Today we have a great amount of information at our disposal, every second a large amount of information is uploaded to the Internet. However, in near distant future the storage capacity could reach its limit, to a point where it would be impossible to save new information without first removing old information. Fortunately, scientists have already thought of such a scenario and now work to store massive amounts of information in nothing more and nothing less than in DNA hard drive. How? Come and discover what this technology is all about.
It was a great idea: to take advantage of the 900th anniversary of the Domesday Book, a general register of England completed in 1086 under the reign of William I the Conqueror, to build a large multi-media digital library on everyday life in Britain. The ambitious BBC Domesday Project, costing 2.5 million pounds, ended in 1986 with the publication of two volumes with a total capacity of 1,600 megabytes (MB), a savagery for its time, equivalent to more than 1,000 diskettes of so.
LaserDisc, at that time it was a brand new format, futuristic and very promising. Ten years later, that practically disappeared from the map. In 2002 the Guardian newspaper titled: “The digital Domesday Book lasts 15 years, not 1,000“. There was no longer a single player capable of reading the project discs, which also required the use of a special computer. Then began the painful work of recovering the data to adapt them to other supports. In 2011, Domesday Reloaded was finally available on the internet for the entire public, a quarter of a century after its original publication.
OBSOLETE AND FRAGILE SUPPORTS
The LaserDisc followed the same path to the digital cemetery that they have taken before or after the floppies of 5¼, the 3½, the Zip or the Jaz, not to mention the myriad ROM cartridge formats of a thousand and one consoles now missing, Or of specific media for audio or video such as the MiniDisc or the MiniDV. Today even the CD, DVD and Blu-ray are already relics of the past for the youngest. All of which raises a problem: where to store digital data today to last the 1,000 years of the original Domesday?
Technological obsolescence is associated with the deterioration of physical supports. We all have the experience that CDs and DVDs are not for life, as we thought we had been promised. Hard drives last for the duration of the computer, and flash memory is comfortable and portable, so much that we lose them. Today we are advised to store in the cloud, but this ethereal term is misleading: at the other end of the cable there must always be a physical support. For long-term storage, large data centers employ magnetic tape cartridges that already achieve capacities of up to 185 terabytes (1 TB equals 1,000 GB) and offer the highest storage density available on the market today. But its average life reaches only a ridiculous 30 years.
Another threat to data storage is that the volume of digital information to be retained is skyrocketing exponentially. According to companies EMC Corporation and International Data Corporation, in 2013 the digital universe occupied 4.4 zettabytes (1 ZB equals one trillion GB); By 2020 this figure will have multiplied by 10 to 44 ZB; Almost as many digital bits as stars are in the universe, and would fill the memory of more than six piles of tablets from Earth to the Moon. The problem, warn these experts, is that the installed memory capacity is not growing at the same pace.
A SYSTEM THAT WILL NEVER BEAR
As a result, some researchers are turning their eyes towards a data storage medium that has existed for billions of years, reaching a density of information 100 million times greater than magnetic tapes (1,000 million GB per cubic millimeter front To 10 in the tapes), which can last for centuries or even millennia and, more importantly, that will never be obsolete: at least as long as humans stay here, we will always need DNA synthesis and reading systems.
DNA is a medium invented by nature to store data. It consists of a chain of links that are identical to each other except for a label that is attached and differentiates into four types, which we call adenine (A), thymine (T), cytosine (C) and guanine (G). The particular sequence of each DNA strand in a gene forms a code that translates to a protein. But it is enough to apply methods of encryption so that a DNA sequence created at will can store another type of non-genetic data; For example, digital in binary code.
The idea of hacking DNA hard drive to encode data is almost as old as the discovery of the molecule itself: it was first proposed by Russian physicist Mikhail Samoilovich Neiman in 1964 but would not begin to be implemented until the end of the 20th century. In 1996, the artist and researcher at the Massachusetts Institute of Technology Joe Davis devised a method for translating into DNA a graph consisting of zeros and ones representing a Germanic rune, or also a simplified drawing of the female symbol; A Microvenus, as its creator called it.
How does DNA hard drive work?
Computer processors use binary code, two-digit system or bit (formed by 0 and 1) to represent letters, numbers and other symbols by means of bit strings. For example, the word “SI” would be represented in the binary code as “01110011 01101001”. The DNA in turn consists of 4 nucleotide bases (Adenine, Cytosine, Guanine and Thymine), these bases in turn can be represented by the initial letters of their names (A, C, G and T).
The first step in storing information in the DNA would be to design a code that allows equivalences between the nucleotide bases and the binary code. For example, zeros become the base pair A or C and the ones on the bases G or T, in this case “SI” could be represented as “AGGTCCTT ATTCGAAG”. The second step, once already encoded the information, would be to synthesize the DNA strings with the sequence of codes corresponding to the information that we want to save. Finally to read such information it would be necessary to determine the sequence of the nucleotide bases, decode them to the corresponding binary code and finally to the code of letters and numbers that we know.
HACKING THE NATURE
In 2008, biotechnology tycoon J. Craig Venter created a synthetic genome of a bacterium that included a kind of watermarks, sequences that coded the names of the researchers and several quotes from famous people. A team from the University of Hong Kong presented two years later a system to introduce texts into bacteria, converting them into ASCII characters in binary code and then encrypting them in the form of DNA sequences. The authors estimated that a gram of bacteria could store information contained in 450 2TB hard drives.
In the present decade other researchers have advanced even further, turning into DNA Shakespeare sonnets, Martin Luther King audio clips, photographs or fragments of Wikipedia. In February 2015, a team from the Swiss Federal Institute of Technology in Zurich translated the Swiss Federal Pact of 1291 and a work by Archimedes to DNA. Instead of bacteria like the Chinese researchers, they used naked DNA, but encapsulated in silica glass to create artificial fossils capable of retaining the data for at least 2,000 years, which could increase to two million years if stored at temperatures of -18 OC.
The latest contribution has just been presented at the 21st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) held in Atlanta in early April. The researchers, from the University of Washington and the research division of Microsoft, have recorded four images in DNA sequences. Once the binary code is converted to genetic, these sequences are broken into short strings that can be created with a DNA synthesizer. To these fragments are added small sequences that serve as address labels or “postal codes” to be able to locate them at will, as does the RAM (Random Access Memory) of electronic devices.
Once the DNA hard drive is stored, to undo the process and retrieve the information, small molecules are used that fished in the desired directions, and the fragments of interest are read with a DNA sequencer and then converted back to binary code. According to the University of Washington, digital information that in today’s media would fill the space of a hypermarket would occupy what a clod of sugar in the form of DNA.
THE DNA WILL REPLACE THE TAPES
Researchers are still exploring unknown terrain, where many obstacles remain to be solved and standards to adopt. Among these are the options of using naked DNA or introducing it into bacteria. “Bacteria replicate, so it is easy to maintain the system, but they have limited capacity and incorporate mutations,” Sotirios A. Tsaftaris, a professor at the University of Edinburgh (UK). New study. For his part, co-author of the paper Georg Seelig, University of Washington, stresses that with naked DNA “the density of information is greater, and both storage and access to information are easier.”
For Seelig, the main technological challenge lies in the synthesis of DNA, a process that is still too long and costly. As for DNA sequencing or reading, it has made tremendous progress since the Human Genome Project at the beginning of this century, but Seelig believes that in the future “it will be interesting to build an integrated system that can exploit high storage density, Easy access and replenishment of data “; That is, something like automated data center systems, where robots search and retrieve tapes to feed into readers. Tsaftaris points out that the breakthrough will come when the world of DNA and the world of electronics can “be linked without discontinuities, without so many chemical steps between them.”
In any case, the use of a chemical carrier always imposes a longer process than pure electronics. “Reading and writing DNA Hard Drive takes a long time, currently about ten hours, so it is not appropriate for applications that require quick and regular access to data,” Seelig says. “However, it’s a really promising technology for long-term file storage,” he adds. “We are confident DNA Hard Drive could replace tapes.” In their study, researchers predict a future of hybrid systems that will combine silicon technology and biochemistry. “The time has come for computing to incorporate biomolecules as an integral part of computer design,” they conclude.
DNA Hard Drive: Advantages of Technology
DNA hard drive is very stable as a storage material and its useful life can reach tens of thousands of years under controlled conditions. For that reason, the data coded in the DNA hard drive can pass through several developments, something that would conserve all the human knowledge during generations. The idea is to use DNA hard drive to store data that need to be backed up for a lot of years, since only a dry and cold environment is enough for them to be kept without difficulties.
A series of experiments have been carried out, after which an impressive amount of data in the DNA hard drive has been stored and recovered in 100% and without any errors, approximately 2.2 petabits were stored in a single gram of DNA, plus two thousand Times the hard drive capacity of a home computer; Maybe in the future we will have to buy computers with “DNA RAM”.
On the other hand, the main drawback of this method is the extremely high cost. For example, the price of coding one megabyte of information is $ 12,400, while to read such information is $ 200. However, scientists do not lose hope and are sure that in a few years that cost would be greatly reduced. Although the investment to carry out the codification should only be done once. DNA cannot be rewritten, so it will be necessary to create new sequences every time you want to update or add data.
There is no doubt that the idea itself is very powerful and would allow in the future to store information for periods of time that are not possible with other materials, which in the long run could have other interesting uses.