90% of global data is’this’… DGIST’Dark Data Extreme Utilization Research Center’ Gaboni: Dong-A Science

Development of technology to utilize dark data accounting for 90%

AI research synergy with supercomputer’iREMB’ equipped with semiconductor device clean room

Researchers are checking computer performance at a supercomputing facility dedicated to the'Dark Data Extreme Utilization Research Center' at Daegu Gyeongbuk Institute of Science and Technology (DGIST).  Provided by DGIST

Researchers are checking computer performance at a supercomputing facility dedicated to the’Dark Data Extreme Utilization Research Center’ at Daegu Gyeongbuk Institute of Science and Technology (DGIST). The development of technologies that will lead the fourth industrial revolution, such as big data and artificial intelligence (AI), must be supported by advanced research equipment such as supercomputers. Provided by DGIST

Dark data is a term that refers to data that is created by a person or a computer and stored somewhere, but the existence or absence of it is unknown or cannot be found. This refers to unstructured data that cannot be made into data, or data that cannot be used because it is not known if the user needs it. It is estimated to occupy 27% of the universe, but it is named after dark matter, a being that cannot be seen, heard, or felt.

On the campus of Daegu Gyeongbuk Institute of Science and Technology (DGIST), Hyeonpung-myeon, Dalseong-gun, Daegu, Korea’s only supercomputer that studies such dark data is operated. It is a supercomputing facility dedicated to the’Dark Data Extreme Utilization Research Center’. When I opened the door of the center I visited on the 8th of last month, the sound of the fan of the computer running constantly filled all over. It was such a noise that the voice of the person next to him could not be heard.

Center Director Seongjin Lee (Professor of DGIST Department of Information and Communication Convergence) said, “All research to collect, store, manage, and process massive amounts of dark data is carried out here. Research facilities that can do this should also be supported,” he said.

According to IBM, 90% of the data generated in the world is estimated to be dark data. It is an analysis that only 1% of the data are actually used by humans. For this reason, in recent big data research, how to process and utilize dark data has emerged as a hot topic.

Director Lee said, “A file attached to an e-mail but not searchable corresponds to dark data as an easy example.” “A vast amount of dark data continues to be generated in the medical field, such as heart rate records and magnetic resonance imaging (MRI) images.” Said. The center is currently jointly developing a diagnostic technology that automatically reads and applies AI to a large amount of chest X-ray images corresponding to dark data with Seoul National University Hospital.

In recent years, data storage media such as hard disks and flash memories have evolved and data storage capacity has increased to TB (terabytes) level, and dark data is also being generated. Director Lee is doing research to increase the processing speed when AI finds data by adding computational functions as well as storing data on the hard disk.

Deep learning is applied to search technology to discover meaningful data from dark data. “Like Facebook adopts machine learning technology to automatically sort out harmful contents, we will be able to label dark data using AI technologies such as artificial neural networks to be searched.” Blocking AI, blockchain, and intelligent distributed search technology are also being studied.”

A device clean room built in the Daegu Gyeongbuk Institute of Science and Technology (DGIST) Central Equipment Center.  It can handle the entire process from AI semiconductor design to manufacturing.  Provided by DGIST

A device clean room built in the Daegu Gyeongbuk Institute of Science and Technology (DGIST) Central Equipment Center. It can handle the entire process from AI semiconductor design to manufacturing. Provided by DGIST

In addition to analyzing dark data, supercomputers are also being used to develop AI semiconductors. The supercomputer’iREMB’ operated by DGIST is the highest performance among domestic universities with its computational processing speed of 1.7 petaflops. Last year, SK Siltron, a core material company for semiconductors, used irem to study the growth of single crystals required for semiconductor wafer development.

DGIST recently added a line that can manufacture 0.5μm (micrometer, 1μm is 1 millionth of a meter)-class CMOS (complementary metal oxide semiconductor) to all process equipment that can produce 6-inch AI semiconductor wafers. “You can put intelligent semiconductors on top of CMOS,” said Seong-Bong Lee, head of the DGIST Central Equipment Center. “You can create a neuromorphic semiconductor that mimics the way human neurons work, and you can plant it directly in primates at the experimental animal center for experimentation. “He said.

.Source