Transforming big data analysis with "behavioral information science"
Announced "Virtual Data Scientist"
From "discovery of facts" to "prediction of the future" with a "behavioral information science" approach

Home » corporate » News » 2013 » <a href="https://www.fronteo.com/corporate/news/2013/20131112090000" title="Transforming big data analysis with "behavioral information science"
Announced "Virtual Data Scientist"
From ``discovering facts'' to ``predicting the future'' with a ``behavioral information science'' approach''>Transforming big data analysis with ``behavioral information science''
Announced "Virtual Data Scientist"
From "discovery of facts" to "prediction of the future" with a "behavioral information science" approach
2013.11.12

--To the press -

Transforming big data analysis with "behavioral information science"
Announced "Virtual Data Scientist"
From "discovery of facts" to "prediction of the future" with a "behavioral information science" approach

UBIC Co., Ltd.
Masahiro Morimoto, President and CEO
2-12-23 Konan, Minato-ku, Tokyo
(TSE Mothers Code Number: 2158)
(NASDAQ ticker symbol: UBIC)

UBIC Co., Ltd. (Headquarters: Minato-ku, Tokyo; President: Masahiro Morimoto), which handles international litigation support services listed on TSE Mothers in the US, is conducting data analysis in the "big data analysis" business that industry is paying attention to these days. As the shortage of human resources for "data scientists" became apparent, we started a software development project in which a computer equipped with artificial intelligence (AI) takes over the analysis work.

This tool has been developed by fusing the know-how that we have cultivated in the field of international proceedings with the data mining technology "Predictive Coding®" developed in-house, and even in the world named "Behavioral Information Science". With an unprecedented approach, we will analyze and utilize big data in a wide range of fields other than litigation.The litigation support business that we are currently developing is an example of the big data analysis business.This is because the data handled in litigation support fits the general definition of "big data" and we are analyzing this big data.In other words, we have been involved in big data analysis ever since the company was established, and have accumulated our own technology and know-how.Based on these many years of knowledge, we are expected to play an active role in fields such as M & A, medical care, and intelligence (security support) in the future.We are currently developing it in our R & D center, and we plan to launch a new product in early 2014.

The arrival of the big data era and the emergence of "data scientists"

With the spread of the Internet and the evolution of IT technology, the "era of big data" has arrived.Data (information) is now said to have the same value as 3M "Man, Material, Money," which are the three major management resources of a company.According to the "7 White Paper on Information and Communication" released by the Ministry of Internal Affairs and Communications in July this year, it is estimated that if big data is fully utilized, the economic effect of related businesses is still expected to be 25 billion yen per year. I will.It is no exaggeration to say that how big data is used in business will affect the competitiveness of Japanese companies in the future.

However, major problems have now been pointed out in the field of handling big data.
It is a shortage of specialists called "data scientists" who analyze and process large amounts of data.

“Using big data” is not just about using high-speed hardware or advanced software.It means incorporating data as an “asset” into the actual management strategy, and the data scientist is responsible for that task.In addition to marketing and product development, the need for big data analysis is rapidly increasing, and it is expected that the demand for data scientists will increase worldwide in the future.However, according to US research firm Gartner, "By 2015, big data demand will bring 440 million jobs worldwide, but only one-third of these jobs will be met." We are doing it.

The industry has begun to feel a sense of crisis in this situation, and in July of this year, the "Data Scientist Association", an industry-academia collaboration organization, was established and embarked on efforts to solve the shortage of human resources.Under these circumstances, we have launched the "Virtual Data Scientist" project to support the "big data" business of Japanese companies with a different idea and approach.

Applying know-how and technology cultivated in the US litigation society to big data analysis

In March 2010, we released the e-discovery support system "Lit i View ®" to develop evidence preservation, investigation, and analysis services for electronic data in international litigation. I've come. In May 3, it was listed on the US NASDAQ market as a Japanese company for the first time in 2013 years.In September of the same year, we developed the next-generation forensic software "Lit iViewXAMINER" that collects and analyzes e-mails and document files that are evidence in criminal investigations, and provides them to law enforcement agencies such as the police. doing.

In this way, we have one of the best achievements in the world in data collection and analysis in the fields of e-discovery and forensics, but the know-how and technology accumulated in the harsh litigation society in the United States can be fully applied to the "big data" business. I'm confident.This is because the task of document review itself, which analyzes vast amounts of electronic data in international proceedings, is very similar to the task of a data scientist.

To date, document reviews in eDiscovery have been done visually by a number of lawyers at great cost.However, unlike the age of paper, the amount of information contained in electronic data is an order of magnitude. When the information stored in one personal computer is converted into paper, it is equivalent to four 1-ton trucks.Therefore, we developed the AI ​​application technology "Predictive Coding ®" in-house and put it into review work.This is to teach AI the judgment and inspection patterns when a veteran lawyer finds evidence from electronic data, and replace most of the analysis work for a huge amount of electronic data with a computer on behalf of a veteran lawyer. It is to let you.Its processing speed and accuracy are more than 2 times more efficient than the number of human review documents and 4% or more accuracy, and at the same time, it is important that it was sometimes overlooked in conventional keyword searches and human reviews. Supports reliable extraction of evidence data.At present, we recognize that there are several companies in the world, including us, that have developed their own predictive coding technology that can be used for such data analysis.Furthermore, we are the only company that fully supports Asian languages.

What is "Virtual Data Scientist" created by "Behavioral Information Science"?

Regarding the definition of "big data", we see it as a collection of the consequences of human thoughts and actions, not just a mathematical world expressed in "binary numbers".As an approach to analyze them, we devised a new concept of "behavioral information science".

"Behavioral information science" is a fusion of "information science" (statistics, mathematics, data mining, pattern recognition technology, etc.) and "behavioral science" (psychology, criminology, sociology, etc.).
Whereas the traditional approach is limited to analyzing past events and "extracting facts," the behavioral information science we propose is more human thinking through human behavior and community generation patterns. It makes it possible to "predict the future" accordingly.

We believe that these new approaches will allow computers to do most of the work that data scientists have done so far.

The knowledge required of data scientists is (3) IT / information and communication, (XNUMX) statistics, and (XNUMX) business. The "Virtual Data Scientist" will utilize this knowledge to demonstrate its abilities.First of all, regarding (XNUMX), you can easily collect data scattered in various places, including unstructured data. Regarding (XNUMX), it already has a sufficient analysis function, and the analysis method can be automatically selected even if the person on the operating side does not have specialized knowledge of statistics. Regarding (XNUMX), we will utilize predictive coding for machine learning and seek solutions while teaching data to the computer.

As described above, by entrusting the data preparation and analysis that humans have done so far to the "virtual data scientist", the data scientist can perform consulting work that only humans can do, that is, planning business strategies based on the analysis results, etc. You can focus on.

In this way, "AI" in big data analysis does not mean replacing the work of data scientists in the future, but means that computers "assist" people.Therefore, we believe that the better the data scientist, the more our technology will be utilized.So what features do Virtual Data Scientists need to perform these missions?First, we aim to solve the following technical issues that big data analysis is currently facing.

  • Data is not integrated (difference in media such as text, image, voice, distribution of data storage destination)
  • Data analysis cannot be performed flexibly (difficult to deal with problems such as overlapping classifications, multiple classifications, and intermediate classifications)
  • Cross-case analysis is not possible (conventional data mining, which emphasizes quantitative analysis of data trends, is difficult to analyze unless there is some bias or tendency in the measured data)
  • 4It takes a huge amount of time and human cost to analyze data

We will introduce predictive coding technology to solve these problems.

"Flexible" ideas created by "unique" team composition

Our development process has "strengths" that other companies do not have.It is the "uniqueness" of the project team membership.
First, the chief executive officer of research and development is an expert in "philosophy," "psychology," "criminology," and "sociology."Under him, a doctor of science who has experience in researching particle physics as a researcher at universities and specialized institutions is building the main logic of the data analysis function.In addition, a Danish doctor of computational linguistics who is familiar with Japanese, Korean, English, Danish and four Japanese and European languages ​​is in charge of language analysis.We believe that "behavioral information science" can only be established by bringing together experts in these diverse fields.

The world of data mining changed by behavioral information science

Traditional data mining is based on "quantitative" relationships, and e-shopping, for example, has a strategy of recommending the same product to customers with the same profile.On the other hand, data analysis based on behavioral information science makes it possible to take a step further and understand the background related to customer purchasing behavior and the relationship with the community to which the customer belongs. ..
Furthermore, in corporate M & A, it is possible to grasp the "factions" that occur before and after the corporate merger, explore possible problems in the future, and take measures to facilitate collaboration.In the medical field, we believe that integrated management of medical records, medication information, and test data can contribute to reducing medical costs and preventing medical accidents.

From "discovery specialist company" to "behavior information data analysis company"

In the future, we will develop and propose solutions in various industrial fields, and support companies and institutions that are considering the accumulation of big data and its utilization in business.At the same time, we are considering collaborating with companies specializing in big data analysis and conducting market analysis in each industry in collaboration with think tanks.Furthermore, we will actively work on human resource development for data scientists, such as holding lectures that teach how to use various tools.Through these activities, we will evolve from a "discovery specialist company" to a "behavior information data analysis company" and aim to make a leap forward to become a "future discovery" company.
The market size of the big data analysis business is projected to reach 2015 trillion yen worldwide in 1.7.In such a market, we are aiming for sales of 1000 billion yen or more in the future.

About UBIC

President
Masahiro Morimoto
Address
2-12-23 Konan, Minato-ku, Tokyo Meishan Takahama Building
URL
http://www.ubic.co.jp/

UBIC Co., Ltd. is an e-discovery business that preserves, investigates, and analyzes electronic data required in international cartel investigations, investigations related to the Federal Foreign Anti-Corruption Act (FCPA), IP proceedings, and PL proceedings. In addition to the eDiscovery Support Project), a comprehensive legal technology company that provides computer forensic proceedings that conduct proceedings focusing on electronic data.We have the world's highest level of technology for Asian language and the laboratory with the largest processing capacity in Asia. December 2007 Established a US subsidiary.Provided legal proceedings related to Asian companies from both Asia and the United States. At the end of 12, we developed our own eDiscovery support system "Lit i View ®" that enables eDiscovery in international litigation even within companies, and from October 2009, we developed "UBIC" as a cloud service. Started providing "Legal Cloud Service".In March 2011, we developed the world's first "Predictive Coding®" technology for Asian languages ​​and succeeded in putting it into practical use. Established on August 10, 2012. Listed on TSE Mothers on June 3, 2003. Listed on NASDAQ on May 8, 8.The capital is 2007 yen (as of June 6, 26).