On an unusually warm spring day in Tokyo, an overflow crowd gathered to listen to a full day of discussions on data security and privacy. The occasion was our third LINE Intertrust Security Summit. The location was the auditorium at LINE headquarters, located in an ultra-modern sleek building in the Shinjuku area of Tokyo. LINE is a major Asian social media outlet and Internet services developer and a very important partner of Intertrust’s. With the drive by more and more companies to adopt data-driven business models combined with the imminent advent of the GDPR (General Data Protection Regulation), interest was high to hear from the assembled international group of academic and industry luminaries discuss a number of topics related to the event’s theme “Data Without Borders, Global Threats to Security and User Privacy.”
The event kicked off with presentations from LINE’s CISO (chief information security officer) Takeshi Nakayama and Intertrust’s CEO Talal Shamoon. Nakayama quoted the well-known security expert Bruce Schneier as saying, “The times have changed.” Nakayama noted that in our immediate past, mass media such as TV and newspapers gathered a lot of information but little data was recorded. Today, with nearly everybody and increasingly everything connected to the Internet, the amount of data has exploded. With modern data analytics, it is easy for companies to use this data to create very detailed profiles of people’s personal interests and daily activities. Nakayama said that companies need to consider carefully how to use this data while respecting user privacy, but not many have advanced very far in their efforts.
Shamoon framed the discussion in the background of the recent data scandals surrounding Facebook and the upcoming GDPR. He pointed out that your privacy is now worth Facebook’s market cap, but the GDPR opens up companies worldwide to the potential of large fines should they violate European citizen’s privacy rights. In this environment, he declared that the time when the dominant Internet culture held that data should be a “free flowing fluid” is over. Data now needs to be handled in a trusted manner. This is not only important for consumer data, but also for industrial data as more and more industrial devices are connected to the Internet. Shamoon proposed trusted data platforms as a way to make more data available for beneficial purposes such as medical advances or energy use optimization while lowering the risks of data breaches.
Machine Learning and Data Security
As in much of the technology industry, the subject of machine learning (ML) has been a hot topic in the cybersecurity space as well. Three speakers brought some real world expertise to this discussion, Clarence Chio, a security engineer and co-author of the book Machine Learning & Security, Kenji Aiko, an engineer at LINE, and Kanatoko, an engineer involved with Bitforest’s Scutum web application firewall tool.
Chio’s presentation came from three years of work in applying ML to cybersecurity area. He noted that ML was trying to make cybersecurity more efficient than the older reactive paradigms prevalent in the industry. This is especially important in the face of such hacker tools as packaging hacking tools as easy-to-use software applications or “captcha farms” where people in the Philippines or Vietnam are paid low rates to solve captcha puzzles. Much of Chio’s presentation revolved around the use of ML in developing fingerprints to identify bad actors or the machines they use from a wide variety of data such as http headers, font lists, screen touch patterns, and device battery states.
Aiko discussed how LINE was using ML to augment their anti-spam efforts. LINE is using ML-based spam filters in addition to rule-based filters and human-based monitoring systems. When using ML to analyze data sets, one of the things they found was data indicating spam attacks tended to have more clustered features while in normal data sets, the features were more distributed. Besides allowing for a more automated approach, Aiko noted that ML technologies had a false-positive of less than 0.01%.
Kanatoko provided some details on the use of ML in their web application firewall product. The product acts as a proxy server and monitors http requests for signs of attacks. The product uses signatures, rules, and Bayesian networks to try and detect outlying data indicating malicious activity, with Kanatoko declaring that he found Bayesian network technology to be very useful. Similar to Aiko, Kanatoko also noted that typical data patterns had very distributed feature sets compared to the clustered features in abnormal data showing attacks. He said that very few false positives are found.
University of Tokyo Research
Two professors from the University of Tokyo, Tatsuya Harada and Kanta Matsuura, gave presentations on research they were engaged in. Harada discussed research projects around visual recognition technologies. One, WebDNN, is claimed to be the fastest deep neural network framework for visual recognition that operates within the browser. He noted that this technology has privacy implications since the visual recognition is done on the user’s client device and not on the server as is typically done. This avoids user data getting away from the control of the user.
The presentation from Matsuura focused on a thorny issue when discussing ML and cybersecurity, how to evaluate how effective the technology is. He pointed out many of his students, when asked how they could prove their technology worked, had a hard time coming up with concrete answers. Since ML is dependent on data, Matsuura noted that ML based technologies could be criticized for coming up with positive results since the data was predetermined to show the results. He also warned about sophisticated attacks on ML based technologies, including using ML based attacks, are possible and the history of research in this field is very short.
Opportunities Afforded by the Internet Governance
Kenny Huang, an executive council member of APNIC (Asia-Pacific Network Information Centre), discussed how Internet governance could help alleviate DDoS (distributed denial of service) and network traffic hijacking attacks through implementing RPKI (resource public key infrastructure). RPKI is where network information centers can certify the network resources of their members thus allowing for the authentication of network traffic. Huang noted that while this technology is about 10 years old, a number of perceived issues have prevented large-scale deployments so far.
Intertrust’s own Tomas Sander finished off the evening with a discussion of the implications of the GDPR (his presentation can be found here and a video of Tomas discussing the GDPR can be found here). While much of the messaging around the GDPR has focused on corporate fears of large fines, Sander sees a number of positive opportunities in the GDPR. Having been involved professionally with privacy for some time, he noted that the GDPR gives both consumers a chance to gain better control over their data and companies the opportunity to build a trusted relationship with their customers. Also, with the requirement for privacy by design, technologists have an incentive to develop better technology from a privacy perspective.
LINE and Intertrust are now planning our fall event. Stay tuned for more details and we hope to see you there.