Evaluating and Certifying the Adversarial Robustness of Neural Language Models

Download or Read eBook Evaluating and Certifying the Adversarial Robustness of Neural Language Models PDF written by Muchao Ye and published by . This book was released on 2024 with total page 0 pages. Available in PDF, EPUB and Kindle.
Evaluating and Certifying the Adversarial Robustness of Neural Language Models
Author :
Publisher :
Total Pages : 0
Release :
ISBN-10 : OCLC:1443188356
ISBN-13 :
Rating : 4/5 (56 Downloads)

Book Synopsis Evaluating and Certifying the Adversarial Robustness of Neural Language Models by : Muchao Ye

Book excerpt: Language models (LMs) built by deep neural networks (DNNs) have achieved great success in various areas of artificial intelligence, which have played an increasingly vital role in profound applications including chatbots and smart healthcare. Nonetheless, the vulnerability of DNNs against adversarial examples still threatens the application of neural LMs to safety-critical tasks. To specify, DNNs will change their correct predictions into incorrect ones when small perturbations are added to the original input texts. In this dissertation, we identify key challenges in evaluating and certifying the adversarial robustness of neural LMs and bridge those gaps through efficient hard-label text adversarial attacks and a unified certified robust training framework. The first step of developing neural LMs with high adversarial robustness is evaluating whether they are empirically robust against perturbed texts. The vital technique related to that is the text adversarial attack, which aims to construct a text that can fool LMs. Ideally, it shall output high-quality adversarial examples in a realistic setting with high efficiency. However, current evaluation pipelines proposed in the realistic hard-label setting adopt heuristic search methods, consequently meeting an inefficiency problem. To tackle this limitation, we introduce a series of hard-label text adversarial attack methods, which successfully tackle the inefficiency problem by using a pretrained word embedding space as an intermediate. A deep dive into this idea illustrates that utilizing an estimated decision boundary in the introduced word embedding space helps improve the quality of crafted adversarial examples. The ultimate goal of constructing robust neural LMs is obtaining ones for which adversarial examples do not exist, which can be realized through certified robust training. The research community has proposed different types of certified robust training either in the discrete input space or in the continuous latent feature space. We discover the structural gap within current pipelines and unify them in the word embedding space. By removing unnecessary bound computation modules, i.e., interval bound propagation, and harnessing a new decoupled regularization learning paradigm, our unification can provide a stronger robustness guarantee. Given the aforementioned contributions, we believe our findings will help contribute to the development of robust neural LMs.


Evaluating and Certifying the Adversarial Robustness of Neural Language Models Related Books

Evaluating and Certifying the Adversarial Robustness of Neural Language Models
Language: en
Pages: 0
Authors: Muchao Ye
Categories:
Type: BOOK - Published: 2024 - Publisher:

DOWNLOAD EBOOK

Language models (LMs) built by deep neural networks (DNNs) have achieved great success in various areas of artificial intelligence, which have played an increas
Advances in Reliably Evaluating and Improving Adversarial Robustness
Language: en
Pages: 0
Authors: Jonas Rauber
Categories:
Type: BOOK - Published: 2021 - Publisher:

DOWNLOAD EBOOK

Machine learning has made enormous progress in the last five to ten years. We can now make a computer, a machine, learn complex perceptual tasks from data rathe
Improved Methodology for Evaluating Adversarial Robustness in Deep Neural Networks
Language: en
Pages: 93
Authors: Kyungmi Lee (S. M.)
Categories:
Type: BOOK - Published: 2020 - Publisher:

DOWNLOAD EBOOK

Deep neural networks are known to be vulnerable to adversarial perturbations, which are often imperceptible to humans but can alter predictions of machine learn
Towards Adversarial Robustness of Feed-forward and Recurrent Neural Networks
Language: en
Pages:
Authors: Qinglong Wang
Categories:
Type: BOOK - Published: 2020 - Publisher:

DOWNLOAD EBOOK

"Recent years witnessed the successful resurgence of neural networks through the lens of deep learning research. As the spread of deep neural network (DNN) cont
ECML PKDD 2020 Workshops
Language: en
Pages: 619
Authors: Irena Koprinska
Categories: Computers
Type: BOOK - Published: 2021-02-01 - Publisher: Springer Nature

DOWNLOAD EBOOK

This volume constitutes the refereed proceedings of the workshops which complemented the 20th Joint European Conference on Machine Learning and Knowledge Discov