Visualization of the relation between test and robust accuracy. (IMAGE)
Caption
(A) Illustrates the final training output of the network, highlighting the areas of class prediction. Shaded regions demarcate these areas, with individual point colors indicating the true labels of the corresponding test samples, demonstrating a general alignment between the network's predictions and actual classifications. In (B), all test samples were subjected to gradient-based attacks, causing perturbed sample points to deviate noticeably from their correct categorical regions, leading to misclassifications by the network model. (C) Focuses on the evolving prediction region for the digit '8' across epochs 1, 21, and 41. The deeper the shade of the region, the higher the network's confidence in its prediction. (D) Similar to (C), but displaying adversarial predictions for the attacked images, it is observed that as the training progresses, the effective radius of distribution for the attack points increases. This suggests that as the network's precision in identifying input features heightens, its vulnerability to attacks also escalates.
Credit
©Science China Press
Usage Restrictions
Use with credit.
License
Original content