
2 minute read
4.2 Who Is Going to Be Affected?
from The Blue Book
users’ data –which in many cases contain sensitive information, such as medical records, photos and other personal descriptors– is used as the training data by the MLaaS platforms. Additionally, some MLaaS operators may give data owners the option to sell access to their trained ML models to the general public.
Despite the massive success of ML in tackling numerous difficult problems, several security and privacy vulnerabilities have been shown to coexist with these models [142, 182]. For example, think of the case where an NLP model misclassifies a movie’s review as "excellent" instead of "bad". This (misclassification) error results in a higher score for that particular movie. Thus, users that consult a specific site for movie ratings will be lured to watch that movie because of its high rating. After watching that movie users will realise that it was not as good as the rating site suggested and, as a consequence, avoid using the same site again. On a more serious note, think of the case where an image recognition model is deployed on an autonomous driving vehicle for identifying road signs. If an attacker deliberately perturbs the input (video) to the image recognition model, then the model might wrongly recognise a “stop” sign as a “minimum speed limit” sign and accelerate instead of stopping the car. As you can easily imagine, such attacks can have serious consequences, even causing fatalities. In conclusion, since ML has dominated across many sectors, we need to come up with solutions for ensuring its secure operation.
Advertisement
4.2 Who Is Going to Be Affected?
Since the widespread adoption of ML models into a variety of services and applications, anyone who has access to a modern device (e.g. a smartphone, a personal computer, a vehicle, or even a home appliance) can be affected. In general, any individual who possesses an electronic device can be affected. Nonetheless, youngsters are expected to be affected to a larger degree compared to older individuals, since they often utilise newer technologies and applications that are often powered by ML [124].
A large portion of ML-based applications are often trained on personal (sensitive) data. Leaks of such data may lead to serious consequences for the affected individuals. Think of the case where an ML model is trained to associate a patient’s information with a specific disease class. If an adversary knows that a patient’s data was included in the model’s training dataset, they can draw conclusions about the victim’s health status (known as membership inference attacks [211]). In a similar fashion, if an adversary manages to successfully generate inputs resembling the original ones used for training the target model, then this might enable the de-anonymisation of users






