Search This Blog

Showing posts with label Machine learning. Show all posts

UK Government Releases New Machine Learning Guidance


Machine Learning and NCSC

The UK's top cybersecurity agency has released new guidance designed to assist developers and others identify and patch vulnerabilities in Machine Learning (ML) systems. 

GCHQ's National Cyber Security Centre (NCSC) has laid out together its principles for the security of machine learning for any company that is looking to reduce potential adversarial machine learning (AML). 

What is Adversarial Machine Learning (AML)?

AML attacks compromise the unique features of ML or AI systems to attain different goals. AML has become a serious issue as technology has found its way into a rising critical range of systems, finance, national security, underpinning healthcare, and more. 

At its core, software security depends on understanding how a component or system works. This lets a system owner inspect and analyze vulnerabilities, these can be reduced or accepted later. 

Sadly, it's difficult to deal with this ML. ML is precisely used for enabling a system that has self-learning, to take out information from data, with negligible assistance from a human developer.

ML behaviour and difficulty to interpret 

Since a model's internal logic depends on data, its behaviour can be problematic to understand, and at times is next to impossible to fully comprehend why it is doing what it is doing. 

This explains why ML components haven't undergone the same level of inspection as regular systems, and why some vulnerabilities can't be identified. 

According to experts, the new ML principles will help any organization "involved in the development, deployment, or decommissioning of a system containing ML." 

The experts have pointed out some key limitations in ML systems, these include:

  • Dependence on data: modifying training data can cause unintended behaviour, and the threat actors can exploit this. 
  • Opaque model logic: developers sometimes can't understand or explain a model's logic, which can affect their ability to reduce risk.
  • Challenges verifying models: it is almost impossible to cross-check if a model will behave as expected under the whole range of inputs to which it might be a subject, and we should note that there can be billions of these. 
  • Reverse engineering models and training data can be rebuilt by threat actors to help them in launching attacks. 
  • Need for retraining: Many ML systems use "continuous learning" to improve performance over time, however, it means that security must be reassessed every time a new model version is released. It can be several times a day. 

In the NCSC, the team recognises the massive benefits that good data science and ML can bring to society, along with cybersecurity. The NCSC wants to make sure these benefits are recognised. 








Hackers can ‘Poison’ Open-source Code on the Internet

 

A Cornell University Tech team with researchers discovered a new kind of backdoor attack that can modify natural-language modelling systems to generate false outputs and bypass any known protection. 

The Cornell Tech team believes the assaults may affect algorithmic trading, email accounts, and other services. The research was supported by the NSF and the Schmidt Futures initiative, and Google Faculty Research Award. 

According to research published on Thursday, the backdoor may alter natural-language modelling systems without requiring access to the original code or model by uploading malicious code to open-source sites commonly used by numerous organisations and programmers. 

During a presentation at the USENIX Security conference on Thursday, the researchers termed the attacks "code poisoning." The attack would offer people or organisations immense authority over a wide range of things, including movie reviews or an investment bank's machine learning model, disregarding news that may affect a company's stock. 

The report explained, "The attack is blind: the attacker does not need to observe the execution of his code, nor the weights of the backdoored model during or after training. The attack synthesizes poisoning inputs 'on the fly,' as the model is training, and uses multi-objective optimization to achieve high accuracy simultaneously on the main and backdoor tasks." 

"We showed how this attack can be used to inject single-pixel and physical backdoors into ImageNet models, backdoors that switch the model to a covert functionality, and backdoors that do not require the attacker to modify the input at inference time. We then demonstrated that code-poisoning attacks can evade any known defence, and proposed a new defence based on detecting deviations from the model's trusted computational graph." 

Eugene Bagdasaryan, a computer science PhD candidate at Cornell Tech and co-author of the new paper with professor Vitaly Shmatikov, mentioned that many companies and programmers use models and codes from open-source sites on the internet. This study highlights the importance of reviewing and verifying materials before incorporating them into any systems.

"If hackers can implement code poisoning, they could manipulate models that automate supply chains and propaganda, as well as resume screening and toxic comment deletion," he added. 

Shmatikov further explained that similarly to prior assaults, the hacker must gain access to the model or data during training or deployment, which involves breaking into the victim's machine learning infrastructure. 

"With this new attack, the attack can be done in advance, before the model even exists or before the data is even collected -- and a single attack can actually target multiple victims," Shmatikov said. 

The paper focuses further on the ways for "injecting backdoors into machine learning models, based on compromising the loss-value computation in the model-training code." The team used a sentiment analysis model for the particular task of always classifying as positive all reviews of the infamously bad movies directed by Ed Wood. 

"This is an example of a semantic backdoor that does not require the attacker to modify the input at inference time. The backdoor is triggered by unmodified reviews written by anyone, as long as they mention the attacker-chosen name," the paper discovered. 

"Machine learning pipelines include code from open-source and proprietary repositories, managed via build and integration tools. Code management platforms are known vectors for malicious code injection, enabling attackers to directly modify the source and binary code." 

To counter the attack:
The researchers suggested a technique that could identify changes from the original code of the model. However, Shmatikov claims that because AI and machine learning tools have grown so popular, many non-expert users create their models with code they hardly comprehend. "We've shown that this can have devastating security consequences." 

In the future, the team aims to investigate how code-poisoning links to summarisation and even propaganda automation, which may have far-reaching consequences for the future of hacking.

They will also strive to create robust protections that will eradicate this entire class of attacks and make AI and machine learning secure even for non-expert users," according to Shmatikov.

Researchers Embedded Malware into an AI's 'Neurons' and it Worked Scarily Well

 

According to a new study, as neural networks become more popularly used, they may become the next frontier for malware operations. 

The study published to the arXiv preprint site stated, malware may be implanted directly into the artificial neurons that make up machine learning models in a manner that protects them from being discovered.

The neural network would even be able to carry on with its usual activities. The authors from the University of the Chinese Academy of Sciences wrote, "As neural networks become more widely used, this method will become universal in delivering malware in the future." 

With actual malware samples, they discovered that changing up to half of the neurons in the AlexNet model—a benchmark-setting classic in the AI field—kept the model's accuracy rate over 93.1 percent. The scientists determined that utilizing a method known as steganography, a 178MB AlexNet model may include up to 36.9MB of malware buried in its structure without being detected. The malware was not identified in some of the models when they were tested against 58 different antivirus programs. 

Other ways of invading businesses or organizations, such as attaching malware to papers or files, are frequently unable to distribute harmful software in large quantities without being discovered. As per the study, this is because AlexNet (like many machine learning models) is comprised mainly of millions of parameters and numerous complicated layers of neurons, including fully connected "hidden" layers, 

The researchers discovered that altering certain other neurons had no influence on performance since the massive hidden layers in AlexNet were still intact. 

The authors set out a playbook for how a hacker could create a malware-loaded machine learning model and distribute it in the wild: "First, the attacker needs to design the neural network. To ensure more malware can be embedded, the attacker can introduce more neurons. Then the attacker needs to train the network with the prepared dataset to get a well-performed model. If there are suitable well-trained models, the attacker can choose to use the existing models. After that, the attacker selects the best layer and embeds the malware. After embedding malware, the attacker needs to evaluate the model’s performance to ensure the loss is acceptable. If the loss on the model is beyond an acceptable range, the attacker needs to retrain the model with the dataset to gain higher performance. Once the model is prepared, the attacker can publish it on public repositories or other places using methods like supply chain pollution, etc." 

According to the article, when malware is incorporated into the network's neurons, it is "disassembled" and assembled into working malware by a malicious receiver software, which may also be used to download the poisoned model via an upgrade.  The virus can still be halted if the target device checks the model before executing it. Traditional approaches like static and dynamic analysis can also be used to identify it.

Dr. Lukasz Olejnik, a cybersecurity expert and consultant, told Motherboard, “Today it would not be simple to detect it by antivirus software, but this is only because nobody is looking in there.” 

"But it's also a problem because custom methods to extract malware from the [deep neural network] model means that the targeted systems may already be under attacker control. But if the target hosts are already under attacker control, there's a reduced need to hide extra malware." 

"While this is legitimate and good research, I do not think that hiding whole malware in the DNN model offers much to the attacker,” he added. 

The researchers anticipated that this would “provide a referenceable scenario for the protection on neural network-assisted attacks,” as per the paper. They did not respond to a request for comment from Motherboard.

This isn't the first time experts have looked at how malicious actors may manipulate neural networks, such as by presenting them with misleading pictures or installing backdoors that lead models to malfunction. If neural networks represent the future of hacking, major corporations may face a new threat as malware campaigns get more sophisticated. 

The paper notes, “With the popularity of AI, AI-assisted attacks will emerge and bring new challenges for computer security. Network attack and defense are interdependent. We hope the proposed scenario will contribute to future protection efforts.”

Kubeflow: The Target of Cryptomining Attacks

 

Microsoft has discovered a new, widespread, ongoing threat that aims to infect Kubernetes clusters running Kubeflow instances with malicious TensorFlow pods that mine cryptocurrencies. Kubeflow is a popular open-source framework for conducting machine learning (ML) tasks in Kubernetes, while TensorFlow is an end-to-end, open-source ML platform. 

Microsoft security experts cautioned on Tuesday that they noticed a rise in TensorFlow pod deployments on Kubernetes clusters at the end of May — pods that were running legal TensorFlow images from the official Docker Hub account. However, a closer examination of the pods' entry point revealed that they are used to mine cryptocurrency. 

In a post on Tuesday, Yossi Weizman, a senior security research software engineer at Microsoft's Azure Security Center, said that the "burst" of malicious TensorFlow deployments was "simultaneous," implying that the attackers scanned the clusters first, kept a list of potential targets, and then fired on all of them at the same time. The attackers used two distinct images, according to Weizman. The first is the most recent version of TensorFlow (tensorflow/tensorflow:latest), and the second is the most recent version with GPU support (tensorflow/tensorflow:latest-gpu). 

According to Weizman, using TensorFlow images in the network "makes a lot of sense," because “if the images in the cluster are monitored, usage of a legitimate image can prevent attackers from being discovered.” Another rationale for the attackers' decision is that the TensorFlow image they chose is an easy way to conduct GPU activities using CUDA, which "allows the attacker to optimize the mining gains from the host," according to him. 

The newly found vulnerability is comparable to a cryptocurrency mining attack revealed by Microsoft in June. That previous campaign also targeted Kubeflow workloads, launching a broad XMRIG Monero-mining campaign by exploiting misconfigured dashboards. The most recent campaign includes the following changes: According to Weizman, the attackers abused their access to the Kubeflow centralized dashboard to establish a new pipeline this time.

Kubeflow Pipelines is a framework for creating machine learning pipelines based on Argo Workflow, an open-source, container-native workflow engine for coordinating parallel jobs. A pipeline is a collection of steps, each of which functions as its own container, that together creates an ML workflow. 

Users of Kubeflow should ensure that the centralized dashboard is not insecurely exposed to the internet, according to Microsoft.

Security Researchers Raise Concerns Over Security Flaws in Machine Learning

 

In today’s age, it is impossible to implement effective cybersecurity technology without depending on innovative technologies like machine learning and artificial intelligence. Machine learning in the field of cybersecurity is a fast-growing trend. But with machine learning and AI there comes a cyber threat. Unlike traditional software, where flaws in design and source code account for most security issues, in AI systems, vulnerabilities can exist in images, audio files, text, and other data used to train and run machine learning models.

 What is machine learning? 

Machine learning, a subset of AI is helping business organizations to analyze the threats and respond to ‘adversarial attack’ and security incidents. It also helps to automate more boring and tedious tasks that were previously carried out by under-skilled security teams. Now, Google is also using machine learning to examine the threats against mobile endpoints running on Android along with detecting and removing malware from the infected handsets. 

What are adversarial attacks? 

Adversarial attacks are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake; they’re like optical illusions for machines. For instance, as web applications with database backends started replacing static websites, SQL injection attacks became prevalent. The widespread adoption of browser-side scripting languages gave rise to cross-site scripting attacks. Buffer overflow attacks overwrite critical variables and execute malicious code on target computers by taking advantage of the way programming languages such as C handle memory allocation. 

Security flaws linked with machine learning and AI 

Security researchers at Adversa, a Tel Aviv-based start-up that focuses on security for artificial intelligence (AI) systems have published their report which says many machine learning systems are vulnerable to adversarial attacks, imperceptible manipulations that cause models to behave erratically. 

According to the researchers at Adversa, machine learning systems that process visual data account for most of the work on adversarial attacks, followed by analytics, language processing, and autonomy. Web developers who are integrating machine learning models into their applications should take note of these security issues, warned Alex Polyakov, co-founder and CEO of Adversa. 

“There is definitely a big difference in so-called digital and physical attacks. Now, it is much easier to perform digital attacks against web applications: sometimes changing only one pixel is enough to cause a misclassification,” Polyakov told The Daily Swig.

Machine Learning in Security - How Machine Learning helps security in the real-world?

 

Image Source

Machine Learning is a core building block in the field of Data Science and Artificial Intelligence. As we all know, mathematics and statistics are the backbones of machine learning algorithms, and the algorithms that are used to discover correlations, anomalies, and patterns deal with data that are too complex. 

When we talk about Security, spam is the first thing that comes to our mind. With the invention of the internet, computers were hooked together to create an effective and valuable communication network, and this medium which had broader distribution and free transmission, perfectly suited to steal account credentials, spread computer viruses, Malware, etc. 

With enormous development in security domains like intrusion detection, malware analysis, web application security, network security, cryptography, etc., even today spam remains a major threat in the email and messaging space which directly impacts the general public. 

The technologists saw a huge potential in Machine Learning in dealing with this constantly evolving issue. The email data can be accessed by the email providers and the internet service providers(ISPs) by which the user behavior, email content, and its metadata can be used to build content-based models to recognize spam. The metadata can be extracted and analyzed to predict the likelihood that an email is spam or not. Some best modern email filters can filter 99.9% of spam and block them, thanks to technology development. 

Indeed, the spam-fighting story has helped researchers to know the importance of data and use the available data and machine learning to detect and defeat malicious adversaries. 

Adversaries & Machine Learning 

All said and done, the adversaries can also take advantage of machine learning to avoid detection and evade defenses. The attackers can also learn about the nature of defenses as much as the defenders can learn from the attacks. It has been known that spammers use polymorphism which is nothing but changing the appearance of the content without changing the content, to avoid detection. 

Adversaries can also use machine learning to learn our interests and personal details from our social media page and use that information to craft a personal phishing message. There is a growing field called adversarial machine learning, by which the attackers can also cause the algorithms to make erroneous predictions and learn wrong things to execute their attacks. 

Machine Learning use cases in Security 

The machine learning use cases in security can be classified to: 
Pattern recognition — In this, we discover explicit characteristics hidden in the data which is nothing but feature sets and these can be used to teach an ML algorithm to recognize other forms of the data that exhibit the same set of characteristics. 
         Examples of pattern recognition are spam detection, malware detection, and botnet detection. 
Anomaly Detection — In this, the goal is to establish a notion of normality that describes 95% of a given dataset. Learning of the patterns is data is not done in this. So, once the normality is determined, any deviations from this will be detected as anomalies. 
        Examples of anomaly detection are Network outlier detection, malicious URL             detection,  user authentication, access control, and behavior analysis. 

Today, almost every piece of technology used by organizations has security vulnerabilities. Driven by some core motivations, malicious actors can pose a security risk to almost all aspects of modern life. A motivated adversary is constantly trying to attack a system, and each side races to fix or exploit the flaws in design and technique before the other uncovers them. 

Often machine learning algorithms are not designed with security in mind and so they are vulnerable to the attempts made by a motivated adversary. Hence, It is very important to have knowledge of the threat models while designing a machine learning system for security purposes. 

References: Machine Learning & Security by Clarence Chio & David Freeman

Hackers Can Use AI and Machine Learning to Attack Cybersecurity

 


According to researchers at NCSA and Nasdaq cybersecurity summit, hackers can use Machine and AI (Artificial Intelligence) to avoid identification during cybersecurity attacks and make the threats more effective. Hackers use AI to escape disclosure; it allows them never to get caught and adapt to new tactics over time, says Elham Tabassi, National Institute of Standards and Technology's chief of staff information technology laboratory. 




Tim Bandos from Digital Guardian says technology always requires human consciousness to strive forward. It has and will require human effort to counter cyberattacks and stop them. According to Tim, Experts and Analysts are the real heroes, and AI is just a sidekick. 

How are hackers using AI to attack cybersecurity? 

1. Data Poisoning 
In some cyberattacks, hackers exploit the data which is used to train machine learning models. In data poisoning, the hacker manipulates a training dataset to manage the model's prediction patterns and prepare it according to his will to do many hacker desires. These can include spamming or phishing emails. Tabassi says that data is the driving mechanism for any machine learning, and one should focus on the information he uses to train the models to act like any model. Machine learning training models and the data it uses affect user trust. For cybersecurity, the industry needs to establish a standard protocol for data quality. 

2. Generative Adversarial Networks 
GANs are nothing but a setting where two AI systems are set up against each other. One AI generates the content, and the other AI finds the errors. The competition between the two AIs together creates reliable content to get through as the original. "This capability could be used by game developers to automatically generate layouts for new game levels, as well as by AI researchers to more easily develop simulator systems for training autonomous machines," says Nvidia blog. According to Bandos, hackers are using GANs to replicate traffic patterns. It allows them not to draw attention to the attack, and the hackers can steal sensitive data and get out of the system within 30-40 minutes.

Facebook using AI to track hate speech

 


Facebook's hate speech and malicious content identifying AI seem to be working as the company said that their AI identified and removed 134% more hate speech in the second quarter than in the first. The company stated in the Community Standards Enforcement Report that it acted upon 9.9 million hateful posts in the first quarter of the year and 22.5 million in the second. But the figures also reveal how much of hate content was there and is still on the site, to begin with.

Facebook's VP of Integrity Guy Rosen blames the high number to “the increase in proactive technology” in detecting a said form of content. The company has more and more been relying on machine learning and AI to drive out this type of content by losing bots on the network. 

There has been a similar rise on Instagram as well. They detected 84% of hate speeches in this quarter and 45% in the last and removed 3.3 million of these posts from April to June- a sweeping amount when compared to just 808,900 in January till March. 

The social media site also has plans to use similar technology to monitor Spanish, Arabic, and Indonesian posts. 

These increasing number in hate content does show the platform's improvement in the AI technology used to fish out hate post but it also raises concerns over the hostile environment the network presents. Though the company blames these numbers to an increase in coverage of content.

 “These increases were driven by expanding our proactive detection technologies in English and Spanish,” as the company states.

Some critiques also say that the company has no way of knowing how much percent they are actually capturing and how much there is as they measure it according to 'Prevalence' that is how often a Facebook user sees a hateful post as opposed to how many there actually are. The social media giant also updated as to what they include as hate speech - excluding misinformation that remains a big problem for Facebook.

Earth-I announces the launch of SEVANT, using satellite for surveillance on Copper Smelters


London: Earth-i has announced to launch a service for traders and miners on October 18, ahead of LME(London Metal Exchange) week to spy over copper smelters through satellite imagery to predict their shut down and boom. The service, sold by Earth-i will keep surveillance over copper smelters as to get beforehand notice of their closures and openings, which could lead to jumps in copper prices.


The copper market is widely watched and closely studied by analysts and researchers as an indicator of a good economy since the metal is widely used in many ways from construction to manufacturing. To watch over these irregularities of going off and on of copper smelters resulting in a surge in prices, Britain-based Earth-i which uses geo-spatial intelligence in collaboration with Marex Spectron and the European Space Agency is set to launch a ground-breaking product, the SAVANT Global Copper Smelting Index. The dataset will provide key operational status reports of the world's copper plants to subscribers. Most of the surveys on copper miners and smelters are released every month.

Earth-i Chief Technology Officer John Linwood said: "Historically when you look at smelters, the challenge has been getting up-to-date information." Over the last year, the company has been testing SAVANT and conducting trials with financers, traders and managers and also detected a major closedown in Chile, the world's biggest copper-producing country. “SAVANT delivers unique insights, not just to copper producers, traders, and investors, but also to analysts and economists who use metals performance as strong indicators of wider economic activity," says Wolf, Global Head of Market Analytics at Marex Spectron.

Earth-i uses twenty high-resolution satellites along with Artificial Intelligence and Machine Learning-led analytics working with satellite image processing. They also launched their satellite last year. The SAVANT Global Copper Smelting Index covers and monitors 90% copper smelting around the world. The data will allow users to make informed and timely decisions to tackle jolts in copper prices. "Earth-i will also publish a free monthly global index from 18 October" a statement by earth-i, the index will be free but delayed.

Hackers Hijacking Your Memories Threatening To Erase Them If You Don't Pay a Ransom


 There is no denying the way that progress in the field of neurotechnology have brought us closer to boosting and upgrading our memories, however more so because of this development, in a couple of decades we may even have the capacity to control, interpret and re-keep in touch with them effortlessly.

Brain implants which are rapidly turning into a common tool for neurosurgeons will later in the future course of action be tremendously upheld by these advancements in innovation.


Regardless of whether it is Parkinson's or Obsessive Compulsive Disorder (OCD) or even controlling diabetes and handling obesity these technological advances deliver Deep Brain Stimulation (DBS) to treat such a wide cluster of conditions.

Still in its beginning periods, and being examined for treating depression, dementia, Tourette's syndrome and other psychiatric conditions, researchers are investigating how to treat memory disorders especially those brought about by traumatic accidents.
Particularly to help restore the memory loss in soldiers influenced by traumatic brain injury as done by the US Defense Advance Research Projects Agency (DARPA).

Laurie Pycroft, a specialist with the Nuffield Department of Surgical Sciences at the University of Oxford says that “By the middle of the century, we may have even more extensive control, with the ability to manipulate memories. But the consequences of control falling into the wrong hands could be ‘very grave’…”

As a hacker could also compromise to 'erase' or overwrite somebody's memories if cash isn't paid to them, this could maybe be done through the dark web.

Cyber Security Company Kaspersky Lab and University of Oxford researchers have teamed up on a new project which outlines the potential dangers and methods for attack concerning these developing technologies. Their report pertaining to the matter says that,“Even at today's level of development - which is more advanced than many people realise - there is a clear tension between patient safety and patient security."


While Mr Dmitry Galov, a researcher at Kaspersky Lab believes that in the event that we acknowledge that this innovation will exist, we might change people’s behaviour, Carson Martinez, health policy fellow at the Future of Privacy Forum, says that "It is not unimaginable to think that memory-enhancing brain implants may become a reality in the future. Memory modification? That sounds more like speculation than fact."

 But she too admits to the fact that the idea of brain jacking "could chill patient trust in medical devices that are connected to a network...”
That is the reason Mr Galov has accentuated on the need of clinicians and patients to be instructed on the best way to play it safe, and prompts that setting solid passwords is necessary.

Despite the fact that Mr Pycroft says that later on, brain implants will be progressively intricate and all the more generally used to treat a more extensive scope of conditions. Be that as it may, he also gives an obvious cautioning as the juncture of these variables is probably going to make it simpler and progressively appealing for the attackers to try and meddle with people's implants.