[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2023223352A1 - System and method for facilitating behavioral analysis of malwares - Google Patents

System and method for facilitating behavioral analysis of malwares Download PDF

Info

Publication number
WO2023223352A1
WO2023223352A1 PCT/IN2023/050462 IN2023050462W WO2023223352A1 WO 2023223352 A1 WO2023223352 A1 WO 2023223352A1 IN 2023050462 W IN2023050462 W IN 2023050462W WO 2023223352 A1 WO2023223352 A1 WO 2023223352A1
Authority
WO
WIPO (PCT)
Prior art keywords
malware
samples
execution
testbed
level
Prior art date
Application number
PCT/IN2023/050462
Other languages
French (fr)
Inventor
Sareena Karapoola
Chester Dominic Rebeiro
Kamakoti Veezhinathan
Original Assignee
Indian Institute Of Technology Madras (Iit Madras),
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Indian Institute Of Technology Madras (Iit Madras), filed Critical Indian Institute Of Technology Madras (Iit Madras),
Publication of WO2023223352A1 publication Critical patent/WO2023223352A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/53Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by executing in a restricted environment, e.g. sandbox or secure virtual machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1417Boot up procedures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3668Testing of software
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/567Computer malware detection or handling, e.g. anti-virus arrangements using dedicated hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/865Monitoring of software

Definitions

  • the present invention generally relates to malware detection. More specifically, the present invention facilitates behavioral analysis of samples of malware.
  • a program having malicious content is known as a malware.
  • the malware poses varying levels of risk to system users. The ramifications of these attacks range from data breaches to business disruptions, reputation damage, financial loss, and sabotage of critical infrastructures.
  • Malware analysis may be broadly classified into static analysis and dynamic analysis. In static analysis, contents of a malware may be examined to extract signatures and detect maliciousness of the malware. However, static signatures can be easily thwarted by techniques such as packing and obfuscation.
  • malware maliciousness is detected using a run-time behavior of the malware.
  • the dynamic analysis adopts an active technique and a dynamic technique for analysis of the malware.
  • the active technique repeatedly instruments the malware before execution to explore all execution paths in the malware.
  • the passive technique merely execute the malware and observe behavioral trails of the malware.
  • the passive technique of behavioral analysis is immune to the evasive malware.
  • Artificial Intelligence (Al) driven run-time behavioral analysis is generally used in defence against evasive malwares.
  • Data models developed using Al techniques provide offer a suitable mechanism to detect anomalies. Al techniques require availability of ground-truth of malware behavior. However, collecting a precise representation of real-world malware behavior in a laboratory setting is challenging.
  • malware detection adopts two approaches to address demand for live samples of the malware.
  • analysis done by Anti-virus (AV) engines provide an outcome that includes an inference in maliciousness of the samples, signatures, and reports obtained from the analysis.
  • signatures and reports are limited by capabilities of the available AV engines.
  • live malware samples are provided to researchers for execution and subsequent analysis is performed.
  • the second approach has multiple limitations.
  • One of the limitation is distribution of live samples of a malware which is highly vulnerable to accidental execution. Any leakage of the live samples of the malware can lead to potential misuse, warranting policies for ensuring accountability.
  • Another limitation is restricted and monopolized services for providing the live malware samples by private enterprises, thus, incurring an excessive cost for a regular supply of new samples of malwares.
  • Another limitation is execution of malwares and detecting real-world behavior in a laboratory setting is challenging. researchers may prioritize safety for analysing the malwares.
  • the evasive malware looks for real-world conditions before revealing offensive behavior.
  • the evasive malware can easily identify artifacts of test environments and choose not to execute. Consequently, data collected after execution of the evasive malware in virtual test environment does not represent offensive behavior.
  • malware execution impacts a system state in an analysis framework
  • An object of the present invention is to provide a method and a system to facilitate precise and comprehensive behavioral analysis of samples of malwares.
  • Another object of the present invention is to provide a model or framework for safe execution of the samples of the malwares to facilitate behavior-as-a-service.
  • Yet another object of the present invention is to provide the framework for timely and large- scale execution of the samples of the malwares.
  • a method for facilitating behavioral analysis of samples of a malware may comprise receiving one or more samples of malware and one or more conditions for execution of the malware.
  • the one or more samples of malware may be executed on a testbed provided with internet connectivity based on the one or more conditions.
  • the testbed comprises a heterogeneous hardware setup including multiple processing devices of different configurations for providing conducive conditions for malware execution.
  • results of execution of the one or more samples of malware may be collected.
  • the results of execution include run-time activity of the malware observable across network, Operating System (OS), and hardware.
  • the results of execution of the one or more samples of the malware are stored in a repository storing details of a plurality of malwares.
  • the one or more conditions include a software platform on which the malware is to be executed and a time duration (t) for which the execution of the malware is to be observed.
  • t time duration
  • one or more processing devices capable of providing the software platform for execution of the one or more samples of the malware are resetted to a clean baseline state.
  • the testbed is connected to the internet through a multi-level firewall that allows the malware to communicate to a server associated with the malware, while blocking attacks to permeate outside the testbed.
  • the testbed has a multi-level reset mechanism.
  • a first level of the multi-level reset mechanism is a software based baseline -reset for restoring a physical machine of the heterogeneous hardware setup and a second level of the multi-level reset mechanism is an image -reset for reloading a required OS from an image server.
  • the present invention discloses a system for facilitating behavioral analysis of a malware.
  • the system comprises a testbed, a processing device, and a memory.
  • the testbed comprises a plurality of devices of different configurations connected in a heterogeneous hardware setup for providing conducive conditions for malware execution.
  • Each device of the plurality of devices is configured to execute samples of the malwares.
  • the processing device is configured to receive one or more samples of a malware and one or more conditions for execution of the malware. Further, the processing device selects a device from the plurality of devices and executes the one or more samples of the malware on the device, based on the one or more conditions. Furthermore, the processing device stores results of execution of the one or more samples of the malware.
  • the results of execution include run-time activity of the malware observable across network, Operating System (OS), and hardware.
  • OS Operating System
  • the one or more conditions include a software platform on which the malware is to be executed and a time duration (t) for which the execution of the malware is to be observed.
  • the plurality of devices are off-the-shelf devices.
  • the off-the-shelf devices are one or more of desktop computers, single -board computers, and embedded platforms with different operating systems.
  • a multi-level firewall is installed in the system to manage a connection with internet.
  • the testbed has a multi-level reset mechanism.
  • a first level of the multi-level reset mechanism is a software based baseline-reset for restoring a device of the heterogeneous hardware setup, and a second level of the multi-level reset mechanism is an image-reset for reloading a required OS from an image server.
  • the memory is configured to store details of behavior of a plurality of malwares and the memory is retrieval by a user for the results of the execution and the details of the behavior of the plurality of malwares.
  • Fig. 1 illustrates a framework for facilitating behavioral analysis of malwares, in accordance with an embodiment of the present invention
  • Fig. 2 illustrates a process flow of interaction with a user interface provided by a front-end of a framework to facilitate behavioral analysis of samples of malwares, in accordance with an embodiment of the present invention
  • Fig. 3 illustrates a process flow of interaction with a back-end of a framework to facilitate behavioral analysis of samples of malwares, in accordance with an embodiment of the present invention
  • Fig. 4 illustrates a process flow of execution of samples of malwares and collection of results of behavioral data, in accordance with an embodiment of the present invention
  • Fig. 5 illustrates a block diagram of a system for facilitating behavioral analysis of malwares, in accordance with an embodiment of the present invention.
  • Fig. 6 illustrates a process flow of a multi-level reset mechanism in a real-world testbed, in accordance with an embodiment of the present invention.
  • Fig. 1 illustrates a framework 100 for facilitating behavioral analysis of malwares.
  • the framework 100 comprises a front-end 102, a dataset corpus 104, and a back-end 106.
  • the front-end 102 may be configured to receive a request from a user and provide results corresponding to the request of the user.
  • the front-end 102 may provide an Application Programming Interface (API) for users to submit program hashes or files and to deliver results corresponding to the program hashes or files, retrieved from the dataset corpus 104.
  • API Application Programming Interface
  • the dataset corpus 104 may be a collection of behavioral data of a plurality of malwares.
  • the front-end 102 submits the request to the back-end 106.
  • the front-end 102 provides the request along with data received from the user.
  • the back-end 106 may comprise a buffer 108 to supply the samples of the malware to a real-world testbed 110.
  • the back-end 106 executes the samples of the malware on the real-world testbed 110 and store the behavioral data of the malware in the dataset corpus 104.
  • the behavioral data of the malware may be transmitted to the user.
  • the framework may provide simultaneous capture of three artifacts including network, Operating System (OS), and hardware behaviour, for analysis of behavior of malwares. These artifacts may be predominantly used for their effectiveness and decreased overheads in dynamic malware detection.
  • a network trail may capture malware communications
  • an OS trail may present system calls made by the malware
  • a hardware trail may include the micro-architectural events triggered during malware execution.
  • Fig. 2 illustrates a process flow of interaction with a user interface provided by the frontend 102 of the framework 100.
  • the API i.e. GetDataForHash
  • the user input may comprise a hash h of the program.
  • data related to the program hash h requested by the user may be checked in the dataset corpus 104.
  • behavioral data corresponding to the program hash h may be extracted from the dataset corpus 104, at step 208.
  • the front-end extracts the behavioral data from the dataset corpus 104.
  • the behavioral data may be provide to the user through the interface of the front-end 102, at step 210.
  • the API i.e. GetDataForProgram
  • An input for requesting the behavioral data of the program executable may include samples of malware executable p, a platform f (e.g. Linux, Windows, Android, etc.) on which the malware needs to be executed, and optionally, a time duration t for which the execution of the malware is to be observed.
  • a format of the input may be present in a form of (program p, platform f, time t).
  • the back-end 106 may be invoked for execution of the samples of malware.
  • the front-end 102 raises a request to the back-end to execute the samples of malware requested by the user.
  • the back-end 106 executes the samples of malware requested by the user on the platform f for the time duration t.
  • a default time duration of 2 minutes may be set, which is considered to be sufficient to elicit most of the malicious behaviors of the malware.
  • the back-end 106 may execute the samples of malware for the default time duration i.e. 2 minutes.
  • the back-end 106 collects the behavioral data based on the execution of the samples of malware.
  • the behaviorial data of the malware may be saved in the dataset corpus 104. Successively, the behavioral data of the malware may be provided to the user, at step 214.
  • the user may upload multiple files for collection of the behavioral data.
  • the API i.e. GetDataForFolder
  • the API may be used to submit a folder of malwares, along with the platform f and time t for executing each sample present in the folder.
  • a format of the input may be present in a form of (program folder, platform f, time t).
  • the front-end 102 may invoke the back-end to execute and collect behavioral data of each sample of the folder and return the behavioral data to the user.
  • Fig. 3 illustrates a process flow of interaction with the back-end 106 of a framework to facilitate the behavioral analysis of the samples of malware.
  • the back-end 106 may comprise an update engine 304, the buffer 108, a test engine 302, and the real-world testbed 110.
  • Algorithm 1 as provided below describes working of the back-end 106 of the framework 100.
  • the update engine 304 may periodically search for a newly reported malware in public malware repositories and may download the newly reported malware in the buffer 108 (indicated through steps 3-8 of the Algorithm 1). Further, the testengine 302 may execute samples of the newly reported malware on the real-world testbed 110. The test engine 302 may collect behavioral data of the newly reported malware on artifacts such as network, operating system (OS), and hardware. The test engine 302, by default, may execute and collect the behavioral data for a pre-defined time duration. In one implementation, the default time duration may be 2 minutes, which is proved to be sufficient to obtain most of malicious behaviors of malware. Further, the behavioral data may be stored in the dataset corpus 104 (indicated through step 9 of the Alogrithm 1).
  • the back-end 106 may receive a request to execute samples of a malware from the front-end 102 of the framework 100.
  • the back-end 106 may extract the samples of the malware requested by a user.
  • the samples of the malware may be temporally stored in the buffer 108 (indicated through step 14 of the Algorithm 1).
  • the buffer 108 may provide the samples of the malware to the test engine 302 for execution of the samples of the malware.
  • the test engine 302 may execute the samples of the malware on the real- world testbed 110 and may collect the behavioral data of the malware. Execution of the samples of the malware and collection of the behavioral data of the malware is described successively in detail, with reference to Fig. 4.
  • the samples of the malware may be executed for a time duration specified by the user.
  • the samples of the malware may be executed for the default time duration.
  • the behavioral data may be provided to the front-end 102 of the framework 100 for presenting the behavioral data of the malware to the user.
  • the behavioral data may be stored in the dataset corpus 104 (indicated through step 15 of Algorithm 1).
  • the back-end 106 ensures timely execution of a regular feed of the newly reported malware, for updating the dataset corpus 104.
  • Fig. 4 illustrates a process flow of execution of the samples of the malware and collection of results of the behavioral data.
  • all devices of the real- world testbed 110 may be resetted to clean baseline states.
  • an appropriate device of the real- world testbed 110 may be selected as a profiler to execute the samples of the malware.
  • the profiler may be a software installed on devices of the real- world testbed 110 on artifacts such as network, operating system (OS), and hardware.
  • the samples of the malware may be provided to the profiler for execution of the samples of the malware.
  • collection of the behavioral data from the real-world testbed 110 may be initiated. For example, a corresponding tool to capture each artifact such as network, OS, and hardware may be started in the profiler.
  • a process-monitoring tool may be started in the profiler to capture OS behavioral data.
  • the samples of the malware may be executed on the profiler for a specific time period.
  • the specific time period may be provided by the user or may be pre-defined.
  • execution of the samples of the malware and collection of results of the execution may be stopped after completion of the specific time period.
  • the test engine 302 may extract the results of the execution and store the results into the dataset corpus 104.
  • Fig. 5 illustrates a block diagram of a system 500 for facilitating behavioral analysis of malwares.
  • the system comprises a processing device 502, a memory 504, and the real- world testbed 110.
  • the processing device 502, the memory 504, and the real-world testbed 110 may be communicatively coupled with each other.
  • the real-world testbed 110 may consist of a plurality of devices of different configurations such as a first device 110a, a second device 110b, and so on till Nth device llOn. Each device of the plurality of devices may execute samples of malwares.
  • the plurality of devices 110a - llOn may be off-the-shelf devices such as desktop computers, single-board computers, and embedded platforms. Each device may operate using an operating system (OS) such Linux, Mac, Windows, and others.
  • OS operating system
  • the real-world testbed 110 may comprise a heterogeneous hardware setup such as Raspberry Pi, Intel x86 Atom,
  • the real-world testbed 110 provides a heterogeneous network of physical machines that may be employed as a profiler to execute malware. Table 1 provides details of the hardware and the OS used in the real-world testbed 110.
  • the processing device 502 may manage the plurality of devices 110a - llOn of the real- world testbed 110. For example, the processing device 502 may control a device of the plurality of devices 110a - llOn to execute the samples of the malware.
  • the memory 504 may store results of execution of the samples of the malware and details of behavior of the samples of the malware.
  • the processing device 502 may receive one or more samples of a malware and one or more conditions for execution of the malware.
  • the one or more samples of the malware may be received from either public malware repositories or from a user.
  • the one or more conditions may include a software platform on which the malware is to be executed and a time duration (t) for which the execution of the malware is to be observed.
  • the time duration (t) may be an optional condition received from the user.
  • the samples of the malware may be executed for a pre-defined time duration.
  • the pre-defined time duration may be defined based on a type of a device on which the samples of the malware to be executed.
  • the processing device 502 may select a first device 110a from the plurality of devices 110a - llOn based on one or more conditions. For example, the processing device 502 may extract information regarding the software platform on which the malware is to be observed, from the one or more conditions. Further, the processing device 502 may select the first device 502 from the plurality of devices 110a - llOn by mapping the software platform with details of the plurality of devices 110a - llOn mentioned in the table 1.
  • the processing device 502 may store results of execution of the samples in the memory 504.
  • the results of execution may include run-time activity of the malware observable across network, Operating System (OS), and hardware.
  • OS Operating System
  • the memory 504 may be continuously updated by storing behavioral data of newly identified malware observed across network, OS, and hardware.
  • the user may retrieve the behavioral data of the plurality of malwares from the memory 504.
  • the system 500 may use a dedicated network connection for internet connectivity.
  • the dedicated network connection may be managed by a multi-level firewall.
  • the multi-level firewall may allow the malware to communicate with a server for execution of one or more samples of the malware.
  • a two-level firewall may be used for managing the dedicated network connection between the system 500 and Internet.
  • the malware may need to compromise multiple firewalls to cross the system 500.
  • devices present outside the real- world testbed 110 may be protected from the malwares executed on the real-world testbed 110.
  • external malware may need to compromise multiple firewalls to attach the real-world testbed 110.
  • Fig. 6 illustrates a process flow of a multi-level reset mechanism in a real-world testbed.
  • the real-world testbed may be resetted to a clean initial state or a baseline state.
  • the real-world testbed may be employed with a multi-level reset mechanism (such as two-level reset mechanism as described with reference to Fig. 6).
  • a multi-level reset mechanism such as two-level reset mechanism as described with reference to Fig. 6.
  • all devices of the real-world testbed may be resetted to the clean-initial state or the baseline state, at steps 602 and 604.
  • the samples of the malwares may be executed on the real- world testbed, at step 606.
  • the first level of the multi-level reset mechanism may be a software based baseline-reset for restoring a device of real-world testbed.
  • the first level of the multi-level reset mechanism provides a quick low-overhead baseline-reset by restarting all devices of the real-world testbed using remote commands.
  • execution of the samples of the malwares makes the real-world testbed inaccessible remotely, the real-world testbed may be resetted by using smart power switches. In some cases, the execution of the samples of the malwares may cause critical faults in the real- world testbed such that the real-world testbed may not be resetted to the clean initial state.
  • all devices of the real-world testbed may be resetted at a second level of the multi-level reset mechanism, at steps 610 and 612.
  • the second level of the multi-level reset mechanism is an image-reset for reloading a required OS from an image server 614.
  • Modern malwares areevasive and look for real-world conditions before revealing offensive behavior, thus remaining dormant in virtualized analysis environments.
  • the evasive malwares can easily identify artifacts of test environments and choose not to execute. Consequently, data collected execution of the evasive malware does not represent offensive behavior.
  • Present invention proposes a system that provides real-world conditions and Internet connectivity to ensure malware to continue execution beyond the conditional checks for evasion in their code.
  • the present invention also proposes a system that allows simultaneous capture of three artifacts i.e. network, OS, and hardware behavior of a malware.
  • Table 2 describes analysis time taken by the system to execute and collect behavioral data for the samples of the malware.
  • the system proposed by the present invention is compared with a public testbed such as DETER. It is observed from the table 2 that multi-level reset mechanism used in the present system enables 58.6% times faster reloads compared to DETER. The shorter time taken for state resets enables more number of sample analysis (255 per day) in the present system as compared to DETER (154 per day).
  • memory of system presented in the invention has 2.7 TB of data and 22M behavioral snapshots of 10,432 samples of the malwares.
  • the memory further includes 7M network packets, 11.3M operating system call traces, and 3.3M micro-architectural events from hardware for 8 classes of malware.
  • Table 3 describes distribution of samples of malware collected in a growing dataset of a memory of the system.
  • the present invention provides a system and a method for safe execution of samples of malwares by providing a real-world testbed for execution of samples of malwares. Further, the system continuously updates behavioral analysis data in a memory associated with the system. Thus, the system provides an unbiased comprehensive view of real-world behavior of the malwares, which enables the researchers to quickly explore and compare detection mechanisms to counter the evolving malware landscape. Furthermore, the system offloads time and efforts of setting up a real-world evaluation infrastructure for comprehensive behavioral data collection, while alleviating high risks involved in handling and executing evasive malwares.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present invention discloses a system (500) and a method of facilitating behavioral analysis of malwares. The system (500) comprises a testbed (110), and a processing device (502). The testbed (110) comprises a plurality of devices (110a – 110n) of different configurations. Each device of the plurality of devices (110a – 110n) is configured to execute samples of the malwares. The processing device (502) is configured to receive one or more samples of a malware and one or more conditions for execution of the malware. The processing device (502) selects a device (110a) from the plurality of devices (110a – 110n) and executes the one or more samples of the malware on the device (110a), based on the one or more conditions. The processing device (502) stores results of execution of the one or more samples of the malware, including run-time activity of the malware observable across network, Operating System (OS), and hardware.

Description

SYSTEM AND METHOD FOR FACILITATING BEHAVIORAL ANALYSIS OF MALWARES
FIELD OF INVENTION
The present invention generally relates to malware detection. More specifically, the present invention facilitates behavioral analysis of samples of malware.
BACKGROUND
A program having malicious content is known as a malware. The malware poses varying levels of risk to system users. The ramifications of these attacks range from data breaches to business disruptions, reputation damage, financial loss, and sabotage of critical infrastructures. Malware analysis may be broadly classified into static analysis and dynamic analysis. In static analysis, contents of a malware may be examined to extract signatures and detect maliciousness of the malware. However, static signatures can be easily thwarted by techniques such as packing and obfuscation.
In dynamic analysis, maliciousness is detected using a run-time behavior of the malware. The dynamic analysis adopts an active technique and a dynamic technique for analysis of the malware. The active technique repeatedly instruments the malware before execution to explore all execution paths in the malware. Thus, the instrumentation done by the active technique may be detected by some evasive malware and choose not to execute. The passive technique merely execute the malware and observe behavioral trails of the malware. Thus, the passive technique of behavioral analysis is immune to the evasive malware. Artificial Intelligence (Al) driven run-time behavioral analysis is generally used in defence against evasive malwares. Data models developed using Al techniques provide offer a suitable mechanism to detect anomalies. Al techniques require availability of ground-truth of malware behavior. However, collecting a precise representation of real-world malware behavior in a laboratory setting is challenging.
Currently, research in malware detection adopts two approaches to address demand for live samples of the malware. In a first approach, analysis done by Anti-virus (AV) engines provide an outcome that includes an inference in maliciousness of the samples, signatures, and reports obtained from the analysis. However, the signatures and reports are limited by capabilities of the available AV engines. In a second approach, live malware samples are provided to researchers for execution and subsequent analysis is performed. However, the second approach has multiple limitations. One of the limitation is distribution of live samples of a malware which is highly vulnerable to accidental execution. Any leakage of the live samples of the malware can lead to potential misuse, warranting policies for ensuring accountability. Another limitation is restricted and monopolized services for providing the live malware samples by private enterprises, thus, incurring an excessive cost for a regular supply of new samples of malwares. Another limitation is execution of malwares and detecting real-world behavior in a laboratory setting is challenging. Researchers may prioritize safety for analysing the malwares. However, the evasive malware looks for real-world conditions before revealing offensive behavior. Thus, the evasive malware can easily identify artifacts of test environments and choose not to execute. Consequently, data collected after execution of the evasive malware in virtual test environment does not represent offensive behavior.
Further, a large-scale evaluation of malwares is also challenging. Specifically, malware execution impacts a system state in an analysis framework
Therefore, there is a need of an efficient method for safe and timely execution and behavioral analysis of samples of malwares.
OBJECTS OF THE INVENTION
An object of the present invention is to provide a method and a system to facilitate precise and comprehensive behavioral analysis of samples of malwares.
Another object of the present invention is to provide a model or framework for safe execution of the samples of the malwares to facilitate behavior-as-a-service.
Another object of the present innovation is to provide a model that enables a user to submit a hashes of the samples of the malware for analysis and retrieve real-world run-time behavioral trails by execution of the samples of the malwares. Still another object of the present invention is to provide a testbed ensuring close-to-real- world configuration with network and internet connectivity for execution of the samples of the malwares.
Yet another object of the present invention is to provide the framework for timely and large- scale execution of the samples of the malwares.
SUMMARY OF THE INVENTION
The summary is provided to introduce aspects related to a system and method for safely and precisely facilitating behavioral analysis of samples of malwares, and the aspects are further described below in the detailed description. This summary is not intended to identify essential features of the claimed subject matter nor is it intended for use in determining or limiting the scope of the claimed subject matter.
In an embodiment, a method for facilitating behavioral analysis of samples of a malware may comprise receiving one or more samples of malware and one or more conditions for execution of the malware. The one or more samples of malware may be executed on a testbed provided with internet connectivity based on the one or more conditions. The testbed comprises a heterogeneous hardware setup including multiple processing devices of different configurations for providing conducive conditions for malware execution. Further, results of execution of the one or more samples of malware may be collected. The results of execution include run-time activity of the malware observable across network, Operating System (OS), and hardware. The results of execution of the one or more samples of the malware are stored in a repository storing details of a plurality of malwares.
In one aspect, the one or more conditions include a software platform on which the malware is to be executed and a time duration (t) for which the execution of the malware is to be observed. Before executing the one or more samples of the malware, one or more processing devices capable of providing the software platform for execution of the one or more samples of the malware are resetted to a clean baseline state. The testbed is connected to the internet through a multi-level firewall that allows the malware to communicate to a server associated with the malware, while blocking attacks to permeate outside the testbed. The testbed has a multi-level reset mechanism. A first level of the multi-level reset mechanism is a software based baseline -reset for restoring a physical machine of the heterogeneous hardware setup and a second level of the multi-level reset mechanism is an image -reset for reloading a required OS from an image server.
In another aspect, the present invention discloses a system for facilitating behavioral analysis of a malware. The system comprises a testbed, a processing device, and a memory. The testbed comprises a plurality of devices of different configurations connected in a heterogeneous hardware setup for providing conducive conditions for malware execution. Each device of the plurality of devices is configured to execute samples of the malwares. The processing device is configured to receive one or more samples of a malware and one or more conditions for execution of the malware. Further, the processing device selects a device from the plurality of devices and executes the one or more samples of the malware on the device, based on the one or more conditions. Furthermore, the processing device stores results of execution of the one or more samples of the malware. The results of execution include run-time activity of the malware observable across network, Operating System (OS), and hardware.
In one aspect, the one or more conditions include a software platform on which the malware is to be executed and a time duration (t) for which the execution of the malware is to be observed.
The plurality of devices are off-the-shelf devices. The off-the-shelf devices are one or more of desktop computers, single -board computers, and embedded platforms with different operating systems. A multi-level firewall is installed in the system to manage a connection with internet.
The testbed has a multi-level reset mechanism. A first level of the multi-level reset mechanism is a software based baseline-reset for restoring a device of the heterogeneous hardware setup, and a second level of the multi-level reset mechanism is an image-reset for reloading a required OS from an image server. The memory is configured to store details of behavior of a plurality of malwares and the memory is retrieval by a user for the results of the execution and the details of the behavior of the plurality of malwares.
Other aspects and advantages of the invention will become apparent from the following description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings constitute a part of the description and are used to provide further understanding of the present invention. Such accompanying drawings illustrate the embodiments of the present invention which are used to describe the principles of the present invention. The embodiments are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” or “one” embodiment in this invention are not necessarily made to the same embodiment, and they mean at least one. In the drawings:
Fig. 1 illustrates a framework for facilitating behavioral analysis of malwares, in accordance with an embodiment of the present invention;
Fig. 2 illustrates a process flow of interaction with a user interface provided by a front-end of a framework to facilitate behavioral analysis of samples of malwares, in accordance with an embodiment of the present invention;
Fig. 3 illustrates a process flow of interaction with a back-end of a framework to facilitate behavioral analysis of samples of malwares, in accordance with an embodiment of the present invention;
Fig. 4 illustrates a process flow of execution of samples of malwares and collection of results of behavioral data, in accordance with an embodiment of the present invention; Fig. 5 illustrates a block diagram of a system for facilitating behavioral analysis of malwares, in accordance with an embodiment of the present invention; and
Fig. 6 illustrates a process flow of a multi-level reset mechanism in a real-world testbed, in accordance with an embodiment of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
The detailed description set forth below in connection with the appended drawings is intended as a description of various embodiments of the present invention and is not intended to represent the only embodiments in which the present invention may be practiced. Each embodiment described in this invention is provided merely as an example or illustration of the present invention, and should not necessarily be construed as preferred or advantageous over other embodiments. The detailed description includes specific details for the purpose of providing a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced without these specific details.
The present invention proposes a system and a method for safely and precisely facilitating behavioral analysis of malwares. Fig. 1 illustrates a framework 100 for facilitating behavioral analysis of malwares. The framework 100 comprises a front-end 102, a dataset corpus 104, and a back-end 106. The front-end 102 may be configured to receive a request from a user and provide results corresponding to the request of the user. For example, the front-end 102 may provide an Application Programming Interface (API) for users to submit program hashes or files and to deliver results corresponding to the program hashes or files, retrieved from the dataset corpus 104. The dataset corpus 104 may be a collection of behavioral data of a plurality of malwares. In a case where the behavioral data corresponding to the program hashes or files requested by the user is not present in the dataset corpus 104, the front-end 102 submits the request to the back-end 106. For example, the front-end 102 provides the request along with data received from the user. The back-end 106 may comprise a buffer 108 to supply the samples of the malware to a real-world testbed 110. The back-end 106 executes the samples of the malware on the real-world testbed 110 and store the behavioral data of the malware in the dataset corpus 104. In one implementation, the behavioral data of the malware may be transmitted to the user.
In one implementation, the framework may provide simultaneous capture of three artifacts including network, Operating System (OS), and hardware behaviour, for analysis of behavior of malwares. These artifacts may be predominantly used for their effectiveness and decreased overheads in dynamic malware detection. A network trail may capture malware communications, an OS trail may present system calls made by the malware, and a hardware trail may include the micro-architectural events triggered during malware execution.
Fig. 2 illustrates a process flow of interaction with a user interface provided by the frontend 102 of the framework 100. In an aspect, the API (i.e. GetDataForHash) may be used to submit a program hash and a request for obtaining corresponding behavioral data. In such case, the user input may comprise a hash h of the program. At step 202, data related to the program hash h requested by the user may be checked in the dataset corpus 104. At step 204, it is determined whether the data related to the program hash h is present in the dataset corpus 104 or not. If the data is not found to be present in the dataset corpus 104, an error message may be provided to the user, at step 206. Alternatively, when the data is found to be present in the dataset corpus 104, behavioral data corresponding to the program hash h may be extracted from the dataset corpus 104, at step 208. For example, when it is determined that the behavioral data corresponding to the program hash h is present in the dataset corpus 104, the front-end extracts the behavioral data from the dataset corpus 104. Successively, the behavioral data may be provide to the user through the interface of the front-end 102, at step 210.
In another aspect, the API (i.e. GetDataForProgram) may be used to submit a program executable and request for corresponding behavioral data. An input for requesting the behavioral data of the program executable may include samples of malware executable p, a platform f (e.g. Linux, Windows, Android, etc.) on which the malware needs to be executed, and optionally, a time duration t for which the execution of the malware is to be observed. A format of the input may be present in a form of (program p, platform f, time t). At step 212, the back-end 106 may be invoked for execution of the samples of malware. For example, after receiving the input from the user, the front-end 102 raises a request to the back-end to execute the samples of malware requested by the user. The back-end 106 executes the samples of malware requested by the user on the platform f for the time duration t. In one implementation, a default time duration of 2 minutes may be set, which is considered to be sufficient to elicit most of the malicious behaviors of the malware. Thus, if the time duration t is not defined by the user, the back-end 106 may execute the samples of malware for the default time duration i.e. 2 minutes. The back-end 106 collects the behavioral data based on the execution of the samples of malware. The behaviorial data of the malware may be saved in the dataset corpus 104. Successively, the behavioral data of the malware may be provided to the user, at step 214.
In one aspect, the user may upload multiple files for collection of the behavioral data. In such scenario, the API (i.e. GetDataForFolder) may be used to submit a folder of malwares, along with the platform f and time t for executing each sample present in the folder. A format of the input may be present in a form of (program folder, platform f, time t). The front-end 102may invoke the back-end to execute and collect behavioral data of each sample of the folder and return the behavioral data to the user.
Fig. 3 illustrates a process flow of interaction with the back-end 106 of a framework to facilitate the behavioral analysis of the samples of malware. The back-end 106 may comprise an update engine 304, the buffer 108, a test engine 302, and the real-world testbed 110. Algorithm 1 as provided below describes working of the back-end 106 of the framework 100.
1. begin
2. while true do
/* Update Engine */
3. Crawl online repositories for newly reported samples
4. if updates are available then
5. NewhashList<— Hashes of newly reported malware samples
6. for h G NewHashList do
7. p <— Download hash h 8. Supply-of- Samples*— p
/* Test Engine */
9. Data-Corpus*— Execute Collect (p)
10. Check for requests from front-end
11. if requests queued from front-end then
12. ListOfPrograms*— List of programs submitted by user
13. for p 6 ListOfPrograms do
14. Supply-of- Samples*— p
// Test Engine
15. Data-Corpus*— Execute Collect (p)
Algorithm 1
As evident from Algorithm 1, the update engine 304 may periodically search for a newly reported malware in public malware repositories and may download the newly reported malware in the buffer 108 (indicated through steps 3-8 of the Algorithm 1). Further, the testengine 302 may execute samples of the newly reported malware on the real-world testbed 110. The test engine 302 may collect behavioral data of the newly reported malware on artifacts such as network, operating system (OS), and hardware. The test engine 302, by default, may execute and collect the behavioral data for a pre-defined time duration. In one implementation, the default time duration may be 2 minutes, which is proved to be sufficient to obtain most of malicious behaviors of malware. Further, the behavioral data may be stored in the dataset corpus 104 (indicated through step 9 of the Alogrithm 1).
In an implementation, the back-end 106 may receive a request to execute samples of a malware from the front-end 102 of the framework 100. The back-end 106 may extract the samples of the malware requested by a user. The samples of the malware may be temporally stored in the buffer 108 (indicated through step 14 of the Algorithm 1). The buffer 108 may provide the samples of the malware to the test engine 302 for execution of the samples of the malware. The test engine 302 may execute the samples of the malware on the real- world testbed 110 and may collect the behavioral data of the malware. Execution of the samples of the malware and collection of the behavioral data of the malware is described successively in detail, with reference to Fig. 4. The samples of the malware may be executed for a time duration specified by the user. When the time duration is not specified by the user, the samples of the malware may be executed for the default time duration. Further, the behavioral data may be provided to the front-end 102 of the framework 100 for presenting the behavioral data of the malware to the user. In one implementation, the behavioral data may be stored in the dataset corpus 104 (indicated through step 15 of Algorithm 1). Thus, the back-end 106 ensures timely execution of a regular feed of the newly reported malware, for updating the dataset corpus 104.
Fig. 4 illustrates a process flow of execution of the samples of the malware and collection of results of the behavioral data. At step 402, all devices of the real- world testbed 110 may be resetted to clean baseline states. At step 404, an appropriate device of the real- world testbed 110 may be selected as a profiler to execute the samples of the malware. The profiler may be a software installed on devices of the real- world testbed 110 on artifacts such as network, operating system (OS), and hardware. Further, the samples of the malware may be provided to the profiler for execution of the samples of the malware. At step 406, collection of the behavioral data from the real-world testbed 110 may be initiated. For example, a corresponding tool to capture each artifact such as network, OS, and hardware may be started in the profiler. In an implementation, a process-monitoring tool may be started in the profiler to capture OS behavioral data. At step 408, the samples of the malware may be executed on the profiler for a specific time period. The specific time period may be provided by the user or may be pre-defined. At step 410, execution of the samples of the malware and collection of results of the execution may be stopped after completion of the specific time period. At step 412, the test engine 302 may extract the results of the execution and store the results into the dataset corpus 104.
Fig. 5 illustrates a block diagram of a system 500 for facilitating behavioral analysis of malwares. The system comprises a processing device 502, a memory 504, and the real- world testbed 110. The processing device 502, the memory 504, and the real-world testbed 110 may be communicatively coupled with each other.
The real-world testbed 110 may consist of a plurality of devices of different configurations such as a first device 110a, a second device 110b, and so on till Nth device llOn. Each device of the plurality of devices may execute samples of malwares. The plurality of devices 110a - llOn may be off-the-shelf devices such as desktop computers, single-board computers, and embedded platforms. Each device may operate using an operating system (OS) such Linux, Mac, Windows, and others. In an implementation, the real-world testbed 110 may comprise a heterogeneous hardware setup such as Raspberry Pi, Intel x86 Atom,
Quark, i5, and i7 machines configured with different OS. The heterogeneous hardware setup ensures real-world conditions for execution of the samples of the malwares. Some evasive malwares may search real-world conditions before revealing malicious behavior. In such cases, the real-world testbed 110 provides a heterogeneous network of physical machines that may be employed as a profiler to execute malware. Table 1 provides details of the hardware and the OS used in the real-world testbed 110.
Figure imgf000013_0001
Figure imgf000014_0001
Table 1
The processing device 502 may manage the plurality of devices 110a - llOn of the real- world testbed 110. For example, the processing device 502 may control a device of the plurality of devices 110a - llOn to execute the samples of the malware. The memory 504 may store results of execution of the samples of the malware and details of behavior of the samples of the malware.
In operation, the processing device 502 may receive one or more samples of a malware and one or more conditions for execution of the malware. The one or more samples of the malware may be received from either public malware repositories or from a user. The one or more conditions may include a software platform on which the malware is to be executed and a time duration (t) for which the execution of the malware is to be observed. The time duration (t) may be an optional condition received from the user. In case the time duration is not provided by the user, the samples of the malware may be executed for a pre-defined time duration. The pre-defined time duration may be defined based on a type of a device on which the samples of the malware to be executed.
The processing device 502 may select a first device 110a from the plurality of devices 110a - llOn based on one or more conditions. For example, the processing device 502 may extract information regarding the software platform on which the malware is to be observed, from the one or more conditions. Further, the processing device 502 may select the first device 502 from the plurality of devices 110a - llOn by mapping the software platform with details of the plurality of devices 110a - llOn mentioned in the table 1.
The processing device 502 may store results of execution of the samples in the memory 504. The results of execution may include run-time activity of the malware observable across network, Operating System (OS), and hardware. Thus, the memory 504 may be continuously updated by storing behavioral data of newly identified malware observed across network, OS, and hardware. The user may retrieve the behavioral data of the plurality of malwares from the memory 504.
The system 500 may use a dedicated network connection for internet connectivity. The dedicated network connection may be managed by a multi-level firewall. The multi-level firewall may allow the malware to communicate with a server for execution of one or more samples of the malware. In one implementation, a two-level firewall may be used for managing the dedicated network connection between the system 500 and Internet. The malware may need to compromise multiple firewalls to cross the system 500. Thus, devices present outside the real- world testbed 110 may be protected from the malwares executed on the real-world testbed 110. Similarly, external malware may need to compromise multiple firewalls to attach the real-world testbed 110.
Fig. 6 illustrates a process flow of a multi-level reset mechanism in a real-world testbed. For execution of each sample of malwares, the real-world testbed may be resetted to a clean initial state or a baseline state. In one implementation, the real-world testbed may be employed with a multi-level reset mechanism (such as two-level reset mechanism as described with reference to Fig. 6). Whenever samples of a malware are received by the real- world testbed, all devices of the real-world testbed may be resetted to the clean-initial state or the baseline state, at steps 602 and 604. After resetting the devices of the real- world testbed, the samples of the malwares may be executed on the real- world testbed, at step 606. After execution, all the devices of the real-world testbed may be resetted at a first level of the multi-level reset mechanism, at step 608. The first level of the multi-level reset mechanism may be a software based baseline-reset for restoring a device of real-world testbed. The first level of the multi-level reset mechanism provides a quick low-overhead baseline-reset by restarting all devices of the real-world testbed using remote commands. When execution of the samples of the malwares makes the real-world testbed inaccessible remotely, the real-world testbed may be resetted by using smart power switches. In some cases, the execution of the samples of the malwares may cause critical faults in the real- world testbed such that the real-world testbed may not be resetted to the clean initial state. In such cases, all devices of the real-world testbed may be resetted at a second level of the multi-level reset mechanism, at steps 610 and 612. The second level of the multi-level reset mechanism is an image-reset for reloading a required OS from an image server 614.
Modern malwares areevasive and look for real-world conditions before revealing offensive behavior, thus remaining dormant in virtualized analysis environments. Thus, the evasive malwares can easily identify artifacts of test environments and choose not to execute. Consequently, data collected execution of the evasive malware does not represent offensive behavior. Present invention proposes a system that provides real-world conditions and Internet connectivity to ensure malware to continue execution beyond the conditional checks for evasion in their code. The present invention also proposes a system that allows simultaneous capture of three artifacts i.e. network, OS, and hardware behavior of a malware.
Table 2 describes analysis time taken by the system to execute and collect behavioral data for the samples of the malware.
Figure imgf000016_0001
Table 2
As described in table 2, the system proposed by the present invention is compared with a public testbed such as DETER. It is observed from the table 2 that multi-level reset mechanism used in the present system enables 58.6% times faster reloads compared to DETER. The shorter time taken for state resets enables more number of sample analysis (255 per day) in the present system as compared to DETER (154 per day).
In one implementation, memory of system presented in the invention has 2.7 TB of data and 22M behavioral snapshots of 10,432 samples of the malwares. The memory further includes 7M network packets, 11.3M operating system call traces, and 3.3M micro-architectural events from hardware for 8 classes of malware.
Table 3 describes distribution of samples of malware collected in a growing dataset of a memory of the system.
Figure imgf000017_0001
Table 3
The present invention provides a system and a method for safe execution of samples of malwares by providing a real-world testbed for execution of samples of malwares. Further, the system continuously updates behavioral analysis data in a memory associated with the system. Thus, the system provides an unbiased comprehensive view of real-world behavior of the malwares, which enables the researchers to quickly explore and compare detection mechanisms to counter the evolving malware landscape. Furthermore, the system offloads time and efforts of setting up a real-world evaluation infrastructure for comprehensive behavioral data collection, while alleviating high risks involved in handling and executing evasive malwares.
The terms “or” and “and/or” as used herein are to be interpreted as inclusive or meaning any one or any combination. Therefore, “A, B or C” or “A, B and/or C” mean “any of the following: A; B; C; A and B; A and C; B and C; A, B and C.” An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.
Any combination of the above features and functionalities may be used in accordance with one or more embodiments. In the foregoing specification, embodiments have been described with reference to numerous specific details that may vary from implementation to implementation. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The sole and exclusive indicator of the scope of the invention, and what is intended by the applicants to be the scope of the invention, is the literal and equivalent scope of the set as claimed in claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction.

Claims

CLAIMS:
1. A method for facilitating behavioral analysis of a malware, the method comprising: receiving one or more samples of the malware and one or more conditions for execution of the malware; executing, based on the one or more conditions, the one or more samples of the malware on a real-world testbed (110) provided with internet connectivity, wherein the testbed comprises a heterogeneous hardware setup including multiple processing devices (110a - 11 On) of different configurations for providing conducive conditions for malware execution; and collecting results of execution of the one or more samples of the malware, wherein the results of execution include precise and holistic run-time activity of the malware observable across network, Operating System (OS), and hardware. providing an unbiased comprehensive view of real-world malware behavior, enabling researchers to quickly explore and compare detection mechanisms to fast-track malware research.
2. The method as claimed in claim 1, wherein the one or more conditions include a software platform on which the malware is to be executed and a time duration (t) for which the execution of the malware is to be observed.
3. The method as claimed in claim 1, wherein before executing the one or more samples of the malware, one or more processing devices (110a - 1 lOn) capable of providing a real- world software and hardware platform for execution of the one or more samples of the malware are resetted to a clean baseline state.
4. The method as claimed in claim 1, wherein the testbed (110) is connected to internet through a multi-level firewall, that allows the malware to communicate to the remote command and control servers associated with the malware similar to real- world scenarios, while blocking the damaging effects of the attacks if any, to permeate outside the testbed (HO).
5. The method as claimed in claim 1, wherein the testbed (110) has a multi-level reset mechanism.
6. The method as claimed in claim 5, wherein a first level of the multi-level reset mechanism is a software based baseline -reset for restoring a physical machine of the heterogeneous hardware setup to its clean baseline state in a faster manner, and a second level of the multi-level reset mechanism is an image-reset for reloading a required OS from an image server which ensures a clean baseline state in scenarios where the malware may impair the machine from booting.
7. The method as claimed in claim 1, wherein the results of execution of the one or more samples of the malware are stored in a repository (504) storing details of a plurality of malwares.
8. A system (500) for facilitating behavioral analysis of a malware, the system (500) comprising: a testbed (110) comprising a plurality of devices (110a - 11 On) of different configurations connected in a heterogeneous hardware setup for providing conducive conditions for malware execution, wherein each device of the plurality of devices (110a - 1 lOn) is configured to execute samples of the malware; characterized in that, a processing device (502) configured to: receive one or more samples of a malware and one or more conditions for execution of the malware; select a device (110a) from the plurality of devices (110a - 1 lOn) based on the one or more conditions; reset the selected device to a clean-baseline state to initiate analysis of malware; execute, based on the one or more conditions, the one or more samples of the malware on the device (110a); and store results of execution of the one or more samples of the malware, wherein the results of execution include a precise and holistic run-time activity of the malware observable across network, Operating System (OS), and hardware. two firewall devices () configured in two levels to: connect the testbed to the Internet; allow malware to communicate to its remote command and control servers; and, block all malware communications that may be detrimental to the infrastructure and the Internet; a processing device configured to update the malware corpus that scans the Internet to search for newly reported malware samples; downloads the newly reported malware samples to a malware corpus; executes the newly reported malware samples on the testbed. thus, ensuring timely analysis of live malware, wherein malware are analysed soon before their remote command and control servers are blocked.
9. The system (500) as claimed in claim 8, wherein the one or more conditions include a software platform on which the malware is to be executed and a time duration (t) for which the execution of the malware is to be observed.
10. The system (500) as claimed in claim 8, wherein the plurality of devices (110a - 1 lOn) are off-the-shelf devices.
11. The system (500) as claimed in claim 10, wherein the off-the-shelf devices are one or more of desktop computers, single-board computers, and embedded platforms with different operating systems.
12. The system (500) as claimed in claim 8, wherein a multi-level firewall is installed in the system (500) to manage a connection with internet.
13. The system (500) as claimed in claim 8, wherein the testbed (110) has a multi-level reset mechanism.
14. The system (500) as claimed in claim 13, wherein a first level of the multi-level reset mechanism is a software based baseline-reset for restoring a device of the heterogeneous hardware setup, and a second level of the multi-level reset mechanism is an image-reset for reloading a required OS from an image server.
15. The system (500) as claimed in claim 8, wherein the memory (504) is configured to store details of behavior of a plurality of malwares and the memory (504) is retrieval by a user for the results of the execution and the details of the behavior of the plurality of malwares.
PCT/IN2023/050462 2022-05-18 2023-05-17 System and method for facilitating behavioral analysis of malwares WO2023223352A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN202241028439 2022-05-18
IN202241028439 2022-05-18

Publications (1)

Publication Number Publication Date
WO2023223352A1 true WO2023223352A1 (en) 2023-11-23

Family

ID=88834787

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2023/050462 WO2023223352A1 (en) 2022-05-18 2023-05-17 System and method for facilitating behavioral analysis of malwares

Country Status (1)

Country Link
WO (1) WO2023223352A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150295945A1 (en) * 2014-04-14 2015-10-15 Drexel University Multi-Channel Change-Point Malware Detection
US20210117544A1 (en) * 2018-06-28 2021-04-22 Crowdstrike, Inc. Analysis of Malware

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150295945A1 (en) * 2014-04-14 2015-10-15 Drexel University Multi-Channel Change-Point Malware Detection
US20210117544A1 (en) * 2018-06-28 2021-04-22 Crowdstrike, Inc. Analysis of Malware

Similar Documents

Publication Publication Date Title
US10972493B2 (en) Automatically grouping malware based on artifacts
US11277423B2 (en) Anomaly-based malicious-behavior detection
US7437764B1 (en) Vulnerability assessment of disk images
US10200390B2 (en) Automatically determining whether malware samples are similar
EP3214568B1 (en) Method, apparatus and system for processing cloud application attack behaviours in cloud computing system
RU2454705C1 (en) System and method of protecting computing device from malicious objects using complex infection schemes
EP2237181B1 (en) Virtual machine snapshotting and damage containment
RU2571723C2 (en) System and method of reducing load on operating system when executing antivirus application
US8667583B2 (en) Collecting and analyzing malware data
CN109074454B (en) Automatically group malware based on artifacts
US20060294592A1 (en) Automated rootkit detector
EP3531329B1 (en) Anomaly-based-malicious-behavior detection
US20080016572A1 (en) Malicious software detection via memory analysis
US9792436B1 (en) Techniques for remediating an infected file
US20230096108A1 (en) Behavior analysis based on finite-state machine for malware detection
Vokorokos et al. Application security through sandbox virtualization
RU2738334C1 (en) Method and system for making decision on need for automated response to incident
Geetha Ramani et al. Nonvolatile kernel rootkit detection using cross‐view clean boot in cloud computing
US11914711B2 (en) Systems and methods for automatically generating malware countermeasures
US11762984B1 (en) Inbound link handling
KR101512462B1 (en) Method for analyzing update of malicious code on analysis sytem of malicious code based on culture
WO2023223352A1 (en) System and method for facilitating behavioral analysis of malwares
Pendergrass et al. Lkim: The linux kernel integrity measurer
Karapoola et al. JUGAAD: Comprehensive malware behavior-as-a-service
KR101512456B1 (en) METHOD FOR RELOADING OS THROUGH network ON ANALYSIS SYTEM OF MALICIOUS CODE BASED ON CULTURE

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23807201

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 23807201

Country of ref document: EP

Kind code of ref document: A1