2021 |
Botacin, Marcus; Aghakhani, Hojjat; Ortolani, Stefano; Kruegel, Christopher; Vigna, Giovanni; Oliveira, Daniela; Geus, Paulo Lício De; Grégio, André One Size Does Not Fit All: A Longitudinal Analysis of Brazilian Financial Malware Journal Article ACM Trans. Priv. Secur., 24 (2), 2021, ISSN: 2471-2566. Abstract | Links | BibTeX | Tags: banking, malware, reverse engineer @article{10.1145/3429741, title = {One Size Does Not Fit All: A Longitudinal Analysis of Brazilian Financial Malware}, author = {Marcus Botacin and Hojjat Aghakhani and Stefano Ortolani and Christopher Kruegel and Giovanni Vigna and Daniela Oliveira and Paulo Lício De Geus and André Grégio}, url = {https://doi.org/10.1145/3429741 https://secret.inf.ufpr.br/papers/marcus_tops_br.pdf}, doi = {10.1145/3429741}, issn = {2471-2566}, year = {2021}, date = {2021-01-01}, journal = {ACM Trans. Priv. Secur.}, volume = {24}, number = {2}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, abstract = {Malware analysis is an essential task to understand infection campaigns, the behavior of malicious codes, and possible ways to mitigate threats. Malware analysis also allows better assessment of attackers’ capabilities, techniques, and processes. Although a substantial amount of previous work provided a comprehensive analysis of the international malware ecosystem, research on regionalized, country-, and population-specific malware campaigns have been scarce. Moving towards addressing this gap, we conducted a longitudinal (2012-2020) and comprehensive (encompassing an entire population of online banking users) study of MS Windows desktop malware that actually infected Brazilian banks’ users. We found that the Brazilian financial desktop malware has been evolving quickly: it started to make use of a variety of file formats instead of typical PE binaries, relied on native system resources, and abused obfuscation techniques to bypass detection mechanisms. Our study on the threats targeting a significant population on the ecosystem of the largest and most populous country in Latin America can provide invaluable insights that may be applied to other countries’ user populations, especially those in the developing world that might face cultural peculiarities similar to Brazil’s. With this evaluation, we expect to motivate the security community/industry to seriously consider a deeper level of customization during the development of next-generation anti-malware solutions, as well as to raise awareness towards regionalized and targeted Internet threats.}, keywords = {banking, malware, reverse engineer}, pubstate = {published}, tppubtype = {article} } Malware analysis is an essential task to understand infection campaigns, the behavior of malicious codes, and possible ways to mitigate threats. Malware analysis also allows better assessment of attackers’ capabilities, techniques, and processes. Although a substantial amount of previous work provided a comprehensive analysis of the international malware ecosystem, research on regionalized, country-, and population-specific malware campaigns have been scarce. Moving towards addressing this gap, we conducted a longitudinal (2012-2020) and comprehensive (encompassing an entire population of online banking users) study of MS Windows desktop malware that actually infected Brazilian banks’ users. We found that the Brazilian financial desktop malware has been evolving quickly: it started to make use of a variety of file formats instead of typical PE binaries, relied on native system resources, and abused obfuscation techniques to bypass detection mechanisms. Our study on the threats targeting a significant population on the ecosystem of the largest and most populous country in Latin America can provide invaluable insights that may be applied to other countries’ user populations, especially those in the developing world that might face cultural peculiarities similar to Brazil’s. With this evaluation, we expect to motivate the security community/industry to seriously consider a deeper level of customization during the development of next-generation anti-malware solutions, as well as to raise awareness towards regionalized and targeted Internet threats. |
2020 |
Botacin, Marcus; Ceschin, Fabricio; de Geus, Paulo; Grégio, André We Need to Talk About AntiViruses: Challenges & Pitfalls of AV Evaluations Journal Article Computers & Security, pp. 101859, 2020, ISSN: 0167-4048. Abstract | Links | BibTeX | Tags: @article{BOTACIN2020101859, title = {We Need to Talk About AntiViruses: Challenges & Pitfalls of AV Evaluations}, author = {Marcus Botacin and Fabricio Ceschin and Paulo de Geus and André Grégio}, url = {http://www.sciencedirect.com/science/article/pii/S0167404820301310 https://secret.inf.ufpr.br/papers/marcus_av.pdf}, doi = {https://doi.org/10.1016/j.cose.2020.101859}, issn = {0167-4048}, year = {2020}, date = {2020-04-29}, journal = {Computers & Security}, pages = {101859}, abstract = {Security evaluation is an essential task to identify the level of protection accomplished in running systems or to aid in choosing better solutions for each specific scenario. Although antiviruses (AVs) are one of the main defensive solutions for most end-users and corporations, AV’s evaluations are conducted by few organizations and often limited to compare detection rates. Moreover, other important factors of AVs’ operating mode (e.g., response time and detection regression) are usually underestimated. Ignoring such factors create an “understanding gap” on the effectiveness of AVs in actual scenarios, which we aim to bridge by presenting a broader characterization of current AVs’ modes of operation. In our characterization, we consider distinct file types, operating systems, datasets, and time frames. To do so, we daily collected samples from two distinct, representative malware sources and submitted them to the VirusTotal (VT) service for 30 consecutive days. In total, we considered 28,875 unique malware samples. For each day, we retrieved the submitted samples’ detection rates and assigned labels, resulting in more than 1M distinct VT submissions overall. Our experimental results show that: (i) phishing contexts are a challenge for all AVs, turning malicious Web pages detectors less effective than malicious files detectors; (ii) generic procedures are insufficient to ensure broad detection coverage, incurring in lower detection rates for particular datasets (e.g., country-specific) than for those with world-wide collected samples; (iii) detection rates are unstable since all AVs presented detection regression effects after scans in different time frames using the same dataset and (iv) AVs’ long response times in delivering new signatures/heuristics create a significant attack opportunity window within the first 30 days after we first identified a malicious binary. To address the effects of our findings, we propose six new metrics to evaluate the multiple aspects that impact the effectiveness of AVs. With them, we hope to assess corporate (and domestic) users to better evaluate the solutions that fit their needs more adequately.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Security evaluation is an essential task to identify the level of protection accomplished in running systems or to aid in choosing better solutions for each specific scenario. Although antiviruses (AVs) are one of the main defensive solutions for most end-users and corporations, AV’s evaluations are conducted by few organizations and often limited to compare detection rates. Moreover, other important factors of AVs’ operating mode (e.g., response time and detection regression) are usually underestimated. Ignoring such factors create an “understanding gap” on the effectiveness of AVs in actual scenarios, which we aim to bridge by presenting a broader characterization of current AVs’ modes of operation. In our characterization, we consider distinct file types, operating systems, datasets, and time frames. To do so, we daily collected samples from two distinct, representative malware sources and submitted them to the VirusTotal (VT) service for 30 consecutive days. In total, we considered 28,875 unique malware samples. For each day, we retrieved the submitted samples’ detection rates and assigned labels, resulting in more than 1M distinct VT submissions overall. Our experimental results show that: (i) phishing contexts are a challenge for all AVs, turning malicious Web pages detectors less effective than malicious files detectors; (ii) generic procedures are insufficient to ensure broad detection coverage, incurring in lower detection rates for particular datasets (e.g., country-specific) than for those with world-wide collected samples; (iii) detection rates are unstable since all AVs presented detection regression effects after scans in different time frames using the same dataset and (iv) AVs’ long response times in delivering new signatures/heuristics create a significant attack opportunity window within the first 30 days after we first identified a malicious binary. To address the effects of our findings, we propose six new metrics to evaluate the multiple aspects that impact the effectiveness of AVs. With them, we hope to assess corporate (and domestic) users to better evaluate the solutions that fit their needs more adequately. |
Botacin, Marcus; de Geus, Paulo Lício; Grégio, André Leveraging branch traces to understand kernel internals from within Journal Article Journal of Computer Virology and Hacking Techniques, 2020, ISSN: 2263-8733. Abstract | Links | BibTeX | Tags: @article{Botacin2020, title = {Leveraging branch traces to understand kernel internals from within}, author = {Marcus Botacin and Paulo Lício de Geus and André Grégio}, url = {https://doi.org/10.1007/s11416-019-00343-w https://secret.inf.ufpr.br//papers/reverse_kernel_marcus.pdf}, doi = {10.1007/s11416-019-00343-w}, issn = {2263-8733}, year = {2020}, date = {2020-01-02}, journal = {Journal of Computer Virology and Hacking Techniques}, abstract = {Kernel monitoring is often a hard task, requiring external debuggers and/or modules to be successfully performed. These requirements make analysis procedures more complicated because multiple machines, although virtualized ones, are required. This requirements also make analysis procedures more expensive. In this paper, we present the Lightweight Kernel Tracer (LKT), an alternative solution for tracing kernel from within by leveraging branch monitors for data collection and an address-based introspection procedure for context reconstruction. We evaluated LKT by tracing distinct machines powered by x64 Windows kernels and show that LKT may be used for understanding kernel's internals (e.g., graphics and USB subsystems) and for system profiling. We also show how to use LKT to trace other tracing and monitoring mechanisms running in kernel, such as Antiviruses and Sandboxes.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Kernel monitoring is often a hard task, requiring external debuggers and/or modules to be successfully performed. These requirements make analysis procedures more complicated because multiple machines, although virtualized ones, are required. This requirements also make analysis procedures more expensive. In this paper, we present the Lightweight Kernel Tracer (LKT), an alternative solution for tracing kernel from within by leveraging branch monitors for data collection and an address-based introspection procedure for context reconstruction. We evaluated LKT by tracing distinct machines powered by x64 Windows kernels and show that LKT may be used for understanding kernel's internals (e.g., graphics and USB subsystems) and for system profiling. We also show how to use LKT to trace other tracing and monitoring mechanisms running in kernel, such as Antiviruses and Sandboxes. |
Botacin, Marcus; Zanata, Marco; Grégio, André The self modifying code (SMC)-aware processor (SAP): a security look on architectural impact and support Journal Article Journal of Computer Virology and Hacking Techniques, 2020, ISSN: 2263-8733. Abstract | Links | BibTeX | Tags: @article{Botacin2020b, title = {The self modifying code (SMC)-aware processor (SAP): a security look on architectural impact and support}, author = {Marcus Botacin and Marco Zanata and André Grégio}, url = {https://doi.org/10.1007/s11416-020-00348-w https://secret.inf.ufpr.br/papers/SMC_marcus.pdf}, doi = {10.1007/s11416-020-00348-w}, issn = {2263-8733}, year = {2020}, date = {2020-01-01}, journal = {Journal of Computer Virology and Hacking Techniques}, abstract = {Self modifying code (SMC) are code snippets that modify themselves at runtime. Malware use SMC to hide payloads and achieve persistence. Software-based SMC detection solutions impose performance penalties for real-time monitoring and do not benefit from runtime architectural information (cache invalidation or pipeline flush, for instance). We revisit SMC impact on hardware internals and discuss the implementation of an SMC detector at distinct architectural points. We consider three detection approaches: (i) existing hardware counters; (ii) block invalidation by the cache coherence protocol; (iii) the use of Memory Management Unit (MMU) information to control SMC execution. We compare the identified instrumentation points to highlight their strong and weak points. We also compare them to previous SMC detectors' implementations.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Self modifying code (SMC) are code snippets that modify themselves at runtime. Malware use SMC to hide payloads and achieve persistence. Software-based SMC detection solutions impose performance penalties for real-time monitoring and do not benefit from runtime architectural information (cache invalidation or pipeline flush, for instance). We revisit SMC impact on hardware internals and discuss the implementation of an SMC detector at distinct architectural points. We consider three detection approaches: (i) existing hardware counters; (ii) block invalidation by the cache coherence protocol; (iii) the use of Memory Management Unit (MMU) information to control SMC execution. We compare the identified instrumentation points to highlight their strong and weak points. We also compare them to previous SMC detectors' implementations. |
Sun, R; Botacin, M; Sapountzis, N; Yuan, X; Bishop, M; Porter, D E; Li, X; Gregio, A; Oliveira, D A Praise for Defensive Programming: LeveragingUncertainty for Effective Malware Mitigation Journal Article IEEE Transactions on Dependable and Secure Computing, pp. 1-1, 2020. @article{9061034, title = {A Praise for Defensive Programming: LeveragingUncertainty for Effective Malware Mitigation}, author = {R Sun and M Botacin and N Sapountzis and X Yuan and M Bishop and D E Porter and X Li and A Gregio and D Oliveira}, url = {https://ieeexplore.ieee.org/document/9061034 https://secret.inf.ufpr.br/papers/chameleon.pdf}, year = {2020}, date = {2020-01-01}, journal = {IEEE Transactions on Dependable and Secure Computing}, pages = {1-1}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Botacin, Marcus; ~a, Giovanni Bert; de Geus, Paulo; Grégio, André; Kruegel, Christopher; Vigna, Giovanni On the Security of Application Installers and Online Software Repositories Conference Detection of Intrusions and Malware, and Vulnerability Assessment, Springer International Publishing, Cham, 2020, ISBN: 978-3-030-52683-2. Abstract | Links | BibTeX | Tags: @conference{10.1007/978-3-030-52683-2_10b, title = {On the Security of Application Installers and Online Software Repositories}, author = {Marcus Botacin and Giovanni Bert{~a}o and Paulo de Geus and André Grégio and Christopher Kruegel and Giovanni Vigna}, editor = {Clémentine Maurice and Leyla Bilge and Gianluca Stringhini and Nuno Neves}, url = {https://link.springer.com/chapter/10.1007/978-3-030-52683-2_10 https://secret.inf.ufpr.br/papers/marcus_dimva_bundle.pdf}, isbn = {978-3-030-52683-2}, year = {2020}, date = {2020-01-01}, booktitle = {Detection of Intrusions and Malware, and Vulnerability Assessment}, pages = {192--214}, publisher = {Springer International Publishing}, address = {Cham}, abstract = {The security of application installers is often overlooked, but the security risks associated to these pieces of code are not negligible. Online public repositories have been one of the most popular ways for end users to obtain software, but there is a lack of systematic security evaluation of popular public repositories. In this paper, we bridge this gap by analyzing five popular software repositories. We focus on their software updating dynamics, as well as the presence of traces of vulnerable and/or trojanized applications among the top-100 most downloaded Windows programs on each of the evaluated repositories. We analyzed 2,935 unique programs collected in a period of 144 consecutive days. Our results show that: (i) the repositories frequently exhibit rank changes due to applications fast climbing toward the first positions; (ii) the repositories often update their payloads, which may cause the distribution of distinct binaries for the same intended application (binaries for the same applications may also be different in each repository); (iii) the installers are composed by multiple components and often download payloads from the Internet to complete their installation steps, posing new risks for users (we demonstrate that some installers are vulnerable to content tampering through man-in-the-middle attacks); (iv) the ever-changing nature of repositories and installers makes them prone to abuse, as we observed that 30% of all applications were reported malicious by at least one AV.}, keywords = {}, pubstate = {published}, tppubtype = {conference} } The security of application installers is often overlooked, but the security risks associated to these pieces of code are not negligible. Online public repositories have been one of the most popular ways for end users to obtain software, but there is a lack of systematic security evaluation of popular public repositories. In this paper, we bridge this gap by analyzing five popular software repositories. We focus on their software updating dynamics, as well as the presence of traces of vulnerable and/or trojanized applications among the top-100 most downloaded Windows programs on each of the evaluated repositories. We analyzed 2,935 unique programs collected in a period of 144 consecutive days. Our results show that: (i) the repositories frequently exhibit rank changes due to applications fast climbing toward the first positions; (ii) the repositories often update their payloads, which may cause the distribution of distinct binaries for the same intended application (binaries for the same applications may also be different in each repository); (iii) the installers are composed by multiple components and often download payloads from the Internet to complete their installation steps, posing new risks for users (we demonstrate that some installers are vulnerable to content tampering through man-in-the-middle attacks); (iv) the ever-changing nature of repositories and installers makes them prone to abuse, as we observed that 30% of all applications were reported malicious by at least one AV. |
Botacin, Marcus; Grégio, André; Alves, Marco Antonio Zanata Near-Memory & In-Memory Detection of Fileless Malware Inproceedings The International Symposium on Memory Systems, pp. 23–38, Association for Computing Machinery, Washington, DC, USA, 2020, ISBN: 9781450388993. Abstract | Links | BibTeX | Tags: antivirus, malware, pattern matching, processing in memory @inproceedings{10.1145/3422575.3422775, title = {Near-Memory & In-Memory Detection of Fileless Malware}, author = {Marcus Botacin and André Grégio and Marco Antonio Zanata Alves}, url = {https://doi.org/10.1145/3422575.3422775 https://secret.inf.ufpr.br/papers/marcus_fileless.pdf}, doi = {10.1145/3422575.3422775}, isbn = {9781450388993}, year = {2020}, date = {2020-01-01}, booktitle = {The International Symposium on Memory Systems}, pages = {23–38}, publisher = {Association for Computing Machinery}, address = {Washington, DC, USA}, series = {MEMSYS 2020}, abstract = {Fileless malware are recent threats to computer systems that load directly into memory, and whose aim is to prevent anti-viruses (AVs) from successfully matching byte patterns against suspicious files written on disk. Their detection requires that software-based AVs continuously scan memory, which is expensive due to repeated locks and polls. However, research advances introduced near-memory and in-memory processing, which allow memory controllers to trigger basic computations without moving data to the CPU. In this paper, we address AVs performance overhead by moving them to the hardware, i.e., we propose instrumenting processors’ memory controller or smart memories (near- and in-memory malware detection, respectively) to accelerate memory scanning procedures. To do so, we present MINI-ME, the Malware Identification based on Near- and In-Memory Evaluation mechanism, a hardware-based AV accelerator that interrupts the program’s execution if malicious patterns are discovered in their memory. We prototyped MINI-ME in a simulator and tested it with a set of 21 thousand in-the-wild malware samples, which resulted in multiple signatures matching with less than 1% of performance overhead and rates of 100% detection, and zero false-positives and false-negatives.}, keywords = {antivirus, malware, pattern matching, processing in memory}, pubstate = {published}, tppubtype = {inproceedings} } Fileless malware are recent threats to computer systems that load directly into memory, and whose aim is to prevent anti-viruses (AVs) from successfully matching byte patterns against suspicious files written on disk. Their detection requires that software-based AVs continuously scan memory, which is expensive due to repeated locks and polls. However, research advances introduced near-memory and in-memory processing, which allow memory controllers to trigger basic computations without moving data to the CPU. In this paper, we address AVs performance overhead by moving them to the hardware, i.e., we propose instrumenting processors’ memory controller or smart memories (near- and in-memory malware detection, respectively) to accelerate memory scanning procedures. To do so, we present MINI-ME, the Malware Identification based on Near- and In-Memory Evaluation mechanism, a hardware-based AV accelerator that interrupts the program’s execution if malicious patterns are discovered in their memory. We prototyped MINI-ME in a simulator and tested it with a set of 21 thousand in-the-wild malware samples, which resulted in multiple signatures matching with less than 1% of performance overhead and rates of 100% detection, and zero false-positives and false-negatives. |
2019 |
Botacin, Marcus; Galante, Lucas; de Geus, Paulo; Grégio, André RevEngE is a Dish Served Cold: Debug-Oriented Malware Decompilation and Reassembly Inproceedings Proceedings of the 3rd Reversing and Offensive-Oriented Trends Symposium, Association for Computing Machinery, Vienna, Austria, 2019, ISBN: 9781450377751. Abstract | Links | BibTeX | Tags: @inproceedings{10.1145/3375894.3375895, title = {RevEngE is a Dish Served Cold: Debug-Oriented Malware Decompilation and Reassembly}, author = {Marcus Botacin and Lucas Galante and Paulo de Geus and André Grégio}, url = {https://doi.org/10.1145/3375894.3375895 https://secret.inf.ufpr.br/papers/roots_revenge.pdf}, doi = {10.1145/3375894.3375895}, isbn = {9781450377751}, year = {2019}, date = {2019-11-28}, booktitle = {Proceedings of the 3rd Reversing and Offensive-Oriented Trends Symposium}, publisher = {Association for Computing Machinery}, address = {Vienna, Austria}, series = {ROOTS’19}, abstract = {Malware analysis is key for cybersecurity overall improvement. Analysis tools have been evolving from complete static analyzers to decompilers. Malware decompilation allows for code inspection at higher abstraction levels, easing incident response. However, the decompilation procedure has many challenges, such as opaque constructions, irreversible mappings, semantic gap bridging, among others. In this paper, we propose a new approach that leverages the human analyst expertise to overcome decompilation challenges. We name this approach "DoD---debug-oriented decompilation", in which the analyst is able to reverse engineer the malware sample on his own and to instruct the decompiler to translate selected code portions (e.g., decision branches, fingerprinting functions, payloads etc.) into high level code. With DoD, the analyst might group all decompiled pieces into new code to be analyzed by other tool, or to develop a novel malware sample from previous pieces of code and thus exercise a Proof-of-Concept (PoC). To validate our approach, we propose RevEngE, the Reverse Engineering Engine for malware decompilation and reassembly, a set of GDB extensions that intercept and introspect into executed functions to build an Intermediate Representation (IR) in real-time, enabling any-time decompilation. We evaluate RevEngE with x86 ELF binaries collected from VirusShare, and show that a new malware sample created from the decompilation of independent functions of five known malware samples is considered "clean" by all VirusTotal's AVs.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Malware analysis is key for cybersecurity overall improvement. Analysis tools have been evolving from complete static analyzers to decompilers. Malware decompilation allows for code inspection at higher abstraction levels, easing incident response. However, the decompilation procedure has many challenges, such as opaque constructions, irreversible mappings, semantic gap bridging, among others. In this paper, we propose a new approach that leverages the human analyst expertise to overcome decompilation challenges. We name this approach "DoD---debug-oriented decompilation", in which the analyst is able to reverse engineer the malware sample on his own and to instruct the decompiler to translate selected code portions (e.g., decision branches, fingerprinting functions, payloads etc.) into high level code. With DoD, the analyst might group all decompiled pieces into new code to be analyzed by other tool, or to develop a novel malware sample from previous pieces of code and thus exercise a Proof-of-Concept (PoC). To validate our approach, we propose RevEngE, the Reverse Engineering Engine for malware decompilation and reassembly, a set of GDB extensions that intercept and introspect into executed functions to build an Intermediate Representation (IR) in real-time, enabling any-time decompilation. We evaluate RevEngE with x86 ELF binaries collected from VirusShare, and show that a new malware sample created from the decompilation of independent functions of five known malware samples is considered "clean" by all VirusTotal's AVs. |
Ceschin, Fabrício; Botacin, Marcus; Gomes, Heitor Murilo; Oliveira, Luiz S; Grégio, André Shallow Security: On the Creation of Adversarial Variants to Evade Machine Learning-Based Malware Detectors Inproceedings Proceedings of the 3rd Reversing and Offensive-Oriented Trends Symposium, Association for Computing Machinery, Vienna, Austria, 2019, ISBN: 9781450377751. Abstract | Links | BibTeX | Tags: @inproceedings{10.1145/3375894.3375898, title = {Shallow Security: On the Creation of Adversarial Variants to Evade Machine Learning-Based Malware Detectors}, author = {Fabrício Ceschin and Marcus Botacin and Heitor Murilo Gomes and Luiz S Oliveira and André Grégio}, url = {https://doi.org/10.1145/3375894.3375898 https://secret.inf.ufpr.br/papers/roots_shallow.pdf}, doi = {10.1145/3375894.3375898}, isbn = {9781450377751}, year = {2019}, date = {2019-11-28}, booktitle = {Proceedings of the 3rd Reversing and Offensive-Oriented Trends Symposium}, publisher = {Association for Computing Machinery}, address = {Vienna, Austria}, series = {ROOTS’19}, abstract = {The use of Machine Learning (ML) techniques for malware detection has been a trend in the last two decades. More recently, researchers started to investigate adversarial approaches to bypass these ML-based malware detectors. Adversarial attacks became so popular that a large Internet company has launched a public challenge to encourage researchers to bypass their (three) ML-based static malware detectors. Our research group teamed to participate in this challenge in August/2019, accomplishing the bypass of all 150 tests proposed by the company. To do so, we implemented an automatic exploitation method which moves the original malware binary sections to resources and includes new chunks of data to it to create adversarial samples that not only bypassed their ML detectors, but also real AV engines as well (with a lower detection rate than the original samples). In this paper, we detail our methodological approach to overcome the challenge and report our findings. With these results, we expect to contribute with the community and provide better understanding on ML-based detectors weaknesses. We also pinpoint future research directions toward the development of more robust malware detectors against adversarial machine learning.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } The use of Machine Learning (ML) techniques for malware detection has been a trend in the last two decades. More recently, researchers started to investigate adversarial approaches to bypass these ML-based malware detectors. Adversarial attacks became so popular that a large Internet company has launched a public challenge to encourage researchers to bypass their (three) ML-based static malware detectors. Our research group teamed to participate in this challenge in August/2019, accomplishing the bypass of all 150 tests proposed by the company. To do so, we implemented an automatic exploitation method which moves the original malware binary sections to resources and includes new chunks of data to it to create adversarial samples that not only bypassed their ML detectors, but also real AV engines as well (with a lower detection rate than the original samples). In this paper, we detail our methodological approach to overcome the challenge and report our findings. With these results, we expect to contribute with the community and provide better understanding on ML-based detectors weaknesses. We also pinpoint future research directions toward the development of more robust malware detectors against adversarial machine learning. |
Botacin, Marcus; de Geus, Paulo Lício; Grégio, André ``VANILLA'' malware: vanishing antiviruses by interleaving layers and layers of attacks Journal Article Journal of Computer Virology and Hacking Techniques, 2019, ISSN: 2263-8733. Abstract | Links | BibTeX | Tags: @article{Botacin2019, title = {``VANILLA'' malware: vanishing antiviruses by interleaving layers and layers of attacks}, author = {Marcus Botacin and Paulo Lício de Geus and André Grégio}, url = {https://secret.inf.ufpr.br/papers/marcus-vanilla.pdf https://doi.org/10.1007/s11416-019-00333-y}, doi = {10.1007/s11416-019-00333-y}, issn = {2263-8733}, year = {2019}, date = {2019-06-11}, journal = {Journal of Computer Virology and Hacking Techniques}, abstract = {Malware are persistent threats to any networked systems. Recent years increase in multi-core, distributed systems created new opportunities for malware authors to exploit such capabilities. In particular, the distributed execution of a malware in multiple cores may be used to evade currently widespread single-core-based detectors (e.g., antiviruses, or AVs) and malware analysis solutions that are unable to correlate data from multiple sources. In this paper, we propose a technique for distributing the malware functions in several distinct ``vanilla'' processes to show that AVs can be easily evaded. Therefore, our technique allows malware to interleave of layers of attacks to remain undetected by current AVs. Our goal is to expose a real menace and to discuss it so as to provide insights for the development of better AVs. We discuss the role of distributed and multicore-based malware in current and future threat scenarios with practical examples that we specially crafted for testing (e.g., a distributed sample synchronized via cache side channels). We (i) review multi-threaded/processed implementation issues (from kernel and userland) and present a multi-core-based monitoring solution; (ii) present strategies for code distribution, exemplified via DLL injectors, and discuss their weak and strong points; and (iii) evaluate how real security solutions perform when exposed to distributed malware. We converted real, serial malware to parallel code and showed that current AVs are not fully able to detect multi-core malware.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Malware are persistent threats to any networked systems. Recent years increase in multi-core, distributed systems created new opportunities for malware authors to exploit such capabilities. In particular, the distributed execution of a malware in multiple cores may be used to evade currently widespread single-core-based detectors (e.g., antiviruses, or AVs) and malware analysis solutions that are unable to correlate data from multiple sources. In this paper, we propose a technique for distributing the malware functions in several distinct ``vanilla'' processes to show that AVs can be easily evaded. Therefore, our technique allows malware to interleave of layers of attacks to remain undetected by current AVs. Our goal is to expose a real menace and to discuss it so as to provide insights for the development of better AVs. We discuss the role of distributed and multicore-based malware in current and future threat scenarios with practical examples that we specially crafted for testing (e.g., a distributed sample synchronized via cache side channels). We (i) review multi-threaded/processed implementation issues (from kernel and userland) and present a multi-core-based monitoring solution; (ii) present strategies for code distribution, exemplified via DLL injectors, and discuss their weak and strong points; and (iii) evaluate how real security solutions perform when exposed to distributed malware. We converted real, serial malware to parallel code and showed that current AVs are not fully able to detect multi-core malware. |
Botacin, Marcus; Galante, Lucas; Ceschin, Fabricio; Santos, Luigi Carro Paulo Cesar; de Geus, Paulo Licio; Gregio, Andre; Zanata, Marco The AV says: Your hardware definitions were updated! Conference 14th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC 2019), IEEE, 2019, ISBN: 978-1-7281-4770-3. @conference{recosoc, title = {The AV says: Your hardware definitions were updated!}, author = {Marcus Botacin and Lucas Galante and Fabricio Ceschin and Luigi Carro Paulo Cesar Santos and Paulo Licio de Geus and Andre Gregio and Marco Zanata}, url = {https://ieeexplore.ieee.org/document/9034972 https://secret.inf.ufpr.br/papers/marcus_recosoc.pdf}, doi = {10.1109/ReCoSoC48741.2019.9034972}, isbn = {978-1-7281-4770-3}, year = {2019}, date = {2019-01-01}, booktitle = {14th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC 2019)}, journal = {14th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC 2019)}, publisher = {IEEE}, howpublished = {urlhttps://secret.inf.ufpr.br/papers/marcus_recosoc.pdf}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
Botacin, Marcus; Kalysch, Anatoli; Grégio, André Proceedings of the 14th International Conference on Availability, Reliability and Security, pp. 49:1–49:10, ACM, Canterbury, CA, United Kingdom, 2019, ISBN: 978-1-4503-7164-3. @inproceedings{Botacin:2019:IBI:3339252.3340103, title = {The Internet Banking [in]Security Spiral: Past, Present, and Future of Online Banking Protection Mechanisms Based on a Brazilian Case Study}, author = {Marcus Botacin and Anatoli Kalysch and André Grégio}, url = {http://doi.acm.org/10.1145/3339252.3340103 https://secret.inf.ufpr.br/papers/marcus_banks.pdf}, doi = {10.1145/3339252.3340103}, isbn = {978-1-4503-7164-3}, year = {2019}, date = {2019-01-01}, booktitle = {Proceedings of the 14th International Conference on Availability, Reliability and Security}, pages = {49:1--49:10}, publisher = {ACM}, address = {Canterbury, CA, United Kingdom}, series = {ARES '19}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Beppler, Tamy; Botacin, Marcus; Ceschin, Fabrício; Oliveira, Luiz E S; Grégio, André L(a)ying in (Test)Bed: How Biased Datasets Produce Impractical Results for Actual Malware Families’ Classification Inproceedings Lin, Zhiqiang; Papamanthou, Charalampos; Polychronakis, Michalis (Ed.): Information Security, pp. 381–401, Springer International Publishing, Cham, 2019, ISBN: 978-3-030-30215-3. Abstract | Links | BibTeX | Tags: learning (artificial intelligence) @inproceedings{10.1007/978-3-030-30215-3_19, title = {L(a)ying in (Test)Bed: How Biased Datasets Produce Impractical Results for Actual Malware Families’ Classification}, author = {Tamy Beppler and Marcus Botacin and Fabrício Ceschin and Luiz E S Oliveira and André Grégio}, editor = {Zhiqiang Lin and Charalampos Papamanthou and Michalis Polychronakis}, url = {https://link.springer.com/chapter/10.1007/978-3-030-30215-3_19 https://secret.inf.ufpr.br//papers/malware_textures_tamy.pdf}, isbn = {978-3-030-30215-3}, year = {2019}, date = {2019-01-01}, booktitle = {Information Security}, pages = {381--401}, publisher = {Springer International Publishing}, address = {Cham}, abstract = {The number of malware variants released daily turned manual analysis into an impractical task. Although potentially faster, automated analysis techniques (e.g., static and dynamic) have shortcomings that are exploited by malware authors to thwart each of them, i.e., prevent malicious software from being detected or classified accordingly. Researchers then invested in traditional machine learning algorithms to try to produce efficient, effective classification methods. The produced models are also prone to errors and attacks. Novel representations of the ``subject'' were proposed to overcome previous limitations, such as malware textures. In this paper, our initial proposal was to evaluate the application of texture analysis for malware classification using samples collected in-the-wild in order to compare them with state-of-the-art results. During our tests, we discovered that texture analysis may be unfeasible for the task at hand, if we use the same malware representation employed by other authors. Furthermore, we also discovered that naive premises associated to the selection of samples in the datasets caused the introduction of biases that, in the end, produced unreal results. Finally, our tests with a broader unfiltered dataset show that texture analysis may be impractical for correct malware classification in a real world scenario, in which there is a great variety of families and some of them make use of quite sophisticate obfuscation techniques.}, keywords = {learning (artificial intelligence)}, pubstate = {published}, tppubtype = {inproceedings} } The number of malware variants released daily turned manual analysis into an impractical task. Although potentially faster, automated analysis techniques (e.g., static and dynamic) have shortcomings that are exploited by malware authors to thwart each of them, i.e., prevent malicious software from being detected or classified accordingly. Researchers then invested in traditional machine learning algorithms to try to produce efficient, effective classification methods. The produced models are also prone to errors and attacks. Novel representations of the ``subject'' were proposed to overcome previous limitations, such as malware textures. In this paper, our initial proposal was to evaluate the application of texture analysis for malware classification using samples collected in-the-wild in order to compare them with state-of-the-art results. During our tests, we discovered that texture analysis may be unfeasible for the task at hand, if we use the same malware representation employed by other authors. Furthermore, we also discovered that naive premises associated to the selection of samples in the datasets caused the introduction of biases that, in the end, produced unreal results. Finally, our tests with a broader unfiltered dataset show that texture analysis may be impractical for correct malware classification in a real world scenario, in which there is a great variety of families and some of them make use of quite sophisticate obfuscation techniques. |
2018 |
Ceschin, Fabrício; Pinage, Felipe; Castilho, Marcos; Menotti, David; Oliveira, Luis S; Gregio, André The Need for Speed: An Analysis of Brazilian Malware Classifiers Journal Article IEEE Security Privacy, 16 (6), pp. 31-41, 2018, ISSN: 1540-7993. Abstract | Links | BibTeX | Tags: Brazilian malware classifers, Feature extraction, invasive software, learning (artificial intelligence), Machine learning, machine-learning systems, malware, malware classification, pattern classification, security, Security of data, Support vector machines @article{8636415, title = {The Need for Speed: An Analysis of Brazilian Malware Classifiers}, author = {Fabrício Ceschin and Felipe Pinage and Marcos Castilho and David Menotti and Luis S Oliveira and André Gregio}, url = {https://secret.inf.ufpr.br/papers/fabricio_needforspeed.pdf}, doi = {10.1109/MSEC.2018.2875369}, issn = {1540-7993}, year = {2018}, date = {2018-11-01}, journal = {IEEE Security Privacy}, volume = {16}, number = {6}, pages = {31-41}, abstract = {Using a dataset containing about 50,000 samples from Brazilian cyberspace, we show that relying solely on conventional machine-learning systems without taking into account the change of the subject's concept decreases the performance of classification, emphasizing the need to update the decision model immediately after concept drift occurs.}, keywords = {Brazilian malware classifers, Feature extraction, invasive software, learning (artificial intelligence), Machine learning, machine-learning systems, malware, malware classification, pattern classification, security, Security of data, Support vector machines}, pubstate = {published}, tppubtype = {article} } Using a dataset containing about 50,000 samples from Brazilian cyberspace, we show that relying solely on conventional machine-learning systems without taking into account the change of the subject's concept decreases the performance of classification, emphasizing the need to update the decision model immediately after concept drift occurs. |
Botacin, Marcus; de Geus, Paulo Lício; Grégio, André The other guys: automated analysis of marginalized malware Journal Article Journal of Computer Virology and Hacking Techniques, 14 (1), pp. 87–98, 2018, ISSN: 2263-8733. Abstract | Links | BibTeX | Tags: @article{Botacin2018, title = {The other guys: automated analysis of marginalized malware}, author = {Marcus Botacin and Paulo Lício de Geus and André Grégio}, url = {https://secret.inf.ufpr.br/papers/behemot.pdf https://doi.org/10.1007/s11416-017-0292-8}, doi = {10.1007/s11416-017-0292-8}, issn = {2263-8733}, year = {2018}, date = {2018-02-01}, journal = {Journal of Computer Virology and Hacking Techniques}, volume = {14}, number = {1}, pages = {87--98}, abstract = {In order to thwart dynamic analysis and bypass protection mechanisms, malware have been using several file formats and evasive techniques. While publicly available dynamic malware analysis systems are one of the main sources of information for researchers, security analysts and incident response professionals, they are unable to cope with all types of threats. Therefore, it is difficult to gather information from public systems about CPL, .NET/Mono, 64-bits, reboot-dependent, or malware targeting systems newer than Windows XP, which result in a lack of understanding about how current malware behave during infections on modern operating systems. In this paper, we discuss the challenges and issues faced during the development of this type of analysis system, mainly due to security features available in NT 6.x kernel versions of Windows OS. We also introduce a dynamic analysis system that addresses the aforementioned types of malware as well as present results obtained from their analyses.}, keywords = {}, pubstate = {published}, tppubtype = {article} } In order to thwart dynamic analysis and bypass protection mechanisms, malware have been using several file formats and evasive techniques. While publicly available dynamic malware analysis systems are one of the main sources of information for researchers, security analysts and incident response professionals, they are unable to cope with all types of threats. Therefore, it is difficult to gather information from public systems about CPL, .NET/Mono, 64-bits, reboot-dependent, or malware targeting systems newer than Windows XP, which result in a lack of understanding about how current malware behave during infections on modern operating systems. In this paper, we discuss the challenges and issues faced during the development of this type of analysis system, mainly due to security features available in NT 6.x kernel versions of Windows OS. We also introduce a dynamic analysis system that addresses the aforementioned types of malware as well as present results obtained from their analyses. |
Botacin, Marcus; Geus, Paulo Lício De; Grégio, André ACM Comput. Surv., 51 (4), pp. 69:1–69:34, 2018, ISSN: 0360-0300. Links | BibTeX | Tags: Binary analysis, HVM, introspection, malware, security, SMM @article{Botacin:2018:WWS:3236632.3199673, title = {Who Watches the Watchmen: A Security-focused Review on Current State-of-the-art Techniques, Tools, and Methods for Systems and Binary Analysis on Modern Platforms}, author = {Marcus Botacin and Paulo Lício De Geus and André Grégio}, url = {https://secret.inf.ufpr.br/papers/marcus-survey.pdf http://doi.acm.org/10.1145/3199673}, doi = {10.1145/3199673}, issn = {0360-0300}, year = {2018}, date = {2018-01-01}, journal = {ACM Comput. Surv.}, volume = {51}, number = {4}, pages = {69:1--69:34}, publisher = {ACM}, address = {New York, NY, USA}, keywords = {Binary analysis, HVM, introspection, malware, security, SMM}, pubstate = {published}, tppubtype = {article} } |
Botacin, Marcus; Geus, Paulo Lício De; Grégio, André Enhancing Branch Monitoring for Security Purposes: From Control Flow Integrity to Malware Analysis and Debugging Journal Article ACM Trans. Priv. Secur., 21 (1), pp. 4:1–4:30, 2018, ISSN: 2471-2566. Links | BibTeX | Tags: branch monitor, debug, malware, ROP @article{Botacin:2018:EBM:3171591.3152162, title = {Enhancing Branch Monitoring for Security Purposes: From Control Flow Integrity to Malware Analysis and Debugging}, author = {Marcus Botacin and Paulo Lício De Geus and André Grégio}, url = {https://secret.inf.ufpr.br/papers/marcus-branch.pdf http://doi.acm.org/10.1145/3152162}, doi = {10.1145/3152162}, issn = {2471-2566}, year = {2018}, date = {2018-01-01}, journal = {ACM Trans. Priv. Secur.}, volume = {21}, number = {1}, pages = {4:1--4:30}, publisher = {ACM}, address = {New York, NY, USA}, keywords = {branch monitor, debug, malware, ROP}, pubstate = {published}, tppubtype = {article} } |
Afonso, Vitor; Kalysch, Anatoli; Müller, Tilo; Oliveira, Daniela; Grégio, André; de Geus, Paulo Lício Lumus: Dynamically Uncovering Evasive Android Applications Inproceedings Chen, Liqun; Manulis, Mark; Schneider, Steve (Ed.): Information Security, pp. 47–66, Springer International Publishing, Cham, 2018, ISBN: 978-3-319-99136-8. Abstract | Links | BibTeX | Tags: @inproceedings{10.1007/978-3-319-99136-8_3, title = {Lumus: Dynamically Uncovering Evasive Android Applications}, author = {Vitor Afonso and Anatoli Kalysch and Tilo Müller and Daniela Oliveira and André Grégio and Paulo Lício de Geus}, editor = {Liqun Chen and Mark Manulis and Steve Schneider}, url = {https://secret.inf.ufpr.br/papers/lumus.pdf}, isbn = {978-3-319-99136-8}, year = {2018}, date = {2018-01-01}, booktitle = {Information Security}, pages = {47--66}, publisher = {Springer International Publishing}, address = {Cham}, abstract = {Dynamic analysis of Android malware suffers from techniques that identify the analysis environment and prevent the malicious behavior from being observed. While there are many analysis solutions that can thwart evasive malware on Windows, the application of similar techniques for Android has not been studied in-depth. In this paper, we present Lumus, a novel technique to uncover evasive malware on Android. Lumus compares the execution traces of malware on bare metal and emulated environments. We used Lumus to analyze 1,470 Android malware samples and were able to uncover 192 evasive samples. Comparing our approach with other solutions yields better results in terms of accuracy and false positives. We discuss which information are typically used by evasive malware for detecting emulated environments, and conclude on how analysis sandboxes can be strengthened in the future.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Dynamic analysis of Android malware suffers from techniques that identify the analysis environment and prevent the malicious behavior from being observed. While there are many analysis solutions that can thwart evasive malware on Windows, the application of similar techniques for Android has not been studied in-depth. In this paper, we present Lumus, a novel technique to uncover evasive malware on Android. Lumus compares the execution traces of malware on bare metal and emulated environments. We used Lumus to analyze 1,470 Android malware samples and were able to uncover 192 evasive samples. Comparing our approach with other solutions yields better results in terms of accuracy and false positives. We discuss which information are typically used by evasive malware for detecting emulated environments, and conclude on how analysis sandboxes can be strengthened in the future. |
2017 |
Sun, R; Yuan, X; Lee, A; Bishop, M; Porter, D E; Li, X; Grégio, André; Oliveira, Daniela The dose makes the poison — Leveraging uncertainty for effective malware detection Inproceedings 2017 IEEE Conference on Dependable and Secure Computing, pp. 123-130, 2017. @inproceedings{8073803, title = {The dose makes the poison — Leveraging uncertainty for effective malware detection}, author = {R Sun and X Yuan and A Lee and M Bishop and D E Porter and X Li and André Grégio and Daniela Oliveira}, doi = {10.1109/DESEC.2017.8073803}, year = {2017}, date = {2017-08-01}, booktitle = {2017 IEEE Conference on Dependable and Secure Computing}, pages = {123-130}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
2021 |
One Size Does Not Fit All: A Longitudinal Analysis of Brazilian Financial Malware Journal Article ACM Trans. Priv. Secur., 24 (2), 2021, ISSN: 2471-2566. |
2020 |
We Need to Talk About AntiViruses: Challenges & Pitfalls of AV Evaluations Journal Article Computers & Security, pp. 101859, 2020, ISSN: 0167-4048. |
Leveraging branch traces to understand kernel internals from within Journal Article Journal of Computer Virology and Hacking Techniques, 2020, ISSN: 2263-8733. |
The self modifying code (SMC)-aware processor (SAP): a security look on architectural impact and support Journal Article Journal of Computer Virology and Hacking Techniques, 2020, ISSN: 2263-8733. |
A Praise for Defensive Programming: LeveragingUncertainty for Effective Malware Mitigation Journal Article IEEE Transactions on Dependable and Secure Computing, pp. 1-1, 2020. |
On the Security of Application Installers and Online Software Repositories Conference Detection of Intrusions and Malware, and Vulnerability Assessment, Springer International Publishing, Cham, 2020, ISBN: 978-3-030-52683-2. |
Near-Memory & In-Memory Detection of Fileless Malware Inproceedings The International Symposium on Memory Systems, pp. 23–38, Association for Computing Machinery, Washington, DC, USA, 2020, ISBN: 9781450388993. |
2019 |
RevEngE is a Dish Served Cold: Debug-Oriented Malware Decompilation and Reassembly Inproceedings Proceedings of the 3rd Reversing and Offensive-Oriented Trends Symposium, Association for Computing Machinery, Vienna, Austria, 2019, ISBN: 9781450377751. |
Shallow Security: On the Creation of Adversarial Variants to Evade Machine Learning-Based Malware Detectors Inproceedings Proceedings of the 3rd Reversing and Offensive-Oriented Trends Symposium, Association for Computing Machinery, Vienna, Austria, 2019, ISBN: 9781450377751. |
``VANILLA'' malware: vanishing antiviruses by interleaving layers and layers of attacks Journal Article Journal of Computer Virology and Hacking Techniques, 2019, ISSN: 2263-8733. |
The AV says: Your hardware definitions were updated! Conference 14th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC 2019), IEEE, 2019, ISBN: 978-1-7281-4770-3. |
Proceedings of the 14th International Conference on Availability, Reliability and Security, pp. 49:1–49:10, ACM, Canterbury, CA, United Kingdom, 2019, ISBN: 978-1-4503-7164-3. |
L(a)ying in (Test)Bed: How Biased Datasets Produce Impractical Results for Actual Malware Families’ Classification Inproceedings Lin, Zhiqiang; Papamanthou, Charalampos; Polychronakis, Michalis (Ed.): Information Security, pp. 381–401, Springer International Publishing, Cham, 2019, ISBN: 978-3-030-30215-3. |
2018 |
The Need for Speed: An Analysis of Brazilian Malware Classifiers Journal Article IEEE Security Privacy, 16 (6), pp. 31-41, 2018, ISSN: 1540-7993. |
The other guys: automated analysis of marginalized malware Journal Article Journal of Computer Virology and Hacking Techniques, 14 (1), pp. 87–98, 2018, ISSN: 2263-8733. |
ACM Comput. Surv., 51 (4), pp. 69:1–69:34, 2018, ISSN: 0360-0300. |
Enhancing Branch Monitoring for Security Purposes: From Control Flow Integrity to Malware Analysis and Debugging Journal Article ACM Trans. Priv. Secur., 21 (1), pp. 4:1–4:30, 2018, ISSN: 2471-2566. |
Lumus: Dynamically Uncovering Evasive Android Applications Inproceedings Chen, Liqun; Manulis, Mark; Schneider, Steve (Ed.): Information Security, pp. 47–66, Springer International Publishing, Cham, 2018, ISBN: 978-3-319-99136-8. |
2017 |
The dose makes the poison — Leveraging uncertainty for effective malware detection Inproceedings 2017 IEEE Conference on Dependable and Secure Computing, pp. 123-130, 2017. |