Automated triage of samples for malware analysis
MetadataVis full innførsel
As people continue to rely increasingly on information systems, the threat landscape will keep evolving.To combat and defeat new threats we need good cyber threat intelligence.Analysis of malicious software is a popular method of gathering such cyber threat intelligence.This can be done both manually and automatically, with pros and cons on both sides.One thing they have in common is that they both require quite a lot of resources, either if it is time or computational power.A problem then arises when the analyst has more suspicious samples to analyze than available resources.The analyst then has to triage and prioritize which samples to focus resources on.This triage itself requires quite a lot of resources, which results in a vicious cycle.This thesis proposes a methodology for automating such sample triage, resulting in a list of samples sorted by a score which indicates the probability of containing good cyber threat intelligence.In order to prove and validate the methodology, a program was developed which followed most of the methods. However, a few limitation had to be considered.Various tests was conducted on the program output in order to validate it. One of which included comparison between computer and human triage.The results were not absolute conclusive, but the study contributed to the needed research of improving efficiency and reduce resource requirements to perform analysis of malware.We have showed that generic rules can become useful if combined with feature values and weights, if used for the right purpose. With these weights and values, we also showed that it is possible to always get a relevant output of interesting samples by changing the values to fit the current situation.And lastly, it was concluded that we there is a good possibility that replacing human triage with an automated system will indeed increase efficiency of the malware analysis process.