This is the presentation that I used in my speech at the 2017 Security Summit in Milan.
Security Summit, organized by ClusIt, is the most important security event in Italy. My speech was about protecting from unknown threats delivered via email, the focus was on the relationship between pragmatism and security.
In this post I will go through the items of this speech.
Complexity
The source of today’s security issues lays mainly in the great complexity of the systems we use, such complexity creates a big attack surface. If we link such a big attack surface to the strong motivations to violate our systems, mainly due to the ransomware and phishing business, we can understand why today the new threats, not yet known, are daily business. We’ve seen about 30 thousand new ransomware variants in 2016, which means tens of new threats every day. The security systems, in order to be effective, must be able to intercept such threats even if they don’t know them in advance.
Security systems, in computer science like in the physical world, must be maintained as simple as possible because complexity is the enemy of security. It reduces the reliability and increases the attack surface.
Some security solutions are more oriented towards marketing needs rather than security needs. A sanboxe based on virtual machines, for example, is a system even more complex than the one it should defend: it is a windows system virtualized in an environment instrumented with additional software that tries to observe the malware without being observed in turn by the malware itself. This must be done with short analysis times, typically within two minutes, not to excessively encumber on the company workflows (and to contain costs).
The malware authors have the very same sandboxes that we use, the malware can be tested in such sandboxes before being released. Malware authors quickly learned to trick the sandbox. While the sandbox has just a couple of minutes to provide a response, the malware, once it has infected the PC, has all the time to manifest itself, it can just wait some time not to be identified by the sandbox. The sandbox started tricking the malware into believing that time passed faster and malware learned how to use this characteristic to detect the sandbox. This very simplified example just to say that the etrnal fight between attack and defense just moved into a different environment without changing the background pattern. In this new environment it’s who defends the one in disadvantage.
But a sandbox is a strong marketing argument because complexity sells more than pragmatism.
These sandboxes are exceptional tools for the analysis and the study of the malware, but when they are used as filters they show many weaknesses and they offer a discrete attack surface. A few days ago at Pwn2Own, a cybersecurity contest, one team managed to escape from the vm and compromise the host. With just a click on a link. Imagine a malware that compromises the sandbox in order to infect all the analyzed files … how does it sound?
So, let’s use the complexity prudently and only where we really need it, knowing that every increase in complexity reduces the security, it has a cost.
Protecting from file-based threats
About the file-based threats, in my speech I’ve performed, along with the audience, the analysis of the problem starting from square zero. Let’s analyze the problem again starting from scratch and let’s add complexity until we’ve reached our goal.
Le’ts start from the “firewall” approach. Do you remember when on the firewall we used to close selectively the ports? All ports open by default except the ones we decided to close. Then we inverted this logic: everything is closed and we selectively open based on what we actually need. This simple change of paradigm alone drastically improved security.
Let’s have the same approach. Do we really need executables attacched to our emails? Do we need .exe, .js and so on? No, we don’t. The experience of over one billion emails per month tells us that we don’t need such files, removing them doesn’t impact on the company workflows, so let’s block them without even analyzing them.
Your technician of the IT dept needs to receive jar files? Fine, let’s add an exception for him. Firewall approach. Everything is blocked and we selectively unblock.
Very simple solution, trivial, to remove most of the attack vectors.
We still have the documents though. Office documents, pdf files, such formats now are so complex that they can contain code that can do anything. What do we do with those? For sure we can’t block them.
Ok, it’s time for a one notch increase in complexity. One step towards a greater complexity justified by a real need. Let’s do it.
Let’s analyze the document. Does it contain code or not? If not, it goes through. Firewall approach. But what if it does contain code? For sure we can’t block all the spreadsheets with a macro! Right.
One more notch of complexity is justified. let’s do it.
Now that we know that the document contains code, let’s roll up our sleeves and let’s inspect what it does. Does it perform calculations in a spreadsheet, other automation that is normal in a document? Ok, let’s define a set of safe operatione and let them go through. Firewall approach. This analysis can be done quickly and safely, let’s do it.
Great, we’ve given green light to documents with innocuous macros. What about the other ones? We can’t block them all, we risk creating some disservice.
Ok, one more notch of complexity. Why don’t we “clean” these files that we classified as “suspect”? We can remove the “active” content, the macro, the embedded ocx object, the javascript code in a pdf. Let’s remove such code and deliver an innocuous document. It can still be used as a document, but without code.
Here we are. We just need the finishing touches: define what to do with encrypted documents that we cannot analyze (easy, we block them because today they are one of the biggest vehicles of ransomware), and what to do in the unlucky case where a malware manages to crash my sandbox in order not to be identified: in this case we categorize it as “indeterminate” and by default we remove all the active content. It’s a simple good programming practice to foresee the case where someone manages to perform sabotage on your sandbox (we still see sandboxes that in such conditions let go through the malware that managed to undermine them).
Finally, let’s make configurable by the sysadmin the behaviour in case of safe, suspect, encrypted and indeterminate documents, so that the admin can decide what to let go through, what to block and what to “clean”. Let’s also make sure that everytime we modify a file we keep the original copy so that it can be recovered, should it be needed.
At this point the goal is reached, we protected from file-based attacks, including the ones that are not yet known, and we did it with the simplest solution possible. An analysis and a selective cleaning that are fast, that can be done while the mail is being analysed by the antispam engine, without uploading your files on a third party cloud service, without compliance and privacy issues, without introducing delays that impact on the workflows. Goal achieved with minimal complexity and minimal attack surface.
Protecting from malicious links
So? Should we absolutely avoid complexity in any case? No
We should just use complexity where we need it without fear but considering it a cost that must be justified.
For example, complexity is more than justified to protect from malicious URLs.
The most frequent attack today is a mail coming from a legit sender (whose account is being used illegally), with a short and very generic text, which contains a link to a legit site (infected five minutes ago) on which has been injected a malware that installs itself just visiting that page. One click and you get the ransomware.
This type of attack is a big problem because such a mail could slip through and not be identified as malicious. When the email is being analyzed, we rewrite the URL so that instead of pointing to that page, it now points to our sandbox, we’ll see that here a complex sandbox is justified. Let’s buy some time because time is on our side, the more time passes and the easier it will be of us to identify a legit site that has just been infected, so let’s just rewrite the URL and postpone the analysis to the very last possible moment: at the moment of the click.
Whe the click happens, the user’s browser lands in our sandbox which, only in this very moment, visits the page and analyzes it. It follows the redirects, it visits the page from multiple locations in order to highlight evasion techniques, it evaluates how it presents itself to the search engines, it looks for infection traces, phishing attempts and so on.
Let’s not economize complexity because here it is useful and pays back. Most importantly, here is the sandbox that is advantaged: thee malware cannot wait, it must immediately manifest itself, the infection must happen when the page is loaded. Also, the available techniques in order to hide from sandboxes are limited and we, visiting the page in many ways and from different “places”, can identify them. This is a huge advantage: just putting in place evasion techniques reveals the presence of the malware, including the most complex and not jet known one, making it extremely vulnerable to our analysis.
Here is where the complexity is justified.
If you want to go deeper, here there is an explanation about how the file analysis works and here the one for the URLs.
Rodolfo Saccani, Security R&D Manager at Libra Esva