Software Diversification: What it Is and Why it’s Important


Software diversification is a method of altering an executable binary so that various instances of the same software, while providing identical functionality, to an attacker appear different and operate differently on the binary level. Software diversification confounds an attacker’s attempts to exploit information gained from one deployment to compromise other deployments. It is much harder to develop a universal cracking scheme for software instances that are diversified. Instead, each software instance must be cracked individually.

On current open architecture computing systems, users control their machines and can modify software and processes. Some users will attempt to analyze software protection mechanisms, out of academic curiosity or otherwise. Often software developers want to protect their software against reverse engineering, for example, to protect against unauthorized software copying or to protect intellectual property contained in software.

There are many code protection techniques on the market that offer protection against reverse engineering and code analysis, with varying degrees of success and complexity. However, many of these techniques do not protect against class breaks (sometimes called “Break Once Run Everywhere (BORE) attacks”). A class break is an attack that, if successfully executed on one software instance, could be similarly applied to crack all other instances of the same software. This has occurred many times in the past, causing tremendous loss to software vendors. Typically all copies of the target software have the same binary code image, enabling an adversary to develop a generic reverse-engineering scheme.

Software diversification is a leading protection technique against class breaks. It significantly increases the time and cost of attacking an installed base of protected applications. Essentially, the attacker must crack each copy of the application. For this reason, software diversification should be a de facto means to protect software applications that are distributed in large numbers to consumer devices, such as desktop computers, mobile devices, and game consoles.

There are two types of software diversification – data and code. Applications containing cryptographic operations should employ at least one, but preferably both, types of software diversification.

Data diversification is a relatively simple method that enhances protection against class breaks. With this method, certain embedded data values referenced by the program code vary among different instances of the same application. For example, this data value could be a key that encrypts a database stored on a device, or that encrypts other keys imported into the application. If a hacker manages to extract the key from a particular application instance, he would not be able to use that key to decrypt the secrets of other application instances.

To use data diversification, unique and “personalized” data values have to be injected into the binary image during code compilation or deployment.

Code diversification is a much more sophisticated and robust (and usually, more costly) protection against class breaks than data diversification. With this method, binary instructions vary between different instances, or between separate sets of instances. Code diversification is typically a result of applying in-house or vendor-supplied tamper-resistance techniques. This may include code obfuscation, instruction set randomization, integrity protection, anti-debug and anti-dumping techniques, code signing, or virtualization. In most cases, for the sake of performance and simplicity, it is enough to diversify just the sensitive parts of the program code (like the cryptographic routines), but in other cases, the protection can benefit from diversifying the whole executable.

In our next blog, we’ll share some challenges and best practices in employing software diversification to effectively protect security applications.