source: embedded.com

Pub: Mini-ITX, Embedded Solution(dataresponse.com), smart assembly, korenix.
Security has become a hot global topic. Banks make a point to advertise the security of their online services. Credit cards boast built-in security chips. The press regularly reports of computer vulnerabilities and attacks wrecking havoc among unsuspecting computer users.Even more disturbing, we hear news of hackers penetrating highly sensitive financial and government institutions. There is an ongoing war between the « good guys », trying to bring to life the promise of computer and Internet technology, and the « bad guys » using the same for nefarious purposes.I too, was recently hit by a « phishing » attack. I received an e-mail from Bank of America Military Bank Online (so it seemed). The e-mail spoke of a modification to my account that, while innocuous, still required that I log in to acknowledge the change. I browsed the web site; it appeared genuine.
However, I never served in the military so I reported the incident to Bank of America and was told they had received similar reports. The (authentic) Bank of America web site had an FAQ on these types of attacks, including screen shots to help spot phishing. Amusingly, a common indicator is spelling errors.

While my big phishing story was merely fodder for dinner conversation (pun intended), it reminded me of the great need for constant vigilance. Concern and curiosity led to a little research. I inquired about my workplace for similar experiences and received numerous stories of stolen electronic identities, malware, more phishing attempts, etc.

I checked into some of the security tracking sites and discovered some alarming statistics. I learned there have been over 800 successful hacker attacks on the Department of Homeland Security. According to CERT, in 2007 there were over 7,200 network vulnerabilities logged.

Many of these types of vulnerabilities can be prevented, and it largely depends on how computer information systems are designed and implemented. The problem is that developing and promulgating highly secure systems is not easy, especially with our dependence upon legacy systems that have been around for years and were never designed with security in mind.

But, in this age of innovation and automation, can’t we invent something to solve these problems? If computers can fly (and land) airplanes, drive cars, and fix a kitchen oven via the Internet, can’t they help ensure that the software we develop is secure?

Yes, they can.

One promising solution comes from a relatively new branch of software development tools called static source analysis. In a nutshell, a static source analyzer is a tool that examines software source code and looks for defects that can result in security holes.

Much of the world’s software uses programming languages – such as C, C++, and Java – that are a double-edged sword: while powerful and flexible, these languages also permit programmers to shoot themselves in the foot.

Many of my fellow engineers regularly wrestle with a slew of programming pitfalls: « uninitialized variables, » « buffer overflows, » and « race conditions », to name a few.

The good news is that static analyzers can catch these flaws and report them to the developer before the product is ever deployed. That is great stuff, you might think, but how does this help make a computer more secure? First, let’s look at what makes systems vulnerable.

Hacking for Dummies
If you believe the stuff you see in the movies, you might think that hackers are a bunch of non-conforming geeks with green spiky hair who spend most of their time hanging out at techno dance clubs. Oh yeah, and in their spare time, they break into highly secure systems mostly to play pranks or perhaps to « borrow » a few bucks to support their whirlwind lifestyle. The reality is a bit different.

While there is a fair share of rebels among hackers, many belong to organized, well funded groups. They spend weeks developing ways to break into a particular system. Much of the work is drudgery, poring over thousands of lines of source or machine code. With the increasing use of open source software, much of the code is readily available making the hackers’ jobs much easier.

While less glamorous than what Hollywood studios illustrate, the hacker’s labor can result in something far more sinister than what you see on the big screen. Instead of pranks, hackers are targeting the nation’s power grid, the financial industry, and the most sensitive military secrets.

An attacker tries some or all of the following:

Find a Back Door  » the hacker will look for a way to get into a system through non-traditional means, for instance pretending to be another computer instead of a user. Once a hacker breaks in, they may download a program that creates a password for later use. In the process, the hacker hides evidence of the break-in.

Know the System Better than the Developer  » since large systems are often implemented by hundreds or even thousands of developers, portions of the implementation will be less understood than others  » too many cooks, so to speak. A hacker doesn’t need to understand the entire system  » only the most vulnerable parts.

Look for Paths Less Traveled  » every application or system has components that are used more frequently compared to others that execute less often. The 80/20 rule usually applies: 20% of the code runs 80% of the time.

That means 20% of the code is probably more reliable and secure. Conversely, the remaining 80% of less traveled code could be riddled with security holes that go undetected for years. When it comes to security, the aphorism « a chain is as strong as its weakest link » couldn’t be more appropriate: all the hacker needs is one way in, and a path rarely taken is a fertile place to look.

Time It Well  » the hacker who succeeds in breaking in will often get only one chance to strike before the attack is detected. Therefore, the attack must be carefully timed.

Figure 1: Static source analyzer showing how two separate lines of code, when combined, form a defect. Each line by itself is correct, but the combination of the two produces a possible NULL pointer dereference defect. Source: Green Hills Software

Hackers Meet Their Match
When it comes to the grunt work of understanding the source code and looking for vulnerabilities, hackers have a bag of tricks known to expose vulnerabilities.

Static source analyzers (Figure 1, above) are designed to detect exactly the scenarios that are likely to cause vulnerabilities, revealing them to the programmer before the system is released. By running the code through a static analyzer, hackers lose their one chance of a strike. In effect, an ounce of static analysis prevention is worth a pound of cure against a successful attack.

These are some of the hacker’s all time favorites, and the ways static analyzers battle them:

1) Buffer Overflow  » Whenever a system needs to store a piece of data, like a user name, it needs to allocate a piece of memory, or a buffer. When the buffer is being filled with data (for instance a user types in their login name), it is easy to go beyond the pre-allocated amount of memory, if a programmer wasn’t careful.

That would make the system start reading or writing over some other data (like data that tell the system which code to execute next). Crafty hackers can this way insert a program of their own into the system. If their program gets access to the system, they can create new passwords, and in the process hide the evidence of their break-in.

Solution: static source analyzers are able to easily catch buffer overflows. They track the amount of allocated memory and look for accesses into the buffer. When they detect a buffer access that is beyond the initially allocated memory, static analyzers report an error. This way the error is fixed well before the system is released and exposed to hacker attacks.

2) Broken Invariant (e.g. uninitialized variable)  » In languages like C and C++, variables are constructs to create a state that the system uses to make decisions. A program can have thousands of variables, and each one needs to be explicitly given the initial value by the programmer.

If a programmer fails to give a variable an initial value that makes sense to the system, the system may behave unpredictably. If the system is behaving in unpredicted and untested ways, unforeseen security holes can quickly open up.

Solution: static analyzers keep track of variables when they are initialized and with what values. Static analyzers can also distinguish if a variable is used just for its own value (in which case the value itself needs to be something expected), or if a variable is used as a pointer to an object that describes some more complex state (in which case the object that is pointed to and the variable both need to be something expected). Again, static analyzers can check for that.

3) Resource Leak  » Ideally, a system is supposed to run forever. As one of its fundamental functions, a system grants resources to applications running on a system. Resources like memory, disk access, accesses to media ports, etc. are limited by the hardware that’s installed.

When resources are no longer used by an application (e.g. an application releases a resource or gets terminated), the system needs to « recycle » them for the next application that requests them.

If a programmer fails to instruct the system to reclaim these resources, a hacker can create a vulnerable scenario where, for instance, more and more memory is allocated until the memory pool is entirely exhausted. This often leads to « denial of service » attacks, because the system is busy looking for resources that are simply not available.

Solution: Static analyzers can be customized to understand the types of resources a system can grant. Systems will have different ways of allocating, using, and reclaiming resources  » also known as the API (Application Programming Interface).

A static analyzer can ensure that a program’s use of a system resource can only occur after a resource has been allocated through a special API set of instructions. Also, a static analyzer can make sure the program releases the resource through the API instructions once it is no longer going to use it, thus preventing a resource leak.

4) Stack Overflow – this vulnerability is much harder to detect, and can be especially dangerous in today’s multi-threaded systems. A function stack is a piece of local memory that a program must have pre-allocated in order to successfully run.

The biggest problem with function stacks is that the programmer must know how much memory to allocate when a new thread is created. Accurately computing that number can be difficult. What ends up happening is that most programmers basically make an educated guess of how much memory is needed (and sometimes throw in a few extra kilobytes, just for good measure).

If this sounds like the wild west of programming, that’s because it is. If programmers get this value wrong (i.e. too small), a function stack of one thread of execution may corrupt a function stack of another and cause subtle, hard to reproduce bugs which are nearly impossible to find. Worse yet, they often slip through tests and QA processes making their way into the released code.

Solution: until recently, static analyzers had no way of battling this issue. What made this impossible to detect was that there was nothing wrong with the source code, but with the interaction between the compiler (tool that translates source code into machine code) and the way a program was structured to execute. The solution comes in a new design of static analyzers, which integrates them into the compilers. This synergy of static analyzers and compilers results in a whole new class of problems that can be detected.

Next in Part 2: Implementing Security

As a Director of Engineering at Green Hills Software, Nikola Valerjev is responsible for managing teams that plan, design, and develop new products, including the DoubleCheck static source analyzer. He also manages teams that evaluate new and existing solutions from the user perspective. Mr. Valerjev holds a Bachelor of Science and a Master of Engineering degree in computer science from Cornell University. He has been with Green Hills Software since 1997.