The Art of Scoping Application Security Reviews (Part 2) - The Types of Testing
Part 2 of my mini-series dedicated to scoping application security reviews focuses on the different types of testing. Not surprisingly it follows part 1. Specifically this second installment describes the main types or techniques for testing the security of applications, and discusses the advantages, disadvantages and considerations for each technique while hopefully providing useful tips aimed at corporations consuming testing services and consultants delivering testing services.
As with the last article the focus is on web applications and line of business type applications but much of the content will be relevant to ISV’s as well. As a recap the planned parts in this mini-series are as follows;
· The Art of Scoping Application Security Reviews (Part 1) - The Business
· The Art of Scoping Application Security Reviews (Part 2) - The Types of Testing
· The Art of Scoping Application Security Reviews (Part 3) - Threat Modeling and Architecture Reviews
· The Art of Scoping Application Security Reviews (Part 4) - Penetration Testing
· The Art of Scoping Application Security Reviews (Part 5) - Code Reviews
Some of the text of this post is taken from a previous IEEE article on web security testing tools (written with software security guru Rudolph Araujo). He has the security genome BTW!
It is worth noting is that this series is only focused on testing software and not testing the process to build software. This will be covered in an upcoming SDL series. It is also worth pointing out that this is far from a conclusive list or descriptions of all types of testing. but a mere overview.
Topics covered in this post include;
- Why Do Different Types of Testing Matter?
- Web Application Penetration Testing
- Static Source Code Analysis
- Dynamic Analysis
- Design, Architecture and Threat Modeling
- What Type of Testing Should You Use and When?
Why Do Different Types of Testing Matter?
Let’s first take a step back. Application security vulnerabilities can fit a common issues taxonomy that you can use to build an ordered set of test cases or criteria. Such a security framework is helpful for analysis and reporting because it promotes consistency and order. It does this by ensuring that you’re comparing apples to apples and oranges to oranges when you analyze vulnerability classes and applications. We developed such a framework at Foundstone which also scales beyond Web applications to desktop and client-server type applications. It worked for us but that’s not really the important point here. What is important is having a consistent taxonomy of issues. For references our framework was as follows;
- Configuration Management: infrastructure issues, including the configuration of the Web server, application server, and runtime technology (.NET Common Language Runtime or Java Virtual Machine).
- Authentication: user and entity authentication protocol and process issues.
- Authorization: issues related to identification information that the system uses to authorize decisions or time-of-check and time-of-use design flaws.
- Data Protection: issues related to how the system protects sensitive data when it’s stored or transmitted between system components or across the Internet from the user to the system.
- User and Session Management: issues related to how the system manages users through such things as password reset mechanisms and session management schemes.
- Data Validation: problems arising when the user or system components accept malicious data.
- Error Handling and Exception Management: issues related to how the system handles security-related errors and exceptions.
- Auditing and Event Logging: issues resulting from auditing and event logging mechanisms (or the lack thereof).
As this framework shows, there are a breadth of possible security vulnerabilities that must be addressed, and no one tool or technique can effectively analyze them all. This is why different types of testing matter! Some issues, such as configuration management, are typically introduced at deployment and only visible during runtime or through technology with unabridged infrastructure access. Other issues, such as authorization, require a detailed understanding of the system’s correct and incorrect behavior. While some techniques might help you discover that your Web server lacks security patches (configuration management), they’d likely be unable to determine if that server is on a shared host that hosts a high-profile attack target.
In England, the phrase “horses for courses” means that different racehorses are better on certain race courses. The same is true for selecting tools and techniques: You must match the tools and techniques to the environment. Different techniques matter!
Web Application Penetration Testing
Web application penetration testing (often referred to as black box testing) mimics the actions of an attacker. Simply put the aim is to find the holes in the same way a bad guys does. The tester is likely to have no more or less visibility and access to the application than a regular user, that is to say he may have accounts on the system but is unlikely to have access to the source code (unless its an open source app of course). Tools of the trade are a web browser and a proxy to intercept, inspect and manipulate the traffic and possibly the new breed of automated web application scanners. While these tools may seem attractive (”press the shiny red button and away it goes”) anyone who understands the vulnerability framework above and the nature of bugs and flaws will also understand they typically find a small percentage (10%-25%) of the average issues in a web site.
Advantages - Web application penetration testing is relatively straight forward to set up. Here’s the starting URL have at it! There is no source code to ship and apart from account provisioning (if needed) things are generally simple. Many traditional security folks who came from network backgrounds feel comfortable with penetration testing as its familiar. Its also true that web application penetration testing can be an efficient and a cost effective way to find some types of bugs; reflective cross site scripting as an example among presentation layer issues.
Disadvantages - We have already mentioned that the automated tools perform poorly. There are several reasons for this. First, static Web sites (brochure sites built from HTML) are designed to be spidered; that is, they’re designed so that search engines can easily index content. In contrast, most web applications consist of complex forms—such as wizards or open text boxes—that require human users to enter contextually relevant information. So, the tools’ first hurdle (which they fail) is site coverage, because client-side navigation technologies—such as JavaScript, AJAX, Flash, and image maps—make it difficult to determine where a user might go. Without coverage, tools can’t build and execute suitable test cases.
There are also limitations with human testing. If all you see is a black box then all you see is a black box! Certain types of issues would be incredibly difficult to stumble upon with a blindfold but painfully obvious without one. As an example a manual penetration tester may find an application appears to accept a classic XSS string. Without an understanding of the dataflow in the application it’s pure guess work if that XSS input is discarded or could reappear as a malicious string served up to another user.
Gary McGraw says it best. If you fail a penetration test you know you have a really bad problem but if you pass a penetration test you don’t know that you don’t have a really bad problem.
Finally, if you find an issue using penetration testing, you typically have no idea where to fix it in the code without further investigation by the developers. Anyone who has been left with a penetration testing report and been told to fix issues will understand that this is not a trivial problem to gloss over.
We will be discussing how to scope web application penetration tests in detail in the next post including discussing a tool we built at Foundstone called SiteScope.
Static Source Code Analysis
Static source code analysis is much like reading a book. An analyst or a tool has the source code and is able to read it, understand it and determine the security vulnerabilities that may be present. He/she may or may not have all of the code, i.e. may or may not be able to compile and run it or have access to supporting services. He can read a chapter, the entire book, look at it in reverse or whatever meets his or her fancy. Static analysis tools have been around for a long time and recently several commercial tools have started to emerge.
Advantages - It’s clear if you have the code then you have a significant advantage over a black-box penetration test. You have the software DNA! With the source code there is no guessing, no estimating or supposing what may happen. Well in theory of course, software is usually so big and complex that few people can mentally process an entire software program and comprehensively understand the entire execution environment. When you do find issues from source code analysis, you of course know exactly where the issue occurred and can usually extrapolate to determine if the same issues have been repeated else where. Static analysis is good for finding flaws and bugs. This is important.
Disadvantages - It is hard to ignore the hard fact that there are fewer skilled people who are capable of code review. Whereas penetration testing mainly speaks one relatively simple protocol language; source code speaks many languages with many dialects (many frameworks). In practice its often hard to get a stable build and in my experience many people “forget” important code or libraries. We were once shipped 3 million lines of code and they “forgot” all the JSPs! Unless you can compile and set breakpoints in the code simply staring at the text alone is often not enough. The static analysis tools are still severely limited in what they actually find and are prone to reporting false positives.
We will be discussing how to scope source code analysis in detail in the next post including discussing a tool we built at Foundstone called CodeScope.
Dynamic Analysis
Dynamic analysis looks at the source code and additionally how it executes. In general it is considered to be a far more detailed form of code analysis and is able to determine actual risks and eliminate false positives by following actual execution and not potential issues. Top flight hackers and security researchers have of course been using dynamic analysis for decades, often without access to the source code by operating on shellcode and intermediary languages.
Dynamic analysis attempts a far deeper analytical approach with source code. Essentially, it not only find instances of bad coding practice, but also attempt rich control and data-flow traversal.
Advantages - Dynamic analysis can take code review one step further by examining what actually happens rather than what looks like will happen. In the age of complex software this can reduce false positives (increase assurance) and uncover unexpected issues caused by complex interactions at runtime.
Disadvantages - While a new breed of dynamic analysis tools look promising, manual dynamic analysis requires a high degree of skill. Analysts need to know their stuff! For the most part you also need to be able to compile and run the code which is not always easy if you are say an online bank. In general dynamic analysis is often used to find bugs and is not always effective at finding design flaws. Dynamic analysis is also more mature in unmanaged code (C / C++) and less mature with managed code like .NET and Java. Most web apps are of course written using the later!
Design, Architecture and Threat Modeling
Threat modeling has become popular recently in part due to Mike Howard and David LeBlanc’s excellent book Writing Secure Code and the work that has come out of Microsoft (PAG, ACE and SWI) . Whether you use Microsoft’s approach or another more generic threat modeling approach, the technique guides you through the process of defining system components, entry and exit points (connectivity), and key security components and mechanisms. You then analyze potential threats and in so doing you get a clear architectural system. As the chief security officer of a major financial service company often quotes, “Would you drive a car that was only front-impact tested? Of course you wouldn’t! So why only test the front impact of your Web applications?”. I like to think that design and architecture is so closely tied to what threat mdoeling actually is that I struggle to seperate them in the real world. They are all part of a review that doesn’t physically touch the application.
Design and architecture reviews and threat modeling is a compelling way to use your unfair advantage (knowing exactly how the system works) to determine issues.
Advantages - At Foundstone we estimated that we could predict about 75% of all issues from a design and architecture review when followed up with a threat model. Bang for buck I think this is the most cost effective way to spend testing dollars in many circumstances.
Disadvantages: When looking at the design and architecture its clear that the focus is on design flaws and not implementation bugs. It is also true that any conlcusions drawn are theoretical and not nessesarily real issues that exist.
What Types of Techniques Should You Use and When?
I have said it before in this article and i’ll say it again so my point is heard; some techniques are better than others at finding certain types of issues. Horses for courses! Some testers maybe better at some techniques or even some specific parts of some techniques than others. Some maybe faster, cheaper or more effective than others and so the answer to what to use when is simple. Select the right tool for the job. It never seizes to amaze me why people get so wrapped up in wanting to do a “pen test” or a “code review” rather than a wholistic security review when it is clearly better to use one technique for some things and another for other things.
My advice: work with the testers to understand what they will use for what and understand why they have made their decisions / recommendations. A good tester will want to use the best tool and best technique for the the right reasons. Mix and match and get the best results. At the end of the day its what blend of tools and techniques that will be effective in the organization or given the current resoureces that matters. While most people will agree that source code review is where it is at, its probably better leveraging skilled penetration testers if that’s all you have than asking mechanics to do brain surgery or brain surgeons to fix cars. I think it also stands true that a really good pen tester and a really good code reviewer will prob find more than two of any one dicipline.
The next part of this series will specifically look at Design, Architecture and Threat Modeling. In fact to be more accurate it will discuss how threat modeling can be used to scope and reduce technical testing.
September 4, 2007 at 6:08 am
It is worth noting is that this series is only focused on testing software and not testing the process to build software
What do you mean by “testing the process to build software”? Do you instead mean “testing/inspecting software before/during/after the build process”?
Web Application Penetration Testing
While these tools may seem attractive (”press the shiny red button and away it goes”) anyone who understands the vulnerability framework above and the nature of bugs and flaws will also understand they typically find a small percentage (10%-25%) of the average issues in a web site
This is in line with what Chess/Kureha presented at BlackHat Federal 2007. Kureha showed, using Fortify Software Tracer, that commercial web application vulnerability scanners only reach up to 30-something percent code coverage in total (across all tests and tools). Of course, this was only for semantic flaws (i.e. your “bugs”), precluding logical flaws (i.e. your “flaws”).
Jeremiah Grossman has also tried to demonstrate the capabilities of these scanners on his blog using speed dials vs. the OWASP Top Ten 2007… although his statisical data was hidden behind the marketechture.
So, the tools’ first hurdle (which they fail) is site coverage, because client-side navigation technologies—such as JavaScript, AJAX, Flash, and image maps—make it difficult to determine where a user might go. Without coverage, tools can’t build and execute suitable test case
I see these as two separate issues:
1) The functional coverage (pages, links, elements) which is a “web crawler” computational intelligence problem (e.g. Heritrix - crawler.archive.org). Romain Gaucher discussed a possible solution on his blog about website functionality coverage, and Chess/Kureha again covered this topic at BlackHat USA recently during their Iron Chef presentation
2) Testing driver support by the tool. Protocol drivers, such as HTTP support, are typically the only driver available by these scanners. First found in ELZA, and later in QA testing tools such as twill, protocol drivers are the primary method of web application vulneraability finding from a pen-test tool perspective. Some scanners include application drivers (usually Javascript only, but include Ajax) such as Hailstorm and WebInspect. In the QA world, compare Selenium (an application driver) or Watir (a browser driver). I guess the OWASP CAL9000 tool is an example of a vulnerability scanner that provides a browser driver, albeit only for XSS.
Static Source Code Analysis
In practice its often hard to get a stable build and in my experience many people “forget” important code or libraries
Stable builds are much easier to get if the org uses continuous integration with a build server (e.g. CruiseControl or LuntBuild).
Missing includes is a problem that should be built into both the IDE and the SCM. For example, using the JDT in Eclipse, a developer can set “missing includes” from warnings to errors. Once the developer updates the SCM (e.g. CVS or Subversion) by checking in code, these warn vs. err requirements will become a project dependency, requiring all developers to match their includes. Fortunately, tools such as IvyDE (an Eclipse plug-in) make this much easier. For code reviewers, tools/techniques like these can really speed this process up.
Dynamic Analysis
Dynamic analysis is also more mature in unmanaged code (C / C++) and less mature with managed code like .NET and Java
I’m not sure what you mean/intend by saying this. Managed code can be decompiled, thus allowing for static code/bytecode analysis. I’ll take FxCop or FindBugs analysis over IDA + PaiMei hybrid analysis any day of the week (because it requires a day instead of a week)!
There will be immense benefit to both fat and web application testing as the hybrid/dynamic tools and techniques emerge. You should read the DDJ article by Amini, Greene, and Sutton on “Requirements for Effective Fuzzing” which is really a chapter in their book, “Fuzzing: Brute Force Vulnerability Discovery”.
The idea here isn’t to compare tools or have code coverage measurements available post-test like Chess/Kureha did. The point is to use coverage (i.e. Fuzzer Tracking) to enhance heuristics (protocol dissection, proxy fuzzing, genetic algorithms, etc) in order to increase time-between-findings. Additionally, they bring up non-coverage issues referred to as “Intelligent Fault Detection”, which covers material such as Fuzzer Stepping, stack unwinding, and dynamic binary instumentation (DBI) to detect faults before they occur. Maybe you’re correct in that some of these do not apply to .NET or Java as they do with regards to C/C++.
Design, Architecture and Threat Modeling
Whether you use Microsoft’s approach or another more generic threat modeling approach, the technique guides you through the process of defining system components, entry and exit points (connectivity), and key security components and mechanisms
I’ve been looking at X.805 for threat-modeling lately because of its inclusion of a concept of “planes” (control, data, management). Reminds me of Cisco days, and much of which is more infrastructure than application focused (although some of the same concepts apply). Outside of the Microsoft and Trike models (and basic attack-trees) there isn’t a lot out there on-topic. The best Microsoft books seem to cover the topic differently - with Howard’s original to Frank/Window’s to Howard’s most recent books on the SDL and Vista.
For web application security threat-models, the OWASP material is good, as is a presentation I saw by Ivan Ristic (Breach Security) - although it was also very infrastructure focused.
What Types of Techniques Should You Use and When?
It never seizes to amaze me why people get so wrapped up in wanting to do a “pen test” or a “code review” rather than a wholistic security review when it is clearly better to use one technique for some things and another for other things
It ceases to amaze me that people want to do review after review, quarter after quarter, year over year - for the same clients. Why allow these [helpless?] organizations to continue to make the same mistakes? In your first part of this series, you mentioned the business aspect about submitting defects into an issue tracking system instead of providing a report that is likely to sit on a desk and collect dust. I say go even further!
What about a strategic “wholistic security” solution?
September 4, 2007 at 5:19 pm
[...] just read an excellent post by Mark Curphey on “The types of testing,” part 2 in his 5 part series on “The Art of Scoping Application Security [...]