** About Us > Briefing Papers > Can Security Audits Be Automated? Automated scanning over the Internet is no substitute for manual penetration tests but it can be of considerable use to hard-pressed IT security managers. Stephen Bishop looks at the issues. * Background Regular network security testing is more or less a given for any organisation with an Internet presence. But, as in many other technical areas, there can be a conflict between automated and manual approaches. Few would deny that a full manual audit by a team of experienced professionals will give the best results, but this can be very time consuming and expensive; an automated scan might be seen as second best by the purists but it can be much easier to fit into budgets. In fact, one of the issues with allocating budgets for penetration tests is in deciding whether to carry out one intensive test per year or commission a series of smaller checks throughout the year, run monthly or even weekly. This leads on to the difficult issue of timeliness: commissioning full manual tests takes considerable effort, and many organisations find it difficult to keep to a fixed schedule of, say, six-monthly tests - especially when the target systems themselves may be undergoing changes to meet business demands. * Manual Testing A typical penetration test is carried out by a small team with a carefully defined brief, usually following an established methodology, even if this in not formally documented. The task may be may split into several phases of exploration, vulnerability testing, analysis and review. Most practitioners will use a number of tools plus various manual techniques for further exploration of potential flaws. At the end of the project the client will expect a written report, with a short summary and various levels of detail. Manual testing is undoubtedly more thorough than an automated probe and can be tailored to clients' needs. Experienced practitioners can draw on their own experience and change the emphasis of their testing according to the type of system under test: in other words, they can follow their nose to a certain extent even within a defined methodology. This is particularly true for application-level testing. Automation can help in mapping web sites in great detail, and carrying out some of the repetitive manipulation, but uncovering intrinsic flaws in the specific application generally relies on intensive manual effort. * Automated Testing There are various forms of automated testing services available, but they are typically offered via subscription, are run remotely across the Internet and their findings are presented in a series of management and technical reports. The client can specify the target set and the timing of the test runs, but otherwise has little influence over the process. Such tests usually carried out much more frequently than manual testing, perhaps monthly rather than yearly. This, of course, means that any unforeseen changes in services offered to the outside world will be picked up sooner. However, the absence of human cross-checking brings with it a risk of false positives appearing in the results: in other words, flagging apparent vulnerabilities that are not actually present. The developers of automated scanning systems put a lot of effort into mechanisms that cross-check any flaws found, and will usually have a quality control process to make sure that any anomalies are kept out of future runs, but individual reports from automated scans must still be read with some caution. On a related note, an automated system is less likely to be able to give useful threat levels for any vulnerabilities found than a skilled penetration tester. The latter will always have a better understanding of the overall context and be able to adjust the output of tools to match the realities of the service offered. * Regular or On-Demand? One way to categorise the various automated scanning offerings is to distinguish between regular and on-demand services. The former run on a fixed schedule, such as 03:00 every Tuesday, while the latter are set in motion by the client as required. The regular scan approach provides a form of system monitoring, so that unforeseen events can be caught. Assuming that successive sets of results are kept, this also allows for trend analysis and historical views, possibly helping with forensic analysis going back months or even years. This is perhaps of most benefit to security teams and other specialists who oversee a large address space but have no direct responsibility for the systems in that space. The appearance or disappearance of, say, a mail server will be made apparent fairly rapidly: it may be that this is an unofficial change that would otherwise have gone unnoticed. On-demand or user-driven scans are more suitable for specific investigations, perhaps in support of a wider audit or initiated by a security scare. For example, a scan may be run after a remedial configuration change that does not justify the expense of a repeat penetration test. Or, perhaps, a significant reworking of an Internet gateway might require a series of scans run at short notice in order to ensure that any exposure of vulnerable systems is minimised. * Confidence Ratings Users can benefit from a system of confidence ratings applied to any vulnerabilities reported. These show the likelihood that that a putative hole actually exists, based on the technique used by the check that caused it to be flagged. For example, if a check carries out an actual exploit and retrieves, say, the target system's password file, then the confidence level would be very high, close to 100%. (It is not possible to award a full 100% to any test, as there is a small but finite chance that the target system is a "honey pot" or other device set up to give deliberately misleading results to intruders.) On the other hand, a vulnerability deduced from a software version number or other circumstantial evidence should be given a lower confidence level, although it is still of value. This is because many systems get patched for security purposes without a full version upgrade being applied: the vulnerability disappears but the version string remains unchanged. In the strictest sense a confidence level should be based on statistical analysis that determines how many false positives a check throws up when run against a significant number of real-life targets; in practice it is acceptable to use a value like 75% to show that its effectiveness lies between evens and a dead cert. Another class of checks are those that send anomalous traffic to a service and then look to see if it is still active afterwards: if the service no longer responds this implies that it is vulnerable to a denial-of-service attack. Unfortunately, this method can often fall foul of load balancing, intrusion protection systems (IPSs) or even the inherent unreliability of the Internet itself. These results should be given a lower rating to indicate that they are worth noting but need further investigation before any remedial work is set in motion. There is, of course, a tricky issue here for any commercial offering. The user of a service may well expect all its checks to give clear cut answers all of the time, even if this is somewhat naive. So putting a confidence level of anything less than 100% against a check may be viewed negatively, although that check may be using the best available technique to test for a given vulnerability. In the outside word, however, we accept such qualification all the time - many medical screening checks have a relatively low confidence level but provide a useful means of determining whether a more thorough examination is needed. * Quality Control In the longer term, the key to improving confidence levels lies in quality control. This requires a careful analysis of all false positives to root out all incorrect assumptions, poor logic, systematic errors and out-of-date information - not to mention straightforward bugs in code and misleading documentation. Quality control is a continuous process of examining successive sets of results and making appropriate changes, and requires a significant commitment from the supplier. Much of this work can be carried out in isolation - for example, checks that yield variable results when run against the same target are always suspect - but in some cases the feedback loop may have to involve the client. This is simply because a little inside knowledge can make a significant saving in the effort needed to analyse a system from the outside. * Cross-Tool Consistency Some services run a number of tools within a scan: they may present a single set of results, but they rely on a number of sources for the testing logic. The most obvious examples of these are systems that explicitly use a collection of public-domain and commercial probes to carry out the widest possible set of tests and then consolidate the results into a common format. But even with a system that uses only dedicated software there are likely to be significant differences in approach between the latest developments and checks that have been in the arsenal for a number of years. The difficulty for the developers and maintainers of the service is to correlate the various output streams so that a consistent set of results can be presented to the user. Although efforts such as the Common Vulnerabilities and Exposures (CVE) dictionary aim to provide a single name for each known vulnerability, the developers have to work hard to avoid reporting the same issue more than once under different names, with all the confusion and irritation that this may cause the end user. It is important, however, not to over-simplify the results by merging together two findings that represent distinct issues which should therefore be kept apart. * Presentation and Reporting Most security analysis tools offer a range of reporting styles, loosely classed as management and technical. The former are intended to present a very high level view of the security state, using colourful graphics and steering clear of details; the latter are aimed at providing evidence and suggestions for remedial work, and may report all findings down the finest level of detail. This approach works well in large organisations where there are clearly defined layers of responsibility: where the "suits" have no direct involvement in system management and the "techies" are not expected to concern themselves with the overall security of the network. It does not always help smaller organisations or operational sub-divisions where the managerial and technical roles are merged and the required style of reporting is somewhere in between. One approach to reporting is to apply the structure of the target to the report itself, using a hierarchy based on networks, hosts, (IP) protocols, (TCP and UDP) ports, applications accessed via these ports and, last but not least, vulnerabilities in these applications. This, of course, derives from the seven-layer model of networking. Reports created in this way may appear somewhat austere, but they can reduce misunderstandings about the nature and value of the results: for example, in making it clear that although a TCP port can be opened, the presence of the service normally associated with that port is not confirmed until the relevant protocol has been use to elicit a characteristic response. Users naturally welcome a selection of reports, but these should present different views of the data obtained by the scan, rather than versions of the same document that simply vary in the amount of detail included. It helps if the automated service is based on a well-defined data model, so that that reporting really is distinct from the scanning phase. As well as reports in the form of browsable and printable documents, some systems provide alarms that alert users to significant changes in services or vulnerabilities found. These can take the form of e-mail messages, syslog events or SNMP traps, although network security considerations may limit the usefulness of all but the first of these. As with any event reporting, setting thresholds can be a significant challenge, and the benefits of issuing alarms can be reduced if users receive so many that real security threats are obscured. * Conclusions Automated monitoring is not a substitute for real penetration testing but a useful aid to system management that can be used alongside and between manual audits. There are a number of offerings in this area but it is important for end users to choose a product that fits their way of working, with a style of reporting that helps them understand their exposure and fix any issues that arise. * About the Author Stephen Bishop has been working in computer security for more than 20 years, and has carried out many penetration tests for customers in a range of industry sectors, both public and private. He has a software development and networking background, and is the designer of IDsec's Superwalk service. * About Us IDsec is an independent company specialising in network security, and has provided penetration tests and intrusion detection systems since 1997. We can assess the security of your enterprise and advise on long-term protection: as we have for a range of blue-chip clients in the banking, telecoms, manufacturing and utility sectors. IDsec Limited 31-33 College Road, Harrow, Middlesex HA1 1EJ, United Kingdom T: +44 20 8861 2001 F: +44 20 8861 3433 W: www.idsec.co.uk All prices exclude VAT and are subject to confirmation. Copyright (C) 2008 IDsec Limited about/briefings/automated-audit.txt 20080715 (5.08)