ADVANCED
WEB BASED HONEYPOT TECHNIQUES
Soda_Popinsky has very kindly allowed this tutorial of his to be
hosted on the TAZ.
You can find the original post here:
http://www.antionline.com/showthread.php?s=&threadid=269669
Advanced Web Based Honeypot Techniques
by Soda_Popinsky
Links
http://ghh.sourceforge.net
https://sourceforge.net/projects/ghh/
GHDB operated by johnny.ihackstuff.com
Background
The GHH project develops web based honeypots designed to lure "Google
Hackers" using malicious search engine tactics, along with tools and
documentation to allow others to develop customized honeypots,
decreasing the exposure of vulnerable applications in the Google index.
Recommended Reading:
http://tazforum.thetazzone.com/viewtopic.php?p=6084#6084
Overview
This tutorial will expand upon extension spoofing and transparent
linking, and how to apply it in the creation of customized web based
honeypots. The v1.1 honeypots and documentation released by GHH will be
used as a reference for this tutorial.
Spoofed file extensions
While browsing through the Google Hacking Database (GHDB), you should
notice that not all of the signatures target server side scripts (.php
for example). This hack, for example:
inurl:passwd.txt
That hack searches for the file extension .txt. The contents of these
files are usually interesting, and their exposure could introduce
vulnerability on the server they are hosted on. There is usually more
of a risk being introduced to the enviroment than a typical web
application vulnerability in cases like these.
Or perhaps these:
inurl:admin.mdb
inurl:customer.mdb
inurl:users.mdb
Depending on their contents, a database file such as this could cause
extreme losses. In order to emulate filetypes like these, GHH depends
on apache htaccess files to spoof its file extension. We can then take
advantage of server side scripting to log and handle the attack any way
we want, and if we're using GHH as an engine, this means log remotely
and apply signatures to the attack.
So following the previous tutorial on GHH v1.0 (Should still be
compatible) we can leverage htaccess and Apache to allow our honeypot
to spoof another file extension. By placing a htaccess file in the same
directory as our honeypot with the following lines:
Code:
AddHandler application/x-httpd-php .mdb #Change .mdb to your filetype
AddType application/x-httpd-php .mdb #Change .mdb to your filetype
Apache & PHP will interpret the .xyz file as a PHP script . The
only problem is that browsers won't behave normally when viewing some
extensions (.mdb, .txt for example) To handle this, we can place the
following PHP code at the beginning of our honeypot:
PHP:
Code:
<?php
header('Content-Type: text/plain'); //This line must change
//Rest of code...
?>
This will tell the browser to handle a file as a certain type of
content. The previous code would be acceptable for a .sql, .txt, .log,
.dat file or something similar. When the content reaches the attacker,
the browser will behave like it should (we already have them captured,
but it's best not to tip them off anyhow). If you had a database file,
you'd want it to open in access for example. This would require
'Content-Type: application/msaccess' to be sent to the browser.
Content types available @:
http://www.iana.org/assignments/media-types/
Transparent Linking
Transparent linking is the process of advertising your honeypot to
search engines, but not the casual users of your website. There are a
few ways to do this, some better than others. The better your
transparent link, the less false positives you'll have in your logs.
The goal is to have visitors to your honeypot that are referred from a
search engine, and not from the site it's hosted on. This forces them
to find the honeypot through the engine, and by that vector you can
retrieve the search query they used against your site (intention and
motive!)
Direct link
Simply making an obvious hyperlink with some text in your top level
website:
PHP:
Code:
<a href="http://yourwebsite.com/honeypot.php">blah</a>
Obvious problems include users clicking on the link, and filling your
logs with false positive. Don't use this type of link.
Camo Link
The following CSS style will make the link the same color as your
background. You should change black to match your background.
PHP:
Code:
<style type="text/css">
<!--
.camo{
color:black;
}
--!>
</style>
Then apply your style to the link.
PHP:
Code:
<a href="http://yourwebsite.com/honeypot.php"
class="camo">.</a>
This has it's problems as well. It's cumbersome, because you might not
know what the background will be behind the link. This makes a
literally transparent link desireable, however I haven't found any
options other than CSS Alpha() function, which doesn't seem to work
well with text.
Disappearing link
The following CSS will prevent the link from being shown to the user at
all, as long as their browser renders CSS.
PHP:
Code:
<style type="text/css">
<!--
.cya{
display:none;
}
--!>
</style>
The link is now completely nonexistent, except in the source. The
thought was that being completely invisible would be the best option,
however the GHH project learned the hard way that display:none is
completely ignored by Google because it can be abused. Against what
seems to be the popular belief, Google does not index links with a CSS
style of display:none (such a smart spider!) It will however, be
indexed by less powerful crawlers.
Shy Link
In order to leverage a disappearing link, you'll need to plug in some
PHP to detect when the Googlebot comes around (You have to cater to
Googlebot Smile)
PHP:
Code:
<a href="http://yourwebsite.com/honeypot.php"
<?php
if(!strstr($_SERVER['HTTP_USER_AGENT'], 'Googlebot'))
echo 'class="cya"';
?>
>.</a>
This is also a pain, but it does the job. Other spiders aren't as smart
as Googlebot, and freely crawl links with the display:none style, so
this technique will compeletly cover the link from casual browsers and
still let it be discovered by Google.
Map Link
The use of image maps can be a quick way to link multiple honeypots.
Create nearly untouchable links in an image.
PHP:
Code:
<img src="image.gif" border="0" usemap="#Map">
<map name="Map">
<area shape="rect" coords="0,0,0,0"
href="http://yourdomain.com/honeypot">
</map>
Buddy Link
Buddy linking is as simple as having other domains link to your
honeypot. When they are crawled, spiders will hopefully follow up to
your site. Casual users of your site are not likely to cause false
positives, however users of your buddies site may cause them, making it
a good idea to stick to the tactics described here.
"Tattletale" Link
TELL the search engine where you are, and forget about linking. Most
engines have a suggest feature, Google has sitemaps. If you don't feel
like using the python tool or writing XML, there's the option to submit
a textfile with URL's separated by CRLF's. Check it out here:
http://www.google.com/webmasters/si...bid=us-et-about
GHH Theory
The nature of GHH is to be known but not seen. This is why working with
GHH is challenging. The concept of Google Hacking and Honeypots are
simple, however the design of the web and the design of a honeypot in
tandem present the challenge of "hiding in plain sight" on the web. GHH
is developed under that concept, which is useful in the creation of new
tools related to the relevant attacks.
Benefits of GHH include very early warning of a potential attack, by
catching an attacker in their reconnaisance phase and learning their
possible motives. GHH also improves other vulnerable targets chances of
survival on the web. By saturating a search engine index with specific
false positives, it makes what was once an foolproof vector a more
unreliable source of victims. So in short, it also benefits others.
Original Tutorial
Submitted by nokia for TheTAZZone-TAZForum
Originally posted on March 6th, 2006 here
Do not use, republish, in whole or in part, without the consent of
the Author. TheTAZZone policy is that Authors retain the rights to the
work they submit and/or post...we do not sell, publish, transmit, or
have the right to give permission for such...TheTAZZone merely retains
the right to use, retain, and publish submitted work within it's
Network.

