作者: 凉心无悔 时间: 2005-5-4 20:10 标题: [整理]google for hacking~!Enjoy it~(陆续整理)
信息安全的隐患-GoogleHacking原理和防范
创建时间:2004-12-27 更新时间:2004-12-30
文章属性:转载
文章提交:sFqRy (mqphk163_at_163.com)
作者:zhaohuan@phack.org 来源:www.phack.org
技术天地:GoogleHacking是利用Google的搜索引擎快速查找存在脆弱性的主机以及包含敏感数据的信息,最近这种以前由黑客手动进行操作的攻击手段可以通过一种新的蠕虫病毒来自动完成。为了引起大家对GoogleHacking的关注与重视,我们编发了这篇文章希望大家通过对Hack的攻击手段的了解,更好的保护自己的信息安全。本文中重在对GoogleHacking攻击手段的理解,对一些攻击的细节不予详述请谅解。
前言:
2004年在拉斯维加斯举行的BlackHat大会上,有两位安全专家分别作了名为You found that on google ? 和google attacks 的主题演讲。经过安全焦点论坛原版主WLJ大哥翻译整理后,个人觉得有必要补充完善一些细节部分。今天向大家讲述的是Google的又一功能:利用搜索引擎快速查找存在脆弱性的主机以及包含敏感数据的信息,甚至可以直接进行傻瓜入侵。
Google Hacking Mini-Guide
Date: May 7, 2004 By Johnny Long.
Using search engines such as Google, "search engine hackers" can easily find exploitable targets and sensitive data. This article outlines some of the techniques used by hackers and discusses how to prevent your site from becoming a victim of this form of information leakage.
The Google search engine found at http://www.google.com offers many features, including language and document translation; web, image, newsgroups, catalog, and news searches; and more. These features offer obvious benefits to even the most uninitiated web surfer, but these same features offer far more nefarious possibilities to the most malicious Internet users, including hackers, computer criminals, identity thieves, and even terrorists. This article outlines the more harmful applications of the Google search engine, techniques that have collectively been termed "Google hacking." The intent of this article is to educate web administrators and the security community in the hopes of eventually stopping this form of information leakage. This document is an excerpt of the full Google Hacker';s Guide published by Johnny Long, and located at http://johnny.ihackstuff.com.
Basic Search Techniques
Since the Google web interface is so easy to use, I won';t describe the basic functionality of the http://www.google.com web page. Instead, I';ll focus on the various operators available:
Use the plus sign (+) to force a search for an overly common word. Use the minus sign (-) to exclude a term from a search. No space follows these signs.
To search for a phrase, supply the phrase surrounded by double quotes (" ").
A period (.) serves as a single-character wildcard.
An asterisk (*) represents any word—not the completion of a word, as is traditionally used.
Google advanced operators help refine searches. Advanced operators use a syntax such as the following:
operator:search_termNotice that there';s no space between the operator, the colon, and the search term.
The site: operator instructs Google to restrict a search to a specific web site or domain. The web site to search must be supplied after the colon.
The filetype: operator instructs Google to search only within the text of a particular type of file. The file type to search must be supplied after the colon. Don';t include a period before the file extension.
The link: operator instructs Google to search within hyperlinks for a search term.
The cache: operator displays the version of a web page as it appeared when Google crawled the site. The URL of the site must be supplied after the colon.
The intitle: operator instructs Google to search for a term within the title of a document.
The inurl: operator instructs Google to search only within the URL (web address) of a document. The search term must follow the colon.
Google Hacking Techniques
By using the basic search techniques combined with Google';s advanced operators, anyone can perform information-gathering and vulnerability-searching using Google. This technique is commonly referred to as Google hacking.
Site Mapping
To find every web page Google has crawled for a specific site, use the site: operator. Consider the following query:
site:http://www.microsoft.com microsoftThis query searches for the word microsoft, restricting the search to the http://www.microsoft.com web site. How many pages on the Microsoft web server contain the word microsoft? According to Google, all of them! Google searches not only the content of a page, but the title and URL as well. The word microsoft appears in the URL of every page on http://www.microsoft.com. With a single query, an attacker gains a rundown of every web page on a site cached by Google.
There are some exceptions to this rule. If a link on the Microsoft web page points back to the IP address of the Microsoft web server, Google will cache that page as belonging to the IP address, not the http://www.microsoft.com web server. In this special case, an attacker would simply alter the query, replacing the word microsoft with the IP address(es) of the Microsoft web server.
Finding Directory Listings
Directory listings provide a list of files and directories in a browser window instead of the typical text-and graphics mix generally associated with web pages. These pages offer a great environment for deep information gathering (see Figure 1).
Figure 1 A typical directory listing.
Locating directory listings with Google is fairly straightforward. Figure 1 shows that most directory listings begin with the phrase Index of, which also shows in the title. An obvious query to find this type of page might be intitle:index.of, which may find pages with the term index of in the title of the document. Unfortunately, this query will return a large number of false positives, such as pages with the following titles:
Index of Native American Resources on the Internet
LibDex—Worldwide index of library catalogues
Iowa State Entomology Index of Internet Resources
Judging from the titles of these documents, it';s obvious that not only are these web pages intentional, they';re also not the directory listings we';re looking for. Several alternate queries provide more accurate results:
intitle:index.of "parent directory"
intitle:index.of name sizeThese queries indeed provide directory listings by not only focusing on index.of in the title, but on keywords often found inside directory listings, such as parent directory, name, and size. Obviously, this search can be combined with other searches to find files of directories located in directory listings.
Versioning: Obtaining the Web Server Software/Version
The exact version of the web server software running on a server is one piece of information an attacker needs before launching a successful attack against that web server. If an attacker connects directly to that web server, the HTTP (web) headers from that server can provide this essential information. It';s possible, however, to retrieve similar information from Google';s cache without ever connecting to the target server under investigation. One method involves using the information provided in a directory listing.
Figure 2 shows the bottom line of a typical directory listing. Notice that the directory listing includes the name of the server software as well as the version. An adept web administrator can fake this information, but often it';s legitimate, allowing an attacker to determine what attacks may work against the server.
Figure 2 Directory listing server.at example.
This example was gathered using the following query:
intitle:index.of server.atThis query focuses on the term index of in the title and server at appearing at the bottom of the directory listing. This type of query can also be pointed at a particular web server:
intitle:index.of server.at site:aol.comThe result of this query indicates that gprojects.web.aol.com and vidup-r1.blue.aol.com both run Apache web servers.
It';s also possible to determine the version of a web server based on default pages installed on that server. When a web server is installed, it generally will ship with a set of default web pages, like the Apache 1.2.6 page shown in Figure 3:
Figure 3 Apache test page.
These pages can make it easy for a site administrator to get a web server running. By providing a simple page to test, the administrator can simply connect to his own web server with a browser to validate that the web server was installed correctly. Some operating systems even come with web server software already installed. In this case, an Internet user may not even realize that a web server is running on his machine. This type of casual behavior on the part of an Internet user will lead an attacker to rightly assume that the web server is not well maintained, and by extension is insecure. By further extension, the attacker can assume that the entire operating system of the server may be vulnerable by virtue of poor maintenance.
The following table provides a brief rundown of some queries that can locate various default pages.
Apache Server Version
Query
Apache SSL/TLS
Intitle:test.page "Hey, it worked !" "SSL/TLS-aware"
Many IIS servers
intitle:welcome.to intitle:internet IIS
Unknown IIS server
intitle:"Under construction" "does not currently have"
IIS 4.0
intitle:welcome.to.IIS.4.0
IIS 4.0
allintitle:Welcome to Windows NT 4.0 Option Pack
IIS 4.0
allintitle:Welcome to Internet Information Server
IIS 5.0
allintitle:Welcome to Windows 2000 Internet Services
IIS 6.0
allintitle:Welcome to Windows XP Server Internet Services
Many Netscape servers
allintitle:Netscape Enterprise Server Home Page
Unknown Netscape server
allintitle:Netscape FastTrack Server Home Page
Using Google as a CGI Scanner
To accomplish its task, a CGI scanner must know what exactly to search for on a web server. Such scanners often utilize a data file filled with vulnerable files and directories like the one shown below:
/cgi-bin/cgiemail/uargg.txt
/random_banner/index.cgi
/random_banner/index.cgi
/cgi-bin/mailview.cgi
/cgi-bin/maillist.cgi
/cgi-bin/userreg.cgi
/iissamples/ISSamples/SQLQHit.asp
/iissamples/ISSamples/SQLQHit.asp
/SiteServer/admin/findvserver.asp
/scripts/cphost.dll
/cgi-bin/finger.cgiCombining a list like this one with a carefully crafted Google search, Google can be used as a CGI scanner. Each line can be broken down and used in either an index.of or inurl search to find vulnerable targets. For example, a Google search for this:
allinurl:/random_banner/index.cgireturns the results shown in Figure 4.
Figure 4 Sample search using a line from a CGI scanner.
A hacker can take sites returned from this Google search, apply a bit of hacker "magic," and eventually get the broken random_banner program to cough up any file on that web server, including the password file, as shown in Figure 5.
Figure 5 Password file captured from a vulnerable site found using a Google search.
Note that actual exploitation of a found vulnerability crosses the ethical line, and is not considered mere web searching.
Of the many Google hacking techniques we';ve looked at, this technique is one of the best candidates for automation, because the CGI scanner vulnerability files can be very large. The gooscan tool, written by j0hnny, performs this and many other functions. Gooscan and automation are discussed below.
Google Automated Scanning
Google frowns on automation: "You may not send automated queries of any sort to Google';s system without express permission in advance from Google. Note that ';sending automated queries'; includes, among other things:
using any software which sends queries to Google to determine how a web site or web page ';ranks'; on Google for various queries;
';meta-searching'; Google; and
performing ';offline'; searches on Google."
Any user running an automated Google querying tool (with the exception of tools created with Google';s extremely limited API) must obtain express permission in advance to do so. It';s unknown what the consequences of ignoring these terms of service are, but it seems best to stay on Google';s good side.
Gooscan
Gooscan is a UNIX (Linux/BSD/Mac OS X) tool that automates queries against Google search appliances (which are not governed by the same automation restrictions as their web-based brethren). For the security professional, gooscan serves as a front end for an external server assessment and aids in the information-gathering phase of a vulnerability assessment. For the web server administrator, gooscan helps discover what the web community may already know about a site thanks to Google';s search appliance.
For more information about this tool, including the ethical implications of its use, see http://johnny.ihackstuff.com.
Googledorks
The term "googledork" was coined by the author and originally meant "An inept or foolish person as revealed by Google." After a great deal of media attention, the term came to describe those who "troll the Internet for confidential goods." Either description is fine, really. What matters is that the term googledork conveys the concept that sensitive stuff is on the web, and Google can help you find it. The official googledorks page lists many different examples of unbelievable things that have been dug up through Google by the maintainer of the page, Johnny Long. Each listing shows the Google search required to find the information, along with a description of why the data found on each page is so interesting.
GooPot
The concept of a honeypot is very straightforward. According to http://www.techtarget.com, "A honey pot is a computer system on the Internet that is expressly set up to attract and ';trap'; people who attempt to penetrate other people';s computer systems."
To learn how new attacks might be conducted, the maintainers of a honeypot system monitor, dissect, and catalog each attack, focusing on those attacks that seem unique.
An extension of the classic honeypot system, a web-based honeypot or "page pot" (click here to see what a page pot may look like) is designed to attract those employing the techniques outlined in this article. The concept is fairly straightforward. Consider a simple googledork entry like this:
inurl:admin inurl:userlistThis entry could easily be replicated with a web-based honeypot by creating an index.html page that referenced another index.html file in an /admin/userlist directory. If a web search engine such as Google was instructed to crawl the top-level index.html page, it would eventually find the link pointing to /admin/userlist/index.html. This link would satisfy the Google query of inurl:admin inurl:userlist, eventually attracting a curious Google hacker.
The referrer variable can be inspected to figure out how a web surfer found a web page through Google. This bit of information is critical to the maintainer of a page pot system, because it outlines the exact method the Google searcher used to locate the page pot system. The information aids in protecting other web sites from similar queries.
GooPot, the Google honeypot system, uses enticements based on the many techniques outlined in the googledorks collection and this document. In addition, the GooPot more closely resembles the juicy targets that Google hackers typically go after. Johnny Long, the administrator of the googledorks list, utilizes the GooPot to discover new search types and to publicize them in the form of googledorks listings, creating a self-sustaining cycle for learning about and protecting from search engine attacks.
Although the GooPot system is currently not publicly available, expect it to be made available early in the second quarter of 2004.
Protecting Yourself from Google Hackers
The following list provides some basic methods for protecting yourself from Google hackers:
Keep your sensitive data off the web! Even if you think you';re only putting your data on a web site temporarily, there';s a good chance that you';ll either forget about it, or that a web crawler might find it. Consider more secure ways of sharing sensitive data, such as SSH/SCP or encrypted email.
Googledork! Use the techniques outlined in this article (and the full Google Hacker';s Guide) to check your site for sensitive information or vulnerable files. Use gooscan from http://johnny.ihackstuff.com to scan your site for bad stuff, but first get advance express permission from Google! Without advance express permission, Google could come after you for violating their terms of service. The author is currently not aware of the exact implications of such a violation. But why anger the "Goo-Gods"?!
TIP
Check the official googledorks web site on a regular basis to keep up on the latest tricks and techniques.
Consider removing your site from Google';s index. The Google webmasters FAQ provides invaluable information about ways to properly protect and/or expose your site to Google. From that page: "Please have the webmaster for the page in question contact us with proof that he/she is indeed the webmaster. This proof must be in the form of a root level page on the site in question, requesting removal from Google. Once we receive the URL that corresponds with this root level page, we will remove the offending page from our index." In some cases, you may want to remove individual pages or snippets from Google';s index. This is also a straightforward process that can be accomplished by following the steps outlined at http://www.google.com/remove.html.
Use a robots.txt file. Web crawlers are supposed to follow the robots exclusion standard. This standard outlines the procedure for "politely requesting" that web crawlers ignore all or part of your web site. I must note that hackers may not have any such scruples, as this file is certainly a suggestion. The major search engine';s crawlers honor this file and its contents. For examples and suggestions for using a robots.txt file, see http://www.robotstxt.org.
Thanks to God, my family, Seth, and the googledork community for all the support. Happy Googling! j0hnny (http://johnny.ihackstuff.com) 作者: 凉心无悔 时间: 2005-5-4 20:23 标题: [整理]google for hacking~!Enjoy it~(陆续整理)
作者: 凉心无悔 时间: 2005-5-4 20:25 标题: [整理]google for hacking~!Enjoy it~(陆续整理)
[这个贴子最后由凉心无悔在 2005/05/04 09:07pm 第 1 次编辑]
再来一篇洋文的~督促大家快快学好洋文!!
GOOGLE TRICKS AND HACKS xclusive writes "Google Tricks and hacks - d00m
Google Tricks and hacks
- d00m
Google.com is undoubtedly the most popular search engine in the world. It offers multiple search features like the ability to search images and news groups.However it';s true power lies in it';s powerful commands that can be used and misused.I am writing this article on the basis of my experience using google and trying out ideas when i am bored.Now enough of lecturing...let';s get
down to business.)
--- Searching URLs :
The "allinurl" command is used to search for a particular string present in
the URL.Goto google.com and type this in the search box:
allinurl:wwwboard/passwd.txt
Wow! 139 results and allmost every result displays a file containing a string
in the form of ---> username:password (password is encrypted using DES crypto and can be cracked using john the ripper) "WWWBOARD" is a CGI message board which saves it';s password by default in a filename called "passwd.txt".This is a very outdated message board script but many new types of CGI/PHP/ASP messages boards and scripts save their passwords
in a text file (some are not encrypted i.e. in plain text !! and the rest can most of the time be cracked with john the ripper)
allinurl:passwd.txt site:virtualave.net
This time too you will get some results which leads to the file containing the
passwords.
This command searched for a file called passwd.txt present in the URL.However
using the "site:virtualave.net" part has limited the search to virutalave.net only! (virtualave.net is a web hosting provider)
Similarly you can also search partcular top level domains like
.net,.org,.np,.jp,.in,.gr etc :
These and many other ideas can return interesting results in google.
--- Searching for Index browsing enabled directories :
Index browsing is a very simple but powerful way of gaining information and interesting things.First of all we need to understand that "index browsing" enabled directories are those directories on the internet that can be browsed just like ordinary directories. We will be using google to find such type of "interesting" directories.
Try these out this in google:
"Index of /admin"
"Index of /secret"
"Index of /cgi-bin" site:.edu
Be more creative and think of more interesting ways to exploit index browsing,
-- Searching for partcular file types:
You can specify the extension of the filename you want to search using "filetype" command. Examples to try in google:
filetype:.doc site:.mil classified
-Yeah searching for classified millitary documents
-- Examples of some real life hacks using google:
1) My personal hack
One day i was reading about an exploit for phpBB 2.0.0 I decided to check
if any sites were vulnerable, so i fired up google and searched for:
"Powered by phpBB 2.0.2"
I found out that there were a lot of site.But i got curious to see if any
Nepali sites were vulnerable too because I am a Nepali myself
"Powered by phpBB 2.0.2" site:.np
I came up with a vulnerable Nepali site that used phpBB 2.0.2
2) Big brother hack
Phrack 60 has an article on Big Brother...(a program that will monitor
various computer equipment; things it can monitor are connectivity, cpu
utilization, disk usage, ftp status, http status, pop3 status, etc.)
You can search for sites using big brother by typing this search string in
google:
"green:Big Brother" (with the quotes)
For more info check out article titled "Watchin Big Brother" @ phrack.org
--Conclusion:
This document is only meant to give some basic ideas about exploiting
google.com. I was very much inspired by +Fravia and his site : http://searchlores.org which has lots of innovative ideas and tricks.Please send positive 作者: 凉心无悔 时间: 2005-5-4 20:25 标题: [整理]google for hacking~!Enjoy it~(陆续整理)
INTRO=========
a week or so back i had an e-mail from a friend (FLW) asking me if i had any
info on google search tips
he was surprised on the amount of info available and open via google...this
got me thinking , well i have seen many various search strings in several
papers....so i thought i would put them all together on the one page...and
up-date as new one are discovered...so if i missed any to be added to the
list please let me know and i shall add some more....
****************************************************************************
WARNING:::i hold no responsibility for what you do via the information
supplied here...this is for educational purpose only , use at your own risk
you have been warned
****************************************************************************
thanks
ComSec aka ZSL
SUMMERY=======
Everyone knows google in the security sector...and what a powerful tool it is,
just by entering certain search strings you can gain a vast amount of knowledge
and information of your chosen target...often revealing sensitive data...this
is all down to badly configured systems...brought on by sloppy administration
allowing directory indexing and accessing , password files , log entrys ,
files , paths ,etc , etc
Search Tips
so how do we start ?
the common search inputs below will give you an idea...for instance if you
want to search for the an index of "root"
in the search box put in exactly as you see it below
==================
example 1:
allintitle: "index of/root"
result:
http://www.google.com/search?hl=en&ie=ISO-8859-1&q=allintitle%3A+%22index+of%2Froot%22&btnG=Google+Search
what it reveals is 2,510 pages that you can possible browse at your will...
====================
example 2
inurl:"auth_user_file.txt"
http://www.google.com/search?num=100&hl=en&lr=&ie=ISO-8859-1&q=inurl%3A%22auth_user_file.txt%22&btnG=Google+Search
this result spawned 414 possible files to access
here is an actual file retrieved from a site and edited , we know who the
admin is and we have the hashes thats a job for JTR (john the ripper)
txUKhXYi4xeFs|master|admin|Worasit|Junsawang|xxx@xxx|on
qk6GaDj9iBfNg|tomjang||Bug|Tom|xxx@xxx|on
with the many variations below it should keep you busy for a long time mixing
them reveals many different permutations
*************************************
SEARCH PATHS....... more to be added
*************************************
"Index of /admin"
"Index of /password"
"Index of /mail"
"Index of /" +passwd
"Index of /" +password.txt
"Index of /" +.htaccess
index of ftp +.mdb allinurl:/cgi-bin/ +mailto
administrators.pwd.index
authors.pwd.index
service.pwd.index
filetype:config web
gobal.asax index
allintitle: "index of/admin"
allintitle: "index of/root"
allintitle: sensitive filetype:doc
allintitle: restricted filetype :mail
allintitle: restricted filetype:doc site:gov
inurl:passwd filetype:txt
inurl:admin filetype:db
inurl:iisadmin
inurl:"auth_user_file.txt"
inurl:"wwwroot/*."
there are to many people to thank for the bits of information cut and pasted
and added to form this paper
most have been collected from various forums , txt , doc';s etc...like to thank
you all, its not intended to rip anyone
its just a combo of various search inputs...put on the one Paper to use as
a reference.
EOF
====================================
http://comsec.governmentsecurity.org
http://governmentsecurity.org/forum
******* new members welcome ******** 作者: 凉心无悔 时间: 2005-5-4 20:27 标题: [整理]google for hacking~!Enjoy it~(陆续整理)
利用google突破各种封锁来下载你要的东西
本文转自《博客中国 - 博客论坛》,感觉很有意思,就收录在这里。点击这里看原著。
在搜索框上输入: "index of/" inurl:lib
再按搜索你将进入许多图书馆,并且一定能下载自己喜欢的书籍。
在搜索框上输入: "index of /" cnki
再按搜索你就可以找到许多图书馆的CNKI、VIP、超星等入口!
在搜索框上输入: "index of /" ppt
再按搜索你就可以突破网站入口下载powerpint作品!
在搜索框上输入: "index of /" mp3
再按搜索你就可以突破网站入口下载mp3、rm等影视作品!
在搜索框上输入: "index of /" swf
再按搜索你就可以突破网站入口下载flash作品!
在搜索框上输入: "index of /" 要下载的软件名
再按搜索你就可以突破网站入口下载软件!
注意引号应是英文的!
再透露一下,如果你输入: "index of /" AVI
你会找到什么呢?同理,把AVI换为MPEG看看又会找到什么呢?呵呵!接下来不用我再教了吧?
作者: 凉心无悔 时间: 2005-5-4 20:44 标题: [整理]google for hacking~!Enjoy it~(陆续整理)
[这个贴子最后由凉心无悔在 2005/05/04 09:09pm 第 2 次编辑]
Patrick Chambet, Google attacks
http://www.risker.org/tech/GoogleHacking/files/bh-us-04-chambet-google_attacks.pdf
Caleb Sima. Exploits & Vulnerabilities - New Trends. http://www.issa.org/anniversary/presentations/Vuln_Exploits_NewTrends.pdf
上面两个得用PDF阅读器看,所以有想要的可以直接下来看
郑辉,Santy蠕虫分析报告。http://202.112.50.218/doc/spark/santywormanalysis.doc
Robert Masse, Jian Hui Wang. Hacking with Google for fun and profit!
http://www.gosecure.ca/SecInfo/library/WebApplication/GOOGLE-HACKING-GS1004.ppt
上边这个连接不太好连接,不过可以下的
·Johnny Long的Gooscan for Linux(http://johnny.ihackstuff.com/modules.php?op=modload&name=Downloads&file=index&req=viewdownload&cid=5),它能够用来在Linux下执行命令行方式的Google查询。
·Google Toolbar for Internet Explorer允许你不进入Google的首页www.google.com就能够在IE浏览器中直接输入关键词进行简单的查询。如果你是IE的反对者,还可以使用能够在Netscape或者Mozilla Firefox下执行的开放源码的Googlebar(http://googlebar.mozdev.org/)。
如果Google返回了查询结果,但其中的链接已经成为死链,你可以点击搜索结果下面的“快照”链接进行搜索和查找。这将搜索Google的缓存,你的信息可能有存在那里的机会。同样,确定在Google Groups(网上论坛)搜索敏感信息,我曾经利用这种方法,在这里搜索到很多有用的信息。你还可以查看Interesting Google Queries(http://artkast.yak.net/81)这个网页,找到针对Microsoft的特殊Google搜索技巧。
请访问The Web Robots Pages(http://www.robotstxt.org/wc/robots.html)获取如何配置你的robots.txt文件和如何执行更多反机器人欺骗的信息。Google同样有一个FAQ on Googlebot';s operation(http://www.google.com/bot.html)。