Google Searching Techniques Jennie DeLucia/JD Consulting ÂŠ Presentation Rochester Regional Library Council December 14, 2006
Basic Google Search Techniques Google Basic Operators Use the plus sign (+) to force a search for an overly common word. Use the minus sign (- )to exclude a term from a search. No space follows these signs. To search for a phrase, supply the phrase surrounded by double quotes (" "). A period (.) serves as a single - character wildcard. An asterisk (*) represents any word窶馬ot the completion of a word, as is traditionally used
Basic Google Search Techniques Phrase searches To be or not to be What do you think you will find with this search? Google will find matches to any of these words on the page- not necessarily the phrase ‘to be or not to be’ To find ‘phrases’ make sure you quote the phrase “To be or not to be”
Basic Google Search Techniques Google’s Basic Boolean AND by default and the OR modifier This means that if you do not add any modifiers to a search Google will search for all the of the words I.e. snow mobile Honda “Rochester NY” Google will search for all of these words and phrase I.e. snow mobile OR Honda OR “Rochester NY” What do you think you will find and does OR need to be capitalized? I.e. snow mobile (Honda OR “Rochester NY”) Parenthesis acts as the OR I.e. snow mobile (snowmobile | “Rochester NY”) In addition to using OR you can use the ‘pipe’ symbol from programming
Basic Google Search Techniques Negation A minus sign directly in front of a word or phrase will exclude if from the search I.e. snow mobile snow blower –”Rochester NY” The minus sign must appear directly in from of the phrase or word Do now use a space or it will not work Do add a space though after the word that you are going to exclude
Basic Google Search Techniques Explicit Intrusion Google will search for all your keywords in a search except for the ones that they considered too popular to be of any use The list of these â€˜stop wordsâ€™ are the following http://www.ranks.nl/tools/stopwords.html I, a, about, an, this, that, the, where, when to name a few
Basic Google Search Techniques Synonyms The Google synonym operator is â€˜~â€™ I.e. ~ape What will you get in return? Monkey, gorilla, and chimpanzee
Synonyms are bolded along with exact keyword matches on the results page so they are easy to spot
Basic Google Search Techniques Number range The number range operator â€˜..â€™ looks for results falling inside your specified numeric range I.e. prada pumps size 5..6 I.e. digital camera 3..5 megapixel $800..$1000 It is KEY to always provide some sort of reference or clue as to the meaning of the range I.e. $, size, kg, and so forth
Basic Google Search Techniques Number ranges You can also use the number range syntax with just one number, making the minimum or maximum of your query I.e. acres Montana land 500.. I.e. raincoat dog ..$30
Note: Google normally doesn’t recognize special characters like $, but because the $ sign was necessary for the number feature, you can use it in all sorts of searches Try these two searches to see the difference ‘yard sale’ bargains 10 ‘yard sale’ bargains $10
Basic Google Search Techniques Simple Searching and Feeling Lucky Google decides which is the most relevant page for your search Is it reliable? Depends on your search ☺ Try searching ‘Washington’ and see where it takes you Try searching ‘president’ and see where it takes you
Basic Google Search Techniques Case Sensitivity Google is NOT case sensitive So, if you search for Three, three, THREE, or ThrEE you will get the same results.
Basic Google Search Techniques Full-Word Wildcards * Stemming is when you add a * or a ? at the end of a word to act as a wildcard I.e. Searching for moon* will yield moons, moonlight, moonshot, etc.
Google DOES NOT support explicit stemming, BUT it implicitly stems for you For example, searching for dietary will yield results for diet and other variations on the theme.
Basic Google Search Techniques Full-word Wildcards Google DOES support full-word wildcard Although you cannot have a wildcard stand in for part of a word, you can insert a wildcard into a phrase and have the wildcard act as a substitute for one full word “three * mice” will yield what results? * for one word; ** for two words, etc
Basic Google Search Techniques The 10-word limit Google has a hard limit of 10 words Thatâ€™s keywords and special taxes combined Ignores anything beyond 10
Basic Google Search Techniques Favor Obscurity Play the wildcard When you have more than 10 words in a search, use * and or eliminate words from the exclusion list
Basic Google Search Techniques Google Advanced Operators The site: operator instructs Google to restrict a search to a specific web site or domain. The web site to search must be supplied after the colon. Example Site:www.google.com How many pages do you think you will hit?
Basic Google Search Techniques Google Advanced Operators The filetype: operator instructs Google to search only within the text of a particular type of file. The file type to search must be supplied after the colon. Don't include a period before the file extension. Example Site:www.google.com libraries More concise search
Basic Google Search Techniques Google Advanced Operators The link: operator instructs Google to search within hyperlinks for a search term. Example Link:www.rrlc.org What kind of pages did you find? Are you surprised?
Basic Google Search Techniques Google Advanced Operators The cache: operator displays the version of a web page as it appeared when Google crawled the site. The URL of the site must be supplied after the colon. Example Cache:www.rrlc.org Itâ€™s the version cached on your laptop Another good site to track your website over the years is http://www.archive.org/web/web.php
Basic Google Search Techniques Google Advanced Operators The intitle: operator instructs Google to search for a term within the title of a document. Example Intitle:LibDex "parent directory"
Basic Google Search Techniques Google Advanced Operators The inurl: operator instructs Google to search only within the URL (web address) of a document. The search term must follow the colon. Example inurl:rrlc inurl:library
Basic Google Search Techniques Google Advanced Operators Allinurl: itâ€™s a variation of the inurl and finds all the words listed in a URL but does not mix well with some other special syntax
Basic Google Search Techniques Google Advanced Operators Inanchor: searches for text in a page’s link anchors. A link anchor is the descriptive text of a link. Example, the link anchor in the HTML code <a href=http://www.oreilly.com>O’Reilly Media</a> is ‘O’ Reilly Media” Inanchor:”tom peters”
Basic Google Search Techniques Google Advanced Operators Datarange: limits your search to a particular date or range of dates on which a page was indexed Important to note that a datarange search has nothing to do with when a page was created, but when it was indexed by Google. I.e. “Geri Halliwell’ “Spice Girls” datarange:2450958-2450968
Basic Google Search Techniques Google Advanced Operators Related: as you might expect; finds pages that are related to the specified page. I.e. related:google.com This will return a variety of search engines including Lycos, Yahoo!, and Northern Lights
Basic Google Search Techniques Google Advanced Operators Info: provides a page of links to more information about a specified URL. This information includes a link to the URLâ€™s cache, a list of pages that link to the URL, pages that are related to the URL, and pages that contain the URL I.e info:www.oreilly.com
Basic Google Search Techniques Google Advanced Operators Phonebook: looks up phone numbers Phonebook: John Doe CA Phonebook: (555) 555-5555
Basic Google Search Techniques Mixing Syntax There are simple rules to follow when mixing syntax elements Do not mix syntax elements that will cancel each other out Site:ucla.edu –inurl:ucla Here you are saying you want all results to come out from ucla.edu but that sites results should not have the string “ucla” in the results.
Basic Google Search Techniques Mixing Syntax Do not overuse single syntax elements I.e. site:com site:edu
What you think youâ€™re asking for results from either .com or .edu, but youâ€™re actually saying is that site results should come from both simultaneously.
Basic Google Search Techniques Mixing Syntax Do not use allinurl: or allintitle: when mixing syntax It takes a careful hand not to misuse these Instead, stick in inurl: or intitle: If you do not put allinurl; in exactly the right place, you will create odd search results.
Basic Google Search Techniques Mixing Syntax Do not use so much syntax that you get too narrow Title:agriculture site:ucla.edu inurl:search Instead try Title:plants site:ucla.edu inurl:database Or even better Databases plants site:ucla.edu
Basic Google Search Techniques Now that we know what NOT to do, letâ€™s work on mixing syntax Titles and sites What search would we run to get an idea of what databases are offered by the state of Texas Site:tx.us intite:search intitle:records
Basic Google Search Techniques Title and URL Sometimes you want to find certain type of information, but you donâ€™t want to narrow by type. Instead, you want to narrow by theme of information. Also, remember, inurl: syntax will search for a string in the URL, but wonâ€™t count finding it within a larger word I.e. inurl:research will not find www.researchbuzz.com but it will find www.research - coucils.ac.uk
Basic Google Search Techniques Google Advanced Search page http://www.google.com/advanced_search?hl= en Options/filters such as language, file format, numeric range, etc
Google Hacking Techniques Now that we know the basic and more advanced Google searching techniques, I can show you how potential hackers can gain information about your website
Archives.org This is one of my favorite websites, it’s does not have anything to do with Google, but it I think you will find it interesting www.archive.org This is a hacker paradise……..but why?
Google Hacking What are common things that a hacker would want to find out about a website? Administrative accounts Logs (error or auditing logs) What data is housed on the site Sensitive directories Advisory and server vulnerabilities
Google Hacking The terms login and logon locate logon portals Login | logon Why is this an issue? What type of information can a hacker find out about your website?
Google Hacking Username | userid | employee.ID | “your name is” These are just a few ways to obtain a username from a target website Site:www.rit.edu username | userid | “your name is”
Google Hacking Password | passcode | â€œyour password isâ€? Perfect combo attack and/or reconnaissance work for a hacker to get a get a username and password to access a site Site:www.rit.edu username password
Google Hacking Admin | administrator Who doesn’t want to find pages that potentially contain administrative/admin access? Site:www.rit.edu admin password What other ways or what other modifiers can we add to find more detailed ‘admin’ information?
Google Hacking -ext is a synonym for filetype Negative query What are we looking for with the following searches? -ext:html –ext:htm –ext:shtml –ext:asp – ext:php http://www.filext.com/index.php- best site for finding any type of file extension
Google Hacking Inurl:temp | inurl:tmp | inurl:backup | inurl:bak This will search for temporary or backup files or directories on a server There are many different naming conventions, but you can get the jist of it
Google Hacking Intranet | help.desk We arenâ€™t necessarily looking for private intranets, but you would be surprised as to what you find How many of you use your intranet on a daily basis? What type of information could someone find if they could gain access?
Google Hacking- Locating common login portals ASP.net
Microsoft Outlook Web Access Generic Admin
Allinurl:”exchange/logon. asp” Inurl:login.asp
Virtual Network Computing ColdFusion Admin
“VNC Desktop” inurl:5800 Intitle:”ColdFusion Administrator Login”
Google Hacking- Locating Various Network Devices Canon Network Camera
Xerox Phaser (generic)
Intitle:”network administration” inurl:”nic” inurl:sts_index.cgi Inurl:live_status.html
Sony Network Camera
SNC - RZ30 HOME
“powered by webcamXP” “Pro|Broadcast”
Google Hacking- Locate Usernames Microsoft Outlook Express Mail address books Access Databases containing profiles Remote Desktop Connection Various ‘locked’ files
Outlook Mail Web Access
Filetype:mdb inurl:profiles Filetype:rdp rdp “index of” lck
Google Hacking- Locate Passwords FTP bookmarks HTTP htpasswd Web user credentials HTTP passwords
Filetype:url + inurl:”ftp:// + inurl:”@” Filetype:htpasswd htpasswd Intitle:”index of” “.htpasswd” htpasswd.bak Filetype:pwd service
Microsoft Frontpage Service Web passwords Password list user Index.of passlist credentials
Google Hacking- Finding sensitive information Cookies AIM buddy lists Excel sheet containing contact info MSN Messenger contact list Student grades
Intitle:”Index.of” cookies.txt “size” Buddylist/blt filetype:xls inurl:contact Filetype:ctt ctt messenger Site:edu admin grades
Google Searching Techniques Jennie DeLucia/JD Consulting ÂŠ Presentation Rochester Regional Library Council December 14, 2006