International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 04 Issue: 07 | July -2017
p-ISSN: 2395-0072
www.irjet.net
Mining query log to suggest competitive keyphrases for sponsored search via improved topic model using ITCK method Yeetika Dhingra1, Dr. R.K. Chauhan2 1M.Tech
Scholar, Dept. Of Computer Science and Applications, Kurukshetra University, Haryana, India Dept. Of Computer Science and Applications, Kurukshetra University, Haryana, India ---------------------------------------------------------------------***--------------------------------------------------------------------2Professor,
Abstract - The study has introduced the concept of query log which maintains the information regarding the user intent. Mining search log is a popular task in suggesting long tail keyphrases for the advertisers bid on specific keywords in search engine auction process to place their advertisements on search engine result page. The sponsored result is generated considering the keywords typed by the user. Here the keyphrases are derived specifying the seed keyword, based on the methodology of hidden topic retrieval using the improved topic model. ITCK method is proposed to provide topic modeling based suggested key phrases. The experiment is being performed on AOL search engine query log. The experiments have been conducted and it has been proved that the proposed work performs better than existing one.
Key Words: Sponsored search, Query log, Topic modeling, Keyword generation, LDA. 1.INTRODUCTION Sponsored search is the biggest key factor in terms of generating revenues using potential customers. When a user types query on search engine, two results are mainly drawn: organic and sponsored results. Both results work very differently as the organic result listing is based on retrieval of valuable information but sponsored searching is based on the auction process conducted by the search engine. In this process, the bidding mechanism is performed by which advertisers bid on specific keywords so that it matches with the query posted by user through search engine and some candidate advertisements are selected. The ad with highest bidding will be displayed and the advertisers meant to be paid highly. For keyword generation in sponsored search, the process of mining query log is used. Search Log mining is a data mining process which aims at extracting useful information for different user behavior models. Query log is basically a file which is maintained by the search engine server. The log file typically consists of a record related to the query requested by user and the results delivered by the search engine. It is used to draw a relationship between user and the search engine. Mining search log is a fast emerging trend which is applied in different areas of information storage and retrieval. Basically it is used to extract the intention behind the user requested query. It is a kind of process used to extract user behavior which can © 2017, IRJET
|
Impact Factor value: 5.181
|
be applied to various platforms [1] [2]. The focus here is to establish a relationship between what user searches and what needs to be relevant. Query log is defined as a set, containing Qi = {query, count} where, query refers to the keywords submitted by the user in search engine query box and count refers to searched volume related to the query. A keyword suggestion method has been proposed which is applied in context of seed key terms which are selected randomly. Seed terms are short and ambiguous and therefore these are categorized topically. The keyword suggestion method is a great support to search engine advertising in which the advertisers bid on these expanded forms and generate revenues. It is used to target potential customers. Suppose we use the seed term ‘’Colgate’’, there are many keywords which co-exists with the seed term in the query log therefore, a co-occurrence relationship is maintained to generate candidate keywords related to seed key term. The methods like synonym based approach; conceptual graph construction and concept hierarchy are previously used for keyword generation [3][4][5]. These methods lack in generating long tail keyphrases for the problem. Therefore, ITCK method is proposed to overcome the failures in existing approaches. ITCK is an improved topic based competitive keyphrase suggestion method. The keyword suggestion model that has been designed, is based on a machine learning approach. The model consists, a three step procedure: candidate keyword generation using association rule mining, topic modeling approach (LDA) and improvement over LDA approach. The proposed method is based on graphical model which is used to develop a correlation between the seed and its candidates. The query log is represented in Table1: Table -1: Query log representation Query q.kw
Count q.vol
Colgate
36
University Colgate
16
Pamolive Colgate
20
Teeth colgate
8
ISO 9001:2008 Certified Journal
|
Page 1541