CIKM Cup

This year’s CIKM features two CIKM Cup data challenges related to the topic of user profiling. For both challenges, CIKM will host a workshop on the last day of the conference (October 28th). The award ceremony for the winners of both challenges will happen during the 25th Anniversary CIKM Banquet. To access CIKM cup official webpage, please visit here.

  • CIKM Cup 2016 Track 1: Cross-Device Entity Linking Challenge
Online advertising is, perhaps, the most successful business model for the Internet known to date and the major element of the online ecosystem. Advertising companies help their clients market products and services to the right audiences of online users. In doing so, advertising companies collect a lot of user generated data, e.g. browsing logs and ad clicks, perform sophisticated user profiling, and compute the similarity of ads to user profiles. User identity plays the essential role in the success of an online advertising company/platform.
As the number and variety of different devices increases, the online user activity becomes highly fragmented. People check their mobile phones on the go, do their main work on laptops, and read documents on tablets. Unless a service supports persistent user identities (e.g. Facebook Login), the same user on different devices is viewed independently. Rather than doing modeling at the user level, online advertising companies have to deal with weak user identities at the level of devices. Moreover, even the same device could be shared by many users, e.g. both kids and parents sharing a computer at home. Therefore, building accurate user identity becomes a very difficult and important problem for advertising companies. The crucial task in this process is finding the same user across multiple devicesand integrating her/his digital traces together to perform more accurate profiling.
The Cross-Device Entity Linking Challenge provides a unique opportunity for academia and industry researchers to work on this challenging task. We encourage both early career and senior researchers to participate in the challenge by testing new ideas for cross-device matching and consolidating the approaches already published and described in the existing work. The successful participation in the challenge implies solid knowledge of entity resolution, link prediction, and record linkage algorithms, to name just a few.
For the model development, we release a new dataset provided by Data-Centric Alliance (DCA). The dataset contains an anonymized browse log for a set of anonymized userIDs representing the same user across multiple devices. We also provide obfuscated site URLs and HTML titles. By looking at this problem from the graph-theoretical perspective, we release data about nodes (userIDs at the level of devices and the corresponding click-stream logs) and a subset of known existing edges. The participants have to predict new edges (identify the same user across multiple devices). The evaluation is done by calculating the ratio of correctly predicted edges using the F1 measure.
  •  CIKM Cup 2016 Track 2: Personalized E-Commerce Search Challenge
The Personalized E-commerce Search Challenge provides a unique opportunity for academia and industry researchers to test new ideas for personalized e-commerce search andconsolidate the approaches already published and described in existing work. The successful participation in the challenge implies solid knowledge of learning to rank, log mining, and search personalization algorithms, to name just a few.
For the model development, we release a new dataset provided by DIGINETICA and its partners containing anonymized search and browsing logs, product data, anonymized transactions, and a large data set of product images. The participants have to predict search relevance of products according to the personal shopping, search, and browsing preferences of the users. Both “query-less” and “query-full” sessions are possible. The evaluation is based on click and transaction data.
The Personalized E-commerce Search Challenge also continues the series of search challenges organized by major search industry leaders like Yandex, Yahoo, and Baidu. In the past, participants worked on learning to rank documents ,predicting relevance of documents using search logsdetecting search engine switching in search sessions , personalizing search user experience for web search , and classifying queries.
The unique feature of this challenge is that we: (1) Release both search and browsing logs while in the past only search logs were provided. (2) Focus on e-commerce search and hence have transaction data and unique (exploratory) search behavior patterns. (3) Provide product images enabling experimentation with the visual features for search ranking.