Tuan Nguyen, Joe Zeimen, Zachary Alexander - Extracting structured data for Relevance Ranking

Abstract

Learning to Rank (LTR) is the application of machine learning to rank search results according to their degree of relevance to the query. Salesforce Enterprise Search employs a hand-crafted ranking function to score search results and order them accordingly for users. Data about this ranking process are stored in JSON format, which is a nested tree with arbitrary depth. We present our first effort to parse this data, and extract the inputs of the ranking function into a tabular format. Next we present some of the potential questions and hypotheses we can ask with this data, and how we use machine learning algorithms to answer these questions.