AALL Spectrum

AALL Spectrum / November/December 2019 / Volume 24, No. 2

AALL Spectrum / Published by American Association of Law Libraries

Issue link: https://epubs.aallnet.org/i/1178310

Contents of this Issue

Navigation

Page 16 of 55

NOVEMBER/DECEMBER 2019 | AALL SPECTRUM 15 document structure plays a larger role for regulations and statutes engines. Unlike horizontal search engines, which mostly rely on query- document term similarity, vertical search approaches allow us to codify how we think about a problem and represent it in a way suitable for algorithmic computations. For example, a decision is often not only meant to resolve the dispute at hand but can also be used to resolve similar disputes in the future. The ramifications of a decision are not fully contained in the decision's words and phrases; rather, the decision can only be fully under- stood after the fact, as other courts apply it and interpret it. A search engine, therefore, must capture this 'meaning' that did not happen yet, or for older decisions, a meaning that is distributed across many decisions (i.e., the citation network of a case). To solve this problem, Westlaw uses machine learning algorithms called learning to rank, which combines a diverse set of factors in its ranking function and represents the challenge as an optimization problem. In the case of Westlaw, it minimizes the num- ber of pairwise inversions, where a lower-quality result is ranked before a higher-quality result. To ensure this is not a popularity contest, where highly cited decisions dominate the results, Westlaw algorithms use normalized features to level the playing fields. In addition, we use a number of stratified sampling strategies to ensure the long tail of the law is represented in both the training and the evaluation of the algorithms. Recently, Westlaw Edge extended those capabilities through a set of pro- prietary natural language processing algorithms that aim to "understand the meaning of a query" and, when appro- priate, provide answer-like results. The Researchers' Challenges with Any Algorithm Boolean ("terms and connectors") searching is satisfying in its simplicity. Although there are minor variations in the way Boolean searches are pro- cessed in each database, the results of a Boolean search may be cleanly divided into two buckets: those with a precise literal match to the search query are returned as results, and those without a precise literal match are not. The operation of such a search is easy to understand, and results can be verified with certainty. Another benefit is that the intent of the search is known to the researcher. A Boolean algorithm is simple: The primary search syntaxes at work are proximity and field limit- ers. Skill is needed: A Boolean search presupposes the researcher is able to effectively translate a concept into an often complex search query that captures all of the concept's possible expressions within an increasingly large corpus of data. Semantic or "algorithmic" search approaches search results differently. Rather than deterministically dividing results into matches and non-matches, modern search engines take a prob- abilistic approach. Effectively, the entire corpus is ranked from "most likely to be responsive" to "least likely to be responsive." The benefit is that multiple search syntaxes and formulas are applied to the query. But the algo- rithm is probabilistically guessing at intent. Therefore, skill is needed: The researcher still needs to formulate a sufficiently targeted search to help the algorithms do their work. Serious thinking about the legal problem is the best first step for every search. Both types of searches set algorithms to work to bridge the gap between the researcher's query and the documents in the system. The human researcher still plays a vital role in prim- ing the algorithm that closes that gap. 3 AALL2go EXTRA Listen to 2018 AALL Annual Meeting record- ing "What's in the Black Box?: An Inside Look at the Algorithms for Westlaw, LexisNexis, and Fastcase," at bit.ly/AM18Blackbox. READ Susan Nevelow Mart's article "Every Algorithm Has a POV," from the September/ October 2017 issue of AALL Spectrum at bit.ly/SO17Algorithm. © 2019 BY ED WALTERS ED WALTERS CEO Fastcase, Inc. Washington, DC ed.walters@fastcase.com © 2019 BY TITO SIERRA TITO SIERRA VP, GLOBAL PRODUCT MANAGEMENT LexisNexis Legal & Professional Raleigh, NC tito.sierra@lexisnexis.com © 2019 BY SUSAN NEVELOW MART SUSAN NEVELOW MART PROFESSOR AND DIRECTOR OF THE LAW LIBRARY Colorado Law Boulder, Colorado susan.mart@colorado.edu © 2019 BY JOE BREDA JOE BREDA PRESIDENT Bloomberg Law Arlington, VA jbreda@bloomberglaw.com © 2019 BY DR. KHALID AL-KOFAHI DR. KHALID AL-KOFAHI HEAD OF THOMSON REUTERS CENTER FOR AI Thomson Reuters Toronto, Canada khalid.al-kofahi@thomsonreuters.com Research + Analytics Information Management

Articles in this issue

Links on this page

Archives of this issue

view archives of AALL Spectrum - AALL Spectrum / November/December 2019 / Volume 24, No. 2