Volltext-Downloads (blau) und Frontdoor-Views (grau)

Claude rules: An evaluation of large language models’ applicability to solve cases in German business law

  • In the evolving field of legal information systems, Claude 3 and other advanced conversational agents (CAs) are emerging as transformative forces. This interdisciplinary study combines quantitative methods, legal analysis, and digital transformation approaches to evaluate the efficacy of leading commercially available CAs in the German legal environment. Employing a corpus of 200 unique legal tasks, the research benchmarks Claude 3 against notable systems such as Google Gemini and ChatGPT versions 4 and 3.5. Through automated evaluations of 1,600 responses generated by these CAs, Claude 3 is demonstrated to be the most effective system, capable of successfully addressing realistic legal challenges and passing a German business law examination with an overall score of 60%—significantly surpassing the 50% score of the previous performance leader ChatGPT-4. Despite its superior performance, Claude 3, along with other evaluated systems, exhibits considerable limitations that can be difficult to identify. Based on these insights, it is recommended that legal professionals thoroughly verify all CA-generated content before use. Additionally, caution is advised for novices utilizing CA-generated legal advice, due to the specialized knowledge required for proper evaluation. This study contributes to the ongoing study of digital transformation in the legal domain, offering insights for both academic and industry stakeholders.

Download full text files

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author of HS ReutlingenSchweitzer, Sascha; Conrads, Markus; Naeve, Jörg
URN:urn:nbn:de:bsz:rt2-opus4-53416
DOI:https://doi.org/10.1016/j.procs.2024.09.406
ISSN:1877-0509
Erschienen in:Procedia computer science
Publisher:Elsevier
Place of publication:Amsterdam
Document Type:Conference proceeding
Language:English
Publication year:2024
Tag:conversational agents; digital transformation; large language models; legal information systems; performance assessment
Volume:246
Page Number:9
First Page:2675
Last Page:2683
DDC classes:004 Informatik
Open access?:Ja
Licence (German):License Logo  Creative Commons - CC BY-NC-ND - Namensnennung - Nicht kommerziell - Keine Bearbeitungen 4.0 International