How fuzzing can help make robot lawyers trustworthy

No, not like this. Source

Courts and state bar associations are (finally) beginning to engage with “robot lawyers” — software applications that guide a user through some legal process. In Utah, the state court system has even gone so far as to set up a regulatory sandbox for testing and evaluating alternative means of delivering legal services, from software applications to non-lawyer ownership of law firms.

It is tempting to think about the regulatory challenge here as merely deciding what a new market for legal services ought to look like. Courts should resist this temptation, and consider why the legal profession has ethical rules and strong fiduciary duties in the first place: to maintain trust in the law and legal institutions. Using this as a starting point changes the frame. It suggests a regulatory approach that is tailored to the unique strengths and weaknesses of client-facing legal software. “Robot lawyers” are not lawyers. They are software, and we shouldn’t think of them like humans. And we shouldn’t regulate them like human lawyers, either.

Regulation does matter, because software has power: they influence how users engage with a legal system, and frame their perception of how the law works, and the remedies available to them. Errant recommendations from a software tool can be damaging to a client, and can harm their trust in courts and whether the law works for them. At the same time, the nature of software means that we can test its inner workings, accuracy, and impact more regularly and rigorously than a human’s.

The critical question about “robot lawyers” is not whether to admit this (wholly fictional) class of “person” into the profession, or whether they are more effective than human lawyers. Instead, the question is how software tools can best be positioned to bolster trust in the law and legal institutions, rather than erode it. We must ask what fiduciary duties legal software providers owe as a result of their power over users, what safeguards flow from those duties, and how courts, bar associations, and states can build systems to identify, minimize, and correct errors that software tools may create.

A fuzzy path forward

As a baseline, courts could generate a suite of random mock client issues, stories, and profiles, and use them to stress-test and audit legal tech tools. This suite could change from time to time, to help identify edge cases and emergent issues, and to mitigate cheating by bad actors. And the tests could be run every time a tool changes, or the law changes. For tools that are primarily multiple choice “wizards” or form-fillers, it’s feasible to generate a table that lists every possible configuration of inputs, and every possible output that a tool might yield. And once again we could update this table automatically upon changes in software or the law.

For a development team, this is useful but unexciting. But for a court attempting to regulate legal software, this starts to lay the foundation for continuous auditing: an opportunity for courts and software providers to identify and correct errors as they occur. A table like this could be used to identify obvious guidance errors, such as a wrong form, or to ensure that tools are responsive to changes in law. Courts could invest in building “client test suites”: a baseline set of scenarios that applications need to be validated against, along with a random set of “fuzzy” tests, and a table that comprehensively captures possible paths through a legal software application.

We could go further. In addition to requiring software providers to make their tools available for this sort of testing, courts could require software providers to report every user journey through their application, even unsuccessful or unfinished ones. A user journey dataset could be invaluable for understanding how demand for legal help is translated (or not) into action. Courts and researchers could use it to identify user errors or dead ends: where a user is told that a tool isn’t the right fit for their needs, and where the user may benefit from a subsequent referral. We could even connect data on case outcomes directly to the software tool (or tools) that yielded it.

Of course, this is just one approach, for one set of software tools. Experimentation may reveal others. Different services may require different levels of monitoring, depending on the vulnerability or sophistication of the expected client base, or the nature of the service on offer. And, of course, this does not address the larger structural questions that legal institutions must reckon with in order to build and maintain trust in a digital age.

Instead, this example hopefully illustrates what’s possible when regulation is adapted to specific contexts and activities, rather than grafted from an old approach and twisted to fit.

Justice as error correction

A crisis of trust is already in full bloom in the tech industry, and of its potential to manipulate, deceive, and discriminate. Courts should not have to relearn lessons that society at large is already learning, and should be absolutely intolerant of a black box culture taking root in our justice system.

There are a host of unanswered questions about whether and how client-facing legal software fits into a broader market for legal services. One thing is clear: regulation should not be based on the fiction of a robot lawyer, but on the very real need to have legal institutions worth trusting.

Lawyer, technologist. Affiliate at Berkman Klein Center & Duke Center on Law and Technology. Adjunct prof. at Georgetown Law.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store