Case Study of a Difficult Federal Government Scrum Project: FBI Sentinel
According to today’s Wall Street Journal (“FBI Goes Digital, After Delays“), the FBI’s Sentinel system is finally ready for agents to use, replacing manual processes and older electronic case management tools. As of Aug 2012 it’s difficult to predict how well it will perform over time. If it serves the FBI well, Scrum and Agile advocates will cite it as evidence of Scrum working where prior approaches had failed. For example, Sentinel was already cited in Ken Schwaber’s fourth book.
A recent Department of Justice Office of the Inspector General (OIG) report provides a great case study of why both traditional and Scrum approaches to software development can be so hard. If you want to save your organization millions of dollars, I’d encourage you to read the whole report. In the article below I’ve excerpted some parts you’ll find particularly interesting.
First they tried handing it off to a government contractor:
- The FBI’s attempt to move from a paper-based to an electronic case management system began in 2001 with the Virtual Case File (VCF), a major component of the FBI’s Trilogy IT modernization project.
- Designed to replace the obsolete Automated Case Support (ACS) system, the FBI abandoned the VCF project in 2005 after spending $170 million.
According to a GAO expert, “It was a classic case of not getting the requirements sufficiently defined in terms of completeness and correctness from the beginning,” and “had there been an [up front] architecture, the likelihood of these requirements problems would have been vastly diminished.” Thank God for experts, right? If only we had perfect requirements and architectures in the beginning! Next they’ll tell us planes crash because of gravity: It was a classic case of gravity causing the aircraft to collide with the Earth. If only planes had magic wing-fairies the likelihood of these gravity problems would be vastly diminished.
So the FBI spelled out over a thousand requirements and tried pre-planned, quasi-iterative development at a different government contractor:
- On March 16, 2006, the FBI announced the award of a $305 million contract to Lockheed Martin as part of a $425 million project to develop Sentinel, a new electronic case management system.
That also failed. It is customary for project managers to complain they don’t have enough “resources.” But the history of large projects almost makes you think using too many people causes more problems than it solves.
- The FBI issued a stop-work order to Lockheed Martin in July 2010, and in September 2010 the FBI announced its plans to complete the remaining two phases of Sentinel using a new Agile methodology development strategy.
(By the way, I hope you are detecting a contradiction between the words “phases” and “Agile.”) According to the report, the FBI kept only 10 of the 135 Lockheed people and moved development in house using Scrum. Good first steps. So “doing Scrum” got the project done early and under budget? Uh, no:
- Because of problems encountered during an FBI-wide test exercise of Sentinel in October 2011, the CTO also stated that the schedule for completing Sentinel’s development had been extended from December 2011 to February 2012.
- As a result of the exercise, which included 743 participants, the FBI identified deficiencies with Sentinel’s performance.
- According to the FBI’s Chief Information Officer (CIO), the problems were the result of insufficient hardware capacity and the FBI determined that it will have to purchase new hardware before Sentinel can operate properly when it ultimately is deployed to all Sentinel users.
Dammit! Note that Sentinel reportedly went live July 1, 2012, five months later than the revised prediction, and we still don’t know how well it will work when fully adopted. The two previous approaches failed entirely, and now our personnel costs are much lower. So this slip isn’t a catastrophe. What about these “deficiencies” though?
- Sentinel experienced significant performance problems during the Sentinel Functional Exercise.
- The FBI attributed these performance problems to either the system architecture or the computer hardware.
- According to the FBI, subsequent operational testing confirmed the inadequacy of the legacy hardware and the requirement to significantly expand the infrastructure before the system could be deployed to all users.
There’s that word “architecture” again. Why were the architecture and/or hardware limitations discovered so late?
- An Agile Development Team official stated that required testing had not been completed within the established time parameters because testing personnel have encountered difficulty setting up testing programs, software, and procedures.
As Ron Jeffries recently wrote, “We have too many people who can almost program, and too few people who can test software.” The most effective Agile teams blend their testing in with their coding each day, so one cannot proceed without the other. But Sentinel’s burndown chart gives the impression stuff was done. Why did the Product Owner declare stuff “done” when they knew they weren’t testing properly?
- The Sentinel Product Owner, the person responsible for tracking the completion of project work, stated that the completion criteria only broadly informs project personnel whether functionality development has been completed at the end of each sprint, and does not specifically address whether functionality is completed.
I didn’t follow that logic either. One hopes the report isn’t representing the Product Owner’s statement accurately. Scrum only works with a rigorous and clear definition of “done” requiring the team to attempt a potentially shippable product increment every Sprint. “Done” should include proper testing, among other things. If we didn’t properly test the stuff we built in the very first Sprint, we should declare those items not done rather than take on more functionality the next Sprint. Before we learn the new skills, this will feel like going too slow. But the only alternative is not knowing where we really are. Not knowing where we really are looks a bit like this burndown chart that changes directions at Sprint 10.
What happened around Sprint 10? Real users actually tried using the product. I’m guessing stuff we thought was done reappeared on the backlog. Perhaps some rework is inevitable, so we want to find out sooner rather than later, by testing sooner rather than later.
By the way, in my opinion there is a better way to draw a burndown chart.
More notes on testing from the OIG report:
- Our concerns about the lack of transparency of Sentinel’s progress are magnified by the apparent lack of comprehensive and timely system testing.
- we believe it is vital that the Agile Development Team only claim fully tested functionality as complete during the biweekly demonstration of a sprint’s completion.
That’s good advice for all of us. In Sentinel’s case, it’s possible definition of done should have addressed regulatory compliance more clearly:
- FBI IT governance officials expressed concern that they were not provided documentation to establish that security features were built into the foundation of Sentinel’s architecture.
- During our review, a Sentinel Agile Team member stated that development team personnel had to re-develop a component of Sentinel’s digital signature functionality because it was not compliant with the National Institute of Standards and Technology’s Federal Information Processing Standards and had not been tested for compliance when it was initially developed.
Here’s what the OIG had to say about Agile:
- The FBI’s transition to an Agile development approach has reduced the risk that Sentinel will either exceed its budget or fail to deliver the expected functionality by reducing the rate at which the FBI is spending money on Sentinel and by instituting a more direct approach to the FBI’s monitoring of the development of Sentinel.
I want to stress that all organizations — not just the FBI — struggle with definition of done when adopting Scrum. All the other stuff in Scrum — the team self organization, the Sprint Review Meetings, the burndowns, the Sprint and release planning, the feedback loops — only works with a clear and rigorous definition of done that includes proper testing.
DISCLAIMER: I have no special knowledge about the FBI Virtual Case File or Sentinel projects other than publicly-available sources which probably got some of the facts wrong. This article is meant to illustrate things we see go wrong over and over, not point fingers at anyone. It’s easy to criticize from afar. I’m encouraged by today’s news that a Scrum Team managed to deploy a system that previously couldn’t be built by hundreds of people with hundreds of millions of dollars.