Phoenix  - A Challenge to the Public Sector

(This post was originally published on Medium)

The irony would be sublime — “Rising from the ashes of Phoenix”. Irony, however, isn’t my goal. Instead I intend to provide a concrete way forward to our elected & appointed federal government officials to replace the Phoenix pay system. It frustrates me to no end that in 2018 we have to pay for failed IT systems that use approaches firmly rooted in the 1960s, when I and many other software people know that we could ship most if not all of the system, with substantially better quality, at a cost that’s at least one if not two orders of magnitude lower.

In the 2018 budget tabled by the Liberal government, they acknowledged that the Phoenix Pay system had severe issues and would have to be replaced. Phoenix was intended to replace the antiquated mainframe systems that paid Canada’s public servants. It has been live for 2 years, and has been fraught with issues resulting in under and overpayments of those employees, and in some cases no payments at all.

I will present an alternative approach to delivering such a system. My approach isn’t based on private sector naïveté — I spent over 15 years building systems large and small for 9 departments and Crown Corporations and have no illusions about the difficulties faced in that environment. I’ve also seen how different approaches within government can be remarkably successful and wish to leverage that experience.

My recommendations, therefore, are based on a background of 30 years of combined public and private sector software system delivery.

The Crash of the Phoenix

Based on what you read, Phoenix was supposed to cost $50 million Canadian dollars to deliver but has, as of March 2018, cost over $450M. The government has budgeted another $431M to deal with the continuing issues until a replacement system can be delivered. To be fair, and based on my previous history working on large government projects, I’m going to assume that the $881M value is for the whole program to build the system, consolidate the compensation handling in Miramichi, New Brunswick, provide training, etc. The original $50M was probably for just the system itself. News reports have a tendency to show the numbers in the least complementary light. The truth about the cost of the actual computer system likely lies in the middle somewhere, but that really isn’t the point.

This is yet another example of our government taking the same old broken approach to defining, procuring, and delivering these systems when they know that it just doesn’t work. If you believe that statement is unfair, then you can simply refer to the Auditor General’s reports from practically every year going back to the 1990s.

Phoenix was yet another case where the approach taken would:
  • Mandate the use of a commercial off-the-shelf (COTS) software package (Peoplesoft in the case of Phoenix) in order to leverage existing functionality;
  • Dream up every possible requirement that a compensation management system would need, requiring substantial customization of the COTS package;
  • Consolidate all that into a Request for Information (RFI) consisting of a metre thick binder;
  • Review the RFI from the vendors and revise the binder such that it becomes 2 metres thick;
  • Use the binder to obtain approval from Treasury Board for some low-ball amount for the project;
  • Issue a massive Request for Proposals (RFP) to only those vendors who are large enough to handle such a massive system (remember the 2 metre thick binder);
  • Take months if not years to evaluate the responses;
  • Select the winner, IBM in the case of Phoenix, who has also low-balled their bid;
  • Begin the massive waterfall project, with the hope that everything will just fall into place like it never has in any project ever before.
  • When undiscovered or misunderstood requirements are discovered during the development process, the integrator issues a constant stream of change requests to cover the new work.

After multiple years (2011–2016), the system was rolled out after being delayed in 2015 at IBM’s request due to critical issues. Despite constant problems that were overwhelming the compensation people who worked with the system, the government kept bulldozing forward to meet their deployment objectives. One of those objectives was to lay off 2,700 compensation advisors across the government who had worked in offices across the country (they were replaced by the single, consolidated pay centre in Miramichi, one of the cost savings “wins” the system was intended to provide). On March 9, 2016 the first full payday was processed. Ish.

By July 2016, some 80,000 employees had issues with their pay ranging from complete non-payment to over and underpayments. More money was thrown at the system, akin to tossing a can of gasoline on a fire.

In the February 2018 budget, the government essentially admitted defeat. Phoenix is going to chew up further hundreds of millions of dollars until its replacement can be delivered. Meanwhile, the Treasury Board quietly earmarked $16M over two years to study a replacement. Study. This is where my blood pressure begins to rise.

A Definition of Insanity…

The budget also indicated that, after the 2-year study had taken place, a new procurement process would start, followed by another development attempt, with the system delivered probably in 2025.

All of that indicates to me that the same people who already created the Phoenix problem will harbour the same assumptions, using the same processes and expect to have a different outcome. It’s true that doing the same thing repeatedly and expecting different results each time is not actually a definition of insanity. It is, however, a tremendously effective way to waste enormous amounts of time, money and, dare I say it, the sanity of the people involved.

I don’t just fear that this will happen, I know it will.

It will cost probably as much or more than Phoenix. It will use one or possibly multiple of the “usual suspects” of the big system integrators like IBM (again), CGI, HP, Accenture, etc. It will be late and have significant problems, but those will still be spun as a success story because, for political reasons, it can’t fail.

The Case for Change

My private sector experience suggests that, given $16M and two years, I could assemble a damned good team and ship a not insignificant increment of the replacement system. Yes, I know — also from experience — that the public sector is different. But this is where I want to start teasing apart the massive, gnarly knot of assumptions that comprise the existing approach to delivering these systems.

Several years of my public sector development experience was in building systems for Human Resources. I also did a 6-month stint in Compensation systems where I wrote code that interfaced directly with one of the predecessors to Phoenix, Online Pay. So, I’m coming at this problem with an understanding of what’s involved for at least one government department.

Let’s examine two of the assumptions regarding government pay — that it is extremely complex and requires a massive team.

Compensation in the government is complicated, with some 80,000 business rules according to one source, but it isn’t complex. For a given input, you can accurately predict the output. That’s what delineates complicated from complex, in which the output can only be observed afterwards and not predicted in advance.

There are some 300,000 people who need to be paid by the system, which means the data volumes are quite large. However, neither of these necessitate that the project needs to be massive, with a massive team fielded from a massive company, with an associated massive budget.

Those assumptions are symptomatic of the broken process. Surely something this large and complicated can’t be handled by a small team! Well, that’s exactly the approach used to fix the disastrous initial version of the healthcare.gov web site and health insurance application process in the U.S.

I know what you’re thinking — this isn’t a web site, it’s a serious payroll management system. First, healthcare.gov was more than just a web site and had to process the enrolment of literally millions of applicants. The usage volumes were considerably larger than what Phoenix handles. Secondly, it had to handle U.S. government security and privacy requirements that are at least as stringent as those in Canada. Finally, from the perspective of the end users it was at least as important as being paid. So we can dispense with any notion that healthcare.gov isn’t in the same league as Phoenix.

On the day that healthcare.gov launched in October 2013, it was a disaster. Only 6 people were able to successfully sign up for health care. Major aspects of the system such as direct enrolment simply didn’t work. This was known before the launch, of course, but the date was the date was the date.

Sounds familiar? A large team from large integrators building a large system with a large budget, and it didn’t work. What the U.S. government did next, though, is a hint at what I have in mind to fix Phoenix.

First, a small group of people were recruited from Silicon Valley as part of what was called the Tech Surge to perform emergency work to get the system into a working state. Second, after that initial surge, a second group took over and spent months replacing the applications that had been written by the massive original project group.

Not only did both of these teams successfully ship the required software, they did it at tiny fractions of the cost. One example cited is the login process, which had cost the original massive team $250M USD (yes, that’s a quarter of a billion U.S. dollars!) to build, with $70M USD annually to maintain. The small group replaced that with one that cost $4M USD and under $1M USD annually for maintenance. With the original version, it took from 2 to 10 seconds to log into the system. The replacement took 30 milliseconds. The old system took a user, on average, 20 minutes to complete the enrolment forms after working through up to 76 pages that “helped” direct the person through the process. The new system took an average of 9 minutes, with at most 16 pages.

Comparison of the Login process for healthcare.gov
How was this possible? Simple — the U.S. government decided to challenge the assumption that everything had to be large. A small team with the required skills given a clear mandate with the standard bureaucracy kept out of the way is able to move much, much faster than a large one. When the team discovered something new, they could respond quickly. They had the autonomy to decide to focus on quality rather than simply shipping features. They still encountered challenges within the government ecosystem, but those were eventually removed or overcome.

A small team was able to replace in the order of months what a large team could barely deliver over years, and they did so at a fraction of the cost.

The Proposal

An interesting note about the group that fixed healthcare.gov is that they didn’t completely disband. One set of people created a public benefit corporation called Nava and another group formed Ad Hoc, both of which are now using their approach for systems in other U.S. government departments.

My proposal is that rather than ending with a similar outcome, why not start there? Is there a similar structure that could be used in Canada? Could a Crown Corporation, Special Operating Agency or some form of not-for-profit organization be created that would be given the mission to replace Phoenix?

That organization would have attributes such as:
  • The ability to directly hire the people with the required skills to build the system (existing government employees would be considered, but would have to be seconded to this organization if hired rather than remaining on force with their department);
  • It would be provided funding by Treasury Board, or some combination of TBS and the departments who are the current stakeholders in Phoenix;
  • When that funding was secured, the organization would have total autonomy on how it was spent;
  • The people in the organization would be paid salaries, but wouldn’t be shareholders in the organization in any way, i.e. there would be no financial incentive to have the project to deliver the replacement system drag on over time.

The team comprising the organization would:
  • Have direct access to the people who would be the consumers of the system in order to ensure that the system would work effectively for them;
  • Have direct access to the subject matter experts in order to ensure that the system is properly handling the business rules;
  • Have direct access to the people responsible for any systems with which the new system would have to integrate;
  • Have complete autonomy regarding the process used to deliver the system;
  • Have complete autonomy regarding the system architecture and technologies used;
  • Be ridiculously transparent in their operation with respect to their progress and what work is being done;
  • And perhaps most importantly, have the constant, unwavering support from management at the Deputy Minister and Ministerial levels of the public service.

Having this organization outside of any one government department ensures that it isn’t unduly influenced by any one viewpoint on the system. Having the organization still remain within the orbit of the government with respect to funding and operation means that it wouldn’t be unduly influenced by external vendors.

Most important, having the organization in the first place means that large integrators won’t be spending enormous sums of tax dollars to feed their own services machine.

That alone should make politicians happy, but consider as well the increased likelihood of successfully delivering a replacement system. Rather than facing awkward queries during Question Period, ministers and MPs could show off the success achieved during their mandate.

So Now What?

Nothing is a foregone conclusion, of course. There are small teams who have failed and large ones who have succeeded. My experience, though, and that of many others, suggests that the smaller team approach is what’s needed to ensure that the replacement for Phoenix isn’t just another Phoenix.

The approach and thought processes currently used in the Canadian government are what led to the Phoenix débâcle in the first place, so I see no reason why the outcome should be any different if nothing changes. There’s an old saying the software business — garbage in, garbage out.

My proposal is to start with a small, handpicked team of perhaps a dozen people from a number of disciplines and grow only when there’s enough pain to warrant growth. Let that team work outside the normal government bureaucracy, with the backing of the highest levels of elected officials and members of the federal public service. Stay out of that team’s way and let them deliver a high quality system that is extremely well-tailored to the needs of the users and stakeholders, and does what it’s supposed to do… pay people in the public service.

If you believe in what I’ve outlined here, if you believe that there really is a better way to deliver systems of any size in the public sector, please share this. Send it to your MP. Send it to our Prime Minister! Send it to people you know in the government. Send it to journalists who can amplify the message.

After all, why do we want to change the way systems are built in government?

Because it’s 2018.

Comments