It's Not Magic
An essay on the design of automated modernization projects.
1
Summary
An automated modernization project, also referred to as a
“conversion”, “migration”, "legacy revitalization" or “legacy renewal” project, is inherently
different from most projects in which IT professionals will participate during
their careers, and in several different ways. When this specialized type of
project goes awry, it is almost always from a failure to appreciate these
differences and to allow for them in the project plan.
Properly controlled, an automated modernization project
should be the least risky of any major project, but a failure to implement the
proper controls can make it a very risky project indeed. Automated
modernization projects obtain their substantial cost savings and short delivery
schedules by extracting highly leveraged results from the automation. However,
it is easy to forget that a lever has two arms, and – improperly implemented –
you can find leverage working against you rather than for you in your project.
1.1
Modernization Versus Conventional Projects
The biggest difference between a modernization project and
a conventional project is that modernization takes an existing system and
changes it to be the same. Some element of how the system does its job will
change, i.e., some aspect of the infrastructure will change, but the business
functionality will not change at all. By contrast, a conventional project will
have as its focus a change to the business functionality, but only rarely
accompanied by any fundamental change to the infrastructure.
Probably the greatest source of risk in a modernization
project arises when these two distinct purposes are blended together. For
programmers steeped in conventional projects, the thought that “while we are in
the code, we should take care of X at the same time” is nearly irresistible.
Probably the most important decision management will make in approving the
project architecture is to firmly squelch this notion, and to educate all the
staff in why that is the best decision.
1.2
Testing Modernization Projects
As we will discuss in the body of this essay, testing is
the greatest cost in any project. Once any business functionality change is
allowed, the ability to use more economical comparison testing is effectively
eliminated. The resulting increase in testing costs will, in practice, be many,
many times any supposed savings in program modification and testing that is
expected to accrue by trying to do 2 things at once. This is a crucial point,
and we recommend giving as much time as is needed for everyone to think through
the issues and come to consensus. A fundamental disagreement with regards to
methodology among key technical staff is itself a significant project risk
factor that needs to be managed appropriately.
Since testing is the principal means by which risk is
controlled in a modernization project, fully integrating appropriate testing
into a modernization project – including an adequate provision for rework and
retesting – will assure adequate risk management. Conversely, a failure to
adequately plan for testing in the specialized context of an automated
modernization project will mean that project risk is not being adequately
managed. It is worth becoming acquainted with the work of
Capers Jones
to understand quantitatively the degree of risk that we typically accept without
question in
conventional projects – and that we can avoid fairly easily in an
automated modernization project that is properly done.
1.3
Modernization Technologies
There are two different technological approaches to
modernization, and an informed discussion must delineate the advantages and
limitations of each. We discuss at length below the technical differences
between renovation and re-engineering, but the business differences can all be
boiled down to risk, cost, and benefit. Renovation will have the least cost and
least risk, but cannot replace a procedural language with a modern, object
oriented language. Re‑engineering can, for example replace COBOL with Java/J2EE or
C#/.NET successfully, and at lower cost and risk than a rewrite into a modern
language, but
at a higher cost and risk than renovation. Sometimes a judicious mix of both
strategies can provide an attractive hybrid project plan. (Our presentation
"From Legacy to BPM" at the May, 2006
Transformation and Innovation conference included an extended example of just
such a hybrid plan.)
The most significant source of non-technical project risk
comes from a nearly universal failure to appropriately understand the real
advantages and real limitations of both technological methodologies. There is a
pattern we’ve seen over and over again in which first time participants in a
modernization project begin with a honeymoon period in which client expectations
of the vendor exceed
reality. This honeymoon can even extend through a pilot project if few speed
bumps are encountered.
But, once the main project begins, it is typically
followed by a disillusionment period in which the unrealistic expectations are
not met and are then replaced by extreme pessimism, then substantially
underestimating actual vendor capabilities. Once through these teething pains, then
the project typically starts to deliver real value. However, if the
disillusionment period is too intense, the result may be project cancellation,
throwing out the baby with the bath water. No one benefits from this situation.
If this is allowed to occur, then it would have been better to not have
undertaken the project in the first place.
1.4
Fitness for Purpose
The one show stopper in a modernization project is the case where the
business rules in the system being modernized no long fit the business process.
Although a perfect match is rare, we look for a very high degree of correlation
between the existing system and the actual business process. If the
correlation is low, then modernization may not be a viable strategy, and it is
time to bite the bullet and design a replacement system.
1.5 Recommendations
We recommend a period of evaluation and education prior to commencing even
a proof of concept or pilot modernization project. This should be tailored to the technical
and non-technical requirements of the proposed project, especially taking into
consideration the three primary business considerations: budgetary limitations,
appetite for risk, and technical goals.
However, there are important soft considerations as well.
For example, a significant divergence between the perspectives of staff and
management can become a critical risk issue. There can also be significant resistance to change, sometimes well founded,
other times difficult to defend.
In sum, we recommend that the overall
business, technical and soft considerations be put squarely on the table, and then a
rational and cost/risk/benefit optimized way forward be designed, implemented
and carefully managed.
2 Change
It To Be The Same
Probably the most fundamental difference in a modernization
project versus a conventional project is that the purpose is different:
- In a conventional
project, we are usually changing what an application does, not how it does
it.
- In a modernization
project, we have just the opposite: we are usually changing some aspect of
how an application accomplishes its task, but not the task itself.
This can also be described as a change in business rules
versus a change in infrastructure. Typical types of modernizations will replace
an existing data store with a relational database, or change the language, or
change the platform, or introduce XML encoding/decoding, or restructure a
program to eliminate “spaghetti” code, etc. When the modernization is
complete, the business functionality of the application should be exactly the
same, even though one or more elements of the application’s execution
environment have changed radically.
2.1
Contrasting Conventional Vs. Automated Modernization Projects
Of course, a conventional project can also consist of creating a
wholly new application or rewriting an existing application. Yet, this
conventional project is still distinct from an automated modernization project.
By definition, you can’t modernize something that hasn’t been built yet.
From a business perspective, creating a wholly new
application creates new value for the organization. By contrast, a
modernization project either enables the subsequent creation of new value,
reduces operational and maintenance cost, or both. Rewriting an existing
application usually falls somewhere in between in terms of the value of the
results.
However, the capital cost can be dramatically different.
Creating a new application or rewriting an existing application can be
5 or 10
times as expensive as a modernization project, and yet the modernization can
deliver most or all of the benefits of a pure rewrite – including reduced
maintenance cost and the introduction of new technology such as J2EE or .NET. And as we
will discuss in a moment, the risk profile can be dramatically different as well.
The automated modernization
project will take an established, working body of source code and introduce some
kind of systematic change across all the sources. When complete, the newly
modified sources will still be recognizable to the programmers familiar with the
system, and the online transactions and batch reports will be identical in their
business functionality (however different their presentation might be).
2.2
Business Process Management (BPM)
At first, it may not be clear
where BPM fits in a discussion of conventional versus automated modernization
projects. Many organizations find significant potential in BPM workflow
systems to inexpensively automate previously manual processes. Where prior
automation exists, an organization may be interested in replacing an existing
application with a BPM implementation.
There is no off the shelf
capability to automate the re-engineering of a legacy application into a
BPM platform, but in practice we don't find this a limitation. Our
opinion is that the best architecture for a BPM application is to define
a collection of services using conventional implementations however
achieved, and then use the BPM platform to orchestrate those services.
On this model, it doesn't
matter whether a legacy application is renovated or re-engineered into a
modern language base. In either case, it is straightforward to
publish transactions as services in a services oriented architecture.
Once the SOA is established, then it is a quick process to orchestrate
those services using BPM.
Our recommendation for
integrating BPM into modernization projects is based on two
observations, one technical and one business. The technical
observation is that BPM workflows are not natural services. Although one
can implement services in most BPM packages, the result is technically
awkward and inhibits obtaining the full value of the BPM platform.
The business observation is that embracing a particular platform can
incur long-term costs, and the promise of
BPEL to allow escape from a
given BPM platform is unlikely, in our opinion, to fulfill its touted promises. If one
implements services in a legacy or a modern standard language, and uses
BPM only to orchestrate those services, then the BPM platform can be
used as intended, and BPEL - though not a perfect answer - is more
practical in this limited context.
2.3 Risk in
Conventional and Automated Projects
Capers Jones, in a
recent
article, revealed that, in a study of 250 conventional projects of
significant scale, only 10% completed on time/on budget, and another 20%
completed with less than a 35% cost and delivery overrun. 70% of the sample had
cost/delivery overruns of greater than 35%, or were cancelled outright.
What causes these dreadful
statistics? As our Cutter Consortium colleague
Jim Highsmith
stated at an annual Cutter Summit a few years ago, "at the start of any project,
your specifications will be at best 70% complete and 50% correct." Upon
hearing that quote, many project managers will consider it too optimistic.
The risks to project success are built-in, right at the inception.
Risk in a conventional project is
a subject that is far too often not a part of formal project design. It is rare
for formal risk management policies to have been adopted by any organization. Even among those who have done so, the policies may remain unfunded or otherwise
marginalized. Sometimes, there is a culture of "can-do" optimism that frowns on any risk
analysis as pessimism or even obstructionism. Or, there is a concern that if risks are brought up and
realistically appraised then the project will never get permission to proceed at
all. This could be because management expects cost overruns, and if they
apply a cost multiplier to a budget that is actually realistic then a distorted
picture can emerge.
The best news in risk management
in recent years has, arguably perhaps, been the advent of agile programming and
agile project
management methodologies. These strategies, based on bedrock policies of
"test-first programming" and of “deliver early
and often,” have been highly successful in containing some of the dreadful
statistics on project failures and cost/delivery overruns without incurring the sometimes
costly overheads of PMI or CMM project
management methodologies. However, these agile strategies do not scale well to very large projects.
Those projects reviewed in the Capers Jones study referenced above included no
agile
projects, because projects of the scale reviewed by Jones are too big for agile
methodologies, true believers notwithstanding.
Unfortunately, legacy
modernization projects tend to be very large projects, with more than 1,000
programs (approximately 10,000 function points) being a not unusual project size,
too big for agile.
These risk statistics alone argue for a bias towards a conservative technical strategy
for your project architecture.
One of the best features of a
legacy modernization project is its ability to markedly reduce risk in a project
of this scale. This ability derives from the central fact of legacy
modernization which is frequently overlooked: we have a working system that,
whatever its faults, delivers the goods every day. The ability to leverage
and extract value from your legacy assets has significant
implications for cost control and particularly for risk mitigation, as we will
discuss in the next section.
3 IT
Risk in Modernization Projects
One of the greatest sources of IT risk in a modernization
project is when the two purposes of functional change and infrastructure change are confused. For programmers steeped in
conventional projects, the thought that “while we are in the code, we should
take care of X at the same time” is nearly irresistible, because it is believed
that there must
be economies in doing so even though research would show just the opposite. One
of the most important roles of management in the project is to firmly prevent
this from occurring.
In doing so, it is important to educate the technical staff
in just why this is important, and to respect their experience that caused them
to make the suggestion in the first place. When you are doing two things at
once – changing the business rules while changing the infrastructure – and then
when you find a fault in testing, the question arises – was it the business
rules change, the infrastructure change, or some interaction between the two
changes that caused the fault? A more subtle reason is that, once you
change the business rules, you obviate the ability to use the cost efficiencies
of comparison testing, as we discuss in the next section. In other words, the testing problem becomes much
larger, more difficult and more expensive, far offsetting any promised
efficiencies from “doing two things at once.” In fact, the result is seriously
diseconomical, not economical.
3.1
Comparison Versus Correctness Testing
The reason why the testing problem becomes more difficult
also involves a subtlety of testing methodology. When we change only
infrastructure, we can rigorously apply comparison (or “regression”) testing.
The changed and unchanged version of the code should produce identical output
for identical input. This is significantly easier (and therefore
proportionally less expensive) than creating and executing the standard test
packs used in conventional projects, and is more accurate to boot.
In all conventional testing, we have to ask the question,
“is this code correct?” either because we are creating a new system (and
thus have nothing to compare against) or because the business rules are
changing, so by definition the results are supposed to be different. In
the latter case, the problem is determining whether the difference is a correct
difference or an incorrect difference. Either way, as part of a standard test pack, you have to specify how to evaluate a result to determine whether or
not it is correct, which is a lot more work than a simple comparison, and
prone to inaccuracy and/or incompleteness as well.
3.2 "Seat of the Pants" Testing
Of course, this formal approach to testing excludes the
“seat of the pants” testers who never create test packs because they “know” what
is correct and incorrect from experience. Although there are frequently
methodological problems with this approach to testing, principally in the area
of test coverage analysis, it is indeed true that your subject matter experts (SMEs)
often can meet the standard of “good enough testing” without creating formal
test packs.
The problem here is that in an automated modernization
project the number of programs to test can choke any testing organization. If
you are forced to single thread your testing through the one person or few
people knowledgeable enough to conduct seat of the pants testing successfully,
you can significantly elongate your project, sometimes to the point of complete
failure.
There is a related project risk element that arises from
making SMEs a bottleneck on project progress. SMEs are often crucial to daily operations or are needed to
support priority maintenance or new initiatives. A testing plan dependent on
the availability of key SMEs
indicates a project seriously at risk before it even leaves the starting gate.
3.3 Outsourcing the Grunt Work
This distinction between comparison and correctness testing
can be crucial to project success if SMEs cannot handle the testing load. Once the
set of regression test packs is defined, the actual comparisons can be carried
out by staff other than SMEs. Indeed, it is not unusual for regression test
packs to be executed and evaluated by contract staff, because this comparison
testing requires no understanding of the application at all. “Are these results
the same?” is a question that anyone can answer, which again leads to lower cost
of testing and quicker completion for equal efficacy against correctness
testing of equivalent test coverage.
Even more of a concern, however, is re-testing. In an
automated modernization project, it is frequently the case that sources are passed through the
transformation engine more than once. Each time through, the programs have to
be tested over again. Testing can become a major bottleneck, with delays and
cost overruns feeding through to the whole project while other people have to
sit at a standstill waiting on the completion of testing. This is further
exacerbated by the problem of SMEs being pulled off the project for
emergencies, idling or severely slowing modernization analysts who depend on
their ready availability.
Ultimately, automating a project is all about applying
technological leverage. When the assumptions of the project are
all met, this leverage works for you, and creates positive economies, saving
real money. However, when the assumptions are violated, leverage starts to work
against you, and can create diseconomies that cost you money.
Testing, like ready access to SMEs, has potential for both significant
economies and significant diseconomies. We will return to the subject of
testing after going into some of the differences between two technological
approaches to modernizing legacy applications.
4 Two
Fundamentally Different Technologies
There are two fundamentally different technologies used to
automate the modernization of legacy applications. In fact, there are not black
and white differences, but many shades of grey. Here, however, we will discuss
them as opposites for clarity.
- The process we define as
renovation uses a transformation (or “parsing”) engine to change the
existing body of source code, most frequently COBOL, according to an
extensive set of rules and to create a new library incorporating the desired
changes. You end up with all the new capabilities that you want, but the
base code is still procedural COBOL.
- The process we define as
re-architecting will also process the existing source library, but
there the similarity ends. You end up with the same new capabilities that
renovation provides, but the code base will be in the desired new language,
most frequently an object oriented language like Java or C#/.NET, or even
Object Oriented COBOL.
Some companies, including ones that we respect, will
speak about "converting" or "migrating" COBOL to Java or C#. This
nomenclature is usually interpreted as meaning a renovation process. When
pressed, however, they will agree that the transformation engine does not do the
whole job, or that extensive and detailed information has to be
prepared to guide the transformation in order for the engine to do its job.
This meets our definition of re-architecting.
We made the point strongly above that the two purposes of modernization
and business functionality improvements should never be blended. However, it is
possible and sometimes quite reasonable to blend these two different
technologies – renovation and re-engineering – in a hybrid project. We’ll
return to this point later.
4.1
Renovation
|
Renovation is a fully deterministic process that will
–
-
Analyze the library of source codes for a given application,
-
Create a repository
of structured information about those sources, and then
-
Apply a set of logic
templates (transform rules) to transform specific constructs in each individual source
within the context supplied by the repository.
|

The best transformation engines
produce 100% of the changes to the source code, though rarely so on the first
iteration. Specialists for that transformation engine fix faults in the results by tweaking
the rule sets and re‑running the source library through the engine again and
again until the results are as desired. The results are usually expressed
in the same language (most frequently COBOL) for a 3GL to 3GL transformation, or
else elements of the original language are preserved in a 4GL to 3GL
transformation. Either way, the results remain familiar to the programming staff
who will subsequently support the modernized code.
However, it is true that sometimes, toward the end of a
project, the specialist programmers managing the transformation may end up
modifying the transformed code by hand to deal with one-off cases. This is
because it can be simpler and faster to do so than writing complex rules to deal
with rare cases. Though common, this is a practice that we discourage
because it inhibits rerunning the transformation as needed.
4.2
Re-Architecting
Re-architecting has some superficial similarities to renovation:

Like renovation, it also uses a parsing engine to analyze
the library of source codes for a given application to create a repository of
structured information. There the similarities end. From that repository plus a
substantial set of generation rules expressed in its logic templates, a new system is
generated without further reference to the original sources. That new system
will contain a large percentage of the final code, which for the best
re-architecting systems can be well in excess of 90% ,but for some only 70% or
less. Then, specialist programmers will need to fill in the missing code
manually. The problem is one of missing information - the engine does not
have everything it needs to create a complete source, and so an incomplete
source is created for the programmers to complete. The manual code is
usually created in blocks that can be merged as the system is regenerated, so
that the final version of the trial sources need not be modified manually to
create the final sources.
4.3 Summary
of Differences
To summarize the most important differences:
- renovation
modifies the existing code base, while re-engineering creates wholly new
code;
- renovation does not - in general -
require manual code to be added to the final sources, while
re‑engineering generates less than 100% of the finished code and thereby
requires some degree of manual code to be merged with the generated code, or
extensive parameter entries created to allow generation of the missing code.
Indeed, this manual tinkering is the major source of the
difference in cost and risk between the two approaches.
The following table provides
a summary of all the significant differences.
|
|
Renovation |
Re-Architecting |
|
Input to analysis |
Original source library |
Original source library |
|
Input to code generation |
Original source library |
Repository |
|
Output from code generation |
Original or similar language |
A wholly new language, e.g.,
Java or C# |
|
Percent of final code generated
before manual code |
100% (no manual code) |
70-95% |
|
Specialist programmers modify … |
Transformation rules but
not transformed code |
Generation rules and the
manual code to be merged |
|
Economics |
Least cost solution |
More expensive than renovation,
less than a rewrite. |
|
Risk |
Least risky solution |
More risky than renovation, less than a rewrite. |
|
3GL
to 3GL |
Yes |
Yes, though rarely done |
|
4GL
to 3GL |
Yes |
Yes, though rarely done |
|
3GL
to 4GL |
No |
Yes |
|
3GL or 4GL
to OO Language (Java, C++,
C#) |
No |
Yes |
Many of the transformation engines are built as COBOL
parsers, and frequently written in COBOL as well. While there is certainly
nothing wrong with a COBOL implementation in principle, in practice programmers
who know primarily COBOL tend not to have a sophisticated knowledge of compiler
design. The practical result is that some things are easy to do and some things
are difficult to do. This fact will be reflected in details of the transformed
code, in the flexibility with which modifications can be made, and in the cost
of the project. Sometimes this flexibility is important, other times not so
very important.
The re-architecting approach is certainly more attractive,
particularly for those interested in moving away from a procedural language to
an object oriented (OO) language such as Java, C++, C#/.NET or even OO COBOL.
These benefits have to be weighed against an increase in both cost and risk,
though both cost and risk will still be significantly less than a manual
re‑design and re‑write.
Note also that the residual code left for manual effort
after the re-engineering process completes may be more difficult than average to
implement and test. Having to write 10% of the code does not mean that the
manual component of this approach will be only 10% of the effort (and cost) of a
rewrite. Still, the cost savings can be quite significant, and this approach
can also result in a significant reduction of both delivery time and of risk.
4.4
Restructuring Renovated Code
One of the advantages sought by moving to a new code base
is to reduce maintenance costs. Renovating old, over-maintained
"spaghetti" COBOL or other code may provide the infrastructure changes
desired, but the maintenance cost burden is still present.
There is a little known but powerful technology to
automatically restructure spaghetti code into reasonably well structured code.
Interestingly, the worse the code, the better the resulting improvement. This technology
can eliminate most or all of the GO TO instructions in COBOL, and produce code
that is substantially easier to maintain. As a side benefit, redundant
logic can be pruned, and the CPU consumption of the resulting programs can
decline by 10-20%. We recommend that this approach be considered as an
adjunct to any renovation project.
5 It’s
Not Magic
The greatest mis-conception
about either type of modernization project is the idealized expectation of the
results by the technical staff. There is a tendency for technicians to see the
promised capabilities of both types of project as a perfect solution, and – far
worse – a solution that should work perfectly out of the box.
Leaving aside for a moment the issue of optional
customizations, there is an inherent perceptual problem here. Both of these two
classes of tools were created by reverse engineering, and reverse engineering is
as much art as science. Even modernization tools that have been through
hundreds
of projects and millions of individual programs will still deliver faulty code
under certain circumstances. Some language facilities do not work as
documented, others have loopholes that are technically illegal but allowed in
practice by the compiler. Sometimes discrete facilities are fine but the
interaction between language facilities rarely used in conjunction can create
problems if such use is encountered in the source code. Others subtly change their dynamic characteristics
from one release to another, a problem that is particularly acute for the run
time components of 4GLs.
Interestingly, the re-architecting technology that we
labeled above as higher risk should be less affected by this perception issue in practice,
because they should never be expected to produce 100% ready code out of the
box. The renovation technology, on the other hand, is expected to
produce 100% ready code out of the box, and as such is very sensitive to these
effects of reverse engineering and particularly to the expectations of the
tool’s capabilities.
However, this reduced sensitivity by the re-architecting technology to the limitations of reverse engineering is all about client
expectations and not about the actual, technical risk issues. Technical
challenges have to be dealt with, but it will be the specialists in
the re‑engineering tool who deal with them in the residual code that they
manually modify.
We maintain our position that re-architecting is higher risk
than renovation (though still significantly lower than a full rewrite). Every
time manual code is introduced into the transformed system, there is the
opportunity for manual error, or for inconsistent implementation of the
same construct in multiple locations. The net result is that testing has to
catch these subtleties, and it is beholden on the project architect to ensure
that sufficient and appropriate testing resources are allocated.
So long as technicians maintain the illusion of
infallibility about the transformed or generated code, they will react
disproportionately negatively to the first few reported faults.
Salespeople representing the vendors will tend to not discourage these illusions
as they can help his or her sales. Plus, salespeople are always reluctant
to bring into the discussion anything that can be perceived as a negative, for
fear that the project will not go forward.
5.1
Perceptions of Renovation Tools
Technical staff members
tend to think of renovation tools as they do compilers – rock solid
implementations that can be used as dependable tools on a daily basis. But
compilers have a vastly easier job than renovation tools. Compilers read a 3GL
(or sometimes a 4GL) and produce some form of executable code, which can be
thought of something like this, where each circle represents the set of the
syntax capabilities of that language:

Compilation is a straightforward transformation of
source to object code. The object code (represented in transparent blue) always
supports whatever the compiler supports (the purple area created by the blue
completely overlaying the red). However, when you are transforming source to
source, it is not so easy:

The purple area represents the overlap of
capabilities, the red area the capabilities of the source not mapping to the
target, and the transparent blue area the new capabilities of the target not
supported by the source. Unlike the object code case, here we have an exposed
red area – meaning that there are capabilities of the source for which there is
not a perfect mapping to the target. These orphan capabilities have to be
handled by the renovation engine in some way, typically either by inserting
blocks of code providing the same functionality, providing run time components
with functionally equivalent capabilities, or providing an extended target language syntax resolved at
compilation time by a pre‑compiler.
There is never a perfect mapping of one source to another
in an infrastructure modernization project, leaving aside special cases like
COBOL restructuring. Consider one case familiar to most mainframe programmers:
COBOL with VSAM being mapped to COBOL with a relational database using SQL.
There is not a perfect mapping of VSAM to SQL. READ NEXT can only be translated
to FETCH commands if a cursor has been declared for the table. The output
program can be made functionally identical, but it is the job of the technology
to bridge this gap and to ensure that the appropriate cursor has been declared
and opened.
Fortunately, there is a general case solution for all issues involved with VSAM
to SQL.
Now consider a more difficult case: COBOL with IDMS to
COBOL with SQL. Most functions of the networked IDMS database can be
implemented with a relational database using SQL, but there are one or two functions that
cannot. For example, the IDMS data definition language can define an index set in which
the entries are ordered in physical sequence and are therefore changeable at the discretion of the
programmer at run time. In other words, the entries are ordered by logic
rather than by data, which means that there is no key defined in the
SQL table that the IDMS dataset becomes during the project by which to order the
result set of a SELECT statement.
No technological
tool can bridge this gap with a general case solution without introducing an
unacceptable level of performance overhead, since the target database
does not natively have this capability. Here, the general solution
involves defining an additional column for the table, a "sequencer" column.
This column is populated with very large digit numbers, as large as possible
without overflowing. Then, when new rows are added, the program must
access the rows before and after the desired insertion point, divide by 2, and
use the resulting value as the sequencer. There then remains the
possibility that a duplicate key will result, at which time logic to renumber
the sequencer column is required to execute upon detecting the error. This
solution is neither desirable for performance reasons nor for clarity and
maintainability of the renovated code.
However, there are a variety of special case solutions, but
each one requires the input of additional information into the renovation
engine, information that is not contained in the library of source code. A
parsing tool cannot parse what is not there. So, the specialist programmers
that set up the renovation engine for the project look for this (and other)
imperfections in the mapping, and create instructions to the engine about how to
implement any needed special case solutions. In more formal terms, they inject
the missing information to allow the engine to complete its task.
5.2
Implications of Renovation Customizations
Because your precise usage of the old source is unlikely to
precisely match a previously executed project, you should expect to see faults,
a lot at first, rapidly slowing to a trickle. (We'll talk more about this
in the next section.)
However, there is a related issue
to these faults appearing: opportunities for adding value through
customization. Here we need to consider both pre-existing customizations and
new customizations.
For example, in one 4GL to 3GL language transformation
system in which we have some substantial experience, there are 47 different
options available in the transformation setup. Each of these options started as a requested customization by
a client, or
an opportunity noticed and implemented by the system support team. For example, one
standard customization involves the mapping of non-relational data structures to
relational. By default, the lowest level of data elements expressed in a record
description will be used to create the columns in the target relational table.
But with a record having numeric date fields, all redefined as individual 2 digit
numeric values, what is the best solution? One customization allows, for
specifically listed
fields, to treat the first group level as the column entity, and subdivide the
date into
sub-fields only in the new program’s memory. This is an example of a
customization generalized into an option, one that is likely to be useful for
other projects in the future and thereby integrated into the tool.
But there is another type of customization, one that will
apply only to that one client, or perhaps to only a subset of programs within
the portfolio. Although a renovation cannot change the business functionality
of a program, it can set up new infrastructure that will simplify future
maintenance on the system after renovation. For example, datanames could be
modified to meet current site standards. Or, variations of a common routine
could be standardized into a single routine, perhaps implemented as a
sub-routine. Or, XML parsers could be embedded into the code to provide for
future data exchange applications. Or, a series of application specific
routines could be added into the code base but not used immediately – they will
just be already there when the maintenance is ready to start. Or ... really,
the sky is the limit on this sort of thing.
Any customization involves increased risk to the project,
but the former type may tend to be less risky because it is so generalized.
Conversely, the latter type of specialized, one off customization can begin to
look similar to custom coding. All decisions on customizations of any kind
should be looked at through the same lens of cost/risk/benefit. And, no one
should be surprised when fault rates increase for new customizations.
6
Perceptions of Test Results
As the first few faults are
reported, the technicians’ opinions of the renovation tool will often deflate
from unrealistically positive to unrealistically negative. The relationship
between number of programs processed and number of faults discovered in practice
actually looks like the blue line:

But
when technicians see the first set of results, even though the results follow
this typical curve, they frequently react as if the relationship were more like
the red line:

In other words, the actual situation is that 80-90% of
faults will be found in the first 50 or so programs tested. Moreover, at least
for renovation projects, as further faults are discovered the transformation
rules can be adjusted and all programs retransformed, so that each fault is
corrected as a class, not as an instance.
However, technicians, unless they have prior modernization
project experience or have been properly educated during the planning of the
project, will almost always mis-read the implications of the initial test
results. For example, if 20 faults are found in the first 50 programs, then
they will expect to see 400 faults in 1,000 programs when, in practice, the
total will be more like 25-30 in 1,000 programs.
This is completely understandable, since it is what their
experience has taught them. Almost no one has experience with the leverage
obtained through these automated tools, and so they should not be expected to
interpret these early results correctly without thorough explanation and
discussion.
It is critical to the success of the
project that technicians have realistic
expectations and an understanding of both the strengths and limitations of the
chosen
technological approach. They need to understand the nature of error patterns
that come out of a modernization engine, and particularly why those errors
occur.
6.1 All
Projects are Not Alike
The most important reason why these unexpected errors occur
is that, contrary to expectations and contrary to the representations of most
services vendors, each new project is not completely like one or more
previous projects. Each project will likely use a unique combination of
options, will request new customizations, and will present a source library that
includes some unique usage combinations of the source language.
We gave an example above of one renovation engine with 47
options. The interaction of those
47 options produces – in theory – 47! (factorial) permutations, though in practice,
due to overlap and correlated usage, the actual number of permutations is a much
smaller but still very large number. The software engine has been used with over 150 different application
projects, which means that its experience base is one of the largest of any
renovation tool. That experience base covers a substantial percentage of all
actual permutations, but it still falls far short of testing all the
actual permutations. And this is one of the best in its class.
Expecting a perfect tool for your project is simply not
realistic. What we see in practice is that every project has a combination of
options not seen before and involves source libraries with idiosyncratic uses of
the language. This fact ensures that in some respect or another, the initial
output will not correspond precisely to expectations and, indeed, will contain
some faults. This is just about a sure thing, except for really simple
projects.
However, the fact that the tool will not produce perfect
results should not put you off the project. What it will do is produce
pretty good results out of the box, and after testing and tweaking will then
produce results that are – if still not absolutely perfect – nevertheless more than good enough.
6.2
Pilot
Projects
This is also why pilot projects should be carefully chosen
and should have well thought through objectives. Most clients ask for pilot
projects in order to have a proof of concept and to evaluate the maturity of the
tool. We have no argument with these objectives, but we do have recommendations
on how to evaluate the results. Any viable vendor should be able to produce
code that represents the target, and to therefore demonstrate that this is a
viable way to proceed.
However, giving the vendor an arbitrary set of code (from
their point of view) and expecting the output to work perfectly out of the box
can result in your rejecting the very vendor who may be the best partner for your
organization. Since all the tools in their class work more or less the same
way, tools that work very well for small demonstration projects may not
necessarily scale well for the full project. Conversely, tools that are best
prepared to grapple with all the subtleties of a full scale project may require
feedback and testing when being set up for a new project, even a small project,
something that is not usually made available to vendors for pilot projects. We will discuss our
recommendations for pilot projects in the final section below.
In our experience, the vendor salesperson who encourages
you to think that their product will do the job perfectly out of the box may not
represent your best choice as a vendor. A vendor who is honest and tells you that this
could be a difficult or frustrating project is a far better choice than one who
says, “sign here and don’t worry, everything will be fine.” They know that,
once you are committed to the project with them, when the inevitable problems do
appear, you will continue to work with them short of a really egregious
failure.
The best modernization project is a partnership between
vendor and client, with well defined roles, capabilities and expectations. We
always remind our clients that a good partnership is built on complete and total honesty. There is a
lot of value in the modernization approach, properly applied, and it can add a
great deal of value to your organization. However, minimizing risk for all
parties means that all the issues need to be out on the table and discussed
honestly, which can be difficult for some sales organizations to handle.
7
Recommendations
7.1 Pilot
Project
We recommend that a pilot project should evaluate a vendor
on two primary criteria:
- The experience base of
the vendor in crafting solutions to newly presented problems and newly
identified opportunities to add value
- The flexibility of the
vendor’s tool in implementing these modifications
In other words, evaluate both knowing what to do and how
they will do it in practice. Some vendors will score higher on one than the
other, and sometimes the best choice will be the vendor who knows what best to
do rather than the vendor with the best tool for implementation.
Do you care more about how they do something or more about
what gets done? The implementation is a single shot affair, but you will
live with the results for a long time. There are only two significant implications for your
project of how a vendor will implement any specific solution. One involves its cost to you,
which is an issue that can be dealt with during business negotiations with a short
list of pre-qualified vendors. The second is whether the tool can implement the
best solution for any given construct or set of constructs, or whether it is
constrained toward certain solutions and away from others. The flexibility
of the tool directly impacts both implied issues.
We prefer to run a pilot project with a large rather than a
small set of code, and evaluate the results without expecting perfection out of
the box. Ideally, ask them to run your entire library through their tool, and
to tell you the imperfections in the results that they identify. Then they
should return samples illustrating specific issues and how they have dealt with
them. Next have your
staff desk check the results, comparing them to the results of other vendors,
and come up with their own list of issues for each set.
From this list, have each vendor discuss their process for
resolving each issue in detail without hand modifying the transformed
code. Ideally, this would involve watching a
technician prepare a solution to one issue, implement it, and compare the
results of re-running the transformation. Some vendors will modify the rule
sets of their engine, others will prepare a supplementary special purpose
“filter” to perform just that task, and still others may modify the
implementation of their engine itself. Or, a combination may be required to
handle all cases. Regardless, the best choice will be the vendor who can
implement any solution, not a vendor who is constrained in the type of solution
being offered due to an inflexible tool.
7.2
Economics
We have seen many sites who think that a bid process will
ensure both the best technology and the best price. For the reasons we discussed above, the vendor who shows best in a normal bid process may not
be the best vendor for the project. Similarly, the vendor with the lowest bid
may not produce the lowest overall cost for the project. We
urge careful thought and caution when considering how best to structure a bid,
though we concede that sometimes procurement rules do not allow the best
approach to doing so.
Our preference is a two stage approach, first a massive
pilot along the lines recommended above, followed by a formal bid process using
a short list of vendors qualified by the pilot. The experience of evaluating
the pilot will greatly facilitate the evaluation of a bid, since the post-pilot
evaluation criteria are likely to be very different from the anticipated
criteria prior to the pilot. In practice, this pilot approach may not be
affordable, since most vendors themselves cannot afford to do it without some
cost cover, and then you are in the position of paying for work (on the part of
the losing vendor) that will never be used.
Consider the point of view of your vendors when they are pricing a bid
response: there is the work they know must be done, which they
should be able to evaluate pretty tightly, and there are the unknowns, which
translate as risk. What is going to show up during the project that has to be
dealt with? And who will bear the cost? The vendor, the client, or both?
Remember that most modernization companies are boutiques
with shallow pockets. You cannot reasonably expect to treat them like IBM, EDS,
or any other major consultancy. Think about this carefully, because if you want
a major consultancy style of project then be prepared to pay for it, and insist on
one of the majors as a prime contractor. But keep this clearly in mind: for this
service, the typical overhead added to the price from the boutique vendor doing
the actual work is 100% plus the
billings of their own staff on the project. For a major prime, expect to pay
double or triple at least. This is not an exaggeration. If anything, it
can be an understatement. Then, the major will do their best to talk you into a
complete rewrite so that they can charge you another 5 or 10 times the cost of a
modernization. If you don’t want to pay to play by the major’s rules, then adapt
yourself to the economic reality of boutique firms.
Our recommendation regarding cost is a shared risk model:
require a fixed price bid for the known part of the project – remembering
always that there are issues unknown both to you and to the vendor – and take a
variable cost for the inevitable unknowns. That variable cost should be at or
very close to their fully burdened internal cost of doing the work, and should
include only labor, nothing extra like licensing charges. In other
words, the boutique knows that they won’t be bankrupted by the unknowns, but
they won’t make any profit from them either. This also weeds out the tendency
for some companies to underbid to get your business, and then constantly
renegotiate the price through change orders.
At the same time, ensure that you do not inadvertently add
to the vendor’s costs. For example, most vendors will depend on rapid access to
SMEs when questions arise. If this access is not consistently available, then
the vendor staff can end up idle for days or – worse – working in ways that have
to be undone later.
7.3 Hybrid
Modernization Projects
Throughout this essay, we have mostly discussed either
renovation or re-architecting as two separate approaches. However, for your
project you may want to consider a hybrid. We have seen several systems in
recent years where it may make more sense both technically and economically to
re-engineer the online system, but renovate the batch system. In other words,
end up with Java or .NET for all online user interface transactions, but keep
the batch in COBOL either permanently or for a long transitional period. We
recommend considering the appropriateness of this approach for your project.
7.4
Modernization Plus New Functionality
If there is one issue in modernization that we come back to
over and over again, it is the problem of attempting to change the business
functionality in some way during the modernization project. This is indeed, in our opinion,
the most dangerously risky decision one can undertake in a modernization project.
However, once again, there may be a practical hybrid. If
you modernize an application first, using either technological approach, then
you can avoid the cost and delivery time risks associated with trying to do two
things at once. However, if – while you are modernizing – you design the
implementation of the new functionality, you may discover that the modernization
project can add a lot of the infrastructure you are going to need for that new
functionality through a customization to the transformation rules. This is not
always practical, but if approached cautiously there may well be some
opportunities to leverage the modernization to cut costs and delivery times for
the new functionality that will be added in a follow-on phase.
Make absolutely sure that you are measuring the thoroughness of your testing, or
else you may discover that what you thought to be well tested programs were in
fact minimally tested with many latent faults waiting to bite you.
7.5 Testing
and Risk Management
We approach all modernization projects from a risk
identification and risk mitigation perspective. Since testing is the primary –
though not the only – form of risk mitigation, and testing is very expensive, we
recommend a modernization design that allows either automated or manual
comparison testing. Comparison testing is still expensive, but both more
thorough and significantly less expensive (for the same degree of thoroughness)
than correctness testing.
7.6
Renovation Versus Re-Architecting
We remain agnostic with regard to a choice between
renovation and re-architecting . When we put our technical hat on, we prefer the
most technically advanced solution, particularly any solution that moves from
procedural to object-oriented programming and that enables business process
management applications. This argues for re-engineering.
However, when we put our risk management hat on, we want
the most conservative technical strategy, with the maximal opportunity for the
most thorough possible testing. This argues for renovation, as does a
constrained budget.
Our recommendation on any given case, whether to go for
re-architecting, renovation or a hybrid, will always depend on what we learn from
management. What are the goals of the organization, the goals of the IT
organization, the resources, time frames, criticality, appetite for risk, etc?
As always in consulting, the final answer will be that it depends. When
approaching one of these projects, we try to avoid actually giving a
recommendation if the technological approach is still an open question. Instead, we work to clarify
the questions of what is more and what is less important. We continue until the
answers given causes the recommendation to emerge naturally out of the dialog.
7.7 The
Soft Issues
As we have discussed at length, the greatest source of
project risk is not a “hard” issue like platform, language, database, user
interface, or indeed any technical issue. We recommend that the project
architect focus like a laser beam on the soft issues, the issues of expectations
and goals of all of the stakeholders in the project. Where there are actual or
perceived losers from a successful project, the highest priority should be to
ensure that these losers will become winners from a successful project.
Otherwise, your project is at severe, perhaps fatal risk before it even begins.
Beyond aligning the business goals and priorities of all
stakeholders, we recommend that all stakeholders be educated with the real
advantages and the real limitations of the technical methodologies under
consideration and the technical goals of the project. We recommend insuring
against both excessive optimism through idealization and excessive pessimism
because of real or perceived limitations, particularly when not everyone has the
same goals or points of view. When there is significant resistance to change,
whether well founded or not, everyone needs to be heard and a solution to real
or perceived problems crafted. In extreme cases, specific individuals may need
to be removed and isolated from the project, though thankfully this has proven
rare in our experience. Our primary recommendation is that the business and
soft considerations be put squarely on the table, discussed thoroughly, and then
a rational and cost/risk/benefit optimized project architecture be designed and
implemented.
8
Conclusions
As we have discussed at length, an automated modernization
project is significantly different in many ways from the prior experiences of
most if not all of the professionals who will participate in it. When these
projects have some type of problem, it is almost always from a failure to
appreciate these differences and to properly allow for them in the project
architecture.
Conversely, a properly managed automated modernization
project can be one of the most economical and least risky projects in all of
IT. For this reason alone, modernization – either as the whole
project or as the first phase of a hybrid modernization + expanded functionality
project – can be very attractive.
In conclusion, we urge anyone considering a modernization
in isolation, and particularly anyone considering a modernization versus a
replacement, to carefully weigh these risks. In the projects we have seen, the
success rate is very high even for large projects, far more than the replacement
approach. It is our firm conviction that if the issues discussed in this essay
were adequately taken into account in all modernization projects, the success
rate would be 100%.
|