This document is intended to present and justify a nontrivial set of ground rules for development that is alleged to ensure that the success of the project is not undermined by ineffective verification. It is hoped that this will help to provide much needed perspective in the heat of battle.
| Anything that isn't verified is broken. |
A reasonable quantitative metric for the complexity of an electronic design is to generate a minimal hierarchical transistor-level netlist, prune out the known-good modules (i.e. the gates and any other modules whose entire sub-hierarchy has been fully verified since its last modification) and count the number of instance pins, multiplying the asynchronous connections by 10, and the analog connections by 100. This complexity metric roughly corresponds to the number of possible design defects.
Assuming that most of the design's features are actually required by a customer, the probability of achieving a production-worthy design given a fixed verification methodology is exponentially decreasing in the number of possible design defects. In my experience, if the complexity metric exceeds 100,000 or so, and it takes at least a week to turn design modifications around, then the true critical path to production is always verification.
For example, if there are 100,000 possible defects, each having a 1% probability of occurring, where each occurring defect has a 99% probability of early discovery, then the probability of a defect-free design is:
Understanding this is probably the single most crucial key to success in complex electronic design. Unfortunately, in my experience, most firms view compromising verification in order to accelerate first release to manufacturing as an acceptable risk. However, because diagnosing a problem in real hardware is much more difficult than in simulation or emulation, and because fixing one defect often creates another defect, cutting corners in early design verification always winds up delaying production on sufficiently complex designs.
In reality, in order to produce usable designs of increasing complexity, the importance imparted to design verification must increase proportionately, such that the expected number of defects that escape verification at the time of release remains small. This generally entails a substantial verification effort, definitive and reusable specifications, and designers who go out of their way to make their designs not only correct, but also verifiably correct.
The credible software methodology literature seems to unanimously report that evolutionary development improves both quality and development time substantially.1 2 3 While direct evidence that this applies to electronic design as well is difficult to obtain, my experience has been that the same is even more true of electronic design, with its similar degree of complexity, its even larger number of competing constraints, its less-evolved development environment, and its much longer turn-around time.
Evolutionary development relies on early integration, as well as frequent, thorough quality assessments throughout the development process. Because repeated manual testing is highly uneconomical and unreliable, it is important to establish a thorough automated testing infrastructure by the time of initial integration, even though automating the testing has an initial fixed cost. Therefore, in order to fully reap the benefits of evolutionary development, it is necessary to budget resources for writing tests essentially concurrently with design implementation.
Only a proposition, not an entity, can be verified. What we commonly refer to as "verifying a design" is actually verifying the proposition that the design satisfies its specification (even if it's an unwritten specification). Therefore, specification is an essential prerequisite for verification.
Because there are substantial, unpredictable variations in both production processes and operating conditions, useful specifications are always imprecise. Even a cycle-accurate specification is imprecise, because it has nonzero tolerances on signal timing, may allow additional unpredicted signal transition pairs within these tolerances, and may not specify a value for every output signal on every cycle. Note, however, that an imprecise specification can still be completely definitive (meaning that it is always possible to determine whether a given behavior does or does not satisfy the specification).
Specifications generally exist, at least conceptually, in distinct layers. For example:
This might seem as though it requires a whole lot of specification effort, but in reality it's not significantly more time consuming than specifying everything in a single monolithic specification. It's also easier to maintain, in part because by virtue of which specification is modified it's obvious whom changes affect. Of course, you can reduce the specification effort by neglecting to specify some of the properties that need to be relied upon, but in the long run that always winds up causing much more trouble than it saves. On the other hand, specifying properties that nobody relies upon is always a waste of time.
Note that each layer of specification is more precise than the last. (More precise means requiring no more and promising no less.) For example, it is not sufficient for the RTL to satisfy the Design Specification without satisfying the Verification Specification (because the goal is to prove that the design works, rather than to make it impractical to prove that the design doesn't work).
It bears mentioning that specification is generally a multi-contributor activity, because it typically consumes more than one full time body in total, and few individuals have both the talent and the inclination to write specifications for a living. Fortunately, the specification effort parallelizes nicely. For example, the party responsible for one layer of specification need not concern himself with the next lower-level specification, except perhaps to review it for violations of his intent.
Because a cycle-accurate specification is complex, determining whether it actually embodies the intended properties of the system requires a great deal of effort (especially when the intended properties are not documented). If this task is neglected, then the design is very likely to inherit defects from the specification, and you'll have no way of knowing about them until customers start complaining, at which point they are extremely costly either to tolerate or to correct.
Furthermore, because the design depends on the cycle-accurate specification, and complex specifications require more time to develop, the product's schedule is adversely impacted.
If you are conservative and specify cycle relationships that are easily achieved, then the design becomes suboptimal. The overall effect of this can be substantial. For example, more latency means more buffering, which means more area, which means longer wires, which means longer delays.
On the other hand, aggressive cycle relationships will need to change frequently to track design changes. That requires a great deal of maintenance, especially when designers can't distinguish between changes that affect the specification and those that do not.
Even worse, when the cycle relationships in the specification are modified, the intended properties are not necessarily preserved, so the specification itself must then be re-verified. If this is a manual process, then the effort required to carry it out after every change would be prohibitive. If it's an automatic process, then that process would probably serve as an excellent basis for a lenient specification.
The advantages of lenient specification are that the specification is easy to maintain, that design efficiency is not compromised, and that violations of the specification are almost always real issues that the design's customers actually care about. The disadvantages are that it is more challenging, both conceptually and in terms of initial implementation, and that violations are not detectable until they propagate to a major interface, which might be much later than the point at which the defect was actually provoked.
A lenient specification typically uses some combination of fuzzy modeling and partially ordered transactions to describe the set of allowable behaviors, although in principle there may be other effective techniques of which I'm not aware.
In principle, you can always model fuzziness with simple unknowns, but in most cases you don't want to. For example a bit vector of width 8 whose allowable numerical range is 0x1e through 0x20 could be modeled as 8'b00xxxxxx, but that would also allow the numerical values 0x00 through 0x1d and 0x21 through 0x3f, which isn't correct in this case.
Fuzzy modeling can allow behaviors that you don't expect. For example, if a counter is expected to be decremented from 0x31 to 0x2a at some imprecise moment, then it can be modeled as having the range 0x2a through 0x31 during the uncertain time interval. However, this allows the counter to jump from 0x31 to 0x2d to 0x30 to 0x2a. If that's considered a bug, then don't use fuzzy modeling.
To model nondeterminism, the specification simply refrains from imposing a full ordering on the expected transactions. Instead, the expected transactions may be partially ordered (meaning that certain pairs of expected transactions have a required sequence), and the allowable latency of an expected transaction can be constrained.
Observed transactions are not necessarily derived from specified interfaces. Observed transactions that are neither input transactions nor expected transactions are classified as hint transactions. They provide a mechanism for the implementation to indicate to the specification which of the allowable behaviors should be anticipated. One drawback to using a lot of hint transactions is that they require maintenance to track design changes that affect their generation.
Any behavior that results from any hint transaction that does not itself violate requirements is deemed as an allowable behavior. This is a subtle but important point, because a carelessly crafted specification might accidentally allow unintended behaviors when hint transactions are used. For example, a specification that observes an output as a hint, and then verifies the output against the hint it just observed, is unobviously equivalent to a specification that does not specify the output at all.
Although hint transactions must be carefully specified, they are often indispensable. Without hints to expediently prune the set of allowable behaviors, the checking problem is likely to become exponential in the number of outstanding transactions. That makes the design impractical to verify.
Note that accepting hints from the design does not corrupt the specification with implementation information, because the format of hints is defined by the specification rather than the design, and the discretionary content of the hints is always correct by definition.
It takes a little while to appreciate the power of modeling uncertainty this way, but in my experience I have yet to encounter a useful Design Specification that cannot be effectively expressed as a Verification Specification with partially ordered transactions.
On the one hand, a Micro-architecture Specification detects provoked defects promptly (and therefore they are easier to diagnose), and it provides a very rapid means of simulation with fidelity to the cycle behavior of the design. On the other hand, a great deal of additional effort is required to develop it and to verify it against the Verification Specification, and even more effort to track design timing changes and repeatedly re-verify. Verifying the Micro-architecture Specification is absolutely mandatory, because tracking timing changes in the design is very likely to cause some design defects to be tracked as well, which makes them otherwise impossible to detect.
Unlike the other layers of specification above RTL, a Micro-architecture Specification will probably require an effort that is significant in comparison to the design effort. You need to weigh that against the potential benefits.
Unfortunately, definitive natural language specifications tend to be prohibitively difficult to read. (For example, try reading any part of the U.S. Tax Code.) Even worse, it is common for two readers of a natural language specification both to believe that it is definitive, and yet have mutually contradictory interpretations. (This is especially true of unwritten specifications.) For these reasons, it is a practical necessity that definitive specifications be executable.
An executable specification is a program that observes the stimulus and response of a design (typically in the context of a simulation), and determines whether the observed behavior satisfies the specification under the given stimulus conditions. If not, it should produce sufficient diagnostic output for a human to determine at least one reason that the specification is not satisfied. An executable specification is also called a checker. A lenient executable specification is also called a smart checker.
Currently, C++ stands out as the language of choice for executable specifications, whether they are lenient or cycle-accurate. In order to be useful, an executable specification must be much faster than the RTL simulation that it verifies, and ideally about as fast as emulation. C++ has a speed advantage of 100× to 1000× over its chief competitors, Vera and Specman. (Verilog and VHDL are not serious candidates, because they don't have dynamic data allocation.) Furthermore, C++ has a mature feature set, including templates, exceptions, multiple inheritance, and a very useful and general standard library. It's also free and stable, and you can even obtain free hardware description libraries (such as cynlib) for it.
Unfortunately, C++ also has a very steep learning curve. You can compensate for this to some extent by using coding guidelines. (Some sample coding guidelines can be found here .) However, coding guidelines are meaningless unless they are enforced, because the contributors who are most likely to submit mistakes are often also the most likely to ignore the guidelines.
As far as I'm aware, there are no readily available libraries in any language for smart checking. However, it is necessary to abstract the checking functionality from the behavioral description functionality if the smart checker is to be practical to understand and maintain. You're pretty much stuck with writing a checking library yourself, but fortunately this library can be reused for every project.
For the RTL, such simulators are readily available. On the other hand, simulating a lenient specification is not so straightforward. The recommended approach is to build simulation functionality into the checking library, such that a given behavioral description can be used as either a checker or a simulator. This is a nontrivial programming design challenge. However, when it is implemented correctly, the Design Specification and the Verification Specification are both executable, and share the vast majority of the product-specific description code, such that they are guaranteed to be consistent.
If the executable Design Specification is written in C++, then it is possible to supply customers with its object code, such that they can simulate and approve the proposed behavior, without exposing the implementation (assuming that you trust your customers not to reverse-engineer the object code, which is probably prohibitively difficult anyway). Customers must understand that the architectural simulator is not absolutely definitive, because it does not necessarily exhibit all of the allowable behaviors.
In order for it to be practical to validate a changing specification, it is necessary to accumulate a battery of regression tests to be re-simulated after each change. It is generally adequate to perform a relatively unintelligent comparison of the results against the previously approved results. There will be cases in which the behavior changes without violating the intent of the specification, and in those cases the specification writer should simply update the expected behavior. This approach is practical only if the simulator does not use a random process to select among allowable behaviors.
The rough model may simply comprise an early version of the Design Specification, but maintaining it as a separate entity make sense if it runs much faster.
For example, if the application demands that a particular action is taken in response to a case that is sufficiently rare to be handled in software, then you can check that either the action is taken by hardware or that sufficient information is supplied to the software that the action can be taken.
The main drawback to this approach is that it requires modeling the entire application (without regard for the hardware/software partition). That's probably rather involved, so it might not be ready until the design is near completion, assuming that it is deemed worthwhile to do it at all. On the other hand, the earlier that such problems are discovered, the better.
It is a good idea to incorporate important subtle invariants into the Verification Specification, such that violations of invariants will be detected promptly. Of course, that also means that the Verification Specification needs to track changes to those invariants.
Invariants can usually be verified at the module level .
Because designers might want to take advantage of some of the unique features of C++ themselves, the most reasonable interface at which to draw the boundary between Design's responsibility and Verification's responsibility is in C++. This has the unfortunate consequence that Design winds up with the responsibility for the tedious data marshaling, for example, from Verilog to PL/I to C to RPC to C to C++. It is recommended that scripts be written to automate the generation of such code.
Because hint transactions will also need to be supplied after the netlist is optimized, it is recommended that transactors should rely only on terminals of unflattened modules. Even that isn't necessarily completely safe, but it ought to minimize the amount of necessary post-optimization tracking. Such tracking is also most effectively carried out by designers.
It is observed that hint transactions can often be derived from signals that are already used for module-level verification , so you can sometimes kill two birds with one stone by having the module checker supply the hint transaction to the smart checker.
Every problem report should be addressed with at least some analysis. If the designer believes he is not at fault, then he still needs to explain in the problem report why this is the case. Furthermore, it is important for the owner of a bug to be the person who is empowered to act on it, and therefore a problem report should not be closed until the fix is made available (and therefore passes the current regression).
If the product roadmap is planned properly, then the majority of any new product specification will consist of content that is shared as-is with other products. This is something of a delicate art, but it pays off sooner than you think. See On Planning .
It is essential that customers appreciate that they mustn't rely on unspecified properties, because otherwise future compatible specifications must be augmented to include all of the properties on which customers rely. It is furthermore then technically impossible to add features, because customers might rely on the exact behavior of the product. This, of course, means that the Product Specification must recite properties in sufficient detail for all promised features to be utilized. The same goes for internal interface properties in the Design Specification, whose customers are the verification infrastructure and the design of the subsystems that use those interfaces.
It is also essential that the unique differences between any two specifications (including two versions of the same specification) can be determined. Otherwise, concurrent contributors cannot be accommodated, and updates cannot be propagated in a timely manner, because it takes too much time to figure out what changed. This argues strongly for specifications to be in some text-based format, such as HTML or SDF .
Specifications mustn't be modified at will or without notice. Changes must be made only in consideration of the ramifications, which are usually substantial and beyond the understanding of any single individual. This is best achieved through cooperative negotiation among affected parties.
To make software compatibility maintainable while retaining the freedom to further optimize the implementation, the API should be defined at a level of abstraction that is meaningful to the application, rather than to the implementation. The product should then include a device driver software component that translates the API into low-level hardware accesses. The driver compensates for changes to the implementation without affecting clients of the API.
For example, the API should be aware of the entries within a given lookup table, but it should not be aware of whether the hardware does a binary lookup or a hash lookup, and should not be aware of the number of copies of the table that exist in hardware.
The API should also be responsible for generating warnings if the software attempts to interact with the hardware in an unsupported manner, or preferably for preventing that from happening at all. This provides a definitive deliverable for such requirements. To improve efficiency, such checking code can be disabled in the production environment (for example, using NDEBUG).
Another advantage of as-is design reuse is that fixes for defects discovered in the subsystem automatically propagate to all of its clients. This avoids the all-too-common problem of corrected defects reappearing in future releases. Don't worry if that means changing a product that has already been released, so long as you don't break it. (See On Concurrent Development .) In fact, this is quite beneficial, because the previous product can be used as a verification platform for the fixes.
Unfortunately, design reuse is often impractical because the design must optimize a number of different and often competing metrics, such as:
The principal challenge in making the verification infrastructure reusable is making the tests reusable. Here are a bunch of tips to that end:
This, of course, requires that the API is exposed to tests, and that all supported stimulus patterns are realizable by the test harness. The effort required to accomplish this is nontrivial, but worthwhile. It's a wasted duplicate effort for the test harness to have its own API, and not realizing all possible stimulus necessarily hides defects.
You should avoid this situation by storing shared source code in a shared location. This also drastically reduces the amount of data in the source control database. Sharing source code is most effectively accomplished by using an object-oriented language to express the tests. The Template Method Pattern is known to be particularly useful for this.
Note that this also argues for using an object-oriented language to express tests.
Crafting effective randomized tests is generally very challenging. Settings and stimulus must be very carefully weighted such that any given legal combination of at least two discrete conditions is covered within about an hour of simulation on average. Estimating the probability of a given discrete condition generally involves running the executable specification in reverse, which is impossible in principle and very difficult in practice. Furthermore, this tends to be quite brittle in the face of changes to the specification, so you'll have to redo the analysis when that happens. Therefore, you should wait until all of the directed tests have been written before you write the randomized tests.
Because it is important to be able to reproduce the results of a random test, you shouldn't use a truly random sequence. Instead, use a pseudo-random sequence with a settable, reported seed. Since it is not uncommon for bugs to hide behind weaknesses in the random number generator, it is critical to use one in which every bit of each number in the sequence is independently sensitive to the seed. In particular, rand48() is usually good enough, whereas random() usually isn't.
It is not possible in general to guarantee that a random test will exhibit similar behavior after a design change. However, you should try to preserve the character of the test for a given seed by using separate pseudo-random sequences for unrelated things.
The amount of random testing that is necessary to reliably detect defects must be determined empirically. It is typically thousands of simulation hours, so you'll need a substantial compute farm to obtain prompt results.
If the environmental requirements of the specification are not satisfied at any point in time, then the resulting future behavior is unspecified, at least until the next reset. In principle, the executable specification should simply stop checking things when this is detected, but in practice that can be misleading, because the test writer might believe that his tests are still actually capable of detecting defects. Therefore, the "all bets off" condition should terminate the simulation promptly, with a descriptive error message.
Similarly, a "many bets off" condition results when a violation of a set of conditional requirements causes the result of some customer-visible transaction to be substantially undefined, but the future behavior must still satisfy at least one operational guarantee. Such a condition should result in a descriptive warning containing some agreed-upon identifiable phrase, such as "UNDEFINED BEHAVIOR". Tests that don't expect this to happen should be capable of configuring the simulator to terminate at that point.
Care must be taken to ensure that specified input requirements are reasonable and satisfiable. For example, setup and hold requirements of synchronous inputs are reasonable. Requirements on how a CPU interacts with the product are also reasonable, provided that there is some means of controlling the behavior of the software. Requiring an input signal to be synchronized to an unobservable and uncontrollable internal node is clearly unreasonable.
The smart checker generally derives API transactions based on the CPU bus activity. Rather than trusting the derived transactions, they can be verified by comparing them to an expect queue based on the actual API calls.
Don't do this. Checkers should be reusable with different test harnesses, in particular, at a higher level of integration. If you want to verify the test harness, do it by comparing the derived stimulus transactions against an expect queue, similarly to how you verify the driver.
On the other hand, a test is allowed to "cheat" by looking at the checker or the design in order to produce the worst-case stimulus, because any stimulus that can be produced by such a closed-loop apparatus could also have been produced by an open-loop apparatus. However, avoid doing that if possible, because it limits the portability of the test.
In particular, since the checker is written in C++, which is not a pointer-safe language, it is advisable for the checker to be a separate process from the simulator. RPC can be used as the basis of this interface, even if the processes are required to execute on the same host.
This approach can also be used to parallelize a very long simulation. The test sequence can be broken into sub-sequences with known beginning and ending states. Each sub-sequence can be simulated starting with its beginning state, provided that its ending state is verified to prove that the concurrent simulation of the next sub-sequence is valid.
Reaching remote states may also be accomplished by adding mechanisms to the actual design that allow secret API calls to modify the state. For example, if a given counter normally overflows every 100,000,000 cycles, then a secret register setting might cause it to overflow every 100,000 cycles instead, such that the overflow condition can be practically simulated. When this approach is used, considerable care must be taken to verify that the correctness of the overall behavior does not rely on using the special unsupported configuration.
However, there are some distinct advantages to simulating at the module level (meaning some level below the unit level). For example, a module can then be verified before the remainder of the unit is completed and integrated. Also, simulation will run faster with a lesser amount of logic to be simulated. Furthermore, the module's inputs are trivial to control and its outputs are trivial to observe, and therefore its internal nodes are also easier to control and observe.
Because typically only the designer knows the intended role of a given module in satisfying the specified properties of its unit, the properties to be verified in module-level simulation are generally recited by designers. The apparatus that verifies those properties, called the module-level checker, documents them.
Because ease of integration with the design is a primary criterion for module-level checkers, the leading candidate languages for expressing them are Vera and Specman. The fact that their licensing cost is significant compared to that of a simulator is a matter of some concern, but you're probably going to use one of them to write the tests anyway.
It is typical for module-level checkers to be developed by dedicated verification engineers working closely with designers. However, since the function of those checkers is dictated exclusively by design, it actually makes more sense for designers to develop them, since (given a little practice) that can be accomplished in about the same amount of time as correctly describing the properties to somebody else.
Because module-level checkers are tied closely to the implementation, which is expected to change much more than the Design Specification, you should expect module-level checkers to be much less reusable than the smart checker.
Don't fall into the trap of relying on information from a module-level checker. In particular, don't assume that simulating an integrated system, with every module being checked simultaneously at the module-level, is tantamount to checking the system as a whole. A module-level checker reflects only the designer's intended behavior, which is often not correct at all, and very often not what is needed to interact correctly with other modules to satisfy system-level properties.
On the other hand, it is safe for the smart checker to get hints from module-level checkers, because hints are not actually relied upon. That is, the smart checker will still flag defects (notwithstanding the fact that they might be false defects) if incorrect hints are supplied.
It is probably impractical to run the module-level checkers in emulation , but you might be able to specify additional emulation logic to detect violations of subtle invariants and drive an external signal low if that happens.
Because the smart checker takes the form of a general program in a Turing-complete language, formal verification for lenient specifications reduces to the halting problem, which is known to be undecidable. There might be ways to address this problem, but I wouldn't expect it to be solved within a decade.
On the other hand, it is practical to formally verify a substantial portion of module-level properties, probably most of them in fact. However, the effort required transcends stating the properties. For example, a module's tables and buffers usually need to be reduced to a trivial size in order for formal verification to be computationally practical. It is open to debate whether such effort is worthwhile.
Any properties that are stated for formal verification should also be monitored in simulation, to account for any modeling discrepancies. If the formal verification tool doesn't do that for you, then it probably makes sense to write your own scripts to automate the generation of monitors.
Unfortunately, the simulator doesn't always know what you want well enough not to give it to you. For example, if you always simulate with maximum delays, you won't catch hold time problems. Fortunately, static timing analysis tools will catch that one for you. Here are some other problems that are more likely to slip through the cracks:
Another technique for finding problems related to asynchronous logic is to replace the flip-flop model with a "glitching" model that produces an unknown for about 1/4 cycle after it transitions. This will make the simulation run more slowly, but it's better to run slowly than not to find bugs.
In order to avoid uncontrolled unknown propagation, you'll probably need call out some of the flip-flops that feed another clock domain by using a special module that resolves to the equivalent glitch-free flip-flop model. This facilitates static analysis of the design, makes it clear that the flip-flop output cannot fan out until it is synchronized, and draws attention to the fact that special physical design attention may be required to optimize the metastability characteristics of the synchronizer flip-flop(s). If your logic synthesis flow supports re-timing, then you'll need to make sure that it respects these synchronization boundaries.
This is really a ball of snakes that you should prefer not to pick up at all. It's better to observe the conservative design practice of using a proven bullet-proof synchronizer module from the library whenever you cross clock domains.
In general, latches should be used only in arrays, where the area savings is substantial, they can be encapsulated in a BIST collar, and their timing requirements can easily be analyzed manually.
Even worse, the distinction between a gated clock and an ordinary synchronous logic node is not always immediately apparent. You can avoid that by always generating gated clocks using a particular module, including both the AND gate and the negative-edge triggered flip-flop, for that purpose.
It is also somewhat tricky to verify a module having an interface to a tristate bus. A good way to do this is to verify the tristate bus driver independently, and then to verify the module inside the driver, treating the input path as an input signal, and the output path and tristate enable as output signals.
You can verify the unimportance of arrival cycle by simulating with a random delay whose maximum value is several times the minimum clock period, independently on each fanout of a signal whose timing arcs are disabled.
This problem can be averted by initially forcing such flip-flops to a given value, but it won't be safe to do so unless you verify that the initial state is unimportant. In most cases, the prudent solution is to add a reset input to such flip-flops.
If the netlist is already frozen, you might get away with initializing such flip-flops to all zeros for the first simulation, then re-run initializing them to all ones the second time, and then re-run initializing each flip-flop to an independent random state several more times. This yields a partial degree of confidence that the design is functional regardless of the initial state.
// 1=>0, 0=>1, X=>1, Z=>1 module inverter(a,b); input a; output b; always @a begin if(a) b=0; else b=1; end endmodule; // inverterIn reality, if an output depends on an input, and the input is unknown, then the output must also be unknown.
The ideal solution is to add code (invisible to the logic synthesis tool, in order to avoid confusing it) to handle this correctly. Unfortunately, that's unreliable, a lot of work, and inefficient to simulate. Instead, such problems are normally covered by post-synthesis simulation, which you'll need to do anyway in order to verify that the logic synthesis tool did nothing unexpected.
Because emulation is typically no more accurate than simulation, but more precise (because it doesn't use unknowns), it is necessarily less definitive, in addition to the fact that it is harder to provoke existing defects under emulation. Therefore, emulation does not trump simulation. In other words, the failure to observe a defect in emulation does not negate its existence when it is observed in simulation. Similarly, hardware does not trump emulation
If it is possible to fit an RTL-accurate version of the design into a single FPGA that performs at speed under some favorable conditions (such as low temperature and high voltage), then it usually makes sense to use that for emulation. FPGA's are relatively economical, and can be used for early prototyping of the system that incorporates the product.
Otherwise, it's not clear always whether it makes sense to emulate at all. Before investing millions of dollars in emulation capability, you should consider whether the same goals could be accomplished more economically, for example, by designing for verifiability, using state forwarding, or using static analysis to prove that difficult-to-provoke defects are absent.
Tests should be written at an abstract level, such that they can in principle be used either for simulation or for emulation. In order for emulation to be effective in detecting defects, you'll need to check its results against the smart checker. In general, that requires adding a "hint bus" in emulation to provide hints to the checker. Since the hint bus is probably a bottleneck, you'll need to buffer its input, and timestamp the hints to account for the unpredictable latency that results.
If you're paying attention, you've realized that I just claimed that the most important dimension of progress on a complex project is not measurable, at least not until well after the fact. The harsh reality is that unmeasurable objectives are not necessarily unimportant. This presents a substantial challenge for project management.
On the other hand, user-defined high-level coverage metrics (for example, Vera coverage objects) may be capable of effectively measuring the coverage of interaction among distant modules. I don't know how well this works.
The jury is still out on whether it is worthwhile to invest in coverage measurement capabilities beyond the common sense of the verification engineers who write the tests.
Characterization is defined as determining all of the "important" properties of a given design, which is much more difficult than verification. For example, "verifying" a design without a specification is characterization. Enumerating the bugs of a design is also characterization, because defects very often hide behind one another. Verifying an arbitrary specification that is not optimized for practical verifiability is also characterization, because it requires guessing the intermediate properties that make the design verifiable. Verifying specified properties that don't gate the product release is also characterization, because it requires guessing which of the properties are actually needed.
If the alleged characteristics of a defect are simple (which is unusual), then you can verify around it by tracking the defect with temporary code in the specification. The set of waived defects is defined in some central location that is approved as part of the release procedure. For example:
#include "the_project/waivers.h" #ifdef WA_PR0123 // Do the expected thing #else // Do the correct thing #endif // WA_PR0123
On the other hand, verifying around a defect that is difficult to characterize requires avoiding cases that provoke the defect. That's likely to hide additional bugs, including bugs that you wouldn't waive were they known. Don't do that if you require the avoided cases to work at all.
Since you probably can't recruit enough good verification engineers to thoroughly verify the design as it is, don't make their job any harder than it has to be. Instead, optimize your Verification Specification for verifiability, and decide your release criteria with little or no regard for the outstanding defects.
Because of this, it is much better to discover design defects in simulation rather than in hardware. If the number of bugs that are discovered in hardware is greater than about 10, then it is very likely that those bugs will hide even more bugs, and that the interactions between those bugs will be nontrivial; therefore, reliably extrapolating the effect of prospective design changes will be essentially impossible.
Furthermore, when the verification effort is incomplete, one cannot reliably draw conclusions from the shape of the bug curve. Given a fixed level of verification capability, the total number of previously discovered bugs, or bug curve, typically approaches a bug target (in a manner similar to exponential decay) as the design is refined. However, the bug target represents only those bugs that are detectable through verification. As verification capability increases, the bug target increases in a discontinuous and unpredictable manner. Therefore, the final bug target can be extrapolated only after all of the release criteria are verified. You'll probably need to trust your verification engineers to tell you when that has happened (see Coverage).
As a rule, you should release only when the bug number is about 5 or less.
The design needs to be completely frozen (in terms of any data seen by Verification) before you can start the timer for the final regression. (I am aware of case in which a prototype failed because a single comment was changed after the final regression.) It is also a good idea to coordinate releases with the system administrators, to make sure that upgrades are not taking place during the final regression.
You shouldn't trust automated processes to translate the design from one form to another. Tools make mistakes too. However, once a translation process has proven itself reliable, it might make sense to defer the verification of that process until after release, and then withdraw the release only in the unlikely event that a problem is discovered.
It is important to have a definitive copy of all the equivalence-checked design specifications and databases (including unreleased data) corresponding to each release. This is essential for effective root cause analysis, and the data can be difficult or impossible to recover after the fact.
Anders Johnson, last modified $Date: 2003/12/07 $