logo资料库

Processor Verification(处理器验证).docx

第1页 / 共14页
第2页 / 共14页
第3页 / 共14页
第4页 / 共14页
第5页 / 共14页
第6页 / 共14页
第7页 / 共14页
第8页 / 共14页
资料共14页,剩余部分请下载后查看
Shadow model and coverage driven processor verific
1. Introduction
2. Improving the verification environment
3. Result checking using shadow modeling
4. Functional coverage
5. The verification software
6. Debugging the system
ISSUE
RELATED ARTICLES
Shadow model and coverage driven processor verification using SystemVerilog By Arthur Freitas | No Comments | Posted: June 1, 2007 Topics/Categories: Verified RTL to gates | Tags:   PRINT SHARE This paper describes a random test generation strategy we are using to complement the verification of upcoming generations of processor. SystemVerilog provided the means to define the functional coverage of our design and to employ the shadow modeling technique, significantly improving our verification flow. Shadow modeling is a reliable method for proving the functionality of the design, because different engineers implement the reference model and the RTL, and it is very unlikely that two distinct teams, using two different programming languages, will implement the same errors in the same way. The functionality of the microprocessor is already comprised in its reference model (i.e., the instruction set simulator). The ISS is considered to be a running specification of our microprocessor. Although it was not conceived for use in hardware verification, we are leveraging the investment spent on its creation for this purpose. In this way, we gained time – otherwise invested to conceive and
implement direct tests for difficult corner cases – to specify and implement the functional coverage of the design. Our verification software not only substantially improved the verification flow, but also now gives us virtually unlimited opportunities to further enhance the verification of the system. 1. Introduction In earlier generations of Hyperstone microprocessors, we used self-checking directed tests as our main approach to functional verification. In conjunction with booting the OS and running some application programs written in C, this was considered sufficient. As the architecture became more complex, many more test cases had to be covered. So, we supplemented existing verification methods with random test generation and automated result checking. Three independent tasks must be implemented to employ this verification methodology: 1. Random test generation: a stream of random instructions, properly constrained so they do not put the microprocessor into an illegal state. 2. Reference modeling: a ‘golden’ model used to ensure that the random test run produces the correct results. The model runs the same test and generates reference results used for comparison in simulation. 3. Functionalcoverage: a mechanism that measures the functional coverage of the random tests generated because, unlike with directed test, we do not know in advance what is being tested and what is not. These tasks can be implemented in many ways. The next sections describe how we implemented our verification system based on random test generation and automated result checking. 2. Improving the verification environment Our original verification environment was based on a simple test-bench that instantiated the processor RTL code and its behavioral memory models. We had a regression suite based on directed tests that was combined with the boot simulation of the Hyperstone real-time kernel and other application programs. As our architecture became more complex, many more test cases had to be covered. So we added random test generation and automated result checking. A good random test generator can generate tests to cover different addressing modes, instruction combinations, pipeline issues, and so on.
Our approach was initially based on perl scripts generating con-strained random assembly code. This was assembled to create a loadable memory image of the test for the RTL simulation. The same test had to run in a reference model prior to the RTL simulation to generate the reference result files. This reference model was a behavioral instruction set simulator (ISS) written in C. During the simulation, these files were loaded and used for comparison with the main RTL results. On a mismatch, the simulation was immediately stopped. To address functional coverage, we used the PSL cover directive to cover properties defined in the instruction register (IR). Figure 1. Shadow modeling This resulted in an over-complicated verification flow. Automation was cumbersome and testing was limited to the amount of memory in the system. The longer the test, the bigger the reference file, and the file IO usually slowed down simulation performance. Because of the limited test duration, we had to start several simulations, merge the functional coverage database, and decide if we needed to start a new simulation or not. We decided to use shadow modeling to improve the flow. This entails integrating the reference model in logic simulation. As a result, we no longer needed to run tests in the ISS to generate reference files for the RTL simulation. The results were generated in simulation on-the-fly by the ISS. To integrate the ISS in logic simulation, we used SystemVerilog’s direct programming interface (DPI). Additionally, we substituted the perl scripts with verification soft-ware written in C. In simulation, instead of loading a memory image of the test containing the random instructions, we now load the cross-compiled verification software. While executing the software, the processor generates on-the-fly random machine code, copies it into a memory
segment, branches to this segment, and executes the just generated code. After finishing execution, it repeats this loop over and over again. We are no longer limited by system memory size. Software running in the device-under-test (DUT) generates and executes its stimuli indefinitely. We used SystemVerilog constructs to define and monitor the functional coverage. Thus, our software could be run indefinitely until it automatically reached the coverage goals. (see Section 4) In summary, the verification flow has been simplified to: 1. Compile the verification software into a loadable memory image. 2. Simulate the execution of the verification software until it reaches coverage goals. Another advantage here is that the verification software is aware of the processor state and can, on its own, steer the direction of the tests it creates. Previously, we were not generating tests on-the-fly but in advance and then simply running them in the microprocessor. These tests had no intelligence: they were simple random code, blindly executed to perform result comparison with the reference model. Now, there are virtually no limitations on how we improve the software(e.g., building intelligence to better constrain the stimuli). 3. Result checking using shadow modeling Random instruction generation can only be used in verification if the same stream of instructions is run on a reference model and the results compared against the RTL implementation. To perform our result checking, we used ‘shadow modeling’. With this technique, a reference model is simulated in parallel to the DUT. Every time the DUT completes an instruction, the reference model is assigned to execute the same instruction. When the reference model finishes executing this instruction, the two results are compared and mismatches flagged. For verification, the reference model can be an ISS or a cycle-accurate model of the microprocessor. The results are snapshots of the microprocessor’s internal register file, which are written every time an instruction is executed. Figure 1 depicts our system. 3.1 The Reference Model We used an existing ISS as our reference model. It was a non-cycle accurate simulation model of the Hyperstone E1-32X microprocessor written in C. It simulates not only the full instruction set architecture (ISA) but also memories and peripheral circuits. After every instruction is executed, its entire register file is saved in a set of variables. Programs are run sequentially; no instruction pipelining nor any timing of the E1-32X at the hardware level is modeled. The fact that the ISS is not cycle-accurate presents some challenges, discussed in the next subsection.
Figure 2. Testbench – checking the results 3.2 Integrating the shadow reference model To integrate our reference model for HDL simulation, we used the SystemVerilog DPI. It simplifies the task of integrating C-code in logic simulation and offers very good simulation performance. For the ISS integration, we wrote an interfacing function in C to hold and transmit the required parameters to the actual ISS. We imported this C function to the testbench with the following statement: import “DPI-C” context task ProcessorCall (input int reset, input int intrpt1, … , input int pin1,…); Once integrated in simulation, the ISS acts like a slave of the test-bench. When called, it takes control of the simulation to execute one single instruction. When this is finished, it gives control back to the logic simulator. Instructions executed by the ISS do not consume simulation time. The ISS is not cycle equivalent to the real system: multi-cycle instructions (e.g., DIV and MUL) are executed in a single call. Interrupts are reported to the ISS on every call. To perform result checking, the DUT and the ISS must run synchronously. We cannot call the ISS at every clock cycle to execute an instruction because multi-cycle instructions
report their results immediately and the program counter (PC) is then actualized, resulting in a loss of synchronization. So, we created a signal in the microprocessor’s RTL code that signals when an instruction has finished its execution and has the results written back to the register file. We named this ‘flag pipe_wb’ (i.e., pipeline write back) and it triggers the call to our ISS. 3.3 Checking Results To check results, we use the DPI to import an existing ISS function that returns the current value of a register in the microprocessor’s register file. We imported this C function to the test-bench with the following statement: Import “DPI-C” function int GetReg (input int reg_index); After the ISS is called to execute an instruction, we call a Verilog task that loops all registers comprised in the microprocessor register file and compares their contents to those generated by the reference model. There are exceptions that require special handling. For example, the results of a LOAD instruction will not immediately be used in the subsequent instruction as a source operand. This LOAD may take several clock cycles to conclude its execution, and as long as its result is not required in the current instruction, the processor does not block the program execution. Yet the ISS does not model this behavior; the result is immediately written to the destination register. Therefore, we built a mechanism that prevents the comparison of this register until the result is also available in the DUT. Before we can compare the registers, we have to pre-initialize them in software. Since both DUT and ISS run the same test program, they are initialized equally. After initialization, the processor writes a value to a memory-mapped register in the testbench to signal that initialization is finished and comparison can start. Figure 2 illustrates a simplified version of the testbench. Lines 0-2 show import declarations for the C functions of the reference model, or ISS (we had to import ‘ProcessorCall’ as a ‘context task’ because it calls our ISS, which in turn calls other functions). Lines 10-20 depict code where the testbench waits until the software finishes the initialization of all registers in the processor’s register file. The processor then writes the value 0×12341234 to the ‘MONITOR’ register, which is memory-mapped to the microprocessor. This sets the signal compare to 1’b1, enabling the actual register comparison in line 38. Lines 24 -31 depict the actual result comparison. The signal ‘pipe_wb’ flags when an instruction has written its results in the processor’s register file. This triggers the call of the reference model (‘ProcessorCall’) so that it can execute the same instruction and keep in synch with the DUT. The passed parameters are assigned to global variables of the ISS before the ISS main function is called to resume the program execution. Before the function returns, the ISS saves its state in an array of global variables (‘reg[RegIndex]’). Lines 33-51 depict the task ‘Load_Compare’, which loops all registers contained in the processor’s register file and compares the contents to those generated by the reference model. 4. Functional coverage
Functional coverage measures the functionality exercised in the design and, properly defined, helps indicate the completeness of the verification plan. It also helps engineers to identify untested parts of the design and concentrate on reaching the verification goals within the optimal number of simulation cycles. We employed SystemVerilog to specify our functional coverage models as it provides many extensions to facilitate the specification, computation, and monitoring of a system’s functional coverage. One important coverage goal was to ensure that all instructions were tested in their most important modes. We used coverage groups to specify the functional coverage of our entire instruction set architecture (ISA). The Hyperstone microprocessor has variable-length instructions of 16, 32, and 48 bits. The next two subsections explain how we used coverage groups to specify functional coverage of the instruction MOV of the Hyperstone ISA. Figure 3. RR instruction encoding Figure 4. MOV functionality Figure 5. MOV encoding 4.1 Specification for the MOV instruction The MOV instruction is a 16-bit instruction of format ‘RR’, that means it accepts global (i.e., G0…G15) or local (i.e., L0…L15) registers as both destination and source operands. Figure 3 depicts how instructions of type ‘RR’ are encoded in the instruction register (IR). In a MOV instruction, the content of a source register is copied to the destination register, and the condition flags are set or cleared accordingly. Figure 4 depicts its functionality. ‘Z’, ‘N’, and ‘V’ refer to the zero, negative, and overflow flags, respectively.
The encoding scheme for the MOV instruction is depicted in Figure 5. For example the assembly instruction ‘MOV G3, L2’ produces the machine code ‘0×2532’, where the op-code ‘0×25’ indicates that it is a MOV instruction that has a global register as destination and a local register as source. The remaining byte ‘0×32’ is the concatenation of Rd-code and Rs-code and indicates that the register indexes are ‘0×3’ for the destination and ‘0×2’ for the source. 4.2 Functional coverage for the MOV instruction To specify the functional coverage model for the MOV instruction, we used the code in Figure 6. This counts how many times all the variants of the MOV instruction have been executed. Figure 6. Covergroup for MOV instruction To collect the coverage information, we defined the coverage group ‘cg_MOV’. We created three coverage points associated with the signal IR, and two coverage points associated
分享到:
收藏