bytecode verification in a single forward-pass  
Author Message
catalinione





PostPosted: Common Language Runtime, bytecode verification in a single forward-pass Top

ECMA for CLR claims that "It shall be possible, with a single forward-pass through the CIL instruction stream for any method, to infer the exact state of the evaluation stack at every instruction"

In other words, it shall be possible to perform bytecode verification in a single forward-pass.

To make that possible , ECMA requires that

"if that single-pass analysis arrives at an instruction, call it location X,
that immediately follows an unconditional branch, and where X is not the
target of an earlier branch instruction, then the state of the evaluation
stack at X shall be empty."

On the other hand, ECMA claims that it "simulates all control flow paths".

I don't get how this simulation of all paths is done in a "single forward-pass".

Suppose that I have a backward branch instruction Y, targeting an (earlier)
instruction Z. The stack state at Y has to be "merged" with the stack state existing at Z, right

But then, I have to propagate again the (new) stack state at Z to its
successors. But here, it is a contradiction with the "single forward-pass"
bytecode verification.


My guess is the following:

  • that "shall be possible" doesn't mean that the bytecode verification is performed in a single forward pass on the original bytecode stream. The code array has first to be rearranged in order to make possible the verification in a single pass. That probably happens during a JIT compilation.

  • If it's not rearranged, the bytecode verification is similar to the one of JVM, i.e., a dataflow analysis with a fix point interation.
Please someone correct me if I'm wrong.

Thanks!


.NET Development17  
 
 
nobugz





PostPosted: Common Language Runtime, bytecode verification in a single forward-pass Top

I'm not able to follow your train of thought here. The way I read it: if location X is not the target of any previously parsed branch instruction, it must be the target of a backward branch instruction later in the instruction stream. To discover the stack state of location X, it would thus be necessary to parse ahead to the branch instruction and discover the stack state at the location of the branch. That violates the "single forward pass" requirement. To meet the requirement, it is thus necessary to assume that the stack is empty at location X. And by inference, it needs to be empty at the branch instruction which is easily verified.

With this requirement in place, there won't be any need to rearrange the byte codes and the byte code verifier can be quite fast with minimal internal state requirements.


 
 
catalinione





PostPosted: Common Language Runtime, bytecode verification in a single forward-pass Top

My concern is not related with that constraint.

Consider the following example:

X1 and X2 are direct subclasses of X3

and assume that in the single forward-pass, the verification infers the following stack states:

1: [X1]
2: [int32]
3: []
4: [X2]

Assume that the instruction 4 is a conditional branch instruction, pointing (back) to the instruction 1. How does the verifier proceed Are the two stack states [X1] and [X2] merged to [X3] If yes, then the stack state at 1 becomes [X3], but then it shall be propagated to its successor 2 - and this is not single forward pass.

 
 
nobugz





PostPosted: Common Language Runtime, bytecode verification in a single forward-pass Top

That is specifically disallowed by the specification. If location 4 branches back to location 1, the stack needs to be empty at both locations.


 
 
catalinione





PostPosted: Common Language Runtime, bytecode verification in a single forward-pass Top

my example is accepted by the verification.

The constraint you're referring to requires that the instruction 1 shall immediately preceed an "unconditional branch instruction". But, in my example, there is no unconditional branch instruction at 0.



 
 
nobugz





PostPosted: Common Language Runtime, bytecode verification in a single forward-pass Top

Okay, I missed the "conditional". Can you think of any C# code that would produce such a stack state


 
 
catalinione





PostPosted: Common Language Runtime, bytecode verification in a single forward-pass Top

I know what you mean, but that's not the point of the verification.

My program is a verifiable CIL program though it does not correspond to any program in a .NET compliant language

 
 
Mark Benningfield





PostPosted: Common Language Runtime, bytecode verification in a single forward-pass Top

Hello All.

catalinione:

Leaving aside the discussion of the stack state, are you trying to say that a non-.NET compliant language should not be CIL verifiable



 
 
Brian Stern - MSFT





PostPosted: Common Language Runtime, bytecode verification in a single forward-pass Top

No no. I think the point is that you can represent more in IL than you can in higher-level languages. The IL verifier itself doesn't care what languages was used to produce the IL.

When a branch occurs, the state of the stacks at the point of the branch target must be merged. If either the stack depth (as in this case) are not equal or the types on the stack are not compatible, the code is unverifiable. Here is an example I whipped up pretty quick in IL:

.method public static void Test(int32)
{
Start:
ldarg.0
dup
brzero End
ldc.i4.1
sub
starg 0
ldc.i4.0
br Start
End:
pop
ret
}

In a nutshell, this is pushing the number of arguments onto the stack as the parameter specifies. So the stack depth could be variable at the Start label. I think this more-or-less represents the idea above. This code is indeed unverifiable. When PEVerify is run on the code it reports the following:

[IL]: Error: [C:\public\verif.exe : <Module>::Test][offset 0x0000000E] Stack depth differs depending on path.

Brian Stern [MSFT]