Skein tutorial, part 7: not dead, but resting
Pascal Cuoq - 2nd Jun 2011Do you remember the Skein tutorial? It went off to a good start (it started this blog) and then was never really completed. I blame laziness. Looking back at that first post, I notice that indeed, before Boron, we were shipping software without documentation (just think! Ahem). At the time, I also thought that the extended tutorial would be over in a couple of posts, a mistake I have learnt not to make any more. September 2010, those were the days…
The story so far
The solution provided in this post was one of two promised solutions. It was in fact my own solution. While conceiving it I was determined not to look at Skein's source code. I could claim that this was to make the verification method completely independent of the code which is obviously good. It allows for instance to swap in another implementation — say another entry in the NIST SHA3 contest — and hopefully to verify it just by relaunching the same script. The true reason I didn't want to look at the code is more likely written in bold at the top of this post.
Because I didn't want to look at Skein's source code my solution involved the meta-argument that you can cover all possible executions of the loop
while (Frama_C_interval(0 1)) { arbitrarize_msg(); Skein_256_Update(&skein_context msg 80); }
by analyzing instead the two pieces of code below which between them do all the same things:
while (Frama_C_interval(0 1)) { arbitrarize_msg(); Skein_256_Update(&skein_context msg 80); arbitrarize_msg(); Skein_256_Update(&skein_context msg 80); }
arbitrarize_msg(); Skein_256_Update(&skein_context msg 80); while (Frama_C_interval(0 1)) { arbitrarize_msg(); Skein_256_Update(&skein_context msg 80); arbitrarize_msg(); Skein_256_Update(&skein_context msg 80); }
The alternate solution
In July 2010 I was teaching at the FLOLAC summer school and I was lucky to have Josh Ko as my Teaching Assistant. He did not have my prejudices against looking at the code being verified. The alternate and better solution that follows is his.
We determined when we elaborated the solution based on pairing the calls to Skein_256_Update()
that the issue was the field skein_context.h.bCnt
that contained 0
before the first iteration and alternately 16
or 32
after that. Pairing the calls to Skein_256_Update()
was one way to force similar calls to be analyzed together and prevent a loss of precision caused by not knowing whether skein_context.h.bCnt
was 16
or 32
. Josh Ko's solution instead prevents the loss of precision by nudging the value analysis into studying these cases separately by way of an annotation inside Skein_256_Update()
:
--- skein.c.orig 2011-06-02 21:06:29.000000000 +0200 +++ skein.c 2011-06-02 21:06:50.000000000 +0200 @@ -159 6 +159 8 @@ Skein_Assert(ctx->h.bCnt <= SKEIN_256_BLOCK_BYTES SKEIN_FAIL); /* catch uninitialized context */ + /*@ assert ctx->h.bCnt == 0 || ctx->h.bCnt == 16 || ctx->h.bCnt == 32 ; */ + /* process full blocks if any */ if (msgByteCnt + ctx->h.bCnt > SKEIN_256_BLOCK_BYTES) {
We have seen how to provide useful annotations in this post. The above annotation is another example. With it we can use this main()
:
void main(void) { int i; u08b_t hash[HASHLEN]; Skein_256_Ctxt_t skein_context; Skein_256_Init(&skein_context HASHLEN * 8); arbitrarize_msg(); Skein_256_Update(&skein_context msg 80); arbitrarize_msg(); Skein_256_Update(&skein_context msg 80); while (Frama_C_interval(0 1)) { arbitrarize_msg(); Skein_256_Update(&skein_context msg 80); } Skein_256_Final( &skein_context hash); }
$ time frama-c -val -slevel 1000 -slevel-function main:0 *.c -cpp-command "gcc -m32 -C -E -I. " > log2 real 1m48.218s user 1m45.133s sys 0m1.325s
This single main()
cover all the programs with n>=2 calls to Skein_256_Update(… 80)
:
$ grep ssert log2 skein.c:162:[value] Assertion got status valid.
And that's that: no alarm is emitted and the annotation that we provided for the value analysis is verified before being used (so it is not an assumption).
For the sake of completeness the cases of 0 and 1 calls to Skein_256_Update(… 80)
should also be checked separately. We have done that before so there is no need to repeat it here.
Generalizing
The advantage of the method presented in this post over the previous one is that it can more easily be generalized to different buffer lengths passed to Skein_256_Update()
. The more complete verification that I last alluded to uses among other annotations the assertion below.
/*@ assert ctx->h.bCnt == 0 || ctx->h.bCnt == 1 || ctx->h.bCnt == 2 || ctx->h.bCnt == 3 || ctx->h.bCnt == 4 || ctx->h.bCnt == 5 || ctx->h.bCnt == 6 || ctx->h.bCnt == 7 || ctx->h.bCnt == 8 || ctx->h.bCnt == 9 || ctx->h.bCnt == 10 || ctx->h.bCnt == 11 || ctx->h.bCnt == 12 || ctx->h.bCnt == 13 || ctx->h.bCnt == 14 || ctx->h.bCnt == 15 || ctx->h.bCnt == 16 || ctx->h.bCnt == 17 || ctx->h.bCnt == 18 || ctx->h.bCnt == 19 || ctx->h.bCnt == 20 || ctx->h.bCnt == 21 || ctx->h.bCnt == 22 || ctx->h.bCnt == 23 || ctx->h.bCnt == 24 || ctx->h.bCnt == 25 || ctx->h.bCnt == 26 || ctx->h.bCnt == 27 || ctx->h.bCnt == 28 || ctx->h.bCnt == 29 || ctx->h.bCnt == 30 || ctx->h.bCnt == 31 || ctx->h.bCnt == 32 ; */