Frama-C-discuss mailing list archives
This page gathers the archives of the old Frama-C-discuss archives, that was hosted by Inria's gforge before its demise at the end of 2020. To search for mails newer than September 2020, please visit the page of the new mailing list on Renater.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Frama-c-discuss] Adding some semantic meaning to a few (standard string) functions
- Subject: [Frama-c-discuss] Adding some semantic meaning to a few (standard string) functions
- From: Pascal.Cuoq at cea.fr (Pascal Cuoq)
- Date: Mon, 15 Dec 2008 15:16:46 +0100
- In-reply-to: <1229346661.3335.142.camel@supelec>
- References: <1229346661.3335.142.camel@supelec>
On Dec 15, 2008, at 2:11 PM, Jonathan-Christofer Demay wrote: > In the example code at the end of this message, at line 28, frama-c is > able to tell that boo's value is 2. However, at line 23, it can't tell > that *foo's value is "foobar". > > Note that I didn't wrote foo's value, but *foo's value. I know that > frama-c is already analyzing foo's value as a pointer. But analyzing > (a > bit) what it's poiting at might be useful. Of course, since > generally a > pointer can point at anything, in most case, it might see nothing, but > at least in the case of strings I think a few things might be > computed. > > So I was wondering if it would possible to add this kind of > knowledge to > frama-c through a plugin. I think that I see what you mean. In this particular case it may be easier to patch the value analysis than to try to write a different plug-in. > Of course, If I were to code, for example, my own strcmp-like > function, > there's no way to automatically deduce this. But at least, for > standard > string functions, and only for a few of them, we might want to > statically give this knowledge to frama-c. For example, if strcmp > return > 0, then it can be seen as ==, and so from the statement line 20, we > know > that *foo can be anything but {""}. Yes, it is possible to do this in principle. There remains the problem of encoding the information that can be extracted from the condition into the existing representation for program states, or to define a new representation for this additional information. The case where strcmp(v, LITERAL)==0 can be encoded exactly in the existing representation. At line 18 in your example, the information from the if condition is equivalent to foo[0] == 0, which can be represented. In general, however, it is not always possible to represent any kind of information, even information that is very obvious for a programmer reading the code. For instance, because the value analysis actually represents the value of each char in v separately, the information that strcmp(v, LITERAL)!=0 is a kind of disjunction (the condition may be false because any character in v is different from the literal string it is compared to), which is generally speaking a bother to manipulate. Don't forget that the analyzer will need to be able to compose pieces of information that comes from different execution paths, and to extrapolate fixpoints when it encounters loops. It is very tempting to go all out and represent every little piece of knowledge that you can think of but in practice, it is better to limit yourself to the information for which you are able to define good composition rules for the operations that appears in C. > We could also see strcpy as =, and > so after the statement line 20, *bar is in {"foobar"}, and so is foo > at > the statement line 23. To go further in making the analyzer realize this, see the remark at the end of this message. > So any comment on this ? Bad/good idea ? Achievable or not ? (at least > in some simple case like this example). I'm already reading the plugin > manual ^^. I would suggest thinking about the primitives (strcpy, ..) that you want to provide, and how their respective effects can be encoded in the existing representation of states (which you would have to become familiar with). Next would be to list the syntactic patterns for which it is possible to infer some information that can be represented in the current states (strcmp (v, LITERAL)==0 is one). Defining a new abstract domain for strings would come after this in difficulty. I should also point you to existing methods to infer loop invariants automatically in presence of string manipulations in the Jessie plug-in. See http://www.lri.fr/~moy/minix3.html and the pointers there for more information. If you want to play with this, you need to have the APRON library installed (http://apron.cri.ensmp.fr/library/). APRON in turn depends on libraries such as GMP and MPFR. Hopefully your distribution would contain packages for all these necessary libraries. You may also be interested in reading this: http://penjili.org/papers/AllamigeonGodardHymansSAS06.pdf > Another question about something I noticed while writing this example. > The statement at line 20 was originally written like this: > > if(bar != strcpy(bar, "foobar")) > return -1; > > But it makes frama-c think this condition is always true and so all > code > after this statement is considered as dead code. Since strcpy in most > case (when everything goes right) will in fact return 'bar', I don't > see > what is wrong here. This is because your project did not include the code for function strcpy. Since it is a library function whose return type is a pointer, it was assumed to return a pointer to a location without aliases to other variables in the program. In these conditions bar!=strcpy(bar, "foobar") is always true. You can fix this by providing the code of the actual function strcpy that will be executed when your program runs, or a simplified equivalent version. Some common functions are provided in share/libc.{c,h}. Note the case of the first definition of memcpy: void* memcpy(void* region1, const void* region2, size_t n) { if (n > 0) Frama_C_memcpy(region1, region2, n); return region1; } This function calls a built-in function, which allows to make any necessary transformation on the interpreter's state, in a simpler way than making the analyzer look at one possible implementation. Our built-ins are not perfect yet, which is why an alternative is proposed for memcpy, but what I am suggesting is that you look at this mechanism first. Pascal -------------- section suivante -------------- Une pi?ce jointe HTML a ?t? enlev?e... URL: http://lists.gforge.inria.fr/pipermail/frama-c-discuss/attachments/20081215/7e861267/attachment.htm
- References:
- [Frama-c-discuss] Adding some semantic meaning to a few (standard string) functions
- From: jcdemay at rennes.supelec.fr (Jonathan-Christofer Demay)
- [Frama-c-discuss] Adding some semantic meaning to a few (standard string) functions
- Prev by Date: [Frama-c-discuss] returns vs loop invariants
- Next by Date: [Frama-c-discuss] returns vs loop invariants
- Previous by thread: [Frama-c-discuss] Adding some semantic meaning to a few (standard string) functions
- Next by thread: [Frama-c-discuss] returns vs loop invariants
- Index(es):