daikon/doc/developer.texinfo at d28c06fa8afcd3cca4cbd42f3aebcf77b86d407e · codespecs/daikon · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\input texinfo   @c -*-texinfo-*-
@c %**start of header
@setfilename developer.info
@settitle The Daikon Invariant Detector Developer Manual
@c %**end of header

@c To update all the nodes and menus all at once:   C-u C-c C-u m
@c You shouldn't need to do that, though; makeinfo does it for you.

@c Tex magic to reduce the margins for the pdf version;
@c these values have no effect on the html output.
@tex
\global\hsize = 6.5in
\global\normaloffset = 0in
@end tex

@macro daikonemail{}
@email{daikon-developers@@googlegroups.com}
@end macro

@macro nospellcheck{text}
\text\
@end macro

@c set overall document style
@c @setchapternewpage odd
@paragraphindent 1
@firstparagraphindent insert
@codequotebacktick on

@c Avoid black boxes marking overfull hboxes in TeX output.
@finalout

@c Start of Document

@titlepage
@sp 10

@c Could also use @title, @subtitle, @author here.
@center @titlefont{Daikon Invariant Detector}
@sp 1
@center @titlefont{Developer Manual}

@sp 2
@center Daikon version 5.7.1

@sp 1

@ifset final
@c FINAL version
@c Print out the Daikon version 5.7.1 release date
@center August 23, 2018
@end ifset

@ifclear final
@c DRAFT version
@center DRAFT Version @today{}
@end ifclear

@sp 5
@c reads daikon-logo.{eps,pdf} (not .txt, .png, or .jpg, because info
@c and HTML don't get the title page)
@image{images/daikon-logo,4in,}

@c The following two commands start the copyright page.
@page
@vskip 0pt plus 1filll
Copyright @copyright{} 1998-2014
@end titlepage

@c set document heading/footing style
@c The texinfo support for headings is too broken to use;
@c perhaps we could fix texinfo.tex some day.
@c @headings off
@c @evenheading @thispage @| @thischaptername @| Chapter @thischapternum
@c @oddheading Section @thissectionnum @| @thissectionname @| @thispage
@c @oddheading Chapter @thischapternum: @thischaptername @| @| @thispage
@c @evenheading @thispage @| @| Section @thissectionnum: @thissectionname

@ifclear final
@c DRAFT version
@everyfooting DRAFT @| @| @today{}
@end ifclear

@html
@image{images/daikon-logo}
@end html

@c Putting this lower in the HTML version looks a little bit strange,
@c but it's acceptable.
@ifnothtml
@contents
@end ifnothtml

@ifnottex
@node  Top
@top Daikon Invariant Detector Developer Manual

This is the developer manual for the Daikon invariant detector.
It describes Daikon version 5.7.1, released August 23, 2018.

@menu
* Introduction::
* Extending Daikon::
* Debugging Daikon::
* Daikon internals::
* Testing::
* Editing::
* Distribution::
* Historical::
* File formats::
* General Index::

@c Putting end menu here is not how it's supposed to be done,
@c but it seems to fix the html toc indentation bug.
@end menu
@end ifnottex

@ifhtml
@contents
@end ifhtml


@node    Introduction
@chapter Introduction

This is the developer manual for the
@ifhtml
@uref{http://plse.cs.washington.edu/daikon/, ,Daikon invariant detector}.
@end ifhtml
@ifnothtml
Daikon invariant detector.  See: @*
@ @ @ @ @ @ @uref{http://plse.cs.washington.edu/daikon/}.
@end ifnothtml

For information about using Daikon, see the
@uref{http://plse.cs.washington.edu/daikon/download/doc/daikon.html#Introduction,Daikon User Manual}.
This manual is intended for those who are already familiar with the use
of Daikon, but wish to customize or extend it.

Additional information can be found in the technical papers available from
@w{@uref{http://plse.cs.washington.edu/daikon/pubs/}.}

@node    Extending Daikon
@chapter Extending Daikon

@cindex extending Daikon
@cindex changing Daikon
@cindex customizing Daikon
@cindex modifying Daikon

This chapter describes how to customize or modify Daikon.

@menu
* Compiling Daikon::
* Version control repository::
* Using Eclipse::
* New invariants::
* New derived variables::
* New formatting for invariants::
* New front ends::
* New suppressors::
* Reading dtrace files::
* System.exit::
@end menu

@node    Compiling Daikon
@section Compiling Daikon

@cindex compiling Daikon

To compile Daikon, type @command{make} in @file{$DAIKONDIR/java/} or any of its
subdirectories.  Alternately, type @command{make -C $DAIKONDIR compile}.
To create the @file{daikon.jar} file, type @command{make -C $DAIKONDIR daikon.jar}.

The distribution includes @file{daikon.jar} and compiled @file{.class} files,
so you do not need to compile them yourself unless you make changes.

For more information about compiling Daikon, see the comments in the Makefiles.

When you compile Daikon, environment variable
@env{JAVA_HOME} must be set.
See the installation instructions
(@ref{Installing Daikon,,,daikon,Daikon User Manual}) for details.
Thus, a complete @file{.bashrc} or @file{.bash_profile} shell setup file
(for Fedora or Ubuntu)
that would enable you to compile Daikon might look like the following:

@example
export JAVA_HOME=/usr/lib/jvm/java

## Not required
# export DAIKONDIR=$HOME/daikon
# source $DAIKONDIR/scripts/daikon.bashrc
@end example


@cindex .jpp files
@cindex jpp files

When setting up environment pathname variables under Windows/Cygwin
(such as @command{JAVA_HOME}) make sure that the
pathname is specified in Linux format (e.g., @file{/cygdrive/c/daikon}
rather than @file{C:\daikon}).  However, the
classpath in a @command{-cp} or @command{-classpath} argument to @command{java}
is a special case;
as @command{java} is a Windows
program, it expects classpath to be in Windows format.
This is the only exception to Cygwin looking like Linux.
Thus the classpath must
always be in Windows format; i.e., it should use semicolons (;) rather
than colons (:) as a path separator.
Using the wrong filename format or the wrong pathname separator
is a common source of errors that will result in odd error
messages and build failures.

One easy way to deal with this is to use the @command{cygpath} utility:
@example
java -cp "`cygpath -wp linux/path/a:linux/path/b`" <other arguments>
@end example
@noindent
Note both the back quotes (@samp{`}) to force @command{cygpath} to be evaluated
before calling @command{java} and the double quotes (@samp{"}) to protect against
special characters and spaces in Windows paths.


@menu
* Requirements for compiling Daikon::
* Requirements for compiling Kvasir::
@end menu


@node    Requirements for compiling Daikon
@subsection Requirements for compiling Daikon

The list at the end of this section identifies the software you will
need to have on your path.

To install all these dependencies under Ubuntu, run the command:
@exampleindent 2
@example
sudo apt-get install openjdk-8-jdk gcc ctags graphviz netpbm texlive texinfo
@end example
@exampleindent 5
Similar commands would be used with other package management systems
on other operating systems.

@cindex Windows, compiling
To compile Daikon on Windows, the best approach is to first install the Cygwin
toolset (available at @uref{http://cygwin.com/}), which
contains everything you need to compile and run Linux programs under
Windows.  You can install Cygwin by running one of the setup
programs (based on your machine type) found
at @uref{http://cygwin.com/install.html}.
Then, install any necessary programs such as those below.

@itemize @bullet

@item
a Java compiler such as @samp{javac}.  This is needed because Daikon is
written in Java.
@cindex @samp{javac} compiler, overriding
@cindex Java compiler, specifying
To override the default Java compiler (@samp{javac}), create a
@file{Makefile.user} file in the @file{daikon} (@env{$DAIKONDIR}) directory and add a line
like the following.
@example
JAVAC ?= jikes -g +E +F
@end example

@item
the C preprocessor
@command{cpp}.  This is needed to convert each @file{.jpp} file in the
distribution into multiple @file{.java} files, which are then compiled.
If you have a C compiler, you almost certainly have @command{cpp}.
If you do not have @command{cpp} (or @command{gcc}, which can emulate
@command{cpp} via @command{gcc -E}), you may run
@command{make -C $DAIKONDIR/java avoid-jpp}, in
which case changes to @code{.jpp} files will not be reflected in the
@file{.java} files or the compiled @file{.class} files.  (The purpose
of the @file{.jpp} files is to avoid code duplication by placing
common code in a single file, then generating other files that need to
include that common code.)

@item
@c needs version 4.7 or newer; version 4.7 was released in April 2004
@command{makeinfo}.
This is needed to make the documentation (via @command{make -C $DAIKONDIR/doc}).

@item
@command{dos2unix} or @command{perl}.
This is used to ensure that the output is the same under Unix and Windows.

@end itemize


@node    Requirements for compiling Kvasir
@subsection Requirements for compiling Kvasir

In order to compile Kvasir, the Daikon front end for the C language
(@pxref{Kvasir,,,daikon,Daikon User Manual}), you will need some
additional tools beyond those required to compile Daikon.

Note that Kvasir does not work on MacOS.

To install these dependencies under Ubuntu, run the command:
@exampleindent 2
@example
sudo apt-get install m4 automake autoconf binutils-dev
@end example
@exampleindent 5
Similar commands would be used with other package management systems
on other operating systems.


@node    Version control repository
@section Version control repository

@cindex repository
@cindex Git repository
@cindex version control repository

The Daikon git repository is located on GitHub;
see @uref{https://github.com/@/codespecs/daikon/}.

After making a local clone, @pxref{Compiling Daikon}, for instructions
on how to compile Daikon.


@node    Using Eclipse
@section Using Eclipse

@cindex Eclipse

[To be improved.]

Here is one way to use Eclipse to edit Daikon.

First, make sure that Daikon builds cleanly from the command line.

File > Import > General > Existing Projects into Workspace

Choose the @file{java} directory of your Daikon checkout

Project > properties > Java build path:
 libraries : @command{add external jars} everything in the lib/ directory,
   plus also the @file{tools.jar} file in the @file{lib/} directory of your JDK.
   (I'm not sure why, but @command{add jars} doesn't show all @file{.jar}
   files in the directory.)

 Source:
   add @file{Daikon}, remove @file{Daikon/src}.
   Default output folder: change from @file{Daikon/bin} to @file{Daikon}.


@node    New invariants
@section New invariants

@cindex new invariants
@cindex adding new invariants

You can easily write your own invariants and have Daikon check them,
in addition to all the other invariants that are already part of Daikon.
Adding a new invariant to Daikon requires writing one Java class, as
explained below.
@itemize
@item
If you are willing to edit the Daikon source code, you should define the
invariant in Daikon's source code, in a subdirectory of
@file{java/daikon/inv/}.  Then, edit method @code{Daikon.setup_proto_invs} to
call the new invariant's @code{get_proto} method and recompile Daikon.
@item
If you do not wish to edit the Daikon source code, then compile the new
invariant and put its @file{.class} file on your classpath.  Then, invoke
Daikon with the @option{--user_defined_invariant} command-line argument
(@pxref{Options to control invariant detection,,,daikon,Daikon User Manual}).
@end itemize


The file @file{java/daikon/inv/unary/scalar/Positive.java} in the
Daikon distribution contains a sample invariant.  This invariant is
true if the variable is always positive (greater than zero).  This
invariant is subsumed by other invariants in the system; it is provided
only as a pedagogical example.  To enable the invariant, comment out the
appropriate line in @code{Daikon.setup_proto_invs()}, then recompile
Daikon.


A Java class defining an invariant is a concrete subclass of one of the direct subclasses of
@uref{http://plse.cs.washington.edu/daikon/download/api/daikon/inv/unary/UnaryInvariant.html,@code{UnaryInvariant}},
@uref{http://plse.cs.washington.edu/daikon/download/api/daikon/inv/binary/BinaryInvariant.html,@code{BinaryInvariant}},
or
@uref{http://plse.cs.washington.edu/daikon/download/api/daikon/inv/ternary/TernaryInvariant.html,@code{TernaryInvariant}}.
A complete list of invariants appears in the body of
@code{Daikon.setup_proto_invs()}.

Daikon's invariants are first instantiated, then are presented samples
(tuples of values for all the variables of interest to the invariant;
this might be a 1-tuple, a 2-tuple, or a 3-tuple) in turn.  If any
sample falsifies the invariant, the invariant destroys itself.  All
remaining invariants at the end of the program run can be queried for
their statistical confidence, then reported as
likely to be true.

You need to implement the abstract methods of @uref{http://plse.cs.washington.edu/daikon/download/api/daikon/inv/Invariant.html,@code{Invariant}} that are not defined in one of the subclasses listed above.  You also need to define a constructor and a static method:

@table @code
@item protected @var{InvName}(PptSlice ppt)
Constructor for class @var{InvName}.  Should only be called from
@code{instantiate_dyn}.
Its typical implementation is
@example
super(ppt);
@end example

@item public static @var{InvName} get_proto()
Returns the prototype invariant used to create other invariants.  Its
typical implementation is
@example
if (proto == null)
  proto = new InvName(null);
return (proto);
@end example

@end table

Methods that need to be overridden that are defined in a subclass of
@samp{Invariant} include:

@table @code
@item public InvariantStatus check_modified(..., int count)
@itemx public InvariantStatus add_modified(..., int count)
Determines whether the invariant is true for a sample (a tuple of values).
@end table


You will eventually want to override one or more of these methods (@pxref{New formatting for invariants}):

@table @code
@item public String format()
@itemx public String repr()
@itemx public String format_using(OutputFormat format)
Returns a high-level printed representation of the
invariant, for user output.
@end table


@node    New derived variables
@section New derived variables

@cindex derived variable
@cindex variable, derived
@cindex adding new derived variables
@cindex new derived variables

A derived variable is an expression that does not appear in the source
code as a variable, but that Daikon treats as a variable for purposes
of invariant detection.  For instance, if there exists an array
@samp{a} and an integer @samp{i}, then Daikon introduces the derived
variable @samp{a[i]}.  This permits detection of invariants over this
quantity.

(Describing how to create new variety of derived variable is still to
be written.  For now, see the derived variables that appear in the Java
files in directory @file{$DAIKONDIR/java/daikon/derive/}.)


@node    New formatting for invariants
@section New formatting for invariants

@cindex output format, defining new
@cindex new output formats
@cindex adding new output formats

Daikon can print invariants in multiple formats
(@pxref{Invariant syntax,,,daikon,Daikon User Manual}).

To support a new output format, you need to do two things:
@itemize @bullet
@item
In @code{daikon.inv.Invariant.OutputFormat}, add a new static final
field and also update the @code{get} method.
@c (if it returns null, then you
@c will get an @samp{Unknown output format} error message when you run Daikon.)
@item
In every subclass of @code{Invariant}, edit the
@code{format_using} method to handle the new @code{OutputFormat}.
@end itemize

@node    New front ends
@section New front ends

@cindex front end, writing

A front end for Daikon converts data into a form Daikon can process,
producing files in Daikon's input format --- data trace declarations and
records.  For more information about these files, see @ref{File formats}.

The data traces can be obtained from any source.  For instance, front
ends have been built for stock data, weather forecasts, truck weight
data, and spreadsheet data (@pxref{convertcsv.pl,,,daikon,Daikon User
Manual}), among others.  More often, users apply a programming language
front end (also called an @dfn{instrumenter}) to a program, causing
executions of the program to write files in Daikon's format.  (For
information about existing front ends, see
@ref{Front ends (instrumentation),,,daikon,Daikon User Manual}.)
When a general front end is not
available, it is possible to manually instrument a specific program so
that it writes files in Daikon's format.  The resulting instrumented
program is very similar to what an instrumenter would have created, so
this section is relevant to both approaches.

Conceptually, instrumentation is very simple.  For each program point
(say, a line of code or the entry or exit from a procedure) at which you
wish to detect invariants, the front end must arrange to create a
declaration (@pxref{Declarations}) that
lists the variables in scope at that program point, and the front end
must arrange that execution creates a data trace record (@pxref{Data
trace records}) for each execution of that
program point.  Conventionally, the way to create a data trace record is
to insert a @code{printf} (or similar) statement that outputs the current
values of the variables of interest.


@menu
* Example instrumented Java program::
* Instrumenting C programs::
@end menu

@node       Example instrumented Java program
@subsection Example instrumented Java program

@cindex instrumenting Java programs
@cindex Java programs, instrumenting

This section gives an example of how an instrumenter for Java might
work; other languages are analogous.
Suppose we wish to instrument file @file{Example.java}.
@example
class Example @{
  // Return either the square of x or the square of (x+1).
  int squar(int x, boolean b) @{
    if (b)
      x++;
    return x*x;
  @}
@}
@end example

@noindent
The @file{.decls} file might look like the following.
@example
DECLARE
Example.squar:::ENTER
x
int
int
1
b
boolean
int
2

DECLARE
Example.squar:::EXIT
x
int
int
1
b
boolean
int
2
return
int
int
1
@end example

@noindent
The instrumented @file{.java} file might look like the following.
This example does not compute the ``modified bits'', but simply sets
them all to 1, which is a safe default.
@example
class Example @{
  static @{
    daikon.Runtime.setDtraceMaybe("daikon-output/StackAr.dtrace");
  @}

  // Return either the square of x or the square of (x+1).
  int squar(int x, boolean b) @{
    synchronized (daikon.Runtime.dtrace) @{
      daikon.Runtime.dtrace.println();
      daikon.Runtime.dtrace.println("Example.squar:::ENTER");
      daikon.Runtime.dtrace.println("x");
      daikon.Runtime.dtrace.println(x);
      daikon.Runtime.dtrace.println(1);  // modified bit
      daikon.Runtime.dtrace.println("b");
      daikon.Runtime.dtrace.println(b ? 1 : 0);
      daikon.Runtime.dtrace.println(1);  // modified bit
    @}

    if (b)
      x++;

    int daikon_return_value = x*x;
    synchronized (daikon.Runtime.dtrace) @{
      daikon.Runtime.dtrace.println();
      daikon.Runtime.dtrace.println("Example.squar:::EXIT");
      daikon.Runtime.dtrace.println("x");
      daikon.Runtime.dtrace.println(x);
      daikon.Runtime.dtrace.println(1);  // modified bit
      daikon.Runtime.dtrace.println("b");
      daikon.Runtime.dtrace.println(b ? 1 : 0);
      daikon.Runtime.dtrace.println(1);  // modified bit
      daikon.Runtime.dtrace.println("return");
      daikon.Runtime.dtrace.println(daikon_return_value);
      daikon.Runtime.dtrace.println(1);  // modified bit
    @}

    return daikon_return_value;
  @}
@}
@end example

@noindent
(Daikon's Java front end, Chicory, does not actually insert
instrumentation into the Java source code of your program.  Rather, it
instruments the bytecode as it is loaded into the JVM.  This is more
efficient, and it avoids making any changes to your @file{.java} or
@file{.class} files.  We have shown an example of  Java source code
instrumentation because that is  simpler to explain and understand than
the bytecode instrumentation.)


@node       Instrumenting C programs
@subsection Instrumenting C programs

@cindex C programs, instrumenting
@cindex instrumenting C programs

Daikon comes with a front end for the C language:  Kvasir
(@pxref{Kvasir,,,daikon,Daikon User Manual}).
Kvasir only works under the Linux operating system, and it works only on
``x86'' (Intel 386, 486, 586, 686 compatible) and ``x86-64'' (AMD64, EM64T
compatible) processors.

You may wish to infer invariants over C programs running on other
platforms; for instance, you want a robust C front end that works under
Microsoft Windows.  This section will help you to either write such a
front end or to hand-instrument your program to produce output that
Daikon can process.

We welcome additions and corrections to this part of the manual.  And,
if you write a C instrumenter that might be of use to others, please
contribute it back to the Daikon project.


A front end for C (or any other language) performs two tasks.  It
determines the names of all variables that are in scope at a particular
program point, and it prints the values of those variables each time the
program point executes.

Determining the names of the variables is straightforward.  It requires
either parsing source code or parsing a compiled executable.  In the
latter case, the variables can be determined from debugging information
that the compiler places in the executable.

The challenge for C programs is determining the values of variables at
execution time:  for each variable, the front end must determine whether
the variable's value is valid, and how big the value is.

@cindex valid values
@cindex invalid values
@cindex nonsensical values
@cindex uninitialized variables
@cindex deallocated pointers

A front end should print only variables that have @emph{valid} values.
Examples of invalid values are variables that have not yet been
initialized and pointers whose content has been deallocated.  (A pointer
dereference, such as @samp{*p} or @samp{p->field}, can itself
be to uninitialized and/or deallocated memory.)  Invalid values should
be printed as @samp{nonsensical} (@pxref{Data trace records}).

It is desirable to print @samp{nonsensical} rather than an invalid value,
for two reasons.  First, outputting nonsense values can degrade
invariant detection; patterns in the valid data may be masked by noise
from invalid values.  Second, an attempt to access an invalid value can
cause the instrumented program to crash!  For instance, suppose that
pointer @samp{p} is not yet initialized --- the pointer value refers to
some arbitrary location in memory, possibly even an address that the
operating system has not allocated to the program.  An attempt to print
the value of @samp{*p} or @samp{p->field} will result in a segmentation
fault when @samp{*p} is accessed.  (If you choose never to dereference a
pointer while performing instrumentation, then you do not need to worry
about invalid references.  However, you will be unable to output any
fields of a pointer to a struct or class, making your front end less
useful.  You will still be able to output fields of a regular variable
to a struct or class, but most interesting uses of structs and classes
in C and C++ are through pointers.)

C relies on the programmer to remember which variables are valid, and
the programmer must take care never to access invalid variables.
Unfortunately, there is no simple automatic way to determine variable
validity for an arbitrary C program.  (Languages with automatic memory
management, such as Java, do not pose these problems.  All variables
always have an initial value, so there is no danger of printing
uninitialized memory, though the initial value may not be particularly
meaningful.  Because pointed-to memory is never deallocated, all
non-null pointers are always valid, so there is no danger of a
segmentation fault.)

An instrumenter needs information about validity of variable values.
This could be obtained from the programmer (which requires work on the
part of the user of Daikon), or obtained automatically by creating a new
run-time system that tracks the information (which requires a more
sophisticated front end).

In addition to determining which variables are uninitialized and which
pointers are to allocated memory, there are additional problems for a C
front end.  For example, given a char pointer @samp{*c}, does it point
to a single character, or to an array of characters?  If it points to an
array of characters, how big is that array?  And for each element of the
array, is that element initialized or not?


The problem of tracking C memory may seem daunting, but it is not
insurmountable.  There exist many tools for detecting or debugging
memory errors in C, and they need to perform exactly the same memory
tracking as a Daikon front end must perform.  Therefore, a Daikon front
end can use the same well-known techniques, and possibly can even be
built on top of such a tool.  For instance, one C front end, named Kvasir, is
built on top of the Valgrind tool (@uref{http://valgrind.org/}),
greatly reducing the implementation effort.  Valgrind only works under
Linux, but a C front end for another platform could build on a similar
tool; many other such tools exist.


There are two basic approaches to instrumenting a C program (or a
program in any other language):  instrument the source code, or
instrument a compiled binary representation of the program.  In each
case, additional code that tracks all memory allocations, deallocations,
writes, and reads must be executed at run time.  Which approach is most
appropriate for you depends on what tools you use when building your C
instrumentation system.


In some cases, it may not be necessary to build a fully general C
instrumentation system.  You may be able to craft a smaller, simpler
extension to an existing program --- enabling that program (only) to
produce files for Daikon to analyze.

For instance, many programs use specialized memory allocation routines
(customized versions of @code{malloc} and @code{free}), in order to
prevent or detect memory errors.  The information that such libraries
collect is often sufficient to determine which variable values should be
printed, and which should be suppressed in favor of printing
@samp{nonsensical} instead.


The presence of memory errors --- even in a program that @emph{appears}
to run correctly --- makes it much harder to create Daikon's output.
Therefore, as a prerequisite to instrumenting a C program, it is usually
a good idea to run a memory checker on that program and to eliminate any
memory errors.

@c If you don't have a memory checking tool, then what platform are you
@c on??  But you could use some Very simple half-measures such s zeroing
@c out memory when it is allocated (always use @code{calloc}, never
@c @code{malloc}) and when it is deallocated, and zeroing out pointers
@c when they are freed.

@c An example of compiler (not runtime) infrastructure for Microsoft
@c Windows is Phoenix:
@c http://research.microsoft.com/phoenix/


@node    New suppressors
@section New suppressors

@cindex adding new suppressors
@cindex new suppressors
@cindex suppressors, adding new

As mentioned in @ref{Daikon internals}, one way to make Daikon more
efficient, and to reduce clutter in the output to the user, is to reduce the
number of redundant invariants of various kinds.  This section describes
how to add a new suppressor relation, such that if invariant A implies
B, B is not instantiated or checked as long as A holds, saving time and
space.  Suppression implications use some terminology.  A
@dfn{suppressor} (defined in the class @code{NISuppressor}) is one of a
set of invariants (@code{NISuppression}) that imply and suppress a
@dfn{suppressee} invariant (@code{NISuppressee}).  The set of all of
the suppressions that suppress a particular @var{suppressee} is stored in the
class @code{NISuppressionSet}.

Adding a new suppression is straightforward when the invariants involved
do not have any state.  Define the @var{suppressee} and
each of the suppressions that suppress it using the corresponding
constructors.  Add the method @code{get_ni_suppressions} to the class
of the invariant being suppressed and return the appropriate
suppression set.  Make sure that @code{get_ni_suppressions} always
returns the same suppression set (i.e., that storage to store
the suppressions is only allocated once).  Normally this is
done by defining a static variable to hold the suppression sets
and initializing this variable the first time that @code{get_ni_suppressions}
is called.

The following example defines suppressions for @samp{x == y} implies
@samp{x >= y} and @samp{x > y} implies @samp{x >= y}.

@exampleindent 1
@example
private static NISuppressionSet suppressions = null;

public NISuppressionSet get_ni_suppressions() @{
  if (suppressions == null) @{
    NISuppressee = new NISuppressee(IntGreaterEqual);

    NISuppressor v1_eq_v2 = new NISuppressor(0, 1, IntEqual.class);
    NISuppressor v1_lt_v2 = new NISuppressor(0, 1, IntLessThan.class);

    suppressions = new NISuppressionSet(new NISuppression[] @{
      new NISuppression(v1_eq_v2, suppressee),
      new NISuppression(v1_lt_v2, suppressee),
    @});
  @}
  return suppressions;
@}
@end example
@c restore default:
@exampleindent 5

For suppressions depending on the state of a particular invariant,
each @code{Invariant} has an @code{isObviousDynamically(VarInfo[] vis)}
method that is called once the state of other invariants has already
been determined.  This method returns a non-null value if this invariant
is implied by a fact that can be derived from the given @code{VarInfo}s.

For example, suppose division was not defined for divisors smaller than 1.
The following example defines an obvious check for @samp{x <= c}
(where c < 1 is a constant) implies @samp{y % x == 0}, written in the
Divides class.

@exampleindent 1
@example
public DiscardInfo isObviousDynamically(VarInfo[] vis) @{
  DiscardInfo di = super.isObviousDynamically(vis);
  if(di != null) @{
    return di;
  @}

  VarInfo var1 = vis[0];

  PptSlice1 ppt_over1 = ppt.parent.findSlice(var1);

  if(ppt_over1 == null) @{
    return null;
  @}

  for(Invariant inv : ppt_over1.invs) @{
    if(inv instanceof UpperBound) @{
      if(((UpperBound) inv).max() < 1) @{
        return new DiscardInfo(this, DiscardCode.obvious,
                       @samp{Divides is obvious when divisor less than one});
      @}
    @}
  @}

  return null;
@}
@end example
@c restore default:
@exampleindent 5


@node    Reading dtrace files
@section Reading dtrace files

If you wish to write a program that manipulates a @file{.dtrace} file, you
can use Daikon's built-in mechanisms for parsing @file{.dtrace} files.
(This is easier and less error-prone than writing your own parser.)

You will write a subclass of @code{FileIO.Processor}, then pass an instance
of that class to @code{FileIO.read_data_trace_files}.  Daikon will parse
each record in the trace files that you indicate, then will pass the parsed
version to methods in your processor.

For a simple example of how to use @code{FileIO.Processor}, see the file
@*
@file{daikon/java/daikon/tools/ReadTrace.java}.


@node    System.exit
@section System.exit

The Daikon codebase does not call @code{System.exit()}, except in a
dummy main method that catches @code{TerminationMessage}, which is the
standard way that a component of Daikon requests the JVM to shut down.

The reason for this is that calling @code{System.exit()} is usually a
bad idea.  It makes the class unusable as a subroutine, because it
might kill the calling program.  It can cause deadlock.  And it can
leave data in an inconsistent state (for example, if the program was in
the middle of writing a file, still held non-Java locks, etc.), because
the program has no good way of completing any actions that it was in the
middle of.  Therefore, it is better to throw an exception and let the
program handle it appropriately.  (This is true of instrumentation code
as well.)

To see the stack trace for a @code{TerminationMessage}, pass
@option{--config_option daikon.Debug.show_stack_trace=true} on the command
line.


@node    Debugging Daikon
@chapter Debugging Daikon

@cindex debugging Daikon

@menu
* Track logging::
@end menu

@ref{Daikon debugging options,,,daikon,Daikon User Manual} describes
several command-line options that enable logging, which can be a useful
alternative to using a debugger when debugging Daikon.  Because Daikon
processes large amounts of data, using a debugger can be difficult.

This chapter describes some of the command-line options in more detail.


@node    Track logging
@section Track logging

@cindex logging
@cindex track logging

Often it is desirable to print information only about one or more specific
invariants.  This is distinct from general logging because it concentrates
on specific invariant objects rather than a particular class or portion of Daikon.
This is referred to as @dfn{Track} logging because it tracks particular
values across Daikon.

@c makeinfo doesn't like ',' in @var so we can't do this:
@c @var{class|class|...<var,var,var>@@ppt}
The @option{--track @var{class|class|...<var}@i{,}@var{var}@i{,}@var{var>@@ppt}}
option to Daikon
(@pxref{Daikon debugging options,,,daikon,Daikon User Manual})
enables track logging.
The argument to the @option{--track} option supplies three pieces of information:

@enumerate

@item The class name of the invariant (e.g., @code{IntEqual}).
Multiple class arguments can be specified separated by pipe symbols
(@samp{|}).

@item The variables that are used in the invariant (e.g., @code{return},
@code{size(this.s[])}).  The variables are specified in angle brackets
(@samp{<>}).

@item The program point of interest (e.g.,
@code{DataStructures.StackAr.findMin()V:::ENTER}).
The program point is preceded by an at sign (@samp{@@}).

@end enumerate

Each item is optional.
For example:
@example
IntEqual<x,y>@@makeEmpty()
LessThan|GreaterThan<return,orig(y)>@@EXIT99
@end example

Multiple @option{--track} switches can be specified. The class, program point,
and each of the variables must match one of the specifications in order
for information concerning the invariant to be printed.

Matching is a simple substring comparison.  The specified item must be
a substring of the actual item.  For instance, @code{LessThan} matches
both @code{IntLessThan} and @code{FloatLessThan}.

Program points and variables are specified exactly as they are seen in
normal Daikon invariant output.  Specifically, @code{Ppt.name} and
@code{VarInfo.name.name()} are used to generate the names for comparisons.

Invariants are not the only classes that can be tracked.  Any class name
is a valid entry.  Thus, for example, to print information about derived
sequence variables from sequence @code{this.theArray[]} and scalar
@code{x} at program point @code{DisjSets.find(int):::EXIT}, the tracking
argument would be:

@exampleindent 1
@smallexample
SequenceScalarSubscriptFactory<x,this.theArray[]>@@DisjSets.find(int):::EXIT