Multilingual Syntax Editing for Software Specifications

Transcription

Multilingual Syntax Editing for Software
Specifications
Hans-Joachim Daniels
30. August 2005
Diplomarbeit
Universität Karlsruhe (TH)
Fakultät für Informatik
Institut für Theoretische Informatik
Verantwortlicher Betreuer: Prof. Dr. Peter H. Schmitt
Betreuer: Kristofer Johannisson, Aarne Ranta
Danksagungen
Mein erster Dank gebührt Aarne Ranta, der den Anstoß zu dieser Arbeit im Ausland
gegeben hat und mir dort hilfreich zur Seite stand. Besonderen Dank auch an Kristofer
Johanisson, der mich auch nach Ablauf seiner Anstellung bei Chalmers weiter sehr gut
betreut hat. Vielen Dank auch an Peter Schmitt, der es mir ermöglicht hat, diese Diplomarbeit in Göteborg zu schreiben.
Janna Khegais Antworten haben mir am Anfang beim Verständnis des Editor-Quelltextes
viel Zeit gespart. Vielen Dank.
Meiner Mutter danke ich dafür, dass sie viele sprachliche Fehler in dieser Ausarbeitung aufgedeckt hat. Und überhaupt, ohne die Unterstützung meiner Eltern hätte ich
mein Studium und meine Auslandsaufenthalte nicht machen können. Vielen, vielen Dank
dafür.
ii
Erklärung
Hiermit versichere ich, die vorliegende Arbeit selbständig verfasst und keine anderen als
die angegebenen Quellen und Hilfsmittel benutzt zu haben.
Göteborg, den 30. August 2005
Hans-Joachim Daniels
iii
iv
Inhaltsverzeichnis
I. Deutsche Zusammenfassung
1
1. Einführung
1.1. Das Grammatical Framework . . . . . . . . .
1.2. KeY . . . . . . . . . . . . . . . . . . . . . . .
1.3. Formal and Informal Software Specification“
”
1.4. Ziel . . . . . . . . . . . . . . . . . . . . . . . .
1.5. Zielgruppe und mögliche Anwendungsgebiete .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
4
4
5
5
2. Gebrauchstauglichkeitsprobleme in der alten Version des Editors
2.1. Welche Funktion macht was? . . . . . . . . . . . . . . . . . . . . .
2.2. Den Wald vor lauter Bäumen nicht sehen . . . . . . . . . . . . . .
2.3. Was soll ich jetzt machen? . . . . . . . . . . . . . . . . . . . . . .
2.4. Alles ist rot, was habe ich falsch gemacht? . . . . . . . . . . . . .
2.5. So viele Vergleichsoperatoren? . . . . . . . . . . . . . . . . . . . .
2.6. Wo fängt was an? . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7. Aber ich wollte doch das nächste Fragezeichen ausfüllen . . . . . .
2.8. Ich will 42 eingeben. . . . . . . . . . . . . . . . . . . . . . . . . .
2.9. Aber meine Ganze Zahl ist doch eine Reele Zahl, oder etwa nicht?
2.10. Wo sind meine booleschen Eigenschaften hin? . . . . . . . . . . .
2.11. Und warum klappt der Baum immer zusammen? . . . . . . . . .
2.12. Was machten self und result da? . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
7
7
8
8
8
8
9
9
9
10
10
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3. Systemarchitektur
11
3.1. Überblick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2. Integrationsprobleme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4. Gebrauchstauglichkeitsverbesserungen
4.1. Die nächste Verfeinerung . . . . . .
4.2. Unterstützung für HTML . . . . .
4.3. Typumwandlungen . . . . . . . . .
4.4. Kleinere Verbesserungen . . . . . .
4.4.1. self und result . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13
13
14
14
15
15
v
Inhaltsverzeichnis
4.4.2.
4.4.3.
4.4.4.
4.4.5.
4.4.6.
Einfachere Erreichbarkeit von Eigenschaften von self . .
Wegkommen von verstecken Parametern . . . . . . . . .
Geänderte Auswahlfarbe . . . . . . . . . . . . . . . . . .
Zerteilfenster per Mittelklick . . . . . . . . . . . . . . . .
Sicherheitsabfrage beim Beenden, ob man speichern will
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
16
16
17
5. Fazit
19
5.1. Nur ein Editor speziell für OCL? . . . . . . . . . . . . . . . . . . . . . . 19
5.2. Ausblick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
II. Englische Ausarbeitung
23
6. Introduction
6.1. The Grammatical Framework . . . . . . . .
6.1.1. Overview . . . . . . . . . . . . . . .
6.1.2. Example . . . . . . . . . . . . . . . .
6.1.3. The Java Editor for GF . . . . . . .
6.2. KeY . . . . . . . . . . . . . . . . . . . . . .
6.3. ‘Formal and Informal Software Specification’
6.4. Goals . . . . . . . . . . . . . . . . . . . . . .
6.5. Potential use and users . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
25
25
25
26
29
30
33
35
35
7. Usability Problems of the Previous Editor
7.1. Which function does what? . . . . . . . . . .
7.2. Not seeing the forest because of so many trees
7.3. What am I to do? . . . . . . . . . . . . . . . .
7.4. Where does what begin? . . . . . . . . . . . .
7.5. But my Integer is a Real, isn’t it? . . . . . . .
7.6. I want to enter 42. . . . . . . . . . . . . . . .
7.7. Where is my boolean property? . . . . . . . .
7.8. Everything is red, what did I do wrong? . . .
7.9. So many comparison operators? . . . . . . . .
7.10. I wanted to fill in the next question mark . . . .
7.11. I could choose self, but now nothing is offered.
7.12. Why does the tree always collapse my nodes?
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
37
37
37
38
39
40
42
43
44
45
46
46
47
.
.
.
.
49
49
50
51
53
8. System Architecture
8.1. Overview . . . . . . . . . . . . . . .
8.2. State of the affairs at the beginning
8.3. Support for the new grammars . . .
8.4. Basic work-flow . . . . . . . . . . .
vi
. . . . . . .
of this work
. . . . . . .
. . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Inhaltsverzeichnis
9. Usability Improvements
9.1. The next refinement ... . . . . . . . . . . . .
9.1.1. Refinement descriptions . . . . . . .
9.1.2. Grouping of refinement menu entries
9.1.3. Parameter descriptions . . . . . . . .
9.1.4. Printnames as the solution . . . . . .
9.2. HTML support . . . . . . . . . . . . . . . .
9.3. Coercions . . . . . . . . . . . . . . . . . . .
9.3.1. Where to insert? . . . . . . . . . . .
9.3.2. How to tag? . . . . . . . . . . . . . .
9.3.3. Refining . . . . . . . . . . . . . . . .
9.3.4. Deleting . . . . . . . . . . . . . . . .
9.3.5. Wrapping . . . . . . . . . . . . . . .
9.3.6. Changes afterwards . . . . . . . . . .
9.3.7. Collection subtyping . . . . . . . . .
9.3.8. Summing up coercions . . . . . . . .
9.4. Minor Improvements . . . . . . . . . . . . .
9.4.1. Suppression of self and result . . . . .
9.4.2. Easier access to properties of self . .
9.4.3. Comparison operators . . . . . . . .
9.4.4. Escaping hidden arguments . . . . .
9.4.5. Changed selection colour . . . . . . .
9.5. Generic Improvements . . . . . . . . . . . .
9.5.1. Middle-click parsing . . . . . . . . .
9.5.2. Ask to save before quitting . . . . . .
9.6. Unimplemented ideas . . . . . . . . . . . . .
9.6.1. On AtomSent . . . . . . . . . . . . .
9.6.2. The collapsing tree . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
55
55
55
57
58
63
65
66
67
69
69
71
72
76
77
79
80
80
80
82
83
84
84
84
86
86
86
87
10.Implementation
10.1. Refactoring the old version of the editor . . . . . .
10.1.1. Documentation . . . . . . . . . . . . . . . .
10.1.2. Names . . . . . . . . . . . . . . . . . . . . .
10.1.3. Data flow . . . . . . . . . . . . . . . . . . .
10.1.4. Static attributes . . . . . . . . . . . . . . . .
10.1.5. Division of labour . . . . . . . . . . . . . . .
10.1.6. Character counting for click-in functionality
10.2. GF’s XML . . . . . . . . . . . . . . . . . . . . . . .
10.2.1. <hmsg> . . . . . . . . . . . . . . . . . . . .
10.2.2. <linearizations> . . . . . . . . . . . . . . .
10.2.3. <tree> . . . . . . . . . . . . . . . . . . . . .
10.2.4. <message> . . . . . . . . . . . . . . . . . .
10.2.5. <menu> . . . . . . . . . . . . . . . . . . . .
10.3. Overview over the classes . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
89
89
89
90
90
91
91
92
93
93
93
95
96
96
96
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
vii
Inhaltsverzeichnis
10.4. Receiving the state . . . . .
10.4.1. XML processing . . .
10.4.2. Probing . . . . . . .
10.4.3. Undo handling . . .
10.5. Sending commands . . . . .
10.6. Integration into TogetherCC
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
96
96
101
101
102
102
11.Conclusions
11.1. User perspective . . . . . . . . .
11.2. Development . . . . . . . . . .
11.3. Contributions . . . . . . . . . .
11.4. Only a specialised OCL editor?
11.5. In the view of EN ISO 9241 - 10
11.6. Future work . . . . . . . . . . .
11.7. Related work . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
105
105
105
106
106
107
109
114
.
.
.
.
.
115
115
116
118
118
119
A. Format of the Printnames
A.1. Introduction . . . . . . . . . . . .
A.2. How to use the enhanced tooltips
A.2.1. Tooltips . . . . . . . . . .
A.2.2. Grouping . . . . . . . . .
A.2.3. Parameter descriptions . .
viii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Abbildungsverzeichnis
6.1. The AST and linearisation of orS in the graphical editor for GF . .
6.2. The parametrised equality operator eq . . . . . . . . . . . . . . . .
6.3. The main window of the Java GUI for GF . . . . . . . . . . . . . .
6.4. State after a refinement . . . . . . . . . . . . . . . . . . . . . . . . .
6.5. Selection of an already refined node . . . . . . . . . . . . . . . . . .
6.6. Tree and linearisation after sending the delete command d . . . . .
6.7. Tree and linearisation after sending the change command ch Thick
6.8. Tree and linearisation after sending the wrap command w Dirty . .
6.9. Tree and linearisation after sending the peel head command ph . . .
6.10. Context menu integration into TogetherCC . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
27
28
30
31
32
32
32
32
33
34
7.1.
7.2.
7.3.
7.4.
7.5.
7.6.
Screen of the editor directly after starting . . . . . . . . . .
The hidden argument of eq is selected. But what does it do?
A non-trivial OCL constraint of the PayCard example . . . .
A transitive subtype witness . . . . . . . . . . . . . . . . . .
An AST, where a boolean model element is accessed. . . . .
The user refined with self for a wrong type and is now stuck.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
38
39
40
42
44
47
8.1.
8.2.
8.3.
8.4.
System architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
An example OCL file as expected by the OCL parser . . . . . . . . . . .
An example OCL skeleton for the method charge as a GF AST . . . . .
The editor after choosing Edit Pre/Postcondition [GF] in the context
menu of TogetherCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
52
52
The situation from 7.1, but this time with English as the menu language
The OCL collection operation forAll, explained with a tooltip . . . . . . .
First argument of implies selected . . . . . . . . . . . . . . . . . . . . . .
First argument of propCall selected . . . . . . . . . . . . . . . . . . . . .
First argument of propCall unrefined, all properties of all classes listed . .
First argument of propCall refined, only the properties of that class are
listed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.7. GF refines some type arguments automatically . . . . . . . . . . . . . . .
9.8. Figure 9.4 revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.9. A picture in the HTML display . . . . . . . . . . . . . . . . . . . . . . .
9.10. Side by side comparison of no formatting and formatting . . . . . . . . .
56
57
59
59
60
9.1.
9.2.
9.3.
9.4.
9.5.
9.6.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
54
61
61
62
66
67
ix
Abbildungsverzeichnis
9.11. Different expectations for coerce . . . . . . . . . . . . . . . .
9.12. Before and after an automatic coerce application . . . . . . .
9.13. Before and after a delete of a coerced Instance . . . . . . . .
9.14. Deleting of coerce above an unrefined node . . . . . . . . . .
9.15. Before and after wrapping with w wrapper 2 . . . . . . . .
9.16. Wrapping without CoercedTo . . . . . . . . . . . . . . . . .
9.17. Wrapping with needed coerce, but coerce hidden . . . . . . .
9.18. The situation from figure 9.17, but with coerce shown . . . .
9.19. GF constraints make the coerce visible . . . . . . . . . . . .
9.20. The editor noticed, that a subtyping witness is missing . . .
9.21. Direct access to the suiting properties of self, here for Integer
9.22. Middle-click parsing with the last linearization language . .
9.23. Middle-click parsing with the clicked linearization language .
9.24. Safety question for saving when exiting . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10.1. Sample of the XML for a GF state . . . . . . . . . . .
10.2. The editor after receiving the XML from figure 10.1. .
10.3. The main Java classes of the editor . . . . . . . . . . .
10.4. The classes for the refinement menu entries . . . . . . .
10.5. The behind-the-scenes probing classes . . . . . . . . . .
10.6. The classes, that interface the editor with TogetherCC
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 94
. 95
. 97
. 98
. 99
. 103
The refinement menu of the OCL grammars . . . . . . . . .
Grouping, tooltips and parameter descriptions . . . . . . . .
Parameter description as tooltip in the abstract syntax tree .
Better text for submenus of the refinement menu . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A.1.
A.2.
A.3.
A.4.
x
.
.
.
.
.
.
.
.
.
.
.
.
68
70
71
72
72
74
75
75
77
78
81
85
86
87
116
117
117
119
Teil I.
Mehrsprachiges syntaxgeleitetes
Verfassen von Software-Spezifikationen
— Deutsche Zusammenfassung —
1
1. Einführung
Diese Arbeit ist im Bereich der Softwarespezifikation angesiedelt. Zu sagen, dass ein
Programm das tut, was es soll, geht nur, wenn klar ist, was es tun soll. Und gerade im
Bereich formaler Methoden reicht es nicht, wenn nur einem Menschen klar ist, welche
Anforderungen an ein Programm gestellt werden sollen. Hier müssen diese Anforderungen formalisiert sein. Dafür werden besondere Spezifikationssprachen verwendet. Eine
davon ist die Object Constraint Language“ OCL.
”
Diejenige Sprache jedoch, die in der Praxis am meisten für die Spezifikation von Software verwendet wird, ist die Sprache, die die Entwickler oder die Anforderungsschreiber
normal auch sprechen. Formale Sprachen sind dagegen nicht so weit verbreitet. Die Absicht der dieser Ausarbeitung zugrundeliegende Arbeit ist es, die Kluft zwischen gewollt
(formal und exakt spezifiziert) und gegeben (in natürlicher Sprache, meist mehrdeutig,
beschrieben) zu überbrücken.
In diesem Abschnitt werden erst die in dieser Arbeit verwendeten Werkzeuge beschrieben. Später wird darauf eingegangen, woher diese Arbeit ihre Berechtigung hat. Auch
das Thema wird erläutert.
1.1. Das Grammatical Framework
GF ist ein Rahmenwerk, um Grammatiken zu schreiben und zu benutzen. Aber es ist
nicht nur einfach ein Grammatikformulismus, wie z.B. die Backus-Naur-Form einer ist,
sondern es ist auch noch die praktische Umsetzung dazu. Diese umfasst u.a. das Generieren und Zerteilen von Texten und die Typprüfung von abstrakten Syntaxbäumen.
GF ist für sowohl für formale Sprachen, wie auch für (genau beschriebene Teile von)
natürlichen Sprachen geeignet. Seine Fähigkeiten gehen über die reine Syntaxbehandlung hinaus; auch auf die Semantik kann, besonders mit Hilfe der abhängenden Typen
aus Martin Löfs Typtheorie, eingegangen werden.
Das Hauptmerkmal von GF ist die Trennung von abstraktem Syntaxbaum und seiner
Darstellung in konkreten Sprachen. Dadurch, dass der Aufbau einer Sprache getrennt
von ihrer Darstellung beschrieben wird, kann ihre Darstellung in verschiedenen Sprachen auch verschieden beschrieben werden. Zudem können sich so im laufenden Betrieb
mehrere konkrete Sprachen den gleichen Syntaxbaum teilen. Das ermöglicht, dass ein
Text, dessen abstrakte Syntax von GF verwaltet wird, in mehreren Sprachen ausgegeben werden kann, wie es z.B. in Abbildung 6.5 ersichtlich ist. Wie die abstrakte Syntax
und ihre Darstellungsregeln aussehen können, ist im Beispiel in Abschnitt 6.1.2 gezeigt.
3
1. Einführung
Dort wird auch näher auf den Begriff der abhängenden Typen eingegangen. Die in dieser
Arbeit verwendeten konkreten Sprachen sind OCL, Englisch und Deutsch.
Zu GF gehört auch eine in Java geschriebene graphische Benutzeroberfläche. Alle Befehle, die man dem Editor gibt, werden auf dem abstrakten Syntaxbaum ausgeführt.
Und dieser wird wiederum in allen geladenen konkreten Sprachen (die zur abstrakten
Grammatik passen müssen) dargestellt. Dadurch kann man Texte in Sprachen erstellen,
die man selbst nicht kennt. Aber da alle Sprachen sich einen Syntaxbaum teilen, hat der
Text in ihnen den gleichen Inhalt. Es reicht also, eine der beteiligten konkreten Sprachen
zu kennen (sofern man natürlich den Schreibern der konkreten Grammatiken zutraut,
gute Arbeit geleistet zu haben).
Die einzelnen Befehle, die der Editor beherrscht, sind in Abschnitt 6.1.3 aufgezählt und
erläutert.
1.2. KeY
Die Umsetzung dieser Ausarbeitung ist Teil des KeY-Systems. Dieses hat zum Ziel, formale Softwarespezifizierung und -verifikation in den industriellen Softwareentwicklungsprozess zu integrieren. Der Gedanke dahinter ist, dass Verfikationswerkzeuge außerhalb
einer normalen Entwicklungsumgebung in der Praxis keine weite Verbreitung finden werden. Dazu gehört auch, dass keine esoterische Programmiersprache unterstützt wird, sondern eine Sprache, die Entwickler sowieso schon nutzen. Daher wurde JavaCard gewählt.
Zudem integriert sich KeY in das kommerziele CASE-Werkzeug Together Control Center
von Borland.
KeY verwendet als Spezifikationssprache, aus der später die Beweisverpflichtungen generiert werden, OCL, welche Teil von UML ist. Dies ist nötig, da UML alleine noch
relativ unpräzise ist. In OCL können Klasseninvarianten und Vor- und Nachbedingungen für Methoden definiert werden. Ein großes Problem ist jedoch, dass, obwohl UML
selbst sehr weit verbreitet ist, OCL kaum bekannt ist. Daher unterstützt KeY den Benutzer bei der Erstellung von Einschränkungen1 in OCL. Ein verwendeter Ansatz dazu
ist in [Bub02] beschrieben. Der dagegen dieser Arbeit zugrundeliegende Ansatz wird im
folgenden Absatz beschrieben.
1.3.
Formal and Informal Software Specification“
”
Die Doktorarbeit Formal and Informal Software Specification“ ([Johar]) von Kristofer
”
Johanisson stellt den direkten Rahmen dieser Diplomarbeit dar. Im folgenden wird ein
kurzer Überblick über sie gegeben.
1
4
In dieser deutschen Zusammenfassung der englischen Ausarbeitungen werden die UML-Fachbegriffe
durch die auf http://www.oose.de/uml_auf_deutsch.htm angegebenen wiedergegeben.
1.4. Ziel
In [HJR02] (ein Teil von [Joh05]) wird ein System beschrieben, das eine Brücke zwischen der formalen Sprache OCL und der natürlichen Sprache Englisch schlägt. Als
Grammatikformalismus wird dabei GF verwendet. Kristofer Johanisson schrieb damit
eine gemeinsame abstrakte Grammatik für OCL und Englisch, sowie die dazugehörigen
konkreten Grammatiken. Damit war es mit dem gewöhnlichen graphischen Editor für GF
möglich, Einschränkungen in OCL zu verfassen, und das, ohne OCL können zu müssen.
In [Dan03] wurde dann mit Hilfe der Grammatikresourcen ( resource grammars“, siehe
”
6.1.1 für mehr Details) eine deutsche Grammatik für OCL geschrieben. Der Autor hat
diese später ins Englische übertragen. David Burke hat sie dann in seiner Master-Arbeit
verbessert und die Unterstützung von Formatierungen eingeführt ([BJ05] (ebenfalls Teil
von [Joh05])). Diese hat der Autor dann wieder ins Deutsche übertragen.
In einem weiteren Teil seiner Doktorarbeit, [Joh04], wird ein Zerteiler für OCL beschrieben. Erst dadurch wurde es möglich, OCL einzulesen und ins Englische und Deutsche
zu übersetzen. Vorher war nur das mehrsprachige Verfassen möglich, da der allgemeine
Zerteiler, den GF bereitstellt, eine Reihe von Einschränkungen besitzt.
Das Hauptproblem des Syntaxeditor war, neben Integrationsproblemen, dass er sehr
umständlich zu benutzen war.
1.4. Ziel
Und diesem Mangel an Gebrauchstauglichkeit abzuhelfen, ist das Thema dieser Diplomarbeit. Die Grundidee ist es, einen Editor zu schaffen, mit dem ein Neuling, der von
OCL keine Ahnung haben zu braucht, Einschränkung in eben dieser formalen Sprache
erschaffen kann.
Dazu gilt es, zuerst die Probleme, die ein Anwender haben kann, zu identifizieren. Dann
sollten Lösungen dafür ersonnen und schlussendlich auch umgesetzt werden. Kurz, der
Editor soll gebrauchstauglich werden.
1.5. Zielgruppe und mögliche Anwendungsgebiete
Der Editor ist nicht für OCL-Kenner gedacht. Der grammatikbasierte Ansatz, in dem
von oben nach unten in einem Syntaxbaum verfeinert wird, ist gänzlich anders, als
OCL-Einschränkungen in einem textbasierten Editor zu schreiben. Nur aus typkorrekten
Verfeinerungen auswählen zu können, ist eine Einschränkung, die nicht unbedingt durch
die Hilfestellung, die eben diese Einschränkung auch sein kann, ausgeglichen werden
kann.
Hauptsächlich profitieren werden davon wohl OCL-Unerfahrene, die sich jedoch im Problembereich gut auskennen und deshalb OCL-Einschränkungen verfassen sollen. Da
OCL zusätzlich angezeigt wird, kann der Editor auch als Lernhilfe verstanden werden.
Wie verschiedene angebotene OCL-Funktionen in OCL aussehen, kann gelernt werden,
5
1. Einführung
und später unabhängig vom Editor angewendet werden. Daher ist auch die Lehre ein
mögliches Anwendungsgebiet.
6
2. Gebrauchstauglichkeitsprobleme in
der alten Version des Editors
Anzumerken ist, dass dieser Arbeit keine echte Gebrauchstauglichkeitsstudie oder Experimente mit möglichen Benutzern zu Grunde liegen. Der Grund dafür liegt darin, dass
der Autor zu viele Hemmschuhe ausgemacht hat, die die Benutzer vor zu große Probleme
stellen würden. Seiner Überzeugung nach sollten erst diese aus dem Weg geräumt werden, bevor in Experimenten tiefer liegende Probleme zu Tage gebracht werden können.
2.1. Welche Funktion macht was?
Der alte Editor hat einen mit einer Riesenliste von zum Teil unverständlichen Funktionsnamen wie in Abbildung 7.1 begrüßt. Manche dieser Namen lassen sich zwar verstehen,
aber ein Name erklärt trotzdem noch nicht, was eine Funktion macht, wenn man über
einfache Vergleiche hinausgeht. any ist so ein Beispiel. Da es die Aufgabe des Editors ist,
OCL-unerfahrenen Benutzern OCL zu ermöglichen, kann nicht davon ausgegangen werden, dass Funktionen anhand ihres Namens wiedererkannt werden. Funktionen müssen
daher beschrieben sein.
2.2. Den Wald vor lauter Bäumen nicht sehen
Die Liste der möglichen Verfeinerungen kann über 50 Einträge lang werden. Was schon
ziemlich unübersichtlich ist, da dann eine gewünschte Funktion, deren Namen man nicht
kennt, sondern höchstens erkennen kann, herauszusuchen. Besonders, wenn viele Funktionen gar nichts mit der gesuchten zu tun haben, wie z.B. Vergleichsfunktionen gegenüber Funktionen, die etwas über die Elemente von OCL-Sammlungen aussagen. Diese
Liste sollte verkleinert und dem Benutzer bei der Auswahl geholfen werden.
2.3. Was soll ich jetzt machen?
Nicht immer sorgt die Baumstruktur oder die Darstellung des Baumes im Englischen
oder Deutschen für Klarheit, wie es z.B. für die und-Funktion andS der Fall ist, deren
zwei Satz-Argumente wohl die beiden mit und“ verknüpften Sätze darstellen werden.
”
7
Besonders versteckte Typ-Parameter, wie sie im Beispiel zu GF in Abschnitt 6.1.2 beschrieben werden, stellen hier ein Problem dar. Was der Benutzer eigentlich auswählen
soll, wenn er (oder GF für ihn) so einen Knoten ausgewählt hat, wird ihm nicht mitgeteilt.
2.4. Alles ist rot, was habe ich falsch gemacht?
Der Benutzer wurde als erstes mit einem in der Signalfarbe Rot unterlegten Text wie
in Abbildung 7.1 begrüßt. Ihm gleich zu Beginn einen Fehler zu unterstellen, besonders,
wenn er unnötig ist, wie in Abschnitt 7.8 beschrieben, ist nicht besonders hilfreich.
2.5. So viele Vergleichsoperatoren?
In OCL sind die Vergleichsoperatoren für größer und kleiner nur für Reelle Zahlen definiert, in der Grammatik sind sie jedoch auch für Ganze Zahlen verfügbar. Sie kommen
also doppelt vor, auch wenn Integer ein Untertyp von Real in OCL ist, und daher die
Vergleichsoperatoren für Ganze Zahlen eigentlich unnötig sind.
Für die Gleichheitsoperatoren gibt es ganze 10, für alle vordefinierten Nicht-Sammlungstypen
von OCL je einen, dazu noch zwei verschiedene, die mittels eines Parameters für alle
Typen verfügbar sind. Warum so viele, wenn es doch auch weniger oder gar nur einer
tun?
2.6. Wo fängt was an?
Lange OCL-Einschränkungen ohne Formatierung oder Einrückung sind schwer lesbar,
wie Abbildung 7.3 zeigt. Und dieses Problem wurde bereits in [BJ05] für die verwendeten OCL-Grammatiken gelöst. Nur war diese Lösung nicht im Editor verwendbar, wie
Abschnitt 10.1.6 erklärt.
2.7. Aber ich wollte doch das nächste Fragezeichen
ausfüllen . . .
Wenn ein Fragezeichen im Editor ausgefüllt worden ist, wählt GF das nächste aus.
Allerdings ist die Definition von nächstes“ etwas unerwartet für den Benutzer, da GF
”
recht gerne den aktuellen Ast verlässt und damit den Teil, auf den der Benutzer sich
gerade konzentriert hat. Was nicht den Erwartungen des Benutzers enspricht.
8
2.8. Ich will 42 eingeben.
2.8. Ich will 42 eingeben.
Das Eingeben von eigenen Zahlen, wenn eine Instanz einer Ganzen Zahl erwartet wird,
ist nicht wirklich selbstbeschreibend. Funktionen für 0, 1 und 2 existieren zwar, aber
mehr nicht. Wenn der Benutzer doch über die Funktion int stolpert und sie auswählt,
bekommt er eine längere Liste. Aber immer noch keinen Hinweis, wie er eigene Zahlen
eingeben kann. Dass dazu nach der Auswahl von int der Read-Knopf betätigt werden
muss, gefolgt von der Auswahl von Term ist nicht ersichtlich.
Für Zeichenketten ist die Situation ähnlich.
2.9. Aber meine Ganze Zahl ist doch eine Reele Zahl,
oder etwa nicht?
Das Typsystem von OCL, das das objektorientierte von UML ist, kennt Vererbung. Eine
Instanz eines Untertyps ist immer anwendbar, wenn eine Instanz des Obertyps erwartet
wird. Das Typsystem von GF dagegen baut nicht auf Vererbung auf. Folglich muss dieses
Prinzip irgendwie nachgebildet werden. Dies geschieht mit Hilfe abhängender Typen, hat
aber das Problem, dass es explizit ist. Das heißt, der Benutzer muss selbst sagen, dass
jetzt eine Typumwandlung zum Obertyp stattfinden soll, während dies in UML/OCL
implizit passiert. Was sowohl umständlich und fehleranfällig ist und zudem auch mit der
Erwartung des Benutzers kollidiert.
Mehr zu dieser Thematik in Abschnitt 7.5.
2.10. Wo sind meine booleschen Eigenschaften hin?
Aussagen werden in den Grammatiken für OCL normalerweise in der GF-Kategorie Sent
wiedergegeben. Für boolsche Variablen existiert jedoch noch Instance BooleanC, das z.B.
als Parameter für Operationen aus dem UML-Modell verwendet werden muss. Abschnitt
7.7 geht noch näher auf diese Unterscheidung ein.
Das Problem mit Abfragen und Attributen vom Typ Boolean aus dem UML-Modell
ist, dass diese weder als Eigenschaft, noch als Sent oder Instance BooleanC auftauchen.
Sie werden getrennt davon behandelt und haben den Typ AtomSent, der nur über zwei
spezielle Funktionen erreichbar ist. Der Grund dafür ist eine schönere negierte Form,
die in der Übersetzung von OCL in natürliche Sprache ihre Stärke ausspielen kann. Im
Editor ist dies allerdings ein Hemmnis für den Benutzer, da dem Benutzer in keinster
Weise gezeigt wird, wie er boolsche Elemente aus dem Modell verwenden kann.
9
2.11. Und warum klappt der Baum immer zusammen?
Wann immer der Benutzer einen anderen Knoten im abstrakten Syntaxbaum auswählt
(egal ob über einen Klick auf den Knoten im graphischen Baum oder durch die Auswahl
eines anderen Wortes in der Sprachdarstellung), wird der Baum komplett neu aufgebaut.
Dadurch geht unter anderem auch verloren, ob ein Knoten vorher aufgeklappt war, oder
nicht. Und das sorgt dafür, dass alle Knoten, außer dem aktuell ausgewählten, nach
jedem Knotenwechsel wieder zugeklappt sind. Was es dem Benutzer schwer macht, sich
im Baum zu orientieren, da dessen Aussehen nicht mehr dasselbe ist.
2.12. Was machten self und result da?
Die GF-Funktionen für self und result sind allgemein gehalten und bekommen ihren Typ
erst durch einen Klassenparameter. Damit sie nicht fälschlicherweise für unpassende
Klassen eingesetzt werden, werden in der Kontextangabe gebundene Variablen mit dem
passenden Typ eingeführt, die von den GF-Funktionen für self und result gebraucht
werden. GF schaut allerdings beim Aufbau des Verfeinerungsmenüs immer nur einen
Schritt weit in die Zukunft und weiß daher nicht, dass meistens für self und result gar nicht
alle Parameter auch ausgefüllt werden können. Daher tauchen diese beiden Funktionen
immer auf, wenn eine Instanz einer Klasse erwartet wird, auch wenn sie gar nicht passen.
10
3.1. Überblick
Neben dem Editor und TogetherCC sind noch andere Komponenten daran beteiligt,
dem Benutzer das Verfassen von OCL-Einschränkungen zu ermöglichen. Die Grammatiken für die Klassen und Eigenschaften des UML-Modells müssen zuerst generiert werden. Dafür werden sie in einem Austauschformat an ein Haskellprogramm von Kristofer
Johanisson weitergeleitet, dass dieses dann erledigt. Wenn in der Klassendatei schon
OCL-Einschränkungen gespeichert waren, werden diese zusammen mit dem Austauschformat an einen ebenfalls von Kristofer Johanisson geschriebenen Zerteiler gegeben, der
im Erfolgsfall einen abstrakten Syntaxbaum zurückgibt. Gab es einen Zerteilfehler, so
wird ein OCL-Skelett verwendet, dass den OCL-Kontext angibt so dass der Benutzer
z. B. Methodenparameter als gebundene Variablen zur Verfügung hat.
Baum, generierte und die allgemeinen OCL-Grammatiken werden dann dem Editor und
GF, dass in einem zweiten Prozess läuft, übergeben. Der Benutzer kann nun mit dem
Editor OCL-Einschränkungen verfassen. Die Vorgehensweise dabei ist, wie im allgemeinen Fall auch, von oben nach unten.
Wenn das Programm beendet wird, kann die verfasste Einschränkung in der JavaKlassendatei gespeichert werden. Fertiges OCL wird direkt als OCL gespeichert. Hat
der Benutzer jedoch Fragezeichen offengelassen, so kann der verwendete Zerteiler das
entstandene OCL nicht einlesen. Daher wird in diesem Fall der abstrakte Syntaxbaum
von GF als Speicherformat gewählt.
3.2. Integrationsprobleme
Als diese Arbeit begonnen wurde, konnte die schon in TogetherCC integrierte Version
des Editors nicht mit den Grammatiken umgehen, die in [Dan03] erstellt wurden, da
der Grammatikgenerator mit dem neuen GF-Format für Module nicht umgehen konnte.
Dieser wurde erst auf einen gebrauchsfähigen Stand gebracht. Desweiteren war der Zerteiler noch nicht eingebunden, so dass dies ebenfalls zu Beginn dieser Arbeit gemacht
wurde. Ein von der neuen GF-Version und mit den erzeugten Grammatiken kompatibles
OCL-Skelett musste ebenfalls noch erzeugt werden können, was ebenfalls im Rahmen
dieser Arbeit umgesetzt wurde.
11
12
Wie die in Abschnitt 2 gefundenen Probleme gelöst wurden (oder zu lösen wären) ist im
folgenden skizziert. Die ausführliche Version ist in Abschnitt 9 zu finden.
4.1. Die nächste Verfeinerung
Das Verfeinerungsmenü war eine lange, unstrukturierte Liste mit teils kryptischen Namen. Um dem abzuhelfen, bekamen die Funktionen aussagekräftigere Namen und Beschreibungen, die in einem Hinweiskästchen (Tooltip) eingeblendet werden. Zuvor stand
der gleiche kryptische Name der Funktion auch in den eingeblendeten Hinweiskästchen,
so dass diese dem Benutzer keine Hilfe waren. Jetzt ist es dagegen möglich, dort einen
anderen, beschreibenden Text anzuzeigen.
Das Unüberschaubarkeitsproblem wurde durch eine Gliederung in Untermenüs wie Ope”
rationen für Sammlungen“ oder Boolsche Operationen“ gelöst. Jede Funktion wurde
”
dazu entsprechend markiert, so dass sie vom Editor richtig einsortiert werden kann. Die
Unterstützung für Untermenüs im Editor wurde in dieser Arbeit hinzugefügt.
Aber auch wenn der Benutzer Funktionen leichter finden kann und ihnen auch im Voraus
ansieht, was er mit ihnen ausdrücken kann, so zeigte ihm der Editor vorher jedoch nicht,
was gerade eigentlich von ihm erwartet wird. Bei manchen Funktionen wie einem oder“
”
ist das auch relativ einfach, aber besonders bei versteckten Typparametern hilft einem
auch die Darstellung in natürlicher Sprache nicht weiter, da sie eben nicht existiert.
Dieses Problem wurde dadurch gelöst, dass jeder einzelne Parameter einer Funktion eine
eigene Beschreibung bekam. Da der Benutzer immer Parameter von anderen Funktionen
ausfüllt, kann diese Parameterbeschreibung als Hinweis über dem Verfeinerungsmenü
angezeigt um dem Benutzer eine Hilfestellung zu geben für das, was der aktuelle Knoten
eigentlich soll.
Technische wurde dies mit Hilfe der sogenannten printnames“ von GF umgesetzt. Ei”
gentlich sind diese nur eine einfache Zeichenketten für GF, die jeweils einer Funktion zugeordnet sind und im Verfeinerungsmenü anstelle des Funktionsnamens angezeigt
wurden. Im Verlauf dieser Arbeit wurde ihnen jedoch eine Struktur verliehen, die vom
Editor verstanden werden kann. Ein Trennzeichen trennt so den Text, der im Verfeinerungsmenü angezeigt wird, vom Text für das Hinweiskästchen. Auch ein Etikett für das
zu benutzende Untermenü ist vorhanden.
13
4.2. Unterstützung für HTML
Das die Darstellung in natürlicher Sprache sehr unübersichtlich werden kann, wurde
schon in [BJ05] beschrieben, und mit einem Formatierungssystem mittels u. A. HTML
gelöst. Allerdings konnte dieses nicht direkt verwendet werden. Das lag daran, dass
der Editor für die Zuordnung von Textschnippseln in der natürlichsprachlichen Darstellung und dem dazugehörigen Knoten im Syntaxbaum die Zeichenposition in der
XML-Zeichenkette, die GF dem Editor übergeben hat, verwendet. Eine genauere Beschreibung dafür findet sich in Abschnitt 10.1.6. Und in einer Beschreibungssprache kann
nicht direkt von der Länge eines Texts im Quelltext auf die Länge desselben Textes in
der Ausgabe geschlossen werden, was die Anzeige von HTML mit diesem Mechanismus
unmöglich gemacht hat.
Wenn dagegen die Länge des dargestellten Textes in der Ausgabe zum Kriterium wird, wo
ein Schnippsel anfängt und aufhört, und nicht die Position im HTML-Quelltext, so würde
das funktionieren. Dazu wurde die Java-Swing-eigene HTML-Anzeige-Komponente benutzt. Dadurch ist es möglich geworden, einen Teil von HTML (der alle in David Burkes
Grammatiken verwendeten HTML-Steuerelemente abdeckt) im Editor zu verwenden.
Und dies hat die Übersichtlichkeit in der Ausgabe drastisch verbessert.
4.3. Typumwandlungen
Wie in Abschnitt 2.9 erwähnt, sollte der Editor dem Benutzer das explizite Umwandeln
der Typen von Instanzen abnehmen. Folgendes wurde dazu umgesetzt:
Automatisches Einfügen Die GF-Funktion coerce, die für das Umwandeln zuständig
ist, wird automatisch eingefügt, wann immer dies sinnvoll ist. Diese Auswahl wurde im
Vorneherein getroffen und umfasst erst einmal alle Stellen, an denen auch eine Instanz
einer Klasse erwartet wird. Eine Einschränkung dazu sind Parameter, deren Typ von
einem versteckten Parameter abhängt. Dort kann der Benutzer oder GF direkt den Typ
der Unterklasse direkt wählen so dass keine Umwandlung nötig ist. Wenn dagegen ein
fester Typ wie z. B. Instance CollectionC erwartet wird, so ist immer eine Umwandlung
nötig, da sonst keine Untertypen passen würden.
Diese Auswahl erfolgt ebenfalls durch die Benutzung der printnames“. Ein Ausrufezei”
chen in der Parameterbeschreibung markiert den zugehörigen Parameter als Stelle, wo
der Editor automatisch ein coerce einfügt. Dies geschieht ohne Einwirkung des Benutzers.
Auch bekommt dieser diese zusätzlichen Funktionen im Normalfall nicht zu Gesicht, sie
bleiben verborgen. Nur falls der Benutzer an versteckten Typparametern dreht, kann es
passieren, dass zwei voneinander abhängende Funktionen unterschiedliche Typen haben.
GF erkennt solche Situationen und meldet diesen Fehler. Der Editor zeigt daraufhin alle
Typumwandlungen an um es dem Benutzer zu ermöglichen, den Fehler zu beheben.
14
4.4. Kleinere Verbesserungen
Löschen Funktionen unterhalb einer Typumwandlung müssen den gleichen Typ haben,
wie ein versteckter Typparameter der Umwandlungsfunktion selbst. Wenn jetzt beim
Löschen dieser unteren Funktion nicht auch noch das Typargument der coerce-Funktion
zurückgesetzt werden würde, ließe sich kein anderer Untertyp mehr auswählen. Daher
wird dieser Typparameter ebenfalls zurückgesetzt.
Umwickeln Damit eine Funktion f zwischen zwei anderen a und b eingefügt werden
kann, muss sie für a an die Stelle von b passen, also denselben Typ zurückgeben, und
zudem muss sie b als Parameter annehmen können. Vererbung hat hier keinen Platz, da
immer der gleiche Typ erwartet wird und GF keine Vorstellung von Untertypen hat.
Wie der Editor trotzdem dem Benutzer ein Verfeinerungsmenü bieten kann, dass diesem das Umwickeln auch mit Funktionen, die nur einen Untertyp des eigentlich von
a erwarteten zurückgeben oder die einen Obertyp von b als Parameter erwarten, ist
in Abschnitt 9.3.5 beschrieben. Umgesetzt wurde dieser rechenaufwändige Algorithmus
allerdings (noch) nicht.
4.4.1. self und result
Um die überflüssigen self und result aus Abschnitt 4.4.1 im Verfeinerungsmenü loszuwerden, verfeinert der Editor im Hintergrund mit diesen und sieht nach, ob der Kontrollparameter ausgefüllt werden konnte. Ist dies der Fall, so werden self beziehungsweise result
im Verfeinerungsmenü belassen, andernfalls werden sie daraus entfernt.
4.4.2. Einfachere Erreichbarkeit von Eigenschaften von self
Wenn der Benutzer Aussagen über eine Klasse machen will, dann wird er nicht umhinkommen, über deren Attribute zu reden. Diese waren jedoch mehrere Verfeinerungen
entfernt, etwas was relativ umständlich ist, was auch von Leuten bemerkt wurde, denen
der Editor gezeigt wurde.
Daher führt der Editor jetzt im Hintergrund diese Verfeinerungsschritte aus und stellt
sie dem Benutzer in einem Schritt zusammengefasst im Verfeinerungsmenü zur Auswahl.
Dadurch wurde dieser häufige Arbeitsvorgang leichter ersichtlich und stark beschleunigt.
4.4.3. Wegkommen von verstecken Parametern
Wann immer ein Knoten im abstrakten Syntaxbaum ausgewählt wurde, der nicht in der
Textdarstellung auftaucht, wie es z. B. bei verstecken Typparametern der Fall ist, konnte
15
die Textdarstellung nicht mehr zur Auswahl anderer Knoten ausgewählt werden. Klicks
in diese wurden ignoriert.
Dieses Manko ist nun behoben. Die Auswahl von Baumknoten funktioniert jetzt immer,
egal der momentan ausgewählte Knoten eine sichtbare Darstellung im Text hat oder
nicht.
4.4.4. Geänderte Auswahlfarbe
Um eine Fehlersituation, wenn voneinander abhängende Knoten nicht den gleichen Typ
haben, anzuzeigen, werden die entsprechenden Teile in der Textdarstellung rot markiert.
Rot ist eine Signalfarbe für Fehler, so dass der Benutzer direkt merkt, dass etwas nicht
stimmt. Allerdings hat der Editor Grün als Auswahlfarbe verwendet. Und gerade im
Kontrast mit Rot drückt dies aus, dass etwas besonders als richtig“ markiert werden
”
soll. Aber dass Zusammenhänge zwischen Knoten nicht verletzt sind, sollte keine besondere Situation sein, sondern der Normalfall. Und der Normalfall sollte keine zusätzliche
Aufmerksamkeit des Benutzers auf sich ziehen, genauso wie die Lampen des Armaturenbretts im Auto nur leuchten, wenn ein mögliches Problem vorliegt, aber ansonsten
aus sind.
Daher wird jetzt die Auswahlfarbe verwendet, die für Swing als Standard eingestellt ist.
Eine Auswahl ist eben nur das und keine Markierung als richtig“.
”
4.4.5. Zerteilfenster per Mittelklick
Ein Mittelklick in die Textdarstellung öffnet ein Fenster, in dem man den Text für die
Verfeinerung des aktuellen Knotens eingeben kann. GF versucht dann, diesen zu zerteilen und mit den entsprechenden Funktionen zu verfeinern. Allerdings wurde früher bei
schon verfeinerten Knoten der Text in der zuletzt von GF gesendeten Sprache angezeigt.
Und wenn man diese nicht beherrscht, kann man eben keine kleinen Änderungen daran vornehmen. Was dem Geiste des mehrsprachigen Verfassens, bei dem man nur eine
Sprache kennen muss, um in allen Veränderungen bewirken zu können, widerspricht.
Daher wird jetzt diejenige Sprache ausgewählt, in deren Textdarstellung man geklickt
hat. Man kann also eine Sprache auswählen, die man beherrscht und ist nicht der Willkür
von GF unterworfen.
Für OCL kann dieses Fenster jedoch nicht benutzt werden, da der allgemeine GFZerteiler OCL nicht zerteilen kann. Diese Fähigkeit ist mehr für andere Grammatiken
gedacht, die ebenfalls mit dem Editor benutzt werden können.
16
4.4.6. Sicherheitsabfrage beim Beenden, ob man speichern will
Wenn der Benutzer früher den Editor beendet hat, wurde dies sofort ausgeführt. Ungesicherte Arbeit ging dabei verloren.
Praktisch jedes andere Programm blendet jedoch eine Sicherheitsabfrage ein. Folglich
kann der Benutzer dieses auch vom Editor erwarten. Daher wurde eine solche Abfrage
eingefügt.
17
18
5. Fazit
Die meisten der groben Bedienprobleme aus Abschnitt 7 sind in dieser Arbeit beseitigt
worden. Erst jetzt kann der Editor auf echte“ Benutzer in einem Gebrauchstauglich”
keitsexperiment losgelassen werden, um zu sehen, ob der grammatikbasiere Ansatz für
OCL überhaupt funktioniert. Der Preis dafür, dass keine OCL-Kenntnisse mehr verlangt
werden, ist ein völlig anderes Bedienkonzept. Kein Eintippen der einzelnen Teilformeln
mehr, sondern ein von oben nach unten durch Operator-Auswahl Verfeinern ohne am
Anfang schon die Operanden auswählen zu können, wie man es von anderen Editoren
gewöhnt ist, wo man sie einfach hinschreiben kann.
Dies widerspricht [KU93], wo empfohlen wird, dass ein grammatikbasierter Editor auf
erfahrene Benutzer und nicht auf Anfänger zuzuschneidern ist, da eine gewisse Einarbeitung unumgänglich sein wird. In dieser Arbeit wird einem Benutzer allerdings die
Einarbeitung in OCL abgenommen, so dass sich immer noch ein Netto-Vorteil ergeben
kann. Aber ob das wirklich der Fall ist, muss noch untersucht werden.
5.1. Nur ein Editor speziell für OCL?
Das ursprüngliche Ziel war es, den allgemeinen GF-Editor auf OCL zuzuschneidern.
Allerdings waren einige der gefundenen Probleme allgemeinerer Natur und nicht OCLspezifisch. Dies waren Informationsdarstellungsprobleme, oder genauer was eine bestimmte GF-Funktion macht, das unübersichtliche und zu lange Verfeinerungsmenü,
welche Rolle der aktuelle Knoten im Syntaxbaum spielt und die Darstellung der Ausgabe ohne Struktur und Formatierung.
Eine allgemeine Lösung für diese Probleme würde es für OCL nur noch nötig machen, die
entsprechenden Erklärungstexte bzw. Formatierungsregeln zu schreiben, was in dieser
Arbeit auch getan wurde. Für HTML war dies sogar nicht mehr nötig, da David Burke
dies schon in den Grammatiken umgesetzt hatte.
Auch die Eingabe von Zahlen und Zeichenketten wurde deutlich einfacher gemacht. Eine
einfache Sicherheitsabfrage beim Beenden und die Sprachauswahl beim Zerteileraufruf
per Mittelklick kommen dem allgemeinen Editor zu Gute. Und die durchgeführten Verbesserungen am Quellcode des Editors sollten auch nicht vergessen werden.
Diese auch für allgemeine GF-Grammatiken nutzbaren Erweiterungen/Verbesserungen
führten dazu, dass der Editor zusammen mit GF 2.3 veröffentlicht wurde.
19
5. Fazit
5.2. Ausblick
Eine Reihe von Dingen wurde in dieser Arbeit nicht angegangen, hätten allerdings ebenfalls in ihr Thema gepasst und wären nächste Arbeitsschritte:
Pseudo-Elemente des UML-Modells OCL besitzt einen Mechanismus um Kurzschreibweisen für längere Ausdrücke einzuführen. Dieser wird in den Grammatiken etwas stiefmütterlich behandelt. Der Grund liegt unter anderem darin, dass die Semantik dieser
Pseudo-Modellelemente sich von OCL-Standard zu OCL-Standard geändert hat und
dies für OCL 2.0 wieder ansteht. In diesem Editor wurden zu erst die anfänglichen Bedienhindernisse angegangen. Und diese Kurzschreibweisen werden vermutlich erst von
fortgeschrittenen Benutzern verwendet, so dass es in den Augen des Autors gerechtfertigt
war, deren Vereinfachung zurückzustellen.
Drag & Drop Wenn ein Knoten des Syntaxbaumes an anderer Stelle wiederverwendet oder dahin verschoben werden soll, so wäre es wünschenswert, dies den Benutzer
mittels Drag&Drop zu ermöglichen. Auch könnte das Ziehen eines Wortes aus der TextDarstellung auf ein Fragezeichen ein Verschieben oder Kopieren bewirken.
Händische Linearisierungsregeln Besonders für Deutsch ist die Darstellung von Elementen des UML-Modells in der natürlichen Sprache nicht sehr gelungen. Dieses Problem besteht schon seit [Dan03], lässt sich aber nur im Zusammenspiel von Benutzer,
TogetherCC und dem Grammatikgenerator von Kristofer Johanisson lösen. Wie eine
meistens mit englischen Begriffen bezeichnete Klasse im Deutschen heißen soll, lässt sich
nur schwer automatisch herausfinden und wäre zudem ein ganz anderes Thema. Es dem
Benutzer zu ermöglichen, Klassen und Eigenschaften des UML-Modells händisch Linearisierungsregeln zuzufügen, die dann vom Grammatikgenerator aufgegriffen werden,
wäre hier ein Schritt hin zu lesbarerem Deutsch.
Zerteilen Eine Fähigkeit, die von Personen, denen der Editor gezeigt wurde, nachgefragt wurde, ist das Zerteilen von Teilausdrücken. Was allerdings weder vom allgemeinen
GF-Zerteiler, noch vom OCL-Zerteiler von Kristofer Johanisson, unterstützt wird. Dieser
müsste dazu angepasst werden.
Speichern währen dem Editieren Der Editor bietet Knöpfe zum Speichern an. Diese speichern allerdings in eigenen Dateien und nicht im Java-Quelltext, so dass sie
beim nächsten Laden nicht automatisch beachtet werden. Hier sollte dem Benutzer die
Möglichkeit gegeben werden, den momentanen abstrakten Syntaxbaum als JavaDocKommentar zu speichern, so dass er bei einem unbeabsichtigen Beenden des Programmes
wiederhergestellt werden kann. Dies wird momentan leider noch nicht unterstützt.
20
5.2. Ausblick
Geschwindigkeit Der Editor kann zum Starten zwischen 40 und 75 Sekunden brauchen. Das ist unangenehm viel, gerade wenn man für mehrere Methoden/Klassen Einschränkungen schreiben will und man den Editor dafür immer wieder neu laden muss.
Solange das Modell selbst nicht geändert wurde, könnte GF mit den geladenen Grammatiken im Hintergrund belassen werden und ihm bei Bedarf nur ein neuer OCL-Kontext
übergeben werden. Dadurch müssten die Grammatiken nur einmal eingelesen werden,
was viel Zeit sparen kann.
Drastisch (auf ca. 15 Sekunden) lassen sich die Ladezeiten reduzieren, wenn auf das Laden der deutschen Grammatiken verzichtet wird. Auch der Speicherverbrauch von GF
sinkt dadurch auf ca. 60 MB ab.
Mit einer HTML-Komponente, die das Anhängen von HTML an ein bereits geladenes
Dokument unterstützt, würde das Berechnen der Indices der Text-Schnippsel von GF
nicht mehr quadratisch sein wie bislang, wo das HTML-Dokument immer länger wird,
aber jedes Mal neu geladen werden muss. Die bisherige Herangehensweise führt dazu,
dass der Editor bei längeren Einschränkungen etwas an Geschwindigkeit verliert.
Dass der Editor im Hintergrund bei GF nachfragt bzw. automatisch verfeinert um dem
Benutzer Arbeit abzunehmen kostet Zeit. Das ist allerdings prinzipbedingt, da der Editor
ursprünglich nur als reine Anzeigekomponente gedacht war, die keine Eigenintelligenz
besitzt.
Eine Geschwindigkeits- oder Speicheroptimierung ist in dieser Arbeit allerdings nicht angegangen worden, da erst die Funktionalität bereitgestellt werden sollte. Sollte der syntaxgeleitete Ansatz das Erstellen von OCL-Einschränkungen wirklich erleichtern, dann
kann die Geschwindigkeit des Editors immer noch angegangen werden.
Andere Spezifikationssprachen Die meisten Verbesserungen, die im Rahmen dieser
Arbeit umgesetzt wurden, sind nicht OCL-spezifisch. Selbst diejenigen, die nicht für
allgemeine GF-Grammatiken eingesetzt werden können, wie z. B. Typumwandlungen
sind nicht wirklich OCL spezifisch, sondern bilden nur ein Typsystem mit Untertypen
nach, dass auch in anderen Spezifikationssprachen wie JML existiert. Daher wären nur
minimale Änderungen am Editor selbst nötig, wenn die zugehörigen GF-Typen nach
dem selben Schema benannt werden würden.
Eine konkrete Grammatik für JML für die abstrakte Grammatik für OCL würde gar
sofort mit dem Editor funktionieren. Dafür wäre das Schreiben dieser Grammatik sehr
aufwändig, da die Elemente der beiden Sprachen nicht eins zu eins auf einander abbildbar
sind.
21
5. Fazit
22
Teil II.
Multilingual Syntax Editing for
Software Specifications
— Englische Ausarbeitung —
23
6. Introduction
The area of this work is software specification. Saying, that a given program is correct,
only makes sense, if it is known, what the program ought to do. And especially in
the context of formal methods, it is not enough, if just a human knows, what the
program should do. This knowledge has to be formalised. For that, specific formalisation
languages are used. One of them is OCL1 . But the language mostly used in practice
for specifying, is natural language, while the use of formal languages is not wide-spread.
The underlying work of this thesis intends to bridge the gap between natural language
used in reality and a formal language (in this case OCL) which is needed for formal
methods.
First in this section the tools that this work taps on are described. Later, a motivation
on why this work is necessaray is given and the topic is explained.
6.1. The Grammatical Framework
6.1.1. Overview
GF2 is a framework for writing and using grammars. It is not just a grammar formalism,
as BNF is one, but also an implementation of that formalism, that, among others, offers
generating, parsing and type checking. It is meant to model formal and (well defined
subsets of) natural language. Not only does it take care of the syntax of a language, but
also semantics, especially with the dependent types of Martin Löf’s type theory.
The main feature of GF is separation of the abstract syntax tree (AST) from its representation in concrete languages. In a so-called abstract grammar module the categories,
possible return types of functions, and those functions with their signature are defined.
But nothing is said about how such a category or fun should look like, when it is linearised, as displaying the tree in a concrete language is called. That is done in one or
more concrete grammar modules. Here, for every cat, a lincat or linearisation category
is defined, that determines, how the cat should look like in that language. Likewise, for
every fun there is a lin, or linearisation rule, that defines the concrete representation
of that function.
There can be more than one concrete grammar module for an abstract one at the same
1
2
http://www.omg.org/cgi-bin/doc?formal/03-03-13
http://www.cs.chalmers.se/~aarne/GF/
25
6. Introduction
time in GF. That way, the AST, which all loaded concrete grammars have in common,
can be linearised into all those languages at the same time. If GF parses a string from
one of those languages into an AST, it will be linearised into the other languages which
results in translation. An example of an AST and the linearisation in several languages
can be seen in figure 6.5.
As concrete languages formal and informal languages are possible. For ten natural languages3 the resource grammars4 exist. They contain linguistic knowledge like inflection
and word order and offer a common (but not completely identical) interface to them for
the application grammar writer, to relieve him from that part of his work and let him
concentrate on what to express, not how to inflect it. For formal languages that is not
necessary, because they, as their name tells it, are already completely formalised. Nevertheless resource modules for precedence rules could be shared between several formal
languages, but such an undertaking has not yet been done.
In this work, the languages used are the formal language OCL and the natural languages
English and German.
GF is described in more detail in [Ran04].
6.1.2. Example
As an example, two funs from the OCL grammars will be explained. In the abstract
syntax they are defined with their cats as follows:
cat
cat
cat
fun
fun
Sent ;
Instance ;
Class ;
orS : Sent -> Sent -> Sent ;
eq : ( c : Class ) -> (a , b : Instance c ) -> Sent ;
First come the declarations of the categories, that are used in this example. They
introduce these types, but says nothing about how they look since this is the abstract
syntax, not a concrete one. The keyword fun begins a judgement for a fun, which is
followed by the name and, separated by a colon, the type signature of the fun. The
signature consists of first the argument types and at the end the category or return
type of the fun. For both funs above Sent would be the return type. For the arguments,
they differ. orS just takes two Sent arguments which can also be seen in figure 6.1. These
can be filled in and they will show up in the concrete linearisation, but nothing special
happens here.
For eq, things are a bit more complicated. Here dependent types come into play. To
specify, which type depends on which, they are given names. Here, the type of the second
and third argument depends on the value of the first argument. When c is refined to
IntegerC, then a and b both have to have the type Instance IntegerC. In this context,
3
4
Danish, English, Finnish, French, German, Italian, Norwegian, Russian, Spanish and Swedish
http://www.cs.chalmers.se/~aarne/GF/lib/resource/
26
Figure 6.1.: The AST and linearisation of orS in the graphical editor for GF
(a,b : Instance c) has the same meaning as (a : Instance c)-> (b : Instance c).
When c is not yet determined, as visible in figure 6.2(a), not much is said about the
type of a and b. But when c is refined (see figure 6.2(b)), both are fixed to one type, in
the example Instance IntegerC.
Having the two Instance arguments depending on the first argument makes it possible to
enforce that both have to have the same type, something which is needed, since in OCL
only comparison between objects of the same type are allowed. GF will automatically
fill in c, when one of the Instance arguments is refined first, since there is a GF constraint
or dependency, that locks the types of c and a/b together. GF will notice it and report
it, if one deletes c afterwards and refines b with a different type than a. When two types
depend on each other, but do not match, it is called a GF constraint. GF will attach
this constraint to the nodes and the editor will colour the linearisation of these nodes
red.
Also the return type of a fun may depend on one of the arguments. Note, that an
Instance IntegerC and Instance StringC are different types on the abstract syntax level.
Only on the concrete syntax level do they share the same linearisation rule, the one for
Instance.
Now for the linearisation rules for these two functions for OCL. Note, that these rules
are simplified and, for example, lack support for operator precedence.
27
6. Introduction
(a) no class chosen
(b) Integer chosen as
the class to compare on
Figure 6.2.: The parametrised equality operator eq. On the left with open Class argument, where the exact type of the Instance arguments is not yet determined
(the type is printed as Instance (?)). On the right, IntegerC has been chosen as the class. Now the type of the two other arguments is known to be
Instance IntegerC.
l i n c a t Sent = { s : String };
l i n c a t Class = { s : String };
l i n c a t Instance = { s : String };
l i n orS sent1 sent2 = { s = sent1 . s ++ " or " ++ sent2 . s };
l i n eq c a b = { s = a . s ++ " = " ++ b . s };
Both are pretty straightforward. Every linearisation of a fun is a record. The layout of
this record is defined in the lincat judgement. In this example, the record just contains
a string field s since there exists no such thing as inflection in OCL. The lin judgements
assign the record fields their values. Linearisation is recursive and starts in the leaves
and the funs above can access the linearisation records of their arguments, but not their
tree structure. This limitation/feature is called compositionality.
For both lins, in the beginning the s field of the first argument is taken, then or /= is
appended and the s field of the second argument is printed afterwards. Note, that the
class argument of eq is dropped in the linearisation. Therefore it is called a hidden type
argument. These hidden type arguments occur quite often in the OCL GF grammars
and will be discussed later on.
Simplified English lins, that don’t use the resource grammars, could look like below.
l i n c a t Sent = { s : String };
l i n c a t Class = { s : String , modelName : String };
l i n c a t Instance = { s : String , modelName : String };
l i n orS sent1 sent2 = { s = sent1 . s ++ " or " ++ sent2 . s };
l i n eq c a b = { s = a . s ++ " is equal to " ++ b . s };
Not much of a difference here in this simplified version. Classes are normally referred to
by a natural language name, which would be pay card for PayCard for example like in
the pay card in the linearisation of self (see section 7.11 for more about that fun), but
in the context declaration the real, model name is used to make the mapping clear (as
visible in figure 7.1). In the example funs above that field is not used, though.
The ‘real’ lins use the resource grammars, but their use was not part of this work, so
28
their use is not shown here.
6.1.3. The Java Editor for GF
GF features a graphical syntax editor. In the editor, all operations operate on the AST,
nothing on the linearisation, whereas the AST will be linearised in all loaded languages.
That way, all operations have their effect on all languages at the same time. Thus,
multilingual editing is possible. One does not have to know all the languages to produce
a text in them, it is enough to know one of them. The linearisation in that language can
serve as a proof linearisation to check, if the editing commands produced the wanted
result. The linearisation in the other languages will always have the same meaning
too, so they do not need to be checked (given that the writers or the foreign language
grammars did a good job).
Editing in the editor is top-down. To start, one selects the type of the root node
(Sentence for example, as in figure 6.3), refines that and then is led to the child nodes
(which for hunting, as in figure 6.4, are subject and object) and so on.
In figure 6.3 you can see the main window of the editor. It is divided into three big
parts.
In the upper left, one can see the abstract syntax tree. Here you can see which fun is
where and takes which arguments. After each fun you can see the type of it. When
one clicks in the tree, GF is told to set that node as the active node. It then sends the
updated state back for display, which accordingly reflects the state change.
In the upper right there is the linearisation area. Here the linearisations in the selected
languages (in the figure Finnish and English) are displayed. If wanted, one can also show
the tree in a textual representation there, but that is of limited use and thus hidden by
default. If the currently active node is visible in the linearisation (for example, type
arguments normally are not), the part in the linearisation that belongs to the active
node, is coloured green. To switch to another part of the text one can just click there
in the linearisation area.
In the bottom there is the refinement menu. There are several kinds of commands.
The most important one is r, short for refine. refine is offered if, like in figure 6.3,
an open (or unrefined) node, a so-called metavariable, is selected. With refine one can
apply a fun at the current node, so that it will appear in the AST at the position of that
metavariable. Once refined, GF will jump to the next metavariable, shown in figure 6.4.
If a refined node is selected, there are several possibilities, as visible in figure 6.5. One can
delete the fun with all its children there (the whole subtree), in the example resulting
in figure 6.6. Or change the currently selected fun into another one, as done in 6.7.
Furthermore it is possible to insert a fun between the current node and its parent (only
if return type of the wrapper fun and the type of the chosen argument are the same as
the one of the current node, see section 9.3.5 for more about that), called wrapping in
GF, which basically is bottom-up editing. Its effect is shown in figure 6.8 Alternatively,
29
6. Introduction
one can just remove the current node with ph (peel head) and use one of its arguments
instead, as in figure 6.9, which does not delete the whole subtree like delete does. Here
also the type of the fun and the selected argument have to be the same since this
command is the reverse of wrapping.
There is also a clipboard, in which one can put arbitrarily many subtrees with ac, add
to clipboard. With rc, refine from clipboard, one can paste them again, given that they
have a type suiting type for the current node.
A full description can be found in [Khe03].
Figure 6.3.: The main window if the Java GUI for GF. Currently, an unrefined node is
selected, so only refine commands are offered.
6.2. KeY
This work is part of the KeY system ([ABB+ 05]). ‘The aim of the project is to integrate
formal software specification and verification into the industrial software engineering
processes’5 . It is believed in the project, that a verification tool outside the normal
development environment in software engineering practice will not get wide-spread use.
It has to be accessible within arm’s reach of the developer. For that it must not use
esoteric languages, but a language the developer already uses. Therefore, JavaCard6 has
5
6
taken from http://www.key-project.org
http://java.sun.com/products/javacard/
30
6.2. KeY
Figure 6.4.: The first refinement has been executed, and updated tree and linearisation
are shown. The next node has automatically been selected by GF.
been chosen as the target language.
Specification occurs on a higher level, not directly on the implementation, but on the
model level. So the model has to be formalised in some way to be accessible by the
specification. The most often used modelling language is the Unified Modelling Language
UML. But UML alone does not say much about the semantics of a model besides state
machines, model structure, caller-callee relations and cardinalities. Return values, for
example, are not covered at all.
In use cases, much of the specification is attached as informal text. But informal text
with all its ambiguities cannot be used as a formal specification when it comes to formal
verification. So the model has to be enriched with formal specifications.
One way to do this is using the Object Constraint Language, in short OCL. With
this specification language class invariants and method contracts consisting of pre- and
postconditions can be stated formally and precisely.
But knowledge and use of OCL is not wide-spread. Therefore, KeY not only tackles
proving that a programme behaves as specified, but also assists in specifying. One
approach are specification patterns as described in [Bub02]. Another one is the basis for
this work and described in [Johar], but also in this work.
To be in arm’s reach of the developer, specification and verification must be integrated
31
6. Introduction
Figure 6.5.: An already refined node has been selected. Now one can either delete the
current subtree, change the current fun into another one or wrap the current
subtree with another fun.
Figure 6.6.: Tree and linearisation after sending the delete command d in figure 6.5 to
GF.
Figure 6.7.: Tree and linearisation after sending the change command ch Thick in figure
6.5 to GF.
Figure 6.8.: Tree and linearisation after sending the wrap command w Dirty in figure
6.5 to GF.
32
Figure 6.9.: Tree and linearisation after sending the peel head command ph in figure 6.5
to GF.
into the used UML and development tools. Together Control Center7 is such a tool,
and KeY has been integrated as a plug-in into TogetherCC. In figure 6.10 one can see,
how KeY is callable from within TogetherCC. Specifying and verifying are offered in the
context menu of a class/method and OCL constraints are saved together with the class
itself (see section 8), so no external tools have to be used.
The direct context of this work is the Phd thesis of Kristofer Johanisson [Joh05]. An
overview of it is given here.
In [HJR02] (part of [Joh05]), a system is presented, that bridges the formal language
OCL and natural language. Using GF as the grammar formalism, Kristofer Johanisson
wrote a common abstract grammar for both OCL and English, and also concrete grammars for these languages. As GF offers multilingual editing with the generic Java GUI
for GF, one could then create OCL constraints without knowing OCL. In [Dan03], a German concrete grammar for the abstract OCL grammar was written, using the resource
grammars. The author later on translated this into English. David Burke improved the
natural language rendering in [BJ05] (also part of [Joh05]) and introduced a system for
formatting. After that, the author adapted these improvements for the German grammars, so that both English and German benefitted from them. In [Joh04] (also included
in [Joh05]), Kristofer Johanisson describes an OCL parser he wrote. Before that, due
to limitations in the generic GF parser, OCL couldn’t get parsed, so only multilingual
syntax editing was possible by then. Now also translating OCL into natural language
was possible.
However, besides integration problems (see section 8.2), the main problem was, that the
generic Java GUI together with the abstract OCL grammar, was not very usable, but
instead quite clumsy to use.
7
http://www.borland.com/us/products/together/index.html
33
6. Introduction
Figure 6.10.: Creating the specification and the verification that the implementation
adheres to it with KeY can be called directly from the context menu of
TogetherCC.
34
6.4. Goals
6.4. Goals
And to remedy this situation is the aim of this work. The idea of this system is to make
OCL editing possible for someone, who does not know OCL. The first step for that is to
identify the main problems that a user of the editor would have. Then to devise ideas
to overcome these problems and finally, to implement these ideas.
The basic working principle is to offer the user only everything, that is possible at the
current situation. On the one hand, that restricts the user, because it’s not possible
to enter something right away, but on the other hand, this ‘railing’ has its advantages.
This approach prevents syntax errors, since creating the written OCL is done by GF,
not by the user, and GF only allows well-formed OCL. Besides that, the user does not
have to know all OCL operations by heart, but just needs to select from a list, from
which all non-applicable operations already have been removed. Or, taken further, the
user does not have to need to know, what OCL operations exist, the system should tell
and explain that to him. And recognizing something presented as the thing needed is
much easier than having to remember it.
This strength, allowing OCL editing by point&click, should be built upon and increased.
Stumbling blocks and unnecessary editing steps, however, should be removed. In short,
the editor should get usable.
6.5. Potential use and users
The editor is not meant for OCL experts. They already know, how to express things in
OCL and how the syntax is. They know the type concept of OCL and don’t need to
be restricted to always write something type-correct. For example, they can start with
a > aStringSet (which would not be possible in the editor), but by continuing as in
a > aStringSet->any(s : String | s.substring(0,1)= "A").size() this gets correct
again.
For people who do not know OCL (yet), but only the domain, this restriction is not so
limiting. They can’t enjoy their freedom yet, but need to be taken by the hand. And this
is something the editor does/should do. When they have fledged and want to write OCL
directly, they still have benefitted from it. That makes the editor to an introductory
tool. It helps beginners to write and learn OCL. The latter only, if the user wants to.
He does not need to look at the OCL, just looking at the English or German version is
enough, since the OCL will have the same meaning.
Teaching is another potential area of application. Here formalisation and OCL can be
tought at the same time, since the latter is not a prerequisite for the former.
35
6. Introduction
36
7. Usability Problems of the Previous
Editor
No real usability study has been done here. The author found too many blocking flaws at
the beginning of the work, which would easily overshadow other, still hidden, problems.
These blocking problems, which should be solved before such a study, are listed below.
This list does not claim to be complete, experiments with untrained subjects will surely
reveal more problems, especially when these problems are resolved and give way to a
less restricted use of the editor, where problems lying deeper can be discovered.
The problems in this chapter are described from a user’s point of view and have different
reasons. These reasons will be discussed and explained too, since shedding light on the
source of a problem is the first step to solve it.
7.1. Which function does what?
When using the old editor started from TogetherCC, one was greeted with a huge list of
strange names like in figure 7.1. These are the names of the funs in the abstract syntax
of GF.
These names were sometimes understandable, like intEq for example. For others like
one, one can’t even guess, what they would do. If a node of a different type than Sent
is selected, the situation is similar.
Not recognising what a function stands for is one thing. Understanding what it could be
used for is another. For the more complicated OCL functions like iterate or anyOclAsType
even having the name they have in OCL does not help, if you don’t know OCL. The
point of the editor is to help people, who do not know OCL, to author OCL constraints.
For that, the editor should help the user choosing the right fun. At least it should
explain, what the functions can be used for. Just telling, which functions are applicable,
is not enough.
7.2. Not seeing the forest because of so many trees
Let’s take a node of type Sent, standing for sentence. There are several different kinds of
functions that return this type. Some are boolean functions to conjoin other sentences,
37
Figure 7.1.: Screen of the editor directly after starting. Note, that the screenshot has
been altered, so that the whole refinement menu is visible at once.
like orS or andS. Other functions compare objects like the equalities mentioned in section
7.9. Again others deal with the OCL collection types and make propositions about them
like one or contains. So functions from different areas are mixed here.
But when one wants to and-conjoin two sentences, one does not want to search these
functions between all those collection and other functions like comparisons. A list with
over 50 entries to choose from is just too long. Some reduction is needed here.
7.3. What am I to do?
The editor always only offers refinements that are type-correct. But the type alone is
not enough for the user. Which purpose the current refinement serves semantically is
not said by the editor and can stay unclear. For example, there are hidden (with regard
to the linearisation) type arguments (as in the example in section 6.1.2). An example
for that is eq, the parametrised equality, as seen in figure 7.2. Its first argument selects
the type on which the comparison is done. But that is not explained anywhere, the user
is out to guess.
On a side node: It is not possible to click in the linearisation to escape such a hidden
argument. If it has been selected in the tree, the linearisation area becomes unclickable.
One can refine the current node, which will make GF select the next one, but selecting a
part of the linearisation to select any other node does not work. One has to use the tree
38
7.4. Where does what begin?
Figure 7.2.: The hidden argument of eq is selected. But what does it do?
to select one. Normally, one can select the wanted node by clicking in the linearisation,
so this does not conform to the user’s expectancy without other benefits. It’s just an
annoying limitation.
7.4. Where does what begin?
If an OCL constraints becomes longer, it is hard not to lose the overview, as mentioned
in [BJ05] or visible in figure 7.3.
An easy, but yet very effective way to help here is formatting (see [BJ05] for details).
In the underlying work of David Burke such formatting is implemented for the English
GF grammar for OCL. But the editor does not support formatting mark-up in the
linearisation. GF just sends text, which is displayed character by character.
Writing formatting for some improvements, that didn’t change the indices of the displayed characters with respect to their position in the text from GF like line-breaks or
nicer bullets, helped a bit, but to implement nice indentation in GF would be really,
really messy since a linearisation of a fun normally stands on its own and does not
depend on its context. And it would not be implementable with the system introduced
by David Burke. This problem is discussed in more detail in section 10.1.6.
39
Figure 7.3.: A non-trivial OCL constraint of the PayCard example. Except for (*) for
bullets, but even they are not followed by line-breaks, no structure is given
to make the constraint more concise.
7.5. But my Integer is a Real, isn’t it?
When comparing Reals for example, only funs of type Real are listed in the refinement
menu, but no Integer funs. In general, only funs of the exact type are offered there,
never funs for a non-reflexive subtype.
The reason for that is a fundamental one. Since OCL is part of UML, which got most
famous with its inheritance-charts, OCL’s type system features subtyping. But the
type system of GF does not do that. So the OCL editor has to work around this, in
this context, deficiency of GF. This is done by using dependent types, one of the main
features of GF’s type system (see [Ran04]).
The main element in this emulation is the fun coerce:
f u n coerce : ( sub , super : Class ) -> Subtype sub super
-> Instance sub -> Instance super ;
This fun does an explicit upcast. The first argument, sub, selects the subclass, while
super selects the superclass. The object, that is to be upcast, is the fourth argument and
has the type Instance sub. That means, that this type depends on the first argument,
sub (see the example in section 6.1.2 for more about dependent types). The return type
40
7.5. But my Integer is a Real, isn’t it?
of coerce is an example for a depending return type. This type is determined by the
second argument super. Whenever a fixed type like Instance IntegerC is expected as an
argument, and a coerce used to refine it, GF will automatically refine super also, since
it is bound to the return type fixed in this situation, and cannot be chosen freely.
So far, this allows one, for every pair of classes sub and super, to use an Instance sub as
an Instance super. But that would allow type-incorrect OCL, and that is not wanted,
since the purpose of this editor is to relieve the user from thinking about syntax and
type correctness. To prevent incorrect subtyping there is the third argument. Here a
so-called subtyping witness (see below) has to be inserted by hand.
For beginners, coerce is easy to overlook, since in UML one does not have to think of
explicit upcasts, so one would expect the variables and functions, that return Instance
sub to appear in the refinement menu for an Instance super. But they don’t.
Even if one knows, what coerce does and what its arguments are (see above), it is still
cumbersome to introduce it everywhere by hand, since it is an additional step that
doesn’t match the UML model, and probably not the user’s mental model either. Every
object of class A is already also an object of class B if A B. As virtually superfluous
steps slow down the user unnecessarily, it would be better to abolish them.
Another disadvantage of many coercions is the cluttered tree. coerce with its 4 arguments
takes up quite some space in the tree view, which distracts from those OCL functions,
that do not just compensate deficiencies.
Subtyping Witnesses For direct subtyping relation B A, there exists a subtyping
witness Bconforms2C : Subtype B A in the grammars. However, for transitive suptyping
like C B A, one has to use
subTypeTrans : (c , b , a : Class ) -> Subtype c b -> Subtype b a
-> Subtype c a ;
which allows one to built the wanted subtyping witness of type Subtype c a. For a
reflexive coercion from A to A,
subTypeRefl : ( c : Class ) -> Subtype c c ;
has to be used, which returns such a reflexive subtyping witness.
Building transitive subtyping witnesses by hand is unnecessarily complicated. For a
situation like D C B A, figure 7.4 shows the AST for a coercion from D to A.
And to construct something like that should not be imposed on the user.
Creating additional witnesses for transitive subtype relations in the grammars themselves
would automatically increase the grammar size, in the worst case quadratically. This is
the case when there is just one long inheritance line, where each new class at the bottom
needs a subtyping witness to all classes above plus to itself. But that should be very
rare, and if the classes are not empty, but have a number of properties, the added funs
shouldn’t have such a big size impact.
41
Figure 7.4.: Given D C B A, this is how a D can be used as a A. The rule
for transitive subtyping only covers one additional step, so after the first
subTypeTrans we still have to prove that B D, which is done in the second
subTypeTrans.
The linearisation record of such a subtyping witness consists of just one string, so they
don’t take up that much memory compared to, for example, classes, which as common
noun phrases (CN) contain a table of all cases and numbers. And reducing the burden
on the user should allow for a slight increase in grammar size (and thus, of loading time
and memory consumption).
7.6. I want to enter 42.
When one wants to enter an arbitrary integer number, one only sees 0,1 and 2 which are
directly offered. There is no visible way to go further. With descriptions of the menu
entries, one could find out that the fun int takes an Int as its argument, which is the
built-in type for integer numbers in GF and can be any integer number. But if this fun
is selected, just a list of 6 seemingly arbitrary numbers is selected. An there is no sign
of how to enter another number either. The key is, to click read, enter the number, and
then, before OK, select Term, which is not the default option.
For String, the situation is similar. At least, the fun, that takes the String literal, is
called stringLiteral, so it is easier to guess. But when one clicks read, things get more
complicated. As possible ‘types’ to read in the value, String and Term are offered. And
String is the wrong ‘type’ for String, since it will ask the parser of GF to parse the
entered value; but with the current restriction, that the GF parser cannot handle String
literals, this won’t work1 . One has to select Term and put quotation marks around the
String.
If one does not know this, this is a complete blocker. There is no indication, that one has
1
There is a special lexer for that, but that one is not active at this point.
42
7.7. Where is my boolean property?
to click outside the refinement menu to do a refinement. But why should it be necessary
to look somewhere else?
The reason, why integers cannot just be entered, when an Instance IntegerC is expected,
is that integer literals in GF have their own built-in type, Int. And this type has nothing
to do with Instance IntegerC, or Instance ? in general, for GF. They are completely
independent types. But semantically and linearisation-wise, they are quite the same.
Just that their type is different. To remedy that situation, a type conversion fun, the
aforementioned int, has to be used:
f u n int : Int -> Instance IntegerC ;
The linearisation of this fun just takes the string representation of that Int and uses it
uninflected:
l i n int i = np2inst ( numNoInfl i . s ) ;
The same holds for Instance StringC and String.
7.7. Where is my boolean property?
In GF OCL grammars, the OCL type Boolean is represented in more than one way.
Variables in the grammars have type Instance, so the Boolean variables have the type
Instance BooleanC. In natural language, an Instance is treated like a name, so itcan be
inserted whenever a subject or object of a sentence is expected. But boolean expressions
tend to be propositions like comparisons. These are sentences in NL, like the PIN code
is set, not just nouns. And cramping sentences into places, where a noun is expected
needs helping constructs like the truth value of the proposition that . . . or true iff . . . .
And they are a bit jolty and not used in normal NL.
But especially boolean operations can be applied to the sentences directly with ’or’ in
the PIN code is set or the card is uninitialised . They do not have to be treated as
nouns. Therefore, there is a cat Sent for those sentences for propositions and boolean
funs to conjoin them. Even a constraint itself, which in OCL has to have a boolean
value, is a Sent. But there are situations, where propositions and boolean variables have
to be mixed, for example when propositions are used as boolean parameters for methods
from the model. They expect Instance BooleanC. Therefore, ‘conversions’ from Sent to
Instance BooleanC and back have to be used.
Thus there are already two places, where boolean predicates could show up. But they are
neither Sent nor Instance Boolean. They are something else for the grammar generator.
The Predicate funs generated for them can only be accessed with a fun that returns
AtomSent. And these can only be accessed via the fun posAtom or negAtom, which
make Sentences out of them (an example tree can be seen in figure 7.5.
f u n posAtom : AtomSent -> Sent ;
f u n negAtom : AtomSent -> Sent ;
43
Figure 7.5.: An AST, where a boolean model element is accessed.
f u n predCall : ( rec : Class ) -> Instance rec -> Predicate rec
-> AtomSent ;
This is a well hidden way that the user has to know. One would expect to be able to
access instances of Boolean to show up as Instance BooleanC, but that is not the case.
If predCall would be offered with a nice description when a Sentence is expected, it would
be easier to see. But hiding it behind the type AtomSent just makes it more complicated
than it should be.
Exceptions have the same problem. If one is thrown or not, is an atomic sentence
AtomSent, no Sent.
The reason, why there is the type AtomSent at all, is that negation of Sents leads to
awkward constructions like it is not the case that the PIN is validated , since a Sent has
no special negated form, which could be used instead of a lengthy sentence prefix. In
contrast, AtomSent has such a negated form. So the above example would be in English
the PIN is not validated , which is shorter and more easily read. So as to the produced
natural language, this approach yields better results.
But the GF resource grammars, which are used to a large extend in the grammars to
form sentences in a mostly language independent way, use the lincat Sent, and know
nothing about AtomSent2 . Although Sent is most often build with VerbGroups, that
have a parameter telling them to negate themselves if necessary, on the Sent level, this
negation is forgotten. And because of the compositionality of GF, that is, that one only
can access the linearisation record of a child in the AST, but never the subtree as such,
one cannot get back this negated form.
7.8. Everything is red, what did I do wrong?
If GF finds a constraint, that is, if arguments of dependent types to not match or have
not been filled in completely, it will turn the respective part of the linearisation red to
2
There is ongoing work in this area, so that is subject to change.
44
7.9. So many comparison operators?
make the user aware of this situation, as explained in the example in section 6.1.2. But
GF is a bit too cautious and reports such a case for the complete tree at the beginning.
This constraint only occurs because not all depending type arguments are filled in, so
the solve mechanism of GF can easily eliminate it by refining these arguments. But that
was not done automatically, so the user was confronted with the signal colour red as in
figure 7.1. This tells the user, that he did something wrong, which is not the best way
to greet a user. Especially, if that ‘error’ has no real basis.
7.9. So many comparison operators?
For a greater-than-comparison for example, there are two function for that, intGT and
realGT:
f u n intGT : (a , b : Instance IntegerC ) -> Sent ;
f u n realGT : (a , b : Instance RealC ) -> Sent ;
In contrast, in OCL, such comparisons are only defined on Reals. Also persons, who have
been shown the editor, asked why Integer and Real get different comparison operators.
One wants to compare numbers, and perhaps even a Real with an Integer, but does not
want to specifically compare only Integer.
The reason for these different comparison operators is, that an Instance IntegerC for GF
is not applicable when an Instance RealC is expected (see section 7.5 on coercions). In
order to spare the user of two coercions when Integers should be compared, the funs for
Instance IntegerC were introduced.
Also there are many funs to compare for equality: stringEq, intEq, realEq, boolEq, bagEq,
setEq, sequenceEq, sentEq, eq, neq, anyEq, anyNeq. A collectionEq was missing. What
the type-specific equality comparisons do might be easy to guess, but what do anyEq
and eq do?
anyEq is the comparison on OclAny, the supertype of all non-collection types in OCL. So
it should take all non-collections as possible arguments. What it does, with a necessary
coerce. eq is a parametrised equality where one first selects the type of both arguments
(perhaps one of them has to be upcasted, but that is not not explained to the user), and
then the arguments themselves.
But why is there a need for such a host of different equality comparison operators? The
reason is, that in OCL only instances of the the same type are allowed. And therefore,
every type has its own equality operator.
eq subsumes all other equality comparison operators, since with its type argument set
to String for example, it expects exactly the same arguments as stringEq does, only that
the type selection is deferred to a later step. One could say, that this serves as grouping
of several equalities. eq can even play the role of anyEq since OclAnyC for the grammars
is just a Class, as StringC is one.
45
With so many ‘different’ comparison operators, the user is given an unnecessary choice,
which needlessly puts more mental work on him.
7.10. I wanted to fill in the next question mark . . .
When one metavariable, or as it is presented, a question mark, is completely refined,
GF jumps seemingly randomly to the next, or sometimes previous metavariable. It does
not always stay in the current subtree, where the mental focus of the user is, but often
leaps up to previous unrefined nodes. Which distracts the user from the goal he wanted
to refine, since he is drawn away by force.
Theoretically, GF should stay in the same subtree, so this behaviour is to be considered
a bug (even by the main GF developer Aarne Ranta).
7.11. I could choose self, but now nothing is offered.
In the OCL GF grammars there is one fun for self and one for result.
f u n self : ( c : Class ) -> VarSelf c -> Instance c ;
f u n result : ( c : Class ) -> VarResult c -> Instance c ;
result, in English for example, is always linearised as the result, while self is linearised
as the pay card , if the class of self is PayCard, or in general, the c for a class c. For this
linearisation, the class argument of self is used.
But the VarSelf/VarResult3 arguments do not appear in the linearisation. And there are
no funs returning these types. These only occur as special bound variables introduced
in an OCL context, where self and result actually can be used, and then they are given
the correct type, as visible in the following example:
fun NOPACKAGEP_PayCardJuniorC_available_Oper_Constr : (
VarSelf N O P A C K A G E P _ P a y C a r d J un i o r C -> VarResult IntegerC
-> Ope rC on st ra in tB od y ) -> Constraint ;
When this fun is applied, there will be one bound variable of type VarSelf NOPACKAGEP PayCardJuniorC, but apart from that, no other funs or variables of type VarSelf
SomeType. So self with its arguments can only be fully refined if the Class parameter of
self is IntegerC. Otherwise, no VarSelf variable/fun would be applicable for the second
argument.
And that is a problem for the editor. GF tolerates arguments for which there are no
possible refinements. So self can be used for any class, since it returns an Instance c, and
the c can be set arbitrarily for GF, with the missing refinement for the VarSelf argument
3
Later on, only self will be mentioned, but result is also meant, both are treated the same way except
when noted otherwise.
46
being no problem. So self will always be offered in the refinement menu.
Only in cases, where the supertype is OclAny4 or unrefined, this is no problem, since
every model class or return type would be allowed there, including the correct one.
And refining with self at a position where it is not applicable leads to a situation where
a VarSelf c fun is expected, but no fun for VarSelf c exists. And the user will only see,
that there is nothing offered in the refinement menu, as shown in figure 7.6. In fact,
that is the same behaviour, as when a type-incorrect upcast with coerce is tried: One
guard argument cannot be filled in. But that that really is an error, is not transmitted
to the user.
Figure 7.6.: The user refined with self for a wrong type and is now stuck.
As mentioned in section 8.1, the editor is stateless over several GF runs. This also means,
that the graphical representation of the abstract syntax tree is rebuilt from scratch after
every command sent to GF that changes the state of GF. So there is no memory of
which nodes where expanded before.
4
Note, that actually there is one exception to this: If an operation returns an array, this will be
mapped to the OCL type Sequence by the grammar generator. And then result will not have a type,
which is a subtype of OclAny. But this is ignored in the editor right now.
47
48
8.1. Overview
TogetherCC
OCL old OCL UML model
KeY
OCL Parser
UML Model
Grammar Generator
AST
OCL GF Grammars
Model Grammar
new OCL
GF
Commands
State
GF Editor
Figure 8.1.: An overview over the different components, and how they interact which
each other
The Java Editor is principally just a GUI for authoring documents with GF. What it
does is displaying the editing state of GF, which runs as a separate process, and giving
the user the possibility to interact with this state. The editor does nothing with the
commands the user gives themselves, but instead sends them to GF to process them.
After that GF sends the new complete state to the editor, which displays it to the user.
The GUI itself is nearly completely stateless over consecutively issued commands. That
means, that the editor forgets everything about the tree and the linearisations before
receiving them anew. Being stateless is one of the main design decisions of the editor.
Although the editor ‘talks’ only to GF, there are more components involved, in order
to actually get the OCL constraints into GF and thereby into the editor. These are
shown in figure 8.1. The system plugs in into TogetherCC, a CASE tool, as mentioned
in section 6.2. TogetherCC features a so called single-source model. That means, that
the UML model is not saved in an extra representation, but is always recreated from
49
the class files themselves. Because of that the class files also were the logical place to
store the OCL constraints, which are annotations to the model. This is implemented
with custom JavaDoc comments and used throughout KeY.
A plug-in of TogetherCC takes the model and exports it into an easily parseable format
which is read by the grammar generator1 . Then the grammar generator creates GF
grammar modules for each class and property in the model. Not only the abstract
grammar is created that way, but also concrete grammars for OCL, English and German.
If a previous OCL constraint is available, the plug-in then feeds model and constraint
into a special OCL parser and type-checker2 , which in return gives the GF AST for
the OCL constraint back. This AST, or if no previous OCL constraint was found, a
generated skeleton, is then given to GF. Then the editor is started and the constraint
can be edited there.
The GF grammars for the standard OCL functions and types are used in GF and thereby
can be used by the user in the editor.
After finishing editing/creating an OCL constraint it is given to the plug-in, which tells
TogetherCC to save it as a JavaDoc comment. If the constraint is not yet finished, it is
saved directly as the GF AST since the parser is not able to parse incorrect/incomplete
OCL. A finished OCL constraint, which is saved as standard OCL, can now be used
with the KeY prover.
8.2. State of the affairs at the beginning of this work
In 2003, Kristofer Johanisson integrated the editor and his grammars into TogetherCC.
With that, it was possible to call the editor from the context menu of a class or method in
the CASE tool to edit a class invariant or pre- postcondition pair. A Java class exported
the elements of the UML model into funs for GF, so they could be used as elements of
the AST. When closing the editor after ‘writing’ a constraint, it was saved as both OCL
and in the GF AST representation as a custom JavaDoc comment for the respective
class or method. Saving in GF AST syntax was necessary since due to circular rules in
the grammars (see [HJR02] for some more details) OCL could not be parsed with the
built-in GF parser.
The integration had been done for GF 1.1, which didn’t feature the module system introduced in GF 2.0. But in [Dan03], together with the creation of the German grammars,
the grammars for OCL, and later for English, were updated to use this module system.
That meant, that the improvements implemented later, like support for Java packages
and the formatting mechanism of David Burke in [BJ05], were not usable in the version
that had been integrated in TogetherCC.
In [Joh04] a system for disambiguating implicit OCL constructions is presented. The
1
2
written by Kristofer Johanisson in Haskell
also written in Haskell by Kristofer Johanisson
50
8.3. Support for the new grammars
reason for this was to build an OCL parser which can be used for the GF OCL grammars.
This parser was working at the beginning of this thesis, but it was not integrated into
TogetherCC, so parsing OCL was still not supported there.
[Johar] describes the state of the system after the author implemented the integration
of the new grammars using the OCL parser. Since [Johar] was written around the
time when this work was started, it mentions none of the improvements of the editor
introduced in this work, but the basic work-flow has not been changed, so with regard
to that it is still valid.
8.3. Support for the new grammars
In section 8.2 above several unconnected components are mentioned. Thence, the task
was more or less to wire them together.
The model export into the immediate representation was already implemented, but only
used to create such an export file for the batch translation of OCL constraints in a
separate file. A reference to this export file is given to the grammar generator. The
grammar generator takes this file and generates the GF grammar files in a temporary
directory.
For the OCL parser to work, an OCL file with package and context information is
needed. This file is expected to have a layout like the one seen in figure 8.2. For the
parser, package and context declaration are necessary, but since the OCL constraints in
TogetherCC are already attached to their respective context, they do not contain them,
do not need them and also shouldn’t contain that redundant information. Hence, they
are added to the OCL by the plug-in.
After parsing, the package part is stripped from the GF AST, since it is assumed, that
the user does not want to see which package he is in all the time. The context part
on the other hand is needed by GF, as bound variables like operation parameters are
declared there. Without this part in the AST, they could not be used in the editor.
When no previous OCL constraint is available, a skeleton is created (see figure 8.3 for
an example). This skeleton only includes the context, but no package declaration, since
it is created on the Java side and not within the OCL parser where the package would
be needed. That brings a synchronisation problem regarding naming conventions with
it. The plug-in has to produce the same fun names as the grammar generator does,
since the skeleton may only include existing funs. If the grammar creator changes, the
plug-in has to be adapted as well.
When the skeleton is created, the names of the bound variables can be set to the actual
parameter names from the model. The skeleton is then given to GF as a tree. Alternatively, the editor could refine the tree step by step, as a user starting from scratch would
do. But that way, the parameter names would be x 1, x 23 and so on, which simply
would not match the model.
3
They could get renamed with the alpha command, but why make it unnecessarily complicated?
51
package javacard :: framework
c o n t e x t AID
inv :
not ( self . theAID = null )
inv :
self . theAID - > size () >= 5
inv :
self . theAID - > size () <= 16
c o n t e x t KeyEncryption :: setKeyCipher ( keyCipher : Cipher )
p o s t : cipher = keyCipher
endpackage
Figure 8.2.: An example OCL file as expected by the OCL parser. This one is taken
from [BJ05].
NOPACKAGEP_PayCardC_charge_IntegerC_Oper_Constr (
\ this , amount -> prepostCt
?
?
)
Figure 8.3.: An example OCL skeleton for the method charge of the PayCard example as
a GF AST. The first line sets up the context. In the second line two bound
variables are introduced in the subtree starting with prepostCt. This fun
takes the pre- and the postcondition as its arguments, which are the question
marks in the next two lines. They are question marks in the skeleton,
because they are not yet refined, but instead open nodes, or metavariables
as they are called in GF.
A restriction of TogetherCC for the current saving mechanism is, that every custom
JavaDoc tag can only be saved once per class/method. That limits the number of invariants, pre and postconditions to one, although in OCL, more are allowed, as visible in
figure 8.2. To prevent the user from creating unsaveable OCL constraints, the skeleton
for method contracts contains a special fun prepostCt, that does not take lists of pre/postconditions as its arguments, but just one Sent for each of them (see figure 8.3). For
class invariants, the fun invCt is used, which takes only one Sent as an argument. The
parser, in contrast, returns list constructions, even if only one invariant, precondition or
postcondition is in the parsed OCL. That is a known, but not yet fixed, deficiency.
If no parseable OCL, but a GF AST is found in the JavaDoc, that one is used instead.
But parseable OCL will be preferred over the GF annotation, and only one of both
52
8.4. Basic work-flow
is saved, while the other one, if present, is deleted. That way a need for merging is
circumvented. The GF AST is expected never to be changed by hand, and if complete
OCL shows up in the JavaDoc, then this has to be more recent, since otherwise the
editor would have deleted it. This OCL is taken instead and the GF AST is deleted
when the constraint is saved the next time.
The editor is given a callback class, that takes care of saving the OCL as JavaDoc comments when closing the program. That way the editor has no compile-time dependency
on TogetherCC. The callback classes are instantiated in the interface plug-in and refine
an abstract class which just offers the saving methods, but does not depend on TogetherCC itself. And only this abstract class is referenced in the main editor class. The
callback classes strip the context information from the OCL linearisation given by GF
and in case of a method contract, splits the constraint into the pre- and the postcondition, to save them independently.
When all things are set up, GF is told where the temporary grammar files reside and
started. When the grammars are loaded the AST is given to GF. Now the user can start
editing.
8.4. Basic work-flow
The editor is started from the context menu of class or method (as in figure 6.10). The
user is then at a node of type Sent, which in this context means a boolean proposition
(as explained in section 7.7). The description above the refinement menu tells us4 , that
we are editing the precondition of a method. Here we can select one boolean proposition.
If we want more than one precondition, we have to use andS to conjoin several of them
conjunctivly, due to reasons explained in section 8.3. There, one can select the wanted
proposition from the list in the refinement menu and double-click it. This command is
then executed and the state updated. Then the work continues with the standard editing
steps as described in section 6.1.3, just that the grammar will allow one to construct
OCL constraints instead of stone age sentences.
Intermediate saving in the model is not yet supported, the save buttons will save the
tree or linearization in an extra file. Only when exiting, the user is asked, if he wants to
save the constraint.
4
This section anticipates the state of the editor after the implementation of the improvements mentioned in section 9.
53
Figure 8.4.: The editor after choosing Edit Pre/Postcondition [GF] in the context
menu of TogetherCC. An entry in the refinement menu has already been
clicked, but not been sent to GF (needs a double-click).
54
In this chapter it is described, what has been done to overcome the problems found in
section 7. But some of the items below have not been listed in this list, nevertheless
they have been regarded as worthy of improvement.
Everything here except the ideas in section 9.6, the parts of section 9.3 marked as such
and the discarded ideas in section 9.1 has been implemented.
9.1. The next refinement ...
9.1.1. Refinement descriptions
The problem in section 7.1 is, that the refinement menu is a huge list of cryptic names.
And even if the names were understandable, they would not help someone who does
not know what the respective OCL functions do. As a solution, the names could be
blown up in their size with a special nomenclature, that is rendered in a special way.
Underscores could be replaced by spaces and so on. But, although it would be possible,
that would make the fun names very unwieldy and cumbersome for the grammar writer
and the AST unreadable, since the funs are shown there in their full glory (sth. which
would have to be changed then). Also changing a description would make unfinished
constraints unloadable, since the old descriptions are still used as the fun names in the
saved GF tree. Descriptions do not play any purely functional role as the fun names
do. Therefore, description and abstract syntax name should stay separate and solutions
besides the abstract fun names need to be found.
In the editor one can switch the language of the refinement menu. That means that
the linearisation of each fun in the selected language is displayed there instead of the
abstract fun name. In figure 9.1 one can see an example for that. Since GF does not
know how the arguments of each function are linearised, their type with an appended
number is used instead. Most often this will just be Instance. What kind of Instance is
not displayed. So for example, collection functions cannot be recognised that way.
Also entries like r [] can occur this way. This happens, because the linearisation of
lists depend on the number of list entries and is defined in a table. And GF defaults
to the first parameter, which is normally 0. And the emtpy list gets linearised as the
cryptic []. To remedy this, the tables for parameters have to get their order changed.
But the other problems described in this sections wouldn’t get away with that.
55
Figure 9.1.: The situation from 7.1, but this time with English as the menu language
There is yet another way to get information into the refinement menu. GF offers a
mechanism called printnames. In the grammars they are defined like in the following:
-- in the abstract grammar :
fun
cryptic : OneType -> AnotherType ;
-- in a concrete grammar :
lin _ _
cryptic = " not very helpful linearisation " ;
p r i n t n a m e cryptic = " speaking name " ;
Technically, printnames are just strings. They can be added to any fun in a concrete
grammar module. This string will then be displayed in the refinement menu instead of
the abstract fun name. Helping the user to recognise what a fun does, is possible with
those printnames, but in the case of OCL this hasn’t been done before this work began.
But this one line string still does not offer enough place for a description of how the
OCL operation is defined or what it exactly does.
Tooltips could help here. Although they shouldn’t get too long, more than one short
line as what a name would consist of would be possible. The tooltips could contain a
short description of the command, what it does and what it can be used for. They could
be further enriched with information about their parameters, as, for example, JavaDoc
offers it. An example for that is shown in figure 9.2. For properties of model elements,
their fully qualified name can be given like in the following:
56
But there existed no way to display tooltips which are different from the displayed text,
and, not to mention, a way to tie that different text to the funs. How this is actually
done is discussed in section 9.1.4.
Figure 9.2.: The OCL collection operation forAll, explained with a tooltip
9.1.2. Grouping of refinement menu entries
At the beginning of this work, the refinement list featured no grouping of the offered
funs or hierarchical access to them. In a way they are already grouped in GF, since at
any place in the AST, only type-correct refinements are offered and the others not. But
since GF features no subtyping, no further subdivisions are possible on the type level.
So we need a different way to tag funs.
One way would be a special nomenclature. The fun names could get a special prefix,
perhaps with as the divisor character between category/namespace name and the real
fun name. All funs with the same namespace prefix would get into the same submenu
of the refinement menu.
But that way, those prefixes could not contain special characters and at least one of the
funs would have to have a huge name consisting of the complete description, which would
be unwieldy. Thus, to give these prefixes longer descriptions, one could use one special
cat Namespace in GF. Now introduce a special fun of this cat for every prefix and give
it a speaking name as the printname. This printname can then be used as submenu for
all funs of one namespace, with the mapping between prefix and description handled by
57
the editor. Even a namespace hierarchy is possible that way.
But as explained above, changing the submenu of a fun would make old trees unloadable,
but at least the descriptions of the subcategories can be changed without that problem.
Another way would use the module system of GF by putting all funs of each subcategory
into a separate module. This would greatly increase the number of files, but since the
definitions of funs do not depend on each other, this would be feasable. But then one
would have to keep track of this number of files, where paragraphs would just do the
same. Which boils down to a question of style, if that is to be an advantage or not.
Giving the modules better display names would again have to be done with special funs
and their printnames, since modules do not appear in the abstract syntax as palpable
entities; they just organise. And finally, renaming a subcategory would be a bit brittle,
since one would have to change the name in all files using the changed module.
With both ways, the grouping would be available on the abstract syntax level, since it is
encoded in the fun names. With special fun names that would be directly the case, while
with modules the fully qualified name would become sth. like module subcat.funName
Since fun are given with this fully qualified name in the list of possible commands for
the refinement menu, this grouping information would always be available. Thus, for
both ways, the grouping would be available for all menu languages in the editor.
9.1.3. Parameter descriptions
In helping the user to write OCL, there are two different levels. The more abstract
reliefs the user from formulating OCL at all. To help there would be one step up on
the semantic level, perhaps something like a wizard system or the specification patterns
described in [BH03]. The user somehow chooses wanted properties and OCL is generated
for them.
On a more concrete level only the syntax is done by the system, the OCL is still formulated, but not written, by the user. That is the level this work operates on, to help
people writing OCL, who know what they want to formulate, but do not know its syntax. This part is taken over by the grammars and GF, so that the user does not need to
know the OCL syntax. But the user still sometimes has to know about the grammars.
At any time one refines a child node of another node (the top node is inserted automatically on start-up). But it is not clear for all of these child nodes, what is actually
expected, as it is in figure 9.3 for example. Especially hidden type arguments are hard
to guess. Figure 9.4 is an example for that.
Here two ways are possible. One says, that these type arguments only deal with the
lack of subtyping in GF and should be hidden from the user completly, since they only
specify the types of other Instances, and these other Instance arguments really matter.
Only they show up in the linearisation. If they are refined, the types can be filled in
automatically.
In OCL, upcasts are done implicitly, and why bother the user at all with having to deal
58
Figure 9.3.: Without additional help, the linearisation gives enough clue, what this sentence in the implication represents.
Figure 9.4.: Here one can only guess, what this hidden type argument does.
59
with it explicitly? Why spend energy on making things easier, that shouldn’t be there
at all? And in the author’s opinion, that is a valid point.
But as long as types of arguments are not yet refined, GF cannot help with reducing the
offered refinements (e. g. in figure 9.5). Selecting first the type, and then the instance
will result in a smaller refinement menu (as in figure 9.6). funs, whose return type again
depends on one of their arguments, will still be offered, though. But if the argument on
which the expected type depends is chosen, GF can automatically set the type argument
of the callee, as shown in an example in figure 9.7. The earlier such an argument is
refined, the more GF can do things automatically, because it knows, what to do. And
that makes helping the user to refine early something worthwhile again.
Figure 9.5.: The receiver argument of propCall not refined, all properties of all classes
listed
60
Figure 9.6.: Now the receiver class is fixed and thus, only the properties of that class
listed
Figure 9.7.: The type argument of eq has been refined to Integer. With that, the type
of the arguments of eq is fixed to Instance IntegerC. The return type of
propCall in turn depends on its second argument, so that one could be filled
in automatically.
61
Thus a way is needed to add descriptions to parameters. Then, when the user wants
to refine a parameter, the editor can tell him, what this argument does and what is
expected. This would be a means to help with the problem described in section 7.3.
Tooltips, again, can help here. Thus, all unrefined nodes in the graphical AST representation get the parameter description of their parent node that belongs to them as
a tooltip, as shown in figure 9.8. But when refining, the mouse cursor is not always
in the tree. So the generic sentence Select Action on Subterm above the refinement
menu is replaced by the current parameter description. That way, this label gets more
specific and helpful, even to a user who has used the editor before and knows, what the
refinement menu does. This also is shown in an example in figure 9.8.
Figure 9.8.: Figure 9.4 revisited: The text above the refinement menu and the tooltip in
the AST explain, what the current refinement is for.
But this information needs to be known in some way. Somehow, it must be attached to
the parameters of the funs. Normally, in the abstract syntax, they do not have to be
mentioned by name, since only the types are relevant there.
f u n negAtom : AtomSent -> Sent ;
f u n eq : ( c : Class ) -> (a , b : Instance c ) -> Sent ;
For dependent types like in eq, the names have to be given and will be remembered,
but for unnamed arguments, they are given an internal name, as visible in the following
excerpt form the compiled form of the above grammar snippet:
62
f u n negAtom : ( h_ : core . AtomSent ) -> core . Sent = {} ;
f u n eq : ( c : core . Class ) -> ( a : core . Instance $ c ) -> ( b : core .
Instance $ c ) -> core . Sent = {} ;
It is possible to give all arguments names, and with the current state of GF they would
even get remembered too, but nothing in the theory behind GF guarantees that. Alpha
renaming is allowed there, but just not done (yet). So it is not a save bet for the future,
to store the names there.
And in the concrete compiled form, the arguments are just numbered, so names given
there are also forgotten after compilation.
So GF’s names for the parameters cannot be used to display them to the user. And,
to go further, names do not make descriptions, so that problem cannot be solved with
GF’s own means alone, but would again need some nomenclature. Which this time
would survive saving and loading of ASTs since the parameter names do not show up
there, but are also not guaranteed to be remembered by GF. Also, as mentioned above,
the fun names only theoretically could get overloaded, but that is to be dismissed due
the practical problems they bring along.
9.1.4. Printnames as the solution
The problem in sections 9.1.1, 9.1.2 and 9.1.3 is, that, as explained there, information is
needed which is tied to funs. As explained above, this cannot be done with abusing the
fun names. But a solution, that allows for extended fun descriptions, could also include
a tag for grouping, so that wouldn’t need an extra way.
Since the information needs to be tied to the funs of GF. Having them in extra files just
increases the chance, that one forgets to update the descriptions to match the new state
of the grammars, so that was ruled out.
A solution would be to put them as comments in the abstract grammar module, like
JavaDoc1 does it for Java source files. But that would require an extra parser for .gf-files
and also access to those files at runtime.
That led to the idea of looking at printnames again. For GF, they are just strings with
no internal structure. They are parsed by GF and can be accessed there at runtime. In
fact, GF sends them to the editor for each command it offers in the refinement menu.
That way the mapping of fun and its printname is done by GF. Why not put structure
in those strings and parse them when needed, as thought on for the overloaded fun
names?
The tooltips mentioned in section 9.1.1 are a direct candidate for that, and therefore
have been implemented that way. One part is displayed as the display name of a fun
as before, but the rest is reserved for the tooltip. Since printnames have no stringent
length restrictions, they can get as long as it suits the grammarian and as it is suitable
1
http://java.sun.com/j2se/javadoc/
63
for a tooltip.
Another tag was added to the printnames to assign them their subcategory in the refinement menu. This tag can take an optional argument that becomes the display text in
the refinement menu, and may also feature a tooltip. But one problem arises here. GF
sends only those printnames to the editor, whose funs are applicable. But the description
would be needed everytime when at least two members of a subcategory are available
(see below). And lengthier descriptions should be written only once, and applying them
for all funs with the same tag shouldn’t depend on copy&paste.
In addition, parsing printnames also takes some time. So if parsing them all over again
and again, when GF sends something to the editor, could be avoided, that would be a
good thing. Since printnames are just strings, they do not depend on anything else and
thus never change.
Thus they are read once at the beginning and the read printnames get cached in their
parsed representation. And when all printnames are together, the tag with the description argument is available, too , so all other funs of the same subcategory can share this
description. This description is then available for all printnames of that subcategory.
hen there is only one fun of one subcategory applicable, there is no need to hide it in a
submenu. That would just make it less visible and access to it harder, but without the
advantage of reducing the initial menu. So whenever there is a subcategory with just
one available fun, this fun gets displayed on the main level of the refinement menu.
In the current system, no further hierarchy for refinement menu entries is supported
since it is not needed. There never are too many entries in a submenu to justify that
work, but still the tag could feature a ’.’ for example as a divisor between namespace
parts. Thus a hierarchy would be possible, if the need for that arises at a later point.
As to the parameter descriptions of section 9.1.3, these were accomplished with an equivalent of JavaDoc’s @param. One just lists parameter names together with a description
in the order, which the respective parameters have in the type signature of the fun. This
order, or to be more precise, the mapping between description and actual parameter,
can’t be checked by the compiler, since without reliable names (it is not guaranteed,
that they are the same as in the fun definition at runtime) there is no other way anyway
to tell, for example, two parameters of the same type apart.
The syntax of these extended printnames is explained in detail in appendix A.
Another way would be to change the printnames in GF from simple strings to records,
which contain the aforementioned tags and values as fields. Grouping tag, display and
tooltip text would be present in all printnames while the number of parameters would
vary. If the enhanced printnames get used in other grammars beside the OCL ones, this
still can be done later, but only one grammar did not justify such a change in GF.
One aspect that has to mentioned, is that printnames belong to a concrete grammar.
They are not supported on the abstract syntax level. So they can only be used if this
grammar is loaded. For editing OCL constraints that means, that when only English or
German is loaded, none of the aforementioned improvements are available.
64
9.2. HTML support
But having this information in the concrete grammars can be advantageous, too. This
way, different sets of tooltips would be possible. A more technical description for persons
who know about formal specification, but don’t know OCL, like written for this work
would be one way. Another would be a much more descriptive version which tries to
minimise the use of technical terms, giving examples, a bit like an OCL tutorial. Or
just a translated version. Currently, German is supported as a linearisation language,
but the tooltips will always be in English. With different printname sets, with selecting
the menu language, one could choose the wanted set of tooltips.
The tooltips of GF evolved in this work. At the beginning, they were just clearer
names, whereas they now perform a bigger documentary purpose. They now explain and
structure the offered refinements and can even guide the user, increasing their usefulness.
9.2. HTML support
In section 7.4 a lack of formatting was stated. It was also explained, why supporting a
mark-up language is not possible with the old approach. What is needed, is a mapping
of a position in the output area to the AST position of the linearisation snippet, that
is displayed there. With text, this is quite easy, since text always has the same length
in characters in Java. As described in section 10.1.6, with HTML, the displayed length
depends on the used HTML tags and not just in their length in characters.
HTML is a hypertext language, and thus offers links. So why not use links to do this?
The answer is, that the snippets from GF can look like . The closing tag may come
later on. Enclosing such a semi-tag with a link would result in <a href="[AST,position
]"></a>, which would be illegal nesting. And the Swing HTML renderer just ignores
these lone formatting tags. So putting everything inside links does not work out.
One easy and fast way would be to strip the HTML tags from the output, when doing
the character counting. For basic markup like foo that would work, but already
 for a line break would mess this up, since the carriage return does count as a
character for Swing, but with stripped tags, it would not be counted when creating the
mapping.
So the editor has to look inside the HTML tags; the HTML has to be parsed to determine,
how many characters the output may have. But writing a special HTML parser was
considered too much of an effort. This lead to the idea to try out, how good the Swing
HTML renderer is at calculating the displayed length of incomplete HTML. A not shown
HTML area gets the HTML text in all its stages, one snippet appended after the other.
For every snippet, the length of the (not) displayed HTML before and after appending
is remembered. After the linearisation is complete, it can be rendered on the screen, but
the character positions of the individual snippets are still the same.
This approach works for all HTML tags, which are forward-readable, that is, for those,
which do not, when they get appended, change the positions of the already displayed
65
characters retroactively. Luckily, for all tags used in the HTML formatting module by
David Burke, this is the case. The benefits of putting structure into OCL constraints
can be seen in figure 9.10. And even pictures can be displayed and clicked that way, as
shown in figure 9.9 with a different grammar, not one of the OCL grammars.
Figure 9.9.: A picture in the HTML display. Although pictures are not highlighted
when clicked, the corresponding node in the AST is still selected. Left of
the rendered HTML is the HTML source code. Both can be shown side by
side, if wanted.
The same way of character counting is also applicable for pure text, except that a simple
StringBuffer is enough, and no Swing component has to be used for appending. And
as an added bonus, the code for marking the length of the StringBuffer before and
after appending is much shorter and easier to understand than all the special cases that
were necessary when the positions in the XML string, which was changed, when tags
were removed, were calculated.
9.3. Coercions
As mentioned in section 7.5, type coercions have to be introduced manually since GF
lacks in-built subtyping. So upcasts/coercions have to be made explicit (see figure 9.11(a)
for an example). But putting that burden on the user complicates his task. Coercions in
UML are implicit, and that is, what the user expects if he knows UML, something which
can be expected, since otherwise a CASE tool which builds upon UML like TogetherCC
66
9.3. Coercions
(a) No formatting used
(b) HTML is used to format the OCL
Figure 9.10.: Side by side comparison of no formatting and formatting
would not have been used in the first place. Having explicit type coercion clashes with
the user’s model, which is something like in figure 9.11(b). So, if possible, they should
stay hidden and implicitly inserted.
9.3.1. Where to insert?
The idea is to introduce coercions automatically at places where the access of subclasses
is expected and to hide them as long as possible. If everything goes well, coercions
should not be shown at all. But when to introduce a coercion? There are reflexive
subtyping witnesses, so adding a coercion even if an instance of the supertype is chosen
as the refinement later on does not hurt (as long things go well, see below). On this
condition, they could be inserted whenever an instance of a superclass is expected.
But that is a bit too often, since a number of funs have their own type arguments, like
propCall:
propCall : ( rec , ret : Class ) -> Instance rec ->
-> Instance ret ;
Property rec ret
67
(a) Necessary coerces shown. That’s how GF
wants it.
(b) No visible signs of
coerce. That’s what the
user expects.
Figure 9.11.: Different expectations for coerce
There is no need to coerce the Instance argument, since its type can be set with the first
argument rec directly to the type of the receiver. No supertype is involved here.
Having one type depending on several arguments creates a constraint in GF. When all
those type arguments match, everything is fine. But if one is changed, the constraint is
unsolvable for GF and reported as a clash. Minimising the number of possible clashes
by minimising the number of arguments which depend on others without restricting the
user is a good thing.
So how to decide, when a coerce should automatically be inserted, and when not? Apparently that depends on the funs themselves. A rule of thumb is, that if the expected
Instance already depends on a type argument like in propCall, and if it is the only such
Instance, than no coercion should be introduced. When there are two such Instances, like
in eq (already printed in section 9.1.3), then this type probably is a common supertype.
In these cases at least one coerce is necessary. So better introduce two here, but hide
them. Refining the first argument will most probably make GF fill in the type argument.
If the type of an Instance is already defined in the grammar, like it is often the case for
Collection, then most likely a coercion is needed, since those operations are often meant
to be general. That is the case for most user defined operations from the model. Here,
an automatic coercion allows refining with subtypes. But again there are exceptions.
It is unlikely that there are subtypes of Integer, Boolean and String. In Java, which is
the main supported language of KeY and the editor, one even cannot inherit from those
types. So here an automatic refinement with coerce can hardly do good, but nearly only
harm. So again, a way to tag funs, or to be more precise, to tag their parameters, is
needed.
68
9.3. Coercions
9.3.2. How to tag?
CoercedTo If GF for an argument finds only one suitable refinement, it refines that
automatically. That led to the idea of a special cat CoercedTo:
cat
cat
cat
fun
Instance Class ;
CoercedTo Class ;
Subtype ( sub , sup : Class ) ;
coerce : (b , a : Class ) -> Subtype b a -> Instance b ->
CoercedTo a ;
This category stands for values, that have been upcasted, whereas Instance represents
an uncoerced type. coerce is the only fun that returns this type, so it will be applied
automatically, whenever the fun expects CoercedTo in its type signature. Thus, tagging
as mentioned above will take place in the fun signature as in the following example:
f u n taggedFun : ( a : CoercedTo A ) -> ( b : Instance B ) ->
Instance ReturnType ;
The first argument will, as said above, always be refined with a coerce by GF an thus,
all subtypes of A can be used here. The second argument is fixed to Instance B. No
coercion is possible here. This way, coercions are applied by GF, and the editor does
not have to do that.
Printnames Another approach is using printnames again. A special flag in the parameter description signals that a coerce has to be introduced here. The flag could be given
for any type, but will have an effect only for Instance. If the selected node is an unrefined
node, for which the parameter description includes that flag, then the editor will, before
the current situation is displayed, tell GF to refine with coerce and move to its Instance
argument (the latter is even with CoercedTo something the editor has to do). Since the
flag is in the printnames, which for GF are just strings, GF will do nothing by itself.
Nothing in the types enforces an application of coerce. It’s all on the editor side.
As it will be explained in section 9.3.5 the approach with CoercedTo has been discarded.
So tagging is done with printnames.
9.3.3. Refining
The wanted way to introduce coerce automatically is shown in figure 9.12. As explained
above, the editor automatically selects the Instance argument of coerce, and if the coerce
is hidden in the tree, only this Instance argument can be selected, since the coerce itself
is invisible in the linearisation because also in OCL, subtyping is implicit and invisible.
In the following, the definition of coerce in the abstract grammar (repeated from 7.5) is
presented:
69
(a) The second argument is about to be selected by the user
(b) coerce has automatically been applied and
the cursor switched to
the Instance argument
Figure 9.12.: Before and after an automatic coerce application. For the user, coerce will
be hidden, along with its first three arguments, but to look behind the
scenes, its shown here.
f u n coerce : ( sub , super : Class ) -> Subtype sub super -> Instance
sub -> Instance super ;
The normal case is, that super is already fixed. For example, when refining an argument
for a model query, each parameter of that query has a determined type. So not all types
are possible subtypes (as it would be, if super wasn’t known). But GF still offers all
types of Instances for the sub argument, since it does not care if the Subtype argument is
refinable or not. That is the same behaviour as in section 7.11 with the VarSelf variable.
Although GF does not care about that, the editor has to care, since it is its goal to help
the user by preventing type errors.
So in cases where the editor can expect, that only a limited range of subclasses are
possible, that is when super is neither unrefined or OclAnyC, it can try to offer a reduced
refinement menu for the Instance argument. To do that in the background, the editor
moves the focus to the Subtype argument and gets the list of possible refinements there.
Only subtyping witnesses are listed, where the superclass is super. Now for each one
of those, it refines with it and moves to the Instance argument. The funs offered there
will be suiting for this subclass of super and get stored. Then move and refinement are
undone, and the next subtype is refined, and so on. After all subclasses are tried out, the
stored lists of funs are packed into one refinement list with duplicates removed, which
is then shown to the user.
Duplicates occur because quite a number of funs have a return type that depends on
one of their arguments. And that, theoretically, could be any type. So they could be
applied wherever an Instance is expected, and thus will occur in each of the collected
lists. So the duplicate removal is necessary.
The refinement menu does now contain no more funs, whose type is clearly not a subtype
of super.
70
9.3. Coercions
(a) The Instance argument of a coerce
is to be deleted
(b) The coerce has also been deleted
and is refined again
Figure 9.13.: Before and after a delete of a coerced Instance
9.3.4. Deleting
When a refined node is selected, the refinement menu will contain the command to delete
that node with all its children. But that is not enough of nodes below a coerce. For
a new refinement the user might want to refine with a different subtype, and not with
the old one again. But the type arguments of the coerce above are still the same, since
they are hidden and cannot be deleted directly by the user. Thus, GF will only offer the
same subtype again. So these type arguments have to be removed. The original state is
shown in figure 9.13(a) and the target state in figure 9.13(b).
This is implemented with a chain command. The delete command is modified, so that
first the cursor is moved to the coerce, then the delete is issued and finally, the coerce is
reintroduced. If the expected supertype is already known, GF will refine the return type
argument automatically, what it wouldn’t, if just the return type argument would have
been deleted explicitly because GF would respect that and not refine it immediately
again.
Deletion of the coerce might also be necessary, if the Instance node has not yet been
defined. For eq, the parametrised equality, one might first choose the type on which the
comparison should be done. When one of the arguments is selected then, the supertype
of the automatically introduced coerce will be set to that type. But when the user
now goes back and changes or deletes that type argument, a clash will occur, since the
supertype of the coerce, on which the return type of the coerce depends, is no more the
same. The coerce now becomes visible. But if the user just selects the Instance again, the
coerce will automatically be deleted and reinserted. Figure 9.14 shows the state before
and after this.
So that clash is solved nearly without user-interaction. Making this deletion completely
automatically would be preferable, though. But that would make necessary that all
nodes, that could theoretically depend on the currently changed node, get checked if
they have to be deleted or not. And that is not done at the moment.
71
(a) The argument, on which the type
of the Instance arguments depends, has
been changed, so GF reports a constraint.
(b) After the previously selected Instance argument has been selected
again, the constraint is resolved.
Figure 9.14.: Deleting of coerce above an unrefined node
(a)
Before
wrapping
(b)
After
wrapping
Figure 9.15.: Before and after wrapping with w wrapper 2
9.3.5. Wrapping
Introducing coerce at the beginning is easy. What’s harder is how to deal with it later
on. Wrapping, the insertion of a fun between another fun and one of its arguments, is
such a case. For wrapping to work, a fun has to return the same cat, as it takes as t
least one of its arguments, as with B in
f u n wrapper : A1 -> ... -> Bk -> ... -> An -> Bk ;
wrapper can be inserted between at a node of type B and its parent, whereas the subtree
beginning in that node will be used as the kth argument. This example (with k=2) is
shown in figure 9.15.
CoercedTo
With the first-mentioned tagging approach using CoercedTo, funs will return Instance,
but expect CoercedTo. And wrapping is only possible for funs, which take the type as at
least one of their arguments as they return, since otherwise, the fun won’t fit in there.
And that is not the case here. GF will not offer wrapping commands then, making
bottom-up editing effectively impossible.
Wrapping with two funs at the same time is not supported by GF, so no sneaking in
72
9.3. Coercions
with another coerce is possible here. So the to-be-wrapped subtree has to be cut to
make GF present the full refinement menu of all funs fitting here. And not only those,
which can indeed take the cut subtree as an argument. Thus, stripping out all funs that
cannot accept this subtree would be up to the editor. In pseudo-code, that could be
done as follows (commands chained with ’;;’ can be sent in one step):
add the current subtree to the clipboard
;; delete it
for all funs ’f’ in the refinement menu
refine with ’f’
for all arguments ’a’ of ’f’
if type(’a’) == Instance ?
//coerce will be applied automatically
move to coerce’ Instance argument
;; paste the cut tree
;; ask GF to solve constraints
if the Subtype witness is filled in
&& the cut subtree can be pasted here
prepare a chain command
undo move, refine, solve, move
undo refine
undo delete
add these chain commands to the refinement menu
These chain commands would consist of adding to the clipboard, deleting, refining with
the new wrapping fun (coerce will be applied automatically by GF), pasting the cut
tree, and solve to fill in the rest of coerce’ arguments. If the user does not use such a
chain command, then the editor has to delete the cut tree from the clipboard to clean
up again.
This version of the algorithm allows only wrapping with the same type, not with a subor supertype, but is already quite slow:
Iterating through all funs and their arguments is linear in the number of funs. Approximately, every fun has two arguments and every second is an Instance. So for every fun,
1 call to GF for the initial refining and 1 call for each argument has to be made, together
with parsing the returned state. The undos can be chained in front of the next command
since the returned state after it is ignored anyway, so they do not cause additional calls
to GF.
The current version of the editor does around 6 calls to GF in the background and even
with a small linearization, where appending and rebuilding the HTML does not hurt
that much, a small slowdown is noticeable. An normally, between 20 and 30 funs are
offered by GF for an Instance FixedType. So doing 50 calls to GF in the background
would become quite annoyingly slow.
Therefore, CoercedTo has been discarded and tagging is done with printnames.
73
Tagging with printnames
With tagging using printnames, wrapping will still be possible, since funs still return
and expect Instance. So wrapping with funs, that go from Instance A to Instance A, is
possible without restrictions, since they are directly offered by GF, as figure 9.16 shows.
Figure 9.16.: Wrapping without CoercedTo is still possible, as long no subtyping is involved.
But when it comes to wrapping with sub- or supertypes, things get complicated too. In
fact, since there too will be a coerce between wrapper and wrappee, the problem is quite
identical for both approaches.
The following parts in this section have not yet been implemented, but at least with this
tagging approach the user can wrap, if the type stays the same, something that wouldn’t
be so easy and fast with CoercedTo.
The general case for wrapping with subtyping is, that the type constraints on the wrapper
fun are loosened. It does not need to have exactly the type which the original parent fun
expected as an argument, a subtype of that is enough (which might be even a subtype
of the type of the former child fun). Analogous, the wrapper fun is not restricted to
expect the exact type of the former child fun as its argument, a supertype suffices.
In a slightly simplified, but not less general example, this looks as follows: Given classes
C B A and funs aa : A -> A, bb : B -> B and c : C. With implicit coercions,
an initial tree (aa c) and a target tree (a (bb c)) is given. And with explicit coercions,
74
9.3. Coercions
(a) The initial
state
(b) The target
state with bb in
between
Figure 9.17.: Example situation where a wrapper fun is to be introduced, that neither
as the child nor as a parent fits without coerce. The actual coerce nodes
are hidden.
(a) The initial state
(b) The target state with
bb in between
Figure 9.18.: The situation from figure 9.17, but with coerce shown
this would look like (aa (coerce C A C2A c)) in the beginning, and (aa (coerce B A
B2A (bb (coerce C B C2B c)))) in the end. Figures 9.17 and 9.18 display the same
example as graphical trees.
As stated above, GF will only offer the funs for ordinary wrapping. To add the others
into the refinement menu is up to the editor, which has ask GF for them. This can be
done as follow:
add current node to clipboard
move up
;; delete the coerce above
;; reintroduce it
;; move to the Instance argument
for all now offered funs ’f’
refine with ’f’
if Subtype argument of coerce is filled in
for all arguments ’a’ of ’f’
if type(’a’) == Instance ?
75
move to ’a’
;; refine with coerce
;; move to Instance argument
;; paste the cut tree
if the Subtype argument of the coerce above is filled in
prepare a chain command
undo solve, paste, move, refine, move
undo solve, refine
undo move, refine, delete, move
add these chain commands to the clipboard
Here, the chain command would consist of add tree to clipboard, move up, delete coerce,
reintroduce it, move to Instance argument, refine with f, move to a, refine with coerce,
move to Instance argument, paste from clipboard, clean up clipboard, solve constraints.
And, like above, if another command is sent to GF, clean up the clipboard.
A version of this algorithm for CoercedTo wouldn’t be very different. Just that coerce
would get refined automatically, but otherwise it would be the same. And it can replace
the algorithm above since the funs that wrap from A to A are included, too.
Additionally, this algorithm is not slower. Again, one initial call to GF per offered fun,
and then one call per Instance argument. Also undos can be treated as above. But the
speed problem would still be there, since ‘not slower’ does not mean ‘not slow’.
For Instance, probably most wrapping will be done with the classes IntegerC, RealC and
BooleanC. And except for RealC, these classes have no subtypes, so that the wrapping,
that GF offers, is enough. And for the rest, a button could be added, so that this
procedure is only executed on demand.
Or in another thread in the background, that works, while the user is thinking. For this
to work, the algorithm has to be interruptible at any stage since it has to share the same
GF process with the normal editing commands because GF can need over 180 MB of
memory and can take 20 s to load a grammar and thus, no further instances of GF in
the background are really possible. Since every cycle of the inner loop of the algorithm
is quite short, it would be enough to just let it run through and check at the beginning
of the loop, if a special termination flag is set. If yes, execute the undo of the outer loop
and the one for the initial commands and then terminate that thread. That way, these
commands could be added one after each other to the refinement menu, without making
the user wait too long, if he doesn’t need them.
9.3.6. Changes afterwards
Another problem with always hiding coercions is, that if the user changes type arguments
on which other arguments depend, and these arguments are already refined, there will be
a clash. And to fix that clash, one has to see its source, and for that all type arguments
76
9.3. Coercions
have to be displayed. And coerce with its two type arguments is a likely candidate for
such a clash. The implemented way to deal with unsolvable constraints is to display the
coerces to make the user know, what the type arguments for the coerces are.
Also, whenever one of the ancestors of a node has such a constraint attached to it, the
node cannot be hidden, as visible in figure 9.19. Then a user at least has a chance to
spot the error. The constraint, as given by GF will just say something like Instance
Figure 9.19.: The type argument of eq has been changed which lead to a GF constraint.
To help in resolving that, the editor displays all affected coerce.
Real <> Instance Integer, but will not tell where the conflicting nodes are. Finding
that out would impose type checking on the editor, which by the design should be done
by GF alone. Therefore, the display of constraints has not been touched in this work,
although it would be helpful for the user to improve it.
But there are cases with subtyping errors that GF does not find. If a subtyping witness
cannot be filled in, GF will say nothing. The node cannot be refined, but that’s it for
GF.
So the editor will mark those nodes red, to signal to the user, that there might be a
problem there. An example for this can be seen in figure 9.20.
9.3.7. Collection subtyping
In OCL 1.5 there are no nested collections. But the standard is very vague and underdefined when it comes to flattening nested collections. Everything, that is said about them
is the following paragraph (quotation from the OCL 1.5 standard2 , chapter 6.5.13):
Within OCL, all Collections of Collections are flattened automatically;
therefore, the following two expressions have the same value:
Set { Set {1 , 2} , Set {3 , 4} , Set {5 , 6} }
Set { 1 , 2 , 3 , 4 , 5 , 6 }
Nothing is said about nesting different kinds of collections, like with Sequence(Set(
OclAny)), where the elements of the inner Set would have to be ordered somehow to
2
http://www.omg.org/cgi-bin/doc?formal/03-03-13
77
Figure 9.20.: The editor noticed, that a subtyping witness is missing
become members of a Sequence. To side-step the flattening problems, the OCL grammars from Kristofer Johanisson avoided flattening collections and allow for nested ones,
since in OCL 2.03 they will be allowed. Yet, that brings up a problem with subtyping
witnesses.
If a Subtype argument is not refined, that does not necessarily have to be a subtyping
error. Sometimes GF does not fill in subtyping witnesses automatically, even if there is
only one possible refinement. And at the moment, the editor does not check that automatically. For non-collection subtyping this normally does not occur, so that shouldn’t
be a problem there.
For Collection and its subtypes in OCL things are a bit different. The subtyping witness
for them looks as follows:
setConforms2coll : ( sub , super : Class ) -> Subtype sub super ->
Subtype ( Set sub ) ( Collection super ) ;
A Collection(Real) in the grammars (and in OCL 2.0) is a supertype of Set(Integer).
There are two subtyping relations involved here. setConforms2coll as such is for the first
one, that Set is a subtype of Collection. But then there is a nested call for to fun with
return type Subtype, since something like Bag(Set(Sequence(Integer))) is a subtype of
Collection(Set(Collection(Real))), and that could get as deeply nested as it wants.
3
OMG Final Adopted Specification: http://www.omg.org/docs/ptc/03-10-14.pdf
78
9.3. Coercions
The algorithm to fill in the Subtype witnesses ‘by hand’ by the editor is as follows: First,
ask GF to solve any constraints, where an open node must have the same type as a
refined one. But as mentioned above, that does not always work. Then scan the AST
from the root to the leaves in its textual representation for open nodes of type Subtype,
where the two nodes above are already refined (this constellation is the same for coerce
as for the collection subtyping witnesses). If one of the two nodes would be open, then
it can’t be decided yet, if there is a subtyping problem or not. Now, move the focus
to a unrefined Subtype node where sub and super are defined and take a look at the
offered refinement menu. If it contains just one entry, refine with that. If no refinements
are offered, we know that there is a subtyping problem4 . More than one refinement is
impossible, since for each pair sub and super, only one subtyping relation exists. As
the next step the editor has to look at the tree and move to the next node as described
above.
Here, in contrast to what was noted above, some knowledge about the tree is preserved.
The tree is not rebuilt, since just its textual representation is scanned, and rechecking
the same first undefinable node again and again is prevented with a line counter. The
check is directed strictly downwards. This works, since only such nodes are touched
where sub and super are already refined. With that precaution, no nodes above will
notice any changes, since they can at most rely on those two arguments, but never on
the Subtype argument.
This procedure is, at the moment, not run automatically for speed reasons and because
nested collections are supposed to be quite rare. There is a button Close Subtypes,
which executes it.
9.3.8. Summing up coercions
The goal of making coercions completely implicit as they are in OCL has not yet been
reached. Just in some cases when deleting or changing type arguments, the user will
catch sight of them. But as long as the user doesn’t change his mind and wrap funs,
he shouldn’t have to do anything with them. For example, the tree in figure 9.11(b) is
done with the current implementation. And no coerce was visible during the creation of
the tree in that screenshot, either.
One problem, that sometimes remains, is that the refinement menu contains too many
entries. funs depending on type arguments will always be offered, regardless if the type
arguments can be filled in or not, which for speed reasons is not checked.
4
Note, that for later on this knowledge gets forgotten, since the editor holds no knowledge about
previous states of the tree when the tree is rebuilt.
79
9.4. Minor Improvements
9.4.1. Suppression of self and result
To remove the superfluous self and result depicted in section 7.11, the editor refines with
them in the background, moves the focus to the VarSelf argument and checks, if that
position has been filled in automatically, and undoes that refinement afterwards. If the
argument was filled in, self respectively result are left in the refinement menu, otherwise
removed.
This works, as long as the type of the expected Instance is fixed. Then the type argument
of self will be constrained to be that class, and only then GF checks, if there is just one
possible refinement for the VarSelf argument. If there is, the editor will show self in the
refinement menu, otherwise the editor will hide it.
But when the expected type of Instance is not fixed (as for example below a coerce where
several subtypes are possible), the Class argument of self won’t be filled in in the first
place. That leaves the scene open for the VarSelf c guard attribute. This can now be
filled in, which triggers the refinement of the Class argument. So now, all arguments of
self are refined. And that is the case in any OCL context, regardless of the type of self.
That is also the case for result, if the context is a method contract of an operation that
does return a value. And since self could be completely refined, it will be shown in the
editor 5 .
This is OK as long as the expected type of self does not have to be a subtype of another
type. But in cases where self has a type which is not a suiting subtype, self would still
be offered, since for GF the lack of a suiting subtyping witness is no problem. So in case
the VarSelf argument is refined the editor checks, if this self is the Instance argument of
a coerce. If it is, it will also check the Subtype argument and if that is not filled in, it
will hide self. The Subtype argument is automatically refined, since there can only be
one. That for Collection GF sometimes does not close the Subtype argument is hardly
ever a problem: self will never have a Collection type and result only in the rare case
mentioned above.
Recapitulating, self and result are (nearly) only offered, when they are applicable, otherwise they do not show up in the refinement menu.
9.4.2. Easier access to properties of self
Users who have been shown the editor, missed easy access to properties of self. If
properties are accessed, these are most probably properties of self, since that is the
instance for which the constraints should hold. So they wondered why these could not
5
Note, that when the Class argument gets refined, that can imply for GF that other unrefined Class arguments that depend on it get refined, too. But the undo command will also undo these refinements,
so that won’t mess things up here.
80
be accessed directly.
That can be done now. Whenever an Instance is expected, the editor refines with a
property call for self in the background and adds all funs that GF offers there as chain
commands to the current refinement menu (see figure 9.21 for an example). These chain
commands consist of refining with implPropCall, the generic property use fun that omits
self in the output, a move to the receiver argument, the refinement with self, a solve
to tell GF to fill in possibly still open type arguments, a move to the actual property
argument of implPropCall, and finally, the refinement with the chosen property.
Figure 9.21.: Direct access to the suiting properties of self, here for Integer
The problem with this approach is similar to the one mentioned in section 9.4.1. If the
expected type is not yet fixed, GF will offer all properties of self. If that later on will
lead to a clash is not checked.
Therefore, whenever the focus is below a coerce where the subtype argument is not filled
in and the supertype is not OclAnyC, the refinement menu for the properties of self is
calculated differently. The editor will refine with all possible subtypes as in section 9.3.3
and then collect the refinements for each subtype as above. The list of refinements
collected this way will not contain properties which do not have a suitable subtype.
Only this list is used for creating chain commands like above, which are then presented
to the user.
But this checking involves a number of calls to GF that are only needed, when such a
property is to be accessed. Therefore, the check in the background is deferred and a
special bogus entry for the self is listed in the refinement menu in the beginning. Only
81
when the user clicks that, this submenu is computed.
There is one complication for the creation of the chain commands, namely when only
one property is available at all for the given supertype. GF will then refine with that
automatically. So the editor has to check after trying out all possible subtypes, if only
one refinement was offered for all subtypes (a delete will always be sent to GF at the
position, where the properties will show up to make single refinements appear there
also). If that is the case, a modified chain command will be offered to the user, which
does not contain the actual property, since refining at a already refined AST node will
produce an error from GF and must be avoided.
9.4.3. Comparison operators
Comparisons for Integer
The comparison operators for Integer have been removed. With the transparent handling
of coercions these are not needed anymore. The refinement menu for the operands will
only contain funs of type Instance IntegerC or Instance RealC6 and coercions are hidden.
With that, the reason for separate Integer comparisons is gone.
Equalities
Since with eq there is a general equality, that can play the role of any other equalities,
these are superfluous. For the model classes never specific equalities existed, so there
was an inconsistency and the user had to choose either anyEq or eq anyway for them.
Now all equalities except eq and its counterpart neq have been removed. These can
handle all others.
But eq is not perfect, it too, has its problems (which would be the same for the specialised
equalities), namely when subtyping comes into play. eq takes a class c, on which the
comparison should take place, as an argument on which the two others depend:
f u n eq : ( c : Class ) -> (a , b : Instance c ) -> Sent ;
But one of the two instances may have a subtype of c. The dynamic type of the instance
of c may stillbe the subtype, so that equality indeed is possible. OCL forbids equality
on different types, so if there is no inheritance relation of a and b, this is incorrect OCL.
There are no explicit upcasts in OCL allowed, so that cannot be stated in OCL. But
nevertheless, the current eq allows such a statement.
In a discussion with Aarne Ranta, a new equality operator came up:
c a t EitherSub Class Class ;
f u n eitherEq : (c , d : Class ) -> ( e : EitherSub c d ) -> ( a :
Instance c ) -> ( b : Instance d ) -> Sent ;
6
An exception for that are properties, there still all will be listed (see section 9.4.2).
82
f u n s ubt yp eF ir st Se co nd = (c , d : Class ) -> Subtype c d ->
EitherSub c d ;
f u n s ubt yp eS ec on dF ir st = (c , d : Class ) -> Subtype d c ->
EitherSub c d ;
This equality is a variant of Conor McBride’s John Major equality ([McB00]), which
only takes two class arguments, but no subtyping relation argument, so everything can
be compared with this equality, but still only the same thing is identical to itself.
In contrast, with eitherEq it can be enforced that either c is a subtype of d or vice
versa. The type EitherSub Class Class has only two constructors which take the same
Class arguments as eitherEq. Each of them takes care of one direction of a possible
inheritance between c and d. But since there are two of them, GF will not refine them
automatically. Thus the editor would have to refine with both of them, each time giving
the class arguments of eq to them. Since for every two classes there is at most one
subtyping witness per direction, GF will refine that argument automatically if there is
one. If it didn’t do that for both subtypeFirstSecond and subtypeSecondFirst, the editor
can conclude that there is a subtyping error and signal that to the user.
Handling equality without hidden coerce is less error-prone since less type dependencies
are involved. But on the other hand, the user will not want to deal with EitherSub
himself, so replacing that argument with a dummy node that either says ‘subtyping
correct’ or ‘subtyping incorrect’ will be enough for him.
If GF reported for constraints,where they occur, the editor could notice, when the user
has changed one of the Class arguments of eitherEq, and as a result, delete and try
to re-refine the EitherSub argument to see, if the new Classes are valid arguments for
eitherEq.
This approach came up shortly before the end of this work and hence, was not implemented. This would encompass a changed treatment of the fun eq and would make
changes in the OCL parser necessary, since the parser would have to produce a different
output.
9.4.4. Escaping hidden arguments
As mentioned in section 7.3, one could not change the focus position with clicking at a
new position in the linearisation itself, when the currently selected node was not shown
in the linearisation. The author couldn’t find a reason, why that was more or less
explicitly disabled. So this limitation was removed, and now the click-in functionality
of the linearisation display in pure text as well as in HTML works regardles of whether
a visibible or a hidden node is currently selected.
83
9.4.5. Changed selection colour
In the previous version, the parts of the linearisation, that belonged to the currently
selected node in the AST were coloured green. A colour which normally signalises, that
something is correct, and not incorrect. Which, in a sense, is indeed the case. When no
GF constraint exists at the currently active node that does mean, that there are no type
errors. Thus, the colour green was chosen. In contrast, when there were GF constraints
present, the respective part of the linearisation, that belonged to the constrained nodes,
was coloured red, not only the currently active part. And then, the user couldn’t see,
which part was selected.
That there are no typing errors should be standard and nothing special. So to colour
that especially is not justified, in only makes the user think, that something is special,
although it isn’t. Thus the standard selection colour of Swing is now used instead, since
selection is the thing, that is important for the user, not correct typing. Only when GF
constraints are present, a different colour is chosen. Then, the active constrained node’s
linearisation is highlighted in orange to make it distinguishable from the not selected
parts, which are in red.
Take the lights in a car dashboard: they are off, when everything is alright, no green
‘lights’ should be on, unless something important has to be signalled to the user. The
selection colour, a bluish violet, is the same as in all Swing applications, so text in that
colour it should be recognisable as selected text, not as otherwise specially marked text.
The selected node in the AST was always coloured in the Swing selection colour, and
since node and its linearisation belong together, that shouldn’t be veiled by different
colours. Note, that this is violated if constraints are present. Then the selection colour
in the tree is still the default one. That stands to be fixed.
9.5. Generic Improvements
The following improvements have in common that they are not of big use for editing
OCL constraints. But since the editor was also released as part of GF 2.3 and is expected
to be used with normal GF grammars, some improvements have been done for them,
too (besides the ones already mentioned that are not exclusively for OCL).
9.5.1. Middle-click parsing
The old editor featured the parsing via middle-click in the linearisation area. After such
a click, a small window would appear, where the user could input a text that GF should
parse. The linearisation of the last displayed language is displayed in this field.
This had two problems. When the current node was open, something like ?4 was filled
in there. And the built-in GF parser only can parse question marks, if they are not
84
9.5. Generic Improvements
followed by a number7 . So if the user wanted to give an expression containing open
subtrees, he was given a false hint.
The other problem defied the sense of multilingual editing. The last displayed language
could be one, that one does not understand. But virtually, he does not need to understand that one, since it will have the same content as the other displayed languages, and
he might just read that linearisation instead which is the point of multilingual editing.
But now the user often cannot just perform a little change in the displayed text, since
it might be in an unknown language (for him) like in figure 9.22.
Figure 9.22.: The parsing window called with a middle-click shows the refinement of the
subtree of the current node in the language that GF last sent to the editor.
To remedy this situation, the editor now saves the language for the linearisation snippets,
so that the user will always get the text in the language he clicked on, as visible in figure
9.23.
So now he can choose the language to be one he knows. If the node is not yet refined,
the window will be empty. Just giving a question mark to be parsed, may be possible,
but is pointless, since it brings nothing new with it.
7
Which is an acknowledged bug of GF.
85
Figure 9.23.: The parsing window called with a middle-click shows the previous text in
the language the user clicked on.
9.5.2. Ask to save before quitting
In the old editor, when the user quit the program, it just quit. It didn’t ask if the
user might want to save the tree or linearisation. So if that happened accidentally, all
unsaved work was lost.
In OCL mode, the situation was the opposite. The constraints were always saved,
whether the user wanted it or not.
Now a message box pops up and asks (see figure 9.24). Also the quit can be cancelled.
This feature is standard and there is no reason why the GF editor should differ here.
9.6. Unimplemented ideas
9.6.1. On AtomSent
The problem with AtomSent in section 7.7 has not been solved in the implementation of
this work. A work-around could be to offer all AtomSents in their positive form as Sent,
and hide the category AtomSent. That way, one would lose the nicer negated form, but
it would make these predicates accessible at all.
It could be implemented either with duplication in the grammars, where there will be
86
9.6. Unimplemented ideas
Figure 9.24.: A simple question box asks, whether the user wants to save before quitting.
just the extra form as a Sent. This would increase the memory footprint of GF a bit. Or
the editor asks GF at the beginning for a list of all funs, that return AtomSent, which
can easily be done by starting anew with the cat AtomSent. And since posAtom does
not depend on other parameters like the AtomSent itself and needs no bound variables
in the AST, this list will never change unless the model gets changed, but that is not
supported while the editor runs anyway. So this list could be cached and added to the
refinement menu for Sent as its own subcategory. This would only increase the startup
time of the editor a bit, but shouldn’t impede all further usage with regard to speed.
But this is not yet implemented and the current way to deal with it is that the subcategory for AtomSent in the refinement menu has the following as a tooltip: ‘boolean model
elements and non-standard OCL statements like ’an exception is thrown”. This is no
real solution, but at least gives the user a slight chance to discover where boolean model
elements are hidden. A better way is definitively desirable, but since there is ongoing
work on negated forms in the resource grammars, there is light on the horizon for a real
solution without work-arounds.
9.6.2. The collapsing tree
The problem of section 7.12 is so by design, which is, that GF gives the whole state to
the editor, and the editor just displays that, but keeps no knowledge of previous states
by itself. The question is, if the design is good, especially with regard to this problem.
87
What is to be bridged, is the independence of different GF calls on one side, and the user,
who expects, that his decision to collapse or unfold tree nodes should be remembered on
the other side. For the user, the tree will still be the same, just with the changes from
the last command applied. But for the editor, the tree always is completely new. Thus,
there is a mismatch between the user’s mental and the program mode. And in such
situations, the program model should be adapted unless there are important reasons to
make the user rethink. But that is not the case here.
Please note, that the following ideas have not been implemented.
One way to help here would be a weak memory of the previous state. Weak memory
means, that a tree node is reused, if the node in the GF AST is the same (perhaps
modulo selection) as the one from the previous GF run. Then the status in the graphical representation would be remembered, but the fun would still be the same. So an
expanded node could survive as expanded, a collapsed node would be collapsed. And if
the node has changed, the editor would recreate all child nodes. Which is not ideal, but
still much better than the current state. Otherwise the editor would have to anticipate
changes in the tree or do a tree diff. Which was not thought out in this work.
Alternatively, when a mp command (move to another AST position) has been issued
before, just update the selection status in the tree nodes, that represent the GF AST
nodes that now have a changed selection status, and make the new node visible and
selected in the graphical representation, while removing the selection flag from the old
one, but without changing its collapsing status. But do not rebuild the graphical tree at
all. Parsing the GF tree can not be avoided, although the mp can be intercepted and the
new position read from the command string, since depending on the currently selected
node and its position in the tree, a new command could be issued to GF automatically
(see the section 9.3). For the user, moving in the tree does not change it. And this
approach respects that.
88
10. Implementation
10.1. Refactoring the old version of the editor
The editor originally was meant as just a graphical interface to GF which ran as a standalone application. Giving the editor more intelligence to help the user with OCL made
it necessary to use many parts of the editor in new ways. For that, the old code had to
be refactored to suit the new requirements.
10.1.1. Documentation
Before The code contained no JavaDoc comments and only very few other comments.
Over all, there were ca. 4 comment lines per 100 non-empty source code lines and the
former developer was on a leave. What methods do and attributes are used for, was
nowhere described. And just reading the code didn’t help that much, because with GF
there is an external program, whose output is processed, output which is not readable
in the source code. And since there was no documentation about the protocol that GF
and the editor speak with each other, the running system had to be analysed with a
debugger.
Also there was no documentation about the control flow. The main event loop, for
example, was in the constructor of the GUI class and thus, the constructor was never
finished. Since the GUI was already initialised at the point the main loop was entered,
the user could start to use the editor and issue commands to GF. Those were sent
via STDOUT from inside the Event Handler Thread to GF, which was started by the
constructor thread. This external program then sent its output to the STDIN of the
GUI where the main loop in the constructor thread parsed the output and updated the
GUI elements of the editor window.
Not knowing that the constructor was never meant to finish and only having this implicit
multi-threading which was mentioned nowhere, made debugging quite complicated at
the beginning. Sending and reading were not connected in the Java code, so stepping
through the methods just ended after sending.
Now In contrast to the state before now every class, method and nearly every attribute
has a JavaDoc comment attached to it. Also in the protocol parsing parts of the code,
comments tell what is read in a line or why a branching is necessary. Overall, there are
89
10. Implementation
now 2224 lines of JavaDoc and other comments for 5928 non-blank and non-comment
lines. So instead of 4 comment lines per 100 real code lines as before, there are now 38
per 100.
10.1.2. Names
Before Some local variables had names like j, j2, k and were defined at the beginning
of a method without any hint to which purpose these variables serve. The use of these
variables depended very much on the input data from GF, so just by looking at the code,
it was not possible to find their use or what values were stored there. Only debugging
the programs helped here.
Some methods had names, that were too general and did not really say, what the method
is actually doing. So here too, without documentation and due to the methods’ dependency on the output of GF only debugging helped here.
Also a number of attributes had the problem of having a name too general. These
attributes lead to the next point in section 10.1.3.
Now Names like j, j2, k do not occur anymore in the code, those variables got speaking
names. Exceptions to this are integer loop variables and some String parameters called
s, where the method name already makes the usage clear. And a number of critical local
variables additionally got comments that describe their usage.
10.1.3. Data flow
Before The old version of the editor uses 23 attributes as global variables whose value
does not need to be saved. They are used instead of return values and parameters for the
purpose of parsing the XML from GF and displaying tree and linearisation. Mostly they
have been used in several methods which interacted on them. The main methods for
parsing and displaying interact purely with side-effects, they take no parameters, return
void, and several of them also affect the GUI. Together with the lack of comments, it
was very hard to figure out which purpose those variables served. Again, only debugging
with real input from GF helped here.
Changing the order of the method calls would also give unexpected results due to their
interactions, but would be needed when, for example, calling GF in the background to
check something.
Now There is one attribute left in the main class, which could perhaps be made local,
and one in the class that handles the linearisation. All other global variables have either
been removed or transformed into local variables and parameters. Thus it was possible
to factor larger parts out of the main class. This is described in section 10.1.5.
90
10.1. Refactoring the old version of the editor
10.1.4. Static attributes
Before The editor was meant to be a stand-alone application and as that it didn’t
matter if an attribute was static or not, since only one instance of the main class was
running at a time. So most of the attributes that kept the state of the editor were just
static without a special reason to be so. But in the context of TogetherCC where one
might want to start the editor more than once for several classes/methods at the same
time, it does matter.
Now Only the loggers and final attributes are static, no more attributes that are used
by just one instance of the editor. Additionally, the main reading loop from GF has been
removed from the constructor. That way, the main thread, in which also TogetherCC
runs, is no more blocked and the editor can be started as several instances at a time.
10.1.5. Division of labour
Before The editor consisted of just two main classes, one for the general GUI and one
for the graphical representation of the AST. Additionally eight more or less anonymous
classes were used as callbacks. But except for the tree display class, all functionality was
gathered in just one big class with no further encapsulation, which hurts the Principle
of Single Responsibility as described in [Mar02]. All steps in getting the state from
GF as text, parsing that text, creating the internal data structures, updating the GUI,
receiving commands from the user and sending them to GF were done in this class.
In addition, the main class implemented the listener classes for all GUI elements except
for the AST. So the event handler methods consisted of huge case distinctions to figure
out, where the request came from. This isn’t inevitably bad, but having 24 cases, from
which only 1 is executed at a time, with handling code up to over 50 lines, makes this
harder to read than necessary.
Now As stated in section 10.1.3, the main class has been split. All direct reading/writing from the main class is now moved to an encapsulation class. This class not only does
the reading, but also some parsing. Thus reading is a completely separate step from
forming the GUI, which was not the case before.
The same goes for parsing the <linearizations> element of the GF XML. The linearisation is formed and the indices are produced without affecting the GUI.
The whole management of the refinement menu is now in its own class, including the
graphical representation of it. That way, the editor is oblivious to the fact that there
are now submenus. Also the context menu of tree and linearisation is produced there,
the main GUI just asks the refinement menu class for it.
The tree display has already been in a different class. Hence, only some minor decoupling
was done here, but these two classes still have to play together.
91
10. Implementation
The architecture is not (yet) Model-View-Controller or Model-View-Presenter or any
of the other separation patterns. The main GUI class does the choreography of the
reading from GF and display, it does not just get notified/updated, when the model has
calculated everything. Additionally, although the main editor class is no longer a listener
class, this work has only been moved into inner classes, but not into a separate presenter
or controller class. With the mix of reading from GF and GUI forming removed, it
shouldn’t be too hard to change it, but for now, the view just does too much.
The reason for this lies in the fact that this work didn’t start from scratch, but with code
that was far more off from MVC than now, and the author proceded with the maxim
‘refactor the low hanging fruit first’1 . Gradually the different steps from reading the
XML over parsing it and forming the GUI have been disentangled, but more basic faults
like the ones listed above were much more important and hence, had to be corrected
first. With them, the system is now in a much better shape than before, but, as stated
above, not yet complying to one of the more advanced GUI patterns.
10.1.6. Character counting for click-in functionality
Before A less general problem, but one that was affected by most of the problems
stated above, was the calculation of the indices of the individual linearisation pieces
in the linearisation area. Each was stored together with its position in the AST as a
MarkedArea. The indices were calculated in roughly the following way (modulo some
special cases): When a subtree tag in the XML from GF was found, the character
positions from its beginning and end in the XML string were saved together with its
associated AST position. Then the subtree tag was removed and the two saved values
were modified to take the characters now missing in the XML string into account. Thus
the beginning and ending positions were always relative to the (modified) XML string,
not to the displayed text.
For the display of pure text, this can be done. Just that this method cannot be carried
over to HTML. Here the source code length cannot be used as a measure in the display
when it comes to caret positions. Code like ‘I’m bold’ will just show up as
‘I’m bold’ where the tag is not visible and thus not noticeable with regards to caret
positions in the output.
This method is not applicable to HTML and has to be replaced. Character counting in
the output is needed.
Now Only the length of the text of one linearisation sniplet when it is displayed,
matters, but not how many characters are used for it in the XML. This is described in
detail in section 9.2.
1
http://c2.com/cgi/wiki?RefactorLowHangingFruit
92
10.2. GF’s XML
10.2. GF’s XML
The editing state is sent as XML from GF to the editor. A (shortened) example of this
XML can be seen in figure 10.1. This XML is not well-formed, because the values of
attributes are not enclosed in quotation marks. But since the parser was not a generic
XML parser, but a simple parser specifically for GF’s output, that did not matter.
And an XML parser couldn’t be used, because the highlighting in the GUI depended
on the exact character positions in the XML strings, which would be lost in the tree
representation of an XML parser. In principle, the same parser is still used in the current
version, although it shouldn’t be too hard to change that, since this dependency has been
removed.
10.2.1. <hmsg>
The first element of the <gfedit> tree is an optional message. If a command sent to GF
is prefixed with something in square brackets like in [t] gf command, GF will repeat
the content of those brackets in the hmsg tag. This is used to transfer some flags to the
next GF run. t for example tells the editor to rebuild the tree. Some commands to GF
do not alter the editing state and therefore do not need a tree rebuild, like switching
linearisation languages on or off.
10.2.2. <linearizations>
The second subtree contains the linearisations in the active languages. Each of them is
in a separate <lin> element which has the name of the language as an attribute.
The tree in abstract syntax is always given to the editor, but normally just ignored.
But the abstract tree in this format can be saved and loaded with GF2 , while not all
linearisations are parseable again. Except for that, the linearisation in the abstract
syntax is ignored since the tree also has its own XML element. All other languages the
user wants to see come after Abstract.
Here are <subtree> tags, that are nested the same way as the AST is, with the difference
that hidden arguments will not show up in the linearisation. funs, that solely rely on
the linearisation of one of their children like coerce will not appear here either. As
attributes, the <subtree> tags contain the type and AST position of the correspondent
AST node. If there is a GF constraint for a node, an additional attribute ‘status’ with
value ‘incorrect’ will appear inside the <subtree> tag.
For the parts of the linearisation that belong to the currently selected node3 , a special
2
With the exception of metavariables. In the abstract syntax from GF they are numbered, but the
parser cannot parse them with numbers. So when saving as the AST, the numbers are removed.
3
There can be more than one of them for discontiguous constituents, that is, a lin can use one part
of the linearisation of a child, then something from another child and then something from the first
child again. If now this first child is selected, its linearisation will not be contiguous, whereas the
93
10. Implementation
<g f e d i t >
<hmsg>
t
</hmsg>
<l i n e a r i z a t i o n s >
< l i n l a n g=A b s t r a c t >
NOPACKAGEP PayCardC charge IntegerC Oper Constr
( \ t h i s , amount −> p r e p o s t C t
( realGT
( coerce
IntegerC
RealC
IntegerCConforms2RealC
amount
)
?9
)
?2
)
</ l i n >
< l i n l a n g=FromUMLTypesEng>
<s u b t r e e p o s i t i o n = [ ] t y p e=C o n s t r a i n t > For t h e o p e r a t i o n \<b\> c h a r g e ( amount :
I n t e g e r ) \</b\> o f t h e c l a s s \<b\> PayCard \</b\>, \<br\> g i v e n t h e f o l l o w i n g
pre−c o n d i t i o n : \ <s u b t r e e p o s i t i o n = [ 0 ] t y p e=OperConstraintBody> <
s u b t r e e p o s i t i o n = [ 0 , 0 ] t y p e=Sent> <s u b t r e e p o s i t i o n = [ 0 , 0 , 0 ] t y p e=I n s t a n c e
RealC> <s u b t r e e p o s i t i o n = [ 0 , 0 , 0 , 3 ] t y p e=I n s t a n c e I n t e g e r C > \ amount \ </ s u b t r e e > </ s u b t r e e > i s g r e a t e r than <s u b t r e e p o s i t i o n = [ 0 , 0 , 1 ] t y p e=
I n s t a n c e RealC> \ ?9 \ </ s u b t r e e > </ s u b t r e e > </ s u b t r e e > \ then
t h e f o l l o w i n g post −c o n d i t i o n s h o u l d h o l d : \ <s u b t r e e p o s i t i o n = [ 0 ] t y p e=
OperConstraintBody> <f o c u s p o s i t i o n = [ 0 , 1 ] t y p e=Sent> ?2 </ f o c u s > </ s u b t r e e >
\ </ s u b t r e e >
</ l i n >
</ l i n e a r i z a t i o n s >
<t r e e >
NOPACKAGEP PayCardC charge IntegerC Oper Constr : C o n s t r a i n t
\ ( t h i s : V a r S e l f NOPACKAGEP PayCardC) , ( amount : I n s t a n c e I n t e g e r C ) −> p r e p o s t C t
: OperConstraintBody
realGT : Sent
c o e r c e : I n s t a n c e RealC
IntegerC : Class
RealC : C l a s s
IntegerCConforms2RealC : Subtype I n t e g e r C RealC
amount : I n s t a n c e I n t e g e r C
?9 : I n s t a n c e RealC
*
?2 : Sent
</ t r e e >
<message>
</message>
<menu>
<item>
<show>
r andS
</show>
<send>
r o c l L i b r a r y . andS
</send>
</item>
<item>
<show>
r anyEq
</show>
<send>
r o c l L i b r a r y . anyEq
</send>
</item>
</menu>
</ g f e d i t >
Figure 10.1.: A sample of the XML for a GF state. The refinement menu has been
shortened to make the XML fit on one page.
94
10.2. GF’s XML
Figure 10.2.: The editor after receiving the XML from figure 10.1.
<focus> tag is used. It only has a different name, but functions identically otherwise.
Everything in the scope of this tag was marked green in the old editor. Now, the tree is
built first, and the focused node will already be known, so that the <focus> tags will
just be changed to <subtree> tags before the linearisation is parsed. Each <subtree>
tag carries its position in the AST. But more about that later in section 10.4.
The result of parsing this <linearisations> tag can be seen in figure 10.2.
10.2.3. <tree>
This tree is just a piece of some otherwise structured text, but no XML. Only indentation
is used to make nodes to children of other nodes. Therefore the tree parser works that
way to transform that text into a tree data structure. Each line represents a node in
the internal GF AST and will be displayed as a node in the graphical tree, as visible in
figure 10.2.
In that tree, the selected node line has a * in front of it. The position of this node will be
saved for highlighting (see above) and the type will be displayed below the linearisation
area. The tree will later get analysed to find out, if, for example, a coerce should be
introduced automatically.
linearisation of its parent will encompass the linearisation of itself and all its children, so it will be
contiguous.
95
10. Implementation
10.2.4. <message>
This element is most often empty. Its content is not part of the editing state and is
used to give information to the user. This information could be error messages, if, for
example, parsing has failed or he wants to move the cursor to a non-existing AST node.
It is also used to display explicitly ordered information. From the editor for example,
that is used to get a list of all printnames available to GF as mentioned in section 9.1.4,
but a user who knows GF’s internals, can use that too.
10.2.5. <menu>
Each entry of the refinement menu is given by GF in this part. The <item> element
groups together the text in the <show> element that is to be displayed to the user, and
the according GF command for it in the <send> element. If no printname is available,
the <show> part will basically be the same as the <send> part, perhaps with more
speaking commands and type annotation, if Abstract is chosen as the menu language.
If a concrete language is chosen, then the linearisation of each fun will be given in the
<show> tag. If a printname is available, it will be preferred to the linearisation. The
same holds for the extended printnames of this work, except that they are loaded and
cached at the beginning and the <show> tag will be ignored.
10.3. Overview over the classes
To give an overview over the classes in the implementation, how the classes are connected
to each other and what their main function is, is the task of the figures 10.3, 10.4 and
10.5. How they play together is described in the sections below. Record-like casses, that
do not methods, but just contain fields are used to store the result of some methods.
These classes are omitted in the diagrams to make them more concise.
10.4. Receiving the state
The editor has two tasks. The first is to display the state that GF sends to the user.
The second is to transform the GUI activities of the user into GF commands for GF.
But a great deal of task two has to be prepared in the first task, since many interactions
depend on the current state, like the refinement menu’s content.
10.4.1. XML processing
As mentioned above, after every user action, GF sends the complete state to the editor.
After each send, the main class GFEditor2 is given the task to process the state.
96
The graphical
representation of the
GF AST.
JPanel
KeyListener
DynamicTree2
ActionListener
ReadDialog
Does not build the
tree itself, just
GUI class that
takes terms and
strings that are
Cut the context
part from the
OCL and save it
to be parsed
from the user
as a JavaDoc
comment
displays it. Also
reacts to clicks from
the user and tells
GFEditor2 to make
GF switch to the AST
node the user has
inTogetherCC.
MyTreeModelListener
−MyRenderer
PopupListener
JFrame
GFEditor2
CallbackClassInv
CallbackPrePost
clicked on.
Maps GF funs to
printnames for the
display in the
PrintnameManager
refinement menu.
PopupListener
OpenAction
SaveAction
ImportAction
NewTopicAction
ResetAction
ConstraintCallback
No dependancy
on the
TogetherCC
OpenAPI.
QuitAction
RandomAction
UndoAction
RefinementMenu
AlphaAction
GfCommandAction
ReadAction
SplitAction
CombineAction
SubtypeAction
LangMenuModel
Stub for saving
OCL constraints
in TogetherCC.
GfCapsule
Encapsulates all
reading from
and writing to
the running GF
process. It is
Linearization
called from
GFEditor2 and
returns the
read partially
Takes the parsed XML for the refinement menu
and displays it to the user. The whole submenu
handling is done here.
Display
parsed XML.
Contains the GUI controls for the refinement menu
and takes care of their display.
The refinement menu can be accessed as the
context menu in the tree and linearization too, and
is built here also.
The main class. Not only the GUI class, but also
controls the other classes around it. With their
help, it retrieves the editing state from GF and
displays it to the user.
Contains inner classes that serve as
ActionListeners
Handles the calculation of the
indices for the linearization
snipplets sent from GF. All
Parses the linearization XML
and manages the mapping
of linearization indices
text that should be shown to
the user is collected here and
(calculated by Display) and
AST positions.
displayed, when the whole
linearisation is formed.
GFEditor2 asks this class
for the AST position, when
Both HTML and pure text are
done in this class.
the user has clicked into the
linearization.
Figure 10.3.: The main GUI class GFEditor2 and the classes around it, that together
form the GUI.
The first action in processing is to ask GfCapsule to read the XML and do some preliminary parsing.
The <hmsg> consists mainly of flags. These flags are parsed and values are set in a
corresponding record class Hmsg.
After that, the <linearizations> element is read and saved. No processing is done with
it in GfCapsule, that’s done at a later stage. The same is done with the <tree> and
the <message> part.
Then the <menu> element is read. The XML structure is quite simple here, so parsing
that is done now. The result is a Vector of the String tuples of command and display
string.
97
10. Implementation
An entry of the
refinement menu,
but not necessarily
Comparable
GFCommand
Is responsible for
the text, the
tooltip and the
a command, that is
to be sent to GF.
submenu in which
the associated
command will
show up.
RealCommand
LinkCommand
InputCommand
Printname
As the name
says, an
object of this
when selected,
the submenu
that belongs to
Exists for Integer and String. If
such a refinement is possible, an
InputCommand will be shown in
class contains
a command
this command
is opened
the refinement menu. If selected, a
dialog box will pop up, where the
string, that is
sent to GF.
user can enter an Integer or a
String, depending on the type of
this InputCommand.
SelfPropertiesCommand
This entry is an oddity here, since it it talks to GF
(via RefinementMenuCollector).
This class handles the deferred gathering of the
properties of self. When it is clicked, it ask GF for
the list of them. After that is done, this entry gets
replaced with a LinkCommand which gives access to
those properties as usual.
An unrefined node will appear with question
mark and type in the tree. But the tooltip will
be taken from the parent according to the
Is responsible for
text and tooltip
and colour of a
position of the child, so that the user can see,
what is expected at this node.
node in the
graphical tree.
UnrefinedAstNodeData
AstNodeData
Represents a parsed line
from the tree, that the editor
got from GF. Here, the type,
GfAstNode
RefinedAstNodeData
A refined node already
has its fun and will
appear with its name in
the used fun, constraints (if
present) and bound
the tree. The tooltip for
this node will also be
variables are accessible.
the one for the fun..
Figure 10.4.: The possible entries of the refinement menu in the upper part, below the
data behind the graphical tree.
98
AbstractProber
Uses GfCapsule to return the
refinement menu entries for the
state after a given command.
Can read the XML from GF, but
does nothing with it. Has hotspot
method for the different
elements, so they can get
overwritten if needed.
RefinementMenuCollector
SubtypingProber
SelfResultProber
TypesLoader
PrintnameLoader
Goes
through
the tree
Checks, if
self and
result are
If the refinement
menu entries should
be annotated with
Is used by
Printname−
Manager to ask
and tries
to close all
really
applicable
their type, then
PrintnameLoader
GF for all
available
open
Subtype
at a given
position.
asks this class to
create a mapping
printnames. It
will deliver
between fun name
and type.
them as single
lines.
nodes.
RefinementMenuTransformer
Dependent on
TreeAnalysisResult,
transforms the
refinement menu.
A record class
to store the
result of the
TreeAnalysisResult
TreeAnalyser
analysis
Analyses the tree
and labels the
individual nodes,
for example if they
should be hidden.
Figure 10.5.: The structure of the classes that besides GfCapsule, talk to GF.
Now a record object GfEditResult is returned to GFEditor2.
As the next step, the tree is formed. Every line in the tree string from GF is transformed
into a tree node. Type, bound variables, the used fun and introduced constraints are
saved for each node in a GfAstNode object. Additionally, for each node the position
in the tree in the GF notation is calculated and also stored. For the tree structure,
DefaultMutableTreeNode is used, with AstNodeData as the user object.
The result of this step is then given to TreeAnalyser, which goes through the AST and:
Labels nodes
– as hidden, if they are a coerce without a constraint. It is marked, which child
node should be used instead.
– as coloured, if they might be a coerce with for a non-existing subtyping relation as described in section 9.3.6
Saves a reference to the currently selected node
99
10. Implementation
Finds out
– if attributes of self should be given an easy access, explained in section 9.4.2
– if the refinement menu below a coerce should be reduced so that it only
contains funs that return a suitable subtype (see section 9.3.3)
– if it should be probed, if self and result are superfluous in the refinement
menu as described in section 9.4.1.
– if a coerce should be introduced automatically, discussed in section 9.3.1
The next phase depends on the result of this analysis. If a command is to executed
automatically (at the moment only the refinement with a coerce), then this is done and
the data read in this run is discarded since it will be obsolete when the next state arrives.
On the other hand, if no command is to be given to GF, the processing and parsing of
the current state continues. First, the nodes that have been labelled as hidden, are
removed from the tree and replaced by their designated child. Then the tree is shown
in the GUI. Missing subtyping nodes will be colored red and the tooltips are set for the
individual nodes as described in section 9.1.3.
Thereafter, RefinementMenuTransformer is asked to do some changes to the refinement
menu. These again depend on the result of the tree analysis. The properties of self are
added and the editor tests, if self and result can type-correctly be filled in. If at the
Instance node below a coerce, the refinement menu is built anew with only type correct
entries. Also the delete command is modified then.
Only now RefinementMenu is called to form the menu and submenus, whereas before,
there existed only GFCommand objects in a Vector. Of every subcategory occurring in the
initial list of commands that does not just contain one element, a LinkCommand is created, that, when selected, opens the according submenu. If applicable, an InputCommand
is placed in the refinement menu too. After that, the refinement menu is sorted alphabetically.
The next step is parsing the linearization XML, which is done in Linearization. Whenever a <subtree> tag is encountered, the start and end indices of the current text snippet
belonging to that tag are saved together with the position and the text itself. How the
indices are calculated is described in section 9.2. That way, when the user later on clicks
into the linearisation, the editor can find out on which of these snippets the user has
clicked.
The linearisation text is then given to an object of the class Display, which manages
the linearization areas for text and HTML, and immediately afterwards displayed in
the GUI. That is needed to do the highlighting (explicated in section 9.4.5). For that,
Linearization is assigned to calculate the highlight positions, which depend on the
selection status and if there are constraints on a node. Then these snippets actually get
highlighted in the GUI linearization area (text and/or HTML).
At the end, the message from GF is appended.
100
10.4.2. Probing
At several occasion, GF is asked in the background, whether a condition is fulfilled, or
to gather something. These GF calls have in common, that they do not take place in
the main class. The classes, which do that, are presented below:
For excluding self and result from the refinement menu, SelfResultProber, which is a
special class for this purpose, is used.
When building a reduced refinement menu below a coerce, RefinementMenuCollector
is used to get the list of the possible Subtype refinements, which are then refined and
RefinementMenuCollector is used again to get the individual property refinements. To
collect the properties of self, it is also used.
In the beginning, the printnames are loaded and cached (see section 9.1.4 for more about
that). This is done with PrintnameLoader, which might use TypesLoader if the user
wants the refinement menu entries to be annotated with their type.
The method detailed in section 9.3.7 for filling in Collection subtyping witnesses is implemented in SubtypingProber.
10.4.3. Undo handling
Since nearly all commands to GF change its state, the background probing classes have
the responsibility to clean up after them. GF accepts chain commands which are a list of
commands that are executed sequentially, where only the final state is sent back to the
editor. But sadly, these chain commands can not be undone as a whole, the individual
commands have to be undone. Hence, the prober classes have to issue the right right
amount of undo commands to GF.
To make the undo button in the GUI undo every user action, which might consist of
more than one GF command (access to a property of self or selecting a node where
a coerce is then automatically introduced), the editor has to keep track of number of
individual commands in each user action. This is done with a simple stack.
Sometimes, a user action leads to a state, where GF automatically executes another
command. In these cases, the top element of the undo stack is modified, so that a click
on the undo button undoes both commands. That way, each undo undoes a complete
user action, not just parts of it. Automatic commands hide something from the user for
a reason. Thus, they should not become noticeble through a back door like undo is one.
In the author’s eyes, counting undos should be the responsibility of the running GF.
There the actual command processing is done, so it should be easy to make chain commands atomic, so that 1 undo undoes a whole chain command. Then the undo stack
would still be necessar to undo automatically executed commands, though. But it would
be relieved from counting command constituents and their possible effects.
101
10. Implementation
10.5. Sending commands
The main work on the editor side to accept user actions has already been done.
If an entry in the refinement menu is selected, its command is sent to GF. For entries of
class InputCommand, where the most important data, the value that the user wants to
enter, is not yet known, a windows pops up and asks for it. Then it is given to GF.
When the user selects a tree node, a move command (mp) with the calculated position,
which is saved in the AstNodeData object attached to the tree node, is sent to GF.
After a click (or selection) in the linearization area, Linearization is given the start
and end indices (which in case of a simple click are equal). Linearization now goes
through the list of all registered linearization snippets and returns the according AST
position, which is then sent to GF. For a simple click, the position is the one stored for
the snippet the user has clicked on. It does not matter, in which language the user has
clicked, since all languages share the same AST.
In case of a selection, the node in the tree is selected, that has the smallest linearization,
that is at least partially covered by the selection. That means, the node closest to the
leafs, whose linearization is partially covered, is selected, since the linearization can only
grow, if nodes above are selected, since they consist of the combined linearization of
their children plus their own.
Note, that this mechanism has been devised by Janna Khegai for the old version of the
editor, but the implementation has been modified to a large extent in this work.
The procedure, how the integration works, is described quite in detail in section 8.3,
so not much has to be added here. TogetherGFinterface is instantiated in a context
menu class of TogetherCC. Depending on whether it is the context menu of a method
or a class, the corresponding method in GFinterface is called. What happens there, is
described above.
102
GF needs access to all
grammar files on a
place in the file
Manages parsing the old OCL
or the creation of the
constraint stubs, generating
Exports the UML model and
all referenced classes into a
format, that’s easily
system. This class
takes care of that.
the grammars and finally,
calling the editor.
parseable for the grammar
generator.
TempGrammarFiles
GFinterface
TogetherGFinterface
ModelExporter
Has access to the TogetherCC
specific classes, while
GFinterface only accesses an
abstraction of them.These
concrete fields are set in this
class.
Figure 10.6.: The classes, that interface the editor with TogetherCC
103
10. Implementation
104
11. Conclusions
11.1. User perspective
The goal in the beginning was a fuzzy one: ‘Make this editor (more) usable’. So part of
the work was refining this goal. This is mostly done in 7.
Many of the main blockers identified in section 7 have been removed. Only now the
editor is ready to be used in a real usability study to see, if the grammar based approach
for OCL really works out and makes editing them simpler, or if the price for not having
to know OCL, but having to cope with the grammar instead, is too high. This approach
with top-down editing is radically different from normal editors and especially novices
will have problems with it, as [KU93] states. There, it is recommended for that reason
to design for experienced users and not for novices, something which this work does not
follow. Whether that turns out to still be the right way, since the user is relieved from
the for him still unknown OCL syntax, remains to be studied.
11.2. Development
First, the author had to get an understanding about how exactly the old code worked.
As detailed in section 10.1, there were a number of impediments towards that.
The system consists of several parts and on all of them work needed to be done:
The program which laid the foundation for the rest was GF. But a number of additions in
GF were needed by the editor. These were chain commands (several commands in a row,
where the intermediate states are not sent to the editor), a way to solve GF constraints
without jumping to the root node and a command to print all available printnames. Also
a number of bugs were discovered during the development in this work and reported to
the GF developers.
The OCL standard types and operations are represented by GF grammars. These were
modified by the author to remove the generic subtyping witnesses for reflexivity and
transitivity to make it possible for GF to fill in this parameter of coerce without user
interaction1 . Also the comparison operators for Integer and the type-specific equality
1
Generic subtyping witnesses, where the concrete type depends on a type argument, are always applicable. Thus, reflexivity and transitivity were always offered. And if there is more than one possible
refinement, the user has to select. GF refines only if there wouldn’t be a choice anyway.
105
11. Conclusions
operators have been removed by the author. But the most important change was writing
the enhanced printnames for all OCL operations. Only with actual helpful content these
mechanisms were of any use.
These grammars only covered the standard OCL operations. The grammars for the
user’s UML model in contrast cannot be written in advance. Generating them is done
by the grammar generator by Kristofer Johanisson. Also this tool had to be changed to
produce better display names for classes and operations, tooltips (especially for marking
parameters for the automatic coercion) and to include the transitive hull of subtyping
witnesses.
If a previous OCL constraint is found, this is parsed by the OCL parser by Kristofer
Johanisson. Most of the changes here were not feature wishes, but corrected bugs or
fixed inconsistencies (something which also applies for the grammar generator).
The different parts are connected by a TogetherCC plug-in. A previous version for the
old grammars already existed, but large parts had to be rewritten by the author.
But the main work has been done on the editor itself. The new features of GF had to be
used here, enhanced printname support, HTML display was added, coercion handling
and nearly all of the other improvements were implemented in the editor.
11.3. Contributions
The main effect on Kristofer Johanisson’s work was giving it a nice GUI. For GF it
meant an overhaul of the editor with many improvements compared with the old one.
Thus, the new editor version was included as the official editor of GF 2.3. But the main
impact is on KeY, where the grammar based way to create OCL constraints got a nicer
face.
11.4. Only a specialised OCL editor?
The goal was to tailor the generic GF editor into a special one for OCL. But a number
of problems were of a more general nature and not specific to OCL. These are the
information display problems:
What does a specific fun do? – see section 7.1
A too long list of refinements – see section 7.2
What is the role of the current position in the AST? – see section 7.3
Output is just text without formatting – see section 7.4
106
Devising a mechanism to solve these problems in general would also solve the problem
for OCL. What would be left would be writing the text, that actually explains the OCL
funs, groups them and describes their parameters. Regarding HTML display, nothing
had to be done on the grammar side there, since David Burke already did that in [BJ05],
which the author later on transferred to the German grammars, along with Burke’s other
improvements.
Some other problems also were not OCL specific:
The collapsing tree – see section 7.12
How to enter strings and integers? – see section 7.6
The safety question when exiting – see section 9.5.2
The tree problem has not been fixed in this work, but entering string and integer literals
is now made easy since a special entry shows up in the refinement menu and, if selected,
opens an input box without distracting choices to be made.
Also some enhancements were done that didn’t help OCL editing like enhanced middleclick parsing. These are mentioned in section 9.5.
And, not to forget, the improvements made to the source code of the editor mentioned
in section 10.1.
This list of generic enhancements made it possible to make the version of the Java GUI
enhanced in this work the new official GF editor. So this editor has been released as a
part of GF 2.3.
The norm EN ISO 9241 covers ‘Ergonomic requirements for office work with visual
display terminals’. Part 10 of it contains a number of usability principles. The goal of
this section is to show, to which extent the editor fulfills these principles and how that
has been changed in this work.
Suitability for the task It is possible to create OCL constraints with the editor without
getting distracted by other things. GF grammars already make a pre-selection of what
kind of texts are creatable with them. Therefore, the user is restricted to what the
grammar he loaded offers, only the funs defined in this grammar and applicable at the
current position are offered. Others are not shown at all. Thus, the user can only edit
OCL constraints in the editor, but is not distracted with other options. But how well
users can cope with the grammar-based approach is still to be seen.
A number of editing steps of the user have been automated in this work. Coercions
are (not completely, though) done by the editor and properties of self are made more
107
11. Conclusions
accessible. The restructuring of the refinement menu made it easier to choose a wanted
fun. Hence, the user now got more support in doing what he wants to express. With
that and more descriptions, editing OCL became easier.
Self-descriptiveness Whenever a command to GF changes the editing state, the editor
will present the state to the user. So for most commands, the user will get feedback,
that reflects, what has been done. But for some commands like copying to the clipboard,
nothing is shown since the editing state has not been changed. Here, appending some
text to the linearisation area could help.
But still, for this principle, the biggest strides forwards have been made. All the OCL
funs got a description of what they express, the user gets an explanation of the current
node, the output is better readable. Before, in contrast, the user was out to guess, what
funs do, because there the often cryptic name was the only description available.
Controllability The user is forced to edit top-down. Beginning with the leafs and then
connecting them with suiting funs is only possible with wrapping, but type changes are
not allowed. Thus, the user is not free and cannot choose where to start. But he still
can select any already existing node in the linearization or tree and refine it and he can
go back and forth between them. So he is given a somewhat reduced freedom. Also
the user cannot apply any fun everywhere if it doesn’t fit; besides with changing type
arguments, the user is not allowed to make errors. But in return for these restrictions,
he gets correct OCL. So the lack of control has advantages, too.
Conformity with user expectations Grammar-based editing is different and the user
will probably not be accustomed to this kind of work-flow. So the editor breaks this
principle by design. But showing the user, what the current node does, is a step to help
him still finding his way.
Another thing is that all refining actions (even deleting or copy&paste) are offered in
the refinement menu. Even reading in input from the user can now be activated here
and is not hidden behind other buttons which are not helpfully labelled. It has been
moved there to match this principle. Only for selecting other nodes and for undo the
user has to search somewhere else.
Error tolerance The goal of the grammar-based approach is to make it impossible to
produce invalid OCL. So in theory, error tolerance shouldn’t be a problem.
But things are not that ideal, there are hidden type elements. When the user changes one
of them afterwards, type errors can be produced. But GF will report those constraints
and the editor will show all coerce nodes in the AST to give the user the possibility to
correct this erroneous state.
A different way to introduce typing errors is via coercions. Before, GF offered all funs
108
11.6. Future work
returning Instance ? at the Instance node of coerce, including funs with types that are
no subtype of the originally expected type. If such a fun was chosen, GF couldn’t fill in
the subtyping witness, but did not mark that as a possible error.
Now, in contrast, this situation normally is prevented from being possible. The refinement menu will only contain funs with a suiting subtype. Or with a type depending on
an argument, which can be a problem. When such a fun is refined and a ‘wrong’ type
selected, the according coerce node will be shown and the subtyping witness node will
be coloured red to make this error situation visible. Before, no hint of that there is an
error was given.
Also, exploring the editor with trial and error is possible, since unwanted actions can be
undone with GF’s undo mechanism.
Suitability for individualisation The different automatisms introduced in this work
or the display of linearisation languages can be switched off. Some minor things like
appending the type of the funs in the refinement menu to them can be switched on. But
more individualisation is not supported.
Different sets of fun descriptions as mentioned in section 9.1.4 would be a way to help
here. Different linearisation languages could get written and would instantly be supported by the editor, but the editor itself is not internationlised. This is still to be
done.
Suitability for learning OCL is displayed by default. All changes to the editing state
are also reflected in OCL. The user can start without knowledge of OCL, but gradually
see, which chosen action in the refinement menu along with its description belongs to
which output changes in OCL. This effect could even be enhanced by giving the OCL
syntax in the tooltip for a fun, but this has not been done yet. To implement that, only
the grammar files would have to be changed, so that shouldn’t be a big deal to support
that.
Watching the changes in OCL was possible before, but without descriptions and a sometimes not helpful linearisation, it was not really clear, what happened, and thus, learning
was impeded.
11.6. Future work
There is a number of things that this Diplomarbeit did not address, but still fall under
its topic. The collapsing tree and AtomSent have already been mentioned. But there
are more.
The jumping cursor GF still has a strange algorithm to compute, which node should
be selected next. In the editor, as it is designed now, this decision is completely left to
109
11. Conclusions
GF and should be fixed there. The tree is not analysed with respect to that. But since
the main GF developer recognised the jumping cursor of GF as a bug, it will be fixed
there, and the editor does not have to be adapted, to benefit from that.
Pseudo UML model features OCL features let definitions to introduce local shorthands for more complicated constructions. These are supported by the grammars, but
their use is quite complicated and has not been simplified by the editor. Offering an
entry in the refinement menu like ‘make to local let definition’, which automatically
introduces a suiting let definition and replaces the current subtree with this shorthand
would make life with let much easier for the user. Transforming bound variables in this
subtree to iterator variables which need a special fun to introduce them as parameters
would be the challenge here.
But let definitions are already an advanced part of OCL and the main point of this
work was to remove usability blockers, so in the eyes of the author it was justified to
postpone the support for let definitions to a later stage.
<<definition>> constraints are global shorthands which can be used in several con-
straints for the same context. That is something not applicable to the system as it is,
where only one constraint of one type (either method contract or invariant) is allowed
per context. Also, these semantics of these pseudo model features have been changed in
the past, and will be again, as they won’t be part of UML 2.0 (see [CK04]). One should
define these features in the model and add a special stereotype to them there. And that
way, funs will be generated for them as for normal features, so no additional work will
have to be done to support them. Therefore no steps have been made to support the
OCL 1.5 def construct.
Lists When the OCL parser sees three and conjoined boolean expressions in OCL, it
will use a list construct for that in the AST. These lists have the advantage that they
can be rendered in NL as bulleted lists, which shows their structure better compared
to producing just a long conjoined sentence. But to produce these lists in the editor is
a bit awkward and goes via AtomSent. Only when this work was nearing its end, GF
got a standardised way to formulate lists in it. It would be really helpful for the user to
make list construction easier, to abstract away from the nil and cons list constructors
and just show one list node in the AST, which has one child node for every list element,
and perhaps a node labelled ‘...’, that produces a new list element when clicked.
Drag & Drop To move a node in the AST to another position at the moment is only
possible by using the clipboard of GF and has to be done manually2 . Furthermore,
moving is restricted by bound variables. If used in the subtree of the node which is to
be moved, they also have to be available at the target position. GF will check for that
and only offer the refine from clipboard command, if they match.
2
Actually, short before the deadline of this work a special command has been implemented in GF,
that would make drag&drop easier to implement.
110
11.6. Future work
But why not automate the use of the clipboard and give the user the possibility to use
drag & drop? If a drag starts, copy the node. When hovering over a possible target
node, ask GF in the background, whether the dragged node is pasteable there or not.
If it is, allow a drop, if not, change the cursor accordingly to signal that. This could
even be supported in the linearisation area. Select a body of text there and drop it over
a question mark, either as a copy or as a move. That would make reordering the tree
much simpler.
Custom linearisation rules For English, the heuristics that transform a class name
into a common noun work quite well, for German they don’t. This problem was there
already at the beginning ([Dan03]), but has not been fixed yet. A solution would be to
add additional JavaDoc tags to the classes and methods that contain GF lins. These
could then be used by the grammar creator.
The problem with this approach is, that this cannot be done automatically, and would
require some familiarity with the GF syntax, although common nouns are are not that
hard to write. Also, GF 2.3 offers linearisation rules by example, which could take away
the need to know the GF syntax at all, so that would be worth trying out.
Renaming bound variables When the linearisation of a bound variable is clicked, the
first node of the subtree, which is the scope for this variable, is selected. But that
is not the node which introduces the variable, as it is usual for higher order abstract
syntax variable treatment. GF does not recognise bound variables as separate entities,
so they cannot be selected in the AST as such. And selecting would be the first step for
renaming.
Now the user has to use the button alpha and enter both the old and the new name.
Not even a list to select the variable is offered, although that could be done without
support from GF.
Saving in between As mentioned above, OCL cannot be saved as a JavaDoc comment
in the middle of an editing session. To make the save and load buttons save in the model,
and to offer an additional menu entry Save As, which would do, what Save does now
(saving in an extra file), would be something expected by a user. The same goes for
loading. Also being able to revert to the state directly after loading would be nice.
Parsing Users who have been shown the editor, expressed the wish to enter OCL by
hand and have it parsed. Using the editor is different from the free-form text approach
of normal programming. And the point of the editor is to make entering OCL by hand
unnecessary. But nevertheless it would be nice to support parsing, since the editor may
slow down more experienced users for simple expressions for which they already know
the syntax.
111
11. Conclusions
The GF built-in parser is, as mentioned before, limited and cannot be used to parse
OCL because only a small, not yet known fragment is parsable. There is the external
parser, but that only knows how to parse complete OCL documents, not only parts.
And the OCL has to be complete, no open metavariables may be left. And that usually
is not the case. So the parser would have to be changed to allow that.
Architecture As mentioned in section 10.1.5, the system does not yet conform to one
of the more advanced separation patterns of GUI and model. So continued refactoring
efforts have to be made to separate the GUI, the editing state and the GF calls.
Speed The most annoying thing about the editor at the moment is the loading time.
50 seconds are just too long to be used for small constraints for single methods/classes
on a Pentium M with 1.5 GHz and 512 MB RAM (depending on the memory load, this
figure can vary between 40 and 75 seconds). And for each method/class, the editor must
be loaded again. Only the previously generated grammar files can be used again. Here,
much has to be done to make the editor acceptable.
Reusing the loaded state and just giving GF a new OCL context would be one way to
make consecutively editing constraints easier. But as soon as the model changes, the
grammars also have to be changed, which would make a long reload necessary. Another
problem is the memory consumption. The GF process takes 179 MB by itself, which
with just 512 MB is noticable and produces quite some swapping.
Loading time and memory consumption could be reduced by making the loaded languages selectable, especially deselecting German with its huge resource grammars would
really help here. Loading only English and OCL as concrete languages reduces the initial
loading time from 50 to 15 seconds and GF’s memory consumption goes down to 60 MB.
Another performance problem is how the HTML indices are calculated. Snippets are
not appended to the JTextPane, but appended to a StringBuffer, which is then given
as a new HTML document to the JTextPane, so that for every following snippet, the
previous ones have to be parsed again. And this is a quadratic algorithm and lengthier
constraints are noticeable slower than short ones. A HTML parser where parts can be
added to the already parsed HTML without a need for reparsing the whole document
could remedy that.
Also here, switching off one language helps since it reduces the amount of GF snippets
by around one third.
What also slows the editor down are the GF calls in the background. When they are
read, only some parts of the GF state are parsed, which always excludes the linearisation.
But still, calling GF takes time.
But as long as the editor wants to know something about possible refinements at a
certain position or if a command can be filled in automatically, it has to ask GF. This
intelligence doesn’t come for free.
In the original design, as stated in section 8.1, the editor was meant to have no knowledge
112
11.6. Future work
about the types in the tree, it should just display the AST. All the intelligent work was
supposed to be done by GF with the editor being just a dumb GUI for it.
In this work, this division of labour has been broken up to some extent. The editor
now does some domain specific analysis of the editing state and changes it accordingly
(namely automatic refinements). It is a layer above GF, which uses GF, but some
functionality of it is closely tied to what GF already does. Removing self and result
from the refinement menu is such an example. Here the editor just looks one step ahead
in the refinement menu, if a refinement can have all its children filled in or not in a
given situation. This checking is not of a higher level than what GF does. It is just not
advisable for the generic GF to always look farther into the future than necessary, since
that could be quite expensive and is only needed for two funs of the OCL grammars.
So there are reasons to put these computations into the specialised editor and not into
GF. But that incurs the run-time cost of calls to the external GF process.
Optimisation has been postponed in this work to future works. First see, what can be
done with the editor, what can it do for the user, and only after that, see, how it can be
done faster.
Other formal languages The improvements, which were implemented for the editor in
this work, were done with OCL in mind. But, as described in section 11.4, not everything
done was OCL specific. Especially for formal languages, most of the improvements beside
the mentioned generic ones can be reused.
For all models which feature subtypes, this has to be emulated with coercions. And
this problem is more general and not specific to OCL. Even the code for that is generic
enough to be applicable even to another specification language like JML3 , which recently
got supported by KeY. Only the category names for classes and instances and the fun
coerce itself have to be the same.
The same ways to prohibit a type-incorrect use of OCL’s self and result can be used for
JML’s this and \result, namely the guard argument and the editor which probes if
this respectively this can be refined in the current situation. If the funs are called the
same, the code should work out of the box. The same holds for the properties of this
and how they can be made more accessible.
And if someone wrote a concrete grammar for a subset of JML for the abstract OCL
grammar, no changes would be necessary in the editor at all. Just that writing such a
grammar would be quite complicated.
According to [Ham04], for the part of JML, that is also covered by OCL, a concrete
grammar theoretically should be writeable, although quite a number of helper constructions would be needed, that would blow up the linearisation in JML. Some steps in
the reverse direction have already been taken, like introducing a null value and class
or a statement excThrown, but already arrays make things complicated (see [Rot02] for
details) and would also need helper definitions (they are silently treated as Sequences in
3
http://www.cs.iastate.edu/~leavens/JML/
113
11. Conclusions
the current GF OCL grammars).
And specifying loops, which are part of the implementation and not of the model, is
not possible in OCL at all. OCL is restricted to the model and no ‘access’ to the
implementation is possible. Hence, these parts cannot be translated.
11.7. Related work
The tool that somehow goes a step in the same direction is Octopus4 . It features a
display of OCL constraints in a language called Business Modelling Language, which
resembles SQL. That way, SQL developers have a easier time at understanding, what an
OCL constraint does. But only OCL can be translated into BML, the reverse direction
is not supported. Thus, no changes can be done there, it is read-only. But the main
focus is model-driven prototyping and code-generation.
The tool olce5 also goes into this direction, together with execution and debugging.
Modelling and simulation is the goal of Xactium6 . In both there is support for editing
OCL. They feature a text editor combined with code-completion or navigation-help,
together with checking if the OCL is well-formed.
But for editing OCL an approach is used in this work that has not been covered before.
4
http://www.klasse.nl/english/research/octopus-intro.html
http://lci.cs.ubbcluj.ro/ocle/index.htm
6
http://albini.xactium.com/content/index.php?option=content&task=section&id=
7&Itemid=49
5
114
The following is the documentation about the syntax used for the printnames. For GF,
it is also available as an individual file.
A.1. Introduction
The main new feature of the new version of the graphical syntax editor for GF, called
gfeditor, are enhanced printnames. Printnames are a feature of GF available for concrete
grammars that are displayed in the editor instead of the name of the fun in the abstract
grammar. Short fun names might be enough for a grammarian, but for the unacquainted
user, they are not. In figure A.1 you can see an example comparison for the OCL
grammars.
Printnames in GF are just strings, that contain no special code. They used to be
displayed as they are written in the grammars.
Now some processing is done. The following is possible with them:
Tooltips Possible refinements can get tooltips attached to them to give the user of the
editor more hints about what will happen (see figure A.2)
Grouping Refinements can now be grouped to make large lists more concise (see figure
A.2)
Parameter descriptions These are displayed at several locations:
Below the fun description in the tooltip for the entries in the refinement
menu, as also visible in figure A.2
As tooltips in the AST for unrefined nodes, as visible in figure A.3
Above the refinement menu as the current goal, to give the user a hint about
what is expected from him now (also figure A.3)
115
(a) without printnames
(b) with printnames
Figure A.1.: The refinement menu of the OCL grammars. On the left without printnames, with only the abstract fun names, on the right with printnames.
The following is an example of where all printname features are used:
p r i n t n a m e excludes = [ " element is not included in collection " ]
++ [ " \\ $ False iff the given object is an element of the given
collection . " ]
++ [ " \\% COLL { Collection operations " ]
++ [ " \\ $ Operations on collections predefined in OCL . " ]
++ [ " These operate on the OCL collection types , " ]
++ [ " not on Vectors , AbstractLists of the implementation
language .} " ]
++ [ " \\# collElemType The official element type of coll . "
]
++ [ " That is , the parameter type of coll " ]
++ [ " \\# instType The type of elem . " ]
116
Figure A.2.: Grouping, tooltips and parameter descriptions
Figure A.3.: Parameter description as tooltip in the abstract syntax tree
117
++ [ " \\# elem The instance that must not be an element of 
coll . " ]
++ [ " \\# coll The Collection which must not include an element " ]
++ [ " This Collection must be parametrized with (
collElemType ) " ];
Processed markup commands are \\$ , \\% and \\# .
A.2.1. Tooltips
\\$ starts a tooltip and ends the displayed text. If you have the following printname:
p r i n t n a m e excludes = [ " element is not included in collection " ]
++ [ " \\ $ False iff the given object is an element of the given
collection . " ]
then ‘element is not included in collection’ will appear in the refinement menu, and
‘False iff the given object is an element of the given collection.’ as the tooltip. Tooltips
are always displayed as HTML, so this markup language can be used here. Note, that
for the text displayed in the refinement list, no HTML can be used.
A.2.2. Grouping
With \\% it is possible to define groups of refinements. Everything from \\% to the next
space or { is taken as a tag for a subcategory group. All refinements having the same tag
will be grouped under one entry in the left refinement menu. In figure A.2 for example,
you can see all entries, that have the tag \\%COLL in the right menu. A tag is defined by
use, it does not have to be declared in advance. The tag is independant from the types
of the funs of GF, so funs with arbitrary types can be in the same group. They just
won’t normally appear together at the same time, since GF only permits type-correct
refinements, and normally only one type is correct.
In figure A.4 you can see, that subcategories can have more elaborate descriptions than
just \\%COLL. If a subcategory tag is directly followed by a {, everything until the next
} is taken as the display text for that subcategory. As for display texts for funs, \\$ is
the delimiter between the text, that will be displayed directly in the refinement menu,
and what is displayed as the tooltip. As for fun tooltips, HTML can be used here.
If only one member of a group is available, the group won’t be displayed. The refinement
will be listed instead in the left menu as a top category item.
Note that it is not possible to use a group hierarchy. Only one level of grouping is
possible.
118
Figure A.4.: Using a nicer display texts and tooltips for subategories in the refinement
menu
A.2.3. Parameter descriptions
Each parameter of a fun can get a description in the printname of this fun. The places
where this information is used, are listed at the end of section 6.
Giving the parameters descriptions works a bit like in JavaDoc. They look like the
following:
[ " \\# collElemType The official element type of coll . " ]
First comes a special tag, \\#. This is directly, without a space, followed by the name of
the parameter1 . The name is ended by the first space. After that everything until the
next \\# is part of the description of that parameter. HTML can be used here.
The order of these parameter descriptions is important, because they are not checked
against the signature of their fun. The first description will be used for the first parameter and so on. If there are more descriptions than parameters, these superfluous
descriptions get read, but are never used. If there are too less, then only the present parameters get described, while the rest gets no tooltips and the generic ‘choose action on
subterm’ will be displayed above the refinement menu, if such a parameter is selected.
1
There is one exception to this. A ’ !’ is allowed here. This is used in OCL mode and triggers the
automatic insertion of a coerce. For general grammars, the ’ !’ is simply ignored.
119
120
Bibliography
[ABB+ 05] Wolfgang Ahrendt, Thomas Baar, Bernhard Beckert, Richard Bubel, Martin
Giese, Reiner Hähnle, Wolfram Menzel, Wojciech Mostowski, Andreas Roth,
Steffen Schlager, and Peter H. Schmitt. The KeY tool. Software and System
Modeling, 4:32–54, 2005.
[BH03]
Richard Bubel and Reiner Hähnle. Formal specification of security-critical
railway software with the KeY system. In Thomas Arts and Wan Fokkink, editors, Proc. Eighth International Workshop on Formal Methods for Industrial
Critical Systems (FMICS 03), volume 80 of Electronic Notes in Theoretical
Computer Science. Elsevier, 2003.
[BJ05]
David A. Burke and Kristofer Johannisson. Translating formal software
specifications to natural language—a grammar-based approach. In Philippe
Blache, Edward Stabler, Joan Busquets, and Richard Moot, editors, LACL
2005, number 3492 in LNAI, pages 51–66. Springer, 2005.
[Bub02]
Richard Bubel. Formale Spezifikation und Verifikation sicherheitskritischer
Software mit dem KeY-System. Master’s thesis, Universität Karlsruhe, 2002.
[CK04]
Maria Victoria Cengarle and Alexander Knapp. Ocl1.4/5 vs. 2.0 expressions – formal semantics and expressiveness. Software and Systems Modeling, 3(1):9–30, March 2004. http://www4.in.tum.de/lehre/seminare/hs/
WS0405/uml/CK04b.pdf.
[Dan03]
Hans-Joachim Daniels. Eine Deutsche Grammatik für OCL, Studienarbeit,
2003. http://www.cs.chalmers.se/~krijo/gfspec/daniels03.pdf.
[Ham04]
Ali Hamie. Translating the Object Constraint Language into JML. In
19th ACM Symposium on Applied Computing, pages 1531 – 1535, Nicosia,
Cyprus, 2004. http://cmis.mis.brighton.ac.uk/Research/vmg/papers/
SAC2004.pdf.
[HJR02]
Reiner Hähnle, Kristofer Johannisson, and Aarne Ranta. An authoring tool
for informal and formal requirements specifications. In Ralf-Detlef Kutsche
and Herbert Weber, editors, Fundamental Approaches to Software Engineering (FASE), Part of Joint European Conferences on Theory and Practice of
Software, ETAPS, Grenoble, volume 2306 of lncs, pages 233–248. spv, 2002.
121
Bibliography
[Joh04]
Kristofer Johanisson. Disambiguating implicit constructions in OCL. In
OCL and Model Driven Engineering Workshop at the UML Conference in
Lisbon, 2004. http://www.cs.kent.ac.uk/projects/ocl/oclmdewsuml04/
description.htm.
[Joh05]
Kristofer Johanisson. Formal and Informal Software Specifications. PhD
thesis, Göteborg University, Chalmers University of Technology, June 2005.
http://www.cs.chalmers.se/~krijo/thesis/thesisA4.pdf.
[Johar]
Kristofer Johanisson.
The KeY Book, edited by Bernhard Beckert,
Rainer Hähnle and Peter Schmitt, chapter Natural Language Specifications.
Springer, LNAI, to appear.
[Khe03]
Janna Khegai. Java GUI syntax editor for GF 1.1. Internet, March
2003.
http://www.cs.chalmers.se/~aarne/GF/doc/javaGUImanual/
javaGUImanual.htm.
[KU93]
Amir Ali Khawaja and Joseph E. Urban. Syntax-directed editing environments: Issues and features. Technical report, Arizona State University, 1993.
[Mar02]
Robert C. Martin. The Principles, Patterns, and Practices of Agile Software
Development, chapter The Single Responsibility Principle. Prentice Hall,
2002. http://www.objectmentor.com/resources/articles/srp.
[McB00]
Conor McBride.
Dependently Typed Functional Programs and their
Proofs.
PhD thesis, LFCS, University of Edinburgh, Edinburgh,
Scotland, 2000. http://www.lfcs.informatics.ed.ac.uk/reports/00/
ECS-LFCS-00-419/index.html.
[Ran04]
Aarne Ranta. Grammatical framework: A type-theoretical grammar formalism. The Journal of Functional Programming, 14(2):145–189, 2004.
[Rot02]
Andreas Roth. Deduktiver Softwareentwurf am Beispiel des Java Collections
Frameworks. Diplomarbeit, Fakultät für Informatik, Universität Karlsruhe,
June 2002.
122

Multilingual Syntax Editing for Software Specifications

Transcription

Similar documents

Context Management and Personalization

ADS Werkzeugkoffer