top of page

Week 2

The material of the second week wraps up EBNF and introduces the first Java programming concepts. It also covers some general concepts in programming.

Verification

How to use tables and trees to verify the legality of a word according to an EBNF rule.

In the lecture you have seen two techniques to allow us to more formally argue over the legality of a word according to some EBNF rule. The first is the table and the second one is the tree.

Both are equally good for proving legality of a symbol. It however is easier to draw a table on a computer than a tree. Hence we will mostly use tables to demonstrate legality of a symbol according to some EBNF rule. 

EBNF

Verification with Trees

Given an EBNF rule we can use a tree to verify that it accepts some word. The root of the tree is given by the EBNF rule and every branch of the tree leads from some element of the rule to an evaluation of it. Evaluating an element of a rule means doing one of the following things.

  • Replace a rule name (LHS) by the corresponding definition of the rule (RHS).

  • Select an element in a selection.

  • Decide an option, i.e. choose it or leave it out.

  • Decide the number of repetitions.

Is -42 allowed by <integer> ?

EPROG Trees.png

EBNF

Verification with Tables

Given an EBNF rule we can use a table to verify if it accepts a certain word. In each row of the table we evaluate one of the elements of the previous row to extend on the previous evaluations.

  • Replace a rule name (LHS) by the corresponding definition of the rule (RHS).

  • Select an element in a selection.

  • Decide an option, i.e. choose it or leave it out.

  • Decide the number of repetitions.

It is considered good practice to comment​ each row in the table by what evaluation was made in that step. This helps the reader and yourself to make sure that the steps made are correct. In the example given here the steps were left out only because of the lack of space.

Is -42 allowed by <integer> ?

<integer> <= [+ | - ]  <digit>  {<digit>}
                 <= [-] <digit>  {<digit>}  
                 <= - <digit> {<digit>}

                 <= - <digit>  <digit>
                 <= - 4 2

                 

EBNF

Special Symbols

The symbols [, ], {, }, (, ), =, > are used in EBNF rules and are part of control forms. We can hence not directly add them to a language. To allow for this we add a rectangle around such symbols to allow the reader to differentiate between control form and the actual symbol.

Often in textbooks and online resources quotations marks are used for this purpose as well. In the solutions on exercises we will use quotation marks as well as it is more convenient to do so.

{
[
|
(
"{"  "["  "|"  "("

EBNF

Equivalent EBNF Rules

Two EBNF rules are called equivalent if they accept the same set of words. In other words an EBNF rule <a> is equivalent to an EBNF rule <b> if all words that <a> accepts <b> also accepts and vice-versa. Hence if <a> rejects a word so does <b>. If the following set of conditions hold then <a> and <b> are equivalent.

  • <a> accepts word x  ⇒  <b> accepts x

  • <a> rejects word x    ⇒  <b> rejects x

  • <b> accepts word x  ⇒  <a> accepts x

  • <b> rejects word x    ⇒  <a> rejects x

<digits> <= {<digit>}

<digits> <= [{<digit>}]

Programming

Introduction

We have seen that EBNF can define the syntax of a language. However as with any language the correctness of the syntax on its own, even though important is not sufficient for a set of words and rules to be considered a language. It takes the semantic to give meaning to chains of words. 

The sentences on the right are syntactically correct, however they have no meaning, i.e. they make no sense to the reader. A programming language is also a language and thus also comes with its own semantic. The syntax of a programming language is easy to learn, the semantics on the other hand, require a lot of practice to fully understand.

"Schiffe riechen gelb."

"Ships smell yellow."

Programming

Semantics

For programming languages two questions are the integral part of the semantics. These questions concern homonyms and synonyms

One may ask the question why this is important. Let us look at the different definitions of <integer> seen in the lecture for this purpose. Some allow for the word 0012 and some do not. A pin number could be 0012, but for numbers we prefer to simply write 12 instead of having preceding zeroes. This allows for each number to have a unique representation which is preferred.

Synonym: Can different symbols have the same meaning?

Homonym: Can the same symbol have different meanings?

Programming

Canonical Representation

We have already mentioned a unique representation for some word. Sometimes an EBNF rule does not allow for that as many words are synonyms. It thus makes sense to agree on one word to be the representant of the synonyms. This representant is also called the canonical representation.

 

For the <integer_set> example on the right one can see that the sets {1, 2, 3}, {1, 1, 2, 3, 3} and {3, 2, 1} represent the same set, as sets have no ordering and duplicates are ignored. The canonical representation is the unique set which contains no duplicates and whose elements are ordered from left to right in ascending order, i.e. the smallest number is on the left and the number get larger from left to right.

<integer_list> <= <integer> { , <integer> }
<integer_set> <= "{" [<integer_list>] "}"

Wrapping up EBNF

Syntax Diagrams

EBNF rules which use the first three control forms can be represented in diagrams. On the right one can see how the different control forms are represented in diagrams. 

Diagrams can be useful to illustrate EBNF rules. The legality of a word can be verified by finding a walk through the diagram which constructs the word and goes from the start to the end of the diagram respecting the following conditions.

  • Sequence: Each element in the sequence must be added to the constructed word.

  • Selection: Exactly one branch must be chosen.

  • Option: Either the upper branch with the optional element or the lower branch without an element must be chosen.

  • Repetition: Choose one of the branches, the backwards branch can be taken as many times as desired.​

One notices that only the repetition allows for backward branches, i.e. branches that go from the right side to the left side of the diagram. 

<room> <= M L D 2 8

SequenceDiag.png

<room> <= M L (D 2 8 | E 1 2)

SelectionDiag.png

<op_digit> <= [<digit>]

<rep_digit> <= {<digit>}

Wrapping up EBNF

Recursion I

The last control form is recursion. Recursive rules contain themselves and hence allow for a similar function as the repetition. It however is a stronger concept than repetition, as not all EBNF rules using recursion can be rewritten to use only the three first control forms. All rules using repetition can be rewritten to use only recursion and the two first control forms (no repetition).

We distinguish between direct recursion and indirect recursion. Direct recursion contains itself in its rule definition, whereas indirect recursion does not contain itself as an element of its definition but one of its elements may contain it.

Direct Recursion:

<r> <= <r> |  ε 

Indirect Recursion:

<a> <= <b> 
<b> <= <a> |  
ε
 

Wrapping up EBNF

Recursion II

As already mentioned recursion is more powerful than the three other control forms (sequence and selection, option, repetition). This means that there exists EBNF rules that can only be defined using recursion, but that all EBNF rules that can be described without recursion can also be described using recursion and additionally without making use of repetition.

An example for this is given by giving a rule that can accepts exactly all words which consist of n 'A' characters followed by n 'B' characters. It is not possible to construct such a rule without making use of recursion.

The diagram representation we have seen does not support recursion as it would have to support infinite recursion depth. We hence use tables or trees for checking legality of a word.

Rule requiring recursion:

<balance> <= A  <balance>  B

From EBNF to Java

Java Identifiers

Googling the term "identifier" in combination with Java will yield multiple web-pages among which we find the Oracle Java specification which says the following about Java identifiers

"An identifier is an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter."

This on its own simply introduces more words we do not understand. Further down we find the following explanation

The "Java letters" include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII dollar sign ($, or \u0024) and underscore (_, or \u005f). The dollar sign should be used only in mechanically generated source code or, rarely, to access pre-existing names on legacy systems. The underscore may be used in identifiers formed of two or more characters, but it cannot be used as a one-character identifier due to being a keyword.

The "Java digits" include the ASCII digits 0-9 (\u0030-\u0039).

Such explanations are very typical for documentations of programming languages like Java and besides the technical language used they are often not as complex as the seem.

ASCII: ASCII is a standard to represent symbols in computers.

800px-ASCII-Table-wide.svg.png

From EBNF to Java

How EBNF helps us

The above descriptions seemed very complex and are hard to understand. Instead of that we can (and should) use EBNF to simplify the explanations by focusing on the important parts.

On the right one can see a simplified version of this EBNF definition of the rule <identifier> which does not cover all ASCII characters as otherwise the rule would require much more space.

<lowercaseletter> <= a | b | c | d | e | f | g | h | i | j | k | l                        | m | n | o | p | q | r | s | t | u | v | w | x | y | z
<uppercaseletter> <= A | B | C | D | E | F | G | H | I | J |      K | L | M | N | O | P | Q | R | S | T | U | W | V | X | Y | Z
<letter> <= <lowercaseletter> | <uppercaseletter>
<digit> <= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<identifier> <= <letter> {<letter> | <digit>}

From EBNF to Java

Java Keywords

Certain identifiers allowed by the above rule are reserved in Java similarly to the symbols {, [, (, | in EBNF and are defined not to be identifiers.  We instead call these words keywords. Keywords are used to tell Java that the result of some computation will be an integer, a rational number, a floating point number or it may be used to identify where a program starts and where it ends. These are just very few of many examples where keywords are used.

On the right one can see a list of such keywords in Java, where the larger a keyword is the more frequently you will have to deal with it.

abstract continue for new switch assert default if package synchronized boolean do goto private this break double implements protected throw byte else import public throws case enum instanceof return transient catch extends int short try char final interface static void class finally long strictfp volatile const float native super while

Java Programing

First Java Program

Java is not the easiest language to get started with. The most basic code snippets already introduce many keywords. It is important that at first one should focus only on some of the keywords to understand the basic meaning of a given piece of code. Later on in the lecture all keywords will be discussed. Let us now only focus on the most important ones. In the next part we will dissect the example on the right and explain each of its parts.

FirstProgram.png

Java Programing

Java Class

The first concept to be discussed is the Java class. A Java class is how we start and end a Java program. A Java class called "ExampleClass" must be inside of a file called "ExampleClass.java". There are exceptions to this rule, which will be seen later, on where this is not strictly necessary. For now we will however always follow this rule. A class always has a name which will be used to identify it. 

We use curly brackets to denote start- and endpoints of a class. Similarly curly brackets are used for the same purpose with methods, loops, if-else blocks, etc.

JavaClass.png

Java Programing

Java Modifiers

The keyword public appears in front of the class keyword in the example code we have seen above. This is a modifier. There exists more modifiers than public, for now we only discuss public, and later on in the lecture will introduce the other modifiers.

The public modifier means that the program which the class defines can be accessed and used from any other file on the computer. This is also the reason why the file name must always be the same with public classes as the class name, as any other file may access the class from outside the file using its class name. Would the class be in a file with a different name, then the computer could not find the class. 

JavaClassModifierIntro.png

Java Programing

Java Method

Java classes can be understood as programs. Programs do something, sometimes they do many things. A text editor allows you to scroll through text, edit text, change the font of text, add images to text, and many more things. We call this the functionality of a program. Java methods give a class its functionality and they are analogous to functions in other programming languages besides some technicalities that do not matter at this point.

A method has parameters which are listed between brackets ( and ) after the method name. Here the method name is main. 

JavaMethodIntroParam.png

Java Programing

Java Parameters

Parameters in Java are passed to functions. When a function adds to numbers together then it requires the two numbers to be passed to it as parameters. We declare parameters by first writing their type and then their name. We separate parameters by using commas. In the example to the right String[] is the type and args is the name of the parameter.

The name "args" stands for arguments. Arguments are not the same as parameters. Arguments are the parameters passed to the function with name main. What they do exactly is not important at this point and will be discussed later on.

JavaMethodIntro.png

Java Programing

Java main Method

A program on its own cannot be executed. It must be told which of its functionality we would like to use and how we would like to use it. Take a calculator program for example, we must first tell it what to do and then click on the designated enter or = button to produce a result. Java programs behave exactly the same. Whenever you click the run button on any Java program the main method will be looked at and everything written in the main method that tells the program what to do exactly will be executed. If no main method exists or if it is empty then simply nothing will happen, at least nothing that we can see.

JavaMethodIntro.png

Java Programing

Java Method Return Type

Any Java method must have a return type. A method that adds two parameters and returns a result must have have the return type of the result of the addition of said parameters. Method can also be of the type void, which means that they do not return anything but instead only do something.

It is important that a method can both do something and return something. For example when a method writes a word into a file, then it might return a value that tells the user if the write was successful or not. 

JavaMethodReturnType.png

Java Programing

Java Types and Data

When we want to represent data we make it a data type. Computers represent data differently than humans do, there are also inherent limitations to computers like finite precision. This will not be of much concern to us, but it is important to understand that data types are not a mathematical object but rather a computer model of it. No computer could represent pi mathematically due to its infinitely many digits, however in most cases using 3.14 (computers support much more digits) as approximation is perfectly fine. Hence when you hear that integers are represented by the data type int, then think of it not as that integers are int in Java, but rather that integers are being modelled by int in Java, and in many cases they behave exactly the same.

Java Programing

Java Types

The types shown inside the box on the right are what we call the primitive types in Java, why they are called primitive will be discussed later on. The most important ones are the following

  • int: The int data type models the integers.

  • double: The double data type models the real numbers.

  • char: A character can be any (printable) ASCII character.

  • boolean: A boolean is either true or false.

Another common data type is the String, which is a sequence of characters in Java.

byte  short  int  long  float  double  char boolean 

Java Programing

Java Objects

Java is a object-oriented programming language. An object is an instance of a class in Java. If a car model is the class then each car produced from this car model is an instance of a class. A car model itself cannot drive, it cannot accelerate or break, only an instance of said model can do that. Java behaves similarly. 

We can call methods in Java by using a dot . on the class or an instance of a class and adding the method name and any parameters the method requires. On the right side one can see that a method from the System class is being called that way. Why we have two dots is not important right now, you may ignore the .out for now and just focus on the println method of the System being called. 

The println method prints the String given as parameter, in this case "Hello World!", to the console. 

ObjectCalling.png
OutputHelloWorld.png

Java Programing

Java static

One can see that we have now covered all essential parts of the previously given code. Only one thing remains which is the static keyword. We have already discussed the difference between a class and an object (an instance of a class). The static keyword makes a method accessible from the class itself. If a method is not declared as static one cannot call the method from the Class itself. Non-static methods require an instance of a class to exist to be able to be called. We then again use the dot, but call the method on the object itself rather than on the class.

Coming back to the previously used car example we can say that an accelerate method could not be called on the car model, as a model cannot accelerate only a physical car produced from the model can do that. A method that changes the colour of the car could be called on the car model itself. The main method is always static as it requires no instance of the class to be called on and can be called simply from the class it is contained in.

This merely is a quick explanation to make sure you can grasp what the code on the right does. The lecture will go into much more detail, hence if you struggle to understand this it is completely normal and no reason to worry. 

Java Programing

Programming Conventions

There are a certain set of guidelines one should follows when programming with Java. We follow guidelines as to guarantee that code in general is easily readable and easier to work with. You might be surprised how quickly you will forget what some code does you wrote a week ago. Comments, indentation and variable naming are just some of the practices used to enable you to quickly go over code you wrote and remember what it does exactly without wasting your time. The following are some conventions you should follow:

  • Class names start with an uppercase letter.

  • Method names start with a lowercase letter.

  • We use upper camel case for classes, e.g. a class named weathersimulationsource should be named WeatherSimulationSource.

  • We use lower camel case for methods, e.g. a method named writetoconsole should be named writeToConsole.

We will look at good practices in programming later on in more detail, but keep the above in mind from now on.

Java Programing

Java Comments

Comments are used to explain parts of the code, they are preceded by two dashes if they are inline comments, which will make it such that everything to the right is not regarded as code. Alternatively if a comment is multiple lines long we use /* and */ to make a comment section where everything in between these symbols is not regarded as code.

Refrain from using inline comments as separate lines in your code as that unnecessarily crowds your code and breaks the read flow. Use top-of-the method comments to tell the reader and your future self what a method does and use side-comments only when a certain piece of code is very hard to understand. Commenting should be kept minimal but it should be done. Many people start commenting every line of comment and end up not writing comments at all, due to it "being to time consuming". This is a common fallacy as writing a top-of-method comment takes very little time and it will allows you to already think about what exactly your method is supposed to do before writing it. Yes, we write top-of-method comments before implementing a method.

CommentIntro.png

Java Programing

Escape Characters

What do we do when we want to output a quotation mark? Simply putting inside a println method in between two quotation mark as we did with the Hello World! would not work as the quotation marks are reserved in Java for Strings. We can make use of so-called escape characters to make sure that a quotation mark is recognised as a literal quotation mark. Here is a list of the most common escape characters

  • \t  Insert a tab in the text at this point.

  • \b Insert a backspace in the text at this point.

  • \n Insert a newline in the text at this point.

  • \r  Insert a carriage return in the text at this point.

  • \f  Insert a formfeed in the text at this point.

  • \s Insert a space in the text at this point.

  • \'  Insert a single quote character in the text at this point.

  • \" Insert a double quote character in the text at this point.

  • \\ Insert a backslash character in the text at this point.

CommentIntro.png

Java Methods

Why to use Methods

A method implements one functionality of a given program. It allows us to take pieces of code which are often reused and pack in into a method, which can then be called instead of copying and pasting code. Methods make code more understandable as they the name of the method should tell us what the method does instead of having to understand a bunch of code. You should split every functionality into a separate method as this allows your code to be understood by a potential reader and your future self much better. Think of yourself at the exam where you might not solve an exercise immediately and then come back to an exercise where the first thing you see is a large number of lines of codes all shuffled together. This will have you start almost from scratch. 

Separating functionality into a method allows to find bugs in your code much faster, as you will be told in what method the problem occurs, hence you can focus only on said method instead of many lines of code which might not even have anything to do with the error.

An example for code that would be much more readable if the author would have used more methods.

BadCodeExamplePalindrome.png

Java Methods

Method Return Type

We have already quickly mentioned that method have a return type. If a method has a return type, then we use the return keyword to send back the result to whatever called the method.

Void methods do not require the return keyword (although one can add it if they want to) as they return once everything inside the method has been executed. Lines of code inside a method are ended using a semicolon ; and we call each such a line a Java statement. Java checks that the result being returned matches the type you declared as the return type of a method.

VoidReturnExample_edited.jpg

Java Methods

Method Calling

Code is executed from top to bottom. Calling methods is an exception to this as we calling a method can make us jump backwards or forwards in the code. One can imagine calling methods like adding code in between the method call and the next line in the code. Once the method has been fully executed we simply proceed with the next line of code. 

In the method printName the parameter is not being used. This is because we still lack some of the tools to be able to print parameters preceded by text.

We call the the order in which statements are executed the control flow

MethodCallingExample_edited_edited.jpg

Java Methods

Method Calling II

Methods can call each other multiple times before returning to the top-to-bottom execution. This often causes confusion as to how methods are executed. Whenever a method is called we execute all statements inside of it. If one of these methods is itself a method call we proceed to execute all statements in that method. This process is repeated until we reach a method without a method call. If we reach no method without itself calling another method then our program runs indefinitely. 

One can see in the output of the program to the right in what order the statements are executed. Understanding why this happens is crucial. Take your time to fully understand what exactly is happening. 

Recursion is similar to cleaning your room. You start by cleaning your room, then start cleaning your desk. You then attempt to put your pens into a drawer and notice that it is a mess, so you start cleaning the drawer, then you notice that one of the wheels of the drawer gets stuck because it is broken, so you repair the wheel. After repairing the wheel you finish cleaning the drawer and afterwards you clean your desk. Then you proceed to clean the rest of the room similarly being drawn into smaller subtasks along the way. At the end of the day you then finally finish cleaning your room. You started with cleaning your room but it is the last task you finished.

NestedMethodCalling.png
ResultNestedMethodCalling.png

Java Types

Overflow

We have already mentioned that there is a limitation to the size and accuracy of representation of data. We have a finite number of memory to store our data on, hence we cannot accommodate for infinitely long strings of numbers. In Java every type has a certain range of size of data it can represent.

To the right one can see in the first line the largest positive number an int can represent and in the third line the most negative number an int can represent. We are now faced with a problem. What do we do when computations extend beyond the range of a certain type?

Java like many other programming languages chose overflow as its way to handle these scenarios. Overflow is when we jump from the largest number to the most negative number when adding one. Similarly we go to the n-th most negative number when adding n to the largest positive number. This is something we will not often have to deal with but it is good to remember it.

Overflow.png
OverflowResult.png

That is it.

You are done with theory. To make sure you understand the theory you need to apply what you learnt in practice.

bottom of page