Reading a text file
You need the FileReader
and BufferedReader
classes
- The
FileReader
handles the opening and reading of text files, character by character
- The
BufferedReader
allows you to read the file line by line instead.
This code will throw an exception if the file cannot be opened, or if it does not exist
Writing a text file
You need the FileWriter
and BufferedWriter
classes
- The
FileWriter
handles the opening and writing of text files, character by character
- The
BufferedWriter
allows you to write multiple characters at once.
This code will throw an exception if the file cannot be opened, or if it does not exist
Writing a text file
You need the FileWriter
and BufferedWriter
classes
- The
FileWriter
handles the opening and writing of text files, character by character
- The
BufferedWriter
allows you to write multiple characters at once.
This code will throw an exception if the file cannot be opened, or if it does not exist
The static and final keywords
Static
We use the static
keyword when we want a variable or method to be accesible without creating an instance of a class
- A static variable will have the same value for all of the instances of a class
- Only one copy of the static variable exists in the computer's memory.
- If you change the value of a static variable, this change will be seen in all instances of the class.
Static
We use the static
keyword when we want a variable or method to be accesible without creating an instance of a class
- A static method can be called without creating an instance of a class
- Inside the class file where they are declared, you can call static methods directly by their name
- Outside that file, you should call them as
ClassName.staticMethodName(arguments)
- Only one copy of the static method exists in the computer's memory.
Static
Some rules for using static methods and variables
- A non-static method can access: static methods and variables
- A non-static method can access: non-static methods and variables
- A static method can access: static methods and variables
- A static method cannot access: non-static methods and variables. It doesn't have access to the
this
reference.
Static + Final
The final
keyword is used when we want the value of a static
variable to be constant
This is usually useful when you want to precompute certain numbers and keep them constants during the execution of your program.
Hashtable
Think of a Hashtable as a dictionary
We use a Hastable to associate a key to a value
Hashtable
Keys, as well as values, can be of any type
Hashtable
A hash table is a generalization of the simple fixed-size array
- In a fixed size array, you access elements by their index, which is a number
between 0 and the size of the array minus 1
This is called direct addressing, since we can access an element of an array with just one operation
- A hash table uses a similar idea, we want to access a value directly using its key
Hashtable
To get direct access, a hash table uses a fixed-size array to store data.
We would associate each key to a position in the array, and store the data in the corresponding slot in the array.
We could try and have one slot in the array for every possible key; i.e. map each possible name to a number in the array
This is not feasible if the number of keys is very big ( How many possible names are there?)
Furthermore, we might not even need all of the possible positions in the array to store data
Hashing
Instead, we use a fixed-size array with a small size. For example, we could use just 26 slots for all possible names in a phone book.
We need to compute an index for every key. This is called hashing .
Hashing function
A hashing function maps input data of arbitrary size to output data of fixed size; i.e. The set of all possible person names to 26 slots
For a key k
, we store its value in the slot given by the hashing function f(k)
We say that f(k)
is the hash value of the key k
A good hashing function would give different hash values to every key
Hashing function
But since we are limiting the size of the fixed-size array (the number of slots available) ...
.. multiple keys will have the same hash value. This is called a collision
- In our example, the hash function maps the first letter of a name to a number from 0 to 25; representing its position in the alphabet
- Inevitably, as we add more elements to the hash table, multiple elements will have the same same hash value.
Collisions in hashing functions
Collisions happen when more than two keys have the same hash value
We can resolve collisions by
- Increasing the size of the array that holds the content, and finding a better hashing function
- Chaining multiple values in the same slot, using a
LinkedList
Collisions in hashing functions
Collisions happen when more than two keys have the same hash value
We can resolve collisions by
- Increasing the size of the array that holds the content, and finding a better hashing function
- Chaining multiple values in the same slot; e.g. using a
LinkedList
Collisions in hashing functions
We can resolve collisions by
- Increasing the size of the array that holds the content, and finding a better hashing function
Picking a good hashing function is a good idea: In our example, there are a lot more people with a name starting with J, than there are people with a name starting with Q
But as long as there are more keys than available slots (hash values), there will be collisions.
Collisions in hashing functions
We can resolve collisions by
- Chaining multiple values in the same slot; e.g. using a
LinkedList
We lose some of the benefit of direct addressing. But this is: 1) faster than a Linked list on average 2) More flexible than a fixed-size array
An implementation of a HashTable
See the provided code in the course website
An exercise for you:
Download the baby-names.csv file from the course website
- Using a Hashtable, count how many times each name appears on the list . See how popular your name is
- Compute the percentage of baby names with a given starting letter; e.g. how what is the percentage of names starting with a J
- Can you come up with a hashing function that maps baby names to 100 slots? Try loading the names in the baby-names.csv file to your Hashtable. Can you change your hashing function so that every slot has more or less the same number of elements?
/