Data Structures, and why you need to actually learn them

This year, I am taking CIS 300 at Kansas State University. At first, I expected this class to be boring as it was a class on “data structures.” How much more boring could it get? Oooh, studying the differences in implementations between LinkedList and ArrayLists. The first day our professor proved to us just how wrong we were. He showed us two snippets of code:

String results = "";
for(char c: String.toCharArray()) {
results += (char)(c+3);
}

and

StringBuilder results = new StringBuilder();
for(char c: String.toCharArray()) {
results.append((char)(c+3));
}

He then proceeded to convert the entire collection of works by Shakespeare. Method 1 was estimated to take over 21 days by the computer. Method 2 finished in seconds. That caught my attention.
There are several cool data structures and search techniques that I have found awesome throughout this semester.

Binary Search

The people who write these algorithims are brilliant. To search through a sorted array, why look at every element? Why not just jump to the middle, see if what you are looking for is less than or greater than the middle, then do it again, with the new middle being the middle of the selected half.

ArrayList

ArrayList’s is essentially a circular array that handles the dirty details of dealing with arrays. When programming simple projects, I prefer to use ArrayLists as they are just more friendly than arrays. It dynamically resizes to allow you to constantly add without having to handle the actual details of expanding the underlying arrays. Finding an element at a certain element is VERY cheap as it essentially an array wrapper, the problem with ArrayList is removing or adding as it requires all of the following elements to be moved which can be pricey in a large array (I don’t mean hundreds, I mean hundreds of thousands).

LinkedList

LinkedList’s can be awesome when used correctly. A LinkedList is a group of nodes, linearly linked together. Node A contains certain data and a link to B. B contains data and a link to C, etc. This can be quite efficient if all you need to do is traverse the list in order. Finding a certain index can be much more costly than an ArrayList, but adding and removing at the end, or having a certain Node is extremely cheap. While sounding sort of lame, it can be very useful.

Tries

Now Trie’s really got my attention the first day of class. While looping through every String in a String[] can be very expensive, say an 80,000 word dictionary. While a binary search can cut that down to around 30 loops. Imagine if you could do that in 5 for a 5 letter word, or 8 for an 8 letter word. Imagine a tree, with the head having 26 children, each corresponding to a letter. From each letter are more children, and more children from there. Say following from the root, ‘a’ to ‘c’ to ‘e’, and ‘e’ contains a ‘emptyString’ property that holds the boolean true. You have found the word ‘ace’ in 3 actions. Mind boggling.

Conclusions

Knowing these staples of data structures enables you to mix and mash creating the perfect data structure for any ocassion. Today I mixed LinkedLists with float arrays to efficiently store millions of floating point numbers so that I could both stay memory efficient (arrays of primitive types) while keeping object sizes down (breaking the large arrays up through the LinkedLists). Note: I in no way claim to be a genius on this, or even know a good amount about it, as I have only spent a few months studying it, but it’s some awesome stuff that all developers should look into if they haven’t already.

blog comments powered by Disqus