Friday, June 27, 2025

More Complexity Analysis

 The first step in learning how to categorize—and then build—efficient code is learning the basic patterns we covered in the first article on complexity:

  • a number of steps that doesn’t change with respect to the input size is constant complexity—like accessing some spot in memory
  • a number of steps that’s proportional to the log of the input size is logarithmic complexity—like binary search
  • a number of steps that’s proportional to the input size is linear complexity—like linear search (this is why it has that name)
  • a number of steps that’s proportional to some power of the input size is quadratic complexity—like anything with a loop inside a loop inside a loop inside a loop … 
  • etc. 

Each complexity class has some recognizable patterns, that, at a glance, tell you the complexity of some code. 


This, for example, has quadratic complexity:


public boolean findDuplicates(int[] inputData) {

    for (int i = 0; i < inputData.length; i++) {

        for (int j = 0; j < inputData.length; j++) {

            if (inputData[i] == inputData[j] && i != j) {

                return true;

            }

        }

    }

    return false;

}


For every i, for every j (in other words i*j times), we do a thing. 

Now, let’s look at this code:

public class NLogNLoopDemo {

    public static void main(String[] args) {

        int n = 16; // You can change this value to see how it scales

        int count = 0;


        for (int i = 0; i < n; i++) {           // O(n)

            for (int j = 1; j < n; j = j * 2) { // O(log n)

                System.out.println("i = " + i + ", j = " + j);

                count++;

            }

        }


        System.out.println("Total iterations: " + count);

    }

}


For every i, for a small subset of the j’s (those that are double as far away as the previous j—which does not mean “half the j’s”), we do a thing. This setup is nlogn. 

What I want to talk about now assumes you’re comfortable looking at code and seeing these patterns—so you know MergeSort is in nlogn, or looping through an array once is n—but how we take these classifications and make decisions based on them. 

Recall the case of BubbleSort. Because of how the bubbling mechanism works, the best scenario is when the array is already sorted—and then the algorithm just does one pass in linear time to check that that’s the case. 

But the worst case is that the array is already sorted, but in the opposite order. In that case, the algorithm does n-1 swaps to get the first number in place, then n-2, then n-3, then n-4…; and the whole sum of these numbers is proportional to n^2, so in the worst case, BubbleSort is n^2. 

On average, we’ll fall somewhere in the middle. With randomized numbers, some will be sorted, and some won’t be sorted. The average of n and n^2 is (n+n^2)/2, which we actually consider to be n^2. This is because of two reasons already alluded to in our first discussion

  • n^2 dominates n, so taking a shortcut and calling “n^2+n”, as things go to their limits, is perfectly acceptable 
  • We ignore constants. 2n^2 looks the same (except for a stretch) compared to n^2; we only say that functions are “different” if they really are, like n^2 and n!. 


In computer science, we often do three kinds of analysis on an algorithm (for time and/or space, so, I guess, six types):

  • If everything is as easy as possible for the algorithm, what’s the best-case scenario for its performance?
  • If everything is how we expect it to be on average, what’s the average-case scenario for its performance?
  • If everything is as hard as possible for the algorithm, what’s the worst-case scenario for its performance?

Most of the time, however, when designing software, you care much more about the worst-cases than the best. 

For each of these, we have 3 possible classes they could fit into: Omega (Ω), Theta (Θ), and Big-O (denoted with the familiar Latin-alphabet “O”, as in “the letter before P,” not a 0):


If an algorithm’s cost function is Big-Omega of a certain class of functions, then it means there is some constant C, such that for C times the class is always less than the cost function. That is, the cost function always performs worse than the constant-times-class, and constant-times-class is therefore a lower bound.  

If an algorithm’s cost function is Big-Omega of a certain class of functions, then it means there is some constant C, such that for C times the class is always less than the cost function. That is, the cost function always performs worse than the constant-times-class, and constant-times-class is therefore a lower bound. You will not do any better than this lower bound.   

If an algorithm’s cost function is Big-O of a certain class of functions, then it means there is some constant C, such that for C times the class is always greater than the cost function. That is, the cost function always performs better than the constant-times-class, and constant-times-class is therefore an upper bound.  You will not do any worse than this lower bound. 

If an algorithm is both Big-O and Big-Omega of some cost function class, then you have enough evidence to state that it’s Big-Theta of that class.

 We need to draw an important distinction between best/worst/average and Omega/Theta/O: For each of the cases (best/worst/average), you can perform Omega/Theta/O analysis. Omega is not the exclusive domain of the best case (which can have its own big-O and potentially big-Theta), and O is not the exclusive domain of the worst case (which can have its own big-Omega and potentially big-Theta). It is perfectly fine to say that “the worst case scenario of [some code] is ‘big-Omega of n^2’” or that “the average case is ‘big-O of nlogn’”


Thursday, June 26, 2025

File I/O

We have seen Scanner used to quickly gather one line, for example, to turn "Hello World" into a personalized "Hello, Andre" or something similar. But there are better ways to handle reading and writing much larger amounts of text: whole files.

BufferedReader and BufferedWriter are commonly used in Java to efficiently read from and write to text files. BufferedReader reads text from a file line by line, which is useful for processing large files without loading the entire contents into memory at once. For example, to read every line from a file and print it to the console, you can use:

BufferedReader reader = new BufferedReader(new FileReader("input.txt"));

String line;

while ((line = reader.readLine()) != null) {

    System.out.println(line);

}

reader.close();

This code opens "input.txt", reads each line until the end of the file, prints each line, and then closes the reader to free resources.

BufferedWriter is used for writing text to files efficiently. It buffers the output, so writing many small pieces of data is faster. For example, to write a couple of lines to a file:

BufferedWriter writer = new BufferedWriter(new FileWriter("output.txt"));

writer.write("This is the first line.");

writer.newLine();

writer.write("This is the second line.");

writer.close();

This code creates or overwrites "output.txt", writes two lines, and closes the writer when done.

If you want to copy the contents of one file to another, you can combine both classes:

BufferedReader reader = new BufferedReader(new FileReader("input.txt"));

BufferedWriter writer = new BufferedWriter(new FileWriter("output.txt"));

String line;

while ((line = reader.readLine()) != null) {

    writer.write(line);

    writer.newLine();

}

reader.close();

writer.close();

This reads each line from "input.txt" and writes it to "output.txt", preserving the line structure.

Always remember to close both the reader and writer after use, as this ensures all data is properly written and resources are released. BufferedReader and BufferedWriter are the preferred way to handle text files in Java when you need efficiency and simplicity.

Wednesday, June 25, 2025

Dijkstra's Algorithm

 Imagine for a minute you are Google Maps (because they actually do use something very similar to what we’ll describe). Your job then is to get someone from A to B as quickly as possible. How do you find the best path?


Suppose we have a road network, seen here:

Dijkstra's Algorithm - Shortest Path Finding

The numbers represent some resource (time, fuel, distance), which is precious, and which we want to use the least amount of, to get from one place to another.

Suppose we live at A and work at F.

We can use Dijkstra’s algorithm to figure out the shortest path.

1.       Assume that everything is infinitely far away, because we haven’t calculated ant distances

2.       Look at the graph, and see that from A, we can get to B in 4 and C in 5. Everything else is infinitely far away.

3.       Now, from B, we can get to D in an additional 9, or E in an additional 7. From C, we can get to B in an additional 11, or E in an additional 3. (We ignore going back to A for 5 as useless).

4.       We already know that the shortest path will not be ABE…, because there’s a faster way to get to E, via C, as in ACE…: ABE has a total cost of 4+7 = 11, and ACE has a total cost of 5+3 = 8; 8 < 11, so the 8-cost path is better.

5.       From E, we can go backwards to C (which we again ignore), or to B for 7, or to D to 13 or to F for 6. We choose the best (which happens to be the leg that gets us to our solution) and go to F for 6.

6.       Now, we have the whole path which minimizes the use of our precious resource: start at A, and go through C and E en route to F.

The following paths (without loops) are possible:

Path

Cost

ABEF

17

ABDF

15

ACBDF

27

ACEBDF

26

ACBEF

29

ACBDEF

46

ACEF

14

 

Of all the paths, the one we came up with with our strategy is the shortest between A and F.

Heuristically, the approach is twofold:

  • Choose the next location as the one that can be reached at the smallest total cost
  • Update the shortest known distance to every node, if a new shortest path has been found

This approach is Dijkstra's algorithm. If I'm not mistaken, it was invented in about half an hour while its inventor (a professor and pioneer of computer science) was on or waiting for an intercity train, sometime in the 1950s or 1960s-- and still today it remains a cornerstone algorithm at the heart of so many different applications. 

There is one important consideration when using Dijkstra's algorithm. Most of the time, edge weights are positive: it costs you some money, or it requires some amount of gas, or you need to spend time. But that does not always need to be the case in every graph; edge weights can be negative. However, if a graph has any negative edge weights, Dijkstra's algorithm will not necessarily work, so it should not be used. 

Here is the pseudocode for the algorithm:

function Dijkstra(Graph, source):

    // Initialize distances and previous node trackers

    for each vertex v in Graph:

        dist[v] := infinity

        prev[v] := undefined

    dist[source] := 0


    // Create a set of all unvisited nodes

    Q := set of all vertices in Graph


    while Q is not empty:

        // Select the unvisited node with the smallest distance

        u := vertex in Q with smallest dist[u]

        remove u from Q


        // For each neighbor v of u still in Q:

        for each neighbor v of u:

            alt := dist[u] + weight(u, v)

            if alt < dist[v]:

                dist[v] := alt

                prev[v] := u


    return dist, prev


And here it is in Java:

import java.util.*;

class Edge {
    int target;
    int weight;
    public Edge(int target, int weight) {
        this.target = target;
        this.weight = weight;
    }
}

public class Dijkstra {
    /**
     * Runs Dijkstra's algorithm on a graph represented as an adjacency list.
     * @param graph adjacency list: graph.get(u) returns a list of edges from node u
     * @param source the starting node
     * @return array of shortest distances from source to every node
     */
    public static int[] dijkstra(List<List<Edge>> graph, int source) {
        int n = graph.size();
        int[] dist = new int[n];
        Arrays.fill(dist, Integer.MAX_VALUE);
        dist[source] = 0;

        PriorityQueue<int[]> pq = new PriorityQueue<>(Comparator.comparingInt(a -> a[1]));
        pq.offer(new int[]{source, 0});

        while (!pq.isEmpty()) {
            int[] curr = pq.poll();
            int u = curr[0];
            int d = curr[1];

            if (d > dist[u]) continue; // Already found a better path

            for (Edge edge : graph.get(u)) {
                int v = edge.target;
                int weight = edge.weight;
                if (dist[u] + weight < dist[v]) {
                    dist[v] = dist[u] + weight;
                    pq.offer(new int[]{v, dist[v]});
                }
            }
        }
        return dist;
    }
}

Tuesday, June 24, 2025

Complexity basics

 From the earliest days of computing, a central question has been how we can solve a problem as efficiently as possible. That has meant several things over time to different people, but in modern times, we’ve settled on efficiency over two different parameters (which generally require tradeoffs with each other): the number of steps something takes, and the amount of memory it takes to do it.

 In computer science, we group things into complexity classes based on the worst-growing term, ignoring any coefficients. Think about it for a second, and pull up Desmos or another graphing calculator to follow along:

  • A^2+3A-25: it looks like A^2, because eventually, the A^2 term will be so much bigge than 3A or 25 that A^2 will dominate
  • X^3+2^X-1000000: it looks like 2^X because, eventually 2^X will be much bigger than X^3 or 1000000
  • x!-10^x-x^100-1000000000: it looks like x! because, eventually, x! will dominate all of 10^x, x^100, and 1000000000

In general, we prefer functions at the top of this list more than at the bottom, in terms of describing both space and time:

  • constant is better than
  • logarithmic is better than
  • linear is better than
  • log-linear is better than
  • polynomial is better than
  • exponential is better than
  • factorial is as bad as it gets

Let’s look at each one, with an example of each

  • “Go to this place in memory” is constant
  • “Do binary search through an already-sorted array” is logarithmic
  • “Do linear search” is linear
  • “Do merge-sort” is log-linear
  • “Do traditional, grade-school, digit-by-digit multiplication” is polynomial
  • “Count the number of nodes in a binary tree of a certain height” is exponential
  • “Tell me all the ways I can form a line out of students in my class” is factorial

Another way of looking at things:
  • If a problem always requires the same number of steps, it is constant
  • If there’s a single loop in the code, that moves by i++ or i--, it’s linear
  • If there's a single loop in the code, that moves by i*=2 (or i/=2) or something similar, it's logarithmic
  • If your problem splits things in half recursively and brings them back together at the end like MergeSort, it's log-linear
  • If there are some number of nested loops: i inside j inside k…, then it’s polynomial, and the degree is the depth of the nesting: i inside j inside k is 3 deep, so this is n^3
  • If one problem causes there to be some number of problems, which creates that many sub-sub problems for each sub problem, and so on (1 problem becomes 2 becomes 4 becomes 8…), then it’s exponential by the rate at which the problems are multiplying: 1 problem into 3 into 9 into 27 is 3^n, but 1 problem into 2 into 4 into 8 into 16 is 2^n, but either way 2^n or 3^n are both exponential
  • If you have no other choice but to calculate every possible arrangement of something, in order to get to a solution, then you’re looking at n! (like, for example, the hilariously bad “bogosort”: randomize a list, check if it’s sorted; if not, randomize again; keep randomizing and checking until a shuffle results in a sorted list)

Monday, June 23, 2025

Red-Black Trees

There exists a special kind of binary search tree called a red-black tree that implements certain extra rules to ensure it stays balanced. Balancing is the process of repositioning nodes to decrease the height of a tree, thereby preventing it from degenerating into a list. Red-black trees, because they impose these conditions to self-balance, perform better, and it takes longer for their performance to degrade. 


Normal binary trees have 4 important relations:

  • You: wherever you are
  • Your parent: whoever you are connected to, which is one level closer to the root than you
  • Your child(ren): whoever you are connected to, which are one level further from the root than you
  • Your sibling(s): whoever may or may not have, as the nearest common ancestor with you, as a parent, the same node that is your parent

Red-black tree rules force us to care about 2  more relations:

  • Your uncle: The sibling of your parent
  • Your grandparent: the parent of your parent

The following coloring rules apply:

  • The root is black
  • Null nodes are black by default
  • Red nodes’ children are not red
  • Every path from the root to a particular node (or, equivalently, up from that node to the root) passes through equal numbers of red and black nodes


Both inserting and deleting nodes might cause violations of these rules to occur. Here are possible violations and how to fix them:

If you insert as red:

  • And your parent is black
    • You’re fine!
  • And your parent is red
    • Keep reading and check on your uncle
    • And your uncle is red
      • Color your parent black
      • Color your uncle black
      • Color your grandparent red
      • Make sure that doesn’t cause any problems, and if there are any, fix them
    • And your uncle is black
      • If you’re a right child
        • Rotate left, i.e., counter-clockwise, about your parent
      • If you’re a left child
        • Rotate right, i.e., clockwise, about your grandparent and recolor


A left rotation looks like:

Before Rotation:

    x                                              

     \                                             

      y                                                         

     / \                                                     

    a   b                                                     


After Left Rotation:

      y

     / \

    x   b

     \

      a

And has these steps:

  • Set y to be the right child of x.
  • Move y’s left subtree to x’s right subtree.
  • Update the parent of x and y.
  • Update x’s parent to point to y instead of x.
  • Set y’s left child to x.
  • Update x’s parent to y.


And a right rotation looks like this:


Befor Right Rotation:    


      x

     /

    y

   / \

  a   b


After Right Rotation:


    y

   / \

  a   x

     /

b


and has these steps:

  • Set y to be the left child of x.
  • Move y’s right subtree to x’s left subtree.
  • Update the parent of x and y.
  • Update x’s parent to point to y instead of x.
  • Set y’s right child to x.
  • Update x’s parent to y.

After inserting as red, you might need to recolor:

  • If your parent and uncle are red, make them black and your grandparent red
  • If the new node’s uncle is black (and the new node is the right child of a left or left child of a right—but not left of a left or right of a right) rotate appropriately
  • If the new node’s uncle is black and this time the new node is a right child of a right child or left child of a left, rotate and recolor the parent and grandparent

After deleting:

  • If you have a black-black parent-child, this can cause problems
    • If you have a black sibling
      • Recolor your parent and sibling, and rotate
    • If your black sibling has black children
      • Recolor the sibling red, and now your parent has a problem. Fix it appropriately.
    • If your black sibling has any red children
      • Rotate and recolor


Sunday, June 22, 2025

Depth- and Breadth-First Searches

 Suppose now you’re looking for a particular value in a graph. (We’ll demonstrate things with a tree, but the graph does not need to be.) You essentially have two ways to go about this search (for example, to solve a maze).

Let’s say you have 10 paths before you, and each is 10 steps deep. You could either go 10 steps down #1, then maybe 10 steps down #2, then maybe 10 steps down #3, etc., until you find what you want or exhaust all options; or 1 step down paths 1-10, then a second step down paths 1-10, then a third step down paths 1-10, and so on, until, again, either tou find what you want or you exhausted all options.

If what you’re looking for is in the graph—an exit from a maze, for example—both of these methods will find the answer. Both are fundamental graph algorithms.

The algorithm by which you go as deep as you can, until you can’t anymore, is called a depth-first search: it prioritizes maximum depth on a path, before moving to another path. The alternative is the breadth-first search, by which you explore as many paths as possible at a certain level, before going to the next level for any of them.

Here's the pseudocode for depth-first:

procedure DFS(node, visited):

    mark node as visited

    for each neighbor in node.neighbors:

        if neighbor is not visited:

            DFS(neighbor, visited)

Note that DFS() is recursive and uses a stack.

And here’s the BFS pseudocode, which, this time, is iterative and uses a queue:

procedure BFS(Graph G, start_node):

    create an empty queue Q

    mark start_node as visited

    enqueue start_node into Q

 

    while Q is not empty:

        node = dequeue Q

        for each neighbor in node.neighbors:

            if neighbor is not visited:

                mark neighbor as visited

                enqueue neighbor into Q

Saturday, June 21, 2025

Full, Complete, Balanced, and Degenerate Trees

 We’ve looked at general binary trees—those graphs without cycles in a single component for which there is one root, and each node has either 0, 1 or 2 children.

Let’s now look at 3 subcategories of binary trees that impose more rules, each one getting closer to a “proper” binary tree by a stricter application of one of those rules.

  • First, full: A general binary tree in theory can have nodes with 1 child. A full tree is one in which none of these one-child nodes exist; all nodes either have 2 children or none
  • Next: complete: a general binary tree can have gaps. A complete tree is one in which, except (maybe) in the last level, all nodes have 2 children. If at the widest level, there are situations with fewer than 2 children, they occur as far to the right as possible
  • Finally: balanced: a balanced tree is one in which, looking at the left and right subtrees (those rooted at the children of the global root), the heights of those two subtrees are different by 0 or 1.  
  • In fact, it might be good to add a fourth category as an anti-pattern: the degenerate tree. Suppose for a moment that from the root, one always moves the same direction when adding a child: add the root, move right, add a child; move right, add a child; move right, add a child; and so on. In this case, this tree is no better than a linked list (and it effectively is one, called a tree by mistake)

                                                                                                                                                               
Try to draw these four groupings of trees. 

Switch

 Other than if/if-else/if-else if-else and the ternary operator, there is yet another common and important conditional expression in Java th...