It’s Not “Depth First Search”
Note: I use a footnote reference notation here that looks like (n) where n is some integer. You can find the footnote corresponding to the (n) in the footnote section.
If you study computer science, one day, you will learn about something called “depth first search”. If you’re not already familiar, depth first search is an algorithm for graph traversal. If “algorithm” and “graph” aren’t familiar to you, there might be something you enjoy in my article here, but it will be a bit hard to interpret (3).
There’s plenty of free resources online describing depth first search graph traversal strategies, as well as a lot of quality academic resources that provide rigorous, formal descriptions (Steve Skiena’s, The Algorithm Design Manual, is one of my favorites).
Although, owing to my intention to question the obvious, and highlight the space for critical interpretation of “rigorous” mathematical ideas, I think it’s interesting to ask — what is “depth first search”? Or, more precisely, why is it called, “depth first”? Where does the idea of “depth” come from? Why is the idea of “depth” not accompanied by a measurement of “height” or “length” of a graph?
We’re not talking about sea exploration, it’s just math. So why have we decided to call an algorithm that utilizes a specific strategy to traverse the nodes of a graph, “depth first search”?
Depth First Search
Let me make up some pseudocode notation and describe the lexical form of the “depth first search” graph traversal strategy.
Let n = # an integer representing the number of nodes in a graph Let V = {0...n - 1} Let E = { {k, j ... i}, # k, j, i >= 0 and k, j, i < n ... }
V is a list of integers from 0 to n — 1.
E is a list of linked lists where each element of each of the linked lists satisfies the condition that it is integer greater than or equal to 0, and less than n.
So I now have a data structure representing a graph. To “depth first search” it, I just need to “visit” (1), the nodes of the graph in a fashion such that, relative to the node I’m currently visiting, I always visit a neighbors neighbor first, before I move on to visit the next neighbor of the node I’m currently visiting.
If that’s not very intuitive, I’ve cast the strategy in the context of a contrived scenario and walked through an example in the footnotes section (2). Building on the pseudocode above, there are only two more ingredients I need to complete my representation of depth first search:
An array of boolean values of length n that I will use to signify whether the vertex represented by an integer has been processed yet (to avoid scenarios where I end up in an infinite loop). That is, if P[i] = True that indicates the integer i has already been processed. If it does not, then i has not yet been processed.
A recursive function (6).
Let n = # an integer representing the number of nodes in a graph Let V = {0...n - 1} Let E = { {k, j ... i}, # k, j, i >= 0 and k, j, i < n ... } Let P = [] for i -> (0...n - 1) P[i] = False Let DFS(P, E, curr) = { process(curr) # "visit" the node, do whatever we need to with it P[curr] = True # mark the node as "processed" for node in E[curr]: if not P[node]: DFS(P, E, node) }
Sweet. So the general code is there. Now, let’s make it specific.
Depth… First… Search?
Here’s a specific case.
Let n = 5 Let V = {0, 1, 2, 3, 4} Let E = { {1, 3, 4}, {2, 1}, {1}, {4, 0}, {3, 0} } Let P = [] for i -> (0...n - 1) P[i] = False Let DFS(P, E, curr) = { process(curr) # "visit" the node, do whatever we need to with it P[curr] = True # mark the node as "processed" for node in E[curr]: if not P[node]: DFS(V, E, node) }
But that’s a bit hard to visualize, so let’s draw the graph.
At this point, you might be able to see where I’m going. In what order do we visit the nodes of the graph above? Well, continuing with the pseudocode above
...
DFS(P, E, 0) Inovcation 1: curr = 0, P = [False, False, False, False, False] Invocation 2: curr = 1, P = [True, False, False, False, False] Invocation 3: curr = 2, P = [True, True, False, False, False] Inovcation 4: curr = 3, P = [True, True, True, False, False] Invocation 5: curr = 4, P = [True, True, True, True, False] Final iteration: P = [True, True, True, True, True]
So, in what order were the nodes traversed?
[0, 1, 2, 3, 4]
Which is hard to imagine, so let’s visualize the path the algorithm took.
Now, let’s contrast that with the notion of “depth”. I would say there’s a few ways we can seek to define “depth”. We could use a textbook, dictionary definition or we could appeal to intuition and think about how the word is used colloquially. In either case, the general sense of the word “depth” implies a measurement that requires the existence a “downward” trajectory or “bottom” of something.
Clearly most applicable to large bodies of water on earth (since, once you zoom out to the scale of planets in an orbit, which way is “down”?), at least one definition of “depth” is a “deep place” in a body of water (https://www.merriam-webster.com/dictionary/depth). Another is “the perpendicular measurement downward from a surface” (https://www.merriam-webster.com/dictionary/depth).
But, hang on, how did we just traverse the graph?
Where’s the surface? Well, I just assumed it was at the top of the first node we visited in the illustration above. But, then, that wasn’t really a “depth” first search. Huh… Maybe there’s a way to fix this…
Aha! That’ll do it, now the path is depth first. Or… wait? Should we have gone up to 3 or 4 first? That’s… height first? Or… wait, I got it!
Aha! Now it is depth first. Or, but isn’t this the same as
Well, wait, what’s this?
I’m being a bit cheeky, but the point is, the notion of “depth” depends on how the graph is drawn. Purely from the lens of a generic, abstract, mathematical concept of a undirected, non-weighted graph (as we now define them), all of the graphs above are the same. The notion of “depth” isn’t very nicely fixed to any of them.
It’d be a lot simpler to just say, “first-in, last-out graph traversal”, or “stack graph traversal”, or, even simpler, “stack traversal”. Then we know almost immediately what to implement. We can spare ourselves the extra space in our brains to map the words “Depth First Search” to the idea of “traversing an undirected, unweighted graph using a stack”.
But there is something else important, perhaps a bit more interesting, I mean to point out.
Generality, In Excess
There is something fascinating I’ve found about when something is true. And that’s this sort of bizarre, surreal sense of something simply being. The truth just has nothing to do with our feelings, what we used to think, or what we think we were going to think, or thought we would’ve thought, or what we want. It just is.
When you get at the truth long enough you start to realize it has such a property. This enticing and airy detachment from you. This simple state of being. In a sense, it’s meditative. That quality, I think, is especially pertinent in computer science and mathematics. I think so because what we’re doing when we do computer science and mathematics is create ideas that have a truth quality of their own.
We are literally creating our own definitions of true and false when we form the basis of mathematical and computer science theory. Here’s the catch though, while this is, in many ways, part of the craft of computer science and mathematics, our definitions of truth need not mean anything at all, even if we can make them consistent, clear and usable.
They are still eventually going to be beholden to a measure against the natural world if they are ever to be interesting enough to be applied. My style and interpretation of mathematics is a lot different from most people because I think that while forming interesting theories that have unique and new properties that can explored is interesting, the most interesting thing is actually finding ways to fit theories to reality.
That said, I do see the value in doing mathematics and constructing new theories just for the joy of doing so. You never know how they might be applied, but you ought to know how little they may relate to reality, or that they may never relate to reality at all.
All that is to say, an algorithm or data structure exists on it’s own terms, it does not care about the natural world. It needs not map to the natural world nicely, it need not fit any real scenario, assist with any task, or mean anything at all. It can be just as meaningless and arbitrary as a false, or intentionally misleading, sentence written in natural language.
It’s consistency or mathematical rigor is not the quality which determines it’s truthfulness if you measure truthfulness like every other scientific domain does — as in an ideas ability to faithfully model the real lived experience of many people. For any algorithm or data structure to make sense, or be considered realistic, in that definition, it has to be fit to a situation in the natural world, somewhere, at some point.
So, we should be very clear and concise in the way we frame our theories and how they may relate to reality (or not) so we can build effective relationships with other domains of science. I think the fact that we have decided to call stack based graph traversal (or, really stack based graph enumeration, as I go on to call it later in this blog), “depth first search”, is an example of a case where we should go a bit farther to be more specific about what we’re actually trying to communicate.
When we try to fix a notion relative to our lived, natural world experience to an algorithm, like the concept of “depth”, it only makes sense when the algorithm is considered in the context of the situation where the experience that gave rise to that notion originally occurred. We can’t say the algorithm has co-opted the notion of “depth” and carried it around to every other situation where the algorithm will be applied. Then it will just have tossed the notion of “depth” into situations where it means nothing.
Suppose the graph we are “depth first searching” is a network of people, all of whom know one another in one way or another (like the example in footnote 2). Since the situation is not a group of people literally standing on top of one another on a series of elevated platforms, the notion of “depth” makes no sense at all.
You’d have to do a lot of stretching to fix some notion of “depth” into such a scenario. The only thing that comes to mind is modelling the intimacy of the relationships. Although at that point the graph structure (in the example in footnote 2) is not designed in such a way to facilitate traversal based on intimacy, so the meaning is lost. This is also clearly not the same idea as the spatial notion of distance from a fixed point.
In contrast, if the graph we are “depth first searching” is a graph modeled after a cave network, where nodes are intersections of tunnels and edges are paths between intersections, and the tunnels only go “down” (further into the earth, or “downward”, as defined by distance towards the center of the earth away from the outer crust), then we have something that could be called a “depth first” search.
Now, there is, of course, some actual structure to the algorithm that we have called “depth first search”, and it has some meaning, and a potential set of applications, one of which may be applied to solving problems which involve a notion of “depth” in certain scenarios, but that’s not what the mathematical idea of “depth first search”, as it is currently defined, is capturing.
It is simply a stack based graph enumeration. That’s a lot more precise because “stack” (last-in, first-out list), “graph” (data structure with nodes and edges) and “enumeration” (procedure where, one by one, you list some of the things in a certain set of things) all have clear, theoretical definitions. And all “depth first search” is, is a purely theoretical, general way to enumerate nodes in an undirected, unweighted graph.
All that’s really occurring when the algorithm runs is the nodes are being explored, one by one, and “visited”, in a specific order.
A true “depth first search” algorithm must:
Be relative to situations where a clear notion of “depth” exists (as in distance from some fixed point or plane, like the top of the ocean, the crust of the earth, the atmosphere of the earth, etc.). Or at the very least be fixed to a three dimension or two dimensional graph structure where a fixed plane or point exists that is considered the “top”, where we could measure “downwards” distance from so it could be easily adapted to situations where a notion of “depth” exists.
Be done on graphs with weighted edges which map faithfully to the real world parameters in #1
Be carried out in such a way that the algorithm first traverses to the farthest node from the originating node, then backtracks to the second farthest node, then the third farthest node, etc.
At that point the notion of “depth first” is fully developed. It also would require an understanding of a domain specific, real world scenario. So sweet! There’s some fun exploring to do.
Critical Thinking in Computer Science
Computer science is a young field. Whether it will ever truly distinguish itself from mathematics is, in my opinion, yet to be seen. Nonetheless, there is a lot of “baggage” hanging out the field of computer science. It’s not that we haven’t built AGI yet, it’s that, in a lot of popular texts and educational mediums, it seems we still don’t even really know how to describe algorithms with clarity, or how to openly discuss the process of how we decide how to fit algorithms to real world scenarios, or how to invite and encourage critical thinking and imagination in discussions of computer science.
Most modern computer science textbooks are filled with descriptions of algorithms (and some case studies) and dense with rigorous theories but lacking in depth of discussion and critical inquiry. We don’t need to memorize a million algorithms, we need to get at the essence of what they are, how they can be shaped to be meaningful, and how we can learn to stretch our imagination to form new ones.
We need that sense of wonder and awe that taunts us and makes us think that maybe, with nothing other than what we already have in our mind: our curiosity, our imagination, our interests, and our dedication, we can see a bit deeper into the fabric of reality than anyone has before.
Computer science needs humanity just as much as humanities need the rigor of traditionally “empirical” domains. It needs critical thinking, curiosity, and the soft, inquisitive nature of curious, kind people. I really think the value of kindness, empathy and the human qualities of having a heart and set of emotions are undervalued in computer science.
I’ve come to believe, after a lot of living, that life just doesn’t reveal it’s secrets to assholes. And the more kind you are to yourself, the better your life becomes, and the farther you can see. There are lot’s of examples of good role models in technical fields. One such person I recently read a bit about is Paul Erdos: https://www.youtube.com/watch?v=CWkCSvhtf_s
Anyways, what was this about? Ah right, it’s not “depth first search”. It’s stack based graph enumeration, or stack enumeration, or stack traversal. :p
A shout out: I drew the illustrations for this blog with a Remarkable. They aren’t paying me to say anything about them, I just think it’s good to give credit where it’s due and I’ve really enjoyed using their product. It’s awesome. Thank you, Remarkable, for giving me a pleasurable and simple note taking and illustration experience.
A second shout out: I strongly dislike Leetcode interviews. I think they are a form of fraud, a bad interview practice, and that they should be banned from technology companies. In an ideal world I would start a petition, collate resources on good interview practices, call out the companies / teams that use Leetcode style interview practices, campaign and get signatures of prominent tech leaders to get momentum and pressure companies into ditching the practice, but alas I find myself inundated with work. For the time being my best form of activism is to include this shout out in the bottom of my posts in the hopes that others can see and recognize the same things, and feel inspired to work in their companies to remove Leetcode from the interview process.
Footnotes
1 If you’ve programmed for a while you might intuitively understand what I mean be “visit” here. For those who haven’t programmed long enough to be familiar with my use of the term “visit” here, I’ll endeavor to elaborate, since my intention is to write about computer science in the most general, broadly accessible way possible.
In natural language, my best description of what I mean by “visit” here is to, for a node of the graph, have available for access or manipulation at a discrete point in time in the flow of execution of my code. Put simply, meaning to be able to access the value of a member variable of (if the node is an object), print (if the node if a printable object or primitive data type) or perform some numerical calculation on (if there are operators available to modify the node).
To describe this in terms of actual code, the Python code below “visits” each integer in the list at the time calls “print” on the integer
listOfInts = [1, 2, 3, 4, 5] for i in range(len(listOfInts)): print(f"I am visiting array element {i}, who's value is {listOfInts[i}")
2 To describe this in a manner more easily interpretable based on a real scenario, suppose I am an employee at a major political science research institute. I have a very specific task: research and discover the opinions of a group of people in a specific age range, all of whom know one another in various ways (whether as coworkers, friends or family members). The idea is to figure out to what degree, if at all, the relationships of the people correlate to their opinions.
To get a sense of the opinions of the people I’m researching, I am given a list of questions which ask the respondent to indicate to what extent they agree with a pre-written reaction to an interpretation of a popular modern event. I have to ask all of the people in the group I am assigned the questions. I am also given a document (in this case, meant to represent the graph) of the names of the people in the group I am assigned and all of their relationships.
I am told I can chose any order I see fit. Suppose I zoom in on one portion of the document where I would like to start, and it looks like:
A simulated mapping of people based on relationship type
If I want to use a “depth first search” strategy, I have to move between people in my interview process in such a way that, after completing an interview, when I need to chose the next person to interview, I do so in such a way that I chose the first person who is directly related to the person I just interviewed, and repeat that process (interviewing all the people they are directly related to), before I move on to the second person who is directly related to the person I just interview.
In this example, the “depth first search” strategy would have me visit
Jane Doe -> John Smith -> Sam Lee -> Susan Lee -> Emily Smith -> Howard Johnson -> James Baker
This makes a little bit more sense if you contrast it with the most popular alternative, commonly called “breadth first search”, where I would do the opposite: after completing an interview, when I need to chose the next person to interview, I do so in such a way that I chose the first person who is directly related to the person I just interviewed, and then the second, and third, etc. until I have visited all the people directly related to the person I just interviewed, at which point I would begin visiting the people related to the people related to the person I just interviewed.
In that case, the “breadth first search”, strategy would have me visit
Jan Doe -> John Smith -> Emily Smith -> Howard Johnson -> Sam Lee -> James Baxter -> Susan Lee
3 While I’ve never used the website myself, brilliant.org seems to have some high quality, generally accessible explanations of algorithms (https://brilliant.org/paths/computer-science-algorithms/) and graphs (https://brilliant.org/wiki/graphs/). Alternatively, almost any textbook on the design and analysis of algorithms or computer science theory will contain some description of an algorithm and/or graph, although they may be intimidating and filled with a lot of formality that will be unfamiliar to those who haven’t trudged through an introduction to discrete mathematics.
You could still surely get the gist of what an algorithm and/or a graph is from such a text nonetheless, if you are willing to do a bit of heavy lifting and drill down into unfamiliar notation as you encounter it. I’ll admit I never tried to do so though. I had the privilege of carefully design curriculum provided by my professors, which afforded me a generous helping of mathematical pre-requisites prior to my introduction to computer science topics like algorithms and data structures.
Nonetheless, for the brave and adventurous auto-didact, who I commend for their intellectual courage being greater than mine, two of my favorite texts in the design and analysis of algorithms are:
Steve Skiena — The Algorithm Design Manual
Robert Sedgewick and Kevin Wayne’s — Algorithms, Fourth Edition
4 A list is typically an array, one of the fundamental building blocks of most algorithms. If you aren’t familiar with arrays, you might have a read of my exploration of how you could familiarize yourself with the notion of an “algorithm” and “graph” in the footnote section (3). You might apply a similar procedure to understanding arrays. It also seems the resource I refer to there also has some educational resources specifically on arrays: https://brilliant.org/wiki/arrays/.
5 Funny thing, I happen to have mistakenly implemented “breadth first search” when I wrote the first draft of this blog. For whatever reason I confused the two. I really confused myself as I made two subtle errors by also drawing an example graph and incorrectly translating it’s representation in adjacency list format while walking through an example by hand (haha).
We all get confused, it’s a very normal part of engineering and science. I think I spend most of my days as an engineer confused (; Anyhow, I just like to tell those stories to share an honest reflection of my experience learning and practicing computer science. Also, since I’d already gone through the effort to write the pseudo-code, I figured, why not leave it in? So here it is (:
Building on the first block of pseudocode in the section of this article titled “Depth First Search”, there are only two more ingredients I need to complete a representation of breadth first search:
A “queue”, which is just a list (4) where the elements are inserted and removed in such a way as to satisfy the “first-in, first-out” paradigm (the first element added is the first element removed)
An array of boolean values of length n that I will use to signify whether the vertex represented by an integer has been processed yet (to avoid scenarios where the same node ends up with multiple entries in the queue). That is, if P[i] = True that indicates the integer i has already been processed. If it does not, then i has not yet been processed.
Let n = # an integer representing the number of nodes in a graph Let V = {0...n - 1} Let E = { {k, j ... i}, # k, j, i >= 0 and k, j, i < n ... } Let P = [] for i -> (0...n - 1) P[i] = False Let Q = [] # an empty queue (list with a first-in, first-out interface) Let curr = 0 # an integer corresponding to an element of V Q.add(curr) # add the node we want to visit first to the queue while Q.size() > 0 curr = Q.remove() # pop the last element from the queue process(curr) # "visit" the node, do whatever we need to with it P[curr] = True for node in E[curr]: if not P[node]: Q.add(node)
Additionally, the “first-in, first-out” rule can be a bit tricky to visualize, despite it’s simplicity. The graphic below, which visualizes the addition and removal of a few integers to a queue, might help
5 Recursion is a bit of a tricky concept in computer science. If you aren’t familiar with it, you might have a read of my exploration of how you could familiarize yourself with the notion of an “algorithm” and “graph” in the footnote section (3). You might apply a similar procedure to understanding recursion. It also seems the resource I refer to there also has some educational resources specifically on recursion: https://brilliant.org/wiki/recursion/.
In reality, you don’t need to use recursion here. You could use a stack, which is just a list (4) where the elements are inserted and removed in such a way as to satisfy the “first-in, last-out” paradigm (the first element added is the last element removed). Implementing depth first search this way would probably make the code a bit nicer to read / understand for people new to computer science.
I suppose as is the canonical Achilles heel of the field, I am simply too strapped for time to do so here (: It could be a good exercise for the interested reader to translate the recursive function into a non-recursive function that uses a stack.