Imagine for a second we have an array, and we want to find a particular number, and we want to do it more efficiently than the linear search: there’s a row of lockers, and is it in locker #0? No. Ok, is it in locker #1? No. Ok, is it in locker #2? No. Ok, is it in locker #3? Yes? Ok great, you found it!
To do better, we first need to guarantee the array is sorted. There are many
sorting algorithms which we’ll go over soon, but for now, let’s just be hand-wavy,
and assume that somehow, the array was sorted. You should not care in which direction,
ascending or descending, it was sorted.
This method that looks for something through a sorted
collection is called “binary search”, not because of 100100110101101, but
because it splits things in half.
You probably do this (or your parents did), if you ever use a phone book. Imagine
you’re looking for someone whose last name is “Kelly.” K is kind of near the
middle of the alphabet, so you open the phone book in what you think the middle
is. You’ve landed on a page full of “Ingleton” records. Not far enough into the
Ks, but you do know something. Your “Kelly” friend is not in the part of the
book from the beginning until “Ingelton,” since “K” comes after “I.” So you
flip to the next page after “Ingleton,” and treat that as “page 1”.
From there, you flip forward some amount trying to find “Kelly”, and
you land in “Martinez.” Oops, too far. “Kelly” lies somewhere between “Ingelton”
and “Martinez,” so you know two things: you shouldn’t waste time looking
earlier in the alphabet than “Ingleton,” or later than “Martinez.” From “Martinez,”
you flip backward and get to “Jones.” Now, again, you’re before “Kelly,” but
your region to search through keeps shrinking. Now, it only makes sense to
search between “Jones” and “Martinez.” You flip one page after Jones, and you
find your friend’s number.
Harvard’s Dr. David Malan, in their CS50 introductory class, does this with a physical
phone book every time he teaches the class, literally tearing the book in half
(as opposed to just ignoring the irrelevant parts, where the answer cannot be)
in the alphabetized phone book because he wants to find someone.
Imagine you have 1000 items, and you need to find item 371 to figure out what
it is. You could go to item 1, then 2, then 3, then 4… then 370, then 371, and
you’d know the answer, but this other way is much quicker. You go
to item 500 in the middle right away. 371 is less than 500, so you immediately give yourself permission
(since items are somehow sorted) to ignore 501-1000. The middle of your new
search space is 250 (from 1 to 500), so you look there. You’ve looked at 2
items. Now, you look just between 251 and 500, at the middle of
that—376. Now, you’ve looked at 3 items. Your item is somewhere between 250 and
the middle, so you ignore the top half. Now, the middle of the new range is 313.
Your item is after 313, but before 376, so you ignore everything else. Then,
the new middle is 345. Your item is between 345 and 376, so you ignore
everything else. You look at the new middle—361—and decide your item is above
it, so you ignore everything else. Your new range is now just 369 to 376. The
middle is 373, so your item is in the lower half just from 369 to 373. You look
in the middle, and you find your item.
Look how much quicker that was—500, 250, 376, 313, 345, 361, 373, found 371—than actually
examining 370 previous items and failing to find it. This is why binary search
is so powerful, and almost every “find me XYZ in a huge collection” in
technology these days uses it. My best friend’s phone number is saved under her
last name, “Elwood.” If I didn’t know that number by heart and still needed to
look her up to call her, if I were to look for “Elwood” in my phone’s contact
list, then examine the code Android is using to give me back her number, I am
nearly certain that it would be doing this.
Don’t start thinking about binary search in code terms just yet. First, look at
the pseudocode.
function binarySearch(array, target):
low ← 0
high ←
length(array) - 1
while low ≤ high
do:
mid ← low +
(high - low) / 2
if array[mid]
== target then:
return
mid // Target found at index mid
else if
array[mid] < target then:
low ← mid
+ 1 // Search right half
else:
high ← mid
- 1 // Search left half
return -1 // Target not found
Understand how the pseudocode works—match it to finding “Kelly” or “371” in
those stories—and then try it with a very small (maybe 10 items, if not less) sorted
array. Try it out when you both can guarantee you’ll find the item and when you
can guarantee you won’t. Call a friend, a parent, a sibling, a significant
other into the room, and demonstrate the algorithm to them. Only once you’re confident
you can explain it, implement this Java:
public class BinarySearch {
public static int
binarySearch(int[] arr, int target) {
int low = 0;
int high =
arr.length - 1;
while (low
<= high) {
int mid =
low + (high - low) / 2;
if
(arr[mid] == target) {
return
mid; // Target found
} else if
(arr[mid] < target) {
low =
mid + 1; // Search right half
} else {
high =
mid - 1; // Search left half
}
}
return -1; //
Target not found
}
public static void
main(String[] args) {
int[] arr =
{1, 3, 6, 8, 10};
int target =
8;
int result =
binarySearch(arr, target);
if (result !=
-1) {
System.out.println("Value found at index: " + result);
} else {
System.out.println("Value not found");
}
}
}
No comments:
Post a Comment