Monday, March 23, 2009

On Optimization...

This is part one of two on mathematical optimization. Part two will cover an interesting application of this stuff. You may want to skip to part two ;)

I'm in a math class called Optimization this semester. Essentially, we study what methods the computer (in our case, Mathematica) uses to solve optimization problems.

An optimization problem can be as simple as deciding which gas station to go to, with variables of distance and price, or as complex as the math behind Obama's economic policy. At first glance, these problems seem very simple to solve, and indeed, for problems like the gas station choice, they can be solved with naive methods. As always, though, this math topic is more complicated than it seems.

Imagine if you were a particularly mathematically inclined high school senior deciding which college to go to. After your initial college visits, you've taken detailed notes on each college, ranging from the food services (Bowdoin is consistently in the top 2 nationwide) and dorm life to grad school acceptance rates and student:faculty ratio. Further imagine that all of these factors (some quite subjective) can be assigned numerical values. Then to decide which college is best for you, you write an optimization problem that assigns weights (modeled by constraints such as food >= 8.0) to each variable, according to your personal taste.

By the time you're done, there may be 10, 15, 20 variables! And to think, this problem is downright tame compared to most optimization problems (which can have thousands of variables!). While a normal high school student surely runs a version of this very same optimization problem in their heads while choosing a college, you will settle for nothing less than the exact correct college. But how? Can't we just play around with the variables until we get the right answer? Technically, yes, but this is not as easy as approximating distance and price of gas stations. With this sort of complex optimization problem, changing one variable may have an influence on the others, and before you know it, your head is aching from all the possible combinations of variables. (Mathematica to the rescue!!)

But seriously, Mathematica is a great tool for solving optimization problems, but how does it work? That, dear reader, is far too much detail for this already lengthy post. The point I'm getting at here is that some problems can't be solved by tinkering with variables one at a time, because once you have a complicated problem to solve, changing one variable can change the entire nature of the problem. In addition, with multiple variables, there are just too many things to tinker with. For a one variable problem, I could simply graph the function and find the low spot, right? Sure! The formal way to solve this (this is an important distinction from just "using the picture") is to find all the places where the derivative (and thus, slope) equals zero. If my minimizer isn't one of these points, then the slope at my minimizer has some value, like 1, which means I can continue along my function and find a lower point...thus, I didn't really find a minimizer.

For a two variable problem, I can still draw a graph, but I have to do it in 3D. For a three variable problem...uh oh. In fact, even two variable problems can be hard, if the function is ugly enough.

So, for the college optimization problem, we start to see why we needed to move away from simply following the graph to the low point in the example above. I can not draw a graph in 20 dimensions (hey, I'm not a visual arts major!), but even if I could, how exactly would I trace the function along and find the low point? This is a problem even in a 2 variable problem (represented as a 3D graph). I can poke around the graph, zoom in, and perhaps find a spot where the function is minimized locally. But how can I be sure that that is the global minimizer? I really can't, unless I check every part of the graph! Not only is this tedious, it's quite impossible- functions have infinite resolution by definition, so you can't just zoom in enough that the function stops changing. If the function is complex enough, there can be all sorts of crazy behavior lying just below your current detail level.

Thus, we need a symbolic way of solving this problem, akin to solving the derivative function in the one variable example. That way, no matter how crazy the function is, we don't have to go zooming in and panning around forever- we just get the answers. Unfortunately, the methods for higher order (meaning more variables) functions are not as easy.


If this post made very little sense or seems to have no connection to life, go ahead and read part two, where I will attempt to connect these topics to internet companies.

No comments: