# Fun with simple analysis problems I: the rest of the story

In an earlier post with the same title (and without the subtitle) I introduced some thoughts that were triggered by this simple problem:

Suppose that

$|\frac{df}{dx}(x)| \leq \lambda |f(x)|$                    (1)

for all $x$, that $f$ is continuous and differentiable, and that $f(0) = 0$.

Prove that $f(x) = 0$ everywhere.

In that post (which you can find here  Fun with simple analysis problems I ),  I started by presenting three solutions and then generalized and explored further.

What I did not reveal in that post, was that writing it, gave me an idea for a more advanced problem. Not too long afterwards, Laramie Paxton joined my group and I gave him this problem to work on for his dissertation. We collaborated in solving the problem, since that is how I mentor all my students — their dissertations are collaborations with me. This resulted in a paper we wrote together: A Singular Integral Measure for $C^{1,1}$ and $C^1$ Boundaries that can be found here.

Laramie Paxton arrived at WSU quite naive with respect to analysis, having completed an online masters in mathematics that did not give him a good foundation in analysis. But he very quickly he adopted habits that led to rapid progress. He started by studying intensely the summer before arriving and passing the qualifying exam on his first try.  Then he took my challenging undergraduate analysis course (I used Fleming’s Functions of Several Variables), pushed through courses in advanced analysis, and geometric measure theory, and worked on applications in image analysis (generating papers he actually led) and finished his dissertation, all in the space of two years. After a year of postdoc, he landed the job he is about to start, at Marian University in Wisconsin. I believe that both the University and Laramie are lucky to have each other.

In general, I believe that small universities are good places to be nowadays, but from everything I hear, this place is better than good — it is perfect for Laramie’s talents and skills. (In addition to his impressively growing mathematical skills, he was already phenomenally skilled in logistics and organization which can be seen in his highly effective help in making the events listed here, from April 2017 to July 2018, a reality.)

A major point of both the original post on the problem and this present post, is that the paper with Laramie, as well as the results in the first post, flowed from taking time to think about a simple analysis problem that would usually be viewed as a not-too-hard exercise, not worthy of more thought than it takes to find one solution.

While I am sure that there are other undiscovered aspects of the problem that launched these two posts and Laramie’s dissertation problem, I believe that what has been explored illustrates why it makes sense to treat simple problems as invitations to playful exploration and creativity.

# Median Shapes

When I wrote the paper with Simon Morgan pointing out the $L^1\text{TV}$ functional was actually computing the flat norm for boundaries, we suggested this gave us a computational route to statistics in spaces of shapes. While earlier work certainly touched on this idea of using the flat norm for inference in shape spaces — see this paper on shape recognition, it was not until my student Yunfeng Hu collaborated with myself and Bala Krishnamoorthy (my collaborator, also a co-mentor of Yunfeng’s), that we started addressing the idea of statistics in shape spaces in the original paper with Simon Morgan.

The results can be found here: https://arxiv.org/abs/1802.04968 , in a paper with the title Median Shapes, with authors Yunfeng Hu, Matthew Hudelson, Bala Krishnamoorthy, Altansuren Tumurbaatar, Kevin R. Vixie. Tumurbaatar wrote the first complete version of the code used, and Matthew Hudelson contributed a pivotal new result on graphs inspired by a problem in the paper, while Bala led the computational end of things and I led, in collaboration with Yunfeng Hu (and Bala keeping us honest!), the theoretical parts of the paper. It was a fair bit of work.

We went over the more difficult results a few times, finding improvements and corrections. Of course, there may be a few things here and there to improve, but for now, it is done.

Yunfeng probably spent the most time writing up the piece proving that near regular points on the median, the collection of minimal surfaces meeting the median have a tangent structure we describe as a book.  While this is clear to experienced geometric analysts,  there are lots of little details and we wanted most of the paper to be more accessible to a wider audience. There are lots of other pieces here and there that took time to think about and write up (and rewrite). For example, when showing the set of medians need not contain any regular members,  the part where we show that we need only consider graphs when searching for a minimizer was not easy. And of course, as in most all of geometric analysis, there are problems you solve without too much effort at a high level, but find that writing down is tedious, though at times enlightening due to the fact that those little details turn out to be hard and illuminating.

Because the problem of computing the median reduces to a linear program, while the mean reduces to a quadratic program, we focused on the median problem. Some parts of the paper are a bit long winded, for the reason that we wanted it to have more details that would usually be in a paper communicating to others that understand geometric analysis.

Anyway, have a look. If you find yourself interested, there is already code you can use to compute medians, though we hope eventually to have faster code.

# Fun with simple analysis problems I

This last semester, I ran a fun, informal master class in problem solving. Actually, a graduate student of mine — Yunfeng Hu — who is an expert problem solver, produced all the problems from the immense library he built up over his undergraduate career in China.

I believe that the art and culture of problem solving is not as widely valued in the USA as it ought to be. Of course there are those that do pursue this obsession and we end up with people with high scores on the Olympiad and Putnam competitions. But many (most?) do not develop this skill to any great degree. While one can certainly argue that too much emphasis on problem solving along the lines of these well known competitions does not help very much in making real progress in current research, I would argue that many have fallen off the other side of the horse — many are sometimes hampered by their lack of experience in solving these simpler problems.

Here is a problem that arose in our Wednesday night session:

Suppose that

$|\frac{df}{dx}(x)| \leq \lambda |f(x)|$                    (1)

for all $x$, that $f$ is continuous and differentiable, and that $f(0) = 0$.

Prove that $f(x) = 0$ everywhere.

Perhaps you want to fiddle with this problem before looking at some solutions. If so, wait to read further.

Here are three solutions: in these first three solutions we are dealing with $f:\Bbb{R}\rightarrow\Bbb{R}$ and so we will denote $\frac{df}{dx}$ by $f'$.

(Solution 1) Consider all the solutions to $g' = 2\lambda g$ and $h' = -2\lambda h$. These are all curves in $\Bbb{R}^2$ of the form $y = g_{C}(x) = Ce^{2\lambda x}$ and $y = h_{C}(x) = Ce^{2\lambda x}.$ We note that if $y=f(x)$ is any function that satisfies equation (1), then everywhere its graph intersects a graph of a curve of the form $y = g_{C}(x) = Ce^{2\lambda x}$ for some $C\in\Bbb{R}$, the graph of $f$ must cross the graph of $g_{C}.$ if we are moving form left to right the graph of $f$ moves from above to below the graph of $g_{C}$. Likewise, $f$ crosses any $h_{C}$ from below to above, when moving from left to right in $x$. Now, supposing that $f(x*) > 0$ at some $x*$. Then simply choose the curves $g_{\frac{f(x*)}{e^{2\lambda x*}}}(x)$ and $h_{\frac{f(x*)}{e^{-2\lambda x*}}}(x)$ as fences that cannot be crossed by $f(x)$ (one for $x < x*$ and the other for $x > x*$ to conclude that $f(x)$ can never equal zero. (Exercise: Verify that this last statement is correct. Also note that assuming $f(x*) > 0$ is enough since, if instead $f(x*) < 0$ then $-f$ also satisfies (1) and is positive at $x*$)

(Solution 2) This next solution is a sort of barehanded version of the first solution. We note that equation (1) is equivalent to

$-\lambda |f(x)| \leq f'(x) \leq \lambda |f(x)|$            (2)

and if we assume that $f(x) > 0$ on $E\subset\Bbb{R}$, then this of course turns into

$-\lambda f(x) \leq f'(x) \leq \lambda f(x)$.            (3)

Assume that $[x_0,x_1]\in E$ and divide by $f(x)$  to get  $-\lambda \leq \frac{f'(x)}{f(x)} \leq \lambda.$ Integrating this, we have $-\lambda (x_1 - x_0) \leq \ln(\frac{f(x_1)}{f(x_0)}) \leq \lambda (x_1 - x_0)$ or

$e^{-\lambda (x_1 - x_0)} \leq \frac{f(x_1)}{f(x_0)} \leq e^{\lambda (x_1 - x_0)}.$            (4)

Now assume that $f(x*) > 0$. Define $u = \sup \{ w | f(x) > 0 \text{ for all } x* < x < w\}$ and $l = \inf \{ w | f(x) > 0 \text{ for all }x* > x > w \}.$ Note that $l \neq -\infty$ implies that $f(l) = 0$ and $u \neq \infty$ implies that $f(u) = 0.$ Use equation (4) together with $\{\text{a sequence of }x_0\text{'s} \downarrow l \text{ and }x_1 = x*\}$ or $\{x_0 = x* \text{ and a sequence of }x_1\text{'s} \uparrow u\}$,  to get a contradiction if either $l \neq -\infty$ or $u \neq \infty.$

(Solution 3) In this approach, we use the mean value theorem to get what we want. Suppose that $f(x_0) = 0$. We will prove that $f(x) = 0$ on the interval $I = [x_0 - \frac{1}{2\lambda}, x_0 + \frac{1}{2\lambda}]$.

(exercise) Prove that this shows that $\{f(x) = 0\text{ for some } x\}$ $\Rightarrow$ $\{f = 0\text{ for all } x\in\Bbb{R}\}$. (Of course, all this assumes Equation (1) is true.)

Assume that $x\in I$. Then the mean value theorem says that

$|f(x) - f(x_0)| \leq |f'(y_1)| |x - x_0| \leq |f'(y_1)| \frac{1}{2\lambda}$               (5)

for some $y_1\in I$. But using equation (1) and the fact that $f(x_0) = 0$, this turns into $|f(x)| \leq \frac{1}{2}f(y_1).$ By the same reasoning, we get that  $|f(y_1)| \leq \frac{1}{2}f(y_2)$ for some $y_2 \in I$, and we can conclude that $|f(x)| \leq \frac{1}{2^{2}}f(y_2)$. Repeating this argument, we have

$|f(x)| \leq \frac{1}{2^{n}}f(y_n)$                  (6)

for some $y_n \in I$, for any positive integer $n$. Because $f$ is continuous, we know that there is an $M < \infty$ such that $f(x) < M \text{ for all } x\in I$. Using this fact together with Equation (6), we get

$|f(x)| \leq \frac{M}{2^{n}} \text{ for all positive integers } n$                (7)

which of course implies that $f(x) = 0$

Now we could stop there, with three different solutions to the problem, but there is more we can find from where are now.

Notice that one way of looking at the result we have shown is that if

(1) $f$ is differentiable,

(2) $f(x_0)=0$ and

(3) for some $\delta > 0$, we have that $f(x) \neq 0$ when $x \neq x_0$ and $x\in [x_0 - \delta, x_0 + \delta]$,

then

$\limsup_{x\rightarrow x_0}A_{f}(x) \equiv \left|\frac{f'(x)}{f(x)}\right|\rightarrow\infty$              (8)

Note also that if we define

$a(f) \equiv \sup_{x\in\Bbb{R}} A_{f}(x)$                  (9)

we find that

$a(f) = a(\alpha f) \text{ for all }\alpha\neq 0.$               (10)

Let $C^{1}(\Bbb{R},\Bbb{R})$ denote the continuously differentiable functions from $\Bbb{R}\text{ to }\Bbb{R}.$ If we define $C_{\lambda} = \{f | a(f) \leq \lambda\}$ we find that not only is $\bigcup_{n\in\Bbb{Z}^{+}} C_n$ not all of $C^{1}(\Bbb{R},\Bbb{R})$, we also have functions satisfying $0 < b \leq f(x) \leq B < \infty$ whose $a(f) = \infty$. So we will restrict the class of functions a bit more. The space of continuously differentiable functions from $K\subset \Bbb{R}$ to $\Bbb{R}$, $C^{1}(K,\Bbb{R})$, where $K = [-R,R]$ (compact!), is closer to what we want. Now, $C_{\infty} \setminus\bigcup_{n\in\Bbb{Z}^{+}} C_n$ contains only those functions which have a root in $K$.

We will call the functions in $C_{\lambda} \subset C^{1}(K,\Bbb{R})$ functions with maximal growth rate $\lambda$. This is a natural moduli for functions when we are studying stuff whose (maximal) grow rate depends linearly on the current amount of stuff. Of course populations of living things fall in the class of things for which this is true. from the proofs above, we know that if $f\in C_\lambda$, then it’s graph lives in the cone defined by exponentials. More precisely

If $a(f) = \lambda$ then for $x < x_0$,   $\frac{f(x_0)}{e^{\lambda x_0)}}e^{\lambda x} \leq f(x) \leq \frac{f(x_0)}{e^{-\lambda x_0)}}e^{-\lambda x}$   and for $x > x_0$ we have $\frac{f(x_0)}{e^{-\lambda x_0)}}e^{-\lambda x} \leq f(x) \leq \frac{f(x_0)}{e^{\lambda x_0)}}e^{\lambda x}.$

(Exercise) Prove this. Hint: use the first proof where instead of $2\lambda$ you use $\alpha\lambda$ and let $\alpha\downarrow 1$.

(Remark) Notice that Equation (10) and $\lambda < \infty$ implies that scaling a function in $C_\lambda$ by any non-zero scalar yields another function in $C_\lambda.$ As a result, we might choose to consider only

$F \equiv f\in c_\lambda\text{ such that }f(0) = 1$

or

$F \equiv \{\text{ functions whose minimum value on }K\text{ is }1\}.$

In both cases we end up with subsets that generate $C_\lambda$ when we take all multiples of those functions by nonzero real numbers.

(Exercise) If we move to high dimensional domains, how wild can the compact set $K$ be and still get these results? It must clearly be connected, so in $\Bbb{R}^1$ we are already completely general with our $K$ above.

Moving back to Equation (1), we can look for generalizations: for example, will this result hold when $f:\Bbb{R}^{n} \rightarrow \Bbb{R}^{m}?$ How about when $f$ maps from one Banach space to another? How about the case in which $f$ is merely Lipschitz?

Lets begin with $f:\Bbb{R}^{n} \rightarrow \Bbb{R}^{m}.$

In this case, the appropriate version of Equation (1) is

$||Df(x)|| \leq \lambda ||f(x)||$                    (11)

where $||Df(x)||$ denotes the operator norm of the derivative $Df(x)$ and $||f(x)||$ is the euclidean norm of $f(x)$ in $\Bbb{R}^m.$

Notice that

$D\ln(||f(x)||) = \frac{1}{||f(x)||}\left(\frac{f(x)}{||f(x)||}\right)^{t}Df(x)$                    (12)

where $\left(\frac{f(x)}{||f(x)||}\right)^{t}$ is an $m$ dimensional row vector and $Df(x)$ is an $n\text{ by }m$ dimensional matrix. (Thus the gradient vector is the transpose of the resulting $n$ dimensional row vector.)  Now we can use this to get the result.

Let $\gamma(s)$ be the arclength parameterized line segment that starts at $x_0$ and ends at $x_1$ the The above equation tells us that

$\int_{\gamma} D\ln(||f(x(s))||) ds = \int_{\gamma} \frac{1}{||f(x)||}\left(\frac{f(x)}{||f(x)||}\right)^{t}Df(x) \leq \int_{\gamma} \frac{||Df(x(s))||}{||f(x))||} ds.$        (13)

Thus, we can conclude that

$\ln(||f(x_1)||) - \ln(||f(x_0)||) \leq \lambda ||x_1 - x_0||$

which implies that

$-\lambda ||x_1 - x_2|| \leq \ln\left(\frac{||f(x_1)||}{||f(x_0)||}\right) \leq \lambda ||x_1 - x_0||$

and we can proceed as we did in the second proof of the problem in the case that $f:\Bbb{R}\rightarrow \Bbb{R}.$ We end up with the following result

If $||Df(x)|| \leq \lambda ||f(x)||$  and $||f(x)|| \neq 0 \text{ for all } x\in B(x*,r)\subset\Bbb{R}^n$, then

$e^{-\lambda ||x - x*||} \leq \frac{||f(x)||}{||f(x*)||} \leq e^{\lambda ||x - x*||}$

for all $x\in B(x*,r).$

(Exercise) Show that this result implies that if f(x) = 0 anywhere, it equals 0 everywhere.

(Exercise) Show that this is implies the one dimensional result we proved above (the first theorem we proved above).

(Exercise) Our proof of the result for the case $f:\Bbb{R}^n\rightarrow\Bbb{R}^m$ can be carried over to the case of $f:B_1 \rightarrow B_2$ where $B_1\text{ and }B_2$ are Banach Spaces — carry out those steps!

We come now to the question of what we can say when we are less restrictive with the constraints on differentiability.  We consider the case in which $f:\Bbb{R}^n\rightarrow\Bbb{R}^m$ is Lipschitz. The complication here is that while we know that $f$ is differentiable almost everywhere, it might not be differentiable anywhere on the line segment from $x_0$ to $x_1$.

Consider a cylinder $C_{x_0}^{x_1}(1)$, with radius $1$ and axis equal to the segment from $x_0\text{ to }x_1.$ Let $E = C_{x_0}^{x_1}(1) \cap \{x| Df(x)\text{ exists }\}$. Since $f$ is differentiable almost everywhere, we have that $\mathcal{L}^n( C_{x_0}^{x_1}(1)\setminus E) = 0$. Therefore almost every segment $L$ generated by the intersection of a line parallel to the cylinder axis and the cylinder, intersects $E$ in a set of length $||x_1 - x_0||$. We can therefore choose a sequence of such segments converging to $[x_0,x_1].$

Since $Df$ exists $\mathcal{H}^1$ almost everywhere on the segments $[x_0^k, x_1^k]$  and $f$ is continuous everywhere, we can integrate the derivatives to get:

$-\lambda ||x_1^k - x_0^k|| \leq \ln\left(\frac{||f(x_1^k)||}{||f(x_0^k)||}\right) \leq \lambda ||x_1^k - x_0^k||.$

And because $f$ is continuous we get that

$-\lambda ||x_1 - x_0|| \leq \ln\left(\frac{||f(x_1)||}{||f(x_0)||}\right) \leq \lambda ||x_1- x_0||.$

so that we end up with the same result that we had for differentiable functions.

There are other directions to take this.

From the perspective of geometric objects, the ratio $\frac{||Df||}{||f||}$ is a bit funky — for example, if $f(x) =$ volume of a set $E(x)\subset \Bbb{R}^n = \mathcal{L}^n(E(x))$, where $x$ can be thought of as the center of the set, we have that $Df$ will be a vectorfield $\eta$ times $\mathcal{H}^{n-1}$ restricted to the $\partial E(x)$. Thus, $||Df||$ will be an $n-1$-dimensional quantity and $f$ a $n$-dimensional quantity. We would usually expect there to be exponents, as in the case of the Poincare ineqaulity,  making the ratio non-dimensional.

On the other hand, one can see this ratio as a sort of measure of reciprocal length of the objects we are dealing with. From the perspective, this result seems to say that no matter what you do, you cannot get to objects with no volume from objects with non-zero volume without getting small (i.e. without the reciprocal length diverging). This is not profound. On the other hand, that ratio is precisely what is important for certain physical/biolgical processes. So this quantity being bounded has consequences in those contexts.

This does not lead to a new theorem: as long as the set evolution is smooth, the $f$ and $Df$ are just a special case where $f:\Bbb{R}^n\rightarrow\Bbb{R}^1$ and even though actually computing everything from the geometric perspective can be interesting, the result stays the same.

in order to move into truly new territory, we need to consider alternative definitions, other measures of change, other types of spaces. An example might be the following:

Suppose that $X$ is a metric space and $f:X\rightarrow \Bbb{R}$. Suppose that $\gamma:\Bbb{R}\rightarrow X$ is continuous and is a geodesic in the sense that for any three points in $\Bbb{R}$, $s_1 < s_2 < s_3$, we have that $\rho(\gamma(s_1),\gamma(s_3)) = \rho(\gamma(s_1),\gamma(s_2)) + \rho(\gamma(s_2),\gamma(s_3)).$

If:

(1) for any two points in the metric space there is a gamma containing both points and

(2) for all such $\gamma$, $g_{\gamma} \equiv f\circ\gamma$ is differentiable

(3) and $\frac{|g_{\gamma}(s)|}{|f(\gamma(s))|} \leq \lambda$

then, we have that

$-\lambda \rho(x_1, x_0) \leq \ln\left(\frac{|f(x_1)|}{|f(x_0)|}\right) \leq \lambda \rho(x_1,x_0).$                       (14)

And, again we get the same type of result for this case as we got in the Euclidean cases above.

(Exercise)  Prove Equation (14).

(Remark) We start with any metric space and consider curves $\gamma:[a,b]\subset\Bbb{R}\rightarrow X$ for which

$l(\gamma)\equiv\sup_{\{\{s_i\}_{i=1}^{n}| a = s_1 \leq s_2 \leq ... \leq s_n = b\}} \sum_{i=1}^{n-1} \rho(\gamma(s_{i}),\gamma(s_{i+1})) \leq \infty$.

We call such curves rectifiable. We can always reparameterize such curves by arclength, so that $\gamma(s) = \gamma(s(t)), t\in[0,l(\gamma)]$ and $l([\gamma(s(d)),\gamma(s(c))] ) = d-c$. We will assume that all curves have been reparameterized by arclength. Now define a new metric

$\tilde{\rho}(x,y) = \inf_{\{\gamma | \gamma(a) = x\text{ and }\gamma(b) = y\}} l(\gamma).$

You can check that this will not change the length of any curve. Define an upper gradient of $f:X\rightarrow \Bbb{R}$ be any non-negative function $\eta_f:X\rightarrow \Bbb{R}$ such that $|f(y) - f(x)| \leq \int_{\gamma} \eta_f(\gamma(t)) dt$.

Now, if $\frac{|\eta_f(x)|}{|f(x)|} \leq \lambda$, we again get the same sort of bounds that we got in equation (14) if we replace $\rho$ with $\tilde{\rho}$. To read more about upper gradients, see Juha Heinonen’s book Lectures on Analysis in Metric Spaces.

While there are other directions we could push, what we have looked at so far demonstrates that productive exploration can start from almost anywhere. While we encounter no big surprises in this exploration, the exercise illuminates exactly why the result is what it is and this solidifies that understanding in our minds.

Generalization is not an empty exercise — it allows us to probe the exact meaning of a result. And that insight facilitates a more robust, more useful grasp of the result. While some get lost in their explorations and would benefit from touching down to the earth more often, it seems to me that in this day and age of no time to think, we most often suffer from the opposite problem of never taking the time to explore and observe and see where something can take us.

# Doing Mathematics

I have come to question a significant portion of the culture in academia, even while I have developed a deeper connection with other parts of that same culture or at least the culture that we could have. While I am deeply committed to mathematics as a creative occupation, and to teaching and mentoring in mathematics, my experience in academia after re-entering it seven years ago has strengthened my rejection of the many parts of that culture because they hinder the best research and teaching.

There are many aspects I could discuss, but here I am singling out four: the question of what makes a mathematical result or paper worthy of recognition together with the place of exposition in mathematics,  the value of awards and recognitions in mathematics, and the effects of federal funding on mathematics and academia.

As opposed to trying to do some sort of statistical study — a study which would only be meaningful if there were sufficient numbers of people following the ideas I propose, and there is not! — I will invoke common sense and intuitions that are commonly agreed on, but usually discarded as a guide for actions because of the economic realities of higher education; the institutions that pay us expect and reward the defective model and very few actively step outside those bounds.

What comes from the idea that results are best if they are definitive? Frankly speaking, I believe this idea is part of a cluster of ideas that impoverishes mathematics and mathematical culture.

I first thought about this when reading Bill Thurston’s 1994 article On Proof and Progress in Mathematics. In this article he contrasted how he approached his first work on foliations (resolve all questions, definitively!) versus his later work in geometry and the huge difference a more generous approach made in creating a rich, open, inspiring environment that many others got involved in, rather than the pinnacle of achievement that was admired from a distance.

Instead of maintaining a museum of monuments, we should propagate a countryside filled with rich, diverse gardens of ideas and a zoo of people tending and changing and expanding and creating new gardens.  While the first model leaves a trail of impressive facts, fit for admiration and worship, the second model is defined by engagement and inspiration for widespread creativity.

When Henry Helson visited Poland after the war, he was struck by the purity and simplicity of the mathematical culture that was also very generous. As he relates in his 1997 Notices article, Mathematics in Poland after the War, he was struck by the combination of generosity and fun that pervaded a culture that was serious about mathematics, but happy to publish things that did not aim to grab and own whole swaths of mathematical territory. Rather they published relatively short papers, each of which presented one new idea very clearly.

That exposition has been neglected, in spite of all the lip service to the contrary, can be seen in the response to the astrobites.org site, which has gained a lot of attention in the astrophysics community because of the large contrast between the high quality exposition that astrobites.org offers and the usual difficulty that non-experts have in reading scholarly papers.

I am now convinced that the high art of exposition should be valued as highly as the construction of brand new theorems, that publishing in such a way as to leave much to others is better than cleaning up an area and creating a monument: that what gets considered valuable mathematics ought to be greatly broadened. If anyone finds value — maybe because of explanations that require original thought, maybe because it brings the ideas to new audiences, maybe because it helps students see something clearly, maybe because it brings the understanding to the general public, and yes, possibly because it is completely original and surprising in construction — then it is valuable mathematics, worthy of the deepest respect. In this new model, the quality of the writing becomes very important. (I suspect that some will take issue with that statement saying that this is not a new model, but I will disagree and point to the enormous quantity of poorly written articles and books, some of which are also very valuable, even though they are not written very well. Of course, there are papers and books that are very, very well written. But it seems that this is considered a cherry on top, rather than something that should always, before anything else, be there.)

I am not urging that there be an effort to police exposition, but rather that this be given a great deal more attention at every level of education and practice. If we must have awards, let them go to those that have explained things well, have written things well. Better yet, train students to pursue the intrinsic rewards of doing anything well, from explaining derivatives to a confused calculus student to proving some new, highly technical theorem.

To encourage such changes, we would need to revisit how we reward and support the mathematical enterprise. This brings us to the consideration of the last two cultural components I said I was going to discuss: awards and federal funding.

Why do mathematics? For me, it is another form of art and at the same time, an exploration of the universe we live in. Knowing and understanding and explaining and inspiring others to do the same, exercises deep creativity and generosity; this is an occupation worthy of human beings that value themselves and others. Of course, there are an enormous number of occupations that can beneficially occupy the human mind and spirit. And each one can be as satisfying and beautiful and useful in its pursuit. By useful, I mean useful as an occupation, not useful as a tool to bend the world to my will. It is the occupation itself that is valuable. What happens to us and those we teach and share with, when we occupy ourselves (in a healthy environment!) is the greatest justification for any occupation.

From this position it becomes clear that awards and honors that many aspire to are actually a distraction. The reward is in the occupation itself. There are of course honors that have more to do with real appreciation rather than ranking and fame, and for such honors there is a place in a healthy culture. But the greed that masquerades in all of us as something more beautiful, seeks fame and fortune as a substitute for love and respect, whose lack actually gives room to that greed in the first place.

When the American Mathematical Society proposed the status of Fellow of the society, the negative side effects of such a program were pointed out rather eloquently by multiple individuals. In particular, I remember that Frank Morgan’s argument against the establishment of the program, and Neal Koblitz’ refusal of the offer of the status of Fellow. Of course, there is also the curious case of Perelman who refused the Fields Medal, the mathematical equivalent of the Nobel prize, whose recipients are given a demi-god status. For an interesting telling of the story and more, see Sylvia Nasar and David Gruber’s article Manifold Destiny in the August 28, 2006 issue of the New Yorker. (In the story, they quote Gromov, another prominent mathematician. Even though I very much doubt Gromov’s explanation of Perelmans refusal as a result of some great purity on Perelmans part, it is a story worth reading and thinking about.)

The influence of federal funding in mathematics, while it has enabled a great expansion of the enterprise, has led to a degradation of the culture, and not only in mathematics. It is well known that federal funding has turned academia into a serious addict, willing to do anything for the next fix of federal funds. That, combined with, spurred on by, the neglect of higher education in the public sector, has led to the very bad state of affairs in which grant money reigns supreme, fame (which can be turned into money!) comes second and teaching, for all the lip service it is given, occupies the lowest realms of academia. Proof of this diagnosis is not needed by anyone in academia (other than administrators who profit from illusions proposing some other reality), but if proof is needed, one need not look any further than the way adjuncts and instructors, who do a great deal of the teaching, are treated. Both in terms of the dismal pay and the insecurity of their jobs, we are saying that teaching is not what a university is really about — it is just what we have to do to keep up the charade.

But this is also where the tragedy lies; it lies in the immense impoverishment that results when teaching is not given top priority. It is a law of nature that real greatness, true stature, is proportional to the service to others that an entity or person actually provides. You may prefer to see this as my definition of greatness and stature. Either way, assuming this to be true, we have traded real nobility for a meager, greedy existence when we accept the perverted system of values that we currently have at research universities — and even, in some ways at teaching universities.

While small liberal arts college do in fact value teaching, they still take advantage of the situation generated by research universities and often pay their adjuncts obscenely low wages. It is tragic and funny at the same time that such colleges are usually full of people who think that businesses ought to raise the minimum wage, provide health care and longer paid vacations, and all sorts of other good ideas, but when it comes to the situation they have power over, they turn a curiously blind eye. But there is also this idolization of research universities, of elite institutions and this admiration pulls in some of the poison that they could otherwise easily avoid.

But, as I wrote in the previous post in this blog,  Learning to Think and to Act, research is a critical piece in education. It inspires and illuminates and brings a freshness and vitality that should be insisted on. On the other hand, research without teaching becomes selfish and elitist and aimed at goals that can at times be silly and irrelevant in their isolation.

What then, can we do? If the system is so far astray, what can be done?

In my opinion, the most powerful thing you can do is inspire change in your own sphere of influence by a focus on the place of freedom you actually have. Having your principles and philosophy aligned with life and love, and consistently acting in accordance with them, has always been the most powerful thing anyone could do.

Creative exploration and teaching, with a deep sensitivity for those that struggle; the pursuit of both pure and applied research, with generosity, and an acute sense for which applications are morally admirable; a discipline of simplicity, eliminating the pursuit of rank or awards or status or recognition — these are still the fundamental components of a culture worth immersing myself in, worth spreading to others. Taken together, they create a deeply rewarding occupation, an occupation that quietly, powerfully, moves us forward, and higher.

# An Invitation to Geometric Measure Theory: Part 1

While there are a variety of article-length introductions to geometric measure theory, ranging from Federer’s rather dry AMS Colloquium Talks to Fred Almgren’s engaging Questions and Answers to Alberti’s Article for the Encyclopedia of Mathematical Physics, I will take a different approach than has been taken in any of these and introduce geometric measure theory through the vehicle of the derivative.

### The Derivative, Geometrically

The derivative that is encountered for the first time in calculus is defined as the limit of a ratio of the “rise” over “run” of the graph of a function. For $y = f(x)$, this becomes

$\frac{df}{dx}$$(a)=\lim_{x\rightarrow a}$$\frac{f(x) - f(a)}{x-a}$.

This is visualized as the slope of the secant lines approaching a limit – the slope of the tangent line – as the free ends of those lines approach $(a,f(a))$. This is illustrated in the first figure.

The derivative as $\hat{L}_a$, the optimal linear approximation to f at a, is another, very useful way to think about the derivative. Here, we focus on the fact that the tangent line at $(a,f(a))$ approximates the graph of $f(x)$ at $(a,f(a))$ as we zoom in on the graph. More precisely, writing $x = h+a$,

$f(x) = f(h+a) = f(a) + \hat{L}_a(h) + g(h)h$,

where $\hat{L}_a$ is linear in $h$, $g(h)\rightarrow 0$ as $h \rightarrow 0$, and the tangent line L is the graph of the function $y = f(a) + \hat{L}_a(x-a)$ .

Exercise: use the facts that (1) linear $\hat{L}_a:\Bbb{R} \rightarrow \Bbb{R}$ have the form $h\rightarrow sh$, $s$ a scalar, and (2) $g(h) \rightarrow 0$ as $h \rightarrow 0$, to rearrange this last equation for $f(x)$ into the original definition of a derivative.

Using the equation above to get

$\left|f(x) - (f(a) + \hat{L}_a(x-a))\right| \leq (\sup_{|s|\in[0,\epsilon]} |g(s)|)|h|$ for  $h\in[-\epsilon,\epsilon]$,

we are able — after some work (see the exercise below) — to get this nice geometric interpretation:

The figure illustrates the fact that the graph of $f(x)$ lies in cones centered on $L$, whose angular widths go to zero as we restrict ourselves to smaller and smaller $\epsilon$-balls centered on $(a,f(a))$. Inside the $\epsilon_1$-ball, the graph stays in the wider cone, while in the smaller, $\epsilon_2$-ball the graph stays in the narrower cone.

Let’s restate this. Defining

1. $p \equiv (a,f(a))$,
2. $B(\epsilon)$ to be the ball of radius $\epsilon$ centered on $p$,
3. $F\equiv\{ (x,y) | y= f(x) \}$,
4. $C_L(p,\epsilon)$ to be the smallest closed cone, symmetrically centered on $L$, with vertex at $p$ such that $F\cap B(\epsilon) \subset C_L(p,\epsilon)$, and
5. $\theta (\epsilon)$ to be the angular width of $C_L(p,\epsilon)$,

we have that

f is differentiable at $a \Leftrightarrow \theta(\epsilon) \rightarrow 0\text{ as }\epsilon \rightarrow 0$

Here is a figure illustrating this:

Exercise: provide the missing details taking us from the above inequality bounding the deviation from linearity to the above statement that {f is differentiable at $a \Leftrightarrow \theta(\epsilon) \rightarrow 0\text{ as }\epsilon \rightarrow 0$} using the facts that (1) the above inequality defines cones that are almost symmetric about $L$ and (2) the $\epsilon$-ball centered at p is contained in the vertical strip $(x-a,y-f(a)) \in [-\epsilon,\epsilon] \times(-\infty,\infty)$

With this shift to a geometric perspective, we are now in a position to take a step in the direction of geometric measure theory.

Note that in our definition the cones contain all of the graph as they narrow down and we zoom in. But what if all we know is that a larger and larger fraction of the graph is in a narrower and narrower cone as we zoom into p? That is precisely the idea that approximate tangent lines capture. We will introduce two different versions of the concept.

### Densities as a path to an approximate tangent line

#### Tangent Cones

The tangent line discussed above is also the tangent cone. The tangent cone of a set in $\Bbb{R}^n$ can have any dimension from 1 to n. For nicely behaved k-dimensional sets, the tangent cone will also be k-dimensional. In the case of the usual derivative of functions from $\Bbb{R}$ to $\Bbb{R}$, we are working in the graph space $\Bbb{R}^2$ with 1-dimensional sets. Moving to tangent cones, we can approximate one dimensional sets which are not graphs or, more generally, arbitrary subsets of $\Bbb{R}^n$.

We now define the tangent cone of $F\subset\Bbb{R}^n$ at $p$.

To obtain the tangent cone, begin by translating $F$ by $-p$. (This moves $p$ to 0.) Define $F(\epsilon) \equiv (F\cap B(\epsilon))\setminus p$. Use a projection center at 0 to project the translated $F(\epsilon)$ onto the sphere of radius $\epsilon$ centered on 0. Take the closure of the resulting subset of the $\epsilon$-sphere. Finally take the cone over this set. Call this set $T_p^\epsilon(F)$. That is,

$T_p^\epsilon(F)=\{\Bbb{R}\geq 0\}(\text{Closure}($$\cup_{x\in F(\epsilon)}\frac{x-p}{|x-p|}$$)).$

Now define the tangent cone of F at p to be the intersection of $T_p^\epsilon(F)$ at any sequence of $\epsilon_i$‘s going to zero; $\epsilon_i = \frac{1}{i}$ will do. Thus the tangent cone of $F$ at p, $T_p(F)$ is given by:

$T_p(F)=\bigcap_i T_p^\frac{1}{i}(F).$

Here is a figure illustrating the key idea:

Note: the tangent cone is centered on the origin, 0, but I will be plotting it as though it were centered on p. Similarly, the tangent lines will sometimes be thought of as linear subspaces (i.e. centered on the origin 0, and other times as the shift of that linear subspace to p.

In the case of a differentiable function $f:\Bbb{R}\rightarrow\Bbb{R}$, this tangent cone is the usual 1-dimensional tangent line.

#### Densities

Now we need $\theta^k(\mu,F)$, the k-dimensional density of F at p.

Define $\omega(k)$ such that it agrees with the volume of the unit ball in $\Bbb{R}^k$ when k is an integer (there is a standard way to do this using $\Gamma$ functions). Let $\mu$ measure k-dimensional volume. Typically this will be k-dimensional  Hausdorff measure, $\mathcal{H}^k$. Whatever intuitive idea you have of k-dimensional measure is good enough for our purposes. (At the end of this post I also define Hausdorff measures more carefully.)

Now, $\theta^k(\mu,F)$ is given by

$\theta^k(\mu,F)=\lim_{\epsilon\rightarrow 0}$$\frac{\mu(F\cap B(\epsilon))}{\omega(k)\epsilon^k}$

when this limit exists. When the limit does not exist, we work with the limsup and liminf of the right hand side which are called upper and lower densities of F at p and are denoted by $\theta^{*k}(\mu,F)$ and $\theta^k_*(\mu,F)$ respectively.

#### Approximate Tangent Cones

We now define the approximate tangent cone at p to be the intersection of closed cones whose complements intersected with F have density zero at p:

$\tilde{T}_p(F)=\bigcap\{\text{closed cones }C\text{ with vertex }p|\theta^k(\mu,(\Bbb{R}^n\setminus C)\cap F)=0\}$

Originally (in this section), we were aiming at having a definition of approximate tangent line that was invariant to (small) pieces of the set F outside the sequence of cones, provided those pieces got small enough, quick enough. Now we can make that more precise. We want a definition of approximate tangent line that ignores such excursions of F provided these excursions have density zero at p. Rather anti-climatically then, here is the definition we have been waiting for (though you might have already guessed it!)

A 1-dimensional set has an approximate tangent line at $p$ when the approximate tangent cone is equal to a line through p.

When the curve is an embedded differentiable curve, the tangent line and the approximate tangent line are the same.

Remark: in general, when we are dealing with k-dimensional sets in $\Bbb{R}^n$, we will get approximate tangent k-planes.

Exercise: can you create examples of one dimensional sets which have a (density based) approximate tangent line at p but not the usual tangent line at p?

Exercise: prove that a tangent line to a continuous curve is also the (density based) approximate tangent line at p.

### Integration as a path to an approximate tangent line

There is different version of approximate tangent k-plane based on integration. (The one dimensional version is of course an approximate tangent line.)

We start with the fact that we can integrate functions defined on $\Bbb{R}^n$ over k-dimensional sets using k-dimensional measures $\mu$ (typically $\mathcal{H}^k$). We zoom in on the point p, through dilation of the set F:

$F_\rho(p) = \{x\in\Bbb{R}^n | \;\;x=\frac{y-p}{\rho}+\text{ p for some }y\in F\}.$

We will say that the set $F$ has an approximate tangent k-plane $L$ at p if the dilation of $F_\rho(p)$, converges weakly to $L$: i.e. if

$\int_{F_\rho} \phi d\mu\rightarrow_{\rho \rightarrow 0} \;\; \int_L \phi d\mu$

for all continuously differentiable, compactly supported $\phi:\Bbb{R}^n \rightarrow \Bbb{R}$.

In the next two figures, we illustrate this for the case of 1-planes – i.e.lines: in the first figure, $L$ is the weak limit of the dilations of F, while in the second it is not.

Note: solid green lines are the level sets of $\phi$ while the dashed green line indicates the boundary of the support of $\phi$. Note also that the $\rho$‘s of 0.4, 0.1, and 0.02 are approximate.

Exercise: can you create an example of a one dimensional curve which has the usual tangent line at p but not an (integration based) approximate tangent line at p?

### Closing Note On Hausdorff Measure

We would like a notion of k-dimensional volume or k-dimensional measure. In many cases, the right notion turns out to be k-dimensional Hausdorff measure. We already know what 1,2, and 3-dimensional measure is as long as the objects we are measuring are regular enough, like subsets of lines, rectangles, and cubes. It does not seem too much of a stretch to think that we can extend these measures to things that are somewhat wiggly. That is, we can still easily imagine measuring the length of a subset of a smoothly turning curve, or the area of a piece of a surface that undulates slowly. Hausdorff measure permits us to measure not only such smooth sets (giving the same result as any reasonable extension of the usual Lebesgue measures to the nice cases), but also to measure very wild sets (like fractals).

How to compute the k-dimensional Hausdorff measure of $A\subset \Bbb{R}^n$:

1. Cover A with a collection of sets  $\mathcal{E}= \{E_i\}_{i=1}^\infty$, where $diam(E_i) \leq d \;\; \forall i$. Here, $diam(E_i)$ is the diameter of $E_i$.
2. Compute the k-dimensional measure of that cover: $\mathcal{V}_\mathcal{E}^k(A) = \sum_i\omega(k) (\frac{diam(E_i)}{2})^k$
3. Define $\mathcal{H}_d^k(A)=\inf_{\mathcal{E}} \mathcal{V}_\mathcal{E}^k(A)$ where the infimum is taken over all covers whose elements with maximal diameter d.
4. Finally, we define: $\mathcal{H}^k(A)=\lim_{d\downarrow 0}\mathcal{H}_d^k(A).$

Remark: Suppose that for any  $\epsilon > 0$, there is a cover $\{E_i\}_1^\infty$ of $A$, such that $\sum_i diam(E_i) < \epsilon$. Then for $k \geq 1$, $\mathcal{H}^k(A) = 0$.

Here is a figure illustrating Hausdorff measures:

Clearly, this can be difficult to compute. It turns out though that in $\Bbb{R}^k$, $\mathcal{H}^k = \Bbb{R}^k$. And by use of mappings, this can take us quite a ways in computing $\mathcal{H}^k(A)$ for integral k and rather general $A$.

Exercise: Show that if $0 < \mathcal{H}^\gamma(A) < \infty$ then $\mathcal{H}^\alpha(A) = \infty$ and $\mathcal{H}^\beta(A) = 0$ for $\alpha < \gamma < \beta$.

# Thoughts on receiving a negative review

This is a slightly edited version of something I wrote in 2009, not long after arriving at WSU from Los Alamos. It remains as pertinent now as it was then. Coincidentally, Gaza is again in the midst of increased mayhem.

Today I received a copy of a review of a paper I am an author on. Needless to say, the reason I am writing about it here is that the review was negative in a way that was not helpful. While the reviewer did make some good points, and we will address those points, it was done in an unfriendly way.

Have I seen worse reviews? Of course. So why write about this review? I suppose because it comes at a time when I am being reflective and when I am thinking about such things more carefully. The error that reviewer made was in not reading the paper carefully enough. Of course we can improve the paper and make it less susceptible to misinterpretation, and we will, but I think that the acceptance of this status quo of negativity and a cultivated attitude that looks for errors and ignores insights, ends up robbing our society of a great deal of original, creative productivity.

In my new position in the mathematics department at Washington State University, as I look around and get the intuitive sense for this university and put that in context of what I have observed at other universities, I see a pattern. And that pattern is tradition and conservatism and narrowness that has its roots in narrow self interest. It inhibits interdisciplinary work. It makes people far more apt to see the mistakes in other work, rather than finding the insights and innovations.  It makes people timid and afraid of adventure, of risk.

Do I like it in academia? Yes. There is still a decent amount of good and potential for a great deal more. There is freedom to develop truly new initiatives. And there are some students and colleagues who are inspired and inspiring.

But the threads I am disturbed about are simply local expressions of global states of human consciousness that we all observe in their horrific consequences: Gaza, the economic crisis, epidemics in Africa, etc.  Underlying everything are multiple threads, but the one that I see everywhere is an unconsciousness, a blindness that is deeply disturbing.

In this state, humans think there is no connection between their personal negativity and selfishness and the atrocities in Gaza. It is acceptable or even good to inflict inhuman atrocities on your enemy, but evil for those “terrorists” to strike back in the ways they can. The unconscious see a great gulf between them and the “terrorists”. They believe that some people are intrinsically good and some intrinsically bad. And the end justifies the means. The work of Chris Hedges — see for example, “I don’t Believe in Atheists” or “War is the Force that Gives us Meaning” or his columns in truthdig.com — feels and proclaims aloud the absurdity of these inconsistencies.

So what can my response be — be it to the reviewer, or the critical, narrow nature of some in academia, or the unmotivated, narrow minds of some students, or the unthinking, unconscious state of some people I run into in my daily life? Certainly, becoming negative and critical is not the answer.

It seems to me that the only thing I can do is to spend all my energy creating a personal atmosphere of rich, creative productivity and connection based on love. Generating happy beauty and a vibrant, living atmosphere, beckoning to those in the sphere of my influence to cooperate in creating little bits of heaven on earth, even if only locally, is the only real evidence there is for the existence of love or heaven … or God.

# Geometric Measure Theory by the Book

There are an armful of texts that I have used to learn and teach geometric measure theory. In this note, I will give a review of these texts, which are:

1. Herbert Federer’s Geometric Measure Theory
2. Frank Morgan’s Geometric Measure Theory: A Beginner’s Guide
3. Krantz and Parks Geometric Integration Theory
4. Lin and Yang Geometric Measure Theory – an Introduction
5. Leon Simon’s Lectures on Geometric Measure Theory
6. Pertti Mattila’s The Geometry of Sets and Measures in Euclidean Spaces
7. Evans and Gariepy’s Measure Theory and Fine Properties of Functions
8. Ambrosio, Fusco, and Pallara’s Functions of Bounded Variation and Free Discontinuity Problems
9. Enrico Giusti’s Minimal Surfaces and Functions of Bounded Variation

My two favorites are Leon Simon’s Lectures on Geometric Measure Theory and Evans and Gariepy’s Measure Theory and Fine Properties of Functions. Before I dive into comments on each of the books, here is a bit of history concerning my path into the subject.

### The Backstory

I was first turned on to geometric measure theory by David Caraballo, the last student to finish with Fred Almgren before Fred died. (Fred who was famous for his deep results in geometric measure theory, was a student of Federer and a professor at Princeton.) I met David at the Nonlinear Control Theory Short Course, organized by Hector Sussmann and Kevin Grasse, at the 1999 Joint AMS-MAA meetings in San Antonio. David and I became instant friends and I was soon swept away by David’s passion for geometric measure theory, realizing that this field was also particularly well matched with my mathematical muses.

In particular, David talked up Evans and Gariepy’s text, so the first thing I did when I got back to Los Alamos was hibernate in my office and immerse myself in that text. It was beautiful and exhilarating — and I was hooked.

At the time I was working on inverse problems and dynamical systems. The inverse problems involved images and it was the image analysis that drew me further into geometric measure theory. The biggest influence in this migration was of course the rising prevalence of total variation regularization in image analysis methods. The first papers I read were David Strong’s. These inspired me to introduce total variation regularization into the sparse tomographic reconstructions methods we were working on. (There were other people dabbling with these methods at the same time at Los Alamos, but Tom Asaki and I took these methods and ran with them. Abel inversions were the tomographic workhorse at Los Alamos, so this was one of the first targets of our work. The first papers can be found here: Abel inversion using total-variation regularization and Abel inversion using total variation regularization: applications)

Through my work in image analysis, I met Andrea Bertozzi, and while visiting Duke at her invitation early in 2003, I met Bill Allard. Bill and I started a close collaboration on data analysis work at Los Alamos, and I began picking up pieces of geometric measure theory from him. Bill was a student of Fleming’s at Brown, but he had also very carefully read and commented on Federer’s entire text as Federer was writing it. (If you know Federer’s book, you can’t help but be impressed by this.) After graduating from Brown, Bill moved to Princeton where he did his seminal work on varifolds.

So why does image analysis lead rather naturally to geometric measure theory? For the simple reason that edges are a big deal in images, while functions of bounded variation (BV) are a very natural class of functions to use when representing functions with discontinuities (edges). And functions in BV are particularly nice, because they are wild, but not too wild — we can still make sense of derivatives (they are nice measures), and all sorts of other nice properties are still at our disposal. For example, sets, whose characteristic functions are in BV, still have usable, generalized outer normals and as a result, we still have the divergence theorem in such regions. Analysis is still nice, or at least possible. (De Giorgi used this class of functions to solve the minimal surface problem in one of the three papers in 1960 that solved this problem. Another was written by Federer and Fleming, the third by Reifenberg. All three used different methods.)

I have studied, referenced or taught out all the above monographs. I now give more detailed comments on those texts. But I am not going to introduce the subject of geometric measure theory here — that will be the subject of other posts (coming soon), nor will I simply outline the contents of the texts, since I don’t think that adds much value. Instead, I will add the things you can’t get from a perusal of the table of contents.

1. Federer’s 1969 Geometric Measure Theory: To a very large degree, this is still the ultimate go-to reference for the contents of the first 4 (of 5) chapters. This is not to say that that content has not evolved, but rather that it is still the foundation for current work. (For example, Solomon and White have enabled us to avoid the difficult structure theorem in getting existence for minimal surfaces, but that structure theorem is still very important.) This text is also rather notorious for it’s density and difficulty. Some of that difficulty is due to the (sometimes understandable) impatience that readers bring to their reading, but it also seems that the lack of pictures in this particular book is a rather eloquent statement about its accessibility. I have learned that it is much easier to read when you translate nearly everything into pictures (because you can!), though its terseness encourages me to use it as a reference, not a text. But it is a completely indispensable reference! Every students should own a copy and read pieces as needed.
2. Frank Morgan’s Geometric Measure Theory: A Beginner’s Guide: Frank wrote his highly successful text as a path into, and an inspiration for the study of, Federer’s book. In contrast to Federer, Frank draws lots of pictures, many of them very enlightening. Almost everybody in the younger generations — people who could have used Frank’s book first — have first read Frank’s book before using other texts. This is as good a place as any to explain that there have been two main branches of effort in GMT, one focused on variational problems like the minimal surface problem, and the second focused on the geometry of sets and measures, with a particular focus on harmonic measures. Morgan and Federer come out of the first branch, while Mattila’s text, which is commented on below, comes from the second branch. (I will comment below on the two new branches of GMT — GMT in metric spaces and GMT with a view to data analysis.) I always use Frank’s book as an illuminating reference for newcomers. I recommend it very highly as a first exposure and an evangelical tool.
3. Krantz and Parks’ Geometric Integration Theory: At first I was skeptical because I am into nice figures (I am an xfig devotee) and some of Krantz and Parks’ figures were very bad. But then I used the text as a reference when proving the deformation theorem for simplicial complexes (see Simplicial Flat Norm with Scale for the paper) and I was impressed with the exposition. Last year I used it as a text for a graduate class. While there are aspects of the text I still don’t like, I do recommend it as a reference that every student should own and consult. (By the way, Federer originally wanted to name his book Geometric Integration Theory, but didn’t because Whitney had already written a book with the same name. Whitney’s book is relevant for those interested in geometric measure theory, and it is now available from Dover books!)
4. Lin and Yang’s Geometric Measure Theory – an Introduction: I have not used this as a text. I was discouraged from using it by the typos, which it is important to note were also very irritating to Fanghua Lin because he had tried to get the publisher to correct them! But it does contain a very nice selection of topics, maybe the broadest selection in the above texts.
5. Leon Simon’s Lectures on Geometric Measure Theory: As I said above, this is one of my two favorites. I am currently (Fall 2012) teaching a graduate class from this text. The book is not so easy to get, but if you are willing to persist, you can get it from the Centre for Mathematical Analysis at Australian National University. (Though the last time I ordered a bunch of copies, they all had the first page of the index missing and they were clearly photocopy’s of the original print run of the book.) I used Leon’s text one other time, when I taught a short course on GMT at UCLA in the spring of 2007. Why do I like the text so much? There are flaws, like typos and a typewriter type font that takes getting used to and things I would change here and there. But Leon’s selection of topics (not too many!), his versions of theorems, the way in which he gives enough details in proofs, but not too many (leaving many implicit exercises and problems for the reader), and the way in which he puts everything together has generated a book that I really like to study and teach from. I recommend it very highly as a primary text, after or alongside Frank’s book. As a side note, Leon is working on a second edition of this text. It should be very good!
6. Pertti Mattila’s The Geometry of Sets and Measures in Euclidean Spaces: As mentioned above, this text comes from the harmonic analysis branch of the subject. As a result, it does not deal with currents, which were developed for their use in GMT (on minimal surfaces) by Federer and Fleming. Mattila’s book is well written and challenging, but not so challenging that students don’t like it. It has explicit problems, which many students like. (I always preferred implicit, fill-in-the-details or what-if-I-weaken-this-assumption type problems.) It covers lots of material, including Marstrand and Priess’ results (about which I would recommend De Lellis’ notes on Rectifiable Sets, Densities and Tangent Measures ), fractals and connections to singular integrals. It contains things that none of the other books I am commenting on have, and it is the only representative of the harmonic analysis branch of GMT in my selection of books. And it does have a different flavor, as one might expect. I consider it something that every student of GMT should own.
7. Evans and Gariepy’s Measure Theory and Fine Properties of Functions: As noted in my story above, this was the first book I saw on the subject. It deals with the same subjects that the first part of Federer and the first part of Simon deal with. It does not delve into currents. The writing is very clear, the proofs are complete, and the amount of filling in, in the proofs, is consistently small enough to make it fairly fast to study, but often enough to keep you very engaged. I found it inspiring when I first read it and it is still usually the first book I have my students buy. The chapters cover measure theory and integration, Hausdorff measure, Radon measures, area and co-area formulas, Sobolev spaces, BV functions (including detailed development of the structure theorem for sets of finite perimeter), and a final chapter on things like Radamacher’s theoorem and extension theorems like Whitney’s. CRC even lowered the price from 180$to 90$ (it has crept back up a bit) in response to our complaints about the price! As far as prerequisites are concerned, most students find this more accessible after a first course in graduate analysis, but some might be happy with it as an introduction to analysis, as long as some other text, like Royden or Folland is also on hand. I recommend it very highly as a text and reference.
8. Ambrosio, Fusco, and Pallara’s Functions of Bounded Variation and Free Discontinuity Problems: David Caraballo was also the first to tell me about this book. It prepares the reader to deal with the existence results for the Mumford-Shah functional, which is an image analysis functional for used for image segmentation. Ambrosio and De Giorgi proposed the space of special functions of bounded variation (SBV) for use with free boundary problems in 1987. In a 1989 paper by De Giorgi, Carriero, and Leaci, these ideas were used to prove the existence of minimizers for the Mumford-Shah functional. Later Ambrosio developed the theory of SBV more fully and this book is a logical follow-on to these works. I have not used this book, though I have read a small bit here and there. I think it is well liked by students, but it is unreasonably expensive at 250.00 (list price). Shame on Oxford University Press! They should not be following the example of the Dutch profiteers! This is an object lesson in why, if you are a mathematician, you should publish your own book and not give it to some publisher to exploit. OK. Done with my soapbox. If you can afford it, get this book! If you can’t, write to Oxford and complain bitterly about their crazy prices and the fact that they are limiting access to an excellent and fascinating book!
9. Enrico Giusti’s Minimal Surfaces and Functions of Bounded Variation: I read much of Giusti’s book and liked it a great deal. I recommend it as very a good source for the subjects it covers. This book is inspired by De Giorgi’s path to a solution for the minimal surface problem, though it contains more material since it was written over 20 years after that work by De Giorgi. Here is review of Giusti’s book by Fred Almgren. It contains a very detailed account of the contents of the book and some nice history as well. Again, this book is overpriced at 183.00 for the paperback but luckily, Springer (who owns the book’s publisher Birkhauser) sells it for 91.50! So buy it directly from Springer. And I do recommend buying it. You will enjoy studying it if you have any interest in this subject.

Finally, as promised above, a few words about the other two branches of GMT. As mentioned above they are GMT in metric spaces and GMT with a view to data.

As far as I know, GMT in metric spaces had its genesis with Ambrosio and Kirchheim’s paper, Currents in Metric Spaces. It is a very active area of research, including for example, analysis on the Heisenberg group, where paths are allowed tangents in proper subspaces of the tangent space instead allowing the path to have arbitrary directions in the tangent space. (These are called sub-Riemannian spaces.) As far as I can tell, this branch was inspired by the growing area of analysis in metric spaces.

By GMT with a view to data, I mean both GMT applied to data and new GMT inspired by data. That the flow of ideas goes in both direction is very important and the reason why this is a very exciting, productive, high-potential place to work. This is where I am working. It includes the work at the intersection of image analysis and BV/SBV functions, like the work of Rudin, Osher and Fatemi which introduced TV regularization to image analysis and the segmentation functional of Mumford and Shah. Other examples include the work of Jones, Lerman, Schul and Okikiolu on Jones’ beta numbers, the multiscale flat norm work I am doing with collaborators (inspired by the connection between the L1TV functional and the flat norm), as well as the applications of curvature measures (and things like curvature measures) to data. Examples of this last item include the work of Adler, Taylor and Worsley at the intersection of statistics and integral geometry and the work of Chazal, Cohen-Steiner, and Merigot on boundary measures. Many more examples exist, but these examples give a flavor for the kinds of things happening at the intersection of GMT and data.

Actually, the GMT data branch has the other three as subbranches for the simple reason that all three have very useful insights to offer data and because data suggests problems leading to new ideas in each of the three areas.

### Afterword

As mentioned above, I like figures. But, while not having figures in a geometric measure theory text doesn’t make so much sense to me, it is the case that students should be drawing their own figures anyway. This is part of the work involved in making the subject yours, in internalizing the ideas and techniques. One might even argue that books without figures are better for students who must then draw figures for themselves in order to grasp what is going on more fully. But I would not go quite that far. I think that drawing pictures, as many and as often as you can, should be the part of the GMT culture. Figures should be common, and they should also appear in books. I do believe that authors should not try to draw pictures for everything, but should draw just enough to get the students going themselves, help students avoid pitfalls and inspire them as they struggle to master the ideas. At the very least, this discussion helps you see why I can still like Leon Simon’s book so much even though there are no figures in it!

It should also be noted that even though I talk about the two older and the two newer branches of GMT, it would be silly to insist on a bold demarcation of the boundaries between branches and a subsequent classification of everything and everyone. Part of this is because the intersections between branches are very large. Another part of this is because the two newer branches are by their nature agnostic, caring only about generalizable (in the case of metric spaces) or useful (in the case of data analysis) ideas or developments in GMT. Using Federer to supply another example, even though Federer’s most famous paper (the one with Fleming that solved Plateau’s Problem) was focused on calculus of variations, he also established other significant pieces of the foundation for the entire field. It would therefore make no sense at all to try to assign him to a single branch. So thinking about the field as characterized by branches is useful only if you do not take it very seriously, or worse yet, turn the branches into fences impeding travel!