1 00:00:00,012 --> 00:00:07,375 , I often want to differentiate an inverse function. Say, I've got a function f. The 2 00:00:07,387 --> 00:00:14,165 derivative of f encodes how wiggling the input affects the output. The derivative 3 00:00:14,177 --> 00:00:20,587 of the inverse function would encode how changes to the output affect the input. 4 00:00:20,832 --> 00:00:25,504 Here's a theorem that I can use to handle this situation. Here is the inverse 5 00:00:25,516 --> 00:00:30,261 function theorem. I'm going to suppose that f is some differentiable function, f 6 00:00:30,354 --> 00:00:35,273 prime is continuous, the derivative is continuous. And the derivative, at some 7 00:00:35,285 --> 00:00:41,455 point, a, is nonzero. In that case, I get the following fantastic conclusion. Then 8 00:00:41,467 --> 00:00:47,434 the inverse function at y is defined for values of y near f of a. So, the function 9 00:00:47,446 --> 00:00:53,460 f is invertable near a. The inverse function is differentiable for inputs near 10 00:00:53,472 --> 00:00:59,849 f of a. And that derivative is continuous in your inputs near f of a. And I've even 11 00:00:59,861 --> 00:01:05,331 got a formula for the derivative. The derivative of the inverse function at y is 12 00:01:05,343 --> 00:01:10,328 1 over the original derivative, the derivative of the original function, 13 00:01:10,340 --> 00:01:15,508 evaluated at the inverse function of y. How can I justify a result like that? Why 14 00:01:15,520 --> 00:01:20,173 should something like that be true? One 1 way to think about this is geometrically. 15 00:01:20,276 --> 00:01:24,947 Here, I've drawn the graph with just some made up function, y equals f of x. What's 16 00:01:24,959 --> 00:01:29,621 the graph of the inverse function look like? Well, one way to think about this is 17 00:01:29,633 --> 00:01:34,270 that the inverse function exchanges the roles of the x and y axes, which is the 18 00:01:34,282 --> 00:01:39,475 same as just flipping it over, alright? What was the y-axis now, the x-axis, what, 19 00:01:39,487 --> 00:01:44,209 was the x-axis is now the y-axis? And this graph here is y equals f inverse of x. 20 00:01:44,315 --> 00:01:47,526 This is how you graph the inverse function. Alright. 21 00:01:47,532 --> 00:01:52,480 So, let's go back to the original function and if I put down a tangent line to the 22 00:01:52,843 --> 00:01:57,840 curve at some point, let's say that tangent line has slope m. Well, what's the 23 00:01:57,852 --> 00:02:02,635 tangent line of the inverse function? That would be the derivative of the inverse 24 00:02:02,647 --> 00:02:07,310 function. Well, if I flip over the graph again to look at the graph of the inverse 25 00:02:07,322 --> 00:02:11,830 function, I can put down a tangent line to the to the inverse function. And that has 26 00:02:11,830 --> 00:02:16,440 slo pe 1 over m. If m was the original slope for the tangent line to the original 27 00:02:16,452 --> 00:02:21,718 function, 1 over m is the new slope to the tangent line of the inverse function. Why 28 00:02:21,730 --> 00:02:26,749 1 over m? Well, that makes sense because I got this graph by exchanging the roles of 29 00:02:26,761 --> 00:02:32,445 the x and y-axis, by flipping the paper over. And that exchange is rise for run, 30 00:02:32,582 --> 00:02:39,195 and run for rise. So, the slope becomes the reciprocal of the old slope. This 31 00:02:39,207 --> 00:02:45,770 slope business is reflected in the notation, dy dx. Som let's suppose that y 32 00:02:45,782 --> 00:02:49,724 is f of x, so x is f inverse of y, supposing that this is an invariable 33 00:02:49,736 --> 00:02:56,878 function. If y is f of x, then f prime of x could be written dy dx. And if f is 34 00:02:56,890 --> 00:03:04,849 inverse of y, then the derivative of the inverse function at y, well, that's asking 35 00:03:04,861 --> 00:03:12,260 how's changing y change x could write that as dx over dy. Well, if you really take 36 00:03:12,272 --> 00:03:17,742 this notation seriously, what it looks like it's saying, is that, dx dy, which is 37 00:03:17,754 --> 00:03:22,730 the derivative of the inverse function, should be 1 over dy dx, right? The 38 00:03:22,742 --> 00:03:27,717 derivative of the inverse function is 1 over the derivative of the original 39 00:03:27,729 --> 00:03:32,763 function. But you have to think about where these derivatives are being 40 00:03:32,775 --> 00:03:38,255 computed. So, maybe you believe that dx dy is 1 over dy dx, it makes sense that if 41 00:03:38,267 --> 00:03:43,697 you exchange the roles of x and y, that takes the reciprocal of the slope of the 42 00:03:43,709 --> 00:03:48,704 line. But where is this wiggling happening, right? dy dx is measuring how 43 00:03:48,716 --> 00:03:53,667 wiggling x affects y. Wiggling around where? Well, let's suppose that I'm 44 00:03:53,679 --> 00:03:58,335 wiggling around a. So, I'm really calculating dy dx when x, say, is at a. 45 00:03:59,088 --> 00:04:04,313 This is the quantity that records how wiggling x near a. will affect y. Well 46 00:04:04,325 --> 00:04:09,898 then, where's y wiggling? Well, if x is wiggling around a, y is wiggling around f 47 00:04:09,910 --> 00:04:15,404 of a. So, the derivative on this side is really being calculated at y equals f of 48 00:04:15,416 --> 00:04:21,097 a. And it's really necessary to keep track of where this wiggling is happening in 49 00:04:21,109 --> 00:04:25,629 order to get a valid formula. It's actually easier to think about what's 50 00:04:25,641 --> 00:04:29,728 going on if we just phrase all of these in terms of the Chain rule. So, what do I 51 00:04:29,740 --> 00:04:32,869 know about the inverse function? Well, here's f inve rse. 52 00:04:32,872 --> 00:04:37,195 F of f inverse of x is just x. Alright, what is the inverse function do? Whatever 53 00:04:37,207 --> 00:04:41,430 you plug into the inverse function, it outputs whatever you need to plug into f 54 00:04:41,442 --> 00:04:45,690 to get out the thing you plugged into the inverse function. Alright. So, this is 55 00:04:45,702 --> 00:04:50,430 true. Now, if I differentiate both sides, assuming that f and f inverse are 56 00:04:50,442 --> 00:04:55,555 differentiable, then by the Chain rule, what do I get? Well, the derivative of 57 00:04:55,567 --> 00:05:01,230 this composition is the derivative of the outside at the inside times the derivative 58 00:05:01,242 --> 00:05:07,898 of the inside. And that's equal to the derivative of the other side, which is the 59 00:05:07,910 --> 00:05:14,265 derivative of x is just 1. Now, I'll divide both sides by f prime f inverse of 60 00:05:14,277 --> 00:05:20,851 x and I get that the derivative of the inverse function of x is 1 over f prime of 61 00:05:20,863 --> 00:05:26,582 f inverse of x. Is that a proof? Absolutely not. The embarrassing truth is 62 00:05:26,594 --> 00:05:30,675 that this argument assumes the differentiability of the inverse function. 63 00:05:30,773 --> 00:05:34,929 If this function, f inverse, is differentiable, then the Chain rule can be 64 00:05:34,941 --> 00:05:39,513 applied to it. The Chain rule requires that the functions be differentiable. Now, 65 00:05:39,525 --> 00:05:44,022 if the function is differentiable, then this Chain rule calculation tells me that 66 00:05:44,034 --> 00:05:48,605 the derivative inverse function is this quantity. But that's all predicated on 67 00:05:48,617 --> 00:05:53,270 knowing that the inverse function is differentiable. How do we know that? Well, 68 00:05:53,282 --> 00:05:57,935 that's actually the content of this theorem, right? The content of the inverse 69 00:05:57,947 --> 00:06:02,135 function theorem is not really the calculation of the derivative of the 70 00:06:02,147 --> 00:06:06,425 inverse function. It's really just the fact that the inverse function is 71 00:06:06,437 --> 00:06:11,010 differentiable at all. That is a huge deal, and it's not something that we can 72 00:06:11,022 --> 00:06:15,604 just get from the Chain rule. Once we know that the inverse function is 73 00:06:15,616 --> 00:06:20,163 differentiable, then the Chain rule gives us this calculation. But actually 74 00:06:20,175 --> 00:06:24,159 verifying if the inverse function is differentiable is really quite deep, 75 00:06:24,257 --> 00:06:28,915 that's why the inverse function theorem is such a big deal. The Chain rule requires 76 00:06:28,927 --> 00:06:33,680 that the functions I'm applying the change rule to be differentiable. In contrast, 77 00:06:33,787 --> 00:06:38,740 the inverse function theorem is asserting the differenti ability of the inverse 78 00:06:38,752 --> 00:06:43,945 function. It's really saying much more, than just a computation of the derivative 79 00:06:43,957 --> 00:06:50,040 if the derivative exists. It's actually telling me that the derivative exists. I'm 80 00:06:50,052 --> 00:06:54,191 going to have to punt on saying much more about the proof of the inverse function 81 00:06:54,203 --> 00:06:58,578 theorem. But nevertheless, we can now apply the inverse function theorem to some 82 00:06:58,590 --> 00:07:03,242 concrete examples. For example think about the function, f of x equals x squared. 83 00:07:03,338 --> 00:07:07,620 Well, what's the inverse function to this? Let's suppose the domain is just the 84 00:07:07,632 --> 00:07:12,292 nonnegative real numbers. Then, the functions invertible on the 85 00:07:12,304 --> 00:07:17,482 domain, and we know the name of the inverse is the square root of x. What's 86 00:07:17,494 --> 00:07:22,884 the derivative of the original function? Well, we know that it's 2x, and the 87 00:07:22,896 --> 00:07:27,974 derivative is continuous and the derivative is not 0 provided that x is a 88 00:07:27,986 --> 00:07:33,168 positive. This is all the stuff that we need to apply the inverse function 89 00:07:33,180 --> 00:07:39,630 theorem. Then, we know that the derivative of the inverse function at x is 1 over the 90 00:07:39,642 --> 00:07:45,595 original derivative at the inverse of x. Now, the inverse fuction is the square 91 00:07:45,607 --> 00:07:51,588 root of x, so that's 1 over f prime of the square root of x, and what's f prime? f 92 00:07:51,600 --> 00:07:57,275 prime is the function that doubles its input. So, that's 1 over 2 square roots of 93 00:07:57,287 --> 00:08:02,085 x. So, the derivative of the inverse function, the derivative of the square 94 00:08:02,097 --> 00:08:07,110 root function is 1 over 2 square roots of x, provided x is bigger than 0, right? 95 00:08:07,217 --> 00:08:11,990 Just like before, this is a calculation of the derivative of the square root 96 00:08:12,002 --> 00:08:17,348 function. We can also see this numerically. So, the square root of 10,000 97 00:08:17,360 --> 00:08:22,667 is 100, and you might ask what do you have to take the square root of, to get at 98 00:08:22,679 --> 00:08:28,360 about 100.1? Say, some numeric example. Well, think now about the functions that 99 00:08:28,372 --> 00:08:34,473 are involved here. There's the squaring function and the square root function. we 100 00:08:34,485 --> 00:08:39,881 saw the derivative of the square root function is 1 over 2 square root x and the 101 00:08:40,140 --> 00:08:45,417 derivative of x squared, we already know, is 2x. Where are we evaluating these 102 00:08:45,429 --> 00:08:51,380 functions? Well, I'm evaluating the square root function at 10,000, right? This is at 103 00:08:51,392 --> 00:08:56,485 x equals 10,000 . And if I evaluate that at 10,000, that's 1 over 2 times the 104 00:08:56,497 --> 00:09:02,013 square root of 10,000, that's 1 over 200. Where am I evaluating the other function, 105 00:09:02,122 --> 00:09:07,143 the x squared function? Well there, I'm really thinking of 100 as the input, so 106 00:09:07,155 --> 00:09:12,154 I'll evaluate that derivative at 100 and 2x, when x is a 100 is 200. And it's not 107 00:09:12,166 --> 00:09:17,619 too surprising, right, that 1 over 200 and 200 are reciprocals of each other, because 108 00:09:17,631 --> 00:09:22,525 I'm calculating derivatives of a function and the inverse function at the 109 00:09:22,537 --> 00:09:27,925 appropriate places. Now, let's try to answer the original question. I'm trying 110 00:09:27,937 --> 00:09:33,225 to figure out, what do I have to take the square root of to get about 100.1? Well, 111 00:09:33,337 --> 00:09:38,425 the ratio here is about 200 between the input and the output. So, if I want the 112 00:09:38,437 --> 00:09:44,090 output to be affected by 0.1, I should try to change the input by about 200 times as 113 00:09:44,102 --> 00:09:49,655 much, and 200 times 0.1 is 20, so I should try to change the input by about 20 and 114 00:09:49,772 --> 00:09:55,091 sure enough, if you take the square root of 10,020, that's awfully close to a 115 00:09:55,091 --> 00:10:00,960 100.1. I hope that you'll play around with these numbers. All the conceptual stuff 116 00:10:00,972 --> 00:10:06,213 that we're doing, these theorems, I'm not telling you these theorems to make numbers 117 00:10:06,225 --> 00:10:10,718 boring, right? I'm telling you all these theorems to heighten your appreciation of 118 00:10:10,730 --> 00:10:12,150 the numerical examples.