What is Hypothesis Testing?

This was one of the hardest math topics for me to grasp, particularly since the ideas of hypothesis testing and the tools used to quantify it are jumbled up, producing a convoluted Kindiak. I make an attempt to deconvolute any confusion that learning this topic may create.

What is hypothesis testing? At its core, hypothesis testing is about evaluating whether a claim is false or not. I claim to be 170 cm, but when you use a metre ruler to measure me I come short at 160 cm. Claim disproven! Of course, what makes hypothesis testing unique is that it uses probabilities to conclude.

Suppose a cola company claims that the mean volume of their drinks are 300 ml in each can. You smell horseshit and decide to test their claim. You buy 20 cans of cola, poured them out individually, and all of them measure only 290 ml. It would be fair to conclude that the company’s cheating your money right?

After all,

  1. suppose they were right, then
  2. the probability of all 20 cans having less than 300 ml would be miniscule.
  3. It’s too improbable!
  4. Hence, the company is lying!

Take note of the bolded argument. This is the core of hypothesis testing. Step 1 assumes the null hypothesis (that is, the default claim) is true. Step 2 tests that claim with reality. Step 3 evaluates how compatible, or incompatible, the claim is with reality using probabilities (p-value). Step 4 makes the final conviction on whether the default claim is false or not.

Instead of 20 cans, all of which are less than 290 ml, let’s put some numbers and distributions on it (for simplicity’s sake; there are many variants to this).

Let the volume of cola in a can X, follow a normal distribution with standard deviation 3 ml. Suppose we find that the mean volume of cola in 20 cans is 299.5 ml. Not so easy to conclude now, huh? It’s simple, really. We follow the four steps of the argument, but this time use statistics to justify these steps.

Step 0: The set-up

Okay, I lied. We need to first set up the courtroom. We call their claim, that the mean volume, \mu, is 300 ml, the null hypothesis and denote it as H_0. (technically we call it the population mean, since it is the mean volume of ALL the cans of cola, but if this confuses you, disregard until future reference). The alternate claim is that they’re lying, maybe that the mean volume is less than 300 ml. We call this the alternative hypothesis and denote it as H_1. We state the hypotheses as follows:

H_0 : \mu = 300

H_1 : \mu < 300

Step 1: Suppose the null hypothesis is true

We test this claim using 20 cans of cola with standard deviation 3 ml. Assume they are right. That means suppose the mean volume really is 300 ml. The mean volume of 20 cans, \overline{X}, using normal distribution concepts, will follow a normal distribution as follows:

\displaystyle \overline{X} \sim N\left(300, \frac{3^2}{20}\right)

Step 2: Test the claim with reality

If H_0, is true, then in theory, the sample mean will be about 300 ml with small variations. In reality, the sample mean is 299.5 ml. Is that too much of a deviation from the proposed mean? If it is too much, will the probability be too low?

In other words, give that \bar{X} follows such a distribution, what is the probability that the sample mean of the 20 cans of cola would have a value of 299.5 ml? Well, we find the probability simply by computing the value of P(\bar{X} < 299.5), which either by using the G.C. or by using z-values and a z-table can be found. You should get a probability, or, the p-value, of 0.228, correct to 3 significant figures.

Step 3: Is it too improbable?

Improbable or probable are relative words. Some think that a 1/3 chance in winning a dice-based is low, while a 1/3 chance in winning the lottery is mostrously large. In order to assess whether 0.228 is too improbable or not, we need to set a limit. What probability or less do we consider it as ‘too improbable’? This is called the level of significance, denoted by \alpha, and is arbitrarily chosen, depending on the context. In short, for the simplest of questions, it’s given to you in the question (there will be variants asking that you find this value of \alpha)

Anyway, let’s set \alpha=25 \% =0.25 for this question. Any probability less than this is, well, too improbable, GIVEN that we assume H_0 to be true. What’s our p-value? It’s 0.228. Is it less than 0.25? YES. IT’S TOO IMPROBABLE!

Step 4: Final conviction

Under H_0, we assume that the mean volume of cola is 300 ml. The probability that the sample mean volume of 20 cans of cola is 299.5, aka less than 300, is 0.228, which is too improbable (compared to \alpha, 0.25) to occur. Hence, it’s more probable that the company is lying.

We therefore find sufficient evidence to reject H_0 and conclude that the actual mean volume of cola is less than 300 ml (accepting H_1, but writing these two words alone won’t get you credit).

Step E: Error

I lied to you again. What if the company isn’t lying, but we were lucky (or unlucky) enough to pick the 20 cans that gave us a sample mean of 299.5 ml? Then that 0.229 probability was too low, but not low enough to render the company’s claim wrong. The p-value is therefore alternatively defined as the probability of wrongly rejecting  H_0.

FINALLY.

That’s the core idea of hypothesis testing and a very simple example to get you started. What about finding the sample mean? What if you use a large sample that doesn’t follow a normal distribution? What if the population variance isn’t known? What if H_1 is not whether \mu is less or more than some claimed value, but just not? What if any of the parameters \mu, \sigma, n, \alpha are unknown?

Welcome to hypothesis testing. Make sure your understanding of distributions and  means of random variables are top-notch before coming here, since they are the ABCs of hypothesis testing.

Advertisements

By Parts Backstory

Here’s the simple concept behind integration by parts as described in an earlier post. Integration by Parts simply what I call the reverse-product rule. Here’s why.

Let u and v be functions of x. Using product rule, when we differentiate uv, we differentiate the first term *times* keep the second constant *plus* differentiate the second term *times* keep the first constant. That is,

\displaystyle \frac{d}{dx} \left(uv\right)=\frac{du}{dx} v + u \frac{dv}{dx}.

Notationally, since u'=\displaystyle\frac{du}{dx} and v'=\displaystyle\frac{dv}{dx}, we rewrite the above identity as

\displaystyle \frac{d}{dx} \left(uv\right)=u'v + uv'.

Let’s integrate with respect to x on both sides. On the LHS, we are integrating what we get after differentiating. Since they cancel each other out, we get uv on the LHS. On the RHS, we get the integrals of each chunk added together, that is,

\displaystyle uv = \int u'v\ dx + \int uv'\ dx.

Subtracting by \displaystyle \int u'v\ dx on both sides and switching the LHS with the RHS, we get

\displaystyle \int uv' \ dx = uv - \int u'v\ dx.

which is the famous integration by parts formula.

Hope this was insightful on how this technique is nothing more than reversing the product rule!

Integrate by Parts, the IS-ID Way

In calculus, we learn to integrate products like x \ln{x} and e^x \sin{x} using the technique integration by parts. We use the formula,

\displaystyle \int uv' \, dx = uv - \int u'v \, dx ,

and using LIATE (Log-InvTrig-Algebraic-Trig-Exp), choose the right-leaning term as v' (e.g. x) and the left-leaning term  as u (e.g. \ln{x}). We integrate and differentiate respectively to get  v=\displaystyle \frac{x^2}{2} and u'=\displaystyle \frac{1}{x} respectively, substitute into the formula, simplify the expression and carry on our calculations. Rather than confuse you with which expression to use as \displaystyle v' and \displaystyle u, I propose a simpler presentation.

Before we discover this presentation, I want to clarify that this is just a simpler presentation of the same technique. The concept is the same. The presentation simply makes more intuitive sense. Also, I might derive the formula in another post. Integration by parts is essentially what I call the “reverse product rule”, and I might elaborate more next time. For now, lets take a look at the formula one more time, but this time I highlighted in green  v and  v' and red  u and  u', and swapped the order of multiplying (doesn’t change the formula mathematically):

\displaystyle \int v' \, u dx = v u  - \displaystyle \int v \, u' dx .

Notice that v' \, gets Integrated and u remains the Same in the first term, v u. Notice that v' \, gets Integrated and u gets Differentiated in the second term, \displaystyle \int v \, u' dx . Thus, we get this idea of Integrate * Same – Integrate * Differentiate, or succinctly put,

I S  - \displaystyle \int I \, D dx .

As mentioned, this is essentially the same method, but using these letters makes it easier for us to choose which terms to use. Let’s integrate x \ln{x} as an example.

Between x and \ln{x}, the right-leaning term is x. Thus, we Integrate that to get  \displaystyle \frac{x^2}{2}. The left-leaning term is \ln{x}. Thus, we Differentiate that to get \displaystyle \frac{1}{x}.

Using IS-ID, we plug in the relevant letters to get

\displaystyle \int x \, \ln{x} dx = \displaystyle \frac{x^2}{2} \ln{x}  - \displaystyle \int \displaystyle \frac{x^2}{2} \displaystyle \frac{1}{x} dx  \displaystyle = \frac{x^2}{2}\ln{x} - \frac{1}{2}\int x \, dx .

Integrating the remaining portion, we get our final result,

\displaystyle \int x \ln{x}\, dx = \frac{x^2}{2}\ln{x} - \frac{x^2}{4} + C.

I hope this alternative presentation will help you more effectively and integrate by parts with better understanding. As an exercise to the reader, find \displaystyle \int e^x \sin{x}\, dx. The final answer, for your reference, is \displaystyle \frac{e^x}{2} (\sin{x} - \cos{x}) and you get bonus marks if you can explain how this answer can be improved.

 

Asymptotes

Q: Hi Joel, could you run through the basics of asymptotes?

The fundamental idea of asymptotes is that it is a limit. For example, consider the function \displaystyle y=\frac{1}{x}. What does it tend to as x gets really, really large? The answer, of course, is 0, since 1 divided by a reaaaaaaaally huge number gives us a reaaaaaaaally small number, which approaches 0. We can graph this function this way:

asymptotes-1-1

Since y tends toward 0, we call y=0 our horizontal asymptote.

What about the y? How do we make that go to infinity? We do this by considering 1 divided by a reaaaaaaaally small number. At the threshold, we take 1 divided by 0 and get an undefined number. So to get to that extreme, we let the denominator equal 0. In this case,  x=0, which just so happens to be our vertical asymptote.

BUT, what about this function, \displaystyle y=\frac{1}{x-2}, which is the initial function but translated by 2 units in the positive xdirection? We ask the same questions: what happens when x gets really, really large and how can we make y really, really large?

When x gets really, really large, it follows that x-2 gets really, really large and y equals 1 divided by a reaaaaaaaally huge number, which gives us a reaaaaaaaally small number, which approaches 0. Thus the horizontal asymptote remains as y=0.

How can we make y go to infinity? We do this by letting the denominator equal 0. In this case,  x-2=0 and  x=2. Hence, that is our asymptote.

asymptotes

BUT, what about this function, \displaystyle y=\frac{1}{x-2}+3, which is the second function but translated by 3 units in the positive ydirection? We ask the same questions: what happens when x gets really, really large and how can we make y really, really large?

When x gets really, really large, it follows that x-2 gets really, really large and \displaystyle \frac{1}{x-2} equals 1 divided by a reaaaaaaaally huge number, which gives us a reaaaaaaaally small number, which approaches 0. Thus, overall\displaystyle y=\frac{1}{x-2}+3 tends towards \displaystyle y=3 since the \displaystyle \frac{1}{x-2} tends toward 0. Therefore \displaystyle y=3 is our horizontal asymptote.

How can we make y go to infinity? We do this by letting the denominator equal 0. In this case,  x-2=0 and  x=2. Hence, that is our asymptote.

asymptotes-1

BUT, what about this function, \displaystyle y=x+3+\frac{1}{x-2}, which is the third function plus x??? We ask the same questions: what happens when x gets really, really large and how can we make y really, really large?

When x gets really, really large, it follows that x-2 gets really, really large and \displaystyle \frac{1}{x-2} equals 1 divided by a reaaaaaaaally huge number, which gives us a reaaaaaaaally small number, which approaches 0. Thus, overall\displaystyle y=x+3+\frac{1}{x-2} tends towards \displaystyle y=x+3 since the \displaystyle \frac{1}{x-2} tends toward 0. Therefore \displaystyle y=x+3 is our asymptote. It’s not horizontal, though. We call these slanted asymptotes oblique.

How can we make y go to infinity? We do this by letting the denominator equal 0. In this case,  x-2=0 and  x=2. Hence, that is our vertical asymptote.

asymptotes-2

BUT, what about this function, \displaystyle y=ax+b+\frac{p}{qx+r}, which is the generalised third function? Bonus marks if you can describe the sequence of transformations from the third function to this. When x gets really, really large, it follows that qx+r gets really, really large and \displaystyle \frac{p}{qx+r} equals p divided by a reaaaaaaaally huge number, which gives us a reaaaaaaaally small number, which approaches 0. Thus, overall, \displaystyle y=ax+b+\frac{p}{qx+r} tends towards \displaystyle y=ax+b since the \displaystyle \frac{p}{qx+r} tends toward 0. Therefore \displaystyle y=ax+b is our oblique asymptote.

How can we make y go to infinity? We do this by letting the denominator equal 0. In this case,  qx+r=0 and \displaystyle  x=-\frac{r}{q}. Hence, that is our vertical asymptote.

In sum, to find the asymptotes, we ask the two questions

  1. What happens when  x gets really, really large? (to obtain horizontal or oblique asymptotes)
  2. What happens when the denominator equals zero? (to obtain vertical asymptotes)

*Footnote: The last function assumes q \neq 0 as division by zero is undefined. Also, if q = 0, we actually get a linear function \displaystyle y=ax+b+\frac{p}{r}, now assuming that r \neq 0 as division by zero is undefined. Note also that if a = 0 then we get the initial examples without oblique asyptotes.

 

For the J1 Saint

For all who promoted, congratulations. Time to move forward. My suggestion comes two-fold:

1. Firm up foundations. Use December to learn and understand everything you studied for FEs. Don’t memorize. Know your stuff well. Understand how and why your formulae or theories work, and be flexible to see them incorporated in unique situations. Think. And once you think well, think out of the box.

2. Learn consistency. Consistency is not doing a lot in a short period of time, but doing a little over a long period of time. Ensure you understand EVERYTHING that is taught in each lecture. Complete your homework on the day it’s given so that procrastination have no hold in your life. Understand the answering techniques in each tutorial. Be consistent in your work, and see effortlessly excellent results. Apply all this on the next lecture this during lecture week, and start NOW.

DO NOT ONLY START IN 2017. KILL PROCRASTINATION BEFORE IT GROWS.

3. Learn consistently. In every moment of your life, ask questions. Relate what you learnt to a physical phenomena. Ask out-of-the-syllabus questions. Satisfy your curiosity for knowledge online. Learn continually. This principle lasts beyond grades and into life as well.

In life, learn to learn. Let good grades be but a fruit of that spirit.