1
00:00:00,012 --> 00:00:06,377
>> We have seen how redshift surveys can
be used to describe large scale structure,

2
00:00:06,377 --> 00:00:12,514
at least in the cosmographic sense.
But how do we quantify the distributional

3
00:00:12,514 --> 00:00:17,254
galaxies such that we can compare with
theoretical models?

4
00:00:17,255 --> 00:00:22,375
The first way in which people have done
this is to use the so called two-point

5
00:00:22,375 --> 00:00:27,007
correlation function.
Nowadays, power spectrum is more often

6
00:00:27,007 --> 00:00:31,287
used and the two are actually related in a
very simple fashion.

7
00:00:31,287 --> 00:00:36,059
We will go about that next time.
So, let's talk about the galaxy two-point

8
00:00:36,059 --> 00:00:40,908
correlation function.
What it means is that, if galaxies are

9
00:00:40,908 --> 00:00:48,334
clustered together, they're correlated.
Each galaxy is somehow more likely to be

10
00:00:48,334 --> 00:00:54,196
found next to another galaxy.
And one way to quantify this is to ask the

11
00:00:54,196 --> 00:00:58,154
question.
Assume that galaxies are actually

12
00:00:58,154 --> 00:01:04,336
uniformly, randomly distributed in space.
Then, at a distance from any given galaxy,

13
00:01:04,336 --> 00:01:08,867
there will be a certain probability of
finding another galaxy.

14
00:01:08,867 --> 00:01:13,088
If you actually measure this, you'll find
out, there is an axis.

15
00:01:13,088 --> 00:01:18,523
There are more galaxies near other
galaxies and you'd expect from purely

16
00:01:18,523 --> 00:01:23,135
random distribution.
And that access above the random is what

17
00:01:23,135 --> 00:01:28,106
correlation function is.
So, one simply does the counting of galaxy

18
00:01:28,106 --> 00:01:34,040
pairs for each galaxy and then normalizes
by what the random distribution with the

19
00:01:34,040 --> 00:01:40,106
same number of data points would be.
As it turns out, the two-point correlation

20
00:01:40,106 --> 00:01:46,379
function is well-represented by power-law,
and it's usually written in this form.

21
00:01:46,379 --> 00:01:52,081
That, radius, divided by some scaling
radius, to some power which is close to

22
00:01:52,081 --> 00:01:55,826
minus 1.8.
And typically, for normal galaxies in this

23
00:01:55,826 --> 00:02:00,028
neck of the universe, the scaling length
is about 5 megaparsec.

24
00:02:00,028 --> 00:02:04,868
This, however, is not universal.
Different kinds of galaxies have different

25
00:02:04,868 --> 00:02:07,994
clustering properties, as you'll see
shortly.

26
00:02:07,995 --> 00:02:12,743
So, here is an example of a modern, well
measured, two-point correlation function

27
00:02:12,743 --> 00:02:15,903
from, from the galaxies from, to the F
redshift survey.

28
00:02:15,903 --> 00:02:18,533
It does look pretty close to the
power-law.

29
00:02:18,533 --> 00:02:22,961
Although, if you subtract the best fit
power-law, you'll see that there are some

30
00:02:22,961 --> 00:02:27,908
significant deviations from it.
Now, if you don't have redshifts, you can

31
00:02:27,908 --> 00:02:32,836
measure the angular correlation fraction
just projected in the sky.

32
00:02:32,836 --> 00:02:38,512
And that this two-dimensional projected
correlation function usually [unknown]

33
00:02:38,512 --> 00:02:43,858
looks like little w, not to be confused
with the equation of state parameter, is

34
00:02:43,858 --> 00:02:49,042
related to the three-dimensional
correlation function, xi of r, in a fairly

35
00:02:49,042 --> 00:02:53,017
simple fashion.
The parallax--point is differed exactly by

36
00:02:53,017 --> 00:02:57,087
one because we reduced the dimensionality
of problem from 3 to 2.

37
00:02:57,087 --> 00:03:01,944
Another important point is that if
galaxies are more likely to be found near

38
00:03:01,944 --> 00:03:05,156
other galaxies, there is that axis
probability.

39
00:03:05,156 --> 00:03:10,060
Then, in order to keep the average
constant, it has to turn negative at some

40
00:03:10,060 --> 00:03:14,341
point, and it does.
Scales that are roughly corresponding to

41
00:03:14,341 --> 00:03:17,238
those of voids seen in galaxy
distribution.

42
00:03:17,238 --> 00:03:22,061
If there is a void in distribution, in
some sense, that's anti-correlation.

43
00:03:22,061 --> 00:03:26,825
It's less likely to find galaxy there,
then if it would be, if the space was

44
00:03:26,825 --> 00:03:31,348
uniformly populated with galaxies.
So, how do we do this in practice?

45
00:03:31,348 --> 00:03:34,935
Suppose we have a catalog of galaxies from
some survey.

46
00:03:34,935 --> 00:03:38,403
Then, you can create an, a random catalog
of galaxies.

47
00:03:38,403 --> 00:03:43,401
Same number of galaxies, but randomly
distributed in a Poissonian fashion.

48
00:03:43,401 --> 00:03:49,632
Then, you can do the simple counts.
Count galaxy pairs, one against the other,

49
00:03:49,632 --> 00:03:55,178
and count the fake galaxy pairs, the
random catalog, divide the two and

50
00:03:55,178 --> 00:03:59,253
subtract to one because its the axis
probability.

51
00:03:59,253 --> 00:04:05,434
For large extensive catalog of galaxies
with uniform borders, this will work

52
00:04:05,434 --> 00:04:09,438
fairly well.
But in reality catalogs do have some

53
00:04:09,438 --> 00:04:13,058
incompleteness or uneven borders, and so
on.

54
00:04:13,058 --> 00:04:19,312
So, there is a better estimator, which is
called the Landy-Szalay estimator, and its

55
00:04:19,312 --> 00:04:23,969
formula is given here.
We make a correction by counting galaxy

56
00:04:23,969 --> 00:04:28,551
versus random catalog pairs.
This takes care of the boundary

57
00:04:28,551 --> 00:04:32,275
conditions.
Now, if you're not counting individual

58
00:04:32,275 --> 00:04:37,540
galaxies, but have, essentially, a galaxy
density field where you can divide

59
00:04:37,540 --> 00:04:42,967
galaxies in, in boxes or, or pixels, then
numerical galaxy density can be used in

60
00:04:42,967 --> 00:04:46,922
the same fashion.
Just subtract the expected average from

61
00:04:46,922 --> 00:04:50,058
the actual count n, and divide by the
average.

62
00:04:50,059 --> 00:04:56,086
So, if you do this, for any kinds of pairs
or density pairs, then xi of r is really

63
00:04:56,086 --> 00:05:00,674
the expectation value of this
probabilistic distribution.

64
00:05:00,674 --> 00:05:06,857
Now, strictly speaking, because we are
correlating galaxies with themselves, this

65
00:05:06,857 --> 00:05:10,498
should be called the autocorrelation
function.

66
00:05:10,498 --> 00:05:14,381
But common usage is just to call it
correlation function.

67
00:05:14,381 --> 00:05:19,826
Note, also, that you can correlate sample
of one kind of objects versus the others.

68
00:05:19,826 --> 00:05:24,319
So, for example, you could ask, are
galaxies clustered around quasars?

69
00:05:24,319 --> 00:05:28,890
And, and we can evaluate the
cross-correlation function between

70
00:05:28,890 --> 00:05:33,381
galaxies and quasars, and so on.
By analogy, we can expand this to

71
00:05:33,381 --> 00:05:37,879
higher-order terms.
The three-point correlation function, now

72
00:05:37,879 --> 00:05:43,225
asks, given a galaxy, given a probability
of finding another galaxy at certain

73
00:05:43,225 --> 00:05:48,166
distance, what is now probability of
finding a third galaxy at some other

74
00:05:48,166 --> 00:05:51,427
distance?
This obviously gets lot more complicated

75
00:05:51,427 --> 00:05:56,021
and numerically tedious very fast.
However, there is some useful information

76
00:05:56,021 --> 00:06:00,676
in these high-order correlation functions,
and sometimes they're evaluated.

77
00:06:00,676 --> 00:06:05,372
I mentioned that different kinds of
galaxies can cluster differently, and here

78
00:06:05,372 --> 00:06:09,369
is a simple thing you can do.
You can ask, how are galaxies clustered,

79
00:06:09,369 --> 00:06:13,647
say, bright ones versus faint ones?
And it turns out the bright ones are

80
00:06:13,647 --> 00:06:17,652
clustered more strongly.
The amplitude is higher and the slope is

81
00:06:17,652 --> 00:06:20,396
steeper.
The steepness of the slope obviously

82
00:06:20,396 --> 00:06:23,314
correlates with how strongly they are
clustered.

83
00:06:23,314 --> 00:06:27,961
If the correlation function was perfectly
flat, there would be no correlation.

84
00:06:27,961 --> 00:06:33,339
Understanding effects like this contains
some useful information about galaxy

85
00:06:33,339 --> 00:06:37,792
formation mechanisms.
And we'll talk more about those later in

86
00:06:37,792 --> 00:06:40,580
the class.
Or, you can divide galaxies by

87
00:06:40,580 --> 00:06:44,346
morphological type, say, ellipticals
versus spirals.

88
00:06:44,346 --> 00:06:49,152
It turns out that elliptical galaxies or
redder galaxies are clustered more

89
00:06:49,152 --> 00:06:52,103
strongly than the blue ones, the disk
galaxy.

90
00:06:52,103 --> 00:06:56,543
That, too, has an interesting clue about
galaxy formation and evolution.

91
00:06:56,543 --> 00:07:00,017
So, it's a power-law.
Does that mean it's a fractal?

92
00:07:00,017 --> 00:07:05,336
Remember, for any distribution of points,
the probability of finding another one

93
00:07:05,336 --> 00:07:09,605
from the same set increases the sum
dimensionality of the space.

94
00:07:09,605 --> 00:07:13,342
In normal tedious space, this would be
like cube of radius.

95
00:07:13,342 --> 00:07:17,211
If that number, is not an integer, the set
is called fractal.

96
00:07:17,211 --> 00:07:22,294
So, we can write this formula, which is
resembling correlation function.

97
00:07:22,294 --> 00:07:27,250
And if, indeed, it was pure power law,
then you could say, universe was fractal,

98
00:07:27,250 --> 00:07:31,645
if it was a pure power-law extending to
infinitely large distances.

99
00:07:31,645 --> 00:07:37,318
But in reality, that's not the case.
There are significant deviations from

100
00:07:37,318 --> 00:07:43,121
power-law, it's slightly bent.
And, therefore, universe is not fractal or

101
00:07:43,121 --> 00:07:48,526
large-scale structure is not fractal,
although pretty close to it.

102
00:07:48,526 --> 00:07:54,317
Next time, we'll talk about power spectrum
of galaxy clustering, which can be

103
00:07:54,317 --> 00:07:57,970
directly related to theoretical
predictions.