Κυριακή 23 Αυγούστου 2015

The Role of Statistics in Finance



In order to understand finance and make prudent investment decisions it is imperative to know statistics, and have a basic understanding. Even in politics statistics is used when public opinion is asked on political issues, it is used in election campaigns and during elections. Statistics is used in medicine, in sports, and in the economy, and in the stock market.
In order to understand finance and make prudent investment decisions it is imperative to know statistics, and have a basic understanding. Even in politics statistics is used when public opinion is asked on political issues, it is used in election campaigns and during elections. Statistics is used in medicine, in sports, and in the economy, and in the stock market.
 

In an organized economy, the stock markets’ prices change daily, fluctuate constantly as a result of various parameters. As a whole, the stock market fluctuates daily as a result of macro or micro economic conditions, or even political decisions. These parameters that fluctuate constantly, that take on different numbers are called variables. So if on any giving day we observe stock market prices of the general index, or any specific stock, we will observe that they take on different prices.

How many times has a nation taken a census on its population, on personal income changes, or even the migration of population to different cities? If these census involve the total number of observations for any given variable, then we are talking about the population. If they involve a part of the observations, then we are talking about a sample. For example, if we are to take for a month the price of a particular stock trading on the stock market, we will observe the daily fluctuations of the stock price closing. These are samples of the variable stock price. The total population is the total data of the stock price closing price from the day the stock enlisted in the stock exchange. So the sample is a subtotal of the total population.

Statistics is the science that is involved with statistical data, the collection of data and its assortment, and presentation, and the conclusions as a result of the analysis. Going back to the example of the closing price of a stock during any giving month, the trend that the stock price closing tends to have, can be giving by the arithmetic average. It is a statistical measure that shows the average price trend of a giving variable during a specific time period.
The formula used is X= χ1 +χ2 +…+χn / V where X is the arithmetic average (mean), is the price of the variable, and V is the total number of observations.

When we determine the arithmetic mean from a set of say a stock price we tend to assign the same weight in each observation. However, we can assign different weights to each observation. To illustrate and go ahead of ourselves, if we hold a portfolio of stocks we tend to place money in different assets at different proportions. If we had a portfolio of two stocks, obviously to minimize risk, we will not place all of our money in one stock, but diversify the risk. So in our example of two stocks, A and B, and given we have $100, we place say 60% in stock A, and 40% in stock B.
 

In another example, say we have data for the price of a stock in the last three years. The most recent stock prices would have a higher weight than those in the distance past. So if we wanted to find the weighted average, the weights assigned range from 0 to 1. The formula used to calculate the weighted average is given by x (weight) =χ1 *w1 + x2 * w2+…..+χn * wn / w1+w2+…  where x (weight) is the weighted average,  are the variables w1, w2,  are the weights, and n is the number of observations.

To illustrate what is the weighted average of stock prices of four banking stocks in the Athens Stock Exchange, I took the closing prices as of August 12, 2015 along with the volume of shares in millions.

Stock
Closing price
Number of shares(volume) millions, round off
Alpha
€0.135
25
National
0.640
11
Piraeus
0.161                         
19
Eurobank
0.068
29

x(weight) = 0.135 *25 + 0.640 *11 +0.161 *19 +0.068 *29 / 25+11+19+29 = 15/84= €0.20
Now let us suppose that the above stocks are in a portfolio, and we have invested 15% in Alpha, 20% in National, 25% in Piraeus, and 40% in Eurobank. Let us also suppose (real from Athens Stock Exchange) that the change in stock prices are +3%, +0.79%, -3.0%, +1.49%. What would be the change in the value of this portfolio?
x(weight) = 0.03*0.015 +0.01*0.20 +(-0.03)*0.25 +0.01*0.40 / 0.15+0.20+0.25+0.40 = -0.1%
The weighted average is a very useful statistical tool in investments since it is consisted of a measure of return of investments.

The arithmetic average as we discussed is a useful tool since it provides us with information as to the central trend of a measurement. What if say we are taking a serious of temperatures of a boiling substance. Sometimes we tend to find extreme (too high or too low) measurements at both sides of the spectrum. Another words, measurements that are too extreme, then the arithmetic average would be a problem. To determine the tendency of the variable we use the Median. We arrange the observations in increasing order, and if the number is odd, the median is the value in the middle. If the number of observations are even, the median is the average of the two values in the middle.
 

The median is another measure of central tendency. A way to measure the median is that if in a sample there appear values that are very high, or very low then the arithmetic mean would be very high or low. From the arithmetic mean the median is used to determine the central tendency of the variable. If the number of observations is odd, the median is the value in the middle. If it even, the median is the average  where is the number of observations.  To illustrate the median, say we take ten annual returns of the Dow Jones Industrials, and we come up with a median 3.5%. That means that half the returns are higher and half are lower than 3.5%.

Another measure is the Mode, which be definition it is the value that occurs most frequently. If we look at the Dow and take returns for any given years, there may be instances where there is no mode, that is, no return appears more than once. But there may be instances where returns may appear more than once. This is least likely to appear in finance.

Now let us consider two hypothetical assets with a mean return of 5%. Would assets A and B be equally desirable if the observed values of returns for asset A is highly condensed between 9 and 12%, while for asset B be dispersed between a low of -50% and a high of 30%. The answer is No. Dispersion around the mean matters, and that is what variance intends to explain. If we take the average of the squared differences between each return and the mean, then the variance measures the average of the squared deviations from the mean.

If we analyze the daily percentage change in the price of a stock for a given period of time, we can use the variance and the standard deviation to determine the volatility change, the return and the risk associated with the stock. From a sample of observations, the variance is where s squared (s2) is the variance of the sample and χ is the mean.
The variance of the population is given by σ2 = (x1-μ) 2 + (x2 – μ) 2 + …. + (xn – μ) 2 / Ν. The standard deviation of the sample is the squared root of s2, or s= x n –χ) 2/n-1 and the standard deviation of the population is the squared root of σ2. That is σ =√(x1 – μ) 2 + (x2 – μ) 2 + …. + (xn – μ) 2/N. The standard deviation is the squared root of the variance, V that is SD =V½. The higher the standard deviation means the higher the dispersion around the mean.
 

Investments have to do with returns and in order to make sound decisions we have to take into account the risk factor. The more returns fluctuate over time, the greater the uncertainty about say, stock prices and returns. This increased uncertainty is associated with greater risk, and without going into further details, for an investor to take the risk, one must be compensated for this with greater returns. In order to capture this uncertainty is to compute the standard deviation (SD) of returns.

The larger the SD is the greater risk is associated with the asset. If we have a collection of data say the returns of a stock market index for a month, a small SD indicates that the returns fluctuate closely around the mean return, which means less volatility. A large SD indicates that returns tend to depart more from the mean return, or high volatility.
In finance it is interesting to know the relationship between two variables. Imagine you had a portfolio where you had only two stocks. Would it not be interesting to know the relationship, if any, between the two stocks? Or suppose we want to know the relationship between two indices, say the DOW Jones index and that of the French CAC. Do they move together or in the opposite direction?
In statistics this is accomplished by the Covariance. The relationship between two variables, i,j or COVi,j, measures the linear relationship between them. There are two problems to encounter here: the first is the units in which the variables are measured, and the second, is that the variance has no bounds (upper or lower limits), and thus you cannot conclude if the number you find is strong or weak in terms of the relationship between the variables.

This is alleviated by determining the correlation coefficient, CORR between variables i and j (CORRi,j). The formula is given by CORRi,j = COVi,j / SDi* SDj. Correlation coefficient measures the strength of the two variables, and takes values of 1 or -1. When the correlation is positive, it means that the two variables move together in the same direction. If it is -1, it means that the two variables move in opposite direction. A correlation of 0 means there is no linear relationship between the variables.

 
In an organized economy, the stock markets’ prices change daily, fluctuate constantly as a result of various parameters. As a whole, the stock market fluctuates daily as a result of macro or micro economic conditions, or even political decisions. These parameters that fluctuate constantly, that take on different numbers are called variables. So if on any giving day we observe stock market prices of the general index, or any specific stock, we will observe that they take on different prices.

How many times has a nation taken a census on its population, on personal income changes, or even the migration of population to different cities? If these census involve the total number of observations for any given variable, then we are talking about the population. If they involve a part of the observations, then we are talking about a sample. For example, if we are to take for a month the price of a particular stock trading on the stock market, we will observe the daily fluctuations of the stock price closing. These are samples of the variable stock price. The total population is the total data of the stock price closing price from the day the stock enlisted in the stock exchange. So the sample is a subtotal of the total population.

Statistics is the science that is involved with statistical data, the collection of data and its assortment, and presentation, and the conclusions as a result of the analysis. Going back to the example of the closing price of a stock during any giving month, the trend that the stock price closing tends to have, can be giving by the arithmetic average. It is a statistical measure that shows the average price trend of a giving variable during a specific time period.
The formula used is X= χ1 +χ2 +…+χn / V where X is the arithmetic average (mean), is the price of the variable, and V is the total number of observations.
When we determine the arithmetic mean from a set of say a stock price we tend to assign the same weight in each observation. However, we can assign different weights to each observation. To illustrate and go ahead of ourselves, if we hold a portfolio of stocks we tend to place money in different assets at different proportions. If we had a portfolio of two stocks, obviously to minimize risk, we will not place all of our money in one stock, but diversify the risk. So in our example of two stocks, A and B, and given we have $100, we place say 60% in stock A, and 40% in stock B.

In another example, say we have data for the price of a stock in the last three years. The most recent stock prices would have a higher weight than those in the distance past. So if we wanted to find the weighted average, the weights assigned range from 0 to 1. The formula used to calculate the weighted average is given by x (weight) =χ1 *w1 + x2 * w2+…..+χn * wn / w1+w2+…  where x (weight) is the weighted average, χ1, χ2, ...χn are the variables,  w1, w2,  are the weights, and n is the number of observations.

To illustrate what is the weighted average of stock prices of four banking stocks in the Athens Stock Exchange, I took the closing prices as of August 12, 2015 along with the volume of shares in millions.

Stock
Closing price
Number of shares(volume) millions, round off
Alpha
€0.135
25
National
0.640
11
Piraeus
0.161                         
19
Eurobank
0.068
29

x(weight) = 0.135 *25 + 0.640 *11 +0.161 *19 +0.068 *29 / 25+11+19+29 = 15/84= €0.20

Now let us suppose that the above stocks are in a portfolio, and we have invested 15% in Alpha, 20% in National, 25% in Piraeus, and 40% in Eurobank. Let us also suppose (real from Athens Stock Exchange) that the change in stock prices are +3%, +0.79%, -3.0%, +1.49%. What would be the change in the value of this portfolio?
x(weight) = 0.03*0.015 +0.01*0.20 +(-0.03)*0.25 +0.01*0.40 / 0.15+0.20+0.25+0.40 = -0.1%
The weighted average is a very useful statistical tool in investments since it is consisted of a measure of return of investments.

The arithmetic average as we discussed is a useful tool since it provides us with information as to the central trend of a measurement. What if say we are taking a serious of temperatures of a boiling substance. Sometimes we tend to find extreme (too high or too low) measurements at both sides of the spectrum. Another words, measurements that are too extreme, then the arithmetic average would be a problem. To determine the tendency of the variable we use the Median. We arrange the observations in increasing order, and if the number is odd, the median is the value in the middle. If the number of observations are even, the median is the average of the two values in the middle.

 

The median is another measure of central tendency. A way to measure the median is that if in a sample there appear values that are very high, or very low then the arithmetic mean would be very high or low. From the arithmetic mean the median is used to determine the central tendency of the variable. If the number of observations is odd, the median is the value in the middle. If it even, the median is the average  where is the number of observations.  To illustrate the median, say we take ten annual returns of the Dow Jones Industrials, and we come up with a median 3.5%. That means that half the returns are higher and half are lower than 3.5%.

Another measure is the Mode, which be definition it is the value that occurs most frequently. If we look at the Dow and take returns for any given years, there may be instances where there is no mode, that is, no return appears more than once. But there may be instances where returns may appear more than once. This is least likely to appear in finance.

Now let us consider two hypothetical assets with a mean return of 5%. Would assets A and B be equally desirable if the observed values of returns for asset A is highly condensed between 9 and 12%, while for asset B be dispersed between a low of -50% and a high of 30%. The answer is No. Dispersion around the mean matters, and that is what variance intends to explain. If we take the average of the squared differences between each return and the mean, then the variance measures the average of the squared deviations from the mean.

If we analyze the daily percentage change in the price of a stock for a given period of time, we can use the variance and the standard deviation to determine the volatility change, the return and the risk associated with the stock .From a sample of observations, the variance is  where s squared (s2) is the variance of the sample and χ is the mean.

The variance of the population is given by σ2 = (x1-μ) 2 + (x2 – μ) 2 + …. + (xn – μ) 2 / Ν. The standard deviation of the sample is the squared root of s2, or s= squared root (x1-x)2 + (x2 - x) 2 +....+(xn - x)2 / n-1 ,  and the standard deviation of the population is the squared root of σ2. That is σ =√(x1 – μ) 2 + (x2 – μ) 2 + …. + (xn – μ) 2/N. The standard deviation is the squared root of the variance, V that is SD =V½. The higher the standard deviation means the higher the dispersion around the mean.
Investments have to do with returns and in order to make sound decisions we have to take into account the risk factor. The more returns fluctuate over time, the greater the uncertainty about say, stock prices and returns. This increased uncertainty is associated with greater risk, and without going into further details, for an investor to take the risk, one must be compensated for this with greater returns. In order to capture this uncertainty is to compute the standard deviation (SD) of returns.

The larger the SD is the greater risk is associated with the asset. If we have a collection of data say the returns of a stock market index for a month, a small SD indicates that the returns fluctuate closely around the mean return, which means less volatility. A large SD indicates that returns tend to depart more from the mean return, or high volatility.

In finance it is interesting to know the relationship between two variables. Imagine you had a portfolio where you had only two stocks. Would it not be interesting to know the relationship, if any, between the two stocks? Or suppose we want to know the relationship between two indices, say the DOW Jones index and that of the French CAC. Do they move together or in the opposite direction?

In statistics this is accomplished by the Covariance. The relationship between two variables, i,j or COVi,j, measures the linear relationship between them. There are two problems to encounter here: the first is the units in which the variables are measured, and the second, is that the variance has no bounds (upper or lower limits), and thus you cannot conclude if the number you find is strong or weak in terms of the relationship between the variables.

This is alleviated by determining the correlation coefficient, CORR between variables i and j (CORRi,j). The formula is given by CORRi,j = COVi,j / SDi* SDj. Correlation coefficient measures the strength of the two variables, and takes values of 1 or -1. When the correlation is positive, it means that the two variables move together in the same direction. If it is -1, it means that the two variables move in opposite direction. A correlation of 0 means there is no linear relationship between the variables.