CFA 2018 Level 2 Quantitative methods
6 months ago
2017CFA 二级培训项目 Quantitative Methods 讲师：周琪
周琪 工作职称：金程教育金融研究院CFA/FRM高级培训师 教育背景：中央财经大学国际经济学学士、澳大利亚维多利亚大学金融风 险管理学学士
主编出版：参与金程CFA项目参考书目的编写工作，包括金程CFA一级中文 Notes 等Topic Weightings in CFA Level II Session NO. Content Weightings
Study Session 1-2 Ethics & Professional Standards 10-15
Study Session 3 Quantitative Methods 5-10
Study Session 4 Economic Analysis
5-10 Study Session 5-6
Financial Statement Analysis 15-20 Study Session 7-8
Corporate Finance 5-15
Study Session 9-11 Equity Analysis
15-25 Study Session 12-13
Fixed Income Analysis 10-20 Study Session 14
Derivative Investments 5-15 Study Session 15 Alternative Investments 5-10 Study Session 16-17 Portfolio Management 5-10
SS3 Quantitative Methods for Framework Valuation
R9 Correlation and regressionQuantitative MethodsR10 Multiple regression and issues in regression analysis R11 Time-series analysis R12 Excerpt from ’’Probabilistic Approaches: Scenario Analysis, Decision Trees, and Simulation’’
1. Scatter Plots
2. Covariance and Correlation
3. Interpretations of Correlation CoefficientsFramework
4. Significance Test of the Correlation
5. Limitations to Correlation Analysis
6. The Basics of Simple Linear Regression
7. Interpretation of regression coefficients
8. Standard Error of Estimate & Coefficient of
2 Determination (R )
9. Analysis of Variance (ANOVA)
10. Regression coefficient confidence interval
11. Hypothesis Testing about the Regression Coefficient
12. Predicted Value of the Dependent Variable
13. Limitations of Regression Analysis
A scatter plots is a graph that shows the relationship between the observations for two data series in two dimensions.Sample Covariance and Correlation
Covariance measures how one random variable moves with another random variable. ----It captures the linear relationship.
Cov ( X , Y ) ( X X )( Y Y ) /( n 1 ) i i
Covariance ranges from negative infinity to positive infinity
Cov ( X , Y )
s s x y
Correlation measures the linear relationship between two random variables
Correlation has no units, ranges from –1 to +1
Correlation coefficient Interpretation r = +1 perfect positive correlation 0 < r < +1 positive linear correlation r = 0 no linear correlation −1 < r < 0 negative linear correlation r = −1 perfect negative correlation
Interpretations of Correlation Coefficients The correlation coefficient is a measure of linear association.
It is a simple number with no unit of measurement attached, so the correlation coefficient is much easier to explain than the covariance.
Interpretations of Correlation Coefficients Significance Test of the Correlation
Test whether the correlation between the population of two variables is equal to zero.
H : ρ=0; H : ρ≠0 (Two-tailed test)
r-0 r n-2 t= , df = n-2
2 1-r 1-r n-2
Decision rule: reject H if t>+t , or t<- t
Conclusion: the correlation between the population of two variables is significantly different from zero.Example
The covariance between X and Y is 16. The standard deviation of X is 4 and the standard deviation of Y is 8. The sample size is 20. Test the significance of the correlation coefficient at the 5% significance level. Solution :
The sample correlation coefficient r = 16/(4 ×8) = 0.5. The t- statistic can be computed as:
20 2 t
2.45 1 0.25
value for α=5%, two-tailed test with df=18 is 2.101. The critical t- Since the test statistic of 2.45 is larger than the critical value of 2.101, we have sufficient evidence to reject the null hypothesis . So we can say that the correlation coefficient between X and Y is significantly different from zero.Limitations to Correlation Analysis
Outliers Outliers represent a few extreme values for sample observations.
Relative to the rest of the sample data, the value of an outlier may be extraordinarily large or small.
Outlier can result in apparent statistical evidence that a
significant relationship exists when, in fact, there is none, or that there is no
relationship when, in fact, there is a relationship.Limitations to Correlation Analysis
Spurious correlation Spurious correlation refers to the appearance of a causal linear relationship
when, in fact, there is no relation . Certain data items may be highly correlated purely by chance . That is to say, there is no economic explanation for the relationship, which would be considered a spurious correlation.
correlation between two variables that reflects chance relationships in a particular data set,
correlation induced by a calculation that mixes each of two variables with a third (two variables that are uncorrelated may be correlated if divided by a third variable, correlation between two variables arising not from a direct relation between them but from their relation to a third variable. (height may be
positively correlated with the extent of a person's vocabulary)Limitations to Correlation Analysis
only measures the linear relationship between two variables, so it dose not capture strong nonlinear relationships between variables.
For example, two variables could have a nonlinear relationship such asThe Basics of Simple Linear Regression
Linear regression allows you to use one variable to make predictions about another , test hypotheses about the relation between two
variables, and quantify the strength of the relationship between the two variables.
Linear regression assumes a linear relation between the dependent and the independent variables.
The dependent variable is the variable whose variation is explained by the independent variable. The dependent variable
is also refer to as the explained variable, the endogenous variable, or the predicted variable .
The independent variable is the variable whose variation is used to explain the variation of the dependent variable. The independent variable is also refer to as the explanatory variable, the exogenous variable, or the predicting variable .The Basics of Simple Linear Regression
The simple linear regression model
Y b b X i n , 1 ,..., i i i
Where Y = ith observation of the dependent variable, Y
X = ith observation of the independent variable, X
b = regression intercept term b = regression slope coefficient
ε = the residual for the ith observation (also referred to as the disturbance
term or error term)Interpretation of regression coefficients
Interpretation of regression coefficients
ˆb The estimated intercept coefficient ( ) is interpreted as the value of Y when X is equal to zero.
The estimated slope coefficient ( ) defines the sensitivity of ˆb
1 Y to a change in X .The estimated slope coefficient ( ) equals ˆb
1 covariance divided by variance of X.
Example An estimated slope coefficient of 2 would indicate that the dependent variable will change two units for every 1 unit change in the independent variable.
The intercept term of 2% can be interpreted to mean that the
independent variable is zero, the dependent variable is 2%.The assumptions of the linear regression
A linear relationship exists between X and Y X is not random, and the condition that X is uncorrelated with the error term can substitute the condition that X is not random.
The expected value of the error term is zero (i.e., E( ε
The variance of the error term is constant (i.e., the error terms are homoskedastic)
The error term is uncorrelated across observations (i.e., E( ε
i ε j
)=0 for all i ≠j)
The error term is normally distributed.Calculation of regression coefficients
Ordinary least squares (OLS)
with corresponding values for b
that minimize the squared residuals (i.e., error terms).
the OLS sample coefficients are those that: The estimated intercept coefficient ( ) : the point ( , ) is on the regression line.
n i i n i i i
OLS estimation is a process that estimates the population parameters B
X Var Y
X Cov b 1 2 1 1 ) ( ) )( (
) ( ) , ( X b Y b
X X Y Y
Bouvier Co. is a Canadian company that sells forestry products to several Pacific Rim customers. Bouvier’s sales are very sensitive to exchange rates. The following table shows recent annual sales (in millions of Canadian dollars) and the average exchange rate for the year (expressed as the units of foreign currency needed to buy one Canadian dollar). Calculate the intercept and coefficient for an estimated linear regression with the exchange rate as the independent variable and sales as the dependent variable.
Example: calculate a regression coefficient
= Exchange Rate Y
30Example: calculate a regression coefficient
The following table provides several useful calculations:
(X -X)(Y -Y)
(X -X) 2 i (Y -Y) i i
0.0016 0.0036 0.0025 0.0009 0.00040.24 0.6 0.2
0.27 0.08 1.39 2 i
Year i X i
= Exchange Rate Y
0.34Example: calculate a regression coefficient
The sample mean of the exchange rate is: The sample mean of sales is: We want to estimate a regression equation of the form Y
i 1 / 2 16 6 0 36 . / . n i i
26 154 444 0 36 26 55 6 81 6 ˆ ˆ ˆ . . . .
n i i i=1 1 n 2 i i=1 1 Y -Y X -X
Y Y n
1 156 6 / 26 / n i i
X X n
1.39 b = = = -154.44, and 0.009 X -X b b
= 81.6 – 154.444X
. The estimates of the slope coefficient and the intercept are
1 X i
= b + b
So the regression equation is YAnalysis of Variance(ANOVA) Table
Y Y SSE
1 __ Y Y SST _ ( ) i
Y b b X ( ) i i i i
( Y Y ) RSS i
ANOVA Table df SS MSS
Regression k=1 RSS MSR=RSS/k Error n-2 SSE MSE=SSE/(n-2) Total n-1 SST -
SSE SEE MSE
Standard error of estimate:
Coefficient of determination (R²)
2 RSS SSE R
1 SST SST explained variation unexplained variation =1- total variation total variationStandard Error of Estimate
Standard Error of Estimate (SEE) measures the degree of variability of the actual Y-values relative to the estimated Y-values from a regression equation. SEE will be low (relative to total variability) if the relationship is very strong and high if the relationship is weak. The SEE gauges the “fit” of the regression line.The smaller the standard error, the better the fit.The SEE is the standard deviation of the error terms in the regression.
SSE SEE MSE n
2 Coefficient Determination (R )
A measure of the
“goodness of fit” of the regression. It is interpreted as a
percentage of variation in the dependent variable explained by the
2 independent variable . Its limits are 0 ≤R ≤1.
Example: R of 0.63 indicates that the variation of the independent variable explains 63% of the variation in the dependent variable.
For simple linear regression, R² is equal to the squared correlation
coefficient (i.e., R² = r² )
The Different between the R
and Correlation Coefficient
The correlation coefficient indicates the
sign of the relationship between two variables, whereas the coefficient of determination does not.
The coefficient of determination can apply to an equation with
several independent variables , and it implies a explanatory power, while the
correlation coefficient only applies to two variables and does not imply explanatory between the variables.Example
An analyst ran a regression and got the following result:
Coefficient t-statistic p-value Intercept -0.5 -0.91
2 12.00 <0.001 ANOVA Table df SS MSS
Regression 1 8000 ? Error ? 2000 ? 51 ?
Total Fill in the blanks of the ANOVA Table.What is the standard error of estimate? What is the result of the slope coefficient significance test? What is the result of the sample correlation? What is the 95% confidence interval of the slope coefficient?
Regression coefficient confidence interval
Regression coefficient confidence interval If the confidence interval at the desired level of significance dose not include zero, the null is rejected, and the coefficient is said to be statistically different from zero. is the standard error of the regression coefficient .
As SEE rises, also increases, and the confidence interval widens because SEE measures the variability of the data about the regression line, and the more variable the data, the less confidence there is in the regression model to estimate a coefficient. 1
b c s t b
1 ˆb sHypothesis Testing about Regression Coefficient
Significance test for a regression coefficient
H : b =The hypothesized value (usually 0)
ˆ b b
, df=n-2 s
ˆ b 1
Decision rule: reject H if +t <t, or t<- t
Rejection of the null means that the slope coefficient is different from the hypothesized value of b .
1Predicted Value of the Dependent Variable
Predicted values are values of the dependent variable based on the estimated regression coefficients and a prediction about the value of the independent variable. Point estimate
' ˆ ˆ ˆ
Y b b
Confidence interval estimate
ˆ Y t s
c f t c = the critical t-value with df =n−2 s
= the standard error of the forecast
f ' 2 '
X 1 ( ) 1 ( ) s SEE SEE
n n s n X
X ( 1 ) ( )
Limitations of Regression Analysis
Regression relations change over time
This means that the estimation equation based on data from a specific time period may not be relevant for forecasts or predictions in another time period. This is referred to as parameter instability .
The usefulness will be limited if others are also aware of and act on the relationship. Regression assumptions are violated
For example, the regression assumptions are violated if the data is
heteroskedastic (non-constant variance of the error terms) or exhibits autocorrelation (error terms are not independent).
1. The Basics of Multiple Regression
2. Interpreting the Multiple Regression ResultsFramework
3. Hypothesis Testing about the Regression Coefficient
4. Regression Coefficient F-test
5. Coefficient of Determination (R )
6. Analysis of Variance (ANOVA)
7. Dummy variables
8. Multiple Regression Assumptions
9. Multiple Regression Assumption Violations
10. Model Misspecification
11. Qualitative Dependent VariablesThe Basics of Multiple Regression
Multiple regression is regression analysis with more than one independent variable The multiple linear regression model
Y b b X b X b
1 1 i
2 2 i k ki i
X = ith observation of the jth independent variable
N = number of observation K = number of independent variables
Predicted value of the dependent variable
ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ Y b b X b X bX
2 2 k kInterpreting the Multiple Regression Results
intercept term is the value of the dependent variable when the independent variables are all equal to zero.
Each slope coefficient is the estimated change in the dependent variable for a one unit change in that independent variable, holding the other independent variables co