Portfolio
Management Theory
And Technical Analysis
Lecture Notes
By: Dr. Sam Vaknin
Malignant Self Love - Buy the Book - Click HERE!!!
Relationships with Abusive Narcissists - Buy the e-Books - Click HERE!!!
READ THIS: Scroll down to
review a complete list of the articles - Click on the blue-coloured
text!
Bookmark this Page - and SHARE IT with Others!
The
Bill of Rights of the Investor
1. To earn a positive return (=yield) on their capital.
2. To insure his investments against risks (=to hedge).
3. To receive information identical to the that of ALL other investors - complete, accurate and timely and to form independent judgement based on this information.
4.To alternate between investments - or be compensated for diminished liquidity.
5. To study how to carefully and rationally manage his portfolio of investments.
6.To compete on equal terms for the allocation of resources.
7. To assume that the market is efficient and fair.
RISK
1. The difference between asset-owners, investors and speculators.
2. Income: general, free, current, projected (expectations), certain, uncertain.
3. CASE A (=pages 3 and 4)
4. The solutions to our FIRST DISCOVERY are called: "The Opportunities Set"
5. The "INDIFFERENCE CURVE" or the "UTILITY CURVE"
{SHOW THREE DIAGRAMS}
6. The OPTIMAL SOLUTION (=maximum consumption in both years).
7. The limitations of the CURVES:
8. CASE B
CASE A
INVESTOR A has secured income of $20,000 p.a. for the next 2 years.
One investment alternative: a savings account yielding 3% p.a.
(in real terms = above inflation or inflation adjusted).
One borrowing alternative: unlimited money at 3% interest rate
(in real terms = above inflation or inflation adjusted).
MR. SPENDER
Will spend $20,000 in year 1
and $20,000 in year 2
and save $ 0
MR. SPENDTHRIFT
Will save $20,000 in year 1 (=give up his liquidity)
and spend this money
plus 3% interest $600
plus $20,000 in year 2 (=$40,600)
MR. BIG PROBLEM
Will spend $20,000 in year 1
plus lend money against his income in year 2
He will be able to lend from the banks a maximum of:
$19,417 (+3% = $20,000)
HIDDEN ASSUMPTIONS IN MR. BIG PROBLEM's CASE:
THE CONCEPT OF NET PRESENT VALUE
Rests on the above three assumptions (Keynes' theorem about the long run).
$19,417 is the NPV of $20,000 in one year with 3%.
OUR FIRST DISCOVERY:
THE CONSUMPTION IN THE SECOND YEAR =
THE INCOME IN THE SECOND YEAR +
{Money Saved in the First Year X (1 + the interest rate)}
CASE
B
1. The concept of scenarios (Delphi) and probabilities
2. THE MEAN VALUE OF AN ASSET's YIELD = SUM {YIELDS IN DIFFERENT SCENARIOS X PROBABILITIES OF THE SCENARIOS}
{SHOW TABLE - p14}
3. The properties of the Mean Value:
4. The mean of the multiplications of a Constant in the yields equals the multiplication of the Constant in the Mean Value of the yields.
5. The Mean of the yields on two types of assets = The Sum of the Means of each asset calculated separately
{SHOW TABLE - p16}
6. Bi-faceted securities: the example of a convertible bond.
{SHOW TABLE - p16}
7. VARIANCE and STANDARD DEVIATION as measures of the difference between mathematics and reality.
They are the measures of the frustration of our expectations.
{Calculation - p17}
8. THE RULE OF PREFERENCE:
We will prefer a security with the highest Mean Value plus the lowest Standard Deviation.
9. The PRINCIPLE OF DIVERSIFICATION of the investment portfolio: The Variance of combined assets may be less than the variance of each asset separately.
{Calculation - p18}
10. THE FOUR PILLARS OF DIVERSIFICATION:
11. Calculating the Average Yield of an Investment Portfolio.
{Calculation - pp. 18 - 19}
12. Short - cutting the way to the Variance:
PORTFOLIO COVARIANCE - the influence of events on the yields of underlying assets.
{Calculation - p19}
13. Simplifying the Covariance - the Correlation Coefficient.
{Calculation - p19}
14. Calculating the Variance of multi-asset investment portfolios.
{Calculations - p19 - 20}
|
STATE OF INVESTOR |
DESCRIPTION |
PROPERTY |
UTILITY FUNCTION |
1. |
Diminishing Avoidance of absolute risk |
Invests more in risky assets as his capital grows |
Derivative of avoidance of absolute risk < Æ |
Natural logarithm (Ln) of capital |
2. |
Constant Avoidance of absolute risk |
Doesn't change his investment in risky assets as capital grows |
Derivative = Æ |
(-1) (e) raised to the power of a constant multiplied by the capital |
3. |
Increasing Avoidance of absolute risk |
Invests less in risky assets as his capital grows |
Derivative > Æ |
(Capital) less (Constant) (Capital squared) |
4. |
Diminishing Avoidance of relative risk |
Percentage invested in risky assets grows with capital growth |
Derivative < Æ |
(-1) (e) squared multiplied by the square root of the capital |
5. |
Constant Avoidance of relative risk |
Percentage invested in risky assets unchanged as capital grows |
Derivative = Æ |
Natural logarithm (Ln) of capital |
6. |
Increasing avoidance of relative risk |
Percentage invested in risky assets decreases with capital growth |
Derivative > Æ |
Capital - (Number) (Capital squared) |
THE EFFICIENT MARKET
1. The tests: lenient, quasi - rigorous, rigorous
2. The relationship between information and yield
3. Insiders and insiders - trading
4. The Fair Play theorem
5. The Random Walk Theory
6. The Monte Carlo Fallacy
7. Structures - Infra and hyper
8. Market (price) predictions
9. Case study: split and reverse split
10. Do-s and Don't Do-s: a guide to rational behaviour
MORE:
1. Efficient Market: The price of the share reflects all available information.
2. The Lenient Test: Are the previous prices of a share reflected in its present price?
3. The Quasi - Rigorous Test: Is all the publicly available information fully reflected in the current price of a share?
4. The Rigorous Test: Is all the (publicly and privately) available information fully reflected in the current price of a share?
5. A positive answer would prevent situations of excess yields.
6. The main question: how can an investor increase his yield (beyond the average market yield) in a market where all the information is reflected in the price?
7. The Lenient version: It takes time for information to be reflected in prices.
Excess yield could have been produced in this time - had it not been so short.
The time needed to extract new information from prices = The time needed for the information to be reflected.
The Lenient Test: will acting after the price has changed - provide excess yield.
8. The Quasi - Rigorous version: A new price (slightly deviates from equilibrium) is established by buyers and sellers when they learn the new information.
The QR Test: will acting immediately on news provide excess yield?
Answer: No. On average, the investor will buy at equilibrium convergent price.
9. The Rigorous version: Investors cannot establish the "paper" value of a firm following new information. Different investors will form different evaluations and will act in unpredictable ways. This is "The Market Mechanism". If a right evaluation was possible - everyone would try to sell or buy at the same time.
The Rigorous Test: Is it at all possible to derive excess yield from information? Is there anyone who received excess yields?
10. New technology for the dissemination of information, professional analysis and portfolio management and strict reporting requirements and law enforcement - support the Rigorous version.
11. The Lenient Version: Analysing past performance (=prices) is worthless.
The QR Version: Publicly available information is worthless.
The Rigorous version: No analysis or portfolio management is worth anything.
12. The Fair Play Theorem: Since an investor cannot predict the equilibrium, he cannot use information to evaluate the divergence of (estimated) future yields from the equilibrium. His future yields will always be consistent with the risk of the share.
13. Insider - Trading and Arbitrageurs.
14. Price predictive models assume:
(a) The yield is positive and (b) High yield is associated with high risk.
15. Assumption (a) is not consistent with the Lenient Version.
16. Random Walk Theory (RWT):
17. The Monte Carlo Fallacy and the Stock Exchange (no connection between colour and number).
18. The Fair Play Theorem does not require an equal distribution of share prices over time and allows for the possibility of predicting future prices (e.g., a company deposits money in a bank intended to cover an increase in its annual dividends).
19. If RWT is right (prices cannot be predicted) - the Lenient Version is right (excess yields are impossible). But if the Lenient Version is right - it does not mean that RWT is necessarily so.
20. The Rorschach tendency to impose patterns (cycles, channels) on totally random graphic images.
The Elton - Gruber experiments with random numbers and newly - added random numbers.
No difference between graphs of random numbers - and graphs of share prices.
21. Internal contradiction between assumption of "efficient market" and the ability to predict share prices, or price trends.
22. The Linear Model
P = Price of share; C = Counter; ED P = Expected difference (change) in price
DP = Previous change in price; R = Random number
Pa - Pa-1 = ( ED P + D P/ ED P ) · ( Pa-1-c - Pa-2-c + R )
Using a correlation coefficient.
23. The Logarithmic Model
( log CPn ) / ( log CPn-1 ) = Cumulative yield CP = Closing Price
Sometimes instead of CP, we use: D P / (div/P) D P = Price change div = dividend
24. These two models provide identical results - and they explain less than 2% of the change in share prices.
25. To eliminate the influence of very big or small numbers -
some analyse only the + and - signs of the price changes
Fama and Macbeth proved the statistical character of sign clusters.
26. Others say that proximate share prices are not connected - but share prices are sinusoidally connected over time.
Research shows faint traces of seasonality.
27. Research shows that past and future prices of shares are connected with transaction costs. The higher the costs - the higher the (artificial) correlation (intended to, at least, cover the transaction costs).
28. The Filter (Technical Analysis) Model
Sophisticated investors will always push prices to the point of equilibrium.
Shares will oscillate within boundaries. If they break them, they are on the way to a new equilibrium. It is a question of timing.
29. Is it better to use the Filter Model or to hold onto a share or onto cash?
Research shows: in market slumps, continuous holders were worse off than Filter users and were identical with random players.
This was proved by using a mirror filter.
30. The filter Model provides an excess yield identical to transaction costs.
Fama - Blum: the best filter was 0,5%. For the purchase side -1%, 1,5%.
Higher filters were better than constant holding ("Buy and Hold Strategy") only in countries with higher costs and taxes.
31. Relative Strength Model
( CP ) / ( AP ) = RS CP = Current price AP = Average in X previous weeks
- Divide investment equally among highest RS shares.
- Sell a share whose RS fell below the RS' of X% of all shares Best performance is obtained when: "highest RS" is 5% and X% = 70%.
32. RS models instruct us to invest in upwardly volatile stocks - high risk.
33. Research: RS selected shares (=sample) exhibit yields identical to the Group of stocks it was selected from.
When risk adjusted - the sample's performance was inferior (higher risk).
34. Short term movements are more predictable.
Example: the chances for a reverse move are 2-3 times bigger than the chances for an identical one.
35. Brunch: in countries with capital gains tax - people will sell losing shares to materialize losses and those will become underpriced.
They will correct at the beginning of the year but the excess yield will only cover transaction costs, (The January effect).
36. The market reacts identically (=efficiently) to all forms of information.
37. Why does a technical operation (split / reverse split) influence the price of the share (supposed to reflect underlying value of company)?
Split - a symptom of changes in the company. Shares go up before a split was conceived - so split is reserved for good shares (dividend is increased). There is excess yield until the split - but it is averaged out after it.
38. There is considerable gap (upto 2 months) between the announcement and the split. Research shows that no excess yield can be obtained in this period.
39. The same for M & A
40. The QR Version: excess yields could be made on private information.
Research: the influence of Wall Street Journal against the influence of market analyses distributed to a select public.
WSJ influenced the price of the stocks - but only that day.
41. The Rigorous Version: excess yields cannot be made on insider information.
How to test this - if we do not know the information? Study the behaviour of those who have (management, big players).
Research shows that they do achieve excess yields.
42. Do's and Don'ts
Profitability and Share Prices
Bonds
The Financial Statements
1. The Income Statement
revenues, expenses, net earning (profits)
2. Expenses
Costs of goods sold I Operating expenses
General and administrative (G & A) expenses I (including depreciation)
Interest expenses
Taxes
3. Operating revenues - Operating costs = Operating income
4. Operating income + Extraordinary, nonrecurring item =
= Earning Before Interest and Taxes (EBIT)
5. EBIT - Net interest costs = Taxable income
6. Taxable income - Taxes = Net income (Bottom line)
7. The Balance Sheet
Assets = Liabilities + Net worth (Stockholders' equity)
8. Current assets = Cash + Deposits + Accounts receivable +
+ Inventory current assets + Long term assets = Total Assets
Liabilities
Current (short term) liabilities = Accounts payable + Accrued taxes + Debts +
+ Long term debt and other liabilities = Total liabilities
9. Total assets - Total liabilities = Book value
10. Stockholders' equity = Par value of stock + Capital surplus + Retained surplus
11. Statement of cash flows (operations, investing, financing)
12. Accounting Vs. Economics earnings (Influenced by inventories depreciation, Seasonality and business cycles, Inflation, extraordinary items)
13. Abnormal stock returns are obtained where actual earnings deviate from projected earnings (SUE - Standardized unexpected earnings).
14. The job of the security analyst: To study past data, Eliminate ² "noise" and form expectations about future dividends and earning that determine the intrinsic value (and the future price) of a stock.
15. Return on equity (ROE) = Net Profits / Equity
16. Return on assets (ROA) = EBIT / Assets
17. ROE = (1-Tax rate) [ROA + (ROA - Interest rate) × Debt / Equity]
18. Increased debt will positively contribute to a firm's ROE if its ROA exceeds the interest rate on the debt (Example)
19. Debt makes a company more sensitive to business cycles and the company carries a higher financial risk.
20. The Du Pont system
ROE = Net Profit/Pretax Profit × Pretax Profit/EBIT × EBIT/Sales × Sales/Assets × Assets/Equity
(1) (2) (3) (4) (5)
21. Factor 3 (Operating profit margin or return on sales) is ROS
22. Factor 4 (Asset turnover) is ATO
23. Factor 3 × Factor 4 = ROA
24. Factor 1 is the Tax burden ratio
25. Factor 2 is the Interest burden ratio
26. Factor 5 is the Leverage ratio
27. Factor 6 = Factor 2 × Factor 5 is the Compound leverage factor
28. ROE = The burden × ROA × Compound leverage factor
29. Compare ROS and ATO Only within the same industry!
30. Fixed asset turnover = Sales / Fixed assets
31. Inventory turnover ratio = Cost of goods sold / Inventory
32. Average collection period (Days receivables) = Accounts receivables / Sales × 365
33. Current ratio = Current assets / Current liabilities
34. Quick ratio = (Cash + Receivables) / Current liabilities is the Acid test ratio
35. Interest coverage ratio (Times interest earned) = EBIT / Interest expense
36. P / B ratio = Market price / Book value
37. Book value is not necessarily Liquidation value
38. P / E ratio = Market price / Net earnings per share (EPS)
39. P / E is not P /E Multiple (Emerges from DDM - Discounted dividend models)
40. Current earnings may differ from Future earnings
41. ROE = E / B = P/B / P/E
42. Earnings yield = E / P = ROE / P/B
43. The GAAP - Generally accepted accounting principles - allows different representations of leases, inflation, pension costs, inventories and depreciation.
44. Inventory valuation:
Last In First Out (LIFO)
First In First Out (FIFO)
45. Economic depreciation - The amount of a firm's operating cash flow that must be re-invested in the firm to sustain its real cash flow at the current level.
Accounting depreciation (accelerated, straight line) - Amount of the original acquisition cost of an asset allocated to each accounting period over an arbitrarily specified life of the asset.
46. Measured depreciation in periods of inflation is understated relative to replacement cost.
47. Inflation affects real interest expenses (deflates the statement of real income), inventories and depreciation (inflates).
[Graham's Technique]
B O N D S
1. BOND - IOU issued by Borrower (=Issuer) to Lender
2. PAR VALUE (=Face Value)
COUPON (=Interest payment)
3. The PRESENT VALUE (=The Opportunity Cost)
1 / (1+r)n r = interest rate n = years
4. ANNUITY CALCULATIONS and the INFLUENCE OF INTEREST RATES:
n
Pb = å C / (1+r)t + PAR / (1+r)n Pb = Price of the Bond
t=1 C = Coupon
PAR = Principal payment
n = number of payments
5. BOND CONVEXITY - an increase in interest rates results in a price decline that is smaller than the price gain resulting from a decrease of equal magnitude in interest rates.
YIELD
CALCULATIONS
1. YIELD TO MATURITY (IRR) = YTM
2. ANNUALIZED PERCENTAGE RATE (APR) = YTM ´ Number of periods in 1 year
3. EFFECTIVE ANNUAL YIELD (EAY) TO MATURITY = [(1+r)n - 1]
n = number of periods in 1 year
4. CURRENT YIELD (CY) = C / Pb
5. COUPON RATE (C)
6. BANK DISCOUNT YIELD (BDY) = PAR-Pb / PAR ´ 360 / n
n = number of days to maturity
7. BOND EQUIVALENT YIELD (BEY) = PAR-Pb / Pb ´365 / n
8. BEY = 365 ´ BDY / 360 - (BDY ´ n)
9. BDY < BEY < EAY
10. FOR PREMIUM BOND: C > CY > YTM (Loss on Pb relative to par)
TYPES
OF BONDS
1. Zero coupons, stripping
2. Appreciation of Original issue discount (OID)
3. Coupon bonds, callable
4. Invoice price = Asked price + Accrued interest
5. Appreciation / Depreciation and: Market interest rates, Taxes, Risk (Adjustment)
BOND SAFETY
1. Coverage ratios
2. Leverage ratios
3. Liquidity ratios
4. Profitability ratios
5. Cash flow to debt ratio
6. Altman's formula (Z-score) for predicting bankruptcies:
Z = 3,3 times EBIT / TOTAL ASSETS + 99,9 times SALES / ASSETS +
+ 0,6 times MARKET VALUE EQUITY / BOOK VALUE OF DEBT +
+ 1,4 times RETAINED EARNINGS / TOTAL ASSETS +
+ 1,2 times WORKING CAPITAL / TOTAL ASSETS
MACROECONOMY
1. Macroeconomy - the economic environment in which all the firms operate
2. Macroeconomic Variables:
GDP (Gross Domestic Product) or Industrial Production - vs. GNP
Employment (unemployment, underemployment) rate(s)
Factory Capacity Utilization Rate
Inflation (vs. employment, growth)
Interest rates (=increase in PNV factor)
Budget deficit (and its influence on interest rates & private borrowing)
Current account & Trade deficit (and exchange rates)
"Safe Haven" attributes (and exchange rates)
Exchange rates (and foreign trade and inflation)
Tax rates (and investments / allocation, and consumption)
Sentiment (and consumption, and investment)
3. Demand and Supply shocks
4. Fiscal and Monetary policies
5. Leading, coincident and lagging indicators
6. Business cycles:
Sensitivity (elasticity) of sales
Operating leverage (fixed to variable costs ratio)
Financial leverage
MANAGING BOND PORTFOLIOS
1. Return On Investment (ROI) = Interest + Capital Gains
2. Zero coupon bond:
Pb = PAR / (1+I)n
3. Bond prices change according to interest rates, time, taxation and to expectations about default risk, callability and inflation
4. Coupon bonds = a series of zero coupon bonds
5. Duration = average maturity of a bond's cash flows = the weight or the proportion of the total value of the bond accounted for by each payment.
Wt = CFt/(1+y)t / Pb Swt = 1 = bond price
t
Macauley's formula D = S t ´ Wt (where yield curve is flat!)
t=1
6. Duration:
7. DP/P = - D ´ [ D (1+y) / 1+y ] = [ - D / 1+y ] ´ D (1+y) = - Dm ´ D y
8. The EIGHT durations rules
Duration always increases with maturity for bonds selling at par or at a premium.
With deeply discounted bonds duration decreases with maturity.
9. Passive bond management - control of the risk, not of prices.
- indexing (market risk)
- immunization (zero risk)
10. Some are interested in protecting the current net worth - others with payments (=the future worth).
11. BANKS: mismatch between maturities of liabilities and assets.
Gap Management: certificates of deposits (liability side) and adjustable rate mortgages (assets side)
12. Pension funds: the value of income generated by assets fluctuates with interest rates
13. Fixed income investors face two types of risks:
Price risk
Reinvestment (of the coupons) rate risks
14. If duration selected properly the two effects cancel out.
For a horizon equal to the portfolio's duration - price and re-investment risks cancel out.
15. BUT: Duration changes with yield rebalancing
16. BUT: Duration will change because of the passage of time (it decreases less rapidly than maturity)
17. Cash flow matching -buying zeros or bonds yielding coupons equal to the future payments (dedication strategy)
18. A pension fund is a level perpetuity and its duration is according to rule (E).
19. There is no immunization against inflation (except indexation).
20. Active bond management
- Increase / decrease duration if interest rate declines / increases are forecast
- Identifying relative mispricing
21. The Homer - Leibowitz taxonomy:
22. Contingent immunization (Leibowitz - Weinberger):
Active management until portfolio drops to
minimum future value / (1+I)T = Trigger value
if portfolio drops to trigger value - immunization.
23. Horizon Analysis
Select a Holding Period
Predict the yield curve at the end of that period
[We know the bond's time to maturity at the end of the holding period]
{We can read its yield from the yield curve} determine price
24. Riding the yield curve
If the yield curve is upward sloping and it is projected not to shift during the investment horizon as maturities fall (=as time passes) - the bonds will become shorter - the yields will fall - capital gains
Danger: Expectations that interest rates will rise.
INTEREST RATE SWAPS
1. Between two parties exposed to opposite types
of interest rate risk.
Example: SNL CORPORATION
Short term - Long term
Variable rate liabilities - Fixed rate liabilities
Long term - Short term
Fixed rate assets - Variable rate assets
Risk: Rising interest rates Risk: Falling interest rates
2. The Swap
SNL would make fixed rate payments to the corporation based on a notional amount
Corporation will pay SNL an adjustable interest rate on the same notional amount
3. After the swap
SNL CORPORATION
ASSETS LIABILITIES ASSETS LIABILITIESLong term loans Short term deposits Short term assets Long term bonds
(claim to) variable (obligation to) make (claim to) fixed (obligation to) make
- rate cash flows fixed cash payments cash flows variable-rate payments
net worth net worth
William Sharpe, John Lintner, Jan Mossin
1. Capital Asset Pricing Model (CAPM) predicts
the relationship between an asset's risk and its expected return = benchmark
rate of return (investment evaluation) = expected returns of assets not yet
traded
2. Assumptions
[Investors are different in wealth and risk aversion} but:
3. Results
A passive (holding) strategy is the best.
Investors vary only in allocating the amount between risky and risk - free assets.
its risk
and the investor's risk aversion
and the beta coefficient of the asset (relative to the market portfolio).
Beta measures the extent to which returns on the stock and the market move together.
4. Calculating the Beta
The line from which the sum of standard deviations of returns is lowest.
The slope of this line is the Beta.
¥¥
bi = Cov (ri, rm) / sm2 = S (yti-yai)(ytm-yam) / S (ytm-tam)2
t=1 t=1
5. Restating the assumptions
- sectoral
- international
6. Diversified investors should care only about risks related to the market portfolio.
Return
Beta
1/2 1 2
Investment with Beta 1/2 should earn 50% of the market's return
with Beta 2 - twice the market return.
7. Recent research discovered that Beta does not work.
A better measure:
B / M
(Book Value) / (Market Value)
8. If Beta is irrelevant - how should risks be measured?
9. NEER (New Estimator of Expected Returns):
The B to M ratio captures some extra risk factor and should be used with Beta.
10. Other economists: There is no risk associated with high B to M ratios.
Investors mistakenly underprice such stocks and so they yield excess returns.
11. FAR (Fundamental Asset Risk) - Jeremy Stein
There is a distinction between:
- Boosting a firm's long term value and
- Trying to raise the share's price
If investors are rational:
Beta cannot be the only measure of risk ® we should stop using it
Any decision boosting (A) will affect (B) ® (A) and (B) are the same
If investors are irrational
Beta is right (it captures an asset's fundamental risk = its contribution to the market portfolio risk) ® we should use it, even if investors irrational if investors are making predictable mistakes - a manager must choose:
If he wants (B) ® NEER (accommodating investors expectations)
If he wants (A) BETA
TECHNICAL ANALYSIS - Part A
1. Efficient market hypothesis - share prices reflect all available information
2. Weak form
Are past prices reflected in present prices?
No price adjustment period - no chance for abnormal returns
(prices reflect information in the time that it takes to decipher it from them)
If we buy after the price has changed - will we have abnormal returns?
Technical analysis is worthless
3. Semistrong form
Is publicly available information fully reflected in present prices?
Buying price immediately after news will converge, on average, to equilibrium
Public information is worthless
4. Strong form
Is all information - public and private - reflected in present prices?
No investor can properly evaluate a firm
All information is worthless
5. Fair play - no way to use information to make abnormal returns
An investor that has information will estimate the yield and compare it to the equilibrium yield. The deviation of his estimates from equilibrium cannot predict his actual yields in the future.
His estimate could be > equilibrium > actual yield or vice versa. On average, his yield will be commensurate with the risk of the share.
6. Two basic assumptions
7. If (A) is right, past prices contain no information about the future
8. Random walk
9. The example of the quarterly increase in dividends
10. The Rorschach Blots fallacy (patterns on random graphical designs)
® cycles (Kondratieff)
11. Elton - Gruber experiments with series of random numbers
12. Price series and random numbers yield similar graphs
13. The Linear model
Pa - Pa-1 = ( ED P + ) ´ ( Pa-1-c - Pa-z-c +R )
P = Price of share
C = Counter
ED P = Expected change in Price
R = Random number
14. The Logarithmic model
= cum. Y
Sometimes, instead of Pc we use D P +
15. Cluster analysis (Fama - Macbeth)
+ and - distributed randomly. No statistical significance.
16. Filter models - share prices will fluctuate around equilibrium because of profit taking and bargain hunting
17. New equilibrium is established by breaking through trading band
18. Timing - percentage of break through determines buy / sell signals
19. Filters effective in BEAR markets but equivalent to random portfolio management
20. Fama - Blum: best filter is the one that covers transaction costs
21. Relative strength models - P / P
Divide investment equally between top 5% of shares with highest RS and no less than 0,7
Sell shares falling below this benchmark and divide the proceeds among others
22. Reservations:
TECHNICAL ANALYSIS - Part B
1. Versus fundamental: dynamic (trend) vs. static (value)
2. Search for recurrent and predictable patterns
3. Patterns are adjustment of prices to new information
4. In an efficient market there is no such adjustment, all public information is already in the prices
5. The basic patterns:
6. Buy/sell signals
Example: Piercing the neckline of Head and Shoulders
7. The Dow theory uses the Dow Jones industrial average (DJIA) as key indicator of underlying trends + DJTransportation as validator
8. Primary trend - several months to several years
Secondary (intermediate) trend - deviations from primary trend: 1/3, 1/2, 2/3 of preceding primary trend
Correction - return from secondary trend to primary trend
Tertiary (minor) trend - daily fluctuations
9. Channel - tops and bottoms moving in the direction of primary trend
10. Technical analysis is a self fulfilling prophecy - but if everyone were to believe in it and to exploit it, it would self destruct.
People buy close to resistance because they do not believe in it.
11. The Elliott Wave theory - five basic steps, a fractal principle
12. Moving averages - version I - true value of a stock is its average price
prices converge to the true value
version II - crossing the price line with the moving
average line predicts future prices
13. Relative strength - compares performance of a stock to its sector or to the performance of the whole market
14. Resistance / support levels - psychological boundaries to price movements assumes market price memory
15. Volume analysis - comparing the volume of trading to price movements high volume in upturns, low volume in down movements - trend reversal
16. Trin (trading index) =
Trin > 1 Bearish sign
17. BEAR / Bull markets - down/up markets disturbed by up/down movements
18. Trendline - price moves upto 5% of average
19. Square - horizontal transition period separating price trends (reversal patterns)
20. Accumulation pattern - reversal pattern between BEAR and BULL markets
21. Distribution pattern - reversal pattern between BULL and BEAR markets
22. Consolidation pattern - if underlying trends continues
23. Arithmetic versus logarithmic graphs
24. Seasaw - non breakthrough penetration of resistance / support levels
25. Head and shoulder formation (and reverse formation):
Small rise (decline), followed by big rise (decline), followed by small rise (decline).
First shoulder and head-peak (trough) of BULL (BEAR) market.
Volume very high in 1st shoulder and head and very low in 2nd shoulder.
26. Neckline - connects the bottoms of two shoulders.
Signals change in market direction.
27. Double (Multiple) tops and bottoms
Two peaks separated by trough = double tops
Volume lower in second peak, high in penetration
The reverse = double bottoms
28. Expanding configurations
Price fluctuations so that price peaks and troughs
can be connected using two divergent lines.
Shoulders and head (last).
Sometimes, one of the lines is straight:
UPPER (lower down) or - accumulation, volume in penetration
LOWER (upper up) 5% penetration signals reversal
29. Conservative upper expanding configuration
Three tops, each peaking
Separated by two troughs, each lower than the other
Signals peaking of market
5% move below sloping trendline connecting two troughs
or below second through signals reversal
30. Triangles - consolidation / reversal patterns
31. Equilateral and isosceles triangle (COIL - the opposite of expansion configuration)
Two (or more) up moves + reactions
Each top lower than previous - each bottom higher than previous
connecting lines converge
Prices and volume strongly react on breakthrough
32. Triangles are accurate when penetration occurs
Between 1/2 - 3/4 of the distance between the most congested peak and the highest peak.
33. Right angled triangle
Private case of isosceles triangle.
Often turn to squares.
34. Trendlines
Connect rising bottoms or declining tops (in Bull market)
Horizontal trendlines
35. Necklines of H&S configurations
And the upper or lower boundaries of a square are trendlines.
36. Upward trendline is support
Declining trendline is resistance
37. Ratio of penetrations to number of times that the trendline was only touched without being penetrated
Also: the time length of a trendline
the steepness (gradient, slope)
38. The penetration of a steep trendline is less meaningful and the trend will prevail.
39. Corrective fan
At the beginning of Bull market - first up move steep, price advance unsustainable.
This is a reaction to previous downmoves and trendline violated.
New trendline constructed from bottom of violation (decline) rises less quickly, violated.
A decline leads to third trendline.
This is the end of the Bull market
(The reverse is true for Bear market.)
40. Line of return - parallel to upmarket trendline, connects rising tops (in uptrends) or declining bottoms (in downtrends).
41. Trend channel - the area between trendlines and lines of return.
42. Breach of line of return signals (temporary) reversal in basic trend.
43. Simple moving average
Average of N days where last datum replaces first datum changes direction after peak / trough.
44. Price < MA ® Decline
Price > MA ® Upturn
45. MA at times support in Bear market
resistance in Bull market
46. Any break through MA signals change of trend.
This is especially true if MA was straight or changed direction before.
If broken trough while continuing the trend - a warning.
We can be sure only when MA straightens or changes.
47. MA of 10-13 weeks secondary trends
MA of 40 weeks primary trends
Best combination: 10+30 weeks
48. Interpretation
30w down, 10w < 30w downtrend
30w up, 10w > 30w uptrend
49. 10w up, 30w down (in Bear market)
10w down, 30w up (in Bull market)
No significance
50. MAs very misleading when market stabilizes and very late.
51. Weighted MA (1st version)
Emphasis placed on 7w in 13w MA (wrong - delays warnings)
Emphasis placed on last weeks in 13w
52. Weighted MA (2nd version)
Multiplication of each datum by its serial number.
53. Weighted MA (3rd version)
Adding a few data more than once.
54. Weighted MAs are autonomous indicators - without being crossed with other MAs.
55. Exponential MA - algorithm
56. Envelopes
Symmetrical lines parallel to MA lines (which are the centre of trend) give a sense of the trend and allow for fatigue of market movement.
57. Momentum
Division of current prices by prices a given time ago
Momentum is straight when prices are stable
When momentum > reference and going up market up (Bull)
When momentum > reference and going down Bull market stabilizing
When momentum < reference and going down market down (Bear)
When momentum < reference and going up Bear market stabilizing
58. Oscillators measure the market internal strengths:
59. Market width momentum
Measured with advance / decline line of market
(=the difference between rising / falling shares)
When separates from index - imminent reversal
momentum = no. of rising shares / no. of declining shares
60. Index to trend momentum
Index divided by MA of index
61. Fast lines of resistance (Edson Gould)
The supports / resistances will be found in 1/3 - 2/3 of previous price movement.
Breakthrough means new tops / bottoms.
62. Relative strength
Does not indicate direction - only strength of movement.
More Technical Analysis:
1. Williams %R = 100 x
r = time frame
2. The Williams trading signals:
- Bearish - WM% R rises above upper reference line
Falls
Cannot rise above line during next rally
- Bullish - WM% R falls below lower reference line
Rallies
Cannot decline below line during next slide
When WM%R fails to rise above upper reference line during rally
or
Fall below lower reference line during decline
3. Stochastic
A fast line (%K) + slow line (%D)
Steps
- Calculate raw stochastic (%K) = x 100
n = number of time units (normally 5)
- %D = x 100 (smoothing)
4. Fast stochastic
%K + %D on same chart (%K similar to WM%R)
5. Slow stochastic
%D smoothed using same method
6. Stochastic trading signals
- Bullish
Prices fall to new low
Stochastic traces a higher bottom than during previous decline
- Bearish
Prices rally to new high
Stochastic traces a lower top than during previous rally
- When stochastic rallies above upper reference line - market O/B
- When stochastic falls below lower reference line - market O/S
When both lines are in same direction - confirmation of trend
7. Four ways to measure volume
8. OBV Indicator (on-balance volume)
Running total of volume with +/- signs according to price changes
9. Combined with:
(OBV calculated for each stock in the index and then rated +1, -1, 0)
The sum of the Net Field Trend Indicators
10. Accumulation / Distribution Indicator
A/D = x V
11. Volume accumulator
Uses P instead of 0.
12. Open Interest
Number of contract held by buyers or
owed by short sellers in a given market on a given day.
13. Herrich Payoff Index (HPI)
HPI = Ky + (K' - Ky)
K = [(P - Py) x C x V] x [1 ± {(½ I - Iy½ x 2 / G}
G= today's or yesterday's I (=open interest, whichever is less)
+/- determined: if P > Py (+), if P < Py (-)
Annex: The Foundations of Common Investment schemes Challenged
The credit and banking crisis of 2007-9 has cast in doubt the three pillars of modern common investment schemes. Mutual funds (known in the UK as "unit trusts"), hedge funds, and closed-end funds all rely on three assumptions:
Assumption number one
That risk inherent in assets such as stocks can be "diversified away". If one divides one's capital and invests it in a variety of financial instruments, sectors, and markets, the overall risk of one's portfolio of investments is lower than the risk of any single asset in said portfolio.
Yet, in the last decade, markets all over the world have moved in tandem. These highly-correlated ups and downs gave the lie to the belief that they were in the process of "decoupling" and could, therefore, be expected to fluctuate independently of each other. What the crisis has revealed is that contagion transmission vectors and mechanisms have actually become more potent as barriers to flows of money and information have been lowered.
Assumption number two
That investment "experts" can and do have an advantage in picking "winner" stocks over laymen, let alone over random choices. Market timing coupled with access to information and analysis were supposed to guarantee the superior performance of professionals. Yet, they didn't.
Few investment funds beat the relevant stock indices on a regular, consistent basis. The yields on "random walk" and stochastic (random) investment portfolios often surpass managed funds. Index or tracking funds (funds who automatically invest in the stocks that compose a stock market index) are at the top of the table, leaving "stars", "seers", "sages", and "gurus" in the dust.
This manifest market efficiency is often attributed to the ubiquity of capital pricing models. But, the fact that everybody uses the same software does not necessarily mean that everyone would make the same stock picks. Moreover, the CAPM and similar models are now being challenged by the discovery and incorporation of information asymmetries into the math. Nowadays, not all fund managers are using the same mathematical models.
A better explanation for the inability of investment experts to beat the overall performance of the market would perhaps be information overload. Recent studies have shown that performance tends to deteriorate in the presence of too much information.
Additionally, the failure of gatekeepers - from rating agencies to regulators - to force firms to provide reliable data on their activities and assets led to the ascendance of insider information as the only credible substitute. But, insider or privileged information proved to be as misleading as publicly disclosed data. Finally, the market acted more on noise than on signal. As we all know, noise it perfectly randomized. Expertise and professionalism mean nothing in a totally random market.
Assumption number three
That risk can be either diversified away or parceled out and sold. This proved to be untenable, mainly because the very nature of risk is still ill-understood: the samples used in various mathematical models were biased as they relied on data pertaining only to the recent bull market, the longest in history.
Thus, in the process of securitization, "risk" was dissected, bundled and sold to third parties who were equally at a loss as to how best to evaluate it. Bewildered, participants and markets lost their much-vaunted ability to "discover" the correct prices of assets. Investors and banks got spooked by this apparent and unprecedented failure and stopped investing and lending. Illiquidity and panic ensued.
If investment funds cannot beat the market and cannot effectively get rid of portfolio risk, what do we need them for?
The short answer is: because it is far more convenient to get involved in the market through a fund than directly. Another reason: index and tracking funds are excellent ways to invest in a bull market.
Categorical measures (categorical, qualitative, classification variables): binary, nominal (order meaningless), ordinal (order meaningful)
Continuous measures: interval vs. ratio
Discrete measures (whole numbers)
Data can be listed, arranged as frequency scores, in histograms (bar charts).
Normal (Gaussian) distribution
Skewed (asymmetrical) distribution: positive (rises rapidly, drops slowly), negative (rises slowly, drops rapidly)
Floor effect (depression), ceiling effect (math problems)
Kurtosis (not bell shaped): positive (not enough in tails), negative (too many in tails)
Outliers
Measures of central tendency:
Mean (x bar=sigma x divided by N) requires symmetrical distribution (no outliers) and interval or ratio data
Median (when distribution asymmetrical and data ordinal): half odd number of scores and take next number; mean of two middle scores in even number of scores
Mode: most frequent score in distribution, most frequent observation among scores
Dispersion (spread):
Range: distance between highest and lowest score
Inter-quartile range (IQR): used with ordinal data or with non-normal distributions, the distance between the upper (quarter) and lower (three quarters) quartiles, values (3 quartiles in a variable divide it to 4 groups, median is the 2nd quartile)
Semi IQR=IQR/2
(Population) Standard deviation=square root of (the totals of each variable minus the mean squared) divided by N-1
Boxplot (box and whisker plot): median represented as thick line, upper and lower quartile as box around the median, whiskers are maximum and minimum points, outlier cannot be 1.5 length of box.
Population: descriptive statistics and inferential statistics
Representative sample is random: each member of the population has equal chance to be selected and the selection of one member does not affect the chances to be selected of any other member. But random sample is impossible, so volunteer sample or snowball (referrals) sample.
Uniform distribution (number on dice) vs. normal (sums of 2 numbers on multiple dice)
In histogram represented as line chart, with continuous variable on x axis and frequency density on y axis, the number of people with any score or range of scores equals the area of the chart. Probability is this area divided by the total area.
Normal distribution is defined by its mean and standard deviation. Half the scores are above the mean and half under it (probability=0.5). 95.45% of cases within 2 SDs of mean. 47.72% between mean and -2SDs. 49.9% between mean and +3SDs. 2.27% lie more than 2 SDs below the mean.
Z-score presented in terms of SDs above the mean = score minus mean divided by standard deviation. One SD above mean=z-score +1. Look at a z table for probability of getting that z score.
Sampling distribution of the mean: probability of different values plotted as normal distribution.
Central limit theorem: sampling distribution of the sample means in the population is normal (big sample) or t-shaped (small sample) – or: if the distribution in the sample is normal, the sampling distribution will be bell curved. If the sample distribution is not normal but the sample is large, sample distribution will be normal, or t-shaped.
If we carry out the same experiment over and over again, we will get different results each time. Standard deviation of this sample of scores is standard error (se) of the mean: standard deviation of the sampling distribution of the mean. The bigger the sample size, the closer the sample mean to the population mean and the smaller the standard deviation. Bigger variation, greater uncertainty as to the population mean. Standard error of the mean of variable X equals its standard deviation divided by the square root of the number of cases in the sample. To halve the standard error, the sample size needs to increase fourfold.
Summary:
If we carry out a study a number of times, we get a range of mean scores that make up a sampling distribution which is normal if the sample is large and t-shaped if the sample is small.
We know the standard deviation of the sampling distribution of the mean (=standard error).
If we know the mean and we know the standard deviation, we know the probability of any value.
PROBABILITY
Null hypothesis (Ho)
Alternative hypothesis (H1)
Probability value: probability of a result if the null hypothesis is true NOT the probability that the null hypothesis is true.
One-tailed, directional hypothesis: probability p=1 divided by 2 to the power of k (number of tosses)
We use two-tailed (non-directional) hypothesis: p=1 divided by 2 to the power of k-1 (2 to the power of zero equals 1 which is why p=1)
Alpha: cut off rate for rejecting the null hypothesis as false=0.05 (5%, 1 in 20). Below alpha is statistically significant. Rejecting the null hypothesis does not mean adopting the alternative hypothesis (not proof, just evidence). Alpha above 0.05 doesn’t prove that the null hypothesis is true, only that it cannot be rejected.
Type I error: rejecting the null hypothesis when it actually may be true. If p=0.05, there is less than 5% chance of the result occurring if the null hypothesis is true/there is a 5% chance of the result occurring if the null hypothesis is true/there is 5% probability that the decision to reject the null hypothesis is wrong. Type II error: failing to reject the null hypothesis when it is false.
If the population mean has a particular value, what is the probability that I would find a value as far from that value as in my sample, or further? For a small sample t=score (mean in sample) minus (suspected population) mean divided by se. Look at a t table for probability of getting that t score.
Degrees of freedom (df)=N-1: what proportion of cases lie more than a certain number of SDs above the mean in a t distribution with N-1 df.
Confidence interval: the likely range of the population value, between the confidence limits (usually 95%, alpha 0.05%, or significance level 0.05). We know the sample mean and the standard error (standard deviation of the sampling distribution). We need to know the values that cover 95% of the population in a t distribution with a certain number of df (what t value gives these values).
CI=mean plus/minus t value for alpha 0.05 multiplied by the standard error (two calculations for UPL/LCL). Interpretation: in 95% of studies, the population mean will be within the confidence limits.
In 95% of studies, the true value is contained within the confidence intervals. In 5% of studies, the true value is therefore not contained within the confidence intervals. In 5% of studies, the result will be statistically significant, when the null hypothesis is true. If the confidence intervals contain the null hypothesis, the result is not statistically significant. If the confidence intervals do not contain the null hypothesis, the result is statistically significant. Confidence intervals and statistical significance are therefore two sides of the same coin.
Experiments with repeated measures (within subjects) design: scores of individuals in one condition against their scores in another condition (people are their own control group).
Problems: practice effects (solutions: counterbalancing and practice items); sensitization; carry over effects.
Related design: people in two groups are closely related.
Cross-over design studies: when people cross from one group to the other.
Correlational design: did people who scored high in one test score high on the second test? Not interested in whether the scores overall went up or down. Repeated measures: did people score higher on one occasion rather than the other? Not interested if people who scored high the first time scored high the second time as well.
What statistical test should I use? (statsols.com)
Which Statistics Test Should I Use? (socscistatistics.com)
Parametric tests (like t test) make inferences about population parameters.
Non-parametric tests (like Wilcoxon) use ranking, use data measured on an ordinal scale (we don’t know the size of the gaps, just the order of the scores).
T-Test (Repeated Measures)
Repeated measures t-test: when data are measured on continuous (interval) level and the differences between the two scores are normally distributed (even if the variables are not normally distributed). Makes no assumption about the variances of the variables (like the independent samples t-test).
1. Calculate the difference between the scores for each person.
2. Calculate the mean and standard deviation of the difference.
3. Calculate the standard error of the difference (SD divided by square root of N), using the result from step 2.
4. Calculate the confidence interval for the difference, using the result from step 3.
5. Calculate the statistical significance of the difference, using the result from step 3.
Step 4:
We need to know what confidence intervals we are interested in. The answer is almost always the 95% confidence intervals, so α (alpha) is equal to 0.05. We need the cutoff value for t, at the 0.05 level. We can use a table or a computer program to do this, but first we need to know the degrees of freedom (df). In this case, df = N-1, so we have 15 df. With 15 df, the 0.05 cut-off for t is 2.131. The equation has a -} symbol in it. This means that we calculate the answer twice, once for the lower confidence interval and once for the upper confidence interval.
CI=mean plus/minus t alpha multiplied by standard error
Lower CI = -0.031 - 2.131 × 0.221 = -0.502
Upper CI = -0.031 + 2.131 × 0.221 = 0.440
Given that the confidence interval crosses zero, we cannot reject the null hypothesis that there is no difference between the two conditions.
Step 5
Find the t value=mean of differences divided by standard error of the differences.
Table or computer tell us probability p of getting a value of t at least as large as the calculated value with N-1 degrees of freedom. If higher than 0.05 we cannot reject null hypothesis.
Wilcoxon Test
When differences not normally distributed, or measures are ordinal.
Step 1 rank the change scores ignoring the sign (quantity, not whether more or less). If there are tied scores (several identical scores), they are given the mean rank (example: the score 12 is ranked in position 2,3,4 so its mean rank is 3 and it will be assigned the rank 3 wherever it appears).
Step 2 separate the ranks of positive changes from the ranks of negative changes. Add up the 2 columns of ranks. T is the lower of these 2 values.
Step 3 the probability of T according to the sample size (in a table). Or use the normal approximation (convert T to z and look up the p value in a normal distribution table).
Z = T-N(N+1)/4 – 0.5 (continuity correction)
-------------------------------------------------------------------------------------------------------------------------------
Sqrt N(N+1)(2N+1)/24 – [sigma t to the third power minus sigma t]/48 (t is number of ties)
The p value associated with z is one tail of the distribution, so must be multiplied by 2.
Sign Test
Used when we have nominal data with 2 categories and have repeated measures data.
S is the smallest of the obtained values (example: 14 yes and 19 no, S=14).
N is the number of scores that were not tied.
P calculated using S and N in a table.
Independent groups design: comparing 2 or more independent groups (different participants in each group). Quasi experimental design: when people already belong to different categories (men and women, for example). Three kinds of dependent variables: continuous, ordinal, and categorical.
T Test (Independent Groups, between subjects, two samples)
Data measured on continuous (interval) scale, data within each group are normally distributed, standard deviations of the two groups are equal (easier test), but can not be (more difficult test).
Step 1 calculate the SD of each group
Step 2 calculate the standard deviation of the difference= SD1 squared (n1-1)+SD2 squared (n2-1)/n1+n2-2
When sample sizes are same, the formula is: SDdiff=SD1 squared+SD2 squared/2
Step 3 calculate the standard error of the difference=SQRT (SDdiff/n1+SDdiff/n2) or, if two sample sizes are equal=SQRT SDdiff/0.5(n1+n2)
Step4 calculate the CI=diff plus/minus t alpha multiplied by standard error (df is n1+n2-2)
Step 5 calculate t (probability associated with null hypothesis: probability of getting t at least this large if null hypothesis is true).
Homogeneity of variance used only in pooled (not unpooled) variance t-test. Pooled for equal sample sizes, unpooled for unequal sample sizes. If Levene’s test gives statistically significant result, the variances (SDs) are not the same and one should use unpooled.
General t-test
t=difference between two means (d)/standard error
CI=d plus minus t of alpha multiplied by standard error
Unpooled (welch) variance t-test
Unequal sample sizes: comparing two naturally occurring groups, expensive intervention, ethical or recruitment issues.
Step 1 calculate the SDs
Step 2 calculate the SE of difference=SQRT of (SD1 squared/N1+SD2 squared/N2)
Step 3 calculate the degrees of freedom=
(SD1 squared/N1+SD2 squared/N2) squared
------------------------------------------------------------------
(SD1 squared/N1) squared/(N1-1) + (SD2 squared/N2) squared/(N2-1)
(See also: Satterthwaite’s method)
Step 4 calculate the confidence intervals (two-tailed alpha)
Step 5 calculate the value of t and find its p value
Cohen’s d: effect size for independent groups t-test: how far apart the means of the two samples are in standard deviation units. d = 2t/(SQRT of df) df=N1+N2-2
Mann-Whitney U Test
Compares two unrelated groups, non-parametric (no assumptions regarding normal distribution and interval data). N1 number in group with larger ranked total, N2 smaller ranked total.
Step 1 ranking
Step 2 Calculate U1 = N1*N2+(N1*(N1+1)/2)-sigma ranks1
Step 3 calculate U2 = N1*N2-U1
Step 4 find U (the smaller of U1 or U2)
Step 5 find p value in table (if p value of U above that, it is not statistically significant at the 0.05 level)
When sample large, convert U score to z score and get p value, then multiply by 2 for two-tailed significance:
z=U-(N1*N2/2)/SQRT N1N2(N1+N2+1)/12
t-test tells us whether difference in means between 2 groups is statistically significant. Mann Whitney compares ranks. If the two groups have same distribution shapes, it compares the medians.
Theta=probability (B>A)+0.5*probability (B=A)=U/N1N2. Measures the probability that the score of a randomly selected person from group B will be higher or equal to the score of a randomly selected person from group A.
Categorical or nominal data: discrete values (yes/no).
Mean is average of continuous data. Median is the average with ordinal data. With nominal data, we provide proportions.
Absolute difference (A-B) and relative difference (A-B/A relative decrease or A-B/B relative increase).
Odds ratio (OR): ratio between 2 odds.
Odd=number of events: number of nonevents (usually, 1, so only number of events mentioned). Odds=probability p/1-p.
If data is placed in table cells A-D, OR=AD/BC
Nu is SD of the OR=SQRT of (1/A+1/B+1/C+1/D). It is related to a normal distribution, so we need its value associated with the cut-off with 95% CI (z of alpha/2 or two-tailed alpha).
Lower confidence limit CL=OR*exp (-1zalpha2tailed*nu). Upper CL=OR*exp(zalpha2tailded*nu).
Chi-square test
Put data in a table with 4 cells and add the totals of both rows and columns.
Calculate E(expected value) for each cell (if null hypothesis were true)=(R*C)/T (R total for given row, C for column, T total).
Calculate differences between data (O or observed values) and E.
Chi squared=sigma of (O-E)squared/E for each of the cells
df=(number of rows-1)*(number of columns-1)
Check in table p value of chi squared as high or higher with given dfs.
Use Fisher’s exact test when expected values in 2X2 table are smaller than 5:
P=(A+B)!(C+D)!(A+C)!(B+D)!/(A+B+C+D)!A!B!C!D! (! Factorial: multiply every number below down to 1, so 4!=4*3*2*1). Calculate for the result of the study and for all results more extreme than the result.
Use (Yates’) continuity correction:
Chi squared=sigma of (!O-E!-0.5) squared/E (absolute values, ignoring plus or minus signs)
Scatterplot: one variable on x axis, the other on the y axis, point per each person showing scores on both variables.
Summarize the relationship between the 2 variables with a number, calculate Cis for this number, find out if the relationship is statistically significant (probability of finding a relationship that is at least this strong if null hypothesis that there is no relationship in the population is true).
Line of best fit: use slope to predict the value of one variable based on the score of the other (regression line). Slope of 1.32 means that for every 1 unit move on x axis, the line rises 1.32 units along the y axis. The height is where the line hits the y axis (the constant, y-intercept or just intercept): expected score on y when score on x is zero. Expected y score=intercept/beta0+(slope/beta1*x score)=regression equation.
Standardized scores, slopes: measured in terms of SDs, not in absolute units.
Pearson Correlation coefficient (r) is parametric (data continuous and normally distributed): standardized slope=(beta*SDx)/SDy (expected increase in one variable when the other variable increases by 1 SD).
SD=spread of points of one variable around mean. Square of SD is variance.
Variance=sigma of (x-mean or d, difference) squared/(N-1)
In regression analysis, instead of d being the difference between score and mean, it is the difference between expected y value given x score and actual y score (residual). Expected value (y ^) = Bo (intercept)*b1 (slope)x. Residual squared equals difference squared and is input in calculating variance (SD squared).
How large is the VAR of the residuals in terms of VAR of x axis scores? VAR of x scores = sigma of (x score minus mean) squared/(N-1). Residual VAR/x scores VAR=proportion of VAR that is residual VAR (not explained in terms of y). 1-VAR residuals=proportion of VAR explained in terms of y. SQRT of non-residual VAR is SD or correlation coefficient (standardized slope).
Squaring the correlation gets the of variance in one variable that is explained by the other variable (proportion of variance).
Correlation is both descriptive and inferential statistic. Table for probability value associated with null hypothesis (r=0) or to describe the strength of the relationship between 2 variables (Cohen: r=0.1 small 0.3 medium 0.5 large).
Covariance of x,y = (x-mean of x)(y minus mean of y)/(N-1)
Pearson Correlation (r) of x,y = covariance (x,y)/SDx*SDy=sigma of [(x minus mean x)(y minus mean y)]/SQRT sigma (x minus mean x) squared * SQRT sigma (y minus mean y) squared
Confidence intervals: range of a value in the population, not just in a sample.
Step 1 Fisher’s z transformation: transforms distribution of the correlation into a z distribution (normal, mean 0, SD 1). Z’=0.5*ln (1+r/1-r)
Step 2 Standard error=1/SQRT (N-3)
Step 3 CI=Z’ plus/minus z of alpha two-tailed * se (z of 2-tailed alpha is value of normal distribution that includes the percentage of values that we want to cover)
Step 4 convert CIs back to correlations r=exp(2z’)-1/exp(2z’)+1 (2 calculations for 2 CIs, upper and lower)
Regression line slope (beta1)=r*SDy/SDx
Intercept=mean y-beta1*mean x
For dichotomous values (yes/no): phi=r=SQRT (chi squared/N)
p value of the correlation=p value of chi squared
One variable dichotomous, one continuous: point-biserial formula (gives same results like Pearson, but easier)
r = [(mean x1-mean x2)*SQRT (p*(1-p)]SDx (x score in group, p proportion of people in group 1, SD of both groups together.
Non-parametric correlations
Spearman rank correlation coefficient (how closely ranked data are related)
Step 1 draw scatterplot: what sort of relationship (positive if upward slope), outliers
Step 2 rank data in each group separately (1 lowest score, ties given average score)
Step 3 find d (difference between the ranks of both variables for each person – 0 d = r 1)
Step 4 convert the total d scores to a correlation r=1-6*sigma d squared/N to third power – N (valid only when no ties in data).
Step 5 to find significance, convert to t statistic. Or use Pearson.
Kendall’s tau-a: p values same as Spearman’s but r always lower
Correlation between A and B: A causes B, B causes A, C causes both A and B. Correlation is not causation, but causation is always correlation.
ANOVA (Analysis of Variance) measures outcome (dependent) variable on a continuous scale which depends on an often categorical predictor variables that we either manipulate or measure. Categorical predictor variables are called factors or independent variables. The outcome is also affected by error (other things).
The differences between the scores on outcomes are represented by the variance of the outcome score (difference between each person’s score and the mean score)
Two questions: (1) How much of the variance (difference) between the two groups is due to predictor variable? And (2) Is this proportion of variance statistically significant (larger than we would expect by chance if null hypothesis were true)?
Partition variance into: 1. Total variance 2. Variance owing to predictors (differences between groups) 3. Error (differences within groups)
Variance sums of squares (of squared deviations from the mean): 1. Total sum of squares (SS) 2. Between groups sum of squares 3. Error (within group) sum of squares (SS within or SS error).
Between groups sum of squares: variance that represents the difference between the groups (SS between). Sometimes refers to the between groups sum of squares for one predictor (SS predictor).
SStotal=sigma (x-mean x, mean of all scores) squared
SSinbetween=sigma (x-mean of the group)
SSbetween=SStotal-SSwithin SStotal=SSwithin+SSbetween
Effect size=SSbetween/SStotal=R squared=eta squared
Statistical significance
Calculate Mean Squares (MS): between and within (but not total).
Step 1 calculate degrees of freedom (dftotal, between, within/error): dftotal=dfwithin+dfbetween
Dftotal=N-1 dfbetween=g-1 (g number of groups) dfwithin=dftotal-dfbetween
Step 2 calculate MS MSbetween=SSbetween/dfbetween MSwithin=SSwithin/dfwithin
Step 3 Calculate F ratio (test statistic for ANOVA) = MSbetween/MSwithin=(SSbetween/dfbetween)/(SSwithin/dfwithin)
Step 4 calculate p value for F with degrees of freedom between and within (use table): report F with degrees of freedom
ANOVA and t test are same when we have 2 groups F=t squared.
But ANOVA can be used to analyze more than 2 groups, calculate p value associated with regression line.
No assumption that outcome variable is normally distributed but that data within each group is normally distributed and that SD within each group is equal (homogeneity of variance with variance being the square of the SD).
ANOVA tests null hypothesis that mean of x of group 1=mean of x of group 2=mean of x of group k. Post hoc tests to determine where the difference comes from if we reject the null hypothesis. Post hoc tests compare each group to each other group. The number of needed tests=number of groups k*(k-1)/2.
Using only t-tests to compare 3 groups creates alpha inflation (type I error: rejecting a true null hypothesis) owing to increase in type I error rate above the nominal type I error rate 0.05. So, we use Bonferroni corrected confidence intervals, dividing alpha by the number of post hoc tests and Bonferroni corrected statistical significance (with p value of t multiplied by 3).
Measures should be reliable (measures well) and valid (measures what it is supposed to measure).
Test is reliable if: (1) it is an accurate measure (2) results are dependable (3) using the measure again obtains same results
Reliability means temporal stability (test-retest reliability) and internal consistency (all parts measure the same thing).
Measuring stability over time with the Bland-Altman limits of agreement
Line of equality: line that all the points would lie on if the person had scored the same for both measures. Range of values within which a person’s score is likely to be is limits of agreement.
Step 1 calculate between scores at time a and time 2
Step 2 draw scatterplot with mean score for each person on x axis and difference on y axis
Step 3 find mean of difference scores
Step 4 find SD of difference score
Step 5 find the 95% limits of agreement (if data normally distributed, 95% of sample lie within 1.96 SDs of the mean difference Lower limit=mean-2SD Upper limit=mean+2SD
Step 6 add horizontal lines to scatterplot, showing limits of agreement and mean difference
We tell how far apart measures are with SD or variance (SD squared): variance between items within one person and variance between people.
Cronbach’s (coefficient) alpha: estimate of correlation between true score and measured score
Standardized alpha when variance of each item is equal=(k items in scale*average correlation, excluding 1s)/1+(k-1)*average correlation 0.7 high alpha (Square of correlation gives proportion of variance shared by 2 variables. Alpha is correlation. Squaring 0.7 gives 0.49, just under 0.5. Alpha higher than 0.7 guarantees that more than half the variance in the measure is true score variance.
High alpha measures highly correlated items or longer test.
Correlation not useful with categorical data. Cohen’s Kappa: agreement beyond chance agreement.
Step 1 enter data into cells ABCD (AD in agreement, BC in disagreement)
Step 2 calculate expected frequencies only for AD (E=R*C/T Row total, Column total, Grand Total)
Step 3 Kappa=(A+D)-[E(A)+E(D)]/N-[E(A)+E(D)]
>0.2 poor agreement 0.2-0.4 fair 0.4-0.6 moderate 0.6-0.8 good 0.8-1 very good
In case of ordinal data, we use weighted kappa (weighting the level of disagreement by the distance between the measures (rating the measures and squaring the ratings)
Copyright Notice
This material is copyrighted.
Free, unrestricted use is allowed on a non commercial basis.
The author's name and a link to this Website must be incorporated in any
reproduction of the material for any use and by any means.
Internet: A Medium or a Message?
Malignant Self Love - Narcissism Revisited
Frequently Asked Questions about Narcissism
Write to me: palma@unet.com.mk or narcissisticabuse-owner@yahoogroups.com