[teach you by hand] use pyfinance to analyze securities returns


Introduction to pyfinance

When looking at how to implement rolling regression using Python, I found a useful quantitative financial package - pyfinance. As the name suggests, pyfinance is a Python analysis package built for investment management and securities return analysis. It mainly complements the existing packages for quantitative finance, such as pyfolio and pandas. Pyfinance consists of six modules,

datasets.py: financial data download (data crawler based on request, some data can not be downloaded due to external network constraints);

general.py: general financial calculation, such as active share calculation, income distribution approximation and tracking error optimization;

ols.py: regression analysis, support pandas rolling window regression;

options.py: option derivatives calculation and strategy analysis;

returns.py: statistical analysis of financial time series through CAPM framework aims to simulate the functions of FactSet Research Systems, Zephyr and other software, and improve the speed and flexibility;

utils.py: infrastructure.

This paper mainly introduces the application of pyfinance in securities investment analysis around the returns module. Later, we will gradually introduce the datasets, options, ols and other modules.


returns module application example

The installation of pyfinance is relatively simple. You can directly enter "pip install pyfinance" on cmd (or anaconda prompt). The returns module mainly takes the TSeries class as the main body (dataframe is not supported temporarily), which is equivalent to extending the Series of pandas to enable it to achieve more functions and support the calculation of performance evaluation indicators based on CAMP (capital asset pricing model) framework in securities investment analysis. When referencing the returns module, you can directly use "from pyfinance import TSeries".

Next, take tushare as the data interface, first define a data acquisition function, use TSeries to convert the yield data in the function, and then directly use the related functions of TSeries class.

import pandas as pd  
import numpy as np
from pyfinance import TSeries
import tushare as ts
def get_data(code,start='2011-01-01',end=''):
    #Returns the TSeries sequence
    return TSeries(ret.dropna())

#Obtain China Ping An data


Yield calculation

The returns of pyfinance provides annualized rate of return (anlzd_ret), cumulative rate of return (cuml_ret) and periodic rate of return (rollup). Next, take Ping An Bank stock as an example to calculate the rate of return index.

#Annualized rate of return
#Cumulative rate of return
#Calculation cycle rate of return

print(f'Annualized rate of return:{anl_ret*100:.2f}%')
print(f'Cumulative yield:{cum_ret*100:.2f}%')
#print(f 'quarterly yield: {q_ret.tail().round(4)}')
#print(f 'yield over the years: {a_ret.round(4)}')

Output results:

Cumulative yield: 205.79%

Annualized rate of return: 12.24%
Visualize quarterly (annual) Returns
from pyecharts import Bar
bar=Bar('China Ping An quarterly yield%')bar.add('',attr,v1,)

from pyecharts import Bar
bar=Bar('China Ping An's yield over the years%')


CAPM model related indicators

Alpha, beta, regression determination coefficient R2, t statistics and residual term are calculated based on CAPM model. In fact, ols regression is mainly used. Therefore, if you want to obtain these dynamic alpha and beta values, you can further use the rolling regression function (PANDAS rolling ols) of ols module, which will introduce its application in subsequent tweets.

#Based on the Shanghai and Shenzhen 300 index
#In order to ensure the consistency of the two lengths, the index of China Ping An shall prevail

print(f'alpha:{alpha:.4f},t statistic:{tstat_a:.2f}')
print(f'beta :{beta:.4f},t statistic:{tstat_b:.2f}')
print(f'Regression determination coefficient R2: {tss.rsq(benchmark):.3f}')

alpha:0.0004,t Statistics: 1.55
beta :1.0634,t Statistics: 60.09
 Regression determination coefficient R2: 0.606


Risk indicators

Risk indicators mainly include standard deviation and maximum pullback. When calculating the standard deviation, note that you need to modify the default parameters and open the path where the pyfinance installation package is located. If Anaconda is installed, enter the following path:

c: \ anaconda3 \ lib \ site packages \ pyfinance, open the returns source file and find anlzd_stdev and semi_stdev function, change the freq default None to 250 (transaction days of a year).

#Annualized standard deviation
#Downside standard deviation 
#Maximum pullback
print(f'Annualized standard deviation:{a_std*100:.2f}%')
print(f'Lower deviation standard deviation:{s_std*100:.2f}%')
print(f'Maximum withdrawal difference:{md*100:.2f}%')

Annualized standard deviation: 31.37%
Lower deviation standard deviation: 0.43%
Maximum withdrawal difference:-45.76%

The down biased standard deviation is mainly to solve the asymmetry of the return distribution. When the return function distribution is left biased, using the normal distribution will underestimate the risk. Therefore, it is not appropriate to use the traditional sharp ratio denominator and use the full sample standard deviation for estimation. The deviation of the return from the return of risk-free investment should be used.


Benchmark comparison index

The benchmark comparison index needs to specify a benchmark, such as taking the Shanghai and Shenzhen 300 index as the benchmark of Ping An stocks in China for comparative analysis.

print(f'Proportion of time higher than benchmark income:{bat*100:.2f}%')
print(f'Ratio of upward period to benchmark income:{uc*100:.2f}%')
print(f'Ratio of downward period to benchmark income:{dc*100:.2f}%')
print(f'Ratio of uplink period to downlink period:{tc*100:.2f}%')
print(f'Proportion of downward (negative return) time of individual stocks:{pct_neg*100:.2f}%')
print(f'Proportion of upward (positive return) time of individual stocks:{pct_pos*100:.2f}%')

Proportion of time higher than benchmark income: 47.83%
Ratio of upward period to benchmark return: 111.70%
Ratio of downward period to benchmark return: 105.32%
Ratio of uplink period to downlink period: 106.06%
Proportion of downward (negative return) time of individual stocks: 48.94%
Proportion of upward (positive return) time of individual stocks: 50.00%

In addition, information ratio and Treynor index are two commonly used benchmark comparison and evaluation indicators, especially for quantitative evaluation of the performance of fund products or portfolios.

information ratio: Based on Markowitz's mean variance model, it measures the excess return brought by excess risk and represents the excess return brought by unit active risk. IR= α ∕ ω ( α Is the excess return of the portfolio, ω Active risk), molecular α It is the difference between the real expected rate of return and the rate of return calculated by the pricing model, and the denominator is the residual risk, that is, the standard deviation of the residual term.

Treynor ratio: measure the excess return of unit risk, and the calculation formula is: TR = (Rp-Rf)/ β p. Where: TR refers to Treynor performance index, Rp refers to the average rate of return of a portfolio, and Rf refers to the average risk-free interest rate, β P represents the systematic risk of a portfolio.

print(f'Information ratio:{ir:.3f}')
print(f'Treynor index:{tr:.3f}')
Information ratio: 0.433
 Treynor index: 0.096


Risk adjusted return index

The commonly used indicators of risk adjusted rate of return are sharp ratio, sortino ratio and calmar ratio. These three indicators are risk adjusted rate of return. Therefore, the numerator is an income indicator and the denominator is a risk indicator.

  • Sharp ratio: risk adjusted rate of return, calculation formula: = [E(Rp) - Rf]/ σ p. Where E(Rp): expected rate of return of portfolio, Rf: risk-free interest rate, σ p: The standard deviation of the portfolio. Calculate how much excess return the portfolio will generate for each unit of total risk.

  • Sortino Ratio: consistent with sharp ratio, the core is that the denominator applies the concept of Downside Risk. When calculating the standard deviation, it does not use the mean value, but a set acceptable minimum rate of return (r_min). In the sequence of rates of return, the return distance beyond this minimum rate of return is calculated as 0, The square distance below this rate of return accumulates, so that the standard deviation becomes half of the downward standard deviation. Correspondingly, the molecular of sotino ratio also adopts the part where the strategic return exceeds the minimum return. Compared with sharp ratio, sotino ratio pays more attention to the expected loss analysis of the (left) tail, while sharp ratio analyzes all samples.

  • Calmar ratio: describes the relationship between return and maximum pullback. The calculation method is the ratio between annualized return and historical maximum pullback. The higher the Calmar ratio, the better the performance of the portfolio.

print(f'Sharp ratio:{sr:.2f}')
print(f'Sotino ratio:{sor:.2f}')
print(f'Karma ratio:{cr:.2f}')

Sharp ratio: 0.33
 Sotino ratio: 28.35
 Karma ratio: 0.27


Example of comprehensive performance evaluation index analysis

Below, we will synthesize the above common indicators and obtain multiple stocks for comparative analysis.

def performance(code,start='2011-01-01',end=''):
    #Annualized rate of return
    dd['Annualized rate of return']=tss.anlzd_ret()
    #Cumulative rate of return
    dd['Cumulative rate of return']=tss.cuml_ret()
    #alpha and beta
    #Risk indicators
    #Annualized standard deviation
    dd['Annualized standard deviation']=tss.anlzd_stdev()
    #Downside standard deviation 
    dd['Downside standard deviation ']=tss.semi_stdev()
    #Maximum pullback
    dd['Maximum pullback']=tss.max_drawdown()
    #Information ratio and Treynor index
    dd['Information ratio']=tss.info_ratio(benchmark)
    dd['treynor index ']=tss.treynor_ratio(benchmark)
    #Risk adjusted rate of return
    dd['sharpe ratio ']=tss.sharpe_ratio()
    dd['Sortino ratio ']=tss.sortino_ratio(freq=250)
    dd['calmar ratio']=tss.calmar_ratio()
    return df

Obtain the data of multiple stocks (also build portfolio) and compare and evaluate the performance evaluation indicators:

#Get multiple stock data

stocks={'China Ping An':'601318','Moutai, Guizhou':'600519',\
        'Haitian flavor industry':'603288','Gree Electric Appliance':'000651',\
        'Vanke A':'00002','BYD':'002594',\
        'Yunnan Baiyao':'000538','Shuanghui development':'000895',\
        'Haier Zhijia':'600690','Tsingtao Beer':'600600'}
for name,code in stocks.items():




Pyfinance is a python package mainly designed for securities investment management and performance evaluation indicators, which is very practical for readers who test CFA and FRM. In fact, the returns module of pyfinance extends the Series class of pandas to support securities investment return analysis and performance evaluation. Python is a "glue" language based on various modules. Therefore, it is good at borrowing existing packages for calculation and programming, which can improve efficiency and reduce the time and energy of "building wheels". This paper mainly introduces the application of returns module in pyfinance. The application of other modules will be introduced in subsequent tweets.

Technical exchange

Welcome to reprint, collect, gain, praise and support!

At present, a technical exchange group has been opened, with more than 2000 group friends. The best way to add notes is: source + Interest direction, which is convenient to find like-minded friends

  • Method ① send the following pictures to wechat, long press identification, and the background replies: add group;
  • Mode ②. Add micro signal: dkl88191, remarks: from CSDN
  • WeChat search official account: Python learning and data mining, background reply: add group

Tags: Python data visualization

Posted on Wed, 17 Nov 2021 09:04:16 -0500 by livepjam