01
Introduction to pyfinance
When looking at how to implement rolling regression using Python, I found a useful quantitative financial package  pyfinance. As the name suggests, pyfinance is a Python analysis package built for investment management and securities return analysis. It mainly complements the existing packages for quantitative finance, such as pyfolio and pandas. Pyfinance consists of six modules,
datasets.py: financial data download (data crawler based on request, some data can not be downloaded due to external network constraints);
general.py: general financial calculation, such as active share calculation, income distribution approximation and tracking error optimization;
ols.py: regression analysis, support pandas rolling window regression;
options.py: option derivatives calculation and strategy analysis;
returns.py: statistical analysis of financial time series through CAPM framework aims to simulate the functions of FactSet Research Systems, Zephyr and other software, and improve the speed and flexibility;
utils.py: infrastructure.
This paper mainly introduces the application of pyfinance in securities investment analysis around the returns module. Later, we will gradually introduce the datasets, options, ols and other modules.
02
returns module application example
The installation of pyfinance is relatively simple. You can directly enter "pip install pyfinance" on cmd (or anaconda prompt). The returns module mainly takes the TSeries class as the main body (dataframe is not supported temporarily), which is equivalent to extending the Series of pandas to enable it to achieve more functions and support the calculation of performance evaluation indicators based on CAMP (capital asset pricing model) framework in securities investment analysis. When referencing the returns module, you can directly use "from pyfinance import TSeries".
Next, take tushare as the data interface, first define a data acquisition function, use TSeries to convert the yield data in the function, and then directly use the related functions of TSeries class.
import pandas as pd import numpy as np from pyfinance import TSeries import tushare as ts def get_data(code,start='20110101',end=''): df=ts.get_k_data(code,start,end) df.index=pd.to_datetime(df.date) ret=df.close/df.close.shift(1)1 #Returns the TSeries sequence return TSeries(ret.dropna()) #Obtain China Ping An data tss=get_data('601318') #tss.head()
01
Yield calculation
The returns of pyfinance provides annualized rate of return (anlzd_ret), cumulative rate of return (cuml_ret) and periodic rate of return (rollup). Next, take Ping An Bank stock as an example to calculate the rate of return index.
#Annualized rate of return anl_ret=tss.anlzd_ret() #Cumulative rate of return cum_ret=tss.cuml_ret() #Calculation cycle rate of return q_ret=tss.rollup('Q') a_ret=tss.rollup('A') print(f'Annualized rate of return:{anl_ret*100:.2f}%') print(f'Cumulative yield:{cum_ret*100:.2f}%') #print(f 'quarterly yield: {q_ret.tail().round(4)}') #print(f 'yield over the years: {a_ret.round(4)}')
Output results:
Cumulative yield: 205.79% Annualized rate of return: 12.24%
Visualize quarterly (annual) Returns from pyecharts import Bar attr=q_ret.index.strftime('%Y%m') v1=(q_ret*100).round(2).values bar=Bar('China Ping An quarterly yield%')bar.add('',attr,v1,) bar
from pyecharts import Bar attr=a_ret.index.strftime('%Y') v1=(a_ret*100).round(2).values bar=Bar('China Ping An's yield over the years%') bar.add('',attr,v1,is_label_show=True, is_splitline_show=False) bar
02
CAPM model related indicators
Alpha, beta, regression determination coefficient R2, t statistics and residual term are calculated based on CAPM model. In fact, ols regression is mainly used. Therefore, if you want to obtain these dynamic alpha and beta values, you can further use the rolling regression function (PANDAS rolling ols) of ols module, which will introduce its application in subsequent tweets.
#Based on the Shanghai and Shenzhen 300 index #In order to ensure the consistency of the two lengths, the index of China Ping An shall prevail benchmark=get_data('hs300') benchmark=benchmark.loc[tss.index] alpha,beta,rsq=tss.alpha(benchmark),tss.beta(benchmark),tss.rsq(benchmark) tstat_a,tstat_b=tss.tstat_alpha(benchmark),tss.tstat_beta(benchmark) print(f'alpha:{alpha:.4f}，t statistic:{tstat_a:.2f}') print(f'beta :{beta:.4f}，t statistic:{tstat_b:.2f}') print(f'Regression determination coefficient R2: {tss.rsq(benchmark):.3f}') alpha:0.0004，t Statistics: 1.55 beta :1.0634，t Statistics: 60.09 Regression determination coefficient R2: 0.606
03
Risk indicators
Risk indicators mainly include standard deviation and maximum pullback. When calculating the standard deviation, note that you need to modify the default parameters and open the path where the pyfinance installation package is located. If Anaconda is installed, enter the following path:
c: \ anaconda3 \ lib \ site packages \ pyfinance, open the returns source file and find anlzd_stdev and semi_stdev function, change the freq default None to 250 (transaction days of a year).
#Annualized standard deviation a_std=tss.anlzd_stdev() #Downside standard deviation s_std=tss.semi_stdev() #Maximum pullback md=tss.max_drawdown() print(f'Annualized standard deviation:{a_std*100:.2f}%') print(f'Lower deviation standard deviation:{s_std*100:.2f}%') print(f'Maximum withdrawal difference:{md*100:.2f}%') Annualized standard deviation: 31.37% Lower deviation standard deviation: 0.43% Maximum withdrawal difference:45.76%
The down biased standard deviation is mainly to solve the asymmetry of the return distribution. When the return function distribution is left biased, using the normal distribution will underestimate the risk. Therefore, it is not appropriate to use the traditional sharp ratio denominator and use the full sample standard deviation for estimation. The deviation of the return from the return of riskfree investment should be used.
04
Benchmark comparison index
The benchmark comparison index needs to specify a benchmark, such as taking the Shanghai and Shenzhen 300 index as the benchmark of Ping An stocks in China for comparative analysis.
bat=tss.batting_avg(benchmark) uc=tss.up_capture(benchmark) dc=tss.down_capture(benchmark) tc=uc/dc pct_neg=tss.pct_negative() pct_pos=tss.pct_positive() print(f'Proportion of time higher than benchmark income:{bat*100:.2f}%') print(f'Ratio of upward period to benchmark income:{uc*100:.2f}%') print(f'Ratio of downward period to benchmark income:{dc*100:.2f}%') print(f'Ratio of uplink period to downlink period:{tc*100:.2f}%') print(f'Proportion of downward (negative return) time of individual stocks:{pct_neg*100:.2f}%') print(f'Proportion of upward (positive return) time of individual stocks:{pct_pos*100:.2f}%') Proportion of time higher than benchmark income: 47.83% Ratio of upward period to benchmark return: 111.70% Ratio of downward period to benchmark return: 105.32% Ratio of uplink period to downlink period: 106.06% Proportion of downward (negative return) time of individual stocks: 48.94% Proportion of upward (positive return) time of individual stocks: 50.00%
In addition, information ratio and Treynor index are two commonly used benchmark comparison and evaluation indicators, especially for quantitative evaluation of the performance of fund products or portfolios.
information ratio: Based on Markowitz's mean variance model, it measures the excess return brought by excess risk and represents the excess return brought by unit active risk. IR= α ∕ ω ( α Is the excess return of the portfolio, ω Active risk), molecular α It is the difference between the real expected rate of return and the rate of return calculated by the pricing model, and the denominator is the residual risk, that is, the standard deviation of the residual term.
Treynor ratio: measure the excess return of unit risk, and the calculation formula is: TR = (RpRf)/ β p. Where: TR refers to Treynor performance index, Rp refers to the average rate of return of a portfolio, and Rf refers to the average riskfree interest rate, β P represents the systematic risk of a portfolio.
ir=tss.info_ratio(benchmark) tr=tss.treynor_ratio(benchmark) print(f'Information ratio:{ir:.3f}') print(f'Treynor index:{tr:.3f}') Information ratio: 0.433 Treynor index: 0.096
05
Risk adjusted return index
The commonly used indicators of risk adjusted rate of return are sharp ratio, sortino ratio and calmar ratio. These three indicators are risk adjusted rate of return. Therefore, the numerator is an income indicator and the denominator is a risk indicator.

Sharp ratio: risk adjusted rate of return, calculation formula: = [E(Rp)  Rf]/ σ p. Where E(Rp): expected rate of return of portfolio, Rf: riskfree interest rate, σ p: The standard deviation of the portfolio. Calculate how much excess return the portfolio will generate for each unit of total risk.

Sortino Ratio: consistent with sharp ratio, the core is that the denominator applies the concept of Downside Risk. When calculating the standard deviation, it does not use the mean value, but a set acceptable minimum rate of return (r_min). In the sequence of rates of return, the return distance beyond this minimum rate of return is calculated as 0, The square distance below this rate of return accumulates, so that the standard deviation becomes half of the downward standard deviation. Correspondingly, the molecular of sotino ratio also adopts the part where the strategic return exceeds the minimum return. Compared with sharp ratio, sotino ratio pays more attention to the expected loss analysis of the (left) tail, while sharp ratio analyzes all samples.

Calmar ratio: describes the relationship between return and maximum pullback. The calculation method is the ratio between annualized return and historical maximum pullback. The higher the Calmar ratio, the better the performance of the portfolio.
sr=tss.sharpe_ratio() sor=tss.sortino_ratio(freq=250) cr=tss.calmar_ratio() print(f'Sharp ratio:{sr:.2f}') print(f'Sotino ratio:{sor:.2f}') print(f'Karma ratio:{cr:.2f}') Sharp ratio: 0.33 Sotino ratio: 28.35 Karma ratio: 0.27
06
Example of comprehensive performance evaluation index analysis
Below, we will synthesize the above common indicators and obtain multiple stocks for comparative analysis.
def performance(code,start='20110101',end=''): tss=get_data(code,start,end) benchmark=get_data('hs300',start,end).loc[tss.index] dd={} #Yield #Annualized rate of return dd['Annualized rate of return']=tss.anlzd_ret() #Cumulative rate of return dd['Cumulative rate of return']=tss.cuml_ret() #alpha and beta dd['alpha']=tss.alpha(benchmark) dd['beta']=tss.beta(benchmark) #Risk indicators #Annualized standard deviation dd['Annualized standard deviation']=tss.anlzd_stdev() #Downside standard deviation dd['Downside standard deviation ']=tss.semi_stdev() #Maximum pullback dd['Maximum pullback']=tss.max_drawdown() #Information ratio and Treynor index dd['Information ratio']=tss.info_ratio(benchmark) dd['treynor index ']=tss.treynor_ratio(benchmark) #Risk adjusted rate of return dd['sharpe ratio ']=tss.sharpe_ratio() dd['Sortino ratio ']=tss.sortino_ratio(freq=250) dd['calmar ratio']=tss.calmar_ratio() df=pd.DataFrame(dd.values(),index=dd.keys()).round(4) return df
Obtain the data of multiple stocks (also build portfolio) and compare and evaluate the performance evaluation indicators:
#Get multiple stock data df=pd.DataFrame(index=performance('601318').index) stocks={'China Ping An':'601318','Moutai, Guizhou':'600519',\ 'Haitian flavor industry':'603288','Gree Electric Appliance':'000651',\ 'Vanke A':'00002','BYD':'002594',\ 'Yunnan Baiyao':'000538','Shuanghui development':'000895',\ 'Haier Zhijia':'600690','Tsingtao Beer':'600600'} for name,code in stocks.items(): try: df[name]=performance(code).values except: continue d
03
epilogue
Pyfinance is a python package mainly designed for securities investment management and performance evaluation indicators, which is very practical for readers who test CFA and FRM. In fact, the returns module of pyfinance extends the Series class of pandas to support securities investment return analysis and performance evaluation. Python is a "glue" language based on various modules. Therefore, it is good at borrowing existing packages for calculation and programming, which can improve efficiency and reduce the time and energy of "building wheels". This paper mainly introduces the application of returns module in pyfinance. The application of other modules will be introduced in subsequent tweets.
Technical exchange
Welcome to reprint, collect, gain, praise and support!
At present, a technical exchange group has been opened, with more than 2000 group friends. The best way to add notes is: source + Interest direction, which is convenient to find likeminded friends
 Method ① send the following pictures to wechat, long press identification, and the background replies: add group;
 Mode ②. Add micro signal: dkl88191, remarks: from CSDN
 WeChat search official account: Python learning and data mining, background reply: add group