Tushare databases have been favored by financial analysts in data acquisition, greatly reducing their workload in financial data collection, cleaning and processing, and storage processes, and focusing more on the research and implementation of strategies and models.Because the old version of Tushare has been running for three years, many articles on the web that involve financial and stock exchange data acquisition currently use the old version of Tushare.The Tushare community now maintains a new version of tusharePro, whose data is more stable and of higher quality, and has made greater improvements in both breadth and depth than the older versions. The available data content has expanded to include financial data such as Shanghai and Shenzhen stock quotations, finance, market reference, index (including foreign stock indexes), funds, futures, options, macro-economy, industry economy, news information, and digital currency quotations.Chain data saves a lot of valuable time for financial quantifiers.
In general, the pro version is free to use. Before using it, you need to log on to the official website to register an account to get token, where to register: https://tushare.pro/register?reg=365212 .However, some of the interfaces set permissions and need to achieve certain integrals before they can be used.The following is an example of how tushare pro can get data by taking stock quotation data as an example.
First, the next stock_basic() interface is introduced, which obtains the basic information data of all listed stocks, including the stock code, name, date of listing, date of withdrawal, and so on.The input parameters are described as follows:
is_hs: whether it is a Shanghai-Shenzhen-Hong Kong trademark, N no, H-Shanghai-Stock Exchange, S-Shenzhen Stock Exchange;
list_status: Listing status, L listing, D withdrawing, P suspending listing;
exchange: SSE Stock exchange, SZSE Shenzhen Stock exchange, HKEX Port exchange.
Note: You can refer to the introduction of the official website about the output parameters, which are not described here.
1 pro = ts.pro_api(token) 2 data = pro.stock_basic(exchange='', list_status='L', fields='ts_code,symbol,name,area,industry,list_date') 3 print(data.head()) 4 """ 5 ts_code symbol name area industry list_date 60 000001.SZ 000001 Ping An Bank Shenzhen Bank 19910403 71,000,002.SZ 0002,000,000 Ke A Shenzhen National Real Estate 19910129 82 000004.SZ 000004 State Agricultural Science and Technology Shenzhen Biopharmaceuticals 19910114 9.3 000005.SZ 000005th Century Star Source Shenzhen Real Estate Service 1990 1210 10 4 000006.SZ 000006 Shenzhen A Shenzhen Regional Real Estate 19920427 11 """ 12 print(data.tail()) 13 """ 14 15 ts_code symbol name area industry list_date 16 3585 603993.SH 603993 Luoyang Molybdenum Industry Henan Small Metal 20121009 17 3586 603996.SH 603996 Zhongxin Technology Zhejiang Household Appliances 20151222 18 3587 603997.SH 603997 Jifeng Stock Zhejiang Automotive Parts 20150302 19 3588 603998.SH 603998 Fangsheng Pharmaceutical Hunan Patent Chinese Medicine 20141205 20 3589 603999.SH 603999 Reader Media Gansu Publishing Industry 20151210 21 """
Then introduce the daily() interface, which is the most common interface to get stock market data.The input parameters include the stock code ts_code, the start_date of the start date, and the end_date of the end date.
#Get Ping An Bank Day Price Data
1 pa=pro.daily(ts_code='000001.SZ', start_date='20180101',
2 end_date='20190106')
3
4 print(pa.head())
5 """
6 ts_code trade_date open ... pct_chg vol amount
7 0 000001.SZ 20190104 9.24 ... 5.0647 1481159.06 1422149.888
8 1 000001.SZ 20190103 9.18 ... 0.9793 415537.95 384457.707
9 2 000001.SZ 20190102 9.39 ... -2.0256 539386.32 498695.109
10 3 000001.SZ 20181228 9.31 ... 1.0776 576604.00 541571.004
11 4 000001.SZ 20181227 9.45 ... -0.2151 624593.27 586343.755
12
13 [5 rows x 11 columns]
14 """
15 print(pa.tail())
16 """
17 ts_code trade_date open ... pct_chg vol amount
18 241 000001.SZ 20180108 13.25 ... -2.56 2158620.81 2806099.169
19 242 000001.SZ 20180105 13.21 ... 0.38 1210312.72 1603289.517
20 243 000001.SZ 20180104 13.32 ... -0.60 1854509.48 2454543.516
21 244 000001.SZ 20180103 13.73 ... -2.70 2962498.38 4006220.766
22 245 000001.SZ 20180102 13.35 ... 3.01 2081592.55 2856543.822
23
24 [5 rows x 11 columns]
25 """
We found that the row index of the DataFrame format data returned here is a serial number, not a transaction date, and the data is sorted by date from 20190104 to 20180102, which is not consistent with the standard data format in the column routines. We need to adjust the return
Data format.
1 pa.index = pd.to_datetime(pa.trade_date) 2 pa.sort_index(inplace=True) 3 pa.drop(axis=1, columns='trade_date', inplace=True) 4 print(pa.head()) 5 """ 6 ts_code open high ... pct_chg vol amount 7 trade_date ... 8 2019-01-04 000001.SZ 9.24 9.82 ... 5.0647 1481159.06 1422149.888 9 2019-01-03 000001.SZ 9.18 9.33 ... 0.9793 415537.95 384457.707 10 2019-01-02 000001.SZ 9.39 9.42 ... -2.0256 539386.32 498695.109 11 2018-12-28 000001.SZ 9.31 9.46 ... 1.0776 576604.00 541571.004 12 2018-12-27 000001.SZ 9.45 9.49 ... -0.2151 624593.27 586343.755 13 14 [5 rows x 10 columns] 15 """ 16 print(pa.tail()) 17 """ 18 ts_code open ... vol amount 19 trade_date ... 20 2018-01-08 000001.SZ 13.25 ... 2158620.81 2806099.169 21 2018-01-05 000001.SZ 13.21 ... 1210312.72 1603289.517 22 2018-01-04 000001.SZ 13.32 ... 1854509.48 2454543.516 23 2018-01-03 000001.SZ 13.73 ... 2962498.38 4006220.766 24 2018-01-02 000001.SZ 13.35 ... 2081592.55 2856543.822 25 26 [5 rows x 10 columns] 27 """
About using the index_daily interface to get the daily quotation of the index, this interface sets the permission to use, users need to accumulate 200 credits before they can be invoked, the method of accumulating credits can refer to the introduction of the official website.If the integral is not reached, the exponential rows can also be obtained from the old interface
Emotions.Here we obtain and visualize the data of the domestic Shanghai Composite Index, Shenzhen Composite Index, Shanghai and Shenzhen 300, Growth Enterprise Index, Shanghai 50, and Small and Medium Board Index through the old version.
1 #Get Common Stock Index Prices 2 indexs={'Shanghai Composite Index': 'sh','Shenzhen Syndrome into Indicator': 'sz', 3 'Shanghai and Shenzhen 300': 'hs300','GEM': 'cyb', 4 'Shanghai 50': 'sz50', 'Small and medium plate finger': 'zxb'} 5 6 index_data = {} 7 for name, code in indexs.items(): 8 #df = pro.index_daily(ts_code=code)#200 points required 9 df = ts.get_hist_data(code,start='2019-01-01',end=datetime.datetime.now().strftime('%Y-%m-%d')) 10 df.index = pd.to_datetime(df.index) 11 index_data[name] = df.sort_index()