8000 GitHub - danbikle/pyspy
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

danbikle/pyspy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pyspy

I present this repository, pyspy, to the Python Meetup 2016-01-18:

http://www.meetup.com/BayPIGgies/events/227481217

I intend for pyspy to run on Ubuntu 14 so I give instructions for installing and operating pyspy on Ubuntu 14.

If you want to run pyspy on a Mac I assume that you could easily generate and then follow instructions to run pyspy on a Mac.

If you want to run Ubuntu 14 as a guest inside your Mac, you can find information on the web for setting that up:

http://www.google.com/search?q=How+to+run+Ubuntu+as+Guest+of+Mac+OSX+with+VirtualBox+or+VMware+or+Parallels+or+Vagrant

Once you find good information, you will probably need a copy of Ubuntu 14:

http://releases.ubuntu.com/14.04/ubuntu-14.04.3-desktop-amd64.iso

If you want to run Ubuntu 14 as a guest of Windows I know that it is possible but be prepared for bad Windows behavior. For example I recently saw a windows10-virtualbox installation that failed to support copy-paste with a mouse.

Install pyspy on Ubuntu 14

I like to install a wide variety of software on my Ubuntu host. Some of the software listed below is probably not necessary but I like to have it available.

sudo apt-get update
sudo apt-get upgrade

sudo apt-get install autoconf bison build-essential libssl-dev libyaml-dev \
libreadline6-dev zlib1g-dev libncurses5-dev libffi-dev libgdbm3       \
libgdbm-dev libsqlite3-dev gitk postgresql postgresql-server-dev-all  \
libpq-dev emacs wget curl chromium-browser openssh-server aptitude    \
ruby ruby-dev sqlite3

sudo apt-get update
sudo apt-get upgrade

I like to install pyspy into this folder:

/home/ann/pyspy

You could try to install it elsewhere.

Anyway, I do this:

useradd ann -m -s /bin/bash
passwd ann
ssh -YA ann@localhost

Next I clone pyspy from github.com:

cd ~
git clone https://github.com/danbikle/pyspy

Then I install Anaconda Python:

cd ~
curl https://3230d63b5fc54e62148e-c95ac804525aac4b6dba79b00b39d1d3.ssl.cf1.rackcdn.com/Anaconda3-2.4.1-Linux-x86_64.sh > Anaconda3-2.4.1-Linux-x86_64.sh
bash Anaconda3-2.4.1-Linux-x86_64.sh
cd ~/anaconda3/bin
mv curl curl_ana
cd ~
echo 'export PATH=${HOME}/anaconda3/bin:$PATH' >> ~/.bashrc
bash
python
quit()

Next I install talib:

cd ~
curl http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz > ta-lib-0.4.0-src.tar.gz
tar zxf ta-lib-0.4.0-src.tar.gz
cd      ta-lib-0.4.0-src
./configure --prefix=/usr
make
sudo make install
pip install TA-Lib

Operate pyspy

I operate pyspy by running a shell script at 12:50pm California time:

~/pyspy/bin/noon50.bash

After I run the above script, I should see something like this:

ann@nia111:~ $ 
ann@nia111:~ $ 
ann@nia111:~ $ cd pyspy/
ann@nia111:~/pyspy $ 
ann@nia111:~/pyspy $ 
ann@nia111:~/pyspy $ bin/noon50.bash 
--2015-12-30 22:37:35--  http://ichart.finance.yahoo.com/table.csv?s=%5EGSPC
Resolving ichart.finance.yahoo.com (ichart.finance.yahoo.com)... 
Connecting to ichart.finance.yahoo.com (ichart.finance.yahoo.com)
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/csv]
Saving to: ‘GSPC.csv’

    [                         <=>       ] 1,195,127    242KB/s   in 6.1s   

2015-12-30 22:37:41 (193 KB/s) - ‘GSPC.csv’ saved [1195127]

--2015-12-30 22:37:41--  http://finance.yahoo.com/q?s=%5EGSPC
Resolving finance.yahoo.com (finance.yahoo.com)... 208.71.44.31, 
Connecting to finance.yahoo.com (finance.yahoo.com)|208.71.44.31|
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘GSPC.html’

    [ <=>                              ] 72,992      --.-K/s   in 0.1s    

2015-12-30 22:37:42 (660 KB/s) - ‘GSPC.html’ saved [72992]

I am building features from this file:
GSPC2.csv
Busy...
I am building features from this file:
ftrGSPC2.csv
Busy...
True  positive count: 63
False positive count: 60
True  negative count: 71
False negative count: 57
Accuracy of positive predictions: 51.219%
Accuracy of negative predictions: 55.468%
Combined Accuracy of predictions: 53.386%
Positive Effectiveness: 0.0403%
Negative Effectiveness: -0.027%
Long Only Effectiveness: 0.0057%
hello, from /home/ann/pyspy/py/plotem.py
Plotting data from: learn_test.csv
New png file: 
learn_test.png
ann@nia111:~/pyspy $ 
ann@nia111:~/pyspy $ 
ann@nia111:~/pyspy $ 

The above report tells me both accuracy and effectiveness of pyspy on recent out of sample data.

After pyspy runs, I want to see the most recent predictions:

ann@nia111:~/ddata $ head -1 learn_test.csv
cdate,cp,pctlead,pctlag1,pctlag2,pctlag4,pctlag8,upf,lowf,prediction,pdir,lead_delta,actual_dir,greenline
ann@nia111:~/ddata $ 
ann@nia111:~/ddata $ tail learn_test.csv
2015-12-16,2073.070,-1.504,1.451,2.529,1.015,-0.890,1.005,1.038,0.484,-1.000,-31.180,-1.000,2190.470
2015-12-17,2041.890,-1.780,-1.504,-0.074,1.467,-1.694,1.019,1.023,0.536,1.000,-36.340,-1.000,2221.650
2015-12-18,2005.550,0.778,-1.780,-3.257,-0.811,-2.813,1.038,1.007,0.561,1.000,15.600,1.000,2185.310
2015-12-21,2021.150,0.882,0.778,-1.016,-1.089,-1.293,1.030,1.015,0.509,1.000,17.820,1.000,2200.910
2015-12-22,2038.970,1.242,0.882,1.666,-1.645,-0.646,1.021,1.024,0.480,-1.000,25.320,1.000,2218.730
2015-12-23,2064.290,-0.160,1.242,2.134,1.097,2.580,1.005,1.035,0.460,-1.000,-3.300,-1.000,2193.410
2015-12-24,2060.990,-0.218,-0.160,1.080,2.764,1.931,1.011,1.034,0.492,-1.000,-4.490,-1.000,2196.710
2015-12-28,2056.500,1.063,-0.218,-0.377,1.749,0.641,1.012,1.020,0.510,1.000,21.860,1.000,2201.200
2015-12-29,2078.360,-0.722,1.063,0.843,1.932,0.255,1.003,1.022,0.494,-1.000,-15.000,-1.000,2223.060
2015-12-30,2063.360,0.000,-0.722,0.334,-0.045,1.051,1.008,1.007,0.497,-1.000,0.000,0.000,2238.060
ann@nia111:~/ddata $ 
ann@nia111:~/ddata $ 

I see that the prediction issued at 12:50PM on 2015-12-30 is -1.000 which is a bearish prediction. The pyspy script predicts that the closing price on 12-31 will probably be below 2063.36.

I see that the prediction issued on 2015-12-29 was also -1.000 and that was an accurate prediction.

I see that the prediction issued on 2015-12-28 was +1.000 and that was an accurate prediction.

I see that the prediction issued on 2015-12-24 was -1.000 and that was an accurate prediction.

I see that the prediction issued on 2015-12-23 was -1.000 and that was an accurate prediction.

I see that the prediction issued on 2015-12-22 was -1.000 and that was a false negative.

An example of a false positive is the prediction issued for 2015-12-17.

I can use python to report on the predictions:

# rpt1.py

import pandas as pd
import numpy  as np
import pdb

test_df = pd.read_csv('learn_test.csv')

# I should count positive predictions.
posp_df  = test_df[['pdir','actual_dir']][test_df['pdir'] == 1]
# I should count true positive predictions.
tposp_df = posp_df[posp_df['actual_dir'] == 1]
# I should calculate positive accuracy.
pos_acc  = 100.0 * len(tposp_df)/len(posp_df)

# I should count negative predictions.
negp_df  = test_df[['pdir','actual_dir']][test_df['pdir'] == -1]
# I should count true negative predictions.
tnegp_df = negp_df[negp_df['actual_dir'] == -1]
# I should calculate negative accuracy.
neg_acc  = 100.0 * len(tnegp_df)/len(negp_df)

# I should calculate combined accuracy
com_acc  = 100.0 * (len(tposp_df)+len(tnegp_df))/len(test_df)

# I should calculate 'effectiveness'.
pos_pctlead_a = np.array(test_df[['pctlead']][test_df['pdir'] == 1])
pos_eff       = np.mean(pos_pctlead_a)
neg_pctlead_a = np.array(test_df[['pctlead']][test_df['pdir'] == -1])
neg_eff       = np.mean(neg_pctlead_a)
longonly_eff  = np.mean(test_df['pctlead'].values)
# I should report
print('True  positive count: '+str(len(tposp_df)))
print('False positive count: '+str(len(posp_df)-len(tposp_df)))
print('True  negative count: '+str(len(tnegp_df)))
print('False negative count: '+str(len(negp_df)-len(tnegp_df)))

print('Accuracy of positive predictions: ' +str(pos_acc)[:6]+'%')
print('Accuracy of negative predictions: ' +str(neg_acc)[:6]+'%')
print('Combined Accuracy of predictions: ' +str(com_acc)[:6]+'%')
print('Positive Effectiveness: '  +str(pos_eff)[:6]+'%')
print('Negative Effectiveness: '  +str(neg_eff)[:6]+'%')
print('Long Only Effectiveness: ' +str(longonly_eff)[:6]+'%')

When I run the above script I should see something like this:

ann@nia111:~/ddata $ 
ann@nia111:~/ddata $ 
ann@nia111:~/ddata $ python rpt1.py
True  positive count: 63
False positive count: 60
True  negative count: 71
False negative count: 57
Accuracy of positive predictions: 51.219%
Accuracy of negative predictions: 55.468%
Combined Accuracy of predictions: 53.386%
Positive Effectiveness: 0.0403%
Negative Effectiveness: -0.027%
Long Only Effectiveness: 0.0057%
ann@nia111:~/ddata $ 
ann@nia111:~/ddata $ 

In addition to the above report, I have a script which creates the 'Blue-Line-Green-Line Visualization':

# plotem.py

# This script should plot data in a CSV file

# Demo:
# cd ~/ddata
# python ~/pyspy/py/plotem.py learn_test.csv

import pandas as pd
import numpy  as np
import pdb
import datetime

# I should check cmd line arg
import sys

print('hello, from '+ sys.argv[0])

#  len(sys.argv) should == 2
if len(sys.argv) == 1:
  print('I need a command line arg.')
  print('Demo:')
  print('python '+sys.argv[0]+' learn_test.csv')
  print('Try again. bye.')
  sys.exit()

csvf = sys.argv[1]
print('Plotting data from: '+csvf)

# I should load the csv into a DataFrame
df1 = pd.read_csv(csvf)

# matplotlib likes dates:
cdate_l = [datetime.datetime.strptime(row, "%Y-%m-%d") for row in df1['cdate'].values]
# I should get values for y-axis now:
cp_l = [row for row in df1['cp']       ] 
gl_l = [row for row in df1['greenline']] 

# I should plot
import matplotlib
# http://matplotlib.org/faq/howto_faq.html#generate-images-without-having-a-window-appear
matplotlib.use('Agg')
# Order is important here.
# Do not move the next import:
import matplotlib.pyplot as plt
plt.figure(figsize=(15,10)) # 10inch x 5inch
plt.plot(cdate_l, cp_l, 'b-', cdate_l, gl_l, 'g-')
plt.title('Blue-Line/Green-Line Visualization (Blue: Long Only, Green: Follow Predictions)')
plt.grid(True)
pngf = csvf.replace('.csv','')+'.png'
plt.savefig(pngf)
plt.close()
print('New png file: ')
print(pngf)
'done'

When I run that script I should see something like this:

ann@nia111:~/ddata $ python ~/pyspy/py/plotem.py learn_test.csv
hello, from /home/ann/pyspy/py/plotem.py
Plotting data from: learn_test.csv
New png file: 
learn_test.png
ann@nia111:~/ddata $ 
ann@nia111:~/ddata $ ls -latr 
total 4428
drwxrwxrwt 35 ann ann    4096 Dec 30 22:31 ..
-rw-rw-r--  1 ann ann 1195127 Dec 30 22:37 GSPC.csv
-rw-rw-r--  1 ann ann   72992 Dec 30 22:37 GSPC.html
-rw-rw-r--  1 ann ann      19 Dec 30 22:37 GSPCrecent.csv
-rw-rw-r--  1 ann ann  353937 Dec 30 22:37 GSPC3.csv
-rw-rw-r--  1 ann ann  353937 Dec 30 22:37 GSPC2.csv
-rw-rw-r--  1 ann ann  848847 Dec 30 22:37 ftrGSPC2.csv
-rw-rw-r--  1 ann ann 1048088 Dec 30 22:37 ftr_wbb_ftrGSPC2.csv
-rw-rw-r--  1 ann ann  450171 Dec 30 22:37 training.csv
-rw-rw-r--  1 ann ann   16238 Dec 30 22:37 test.csv
-rw-rw-r--  1 ann ann   25087 Dec 30 22:37 learn_test.csv
-rw-rw-r--  1 ann ann    1699 Dec 30 23:22 rpt1.py
drwxrwxr-x  2 ann ann    4096 Dec 30 23:22 .
-rw-rw-r--  1 ann ann  135693 Dec 30 23:31 learn_test.png
ann@nia111:~/ddata $ 
ann@nia111:~/ddata $ 

I should see something like this in the learn_test.png file:

learn_test.png

Sometimes I will run pyspy at night after the most recent closing price is available:

~/pyspy/bin/night.bash

Another way to operate pyspy is to call it from herokuspy which is a repository I use to publish pyspy results to the web via heroku:

https://github.com/danbikle/herokuspy

If you have questions, e-me: bikle101 at gmail

Thank you for studying pyspy today.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0