2017年5月29日 星期一

[JS] 用 NodeJS 製作記錄日間股價的csv檔(Yahoo Finance)

學習Node.JS 有兩個目標希望可以寫到的,一個是自己的後台程式可以按需要查詢自定的指標,二是在日間抓取 網上免費的「即時」股票報價 ( Intraday Data),儲存做本機上的數據庫。兩個需求都在過往星期有所進展,比較接近完成的是日間股價的數據庫。

這個數據庫的建立可以分為兩大部份,首先要可以定時提取數據源的財經數據,免費的當然要數Yahoo Finance,可以自行寫GET request 去用他「隱藏」的接口拿數據,(這是Yahoo Finance 的 CSV API,用Excel的話會用到這方法,它另外還有一套下載歷史股價的方法就好像在這次改版中失效了。以往Yahoo和Google都有股價的API,但似乎在逐漸放棄。原本Yahoo都有寫一些 License上不可 redistribute / commercial usage 的條款,但現在根本連這個功能的說明版面都找不到了)。另外也有現成的Node package,我就當然懶得 Reinvent the wheel,所以這方面就用現有的 node-yahoo-finance。雖然這個Yahoo的API可以有historical 和snapshot的數據,但免費數據一般都沒有即時更新,Snapshot固然一般有15分鐘滯後,所以一定要看Last-Trade-Time。Historical 的也可以留意到一些日子的Close 和ratio數據有點奇怪,引申到52周高低有點問題,大概免費的東西質量總是少點保證的,所以如果打算直接使用的話,一定要小心,不然類似出現 Fat Finger Error 就慘慘了 ( T_T ) ~~

Node package: yahoo-finance

今次的目的是想做一個 Intraday 的數據庫,讓之後可以有數據基礎去研究分析。因為會經過檢查,所以要求並不高。先安裝node-yahoo-finance:
$ npm install yahoo-finance --save

NPM - yahoo-finance
Github - pilwon/node-yahoo-finance
(留意Yahoo在改動原本的API接口 -16May2017,不過見一些常用的field依然可用。open-source的強大之處就是己經有高手去解決 26May2017,原來的snapshot() 功能將被棄用,由新的quotes()。historical() 聲稱不受影響。)

要知如何使用,最好先對Yahoo Finance的接口有點經驗。例如在網址上試試輸入:
http://download.finance.yahoo.com/d/quotes.csv?s=^HSI+0005.HK&f=nsl1t1opc1p2&e=.csv

這是對於 Symbol:^HSI (恆生指數) 和 0005.HK (匯控),傳回它的 Fields:n (Name)、s (Symbol)、l1 (Last Trade Price)、t1 (Last Trade Time)、o (Open)、p (Previous Close)、c1 (Change)、p2 (Change in Percent)。

Symbol可以透過yahoo finance 查詢,港股像Thomsons Reuters的RIC code(e.g. ^HSI, 0005.HK),Field的對照表特別重要,畢竟也要知道人家有什麼數據提供、如何抓取。
指數例子:恆生指數 (^HSI)

個股例子:匯豐控股 (0005.HK)

Field: http://www.jarloo.com/yahoo_finance/
Pricing Dividends
a: Ask y: Dividend Yield
b: Bid d: Dividend per Share
b2: Ask (Realtime) r1: Dividend Pay Date
b3: Bid (Realtime) q: Ex-Dividend Date
p: Previous Close
o: Open
Date
c1: Change d1: Last Trade Date
c: Change & Percent Change d2: Trade Date
c6: Change (Realtime) t1: Last Trade Time
k2: Change Percent (Realtime)
p2: Change in Percent
Averages
c8: After Hours Change (Realtime) m5: Change From 200 Day Moving Average
c3: Commission m6: Percent Change From 200 Day Moving Average
g: Day’s Low m7: Change From 50 Day Moving Average
h: Day’s High m8: Percent Change From 50 Day Moving Average
k1: Last Trade (Realtime) With Time m3: 50 Day Moving Average
l: Last Trade (With Time) m4: 200 Day Moving Average
l1: Last Trade (Price Only)
t8: 1 yr Target Price
Misc
w1: Day’s Value Change g1: Holdings Gain Percent
w4: Day’s Value Change (Realtime) g3: Annualized Gain
p1: Price Paid g4: Holdings Gain
m: Day’s Range g5: Holdings Gain Percent (Realtime)
m2: Day’s Range (Realtime) g6: Holdings Gain (Realtime)
52 Week Pricing Symbol Info
k: 52 Week High i: More Info
j: 52 week Low j1: Market Capitalization
j5: Change From 52 Week Low j3: Market Cap (Realtime)
k4: Change From 52 week High f6: Float Shares
j6: Percent Change From 52 week Low n: Name
k5: Percent Change From 52 week High n4: Notes
w: 52 week Range s: Symbol

s1: Shares Owned

x: Stock Exchange

j2: Shares Outstanding
Volume
v: Volume
a5: Ask Size
b6: Bid Size Misc
k3: Last Trade Size t7: Ticker Trend
a2: Average Daily Volume t6: Trade Links

i5: Order Book (Realtime)
Ratios l2: High Limit
e: Earnings per Share l3: Low Limit
e7: EPS Estimate Current Year v1: Holdings Value
e8: EPS Estimate Next Year v7: Holdings Value (Realtime)
e9: EPS Estimate Next Quarter s6 Revenue
b4: Book Value
j4: EBITDA
p5: Price / Sales
p6: Price / Book
r: P/E Ratio
r2: P/E Ratio (Realtime)
r5: PEG Ratio
r6: Price / EPS Estimate Current Year
r7: Price / EPS Estimate Next Year
s7: Short Ratio

回到講 Node.JS 上的 yahoo-finace 套件,用 snapshot() 的方法,提供 SymbolsFields 的Array,這樣會傳回一個JSON格式的的資料表。(......之後會嘗試新的quote()了 )

node-stock-snapshotDB.js
var yahooFinance = require('yahoo-finance');
var _ = require('lodash');

var FIELDS = _.flatten([
    ['n'], //name
    ['l1','d1','d2','t1'], //Last Trade, Last Trade Date, Trade Date, Last Trade Time
    ['p','c1','w','j6','k5'], 
    ['y','r','p6']
    ]);
    
var SYMBOLS = [ '^HSI', '0005.HK'];

// extract data from Yahoo
yahooFinance.snapshot({
    fields: FIELDS,   // Fields 的陣列
    symbols: SYMBOLS   // Symbol 的陣列
}, function (err, result) {
    if (err) { throw err; }
        _.forEach(result, function (snapshot, symbol) {
        //...... 對每個symbol的每次snapshot做處理....
});

定時功能只要用全域變數 myVar = SetInterval(function, milliseconds,...) 去令指令重覆每5分鐘執行一次,當執行了100次就執行 clearInterval(myVar) 去停止。
node-stock-snapshotDB.js
// set interval job every 5 min
var myInterval = setInterval(function () {

    // extract data from Yahoo
    yahooFinance.snapshot({
        fields: FIELDS,   // Fields 的陣列
        symbols: SYMBOLS   // Symbol 的陣列
    }, function (err, result) {
        if (err) { throw err; }
            _.forEach(result, function (snapshot, symbol) {
            //...... 對每個symbol的每次snapshot做處理....
    });

    if (countRecord >100) {
        // db.close(); 
        clearInterval(myInterval);
    }
 }, 1000*60*5); 

MongoDB

下一部份是數據庫工作,用一般的Mongoose 去管MongoDB,這個是從Coursera的課程中學來的。這套MongoDB 一個好處是方便使用 JSON(Javascript Object Notation) 格式,大概對我們用JS寫的程式方便點吧?另外,MongoDB 是 NoSQL (Not only SQL)  (也是Not Relational, Document based) 的一個代表,相對於用SQL的RDBMS資料庫。提供彈性處理不是那種規規矩矩的資料表,以至配合大數據常見的分散式儲存架構,近年就更得到重視。不過,這裡可以沒有這種需求,我們甚至用Mongoose去定義Schema和Model來方便規範一下我們的數據庫。

Snapshots.js  -  讓 Mongoose Database 用的 Model
// grab the things we need
var mongoose = require('mongoose');
var Schema = mongoose.Schema;

// create a schema
var snapshotSchema = new Schema({
    symbol: {type: String, required: true},
    name: {type: String, required: true},
    lastTradePrice:  {type: String, required: true},
    lastTradeDate: {type: String, required: true},
    tradeDate: {type: String},
    lastTradeTime: {type: String}
    }, {timestamps: true
  });

var Snapshots = mongoose.model('Snapshot', snapshotSchema);  // Create a Model using this Schema, mongoose always use plural form of of collection name to make Model
module.exports = Snapshots;  // make Model available to Node applications


完整的 node-stock-snapshotDB.js 我放到最後面,讓不看Code的人看得輕鬆一點。

當每天繼續上班做個營營役役的螺絲釘,家裡就讓這個後台程序掛機跑一天。回來後應該要看一下自己的成果吧,但MangoDB 的儲存方式是一種叫BSON (Binary JSON) 的格式,反正就不是你在其他統計軟件方便弄的,所以我們先要把內容匯出來成為csv的形式。

在Terminal中這樣打開Mongo Shell:
$ mongo

匯出成CSV:
$ mongoexport -d 'conFusion' -c 'snapshots' -o 'snapshots.csv' --type=csv --fields "symbol,name,lastTradePrice,lastTradeDate,lastTradeTime"
當中的option包括: -d <database>, -c <collection>, -o <output name>, --type <output type>, --field <output fields> ( 選擇--type=csv的話一定要提供 --fields)

清除舊資料的Collection,以至清除Database:
$ show dbs$ use conFusion
$ show collections 
$ db.snapshots.drop
$ db.dropDatabase()
更多Mongo shell 的指令可以參考 MongoDB Tutorial: https://www.tutorialspoint.com/mongodb/index.htm

下一篇寫埋Node.JS會寫另一個用 socket.io 在外面連結自己家中伺服器查詢自定的指標的方法。這篇作為了解Node.JS的應用,學習 Node.JS 以至 Javascript 的寫法。

node-stock-snapshotDB.js
var yahooFinance = require('yahoo-finance');
var mongoose = require('mongoose');
var _ = require('lodash');
var Snapshots = require('./models/snapshot');

var FIELDS = _.flatten([
    ['n'], //name
    ['l1','d1','d2','t1'], //Last Trade, Last Trade Date, Trade Date, Last Trade Time
    ['p','c1','w','j6','k5'], 
    ['y','r','p6']
    ]);
    
var SYMBOLS = ['^HSI','0823.HK'];
var countRecord = 0;

// Connection URL
var url = 'mongodb://localhost:27017/conFusion';  //set the host as localhost, and database as conFusion
mongoose.connect(url); // Mongoose pending a connection to the database

var db = mongoose.connection; 
db.on('error', console.error.bind(console, 'connection error:')); //If connection fail, alert 
db.once('open', function () { 
    // we're connected!
    console.log("Connected correctly to server");

    // set interval job every 5 min
    var myInterval = setInterval(function () {

        // extract data from Yahoo
        yahooFinance.snapshot({
          fields: FIELDS,
          symbols: SYMBOLS
        }, function (err, result) {
          if (err) { throw err; }
          _.forEach(result, function (snapshot, symbol) {
                // create a new snapshot
                Snapshots.create({
                    symbol: snapshot['symbol'],
                    name: snapshot['name'],
                    lastTradePrice: snapshot['lastTradePriceOnly'], 
                    lastTradeDate: snapshot['lastTradeDate'],
                    lastTradeTime: snapshot['lastTradeTime']  
                }, function (err, snapshot) {
                    if (err) throw err;
                    var id = snapshot._id;
                    setTimeout(function () {
                        snapshot.save(function (err, snapshot) {
                            countRecord++;
                            console.log('Updated Record #%d', countRecord);   
                        });
                    }, 3000);
                });
          });
        });    
    
        if (countRecord >5) {
            db.close();
            clearInterval(myInterval);
        }
    
    }, 1000*60*5);        
        
});

1 則留言:

  1. http://download.finance.yahoo.com/d/quotes.csv?s=^HSI+0005.HK&f=nsl1t1opc1p2&e=.csv

    http://ichart.yahoo.com/table.csv?s=%5EHSI&a=11&b=31&c=1986&g=d&ignore=.csv

    失效了 ?

    回覆刪除