這個數據庫的建立可以分為兩大部份,首先要可以定時提取數據源的財經數據,免費的當然要數Yahoo Finance,可以自行寫GET request 去用他「隱藏」的接口拿數據,(這是Yahoo Finance 的 CSV API,用Excel的話會用到這方法,它另外還有一套下載歷史股價的方法就好像在這次改版中失效了。以往Yahoo和Google都有股價的API,但似乎在逐漸放棄。原本Yahoo都有寫一些 License上不可 redistribute / commercial usage 的條款,但現在根本連這個功能的說明版面都找不到了)。另外也有現成的Node package,我就當然懶得 Reinvent the wheel,所以這方面就用現有的 node-yahoo-finance。雖然這個Yahoo的API可以有historical 和snapshot的數據,但免費數據一般都沒有即時更新,Snapshot固然一般有15分鐘滯後,所以一定要看Last-Trade-Time。Historical 的也可以留意到一些日子的Close 和ratio數據有點奇怪,引申到52周高低有點問題,大概免費的東西質量總是少點保證的,所以如果打算直接使用的話,一定要小心,不然類似出現 Fat Finger Error 就慘慘了 ( T_T ) ~~
Node package: yahoo-finance
今次的目的是想做一個 Intraday 的數據庫,讓之後可以有數據基礎去研究分析。因為會經過檢查,所以要求並不高。先安裝node-yahoo-finance:$ npm install yahoo-finance --save
NPM - yahoo-finance
Github - pilwon/node-yahoo-finance
(留意Yahoo在改動原本的API接口 -16May2017,不過見一些常用的field依然可用。open-source的強大之處就是己經有高手去解決 26May2017,原來的snapshot() 功能將被棄用,由新的quotes()。historical() 聲稱不受影響。)
要知如何使用,最好先對Yahoo Finance的接口有點經驗。例如在網址上試試輸入:
http://download.finance.yahoo.com/d/quotes.csv?s=^HSI+0005.HK&f=nsl1t1opc1p2&e=.csv
這是對於 Symbol:^HSI (恆生指數) 和 0005.HK (匯控),傳回它的 Fields:n (Name)、s (Symbol)、l1 (Last Trade Price)、t1 (Last Trade Time)、o (Open)、p (Previous Close)、c1 (Change)、p2 (Change in Percent)。
Symbol可以透過yahoo finance 查詢,港股像Thomsons Reuters的RIC code(e.g. ^HSI, 0005.HK),Field的對照表特別重要,畢竟也要知道人家有什麼數據提供、如何抓取。
指數例子:恆生指數 (^HSI) |
個股例子:匯豐控股 (0005.HK) |
Field: http://www.jarloo.com/yahoo_finance/
Pricing | Dividends |
a: Ask | y: Dividend Yield |
b: Bid | d: Dividend per Share |
b2: Ask (Realtime) | r1: Dividend Pay Date |
b3: Bid (Realtime) | q: Ex-Dividend Date |
p: Previous Close | |
o: Open | |
Date | |
c1: Change | d1: Last Trade Date |
c: Change & Percent Change | d2: Trade Date |
c6: Change (Realtime) | t1: Last Trade Time |
k2: Change Percent (Realtime) | |
p2: Change in Percent | |
Averages | |
c8: After Hours Change (Realtime) | m5: Change From 200 Day Moving Average |
c3: Commission | m6: Percent Change From 200 Day Moving Average |
g: Day’s Low | m7: Change From 50 Day Moving Average |
h: Day’s High | m8: Percent Change From 50 Day Moving Average |
k1: Last Trade (Realtime) With Time | m3: 50 Day Moving Average |
l: Last Trade (With Time) | m4: 200 Day Moving Average |
l1: Last Trade (Price Only) | |
t8: 1 yr Target Price | |
Misc | |
w1: Day’s Value Change | g1: Holdings Gain Percent |
w4: Day’s Value Change (Realtime) | g3: Annualized Gain |
p1: Price Paid | g4: Holdings Gain |
m: Day’s Range | g5: Holdings Gain Percent (Realtime) |
m2: Day’s Range (Realtime) | g6: Holdings Gain (Realtime) |
52 Week Pricing | Symbol Info |
k: 52 Week High | i: More Info |
j: 52 week Low | j1: Market Capitalization |
j5: Change From 52 Week Low | j3: Market Cap (Realtime) |
k4: Change From 52 week High | f6: Float Shares |
j6: Percent Change From 52 week Low | n: Name |
k5: Percent Change From 52 week High | n4: Notes |
w: 52 week Range | s: Symbol |
s1: Shares Owned | |
x: Stock Exchange | |
j2: Shares Outstanding | |
Volume | |
v: Volume | |
a5: Ask Size | |
b6: Bid Size | Misc |
k3: Last Trade Size | t7: Ticker Trend |
a2: Average Daily Volume | t6: Trade Links |
i5: Order Book (Realtime) | |
Ratios | l2: High Limit |
e: Earnings per Share | l3: Low Limit |
e7: EPS Estimate Current Year | v1: Holdings Value |
e8: EPS Estimate Next Year | v7: Holdings Value (Realtime) |
e9: EPS Estimate Next Quarter | s6 Revenue |
b4: Book Value | |
j4: EBITDA | |
p5: Price / Sales | |
p6: Price / Book | |
r: P/E Ratio | |
r2: P/E Ratio (Realtime) | |
r5: PEG Ratio | |
r6: Price / EPS Estimate Current Year | |
r7: Price / EPS Estimate Next Year | |
s7: Short Ratio |
回到講 Node.JS 上的 yahoo-finace 套件,用 snapshot() 的方法,提供 Symbols 和 Fields 的Array,這樣會傳回一個JSON格式的的資料表。(......之後會嘗試新的quote()了 )
node-stock-snapshotDB.js
var yahooFinance = require('yahoo-finance');
var _ = require('lodash');
var FIELDS = _.flatten([
['n'], //name
['l1','d1','d2','t1'], //Last Trade, Last Trade Date, Trade Date, Last Trade Time
['p','c1','w','j6','k5'],
['y','r','p6']
]);
var SYMBOLS = [ '^HSI', '0005.HK'];
// extract data from Yahoo
yahooFinance.snapshot({
fields: FIELDS, // Fields 的陣列
symbols: SYMBOLS // Symbol 的陣列
}, function (err, result) {
if (err) { throw err; }
_.forEach(result, function (snapshot, symbol) {
//...... 對每個symbol的每次snapshot做處理....
});
定時功能只要用全域變數 myVar = SetInterval(function, milliseconds,...) 去令指令重覆每5分鐘執行一次,當執行了100次就執行 clearInterval(myVar) 去停止。
node-stock-snapshotDB.js
// set interval job every 5 min
var myInterval = setInterval(function () {
// extract data from Yahoo
yahooFinance.snapshot({
fields: FIELDS, // Fields 的陣列
symbols: SYMBOLS // Symbol 的陣列
}, function (err, result) {
if (err) { throw err; }
_.forEach(result, function (snapshot, symbol) {
//...... 對每個symbol的每次snapshot做處理....
});
if (countRecord >100) {
// db.close();
clearInterval(myInterval);
}
}, 1000*60*5);
MongoDB
下一部份是數據庫工作,用一般的Mongoose 去管MongoDB,這個是從Coursera的課程中學來的。這套MongoDB 一個好處是方便使用 JSON(Javascript Object Notation) 格式,大概對我們用JS寫的程式方便點吧?另外,MongoDB 是 NoSQL (Not only SQL) (也是Not Relational, Document based) 的一個代表,相對於用SQL的RDBMS資料庫。提供彈性處理不是那種規規矩矩的資料表,以至配合大數據常見的分散式儲存架構,近年就更得到重視。不過,這裡可以沒有這種需求,我們甚至用Mongoose去定義Schema和Model來方便規範一下我們的數據庫。Snapshots.js - 讓 Mongoose Database 用的 Model
// grab the things we need
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
// create a schema
var snapshotSchema = new Schema({
symbol: {type: String, required: true},
name: {type: String, required: true},
lastTradePrice: {type: String, required: true},
lastTradeDate: {type: String, required: true},
tradeDate: {type: String},
lastTradeTime: {type: String}
}, {timestamps: true
});
var Snapshots = mongoose.model('Snapshot', snapshotSchema); // Create a Model using this Schema, mongoose always use plural form of of collection name to make Model
module.exports = Snapshots; // make Model available to Node applications
完整的 node-stock-snapshotDB.js 我放到最後面,讓不看Code的人看得輕鬆一點。
當每天繼續上班做個營營役役的螺絲釘,家裡就讓這個後台程序掛機跑一天。回來後應該要看一下自己的成果吧,但MangoDB 的儲存方式是一種叫BSON (Binary JSON) 的格式,反正就不是你在其他統計軟件方便弄的,所以我們先要把內容匯出來成為csv的形式。
在Terminal中這樣打開Mongo Shell:
$ mongo
匯出成CSV:
$ mongoexport -d 'conFusion' -c 'snapshots' -o 'snapshots.csv' --type=csv --fields "symbol,name,lastTradePrice,lastTradeDate,lastTradeTime"
當中的option包括: -d <database>, -c <collection>, -o <output name>,
--type <output type>, --field <output fields> ( 選擇--type=csv的話一定要提供 --fields)清除舊資料的Collection,以至清除Database:
更多Mongo shell 的指令可以參考 MongoDB Tutorial: https://www.tutorialspoint.com/mongodb/index.htm$ show dbs
$ use conFusion
$ show collections $ db.snapshots.drop $ db.dropDatabase()
下一篇寫埋Node.JS會寫另一個用 socket.io 在外面連結自己家中伺服器查詢自定的指標的方法。這篇作為了解Node.JS的應用,學習 Node.JS 以至 Javascript 的寫法。
node-stock-snapshotDB.js
var yahooFinance = require('yahoo-finance'); var mongoose = require('mongoose'); var _ = require('lodash'); var Snapshots = require('./models/snapshot'); var FIELDS = _.flatten([ ['n'], //name ['l1','d1','d2','t1'], //Last Trade, Last Trade Date, Trade Date, Last Trade Time ['p','c1','w','j6','k5'], ['y','r','p6'] ]); var SYMBOLS = ['^HSI','0823.HK']; var countRecord = 0; // Connection URL var url = 'mongodb://localhost:27017/conFusion'; //set the host as localhost, and database as conFusion mongoose.connect(url); // Mongoose pending a connection to the database var db = mongoose.connection; db.on('error', console.error.bind(console, 'connection error:')); //If connection fail, alert db.once('open', function () { // we're connected! console.log("Connected correctly to server"); // set interval job every 5 min var myInterval = setInterval(function () { // extract data from Yahoo yahooFinance.snapshot({ fields: FIELDS, symbols: SYMBOLS }, function (err, result) { if (err) { throw err; } _.forEach(result, function (snapshot, symbol) { // create a new snapshot Snapshots.create({ symbol: snapshot['symbol'], name: snapshot['name'], lastTradePrice: snapshot['lastTradePriceOnly'], lastTradeDate: snapshot['lastTradeDate'], lastTradeTime: snapshot['lastTradeTime'] }, function (err, snapshot) { if (err) throw err; var id = snapshot._id; setTimeout(function () { snapshot.save(function (err, snapshot) { countRecord++; console.log('Updated Record #%d', countRecord); }); }, 3000); }); }); }); if (countRecord >5) { db.close(); clearInterval(myInterval); } },
1000*60*5
); });
http://download.finance.yahoo.com/d/quotes.csv?s=^HSI+0005.HK&f=nsl1t1opc1p2&e=.csv
回覆刪除http://ichart.yahoo.com/table.csv?s=%5EHSI&a=11&b=31&c=1986&g=d&ignore=.csv
失效了 ?