• ### Z-Studio

Organizer
July 27, 2021 at 7:02 am

A mathematical computation and number crunching way to identify trading opportunities.

• ### Z-Studio

Organizer
July 27, 2021 at 1:16 pm

How to understand the Black-Scholes option pricing model?

The understanding of BS model is most simple, easy to understand and vivid from the perspective of self-financing dynamic hedging, because the basic function of option is risk management. As for the sell side of option, dynamic hedging is the eternal melody and origin. In addition, other derivations include taking the limit of the CRR tree, using substitution /Fourier Series to find the PDE of heat conduction, and Brute Force Integration can also be used. Here I write a derivation process, and annotated the method to make it easy to understand for beginners, for people with financial background, relatively vivid. The content includes: Understanding Self Financing dynamic Hedging (completed) Using (1) From GBM to BS Differential Equation derivation (completed) How can we understand and remember BS Differential Equation from Greeks? (completed) How to understand the required in the proof:? (completed) How to understand the proof: Ito’s Lemma and add a Goldman Sachs interview question (completed) interview BS differential equations exam example? (done)

Understanding Self Financing dynamic hedging our starting capital is 0. The understanding of Self Financing is that three things, borrowing money and saving money, selling options (derivatives) and trading stocks (underlying assets), enable us to make no loss or profit in the whole process. At time T, the value of the position in our trading account is as follows (put the option and trade the stock to hedge it) :Image:1

And the cash value in our bank account is: Image 2

At time T +dt, the value of the position in the trading account has changed because of the option price and the stock price, Image 3

Then at time T +dt, the net income or loss of our account is expressed as follows: Image 4

The value of the account position at time T +dt minus the value of the account position at time T plus the interest on the bank account (which you may give to the bank) Image 5

What we want is for this P to be 0 for any tiny period of time, which means that our hedging process perfectly replicates the option, which also means that the money we make in each tiny period just covers the interest we pay to the bank (or vice versa) so sad. From GBM to BS differential Equation and here we have for any T, we have the following two expressions: Image 6

Image 7 By Ito’s Lemma

Substituting the first expression for the second expression we write a new [formula] :Image 8

So we have Image 9

• ### Z-Studio

Organizer
July 27, 2021 at 1:24 pm

Quantitative strategy: How to use autoregressive model to build intraday high frequency strategy

Autoregressive model

Autoregressive model is a time series model, where:

The predictive variable is the past value of the time series

The target variable is the future value of the time series

If the first-order lag y(T-1) of the time series is used as the prediction variable, the AR model is called AR(1), as shown below:

$$y_t = \ \ beta_1 beta_0 + y + \ {1} t – epsilon_t (\ beta for regression coefficient)$$

If the second-order hysteresis Y (T-1) and Y (T-2) are used as the prediction variables, they are called AR(2) model, as shown below:

$$y_t = \beta_0 + \beta_1 y_{t-1} + \beta_2 y_{t-2} + \epsilon_t$$

Ernie’s AR model

Groeb says:

Time series models are most useful in markets where basic information is lacking or where short-term forecasting is not particularly useful. Money markets fit the bill.

Ernie used Bayesian information criterion (BIC) to find the optimal lag order P, and built AR(P) model of Australian dollar/US dollar minute price. He used sample data from July 24, 2007 to August 12, 2014 to fit the AR(10) model:

The out-of-sample data from August 12, 2014 to August 3, 2015 were retrospectively tested (excluding costs), and the net curve was as follows:

Think more deeply about the model

If we want to use the AR(P) model to predict prices, what assumptions do we have to make?

In essence, past prices are related to future prices to the extent that tradable information is included.

The first part seems entirely reasonable. After all, a good starting point for predicting tomorrow’s prices is today’s. For real deals, however, such naive predictions won’t do much good. On the other hand, if we could predict yields, that would be a useful thing!

Most quantitative measures will tell you that price levels do not contain useful predictive information, and instead you need to focus on the process by which those prices are reflected, namely earnings. Building a price-based time series model feels a bit like technical analysis, albeit with more interesting tools than trend lines.

If you’re going to use an AR model, a reasonable starting point is to figure out whether price fluctuations follow an AR process. For this purpose, we use R language to do research.

As expected, the minute price series does have a strong autocorrelation. I used minute data from 2009 to 2020, and as the graph above shows, prices a minute ago look a lot like current prices.

We can use the partial autocorrelation function (PACF) to judge the lag order of the AR model. The following is the partial autocorrelation graph:

The results are very interesting. If the price series were random walk, you would expect that the lag of the partial autocorrelation graph would not be significant, but there is a significant autocorrelation.

The following code creates a random walk price sequence and a partial autocorrelation graph:

# random walk for comparison

n <- 10000

random_walk <- cumsum(c(100, rnorm(n)))

data.frame(x = 1:(n+1), y = random_walk) %>%

ggplot(aes(x = x, y = y)) +

geom_line() +

theme_bw()

p <- pacf(random_walk, plot = FALSE)

plot(p[2:20])

Random walk price sequence:

Partial autocorrelation graph of random walk sequence:

Back to the partial autocorrelation graph of the real price series, Ernie’s choice of AR(10) model is reasonable, but from the perspective of PACF, there is no problem with the lag order of 15.

It will be interesting to see how the PACF curve changes over time. The following code divides the sample by year and calculates the partial autocorrelation graph for each year:

# annual pacfs

annual_partials <- audusd %>%

mutate(year = year(timestamp)) %>%

group_by(year) %>%

# create a column of date-close dataframes called data

nest(-year) %>%

mutate(

# calculate acf for each date-close dataframe in data column

pacf_res = purrr::map(data, ~ pacf(.x$close, plot = F)), # extract acf values from pacf_res object and drop redundant dimensions of returned lists pacf_vals = purrr::map(pacf_res, ~ drop(.x$acf))

) % > %

# promote the column of lists such that we have one row per lag per year

unnest(pacf_vals) %>%

group_by(year) %>%

mutate(lag = seq(0, n() – 1))

signif <- function(x) {

Qnorm ((1 + 0.95) / 2)/SQRT (sum (! is.na(x)))

}

signif_levels <- audusd %>%

mutate(year = year(timestamp)) %>%

group_by(year) %>%

summarise(significance = signif(close))

annual_partials %>%

filter(lag > 0, lag <= 20) %>%

ggplot(aes(x = lag, y = pacf_vals)) +

geom_segment(aes(xend = lag, yend = 0)) +

geom_point(size = 1, colour = “steelblue”) +

geom_hline(yintercept = 0) +

facet_wrap(~year, ncol = 3) +

geom_hline(data = signif_levels, aes(yintercept = significance), colour = ‘red’, linetype = ‘dashed’)

The partial autocorrelation looks pretty stable, but remember, we’re using prices instead of yields, and what we’re seeing is that current prices are related to

Next, look at the autocorrelation (ACF) diagram of minute returns:

ret_acf <- acf(audusd %>%

mutate(returns = (close – dplyr::lag(close))/dplyr::lag(close)) %>%

select(returns) %>%

na.omit() %>%

pull(),

lags = 20, plot = FALSE

)

plot(ret_acf[2:20], main = ‘ACF of minutely returns’)

The chart above shows a significant negative correlation between minute returns, and I think the following assumptions can be made:

There’s nothing special about the last ten minutes.

This is not an indicator we can trade, especially in the world of retail forex trading.

Before jumping to conclusions, run some simulations:

AR(10) model was used to fit a large number of data.

Trading strategies were created based on AR(10) model and backtracked with out of sample data.

Next, R is used to fit AR(10) model, and Zorro is used for backtracking test.

First, R is used to fit AR(10) model (AUD/USD price is stored in the data box, index is TIMESTAMP) :

# fit an AR model

ar <- arima(

audusd %>% filter(timestamp < “2014-01-01”) %>% select(close) %>% pull(),

order = c(10, 0, 0)

)

Estimation coefficient of AR(10) model:

ar$coef # ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 ar10 # 0.9741941564 0.0228922865 0.0019821879-0.0073977641 0.0045880720 0.0072364966-0.0047513598 0.0003852733 0.0048944003 0.0057283039 # intercept # 0.6692288336 Create custom functions to generate predictions based on the AR model: # fit an AR model and return the step-ahead prediction # can fit a new model or return predictions given an existing set of coeffs and new data # params: # series: data to use to fit model or to predict on # fixed: either NA or a vector of coeffs of same length as number of model parameters # usage: # fit a new model an return next prediction: series should consist of the data to be fitted, fixed should be NA # fit_ar_predict(audusd$close[1:100000], order = c(10, 0, 0))

# predict using an existing set of coeffs: series and fixed should be same length as number of model parameters

# fit_ar_predict(audusd$close[100001:100010], order = c(10, 0, 0), fixed = ar$coef)

fit_ar_predict <- function(series, order = 10, fixed = NA) {

if(sum(is.na(fixed)) == 0) {

# make predictions using static coefficients

# arima(series, c(1, 0, 0)) fits an AR(1)

predict(arima(series, order = c(order, 0, 0), fixed = fixed), 1)$pred } else { # fit a new model predict(arima(series, order = c(order, 0, 0)), 1)$pred

}

}

The Zorro code for backtracking checks:

#include <r.h>

function run()

{

set(PLOTNOW);

setf(PlotMode, PL_FINE);

StartDate = 2014;

EndDate = 2015;

BarPeriod = 10;

LookBack = 10;

MaxLong = MaxShort = 1;

MonteCarlo = 0;

if(is(INITRUN))

{

// start R and source the kalman iterator function

if(! Rstart(“ar.R”, 2))

{

print(“Error – can’t start R session!” );

quit();

}

}

asset(“AUD/USD”);

Spread = Commission = RollLong = RollShort = Slippage = 0;

// generate reverse price series (the order of Zorro series is opposite what you’d expect)

vars closes = rev(series(priceClose()));

// model parameters

int order = 10;

Var coeffs = {0.9793975109, 0.0095665978, 0.0025503174, 0.0013394797, 0.0060263045, -0.0023060104, -0.0022220192, 0.0006940781, 0.0011942208, 0.0037558386, 0.9509437891}; //note 1 extra coeff – intercept

if(! is(LOOKBACK)) {

// send function argument values to R

Rset(“order”, order);

Rset(“series”, closes, order);

Rset(“fixed”, coeffs, order+1);

// compute AR prediction and trade

var pred = Rd(“fit_ar_predict(series = series, order = order, fixed = fixed)”);

printf(“\nCurrent: %.5f\nPrediction: %.5f”, priceClose(), pred);

if(pred > priceClose())

enterLong();

else if(pred < priceClose())

enterShort();

}

}

Since the R function is called every minute for prediction, backtracking takes some time. Here are the results:

This is fairly consistent with Ernie’s backtest results (excluding costs). Unfortunately, the strategy trades so often that the profits don’t cover the costs. The average gain from Zorro’s backtracking was just 0.1 points per deal.

Can you reduce the frequency of transactions?

Transaction costs are a major issue, so start optimizing with transaction frequency.

If you see evidence of biased autocorrelation over a longer time frame, you can reduce the number of trades and hold the position for longer.

First, try the 10-minute K-line:

# 10-minute partials

partial <- pacf(

audusd %>%

mutate(minute = minute(timestamp)) %>%

filter(minute %% 10 == 0) %>%

select(close) %>%

pull(),

lags = 20, plot = FALSE)

plot(partial[2:20], main = ‘Ten minutely PACF’)

And then the hour graph K line:

# hourly partials

partial <- pacf(

audusd %>%

mutate(minute = minute(timestamp)) %>%

filter(minute == 0) %>%

select(close) %>%

pull(),

lags = 20, plot = FALSE)

plot(partial[2:20], main = ‘Hourly PACF’)

And finally, the K line:

# daily partials

partial <- pacf(

audusd %>%

mutate(hour = hour(timestamp)) %>%

filter(hour == 0) %>%

select(close) %>%

pull(),

lags = 20, plot = FALSE)

plot(partial[2:20], main = ‘Daily PACF’)

As longer time frames are used, the partial autocorrelation of price series becomes weaker and weaker.

We can fit the AR(10) model with the 10-minute K line. Following the same process above, the results are as follows:

We’ve managed to increase average earnings per transaction to 0.3 points, but that’s still far from covering costs.

Viewing 1 - 3 of 3 replies