<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://dm13450.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://dm13450.github.io/" rel="alternate" type="text/html" /><updated>2026-04-08T09:36:25+00:00</updated><id>https://dm13450.github.io/feed.xml</id><title type="html">Dean Markwick</title><subtitle>Personal website for Dean Markwick. If you like stats, sports and rambling, you&apos;ve come to the right place. All rights reserved. 
</subtitle><author><name>Dean Markwick</name></author><entry><title type="html">Making Sense of the DXY</title><link href="https://dm13450.github.io/2026/03/10/Making-Sense-of-the-DXY.html" rel="alternate" type="text/html" title="Making Sense of the DXY" /><published>2026-03-10T00:00:00+00:00</published><updated>2026-03-10T00:00:00+00:00</updated><id>https://dm13450.github.io/2026/03/10/Making-Sense-of-the-DXY</id><content type="html" xml:base="https://dm13450.github.io/2026/03/10/Making-Sense-of-the-DXY.html"><![CDATA[<p>My day job is in quant <em>trading</em>, but there’s another fascinating world: quantitative <em>investing</em>. While I focus on latencies and execution, quant investors are busy building the most efficient portfolios and ensuring they extract pure alpha. Not one to stay in my lane, I’m using this blog post as an opportunity to dive into the world of quant investing and level up my knowledge.</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>Now most quant investing examples use equities as the underlying asset class, but I am an FX man, so will be replacing Apple and Microsoft with Euro’s and Yen. In some ways, this is easier; I just have to worry about 30-odd currencies as my investible universe compared to the thousands, if not hundreds of thousands, of different stocks. But in many ways it’s harder. What drives FX returns is at a much higher macro-level compared to an individual stock, and things like central banks changing interest rates, government policy changes are difficult to translate to a dataset compared to the price-to-book ratio of a stock. Still, we will give it a go.</p>

<p>In short, we want to better understand what can influence a currency’s return and produce a systematic model. This post is going to start with the basics, pulling in the right data, building a proxy to the overall FX market and ending with some basic regressions.</p>

<h2 id="twelve-data">Twelve Data</h2>

<p>For any quant investing model, we need to start with data. I’m always on the hunt for new sources, and <a href="https://twelvedata.com">twelvedata</a> is the latest one to come across my radar. It has a generous free tier and, more importantly, has FX data across all the main pairs. Plus, it has a Python API that is dead simple to use. This makes it ideal for this string of posts.</p>

<p>Sign up and get your API key, and you can follow along.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">twelvedata</span> <span class="kn">import</span> <span class="n">TDClient</span>
 
<span class="n">td</span> <span class="o">=</span> <span class="n">TDClient</span><span class="p">(</span><span class="n">apikey</span><span class="o">=</span><span class="n">API_KEY</span><span class="p">)</span>

<span class="n">td</span><span class="p">.</span><span class="n">time_series</span><span class="p">(</span>
        <span class="n">symbol</span><span class="o">=</span><span class="s">"USD/JPY"</span><span class="p">,</span>
        <span class="n">interval</span><span class="o">=</span><span class="s">"1day"</span><span class="p">,</span>
        <span class="n">start_date</span><span class="o">=</span><span class="s">"2025-01-01"</span><span class="p">,</span>
        <span class="n">end_date</span><span class="o">=</span><span class="s">"2026-03-01"</span><span class="p">,</span>
        <span class="n">outputsize</span><span class="o">=</span><span class="mi">5000</span><span class="p">).</span><span class="n">as_json</span><span class="p">()</span>
</code></pre></div></div>

<p>This returns the daily timeseries of USDJPY since 2025 til March 2026, formatted as a JSON. Pretty simple to then go from that to a dataframe or however you want to deal with the data.</p>

<p>I don’t want to get blocked by the API limits, so I’m going to save the JSON objects locally.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">download_data</span><span class="p">(</span><span class="n">td</span><span class="p">,</span> <span class="n">ccy</span><span class="p">,</span> <span class="n">start_date</span><span class="p">,</span> <span class="n">end_date</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">td</span><span class="p">.</span><span class="n">time_series</span><span class="p">(</span>
        <span class="n">symbol</span><span class="o">=</span><span class="sa">f</span><span class="s">"USD/</span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
        <span class="n">interval</span><span class="o">=</span><span class="s">"1day"</span><span class="p">,</span>
        <span class="n">start_date</span><span class="o">=</span><span class="n">start_date</span><span class="p">,</span>
        <span class="n">end_date</span><span class="o">=</span><span class="n">end_date</span><span class="p">,</span>
        <span class="n">outputsize</span><span class="o">=</span><span class="mi">5000</span>
    <span class="p">)</span>

<span class="k">def</span> <span class="nf">save_data</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">ccy</span><span class="p">):</span>
    <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="sa">f</span><span class="s">"data/</span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">.json"</span><span class="p">,</span> <span class="s">"w"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
        <span class="n">json</span><span class="p">.</span><span class="n">dump</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">as_json</span><span class="p">(),</span> <span class="n">f</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">download_and_save_data</span><span class="p">(</span><span class="n">td</span><span class="p">,</span> <span class="n">ccy</span><span class="p">,</span> <span class="n">start_date</span><span class="p">,</span> <span class="n">end_date</span><span class="p">):</span>
    <span class="n">file_path</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"data/</span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">.json"</span>
    <span class="k">if</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">exists</span><span class="p">(</span><span class="n">file_path</span><span class="p">):</span>
        <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"File for </span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s"> already exists. Skipping download."</span><span class="p">)</span>
        <span class="k">return</span> <span class="bp">False</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Downloading data for </span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">..."</span><span class="p">)</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">download_data</span><span class="p">(</span><span class="n">td</span><span class="p">,</span> <span class="n">ccy</span><span class="p">,</span> <span class="n">start_date</span><span class="p">,</span> <span class="n">end_date</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Saving data for </span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">..."</span><span class="p">)</span>
    <span class="n">save_data</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">ccy</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Data for </span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s"> downloaded and saved successfully."</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"Sleeping for 8 seconds to avoid hitting API rate limits..."</span><span class="p">)</span>
    <span class="n">time</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span>
    <span class="k">return</span> <span class="bp">True</span>
</code></pre></div></div>

<p>Then, to load the data for a particular currency, we have a separate function.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">load_data</span><span class="p">(</span><span class="n">ccy</span><span class="p">):</span>
    <span class="n">df</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">read_json</span><span class="p">(</span><span class="sa">f</span><span class="s">'data/</span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">.json'</span><span class="p">)</span>
    <span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
        <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Date</span><span class="p">),</span>
        <span class="n">ccy</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="n">ccy</span><span class="p">),</span>
        <span class="nb">open</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"open"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Float64</span><span class="p">),</span>
        <span class="n">high</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"high"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Float64</span><span class="p">),</span>
        <span class="n">low</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"low"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Float64</span><span class="p">),</span>
        <span class="n">close</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Float64</span><span class="p">))</span>
    <span class="k">return</span> <span class="n">df</span>
</code></pre></div></div>

<p>To make sure everything is working nicely, let’s load and plot JPY.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">load_data</span><span class="p">(</span><span class="s">"JPY"</span><span class="p">)</span>

<span class="n">fig</span> <span class="o">=</span> <span class="n">go</span><span class="p">.</span><span class="n">Figure</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">go</span><span class="p">.</span><span class="n">Ohlc</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'datetime'</span><span class="p">],</span>
                    <span class="nb">open</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'open'</span><span class="p">],</span>
                    <span class="n">high</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'high'</span><span class="p">],</span>
                    <span class="n">low</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'low'</span><span class="p">],</span>
                    <span class="n">close</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'close'</span><span class="p">]))</span>

<span class="n">fig</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/dxy/jpy.png" alt="Line chart depicting the the price of USDJPY" /></p>

<p>All looks good, so now we can download whatever pair our heart desires. Which leads us to the next part.</p>

<h2 id="what-is-the-dxy">What is the DXY?</h2>

<p>In my mind, the DXY is the FX equivalent of the S&amp;P500. It gives a general indication of how the dollar’s value is changing by using the exchange rate of EUR, JPY, CHF, GBP, CAD and SEK vs the dollar. It’s calculated as a geometric weighted average of these six currencies, and given the dollar’s dominance in the FX market, it works as a reasonable proxy of how the overall FX market is moving.</p>

<p>If we cast our mind back to the <a href="https://en.wikipedia.org/wiki/Capital_asset_pricing_model">Capital Asset Pricing Model</a>, an asset’s expected return can be broken down to its \(\alpha\) active return and its sensitivity to the market, \(r_m\). The strength of this sensitivity is \(\beta\).</p>

\[r_i = \alpha_i + \beta_i r_m\]

<p>In equities, \(r_i\) is a single stock and \(r_m\) is some measure of the overall market return (S&amp;P500, FTSE100, etc.). In FX, \(r_i\) is an individual currency and \(r_m\) is the DXY. This gives us an easy quantitative model to judge how a currency’s return is driven by the overall movement in the dollar.</p>

<p>Now you can either read the DXY from a market data source (expensive) or you can calculate it yourself.</p>

<h2 id="calculating-the-dxy">Calculating the DXY</h2>

<p>The formula for the DXY is in a pdf here: <a href="https://www.ice.com/publicdocs/futures_us/ICE_Dollar_Index_FAQ.pdf">U.S. Dollar Index Contracts</a>. It’s a simple weighted geometric average, so we just need the individual currency prices, and we can implement the calculation.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dfs</span> <span class="o">=</span> <span class="p">[</span><span class="n">load_data</span><span class="p">(</span><span class="n">ccy</span><span class="p">)</span> <span class="k">for</span> <span class="n">ccy</span> <span class="ow">in</span> <span class="p">[</span><span class="s">"EUR"</span><span class="p">,</span> <span class="s">"JPY"</span><span class="p">,</span> <span class="s">"GBP"</span><span class="p">,</span> <span class="s">"CAD"</span><span class="p">,</span> <span class="s">"SEK"</span><span class="p">,</span> <span class="s">"CHF"</span><span class="p">]]</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">dfs</span><span class="p">)</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span>
</code></pre></div></div>

<p>The more eagle-eyed readers might have noticed that I’m saving down some of the pairs the ‘wrong’ way round. USDEUR instead of EURUSD, USDGBP instead of GBPUSD, etc. This is because the DXY needs to flip everything into USD base terms, so in the weighting, some of the negatives are changed to positive.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dxyWeightings</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s">"EUR"</span><span class="p">:</span> <span class="mf">0.576</span><span class="p">,</span>
    <span class="s">"JPY"</span><span class="p">:</span> <span class="mf">0.136</span><span class="p">,</span>
    <span class="s">"GBP"</span><span class="p">:</span> <span class="mf">0.119</span><span class="p">,</span>
    <span class="s">"CAD"</span><span class="p">:</span> <span class="mf">0.091</span><span class="p">,</span>
    <span class="s">"SEK"</span><span class="p">:</span> <span class="mf">0.042</span><span class="p">,</span>
    <span class="s">"CHF"</span><span class="p">:</span> <span class="mf">0.036</span><span class="p">,</span>
    <span class="s">"const"</span><span class="p">:</span> <span class="mf">50.14348112</span><span class="p">}</span>

<span class="n">weights_df</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">dxyWeightings</span><span class="p">.</span><span class="n">items</span><span class="p">()),</span> <span class="n">schema</span><span class="o">=</span><span class="p">[</span><span class="s">"ccy"</span><span class="p">,</span><span class="s">"weight"</span><span class="p">])</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">weights_df</span><span class="p">,</span> <span class="n">on</span><span class="o">=</span><span class="s">"ccy"</span><span class="p">,</span> <span class="n">how</span><span class="o">=</span><span class="s">"left"</span><span class="p">)</span>
</code></pre></div></div>

<p>So now we have a dataframe of the relevant prices joined by the weightings.</p>

<p>Step 1: exponentiate the 4 prices by the right power.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"open"</span><span class="p">)</span> <span class="o">**</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"weight"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"open_weighted"</span><span class="p">),</span>
    <span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"high"</span><span class="p">)</span> <span class="o">**</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"weight"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"high_weighted"</span><span class="p">),</span>
    <span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"low"</span><span class="p">)</span> <span class="o">**</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"weight"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"low_weighted"</span><span class="p">),</span>
    <span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">)</span> <span class="o">**</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"weight"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"close_weighted"</span><span class="p">)</span>
</code></pre></div></div>

<p>Step 2: For each day, take the product and multiply it by the constant.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dxy</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">group_by</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">).</span><span class="n">agg</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"open_weighted"</span><span class="p">).</span><span class="n">product</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_open"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"high_weighted"</span><span class="p">).</span><span class="n">product</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_high"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"low_weighted"</span><span class="p">).</span><span class="n">product</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_low"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close_weighted"</span><span class="p">).</span><span class="n">product</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_close"</span><span class="p">)</span>
<span class="p">).</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">'dxy_open'</span><span class="p">)</span><span class="o">*</span><span class="n">dxyWeightings</span><span class="p">[</span><span class="s">"const"</span><span class="p">],</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">'dxy_high'</span><span class="p">)</span><span class="o">*</span><span class="n">dxyWeightings</span><span class="p">[</span><span class="s">"const"</span><span class="p">],</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">'dxy_low'</span><span class="p">)</span><span class="o">*</span><span class="n">dxyWeightings</span><span class="p">[</span><span class="s">"const"</span><span class="p">],</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">'dxy_close'</span><span class="p">)</span><span class="o">*</span><span class="n">dxyWeightings</span><span class="p">[</span><span class="s">"const"</span><span class="p">])</span>
</code></pre></div></div>

<p><img src="/assets/dxy/dxy.png" alt="Line chart depicting the DXY" /></p>

<p>[Alt text: Line chart depicting daily DXY values. The x-axis shows time, and the y-axis shows the DXY value. The chart provides a clear view of the daily movement of the DXY.]</p>

<p>If you compare it to the Yahoo Finance DXY plot, it looks pretty similar, so I’m pretty confident this is all correct.</p>

<h2 id="individual-currency-betas">Individual Currency \(\beta\)’s</h2>

<p>Now we can go on to measuring the currencies \(\beta\) values. This is a simple linear regression of the log returns of an individual currency vs the log returns of the DXY.</p>

<p>We need to load in more currency pairs.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dfs</span> <span class="o">=</span> <span class="p">[</span><span class="n">load_data</span><span class="p">(</span><span class="n">ccy</span><span class="p">)</span> <span class="k">for</span> <span class="n">ccy</span> <span class="ow">in</span> <span class="n">all_pairs</span><span class="p">]</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">dfs</span><span class="p">)</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span>
</code></pre></div></div>

<p>For the regression, we need the individual currency returns and also the DXY returns. Simple log return calculation, and then join the DXY frame onto the individual currencies.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">().</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return"</span><span class="p">)</span>
<span class="p">)</span>

<span class="n">dxy</span> <span class="o">=</span> <span class="n">dxy</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"dxy_close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_log_return"</span><span class="p">)</span>
<span class="p">)</span>

<span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">dxy</span><span class="p">,</span> <span class="n">on</span><span class="o">=</span><span class="s">"datetime"</span><span class="p">,</span> <span class="n">how</span><span class="o">=</span><span class="s">"left"</span><span class="p">)</span>
</code></pre></div></div>

<p>We will do a rolling regression using a 252-day look back, which is roughly the number of trading days in a year.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">statsmodels.regression.rolling</span> <span class="kn">import</span> <span class="n">RollingOLS</span>

<span class="n">allParams</span> <span class="o">=</span> <span class="p">[]</span>

<span class="k">for</span> <span class="n">ccy</span> <span class="ow">in</span> <span class="p">[</span><span class="s">"EUR"</span><span class="p">,</span> <span class="s">"SEK"</span><span class="p">,</span> <span class="s">"CNH"</span><span class="p">,</span> <span class="s">"TWD"</span><span class="p">,</span> <span class="s">"TRY"</span><span class="p">]:</span>

    <span class="n">subDF</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">)</span> <span class="o">==</span> <span class="n">ccy</span><span class="p">)</span>
    <span class="n">mod</span> <span class="o">=</span> <span class="n">RollingOLS</span><span class="p">.</span><span class="n">from_formula</span><span class="p">(</span><span class="s">"log_return ~ dxy_log_return"</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">subDF</span><span class="p">,</span> <span class="n">window</span><span class="o">=</span><span class="mi">252</span><span class="p">)</span>
    <span class="n">rres</span> <span class="o">=</span> <span class="n">mod</span><span class="p">.</span><span class="n">fit</span><span class="p">()</span>

    <span class="n">paramDF</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">from_pandas</span><span class="p">(</span><span class="n">rres</span><span class="p">.</span><span class="n">params</span><span class="p">)</span>
    <span class="n">paramDF</span> <span class="o">=</span> <span class="n">paramDF</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">ccy</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="n">ccy</span><span class="p">),</span> <span class="n">Date</span> <span class="o">=</span> <span class="n">subDF</span><span class="p">[</span><span class="s">"datetime"</span><span class="p">])</span>
    <span class="n">allParams</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">paramDF</span><span class="p">)</span>

<span class="n">allParams</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">allParams</span><span class="p">)</span>
</code></pre></div></div>

<p>To examine the results, we plot the \(\beta_i\) value over time for some different currencies.</p>

<p><img src="/assets/dxy/betas.png" alt="Line chart depicting beta values of various currencies over time." /></p>

<p>EUR (green) is close to 1, which aligns with intuition as it’s the largest weight of the DXY calculation. TRY has the lowest \(\beta\) out of these pairs, which suggests its returns are not driven by the overall dollar returns, again, makes sense given TRY’s movements reflect the underlying macroeconomics of TRY. SEK has a consistent \(\beta &gt; 1\) which again suggests it’s very susceptible to general dollar moves. It’s not pictured, but HKD comes out with the lowest \(\beta\), which is reassuring as it is pegged to the dollar.</p>

<p>Overall, do these \(\beta\)’s tell us much? Not really, but it is interesting to measure, and this is the foundation needed before we start looking at other factors that might influence the daily currency movements. These can be things like momentum, oil/gold sensitivity, etc.</p>

<h2 id="conclusion">Conclusion</h2>

<p>From this, we have built up a new dataset of daily currency prices and now have daily DXY values too. This has given the underpinnings of an FX factor model, and next time we can start looking at other components that could explain currency movements.</p>

<p>Loosely related is my post on <a href="https://dm13450.github.io/2024/04/25/Currency-Hedging-and-Principal-Component-Analysis.html">Currency Hedging and Principal Component Analysis</a> and <a href="https://dm13450.github.io/2022/06/09/ETF-Correlations.html">Dipping My Toes into ETF Correlations</a>.</p>]]></content><author><name>Dean Markwick</name></author><category term="python" /><summary type="html"><![CDATA[My day job is in quant trading, but there’s another fascinating world: quantitative investing. While I focus on latencies and execution, quant investors are busy building the most efficient portfolios and ensuring they extract pure alpha. Not one to stay in my lane, I’m using this blog post as an opportunity to dive into the world of quant investing and level up my knowledge.]]></summary></entry><entry><title type="html">Premier League Survival – How Many Points Are Enough?</title><link href="https://dm13450.github.io/2025/10/31/Premier-League-Survival-How-Many-Points-Are-Enough.html" rel="alternate" type="text/html" title="Premier League Survival – How Many Points Are Enough?" /><published>2025-10-31T00:00:00+00:00</published><updated>2025-10-31T00:00:00+00:00</updated><id>https://dm13450.github.io/2025/10/31/Premier-League-Survival%E2%80%93How-Many-Points-Are-Enough</id><content type="html" xml:base="https://dm13450.github.io/2025/10/31/Premier-League-Survival-How-Many-Points-Are-Enough.html"><![CDATA[<p>It’s been an interesting start to the Premier League. All of the promoted teams (Sunderland, Leeds and Burnley) are outside the relegation zone, with Wolves and West Ham struggling at the bottom. So I want to look back at the other seasons and work out the average number of points throughout the season that characterises relegation teams, and how many points do you need to avoid relegation?</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>This is also a post where I dive into Python. I’ve been meaning to learn both <a href="https://pola.rs/">Polars</a> and <a href="https://plotly.com/">Plotly</a>, and given the relative simplicity of this post, it feels like the opportune time. It has also been a while since I’ve written about football and given my reduced output recently, it feels like a quick win to churn something out quickly.</p>

<h2 id="downloading-the-data">Downloading the Data</h2>

<p>The gold standard for free and easy football data is <a href="https://www.football-data.co.uk/">football-data</a>, where they have a CSV of every season for many years. This makes it easy to download it directly and merge the seasons together.</p>

<p>Reading a CSV with Polars is no different to Pandas, but adding in a new column is slightly different with the <code class="language-plaintext highlighter-rouge">use_columns</code> function and giving it an <code class="language-plaintext highlighter-rouge">alias</code>.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">s</span> <span class="o">=</span> <span class="nb">range</span><span class="p">(</span><span class="mi">2009</span><span class="p">,</span> <span class="mi">2027</span><span class="p">)</span>
<span class="n">seasons</span> <span class="o">=</span> <span class="p">[</span><span class="nb">str</span><span class="p">((</span><span class="n">x</span><span class="o">-</span><span class="mi">1</span><span class="p">))[</span><span class="mi">2</span><span class="p">:</span><span class="mi">4</span><span class="p">]</span> <span class="o">+</span> <span class="nb">str</span><span class="p">((</span><span class="n">x</span><span class="p">))[</span><span class="mi">2</span><span class="p">:</span><span class="mi">4</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">s</span><span class="p">]</span>

<span class="n">rawDataList</span> <span class="o">=</span> <span class="p">[]</span>

<span class="k">for</span> <span class="n">season</span> <span class="ow">in</span> <span class="n">seasons</span><span class="p">:</span>
    <span class="n">url</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"https://www.football-data.co.uk/mmz4281/</span><span class="si">{</span><span class="n">season</span><span class="si">}</span><span class="s">/E0.csv"</span>
    <span class="n">rawData</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">read_csv</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">truncate_ragged_lines</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">rawData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="n">season</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"Season"</span><span class="p">))</span>
    <span class="n">rawDataList</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">rawData</span><span class="p">)</span>

<span class="n">rawData</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">rawDataList</span><span class="p">,</span> <span class="n">how</span> <span class="o">=</span> <span class="s">"diagonal"</span><span class="p">)</span>
</code></pre></div></div>

<p>We diagonally concatenate the dataframes because not every season has the same columns, and this will null-fill any missing columns.</p>

<p>We then add a column of row indices and add the points scored by the home and away team based on the outcome of the match.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rawData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">.</span><span class="n">with_row_index</span><span class="p">(</span><span class="s">"MatchID"</span><span class="p">)</span>
<span class="n">rawData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FTR"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"H"</span><span class="p">).</span><span class="n">then</span><span class="p">(</span><span class="mi">3</span><span class="p">).</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FTR"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"A"</span><span class="p">)).</span><span class="n">then</span><span class="p">(</span><span class="mi">0</span><span class="p">).</span><span class="n">otherwise</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">'PTH'</span><span class="p">))</span>
<span class="n">rawData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FTR"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"A"</span><span class="p">).</span><span class="n">then</span><span class="p">(</span><span class="mi">3</span><span class="p">).</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FTR"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"H"</span><span class="p">)).</span><span class="n">then</span><span class="p">(</span><span class="mi">0</span><span class="p">).</span><span class="n">otherwise</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">'PTA'</span><span class="p">))</span>
</code></pre></div></div>

<h2 id="formatting-the-data">Formatting the Data</h2>

<p>Currently, the data is in a ‘per match’ format with a home and away team. We need to rearrange this so that each team gets its own row per match, so if we filter for a specific team, we get all their matches rather than having to filter both the home and away columns.</p>

<p>The current columns refer to stats in terms of home (<code class="language-plaintext highlighter-rouge">H</code>) and away (<code class="language-plaintext highlighter-rouge">A</code>). We will replace those names with <code class="language-plaintext highlighter-rouge">1</code> and <code class="language-plaintext highlighter-rouge">2</code>.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">matchDetailsCols</span> <span class="o">=</span> <span class="p">[</span><span class="s">"MatchID"</span><span class="p">,</span> <span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Date"</span><span class="p">,</span> <span class="s">"HomeTeam"</span><span class="p">,</span> <span class="s">"AwayTeam"</span><span class="p">]</span>
<span class="n">matchDetailsMap</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">matchDetailsCols</span><span class="p">,</span> <span class="p">[</span><span class="s">"MatchID"</span><span class="p">,</span> <span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Date"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">,</span> <span class="s">"Team2"</span><span class="p">]))</span>

<span class="n">matchStatsCols</span> <span class="o">=</span> <span class="p">[</span><span class="s">"FTHG"</span><span class="p">,</span> <span class="s">"FTAG"</span><span class="p">,</span> <span class="s">"HS"</span><span class="p">,</span> <span class="s">"AS"</span><span class="p">,</span> <span class="s">"HST"</span><span class="p">,</span> <span class="s">"AST"</span><span class="p">,</span> <span class="s">"PSCD"</span><span class="p">,</span> <span class="s">"PSCH"</span><span class="p">,</span> <span class="s">"PSCA"</span><span class="p">,</span> <span class="s">"PTH"</span><span class="p">,</span> <span class="s">"PTA"</span><span class="p">]</span>
<span class="n">matchStatsMap</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">matchStatsCols</span><span class="p">,</span> <span class="p">[</span><span class="n">x</span><span class="p">.</span><span class="n">replace</span><span class="p">(</span><span class="s">"H"</span><span class="p">,</span> <span class="s">"1"</span><span class="p">).</span><span class="n">replace</span><span class="p">(</span><span class="s">"A"</span><span class="p">,</span> <span class="s">"2"</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">matchStatsCols</span><span class="p">]))</span>

<span class="n">allCols</span> <span class="o">=</span> <span class="n">matchDetailsCols</span> <span class="o">+</span> <span class="n">matchStatsCols</span>
<span class="n">colsMap</span> <span class="o">=</span> <span class="n">matchDetailsMap</span> <span class="o">|</span> <span class="n">matchStatsMap</span>
<span class="n">matchData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">[</span><span class="n">allCols</span><span class="p">]</span>
</code></pre></div></div>

<p>So we create a frame with all the matches relabelled as <code class="language-plaintext highlighter-rouge">Team1</code> and add a dummy indicator for a <code class="language-plaintext highlighter-rouge">Home</code> match.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">team1Data</span> <span class="o">=</span> <span class="n">matchData</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">colsMap</span><span class="p">)</span>
<span class="n">team1Data</span> <span class="o">=</span> <span class="n">team1Data</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"Home"</span><span class="p">))</span>
</code></pre></div></div>

<p>Likewise for <code class="language-plaintext highlighter-rouge">Team2</code>.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">team2Data</span> <span class="o">=</span> <span class="n">matchData</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">colsMap</span><span class="p">)</span>
<span class="n">team2Map</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">team2Data</span><span class="p">.</span><span class="n">columns</span><span class="p">,</span> <span class="p">[</span><span class="n">x</span><span class="p">.</span><span class="n">replace</span><span class="p">(</span><span class="s">"1"</span><span class="p">,</span> <span class="s">"2"</span><span class="p">)</span> <span class="k">if</span> <span class="s">"1"</span> <span class="ow">in</span> <span class="n">x</span> <span class="k">else</span> <span class="n">x</span><span class="p">.</span><span class="n">replace</span><span class="p">(</span><span class="s">"2"</span><span class="p">,</span> <span class="s">"1"</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">team2Data</span><span class="p">.</span><span class="n">columns</span><span class="p">]))</span>
<span class="n">team2Data</span> <span class="o">=</span> <span class="n">team2Data</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">team2Map</span><span class="p">)</span>
<span class="n">team2Data</span> <span class="o">=</span> <span class="n">team2Data</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="mi">0</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"Home"</span><span class="p">))</span>
</code></pre></div></div>

<p>Then rejoin and sort by the matchID.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">teamData</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">([</span><span class="n">team1Data</span><span class="p">,</span> <span class="n">team2Data</span><span class="p">],</span> <span class="n">how</span> <span class="o">=</span> <span class="s">"diagonal"</span><span class="p">)</span>
<span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"MatchID"</span><span class="p">)</span>
</code></pre></div></div>

<p>Now we want to add the cumulative sum of points, goals, and goals conceded to get a view of each team’s league position on a match by match basis.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"PT1"</span><span class="p">).</span><span class="n">cum_sum</span><span class="p">().</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">))</span>
<span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FT1G"</span><span class="p">).</span><span class="n">cum_sum</span><span class="p">().</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"TotalGoals1"</span><span class="p">))</span>
<span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FT2G"</span><span class="p">).</span><span class="n">cum_sum</span><span class="p">().</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"TotalGoalsC1"</span><span class="p">))</span>
<span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">int_range</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">len</span><span class="p">()).</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"N"</span><span class="p">))</span>
</code></pre></div></div>

<p>This is a bit different to the usual groupby and aggregate, but makes sense to define the function over the column then specify the aggregation columns.</p>

<p>Finally, we are going to create a league table dataframe by taking the last points/goals/goals conceded by each team per season and use that to work out who got relegated each year.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">leagueTable</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">group_by</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">agg</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"N"</span><span class="p">,</span> <span class="s">"TotalPoints1"</span><span class="p">,</span> <span class="s">"TotalGoals1"</span><span class="p">,</span> <span class="s">"TotalGoalsC1"</span><span class="p">).</span><span class="n">last</span><span class="p">())</span>
<span class="n">leagueTable</span> <span class="o">=</span> <span class="n">leagueTable</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">,</span> <span class="n">descending</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">leagueTable</span> <span class="o">=</span> <span class="n">leagueTable</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">int_range</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">len</span><span class="p">()).</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"FinalPosition"</span><span class="p">))</span>
<span class="n">leagueTable</span> <span class="o">=</span> <span class="n">leagueTable</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FinalPosition"</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="mi">17</span><span class="p">).</span><span class="n">then</span><span class="p">(</span><span class="mi">1</span><span class="p">)).</span><span class="n">otherwise</span><span class="p">(</span><span class="mi">0</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">'Relegated'</span><span class="p">))</span>
</code></pre></div></div>

<p>We can then join this to the <code class="language-plaintext highlighter-rouge">teamData</code>, and this will form the basis of our stats.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">leagueTable</span><span class="p">[[</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">,</span> <span class="s">"FinalPosition"</span><span class="p">,</span> <span class="s">"Relegated"</span><span class="p">]],</span> <span class="n">on</span> <span class="o">=</span> <span class="p">[</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">])</span>
</code></pre></div></div>

<h2 id="relegation-statistics">Relegation Statistics</h2>

<p>The data is in a nice format, and we can manipulate it and see where this season is lining up. This is where <code class="language-plaintext highlighter-rouge">plotly</code> now comes in. I’ve always been a <a href="https://matplotlib.org/">matplotlib</a> user and enjoyed building up the plots layer by layer and a decent amount of control. Plotly was always missing from my arsenal, so if I’m dipping my toes into Python, I might as well plug that gap. I’ve neglected some of the final graph formatting points to keep the code chunks manageable.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">plotly.express</span> <span class="k">as</span> <span class="n">px</span>
<span class="kn">import</span> <span class="nn">plotly.graph_objects</span> <span class="k">as</span> <span class="n">go</span>
<span class="kn">from</span> <span class="nn">plotly.subplots</span> <span class="kn">import</span> <span class="n">make_subplots</span>
</code></pre></div></div>

<p>First, we calculate the relegation stats. We want to calculate the average number of points, goals scored, and goals conceded after each game week for the teams that were eventually relegated.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">relegated</span> <span class="o">=</span> <span class="p">(</span><span class="n">teamData</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Season"</span><span class="p">)</span> <span class="o">!=</span> <span class="s">"2526"</span><span class="p">)</span>
                     <span class="p">.</span><span class="n">group_by</span><span class="p">([</span><span class="s">"N"</span><span class="p">,</span> <span class="s">"Relegated"</span><span class="p">])</span>
                     <span class="p">.</span><span class="n">agg</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span> 
                          <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalGoals1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span> 
                          <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalGoalsC1"</span><span class="p">).</span><span class="n">mean</span><span class="p">())</span>
                     <span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"N"</span><span class="p">).</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Relegated"</span><span class="p">)</span> <span class="o">==</span> <span class="mi">1</span><span class="p">))</span>
</code></pre></div></div>

<p>We then want to plot this and compare it to the currently promoted teams, plus Wolves and West Ham, who are in the most trouble. Also, shout out to <a href="https://teamcolours.netlify.app/">https://teamcolours.netlify.app/</a> to get the actual colours of the teams for the plot.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">fig</span> <span class="o">=</span> <span class="n">go</span><span class="p">.</span><span class="n">Figure</span><span class="p">()</span>
<span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">relegated</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">relegated</span><span class="p">[</span><span class="s">"TotalPoints1"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="s">'Avg Points Of A Relegated Team'</span><span class="p">))</span>

<span class="k">for</span> <span class="n">team</span> <span class="ow">in</span> <span class="p">[</span><span class="s">"West Ham"</span><span class="p">,</span> <span class="s">"Wolves"</span><span class="p">,</span> <span class="s">"Sunderland"</span><span class="p">,</span> <span class="s">"Leeds"</span><span class="p">,</span> <span class="s">"Burnley"</span><span class="p">]:</span>
    <span class="n">latestTeam</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Team1"</span><span class="p">)</span> <span class="o">==</span> <span class="n">team</span><span class="p">,</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Season"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"2526"</span><span class="p">)</span>

    <span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">latestTeam</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">latestTeam</span><span class="p">[</span><span class="s">"TotalPoints1"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="n">team</span><span class="p">))</span>


<span class="n">fig</span><span class="p">.</span><span class="n">update_layout</span><span class="p">(</span><span class="n">height</span><span class="o">=</span><span class="mi">500</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="mi">700</span><span class="p">,</span>
                  <span class="n">title_text</span><span class="o">=</span><span class="s">"Relegation Stats"</span><span class="p">)</span>

<span class="n">fig</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/relegation/relegated.png" alt="Line chart titled Relegation Stats showing average cumulative points of teams that were eventually relegated compared to current teams. X axis is match week number and Y axis is total points. Primary subjects are the average relegated team line and individual team lines for West Ham, Wolves, Sunderland, Leeds, and Burnley. The average relegated team line rises steadily through the season. Sunderland's line is well above the average, Leeds and Burnley track close to the average, and West Ham and Wolves fall below the average with Wolves furthest below." /></p>

<p>Wolves and West Ham are currently in trouble. They are below the average line at this point in the season, whereas Sunderland is storming it, Leeds are also quite safe, and Burnley’s recent performance have kept them above the fated line.</p>

<p>However, looking at the average points of a relegated team isn’t the best way of looking at this. It can get dragged down by a very poor team at the bottom of the league. Instead we need to look at the minimum and average number of points to stay safe every season.</p>

<p>This is the same calculation as above, but aggregating on the final position of each team and then filtering on position 16, one above the relegation zone.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">safe</span> <span class="o">=</span> <span class="p">(</span><span class="n">teamData</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Season"</span><span class="p">)</span> <span class="o">!=</span> <span class="s">"2526"</span><span class="p">)</span>
                <span class="p">.</span><span class="n">group_by</span><span class="p">([</span><span class="s">"N"</span><span class="p">,</span> <span class="s">"FinalPosition"</span><span class="p">])</span>
                <span class="p">.</span><span class="n">agg</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span> 
                     <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalGoals1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span> 
                    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalGoalsC1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span>
                    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">).</span><span class="nb">min</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"Min"</span><span class="p">))</span>
                <span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"N"</span><span class="p">).</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FinalPosition"</span><span class="p">)</span> <span class="o">==</span> <span class="mi">16</span><span class="p">)</span>
       <span class="p">)</span>
</code></pre></div></div>

<p>Again, plotting this with the same teams.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">fig</span> <span class="o">=</span> <span class="n">go</span><span class="p">.</span><span class="n">Figure</span><span class="p">()</span>
<span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">safe</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">safe</span><span class="p">[</span><span class="s">"TotalPoints1"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="s">'Avg Points of a Safe Team'</span><span class="p">))</span>

<span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">safe</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">safe</span><span class="p">[</span><span class="s">"Min"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="s">'Min Points of a Safe Team'</span><span class="p">))</span>

<span class="k">for</span> <span class="n">team</span> <span class="ow">in</span> <span class="p">[</span><span class="s">"West Ham"</span><span class="p">,</span> <span class="s">"Wolves"</span><span class="p">,</span> <span class="s">"Sunderland"</span><span class="p">,</span> <span class="s">"Leeds"</span><span class="p">,</span> <span class="s">"Burnley"</span><span class="p">]:</span>
    <span class="n">latestTeam</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Team1"</span><span class="p">)</span> <span class="o">==</span> <span class="n">team</span><span class="p">,</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Season"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"2526"</span><span class="p">)</span>

    <span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">latestTeam</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">latestTeam</span><span class="p">[</span><span class="s">"TotalPoints1"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="n">team</span><span class="p">))</span>

<span class="n">fig</span><span class="p">.</span><span class="n">update_layout</span><span class="p">(</span><span class="n">height</span><span class="o">=</span><span class="mi">500</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="mi">700</span><span class="p">,</span>
                  <span class="n">title_text</span><span class="o">=</span><span class="s">"Safety Stats"</span><span class="p">)</span>

<span class="n">fig</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/relegation/safe.png" alt="Line chart titled Safety Stats showing cumulative points by match week on the x axis and total points on the y axis. Primary subjects are the colored lines representing Avg Points of a Safe Team, Min Points of a Safe Team, and individual teams West Ham, Wolves, Sunderland, Leeds, Burnley. Sunderland is well above both safety lines, Leeds and Burnley track close to the average and minimum lines, Wolves falls below both safety lines, and West Ham falls below the average line." /></p>

<p>Again, Wolves and West Ham are well below the average line (blue), and Wolves are even below the minimum line (red). Burnley and Leeds are in touching distance. Sunderland is well above. From this, Sunderland should be happy and confident that they can stay up; Leeds are at the bare minimum. Wolves are in big danger, but with a new manager, they might be able to get going again. West Ham have already had their new manager bounce, and it’s still looking precarious.</p>

<p>This also shows that, on average, you need 37.23 points to survive in the Premier League, with 35 as the bare minimum. So the fabled 40 point mark is actually a slight over estimation.</p>

<p>It’s not just points, though. What about the number of goals each team has scored and how many they are conceding? Let’s look at these stats and also format up the graph so it’s a bit less default, and focus just on the games so far.</p>

<p><img src="/assets/relegation/more_safe.png" alt="A line chart comparing Premier League teams' cumulative points across the season, focusing on teams near the relegation zone." title="A line chart comparing Premier League teams' cumulative points across the season, focusing on teams near the relegation zone." /></p>

<p>No real change to the conclusion. Sunderland are doing well on both points and goals scored, and their conceded goals are below the average in the 16th position. Wolves and West Ham are underperforming across the board. Leeds and Burnley are scraping by.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Based on these early-season trajectories, it’s not looking good for West Ham or Wolves. By contrast, Sunderland should be getting excited about the prospect of another season in the Premier League. Leeds and Burnley - not quite out of the woods. As another cliche goes, relegation is about hoping you are better than 3 other teams and at the minute Wolves and West Ham are struggling to find three other worse teams!</p>]]></content><author><name>Dean Markwick</name></author><category term="python" /><summary type="html"><![CDATA[It’s been an interesting start to the Premier League. All of the promoted teams (Sunderland, Leeds and Burnley) are outside the relegation zone, with Wolves and West Ham struggling at the bottom. So I want to look back at the other seasons and work out the average number of points throughout the season that characterises relegation teams, and how many points do you need to avoid relegation?]]></summary></entry><entry><title type="html">Easy Neural Nets and Finance - Part 1</title><link href="https://dm13450.github.io/2025/07/23/Easy-Neural-Nets-and-Finance-Part-1.html" rel="alternate" type="text/html" title="Easy Neural Nets and Finance - Part 1" /><published>2025-07-23T00:00:00+00:00</published><updated>2025-07-23T00:00:00+00:00</updated><id>https://dm13450.github.io/2025/07/23/Easy-Neural-Nets-and-Finance-Part-1</id><content type="html" xml:base="https://dm13450.github.io/2025/07/23/Easy-Neural-Nets-and-Finance-Part-1.html"><![CDATA[<p>I’m fortunate enough to be participating in a lecture series at work that covers deep learning and its applications in finance. This will be a series of posts documenting what I learn and implementing the ‘homework’ (I’m 32, how am I still getting homework?) using Julia and Flux.</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>The phrase ‘deep learning’ already feels outdated, and the current hotness is more about AI and LLMs, so the lecture and topics might feel a bit out of date. But given LLMs wouldn’t be here without the deep learning, it feels like going back to the basics.</p>

<p>Plus, I’ve never really jumped in and explored neural nets, so this gives me a chance to do some deep learning in an applied way.</p>

<p>After reading this, you will be able to build your own neural net with different layers and compare it to a simpler linear model.</p>

<h2 id="predicting-a-stocks-daily-volume">Predicting a Stock’s Daily Volume</h2>

<p>If you Google neural nets and finance, you will find an infinite amount of copy-pasted quant finance Python examples of people using PyTorch/TensorFlow/JAX to predict the closing price of some stock. Kudos to these tutorials for putting something out there, but you will struggle to learn anything meaningful about either finance, modelling or neural nets.</p>

<p>This is my attempt to be different.</p>

<p>Instead of predicting prices or returns and showing that neural nets can make money, we will model the total number of shares traded per day. For starters, this is much easier as the data is a bit more signal and less noise. Plus, if I managed to build something that could predict prices, why would I share it?</p>

<p>So, we will be using deep learning to build a model of the <em>total trading volume</em> per day of the SPY ETF. A basic time series prediction task that can be approached both with linear models and deep learning.</p>

<p>You know the drill, fire up your Julia notebook and follow along.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">Dates</span><span class="x">,</span> <span class="n">AlpacaMarkets</span><span class="x">,</span> <span class="n">Plots</span><span class="x">,</span> <span class="n">StatsBase</span>
<span class="k">using</span> <span class="n">DataFramesMeta</span><span class="x">,</span> <span class="n">ShiftedArrays</span>
</code></pre></div></div>

<h2 id="getting-the-data">Getting the Data</h2>

<p>We are using similar data to my <a href="https://dm13450.github.io/2025/06/16/Cyclical-Embedding.html">Cyclical Embedding</a> post, except for this time, we will be using the SPY ETF instead of Apple.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spyRaw</span><span class="x">,</span> <span class="n">npt</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">stock_bars</span><span class="x">(</span><span class="s">"SPY"</span><span class="x">,</span> <span class="s">"1Day"</span><span class="x">;</span> 
  <span class="n">startTime</span><span class="o">=</span><span class="kt">Date</span><span class="x">(</span><span class="s">"2000-01-01"</span><span class="x">),</span> 
  <span class="n">endTime</span> <span class="o">=</span> <span class="n">today</span><span class="x">()</span> <span class="o">-</span> <span class="kt">Day</span><span class="x">(</span><span class="mi">1</span><span class="x">)</span> <span class="x">,</span>
  <span class="n">adjustment</span> <span class="o">=</span> <span class="s">"all"</span><span class="x">,</span> <span class="n">limit</span> <span class="o">=</span> <span class="mi">10000</span><span class="x">)</span>
</code></pre></div></div>

<p>From the raw data, we parse the timestamp and scale the volumes by a million.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span> <span class="o">=</span> <span class="n">spyRaw</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">,</span> <span class="o">:</span><span class="n">c</span><span class="x">]]</span>
<span class="n">spy</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]</span> <span class="o">=</span> <span class="kt">DateTime</span><span class="o">.</span><span class="x">(</span><span class="n">chop</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]));</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]</span> <span class="o">.*</span> <span class="mf">1e-6</span><span class="x">;</span>
</code></pre></div></div>

<p>We also add in the returns with a lag because we are using the close-to-close return as a feature.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"r"</span><span class="x">]</span> <span class="o">=</span> <span class="n">log</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">c</span><span class="x">)</span> <span class="o">.-</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">log</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">c</span><span class="x">))</span>
<span class="n">spy</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"prev_r"</span><span class="x">]</span> <span class="o">=</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">r</span><span class="x">);</span>
</code></pre></div></div>

<p>In this data, the daily volume isn’t stationary and it is also heavy-tailed.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plot</span><span class="x">(</span>
  <span class="n">plot</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">spy</span><span class="o">.</span><span class="n">vNorm</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"IEX Daily Volume"</span><span class="x">),</span>
  <span class="n">histogram</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">vNorm</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Daily Volume Distribution"</span><span class="x">)</span>
  <span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/deeplearning/part1/volumes.png" alt="Line chart showing daily trading volume for SPY ETF over time. The chart displays a fluctuating pattern with several peaks and troughs, illustrating periods of higher and lower trading activity." width="80%" class="center-image" /></p>

<p>Looking at the autocorrelation, we can see a long-range dependence on the daily volumes, but when we take the daily difference in daily volume, we see a strong effect at lag 1, and the rest are much smaller.</p>

<p><img src="/assets/deeplearning/part1/volumes_autocor.png" alt="Bar chart displaying autocorrelation of daily trading volume for SPY ETF across multiple lags. The chart shows a prominent negative bar at lag 1, indicating strong mean reversion, followed by smaller bars for subsequent lags." width="80%" class="center-image" /></p>

<p>A negative value at lag 1 indicates a mean reversion-like process, but more importantly, means modelling the difference in daily volume will be easier than just directly modelling the daily volumes.</p>

<p>Predicting the daily change in volume does reduce how far out we can forecast volumes, though, as it relies on using the known previous volume to produce the next day’s volume. If you estimate multiple days, then you will be compounding the error.</p>

<p>We lag the volume variables as required.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">])</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">delta_vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]</span> <span class="o">.-</span> <span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_vNorm</span><span class="x">]</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_delta_vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">delta_vNorm</span><span class="x">])</span>

<span class="n">spy</span> <span class="o">=</span> <span class="n">dropmissing</span><span class="x">(</span><span class="n">spy</span><span class="x">)</span>
</code></pre></div></div>

<p>We add in the time-based variables and cyclically encode them.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfMonth</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofmonth</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfWeek</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofweek</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfQtr</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofquarter</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthOfYear</span><span class="x">]</span> <span class="o">=</span> <span class="n">month</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>

<span class="n">spy</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="s">"DayOfWeek"</span><span class="x">)</span>
<span class="n">spy</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="s">"DayOfMonth"</span><span class="x">)</span>
<span class="n">spy</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="s">"DayOfQtr"</span><span class="x">)</span>
<span class="n">spy</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="s">"MonthOfYear"</span><span class="x">);</span>
</code></pre></div></div>

<p>We also add in if the date was the end of the month.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">month</span><span class="x">]</span> <span class="o">=</span> <span class="n">floor</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">Dates</span><span class="o">.</span><span class="kt">Month</span><span class="x">)</span>
<span class="n">spy</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="o">:</span><span class="n">month</span><span class="x">),</span> 
                 <span class="o">:</span><span class="n">MonthEnd</span> <span class="o">=</span> <span class="x">(</span><span class="o">:</span><span class="n">t</span> <span class="o">.==</span> <span class="n">maximum</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">)))</span>
</code></pre></div></div>

<p>Finally, train/test split.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spyTrain</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">,</span> <span class="o">:</span><span class="x">];</span>
<span class="n">spyTest</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">];</span>
</code></pre></div></div>

<p>With the data prepared, we move on to building out the models.</p>

<h2 id="the-baseline-model">The Baseline Model</h2>

<p>We always want to make sure the neural nets are adding value, so we need something simple to compare to. In regular statistical modelling, this might be an intercept-only model, but in this case, we want the best linear model.</p>

<p>It’s a simple linear regression of all the available variables.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">GLM</span>

<span class="n">linearModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">delta_vNorm</span> <span class="o">~</span> <span class="n">prev_delta_vNorm</span> <span class="o">+</span> <span class="n">prev_vNorm</span> <span class="o">+</span> 
                                        <span class="n">MonthEnd</span> <span class="o">+</span> <span class="n">prev_r</span> <span class="o">+</span>
                                        <span class="n">DayOfWeek_sin</span> <span class="o">+</span> <span class="n">DayOfWeek_cos</span> <span class="o">+</span> 
                                        <span class="n">DayOfMonth_sin</span> <span class="o">+</span> <span class="n">DayOfMonth_cos</span> <span class="o">+</span>
                                        <span class="n">DayOfQtr_sin</span> <span class="o">+</span> <span class="n">DayOfQtr_cos</span> <span class="o">+</span>
                                        <span class="n">MonthOfYear_sin</span> <span class="o">+</span> <span class="n">MonthOfYear_cos</span>
                        <span class="x">),</span> <span class="n">spyTrain</span><span class="x">)</span>
</code></pre></div></div>

<p>This fits instantly and we get an in-sample \(R^2\) of 23% and an out-of-sample MSE of 380.</p>

<p>To add the predicted volume to the test set, we need to add the prediction of the model to the previous volume.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spyTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"linearPred"</span><span class="x">]</span> <span class="o">=</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">prev_vNorm</span> <span class="o">.+</span> <span class="n">predict</span><span class="x">(</span><span class="n">linearModel</span><span class="x">,</span> <span class="n">spyTest</span><span class="x">);</span>
<span class="n">sort!</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>
</code></pre></div></div>

<p><img src="/assets/deeplearning/part1/lm_res.png" alt="Line chart comparing predicted and actual daily trading volumes for SPY ETF over time." title="Line chart comparing predicted and actual daily trading volumes for SPY ETF over time." width="80%" class="center-image" /></p>

<p>Everything lines up quite nicely. There are a couple of periods where the volume spikes and the model can’t keep up, but other than that, it looks decent.</p>

<p>Also interesting to look at the shape of the cyclically encoded variables.</p>

<p><img src="/assets/deeplearning/part1/lm_cyen.png" alt="Line plot showing the effect of cyclically encoded variables on predicted daily trading volume changes for SPY ETF. The chart displays four panels for day of the week, day of the month, day of the quarter, and month of the year, each with a smooth curve illustrating how each time-based feature influences the model output." width="80%" class="center-image" /></p>

<p>Plenty going on here!</p>

<ul>
  <li><strong>Day of the Week</strong> - Wednesdays and Thursdays have a larger positive effect than Mondays and Tuesdays.</li>
  <li><strong>Day of the Month</strong> - The middle of the month (10-15) has the higher positive effect.</li>
  <li><strong>Day of the Quarter</strong> - Larger positive effects towards the end of the quarter.</li>
  <li><strong>Month of the Year</strong> - Summer months have the most negative effect.</li>
</ul>

<p>A positive effect here means a larger positive change in the daily volume compared to the previous day, and similarly, the same with the negative effects.</p>

<p>So, an intuitive model to begin with that has produced a strong foundation to improve upon with the neural net models.</p>

<h2 id="neural-nets-in-julia">Neural Nets in Julia</h2>

<p>Let’s increase the model complexity and introduce the neural nets. We are still using the same variables, but we expand them to include even more lags of the change in volumes.</p>

<h3 id="preparing-the-data-for-a-neural-network">Preparing the Data for a Neural Network</h3>

<p>We start with the dataframe, but iterate through and add the 30 lags of the previous volume changes.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rawData</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">delta_vNorm</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_vNorm</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthEnd</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_r</span><span class="x">,</span>
                      <span class="o">:</span><span class="n">DayOfWeek_sin</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfWeek_cos</span><span class="x">,</span>
                      <span class="o">:</span><span class="n">DayOfMonth_sin</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfMonth_cos</span><span class="x">,</span>
                      <span class="o">:</span><span class="n">DayOfQtr_sin</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfQtr_cos</span><span class="x">,</span>
                      <span class="o">:</span><span class="n">MonthOfYear_sin</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthOfYear_cos</span><span class="x">]]</span>

<span class="n">maxLag</span> <span class="o">=</span> <span class="mi">30</span>
<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="n">maxLag</span>
    <span class="n">rawData</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="s">"lag_</span><span class="si">$(i)</span><span class="s">_delta_vNorm"</span><span class="x">)]</span> <span class="o">=</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">rawData</span><span class="o">.</span><span class="n">delta_vNorm</span><span class="x">,</span> <span class="n">i</span><span class="x">)</span>
<span class="k">end</span>

<span class="n">dropmissing!</span><span class="x">(</span><span class="n">rawData</span><span class="x">)</span>
</code></pre></div></div>

<p>We then need to go from dataframes to matrices and flip the dimensions so each column is an observation rather than each row.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">y</span> <span class="o">=</span> <span class="n">permutedims</span><span class="x">(</span><span class="n">rawData</span><span class="o">.</span><span class="n">delta_vNorm</span><span class="x">)</span>
<span class="n">ts</span> <span class="o">=</span> <span class="n">rawData</span><span class="o">.</span><span class="n">t</span>
<span class="n">x</span> <span class="o">=</span> <span class="nd">@select</span><span class="x">(</span><span class="n">rawData</span><span class="x">,</span> <span class="n">Not</span><span class="x">(</span><span class="o">:</span><span class="n">delta_vNorm</span><span class="x">,</span> <span class="o">:</span><span class="n">t</span><span class="x">))</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">permutedims</span><span class="x">(</span><span class="kt">Matrix</span><span class="x">(</span><span class="n">x</span><span class="x">));</span>
</code></pre></div></div>

<p>Again, train/test split too.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">xTrain</span> <span class="o">=</span> <span class="n">x</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">]</span>
<span class="n">yTrain</span> <span class="o">=</span> <span class="n">y</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">]</span>
<span class="n">tsTrain</span> <span class="o">=</span> <span class="n">ts</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">]</span>

<span class="n">xTest</span> <span class="o">=</span> <span class="n">x</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">]</span>
<span class="n">yTest</span> <span class="o">=</span> <span class="n">y</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">]</span>
<span class="n">tsTest</span> <span class="o">=</span> <span class="n">ts</span><span class="x">[</span><span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">];</span>
</code></pre></div></div>

<p><a href="https://fluxml.ai/Flux.jl/stable/">Flux.jl</a> is Julia’s neural network library and the go-to for deep learning in Julia. It provides all the tools to build and train these types of models. One such tool is the <code class="language-plaintext highlighter-rouge">DataLoader</code>, which enables batch training for models. Batch training uses random subsets of the full data to train the model, which is very useful if you have too much data to fit into memory. You get to train the model on all your data by breaking it down into chunks.</p>

<p>Now, in this specific case, it isn’t needed as our data is small, but it’s always good to understand the techniques, and Flux makes it very simple. Pass in the <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> matrices, define the batch size and whether you want to randomise the samples or not.</p>

<p>Here we build random batches of 5.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">train_loader</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">DataLoader</span><span class="x">((</span><span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">),</span> <span class="n">batchsize</span><span class="o">=</span><span class="mi">5</span><span class="x">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="nb">true</span><span class="x">);</span>
</code></pre></div></div>

<p>Next, we need to build the model. In Flux, each layer of the basic net needs the number of input nodes and output nodes.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model</span> <span class="o">=</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">)</span>
</code></pre></div></div>

<p>Simply taking the number of rows of the <code class="language-plaintext highlighter-rouge">x</code> matrix as the input, and we are outputting 1 number - the expected change in volume for that day.</p>

<p>We also need to define a loss function for the model. We will use the mean square error (MSE). We predict the values from the model and calculate the MSE compared to the true values.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> flux_loss</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>
    <span class="n">yhat</span> <span class="o">=</span> <span class="n">flux_model</span><span class="x">(</span><span class="n">x</span><span class="x">)</span>
    <span class="n">Flux</span><span class="o">.</span><span class="n">mse</span><span class="x">(</span><span class="n">yhat</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>A neural net has several parameters that we need to optimise using the training data. With each batch of data, we evaluate the loss function and use the gradient of the loss function to push the parameters in the right direction to minimise the loss. The mechanics of moving around the loss function are controlled by the optimiser. In this case, we will use regular gradient descent, but there are many different optimisers out there that Flux provides - <a href="https://fluxml.ai/Flux.jl/stable/reference/training/optimisers/#man-optimisers">Optimiser Reference</a>.</p>

<p>Again, Flux makes this easy to do out of the box without really needing to understand what’s happening behind the scenes. We provide a gradient descent optimiser, <code class="language-plaintext highlighter-rouge">Flux.setup(Descent(eta)), flux_model)</code> (with <code class="language-plaintext highlighter-rouge">eta</code> (\(\eta\)) being the learning rate) and update the parameters after each batch of data.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">l</span><span class="x">,</span> <span class="n">gs</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">withgradient</span><span class="x">(</span><span class="n">m</span> <span class="o">-&gt;</span> <span class="n">flux_loss</span><span class="x">(</span><span class="n">m</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">),</span><span class="n">flux_model</span><span class="x">)</span>
<span class="n">Flux</span><span class="o">.</span><span class="n">update!</span><span class="x">(</span><span class="n">opt_state</span><span class="x">,</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">gs</span><span class="x">[</span><span class="mi">1</span><span class="x">])</span>
</code></pre></div></div>

<p>After all that, we throw everything into one function to easily iterate around the models. We are batch training with gradient descent and returning the trained model plus the loss history on both the full training set and the test set.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> train</span><span class="x">(</span><span class="n">train</span><span class="x">,</span> <span class="n">test</span><span class="x">,</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">flux_loss</span><span class="x">;</span> <span class="n">batchSize</span><span class="o">=</span><span class="mi">1024</span><span class="x">,</span> <span class="n">epochs</span><span class="o">=</span><span class="mi">10</span><span class="x">,</span> <span class="n">eta</span><span class="o">=</span><span class="mf">0.01</span><span class="x">)</span>
    <span class="x">(</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">)</span> <span class="o">=</span> <span class="n">train</span>
    <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">)</span> <span class="o">=</span> <span class="n">test</span>
    
    <span class="n">train_loader</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">DataLoader</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="n">batchsize</span><span class="o">=</span><span class="n">batchSize</span><span class="x">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="nb">true</span><span class="x">);</span>
    <span class="n">opt_state</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">setup</span><span class="x">(</span><span class="n">Descent</span><span class="x">(</span><span class="n">eta</span><span class="x">),</span> <span class="n">flux_model</span><span class="x">);</span>
        
    <span class="n">allTrainLoss</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">epochs</span><span class="x">)</span>
    <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">epochs</span><span class="x">)</span>
    
    <span class="k">for</span> <span class="n">epoch</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="n">epochs</span>
        <span class="n">loss</span> <span class="o">=</span> <span class="mf">0.0</span>
        <span class="k">for</span> <span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span> <span class="k">in</span> <span class="n">train_loader</span>
            <span class="n">l</span><span class="x">,</span> <span class="n">gs</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">withgradient</span><span class="x">(</span><span class="n">m</span> <span class="o">-&gt;</span> <span class="n">flux_loss</span><span class="x">(</span><span class="n">m</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">),</span> <span class="n">flux_model</span><span class="x">)</span>
            <span class="n">Flux</span><span class="o">.</span><span class="n">update!</span><span class="x">(</span><span class="n">opt_state</span><span class="x">,</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">gs</span><span class="x">[</span><span class="mi">1</span><span class="x">])</span>
            <span class="n">loss</span> <span class="o">+=</span> <span class="n">l</span> <span class="o">/</span> <span class="n">length</span><span class="x">(</span><span class="n">train_loader</span><span class="x">)</span>
        <span class="k">end</span>
        <span class="n">train_loss</span> <span class="o">=</span> <span class="n">flux_loss</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">)</span>
        <span class="n">test_loss</span> <span class="o">=</span> <span class="n">flux_loss</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">)</span>
        <span class="n">allTrainLoss</span><span class="x">[</span><span class="n">epoch</span><span class="x">]</span> <span class="o">=</span> <span class="n">train_loss</span>
        <span class="n">allTestLoss</span><span class="x">[</span><span class="n">epoch</span><span class="x">]</span> <span class="o">=</span> <span class="n">test_loss</span>
        
    <span class="k">end</span>
    <span class="k">return</span> <span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span><span class="x">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>We can now train the models, so let’s build some models!</p>

<h3 id="a-1-layer-neural-net">A 1 Layer Neural Net</h3>

<p>The simplest neural net is 1 layer with the features as an input and 1 value as the output. Nothing else!</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model</span> <span class="o">=</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">)</span>
<span class="n">flux_model</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">train</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">),</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">flux_loss</span><span class="x">;</span> <span class="n">epochs</span> <span class="o">=</span> <span class="mi">1000</span><span class="x">,</span> <span class="n">eta</span><span class="o">=</span><span class="mf">1e-6</span><span class="x">);</span>
</code></pre></div></div>

<p><img src="/assets/deeplearning/part1/layer1_traing.png" alt="Line chart showing the training loss over epochs for a one-layer neural network model predicting daily trading volume changes. The chart displays a downward trend, indicating that the model loss decreases as training progresses." width="80%" class="center-image" /></p>

<p>You might notice something strange here: the test loss is smaller than the training loss. This is a quirk of this data set; the test set has a tighter distribution than the training data, which is easy to see in a histogram.</p>

<p><img src="/assets/deeplearning/part1/testtraindist.png" alt="Histogram comparing the distribution of daily trading volume changes for SPY ETF in the training and test datasets. The training set shows a wider spread and more extreme values, while the test set is more tightly clustered around the center. The chart highlights the difference in variability between the two datasets." width="80%" class="center-image" /></p>

<p>Like I said, it’s a quirk of the dataset, but something to bear in mind for the rest of the examples.</p>

<p>Let’s look at the predicted values of this first neural net and how they line up with reality. Plus, we can compare it to the linear model. For the linear model, you just need to run <code class="language-plaintext highlighter-rouge">predict</code> and pass in the test dataset. Similarly, with the neural net, we evaluate the trained model on the testing matrix.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">nnTest</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">t</span><span class="o">=</span><span class="n">tsTest</span><span class="x">,</span> <span class="n">delta_vNorm_nn</span> <span class="o">=</span> <span class="n">vec</span><span class="x">(</span><span class="n">flux_model</span><span class="x">(</span><span class="n">xTest</span><span class="x">)</span><span class="err">'</span><span class="x">))</span>
<span class="n">spyTest</span><span class="o">.</span><span class="n">delta_vNorm_lin</span> <span class="o">=</span> <span class="n">predict</span><span class="x">(</span><span class="n">linearModel</span><span class="x">,</span> <span class="n">spyTest</span><span class="x">)</span>
<span class="n">spyTest</span> <span class="o">=</span> <span class="n">leftjoin</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="n">nnTest</span><span class="x">,</span> <span class="n">on</span> <span class="o">=</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>
</code></pre></div></div>

<p>As we are predicting the change in the daily volume, we need to add back in the previous value to get our predicted daily volume.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spyTest</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">v_nn</span> <span class="o">=</span> <span class="o">:</span><span class="n">prev_vNorm</span> <span class="o">.+</span> <span class="o">:</span><span class="n">delta_vNorm_nn</span><span class="x">,</span> <span class="o">:</span><span class="n">v_lin</span> <span class="o">=</span> <span class="o">:</span><span class="n">prev_vNorm</span> <span class="o">+</span> <span class="o">:</span><span class="n">delta_vNorm_lin</span><span class="x">);</span>
<span class="n">sort!</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>
</code></pre></div></div>

<p>And then plotting</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">p</span> <span class="o">=</span> <span class="n">plot</span><span class="x">(</span><span class="n">spyTest</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">vNorm</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"True"</span><span class="x">,</span>  <span class="n">dpi</span><span class="o">=</span><span class="mi">300</span><span class="x">,</span> <span class="n">background_color</span> <span class="o">=</span> <span class="o">:</span><span class="n">transparent</span><span class="x">)</span>
<span class="n">p</span> <span class="o">=</span> <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">v_nn</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"NN"</span><span class="x">)</span>
<span class="n">p</span> <span class="o">=</span> <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">v_lin</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Linear"</span><span class="x">)</span>
<span class="n">p</span>
</code></pre></div></div>

<p><img src="/assets/deeplearning/part1/layer1_results.png" alt="Line chart comparing predicted and actual daily trading volumes for SPY ETF over time. The chart shows three lines: one representing true daily volumes, another representing neural network predictions and another showing the linear model predictions. All the lines follow a similar pattern." width="80%" class="center-image" /></p>

<p>Things line up quite well, nothing outrageous.</p>

<p>In terms of performance, we calculate the MSE from the dataframe.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@combine</span><span class="x">(</span><span class="n">dropmissing</span><span class="x">(</span><span class="n">spyTest</span><span class="x">),</span> 
          <span class="o">:</span><span class="n">NN</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nn</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span> 
          <span class="o">:</span><span class="n">Lin</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_lin</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>NN</th>
      <th>Lin</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>405.55</td>
      <td>370.57</td>
    </tr>
  </tbody>
</table>

<p>The linear model is doing better so far.</p>

<h3 id="2-layer-neural-nets">2 Layer Neural Nets</h3>

<p>We are now in the realm of multi-layer perceptrons (MLPs) and have introduced many more parameters into the model. We can also now build more complicated interactions with each layer.</p>

<p>In Flux, building out more layers is simple; you are chaining different dense layers together. We are choosing to have a fully connected MLP with 2 layers, with all the variables passed through.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model2</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">Chain</span><span class="x">(</span><span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">)),</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">))</span>

<span class="n">flux_model2</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">train</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">),</span> <span class="n">flux_model2</span><span class="x">,</span> <span class="n">flux_loss</span><span class="x">;</span> <span class="n">epochs</span> <span class="o">=</span> <span class="mi">1000</span><span class="x">,</span> <span class="n">eta</span> <span class="o">=</span> <span class="mf">1e-6</span><span class="x">);</span>
</code></pre></div></div>

<p>This trains in the same amount of time with the same train/test loss pattern. Again, assessing the MSE of this bigger model.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">nnhTest</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">t</span><span class="o">=</span><span class="n">tsTest</span><span class="x">,</span> <span class="n">delta_vNorm_nnh</span> <span class="o">=</span> <span class="n">vec</span><span class="x">(</span><span class="n">flux_model2</span><span class="x">(</span><span class="n">xTest</span><span class="x">)</span><span class="err">'</span><span class="x">))</span>
<span class="n">spyTest</span> <span class="o">=</span> <span class="n">leftjoin</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="n">nnhTest</span><span class="x">,</span> <span class="n">on</span> <span class="o">=</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>

<span class="n">spyTest</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">v_nnh</span> <span class="o">=</span> <span class="o">:</span><span class="n">prev_vNorm</span> <span class="o">.+</span> <span class="o">:</span><span class="n">delta_vNorm_nnh</span><span class="x">)</span>
<span class="nd">@combine</span><span class="x">(</span><span class="n">dropmissing</span><span class="x">(</span><span class="n">spyTest</span><span class="x">),</span> <span class="o">:</span><span class="n">NN</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nn</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span> 
                               <span class="o">:</span><span class="n">Lin</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_lin</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span>
                               <span class="o">:</span><span class="n">NNH</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nnh</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>NN</th>
      <th>Lin</th>
      <th>NNH</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>405.55</td>
      <td>370.57</td>
      <td>401.424</td>
    </tr>
  </tbody>
</table>

<p>This has improved on the 1-layer neural net, but still no better than the linear model.</p>

<h2 id="neural-net-regularisation">Neural Net Regularisation</h2>

<p>The linear model has 13 parameters, the 1-layer neural net has 42 parameters, and the 2-layer net has 1,764 parameters. This is a rapid growth in complexity which raises the likelihood that the model starts to overfit. How do we make sure the neural net models only pick out the key parameters and regularise themselves?</p>

<p>We have two options: add a penalisation score in the loss function that bounds the total size of the coefficients or introduce something called a dropout layer.</p>

<h3 id="penalising-the-loss-function">Penalising the Loss Function</h3>

<p>You can extend regularisation into neural networks the same way you do linear models. You add an additional term to the loss function that penalises the total combined size of the coefficients.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> flux_loss_reg</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>
    <span class="n">flux_loss</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span> <span class="o">+</span> <span class="n">sum</span><span class="x">(</span><span class="n">x</span><span class="o">-&gt;</span><span class="n">sum</span><span class="x">(</span><span class="n">abs2</span><span class="x">,</span> <span class="n">x</span><span class="x">),</span> <span class="n">Flux</span><span class="o">.</span><span class="n">trainables</span><span class="x">(</span><span class="n">flux_model</span><span class="x">))</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Therefore, if the model wants to allocate more weight to 1 parameter, it needs to take some weight from another. This acts as a balancing mechanism and should reduce the chance of overfitting.</p>

<p>We use this new loss function with the 2-layer net.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">Chain</span><span class="x">(</span><span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">)),</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">))</span>
<span class="n">flux_model</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">train</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">),</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">flux_loss_reg</span><span class="x">;</span> <span class="n">epochs</span> <span class="o">=</span> <span class="mi">1000</span><span class="x">,</span> <span class="n">eta</span> <span class="o">=</span> <span class="mf">1e-6</span><span class="x">);</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>NN</th>
      <th>Lin</th>
      <th>NNH</th>
      <th>NNHR</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>405.55</td>
      <td>370.57</td>
      <td>401.424</td>
      <td>388.548</td>
    </tr>
  </tbody>
</table>

<p>So slightly better than the unregularised version.</p>

<h3 id="neural-net-dropout-layers">Neural Net Dropout Layers</h3>

<p>An alternative way of regularising a network is to introduce a dropout layer. Dropout randomly sets the output of a node to zero during the training phase, which means the net has fewer parameters to optimise over and reduces the possibility of overfitting. When it comes to inference, all of the nodes are included but rescaled by the dropout probability. The original dropout paper is an engaging read - <a href="https://jmlr.org/papers/v15/srivastava14a.html"> Dropout: A Simple Way to Prevent Neural Networks from Overfitting</a>.</p>

<p>Again, very simple to use dropout in Julia and Flux; it is just another type of layer.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model3</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">Chain</span><span class="x">(</span><span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">)),</span> <span class="n">Dropout</span><span class="x">(</span><span class="mf">0.5</span><span class="x">),</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">))</span>

<span class="n">flux_model3</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">train</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">),</span> <span class="n">flux_model3</span><span class="x">,</span> <span class="n">flux_loss</span><span class="x">;</span> <span class="n">epochs</span> <span class="o">=</span> <span class="mi">250</span><span class="x">,</span> <span class="n">eta</span> <span class="o">=</span> <span class="mf">1e-6</span><span class="x">);</span>
</code></pre></div></div>

<p>For the final time, let’s evaluate this model on the test set and calculate the MSE.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">nndTest</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">t</span><span class="o">=</span><span class="n">tsTest</span><span class="x">,</span> <span class="n">delta_vNorm_nnd</span> <span class="o">=</span> <span class="n">vec</span><span class="x">(</span><span class="n">flux_model3</span><span class="x">(</span><span class="n">xTest</span><span class="x">)</span><span class="err">'</span><span class="x">))</span>
<span class="n">spyTest</span> <span class="o">=</span> <span class="n">leftjoin</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="n">nndTest</span><span class="x">,</span> <span class="n">on</span> <span class="o">=</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>

<span class="n">spyTest</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">v_nnd</span> <span class="o">=</span> <span class="o">:</span><span class="n">prev_vNorm</span> <span class="o">.+</span> <span class="o">:</span><span class="n">delta_vNorm_nnd</span><span class="x">)</span>
<span class="nd">@combine</span><span class="x">(</span><span class="n">dropmissing</span><span class="x">(</span><span class="n">spyTest</span><span class="x">),</span> <span class="o">:</span><span class="n">NN</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nn</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span> 
                               <span class="o">:</span><span class="n">Lin</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_lin</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span>
                               <span class="o">:</span><span class="n">NNH</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nnh</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span>
                               <span class="o">:</span><span class="n">NND</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nnd</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>NN</th>
      <th>Lin</th>
      <th>NNH</th>
      <th>NNHR</th>
      <th>NNHD</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>405.55</td>
      <td>370.57</td>
      <td>401.424</td>
      <td>388.548</td>
      <td>411.105</td>
    </tr>
  </tbody>
</table>

<p>The worst model so far!</p>

<h2 id="conclusion">Conclusion</h2>

<p>So the linear model is still winning. The neural net and various iterations haven’t improved on this simple model, and the best neural net was the 2-layer with regularisation.</p>

<p>It must be noted that this problem isn’t exactly hard, and the amount of data is relatively small, so it is unsurprising that the added complexity of the neural nets hasn’t added anything. It’s hardly a ‘deep learning’ problem!</p>

<p>I’ve also not gone crazy with the neural net optimisations. You can include more layers, change the number of nodes in the layers, change the activation functions, and change the loss function - all sorts of things that could be tweaked and improve the model.</p>

<p>Hopefully I’ve not just added to the slop of neural net finance tutorials and you’ve found something useful. Unfortunately, the neural nets haven’t beaten the linear model, which shows you can’t just jump into the fancy tools without looking at the simpler models.</p>

<h2 id="other-juliafinance-posts">Other Julia/Finance Posts</h2>

<p>For more quant finance tutorials check out some of my older posts.</p>

<ul>
  <li><a href="https://dm13450.github.io/2025/03/14/Fitting-Price-Impact-Models.html">Fitting Price Impact</a></li>
  <li><a href="https://dm13450.github.io/2024/02/08/Cross-Asset-Skew-A-Trading-Strategy.html">Cross Asset Skew - A Trading Strategy</a></li>
  <li><a href="https://dm13450.github.io/2023/07/15/Stat-Arb-Walkthrough.html">Stat Arb - An Easy Walkthrough</a></li>
</ul>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><summary type="html"><![CDATA[I’m fortunate enough to be participating in a lecture series at work that covers deep learning and its applications in finance. This will be a series of posts documenting what I learn and implementing the ‘homework’ (I’m 32, how am I still getting homework?) using Julia and Flux.]]></summary></entry><entry><title type="html">Cyclical Embedding</title><link href="https://dm13450.github.io/2025/06/16/Cyclical-Embedding.html" rel="alternate" type="text/html" title="Cyclical Embedding" /><published>2025-06-16T00:00:00+00:00</published><updated>2025-06-16T00:00:00+00:00</updated><id>https://dm13450.github.io/2025/06/16/Cyclical-Embedding</id><content type="html" xml:base="https://dm13450.github.io/2025/06/16/Cyclical-Embedding.html"><![CDATA[<p>Cyclical embeding (or encoding) is a basic transformation for nunmerical variables that follow a cycle. Let’s explore how they work.</p>

<p>I am currently attending a Deep Learning in Finance lecture series (lectured by Stefan Zohran in preparation for his new book). The ongoing homework is taking a basic time series model and applying the various deep learning techniques. In the process of doing this homework, I’ve come across Cyclical Embeddings and how they are used to transform variables that move into a cycle into something a model can understand.</p>

<p>Consider this blog post me reading this Kaggle notebook: <a href="https://www.kaggle.com/code/avanwyk/encoding-cyclical-features-for-deep-learning">Encoding Cyclical Features for Deep Learning</a>, converting it to Julia and using some examples to convince myself Cyclical Embeddings work and are useful.</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>Cyclical variables are especially pertinent in Finance. For example, day of the week you could either use a factor (the label directly) or number (Mon=1, Tue=2 etc.) in a model. Using a factor, your model now includes 5 additional parameters. If you use the number you’ll have to specify the form of the relationship (linear or using a GAM). Each has its ups and downs, but there is also a key piece of information missing: the days of the week form a cycle where 1 follows from 5.  How can we translate this into something the model will understand?</p>

<p>As the name suggests, cyclical embeddings lead to a cycle and the natural functions are the trigonometry sin and cos. We take the one-dimensional variable and transform it into two dimensions</p>

\[\begin{align*}
x &amp; = \sin \left( \frac{2 \pi t}{\text{max} (t)} \right), \\
y &amp; = \cos \left( \frac{2 \pi t}{\text{max} (t)} \right).
\end{align*}\]

<p>If we apply this transformation to our day of the week we go from \(t \in [0, 4]\) to a circle in \(x\) and \(y\).</p>

<p><img src="/assets/CyclicalEmbedding/example.png" alt="A two-dimensional plot showing the cyclical embedding of days of the week, where each day is represented as a point on a circle using sine and cosine transformations. The points form a closed loop, visually demonstrating the cyclical nature of the days." width="80%" class="center-image" /></p>

<p>I am reminded of polar coordinates and we can now see that Monday is the same distance from Friday as it is Tuesday. 
Crucially, the new variables are nicely bounded between -1 and 1 which is always helpful when building models. 
All in, this looks like a sensible transformation, now to see if it has a noticeable difference in modelling performance.</p>

<h2 id="practical-cyclical-embeddings---daily-volumes">Practical Cyclical Embeddings - Daily Volumes</h2>

<p>Let’s model the daily trading volume of a stock. It feels logical that the day of the week (Mon-Fri), day of the month (1-31) and month (1-12) would affect the amount traded. The summer months might be quieter, the end of the month might be busier (month-end rebalancing) and Fridays might be quieter. All three of these time variables are cyclical so the cyclical embeddings should help.</p>

<p>We have 3 separate choices:</p>

<ol>
  <li>Everything as a number (3 free parameters)</li>
  <li>Days of the week and months as factors (5 + 12 + 1 free parameters)</li>
  <li>Cyclically embedded the three variables (3x2=6 parameters)</li>
</ol>

<p>So a balance between the number of parameters and the flexibility of the model.</p>

<p>We will use a simple linear model, nothing fancy.</p>

<p>As always we will be in Julia.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">Dates</span><span class="x">,</span> <span class="n">AlpacaMarkets</span><span class="x">,</span> <span class="n">Plots</span><span class="x">,</span> <span class="n">StatsBase</span><span class="x">,</span> <span class="n">GLM</span>
<span class="k">using</span> <span class="n">DataFramesMeta</span><span class="x">,</span> <span class="n">CategoricalArrays</span><span class="x">,</span> <span class="n">ShiftedArrays</span>
</code></pre></div></div>

<p>To load the data in we will use my <a href="https://github.com/dm13450/AlpacaMarkets.jl">AlpacaMarkets.jl</a> API and pull in as much daily data as possible.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aaplRaw</span><span class="x">,</span> <span class="n">npt</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">stock_bars</span><span class="x">(</span><span class="s">"AAPL"</span><span class="x">,</span> <span class="s">"1Day"</span><span class="x">;</span> <span class="n">startTime</span><span class="o">=</span><span class="kt">Date</span><span class="x">(</span><span class="s">"2000-01-01"</span><span class="x">),</span> <span class="n">endTime</span> <span class="o">=</span> <span class="n">today</span><span class="x">()</span> <span class="o">-</span> <span class="kt">Day</span><span class="x">(</span><span class="mi">2</span><span class="x">),</span> <span class="n">adjustment</span> <span class="o">=</span> <span class="s">"all"</span><span class="x">,</span> <span class="n">limit</span> <span class="o">=</span> <span class="mi">10000</span><span class="x">)</span>
</code></pre></div></div>

<p>Some basic cleaning and formatting.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aapl</span> <span class="o">=</span> <span class="n">aaplRaw</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]]</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]</span> <span class="o">=</span> <span class="kt">DateTime</span><span class="o">.</span><span class="x">(</span><span class="n">chop</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]))</span>
</code></pre></div></div>

<p>Julia makes it easy to add the factor variables and the numeric versions. As the numeric values all start at 1 we subtract one so they begin at 0.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayName</span><span class="x">]</span> <span class="o">=</span> <span class="n">CategoricalArray</span><span class="x">(</span><span class="n">dayname</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">))</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthName</span><span class="x">]</span> <span class="o">=</span> <span class="n">CategoricalArray</span><span class="x">(</span><span class="n">monthname</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">))</span>

<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfMonth</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofmonth</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfWeek</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofweek</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthOfYear</span><span class="x">]</span> <span class="o">=</span> <span class="n">month</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span><span class="x">;</span>
</code></pre></div></div>

<p>We normalise the volume to millions of shares and take the difference.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aapl</span> <span class="o">=</span> <span class="n">aaplRaw</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]]</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]</span> <span class="o">.*</span> <span class="mf">1e-6</span><span class="x">;</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">delta_vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]</span> <span class="o">.-</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]);</span>
</code></pre></div></div>

<p>As the regular volumes (<code class="language-plaintext highlighter-rouge">vNorm</code>) aren’t stationary, we can see a clear trend that changes, it’s better to model the difference in volumes each day.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plot</span><span class="x">(</span><span class="n">plot</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">aapl</span><span class="o">.</span><span class="n">vNorm</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Volume"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">),</span> 
     <span class="n">plot</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">aapl</span><span class="o">.</span><span class="n">delta_vNorm</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Volume Difference"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">),</span> <span class="n">layout</span><span class="o">=</span><span class="x">(</span><span class="mi">2</span><span class="x">,</span><span class="mi">1</span><span class="x">))</span>
</code></pre></div></div>

<p><img src="/assets/CyclicalEmbedding/volumes.png" alt="Two line plots showing daily trading volumes for AAPL over time. The first plot displays significant fluctuations and trends, with periods of higher and lower trading activity. The second plot is the difference in trading volumes between the days and doesn't have a trend." width="80%" class="center-image" /></p>

<p>To apply the cyclical encoding we need to take one column and turn it into two.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> cyclical_encode</span><span class="x">(</span><span class="n">df</span><span class="x">,</span> <span class="n">col</span><span class="x">,</span> <span class="n">max</span><span class="x">)</span>
    <span class="n">df</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="s">"</span><span class="si">$(col)</span><span class="s">_sin"</span><span class="x">)]</span> <span class="o">=</span> <span class="n">sin</span><span class="o">.</span><span class="x">(</span><span class="mi">2</span> <span class="o">.*</span> <span class="nb">pi</span> <span class="o">.*</span> <span class="n">df</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="n">col</span><span class="x">)]</span><span class="o">/</span><span class="n">max</span><span class="x">)</span>
    <span class="n">df</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="s">"</span><span class="si">$(col)</span><span class="s">_cos"</span><span class="x">)]</span> <span class="o">=</span> <span class="n">cos</span><span class="o">.</span><span class="x">(</span><span class="mi">2</span> <span class="o">.*</span> <span class="nb">pi</span> <span class="o">.*</span> <span class="n">df</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="n">col</span><span class="x">)]</span><span class="o">/</span><span class="n">max</span><span class="x">)</span>
    <span class="n">df</span>
<span class="k">end</span>

<span class="k">for</span> <span class="n">col</span> <span class="k">in</span> <span class="x">[</span><span class="s">"DayOfWeek"</span><span class="x">,</span> <span class="s">"DayOfMonth"</span><span class="x">,</span> <span class="s">"MonthOfYear"</span><span class="x">]</span>
    <span class="n">aapl</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">aapl</span><span class="x">,</span> <span class="n">col</span><span class="x">,</span> <span class="n">maximum</span><span class="x">(</span><span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="n">col</span><span class="x">]))</span>
<span class="k">end</span>
</code></pre></div></div>

<p>If you’ve not seen it before the <code class="language-plaintext highlighter-rouge">$</code> is like Python F-strings and lets you use a variable in the string.</p>

<p>We do the normal test/train split.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aaplTrain</span> <span class="o">=</span> <span class="n">aapl</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">,</span><span class="o">:</span><span class="x">]</span>
<span class="n">aaplTest</span> <span class="o">=</span> <span class="n">aapl</span><span class="x">[</span><span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">,</span><span class="o">:</span><span class="x">];</span>
</code></pre></div></div>

<p>Now to build the three models.</p>

<p>The numerical model takes in the numbers directly.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">numModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">delta_vNorm</span> <span class="o">~</span> <span class="n">DayOfWeek</span> <span class="o">+</span> <span class="n">MonthOfYear</span> <span class="o">+</span> <span class="n">DayOfMonth</span><span class="x">),</span> <span class="n">aaplTrain</span><span class="x">)</span>
</code></pre></div></div>

<p>The factor model represents the day of the week and day of the month as categories so they each get a separate parameter.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">factorModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">delta_vNorm</span> <span class="o">~</span> <span class="n">DayName</span> <span class="o">+</span> <span class="n">MonthName</span> <span class="o">+</span> <span class="n">DayOfMonth</span> <span class="o">+</span> <span class="mi">0</span><span class="x">),</span> <span class="n">aaplTrain</span><span class="x">)</span>
</code></pre></div></div>

<p>The embedding model takes in the sin/cos transformation of each of the variables.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">embeddingModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">delta_vNorm</span> <span class="o">~</span> <span class="n">DayOfWeek_sin</span> <span class="o">+</span> <span class="n">DayOfWeek_cos</span> <span class="o">+</span> <span class="n">DayOfMonth_sin</span> <span class="o">+</span> <span class="n">DayOfMonth_cos</span> <span class="o">+</span> <span class="n">MonthOfYear_sin</span> <span class="o">+</span> <span class="n">MonthOfYear_cos</span><span class="x">),</span> <span class="n">aaplTrain</span><span class="x">);</span>
</code></pre></div></div>

<p>To assess how well the models perform we look at the RMSE (in sample and out of sample), AIC (in sample) and \(R^2\) (in sample and out of sample).</p>

<table>
  <thead>
    <tr>
      <th>Model</th>
      <th>NumCoefs</th>
      <th>RMSE</th>
      <th>RMSEOOS</th>
      <th>AIC</th>
      <th>R2</th>
      <th>R2OOS</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Numeric</td>
      <td>4</td>
      <td>31.1041</td>
      <td>50.2975</td>
      <td>21346.9</td>
      <td>0.0336539</td>
      <td>0.0396665</td>
    </tr>
    <tr>
      <td>Factor</td>
      <td>17</td>
      <td>31.2978</td>
      <td>50.0453</td>
      <td>21352.8</td>
      <td>0.0433269</td>
      <td>0.0276647</td>
    </tr>
    <tr>
      <td>Embedding</td>
      <td>7</td>
      <td>31.7484</td>
      <td>51.1591</td>
      <td>21420.8</td>
      <td>0.0002655</td>
      <td>-0.000531</td>
    </tr>
  </tbody>
</table>

<p>Interestingly, the embedding model performs the worst both in sample and out of sample.</p>

<p>When we pull out the Day of the Week effect it’s easy to see what the model has learnt.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">params</span> <span class="o">=</span> <span class="kt">Dict</span><span class="x">(</span><span class="n">zip</span><span class="x">(</span><span class="n">coefnames</span><span class="x">(</span><span class="n">embedingExample</span><span class="x">),</span> <span class="n">coef</span><span class="x">(</span><span class="n">embedingExample</span><span class="x">)))</span>

<span class="n">x</span> <span class="o">=</span> <span class="mi">0</span><span class="o">:</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">4</span>
<span class="n">ySin</span> <span class="o">=</span> <span class="n">params</span><span class="x">[</span><span class="s">"DayOfWeek_sin"</span><span class="x">]</span> <span class="o">*</span> <span class="n">sin</span><span class="o">.</span><span class="x">(</span><span class="mi">2</span> <span class="o">.*</span> <span class="nb">pi</span> <span class="o">.*</span> <span class="n">x</span> <span class="o">./</span> <span class="n">maximum</span><span class="x">(</span><span class="n">x</span><span class="x">))</span>
<span class="n">yCos</span> <span class="o">=</span> <span class="n">params</span><span class="x">[</span><span class="s">"DayOfWeek_cos"</span><span class="x">]</span> <span class="o">*</span> <span class="n">cos</span><span class="o">.</span><span class="x">(</span><span class="mi">2</span> <span class="o">.*</span> <span class="nb">pi</span> <span class="o">.*</span> <span class="n">x</span> <span class="o">./</span> <span class="n">maximum</span><span class="x">(</span><span class="n">x</span><span class="x">))</span>


<span class="n">p</span> <span class="o">=</span> <span class="n">plot</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">ySin</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Sin"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">yCos</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Cos"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">yCos</span> <span class="o">.+</span> <span class="n">ySin</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Combined"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/CyclicalEmbedding/dayofweek.png" alt="Circular plot illustrating the cyclical embedding of days of the week effect from the model." width="80%" class="center-image" /></p>

<p>This indicates the lower volume changes are on Tuesday and the higher volume changes are on Thursday.</p>

<p>Based on the model performance it’s not a great showing for the embedding transformation. Let’s move on to another example where the cyclical nature might be more obvious.</p>

<h2 id="practical-cyclical-embeddings---intraday-volumes">Practical Cyclical Embeddings - Intraday Volumes</h2>

<p>Another example would be the flow of trades over the day. In this case, the hour is the variable we will cyclically embed. For this, we use BTCUSD trades from AlpacaMarkets.jl and aggregate them over the day.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">btcRaw</span><span class="x">,</span> <span class="n">token</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">crypto_bars</span><span class="x">(</span><span class="s">"BTC/USD"</span><span class="x">,</span> <span class="s">"1H"</span><span class="x">;</span> <span class="n">startTime</span><span class="o">=</span><span class="kt">Date</span><span class="x">(</span><span class="s">"2025-01-01"</span><span class="x">),</span> <span class="n">limit</span> <span class="o">=</span> <span class="mi">10000</span><span class="x">)</span>

<span class="n">res</span> <span class="o">=</span> <span class="x">[</span><span class="n">btcRaw</span><span class="x">]</span>
<span class="k">while</span> <span class="o">!</span><span class="x">(</span><span class="n">isnothing</span><span class="x">(</span><span class="n">token</span><span class="x">)</span> <span class="o">||</span> <span class="n">isempty</span><span class="x">(</span><span class="n">token</span><span class="x">))</span>
    <span class="n">println</span><span class="x">(</span><span class="n">token</span><span class="x">)</span>
    <span class="n">newtrades</span><span class="x">,</span> <span class="n">token</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">crypto_bars</span><span class="x">(</span><span class="s">"BTC/USD"</span><span class="x">,</span> <span class="s">"1H"</span><span class="x">;</span> <span class="n">startTime</span><span class="o">=</span><span class="kt">Date</span><span class="x">(</span><span class="s">"2025-01-01"</span><span class="x">),</span> <span class="n">limit</span> <span class="o">=</span> <span class="mi">10000</span><span class="x">,</span> <span class="n">page_token</span> <span class="o">=</span> <span class="n">token</span><span class="x">)</span>
    <span class="n">println</span><span class="x">((</span><span class="n">minimum</span><span class="x">(</span><span class="n">newtrades</span><span class="o">.</span><span class="n">t</span><span class="x">),</span> <span class="n">maximum</span><span class="x">(</span><span class="n">newtrades</span><span class="o">.</span><span class="n">t</span><span class="x">)))</span>
    <span class="n">append!</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="x">[</span><span class="n">newtrades</span><span class="x">])</span>
    <span class="n">sleep</span><span class="x">(</span><span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">SLEEP_TIME</span><span class="x">[])</span>
<span class="k">end</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">vcat</span><span class="x">(</span><span class="n">res</span><span class="o">...</span><span class="x">);</span>
</code></pre></div></div>

<p>Sidenote, I do need to wrap this functionality into the package itself.</p>

<p>We get the raw data into a suitable state.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">btc</span> <span class="o">=</span> <span class="n">res</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]]</span>
<span class="n">btc</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]</span> <span class="o">=</span> <span class="kt">DateTime</span><span class="o">.</span><span class="x">(</span><span class="n">chop</span><span class="o">.</span><span class="x">(</span><span class="n">btc</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]));</span>

<span class="n">btc</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">btc</span><span class="x">,</span> <span class="o">:</span><span class="kt">Date</span> <span class="o">=</span> <span class="kt">Date</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">),</span> <span class="o">:</span><span class="kt">Time</span> <span class="o">=</span> <span class="kt">Time</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">),</span> <span class="o">:</span><span class="n">DayOfWeek</span> <span class="o">=</span> <span class="n">dayofweek</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">),</span> <span class="o">:</span><span class="kt">Hour</span> <span class="o">=</span> <span class="n">hour</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">))</span>
<span class="n">trainDates</span> <span class="o">=</span> <span class="n">unique</span><span class="x">(</span><span class="n">btc</span><span class="o">.</span><span class="kt">Date</span><span class="x">)[</span><span class="mi">1</span><span class="o">:</span><span class="mi">140</span><span class="x">]</span>
<span class="n">testDates</span> <span class="o">=</span> <span class="n">setdiff</span><span class="x">(</span><span class="n">unique</span><span class="x">(</span><span class="n">btc</span><span class="o">.</span><span class="kt">Date</span><span class="x">),</span> <span class="n">trainDates</span><span class="x">)</span>

<span class="n">trainDataRaw</span> <span class="o">=</span> <span class="n">btc</span><span class="x">[</span><span class="n">findall</span><span class="x">(</span><span class="k">in</span><span class="x">(</span><span class="n">trainDates</span><span class="x">),</span> <span class="n">btc</span><span class="o">.</span><span class="kt">Date</span><span class="x">),</span> <span class="o">:</span><span class="x">];</span>
<span class="n">testDataRaw</span> <span class="o">=</span> <span class="n">btc</span><span class="x">[</span><span class="n">findall</span><span class="x">(</span><span class="k">in</span><span class="x">(</span><span class="n">testDates</span><span class="x">),</span> <span class="n">btc</span><span class="o">.</span><span class="kt">Date</span><span class="x">),</span> <span class="o">:</span><span class="x">];</span>

<span class="n">trainData</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">trainDataRaw</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="kt">Hour</span><span class="x">]),</span> <span class="o">:</span><span class="n">v</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">))</span>
<span class="n">trainData</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">trainData</span><span class="x">,</span> <span class="o">:</span><span class="n">total_v</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">),</span> <span class="o">:</span><span class="n">frac</span> <span class="o">=</span> <span class="o">:</span><span class="n">v</span><span class="o">./</span><span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">))</span>

<span class="n">testData</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">testDataRaw</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="kt">Hour</span><span class="x">]),</span> <span class="o">:</span><span class="n">v</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">))</span>
<span class="n">testData</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">testData</span><span class="x">,</span> <span class="o">:</span><span class="n">total_v</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">),</span> <span class="o">:</span><span class="n">frac</span> <span class="o">=</span> <span class="o">:</span><span class="n">v</span><span class="o">./</span><span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">))</span>

<span class="n">sort!</span><span class="x">(</span><span class="n">trainData</span><span class="x">,</span> <span class="o">:</span><span class="kt">Hour</span><span class="x">);</span>
<span class="n">sort!</span><span class="x">(</span><span class="n">testData</span><span class="x">,</span> <span class="o">:</span><span class="kt">Hour</span><span class="x">);</span>
</code></pre></div></div>

<p>Again, using a linear model we fit the embedded hour variables to the fraction of the volume traded per hour.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">embedModelIntra</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">frac</span> <span class="o">~</span> <span class="n">Hour_sin</span> <span class="o">+</span> <span class="n">Hour_cos</span><span class="x">),</span> <span class="n">trainData</span><span class="x">)</span>
</code></pre></div></div>

<p>When comparing the results, we are now just looking at the intraday profile of the trades for both the train set and test set overlaid with the model.</p>

<p><img src="/assets/CyclicalEmbedding/intraEmbedd.png" alt="Line plot comparing actual and predicted intraday trading volume fractions by hour. The plot shows three lines: one representing the observed fraction of trading volume for each hour of the day from the training set, another from the test set and another representing the model's predicted values using cyclical embedding." width="80%" class="center-image" /></p>

<p>The model has done well to pick up the peak in the afternoon but has missed the peak in the early morning. The RMSE of this model is 0.029 vs 0.026 from using the training fractions directly, so again the encoded model has done worse. 
This is the limiting factor with this embedding, we have a single frequency of sin/cos when in reality this problem needs more degrees of freedom, i.e. multiple components</p>

\[\sum _i c^1_i \sin \left(\frac{2 \pi \omega _i x}{\max (x)}\right) + c^2_i \cos \left(\frac{2 \pi \omega _i x}{\max (x)}\right).\]

<p>This is now a GAM with trigomonic splines so we can view the cyclical encoding as a 1-spline GAM.</p>

<h2 id="conclusion">Conclusion</h2>

<p>It’s an interesting transformation of time-like variables and gives you a route to smoothing out the beginning and ending of the cycles.</p>

<p>In these toy models, the embedding hasn’t improved performance but it’s possible that it’s more relevant in deep learning architectures where there are more parameters and more interactions. In all the above models there’s much more groundwork to do before we start eeking out performance gains from the time variables.</p>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><summary type="html"><![CDATA[Cyclical embeding (or encoding) is a basic transformation for nunmerical variables that follow a cycle. Let’s explore how they work.]]></summary></entry><entry><title type="html">Fitting Price Impact Models</title><link href="https://dm13450.github.io/2025/03/14/Fitting-Price-Impact-Models.html" rel="alternate" type="text/html" title="Fitting Price Impact Models" /><published>2025-03-14T00:00:00+00:00</published><updated>2025-03-14T00:00:00+00:00</updated><id>https://dm13450.github.io/2025/03/14/Fitting-Price-Impact-Models</id><content type="html" xml:base="https://dm13450.github.io/2025/03/14/Fitting-Price-Impact-Models.html"><![CDATA[<p>A big part of market microstructure is price impact and understanding how you move the market every time you trade. In the simplest sense, every trade upends the supply and demand of an asset even for a tiny amount of time. The market responds to this change, then responds to the response, then responds to that response, etc. You get the idea. It’s a cascading effect of interactions between all the people in the market.</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>Price impact is happening both at the micro and macro level. At the micro level each trade moves the market a little bit based on the instantaneous market conditions commonly called ‘liquidity’. At the macro level, continuous trades in one direction have a compounding and overlapping effect. In reality, you can’t separate out either effect so the market impact models need to work for both small and large scales.</p>

<p>This post is inspired by two sources:</p>

<ol>
  <li><a href="https://www.routledge.com/Handbook-of-Price-Impact-Modeling/Webster/p/book/9781032328225">Handbook of Price Impact Modelling</a> - Chapter 7</li>
  <li><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4286108">Stochastic Liquidity as a Proxy for Nonlinear Price Impact</a></li>
</ol>

<p>Both cover very similar models but one is a fairly expensive
book and the other is on SSRN for free. The same author is involved in
both of them too.</p>

<p>In terms of data, there are two routes you can go down.</p>

<ol>
  <li>You have your own, private, execution data and can build out a data set for the models.</li>
  <li>You use publicly available trades and adjust the models to account for the anonymous data.</li>
</ol>

<p>In the first case, you will know when an execution started and stopped so can record how the price changed. In the second case, the data will be made up of lots of trades and less obvious when some parent execution started and stopped.</p>

<p>We will take the 2nd route and using Bitcoin data to look at different price impact models.</p>

<p>As ever I will be using Julia with some of the standard packages.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">LibPQ</span>
<span class="k">using</span> <span class="n">DataFrames</span><span class="x">,</span> <span class="n">DataFramesMeta</span>
<span class="k">using</span> <span class="n">Dates</span>
<span class="k">using</span> <span class="n">Plots</span>
<span class="k">using</span> <span class="n">GLM</span><span class="x">,</span> <span class="n">Statistics</span><span class="x">,</span> <span class="n">Optim</span>
</code></pre></div></div>

<h2 id="bitcoin-price-impact-data">Bitcoin Price Impact Data</h2>

<p>We will use my old trusty Bitcoin data set that I collected
in 2021. It’s just over a day’s worth of Bitcoin trades and L1 prices
that I piped into QuestDB. Full detail in <a href="https://dm13450.github.io/2021/08/05/questdb-part-1.html">Using QuestDB to Build a Crypto Trade Database in Julia</a>.</p>

<p>First, we connect to the database.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">conn</span> <span class="o">=</span> <span class="n">LibPQ</span><span class="o">.</span><span class="n">Connection</span><span class="x">(</span><span class="s">"""
             dbname=qdb
             host=127.0.0.1
             password=quest
             port=8812
             user=admin"""</span><span class="x">);</span>
</code></pre></div></div>

<p>For each trade recorded in the database, we also want to join the best bid and offer immediately before it. This is where an <code class="language-plaintext highlighter-rouge">ASOF</code> join is useful. It joins two tables with timestamps using the entry of the 2nd table with time before the first table row. Sounds more complicated than it really is. In short, it takes the trade table and adds in the prices using the price just before the trade.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">trades</span> <span class="o">=</span> <span class="n">execute</span><span class="x">(</span><span class="n">conn</span><span class="x">,</span> 
    <span class="s">"WITH
trades AS ( 
   SELECT * FROM coinbase_trades
   ),
prices as (
  select * from coinbase_bbo
)
select * from trades ASOF JOIN prices"</span><span class="x">)</span> <span class="o">|&gt;</span> <span class="n">DataFrame</span>
<span class="n">dropmissing!</span><span class="x">(</span><span class="n">trades</span><span class="x">);</span>
<span class="n">trades</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="o">:</span><span class="n">mid</span> <span class="o">=</span> <span class="mf">0.5</span><span class="o">*</span><span class="x">(</span><span class="o">:</span><span class="n">ask</span> <span class="o">.+</span> <span class="o">:</span><span class="n">bid</span><span class="x">))</span>
</code></pre></div></div>

<p>For these small tables, it calculates pretty much instantly and we are
able to return a Julia data frame. Plus we calculate the mid-price for each row.</p>

<p>In all the price impact models we are aggregating this data:</p>
<ol>
  <li>Group the data by some time bucket (seconds or minutes etc.)</li>
  <li>Calculate the net amount, total absolute amount and open and close prices of the bucket.</li>
  <li>Calculate the price return using the close-to-close prices.</li>
</ol>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> aggregate_data</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="n">smp</span><span class="x">)</span>
    <span class="n">tradesAgg</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="nd">@transform</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span> <span class="o">=</span> <span class="n">floor</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">timestamp</span><span class="x">,</span> <span class="n">smp</span><span class="x">)),</span> <span class="o">:</span><span class="n">ts</span><span class="x">),</span> 
             <span class="o">:</span><span class="n">q</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">size</span> <span class="o">.*</span> <span class="o">:</span><span class="n">side</span><span class="x">),</span> 
             <span class="o">:</span><span class="n">absq</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">size</span><span class="x">),</span> 
             <span class="o">:</span><span class="n">o</span> <span class="o">=</span> <span class="n">first</span><span class="x">(</span><span class="o">:</span><span class="n">mid</span><span class="x">),</span> 
             <span class="o">:</span><span class="n">c</span> <span class="o">=</span> <span class="n">last</span><span class="x">(</span><span class="o">:</span><span class="n">mid</span><span class="x">));</span>
    <span class="n">tradesAgg</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">]</span> <span class="o">.=</span> <span class="x">[</span><span class="nb">NaN</span><span class="x">;</span> <span class="x">(</span><span class="n">tradesAgg</span><span class="o">.</span><span class="n">c</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]</span><span class="o">./</span> <span class="n">tradesAgg</span><span class="o">.</span><span class="n">c</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)])</span> <span class="o">.-</span> <span class="mi">1</span><span class="x">]</span>
    <span class="n">tradesAgg</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"ofi"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">tradesAgg</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span> <span class="n">tradesAgg</span><span class="o">.</span><span class="n">absq</span>

    <span class="n">tradesAgg</span>
<span class="k">end</span>
</code></pre></div></div>

<p>We are going to bucket the data by 10 seconds.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggData</span>  <span class="o">=</span> <span class="n">aggregate_data</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="n">Dates</span><span class="o">.</span><span class="kt">Second</span><span class="x">(</span><span class="mi">10</span><span class="x">))</span>
</code></pre></div></div>

<p>As ever, let’s split this data into a training and test set.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span> <span class="o">=</span> <span class="n">aggData</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="mi">7500</span><span class="x">,</span> <span class="o">:</span><span class="x">]</span>
<span class="n">aggDataTest</span> <span class="o">=</span> <span class="n">aggData</span><span class="x">[</span><span class="mi">7501</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">];</span>
</code></pre></div></div>

<p>It’s just a simple split on time.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plot</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">c</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Train"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">c</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Test"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/traintest.png" alt="" width="80%" class="center-image" /></p>

<h2 id="calculating-the-volatility-and-adv">Calculating the Volatility and ADV</h2>

<p>All the models require a volatility and ADV calculation. My data runs just over a day, so need to adjust for that.</p>

<p>For the ADV we take the sum of the total volume traded and divide by the length of time converted to days.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">deltaT</span> <span class="o">=</span> <span class="n">maximum</span><span class="x">(</span><span class="n">trades</span><span class="o">.</span><span class="n">timestamp</span><span class="x">)</span> <span class="o">-</span> <span class="n">minimum</span><span class="x">(</span><span class="n">trades</span><span class="o">.</span><span class="n">timestamp</span><span class="x">)</span>
<span class="n">deltaTDays</span> <span class="o">=</span> <span class="x">(</span><span class="n">deltaT</span><span class="o">.</span><span class="n">value</span> <span class="o">*</span> <span class="mf">1e-3</span><span class="x">)</span><span class="o">/</span><span class="x">(</span><span class="mi">24</span><span class="o">*</span><span class="mi">60</span><span class="o">*</span><span class="mi">60</span><span class="x">)</span>
<span class="n">adv</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="n">trades</span><span class="o">.</span><span class="n">size</span><span class="x">)</span><span class="o">/</span><span class="n">deltaTDays</span>
<span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"ADV"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">adv</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"ADV"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">adv</span><span class="x">;</span>
</code></pre></div></div>

<p>For the volatility, we take the square root of the sum of the 5-minute return squared. Should probably be annualised if we were comparing the parameters across different assets.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">min5Agg</span> <span class="o">=</span> <span class="n">aggregate_data</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="n">Dates</span><span class="o">.</span><span class="kt">Minute</span><span class="x">(</span><span class="mi">5</span><span class="x">))</span>
<span class="n">volatility</span> <span class="o">=</span> <span class="n">sqrt</span><span class="x">(</span><span class="n">sum</span><span class="x">(</span><span class="n">min5Agg</span><span class="o">.</span><span class="n">price_return</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]</span> <span class="o">.*</span> <span class="n">min5Agg</span><span class="o">.</span><span class="n">price_return</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]))</span>
<span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"Vol"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">volatility</span><span class="x">;</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"Vol"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">volatility</span><span class="x">;</span>
</code></pre></div></div>

<p>The ADV and volatility have a normalising effect across assets. So if we had multiple coins, we could use the same model even if one was a highly traded coin like BTC or ETH vs a lower volume coin (the rest of them?!). This would give us comparable model parameters to judge the impact effect.</p>

<p>As our data sample is so small we are only calculating 1 volatility and 1 ADV. In reality, you calculate the volatility/ADV on a rolling basis and then do the train/test split.</p>

<h2 id="models-of-market-impact">Models of Market Impact</h2>

<p>The paper and book describe different market impact models that all follow a similar functional form. I’ve chosen four of them to illustrate the model fitting process.</p>

<ul>
  <li>The Order Flow Imbalance model (OFI)</li>
  <li>The Obizhaeva-Wang (OW) model</li>
  <li>The Concave Propagator model</li>
  <li>The Reduced Form model</li>
</ul>

<p>For all the models we will state the form of the market impact
\(\Delta I\) and use the price returns over the same period to find
the best parameters of the model.</p>

<p>The overarching idea is that the return in each bucket is proportional
to the amount of volume traded in that bucket plus some
contribution from the previous volumes earlier - suitably decayed.</p>

<h3 id="order-flow-imbalance">Order Flow Imbalance</h3>

<p>This is the simplest model as it just uses the imbalance over the
bucket to predict return. For the OFI we are just using the trade
imbalance, the net volume divided by the total volume in the bucket.</p>

\[\Delta I = \lambda \sigma \frac{q_t}{| q_t | \text{ADV}}\]

<p>As there is no dependence on the previous returns, we can use simple linear regression to estimate $\lambda$.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ofi"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ofi</span> <span class="o">./</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ADV</span><span class="x">)</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ofi"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">ofi</span> <span class="o">./</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">ADV</span><span class="x">)</span>

<span class="n">ofiModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">price_return</span> <span class="o">~</span> <span class="n">x_ofi</span> <span class="o">+</span> <span class="mi">0</span><span class="x">),</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">])</span>
</code></pre></div></div>
<p>The model has returned a significant value of \(\lambda = 59\) and has an in sample \(R^2\) of 11% and our of sample RMSE of 0.0003. Encouraging and off to a good start!</p>

<p>Side note, I’ve written about Order Flow Imbalance before in <a href="https://dm13450.github.io/2022/02/02/Order-Flow-Imbalance.html">Order Flow Imbalance - A High Frequency Trading Signal</a>.</p>

<h3 id="the-obizhaeva-wang-ow-model">The Obizhaeva-Wang (OW) Model</h3>

<p>The OW model is a foundational model of market impact and you will see this model frequently across different microstructure papers. It suggests a linear dependence between the signed order flow and price impact but again normalising against the ADV and volatility.</p>

\[\Delta I = -\beta I_t + \lambda \sigma \frac{q_t}{ADV}\]

<p>Again, we create the \(x\) variable in the data frame specific for this model but this will need special attention to fit.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ADV</span><span class="x">);</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">ADV</span><span class="x">);</span>
</code></pre></div></div>

<p>From the market impact formula, we can see that the relationship is
recursive. The impact at time \(t\) depends on the impact at time
\(t-1\). How much of the previous impact is carried over is controlled
by \(\beta\) and in the paper they fix this at \(\frac{\log 2}{\beta}
= 60 \text{ Minutes}\). This means we have to fit the model as:</p>

<ol>
  <li>Calculate the \(I\) given an estimate of \(\lambda\)</li>
  <li>Adjust the price returns by this impact</li>
  <li>Regress the adjusted price returns against the \(x\) variable.</li>
  <li>Repeat with the new estimate of \(\lambda\) until converged.</li>
</ol>

<p>This is a simple 1 parameter optimisation where we minimise the RMSE.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> calcImpact</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">beta</span><span class="x">,</span> <span class="n">lambda</span><span class="x">)</span>
    <span class="n">impact</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">length</span><span class="x">(</span><span class="n">x</span><span class="x">))</span>
    <span class="n">impact</span><span class="x">[</span><span class="mi">1</span><span class="x">]</span> <span class="o">=</span> <span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="x">]</span>
    <span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">2</span><span class="o">:</span><span class="n">length</span><span class="x">(</span><span class="n">impact</span><span class="x">)</span>
        <span class="n">impact</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="x">(</span><span class="mi">1</span><span class="o">-</span><span class="n">beta</span><span class="x">)</span><span class="o">*</span><span class="n">impact</span><span class="x">[</span><span class="n">i</span><span class="o">-</span><span class="mi">1</span><span class="x">]</span> <span class="o">+</span> <span class="n">lambda</span><span class="o">*</span><span class="n">x</span><span class="x">[</span><span class="n">i</span><span class="x">]</span>
    <span class="k">end</span>
    <span class="n">impact</span>
<span class="k">end</span>
	
<span class="k">function</span><span class="nf"> fitLambda</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">,</span> <span class="n">beta</span><span class="x">,</span> <span class="n">lambda</span><span class="x">)</span>
    <span class="n">I</span> <span class="o">=</span> <span class="n">calcImpact</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">beta</span><span class="x">,</span> <span class="n">lambda</span><span class="x">)</span>
    <span class="n">y2</span> <span class="o">=</span> <span class="n">y</span> <span class="o">.+</span> <span class="x">(</span><span class="n">beta</span> <span class="o">.*</span> <span class="n">I</span><span class="x">)</span>
    <span class="n">model</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="n">reshape</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="x">(</span><span class="n">length</span><span class="x">(</span><span class="n">x</span><span class="x">),</span> <span class="mi">1</span><span class="x">))[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">],</span> <span class="n">y2</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">])</span>
    <span class="n">model</span>
<span class="k">end</span>

<span class="n">rmse</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">=</span> <span class="n">sqrt</span><span class="x">(</span><span class="n">mean</span><span class="x">(</span><span class="n">residuals</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">.^</span><span class="mi">2</span><span class="x">))</span>
</code></pre></div></div>

<p>We start with \(\lambda = 1\) and let the optimiser do the work.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="n">optimize</span><span class="x">(</span><span class="n">x</span> <span class="o">-&gt;</span> <span class="n">rmse</span><span class="x">(</span><span class="n">fitLambda</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">],</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="x">])),</span> <span class="x">[</span><span class="mf">1.0</span><span class="x">])</span>
</code></pre></div></div>

<p>It’s converged! We plot the different values of the objective function and show that this process can find the minimum.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">lambdaRes</span> <span class="o">=</span> <span class="n">rmse</span><span class="o">.</span><span class="x">(</span><span class="n">fitLambda</span><span class="o">.</span><span class="x">([</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">]],</span> <span class="x">[</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">]],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="mi">0</span><span class="o">:</span><span class="mi">1</span><span class="o">:</span><span class="mi">20</span><span class="x">))</span>
<span class="n">plot</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="mi">1</span><span class="o">:</span><span class="mi">20</span><span class="x">,</span> <span class="n">lambdaRes</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">xlabel</span> <span class="o">=</span> <span class="s">L"\lambda"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"RMSE"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"OW Model"</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">Optim</span><span class="o">.</span><span class="n">minimizer</span><span class="x">(</span><span class="n">res</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Optimised Value"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/ow.png" alt="" width="80%" class="center-image" /></p>

<p>We then pull out the best-fitting model and estimate the \(R^2\).
We have a nice convex relationship which is always a good sign.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">owModel</span> <span class="o">=</span> <span class="n">fitLambda</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">],</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">first</span><span class="x">(</span><span class="n">Optim</span><span class="o">.</span><span class="n">minimizer</span><span class="x">(</span><span class="n">res</span><span class="x">)))</span>
</code></pre></div></div>

<p>Which gives \(R^2 = 11\%\). So roughly the same as the OFI model. For the out-of-sample RMSE we get 0.0006.</p>

<h2 id="concave-propagator-model">Concave Propagator Model</h2>

<p>This model follows the belief that market impact is a power law and
that power is close to 0.5. Using the square root of the total amount
traded and the net direction gives us the \(x\) variable.</p>

\[\Delta I = -\beta I_t + \lambda \sigma \text{sign} (q_t) \sqrt
{\frac{| q_t |}{\text{ADV}}}\]

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_cp"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="n">sign</span><span class="o">.</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">q</span><span class="x">)</span> <span class="o">.*</span> <span class="n">sqrt</span><span class="o">.</span><span class="x">((</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">absq</span> <span class="o">./</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ADV</span><span class="x">));</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_cp"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="n">sign</span><span class="o">.</span><span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">q</span><span class="x">)</span> <span class="o">.*</span> <span class="n">sqrt</span><span class="o">.</span><span class="x">((</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">absq</span> <span class="o">./</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">ADV</span><span class="x">));</span>
</code></pre></div></div>

<p>Again, we optimise using the same methodology as above.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="n">optimize</span><span class="x">(</span><span class="n">x</span> <span class="o">-&gt;</span> <span class="n">rmse</span><span class="x">(</span><span class="n">fitLambda</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_cp"</span><span class="x">],</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="x">])),</span> <span class="x">[</span><span class="mf">1.0</span><span class="x">])</span>
<span class="n">lambdaRes</span> <span class="o">=</span> <span class="n">rmse</span><span class="o">.</span><span class="x">(</span><span class="n">fitLambda</span><span class="o">.</span><span class="x">([</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_cp"</span><span class="x">]],</span> <span class="x">[</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">]],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="mi">0</span><span class="o">:</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">1</span><span class="x">))</span>
<span class="n">plot</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">1</span><span class="x">,</span> <span class="n">lambdaRes</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">xlabel</span> <span class="o">=</span> <span class="s">L"\lambda"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"RMSE"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Concave Propagator Model"</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">Optim</span><span class="o">.</span><span class="n">minimizer</span><span class="x">(</span><span class="n">res</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Optimised Value"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/concaveprop.png" alt="" width="80%" class="center-image" /></p>

<p>Another success! This time the \(R^2\) is 17% so an improvement on the other two models. It’s out of sample RMSE is 0.0008.</p>

<h2 id="reduced-form-model">Reduced Form Model</h2>

<p>The paper suggests that as the number of trades and time increment
increases the market impact function converges to a linear form with a
dependence on the stochastic volatility of the order flow.</p>

\[\Delta I = -\beta I_t + \lambda \sigma \frac{q_t}{\sqrt{v_t \cdot \text{ADV}}}\]

<p>For this, we need to calculate the stochastic liquidity parameter, \(v_t\), which is simply the moving average of the absolute market volumes.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> calcLiquidity</span><span class="x">(</span><span class="n">absq</span><span class="x">,</span> <span class="n">beta</span><span class="x">)</span>
    <span class="n">v</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">length</span><span class="x">(</span><span class="n">absq</span><span class="x">))</span>
    <span class="n">v</span><span class="x">[</span><span class="mi">1</span><span class="x">]</span> <span class="o">=</span> <span class="n">absq</span><span class="x">[</span><span class="mi">1</span><span class="x">]</span>
    <span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">2</span><span class="o">:</span><span class="n">length</span><span class="x">(</span><span class="n">v</span><span class="x">)</span>
        <span class="n">v</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="x">(</span><span class="mi">1</span><span class="o">-</span><span class="n">beta</span><span class="x">)</span><span class="o">*</span><span class="n">v</span><span class="x">[</span><span class="n">i</span><span class="o">-</span><span class="mi">1</span><span class="x">]</span> <span class="o">+</span> <span class="n">absq</span><span class="x">[</span><span class="n">i</span><span class="x">]</span>
    <span class="k">end</span>
    <span class="k">return</span> <span class="n">v</span>
<span class="k">end</span>

<span class="n">v</span> <span class="o">=</span> <span class="n">calcLiquidity</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"absq"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">)</span>
<span class="n">vTest</span> <span class="o">=</span> <span class="n">calcLiquidity</span><span class="x">(</span><span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"absq"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">)</span>

<span class="n">plot</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">v</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Stochastic Liquidity"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">vTest</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Test Set"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/stochliq.png" alt="" width="80%" class="center-image" /></p>

<p>Adding this into our data frame and calculating the \(x\) variable is simple.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"v"</span><span class="x">]</span> <span class="o">=</span> <span class="n">v</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"v"</span><span class="x">]</span> <span class="o">=</span> <span class="n">vTest</span>

<span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_rf"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span> <span class="n">sqrt</span><span class="o">.</span><span class="x">((</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ADV</span> <span class="o">.*</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"v"</span><span class="x">]));</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_rf"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span>
<span class="n">sqrt</span><span class="o">.</span><span class="x">((</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">ADV</span> <span class="o">.*</span> <span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"v"</span><span class="x">]));</span>
</code></pre></div></div>

<p>And again, we repeat the fitting process.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">lambdaVals</span> <span class="o">=</span> <span class="mi">0</span><span class="o">:</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">5</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">optimize</span><span class="x">(</span><span class="n">x</span> <span class="o">-&gt;</span> <span class="n">rmse</span><span class="x">(</span><span class="n">fitLambda</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_rf"</span><span class="x">],</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="x">])),</span> <span class="x">[</span><span class="mf">1.0</span><span class="x">])</span>
<span class="n">lambdaRes</span> <span class="o">=</span> <span class="n">rmse</span><span class="o">.</span><span class="x">(</span><span class="n">fitLambda</span><span class="o">.</span><span class="x">([</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_rf"</span><span class="x">]],</span> <span class="x">[</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">]],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">lambdaVals</span><span class="x">))</span>
<span class="n">plot</span><span class="x">(</span><span class="n">lambdaVals</span><span class="x">,</span> <span class="n">lambdaRes</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">xlabel</span> <span class="o">=</span> <span class="s">L"\lambda"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"RMSE"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Reduced Form Model"</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">Optim</span><span class="o">.</span><span class="n">minimizer</span><span class="x">(</span><span class="n">res</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Optimised Value"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/rf.png" alt="" width="80%" class="center-image" /></p>

<p>This model gives an \(R^2=10%\) and out-of-sample RMSE of 0.0009.</p>

<p>With all four models fitted, we can now look at the differences statistically and how the impact state evolves over the course of the day.</p>

<table>
  <thead>
    <tr>
      <th>Model</th>
      <th>\(\lambda\)</th>
      <th>\(R^2\)</th>
      <th>OOS RMSE</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>OFI</td>
      <td>43</td>
      <td>0.11</td>
      <td>0.0003</td>
    </tr>
    <tr>
      <td>OW</td>
      <td>14</td>
      <td>0.11</td>
      <td>0.0006</td>
    </tr>
    <tr>
      <td>Concave Propagator</td>
      <td>0.34</td>
      <td>0.17</td>
      <td>0.0008</td>
    </tr>
    <tr>
      <td>Reduced Form</td>
      <td>1.7</td>
      <td>0.10</td>
      <td>0.0009</td>
    </tr>
  </tbody>
</table>

<p>So, the concave propagator model has the highest \(R^2\) followed by the reduced form model. The OFI and OW models have slightly lower \(R^2\).
But, looking at the RMSE values from the out-of-sample performance its
clear that the OFI model seems to be the best.</p>

<p>When we plot the resulting impacts from the 4 models we generally see
they agree, with only the OFI model being the most different. This
difference comes from the lack of time decay from the previous volumes.</p>

<p><img src="/assets/priceimpact/priceimpact.png" alt="" width="80%" class="center-image" /></p>

<h2 id="conclusion">Conclusion</h2>

<p>Overall, I don’t think these results are that informative, my data set is tiny
compared to the paper (1 day vs months). Instead, use this as more of
an instructional on how to fit these models. We didn’t even explore
optimising the time decay (\(\beta\) values) for Bitcoin which could
be substantially different from the paper dataset on equities. So
there is plenty more to do!</p>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><summary type="html"><![CDATA[A big part of market microstructure is price impact and understanding how you move the market every time you trade. In the simplest sense, every trade upends the supply and demand of an asset even for a tiny amount of time. The market responds to this change, then responds to the response, then responds to that response, etc. You get the idea. It’s a cascading effect of interactions between all the people in the market.]]></summary></entry><entry><title type="html">Importance Sampling, Reinforcement Learning and Getting More From The Data You Have</title><link href="https://dm13450.github.io/2024/12/17/Importance-Sampling-Reinforcement-Learning-and-Getting-More-From-The-Data-You-Have.html" rel="alternate" type="text/html" title="Importance Sampling, Reinforcement Learning and Getting More From The Data You Have" /><published>2024-12-17T00:00:00+00:00</published><updated>2024-12-17T00:00:00+00:00</updated><id>https://dm13450.github.io/2024/12/17/Importance-Sampling-Reinforcement-Learning-and-Getting-More-From-The-Data-You-Have</id><content type="html" xml:base="https://dm13450.github.io/2024/12/17/Importance-Sampling-Reinforcement-Learning-and-Getting-More-From-The-Data-You-Have.html"><![CDATA[<p>A new paper hit my feed <a href="https://download.ssrn.com/2024/10/28/5001783.pdf?response-content-disposition=inline&amp;X-Amz-Security-Token=IQoJb3JpZ2luX2VjEOP%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLWVhc3QtMSJGMEQCIHq9Se9RaBE1%2F66%2BtmGrcj7aLbi6toqQfDSRp1zTko7kAiBSm6JmvMSoPO0qhnQjbVhmI7Jq4ELTMJ7aVMMhtXx5Kyq9BQh8EAQaDDMwODQ3NTMwMTI1NyIM7DspRkQ6bu5dMzZmKpoFy1twmqL3CnHbwCY%2FoeCM5gNNzZ7Tgg6Nkneyv2NRLkCFg5Me2jaIG8Q19ah0BTONxgU0DOq1FJqha7bmPF5e1aNGUEDvsb15S2rcqa%2FMn6FSkvUYj0MdXZ7mW%2F0E8a7Tze6yx79i96PxX%2BDAmB49m7eca2VjYxGTQLD4BeSG6pCC%2BUb2KIKHraWojBJVa0ttPHzpgau%2F2lrXccdLLjSuk4sXavZUaFK%2BSPPq9vLzE67U7CbL8fbSVSa8kJjhydjEfbJY3VWCS1ObFoo%2BmZ3NaVa3aUICzIYL8t6otH6TSvqu5ngqN3CeWuFcFXCBIYx%2FuYUL55ZNDj47pxOSwLXc3bhMJgEceXuTcNLTwT8gm8NmKDOxIfsZ0yyQ39NxBbgmy3tfxj786JT7ZHUwBOTz9UEHLTtYuVX5QVGIs9wM8vWgP39UKJ02jz0Fkm540mXidi7qOwcT0Qq3fuROji8yfKHrQ%2F8NPY8EG0JIbNQXPiGZa%2FGcwvAh1OKaK6jDCFHJ5oIPREpmUHC8a%2FxURXyVqljIyA%2Bcci2aLmRM8miFe2c3RtJz3H%2FzYHoKHbmlVdHy8L1OedB2niM1oJynkr%2BLPAGqmFtqzyidMKR0vkL%2BXwBGn%2BTc9LtLSPevoFXweE%2BGnCU0R6NeUNKyYuqlfDIOIHrHi%2F9KlTKHd8nctyiwcSORdBPHBQGKMN22Aqw31KMGY%2BDQoI2FfihWRRenmgg0dE6dpQbzzcvVGxodLkUPiWpKUE%2BLv5IiiWwMRH8JtHFB3BKPl%2B936ZxsCyr8a2g5x29DyHqK%2FXsaKKsrn%2F%2B%2F2fQ0SzPmqIlbkXxR5HK%2Frp5dl7QO%2Fv30qno%2FBffEzNoxqw7WtVrpDCcHYHkSfja6Z6FGdNct6copIdWoMKq687kGOrIBGDmT9sBZhpvZ6i6MxU6SIg9XleS6iefUMx6HSruHRyC9b6%2BhSY7By7IEHKF8XQ5ZSQQca2XV4L0zd6uEnmDMraFTr%2FfKnfsoEV9nG8dswAaWuzOFobks7PG7lRDAbTEtWZVQbCTsoMBkA4wEVhDymVRoVI3N%2BTWZhFFZRJuL%2BaYV6%2Bz3ueLjgFPHJQABdpTzXebI%2B26TpeAes2xVo%2B1sViJo7gKO%2BCGP3CerL50h1Z7BRw%3D%3D&amp;X-Amz-Algorithm=AWS4-HMAC-SHA256&amp;X-Amz-Date=20241119T193009Z&amp;X-Amz-SignedHeaders=host&amp;X-Amz-Expires=300&amp;X-Amz-Credential=ASIAUPUUPRWE6IWZOOPX%2F20241119%2Fus-east-1%2Fs3%2Faws4_request&amp;X-Amz-Signature=e74a64fd89c4f7620828c712083958765a6f5fabd8e23674e9969cd95f5ff971&amp;abstractId=5001783">Choosing trading strategies in electronic execution using
importance sampling</a>. I’ve only encountered sampling as part of a statistical computing course as part of my PhD, and I had never strayed away from Monte Carlo sampling, but this practical example provided an intuitive understanding of its importance and utility.</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>The key tenet of the paper is to use the data you have to evaluate a strategy you are considering without actually running the new strategy in production. In real life, changing something like these strategies can take a long time, with limited upside but unlimited downside if it all goes wrong.</p>

<p>This blog post will run through the paper and replicate the main themes in Julia. I believe the author is a Julia user too, I remember enjoying their JuliaCon talk about high-frequency covariance matrices - <a href="https://www.youtube.com/watch?v=X_TCI02rgu0">HighFrequencyCovariance: Estimating Covariance Matrices in Julia</a> and the associated Julia package <a href="https://github.com/s-baumann/HighFrequencyCovariance.jl">HighFrequencyCovariance.jl</a></p>

<h2 id="the-execution-traders-problem">The Execution Traders Problem</h2>

<p>You are an execution trader with access to 4 different broker algorithms (algos) to execute your trade. With each trade you need to choose an algo and measure the trade’s overall slippage - the price you paid vs the price at the start of the order. You want to chose the best algo to ensure each of your trades gets the best price.</p>

<p>How do you chose what one to use? Do you have enough data to decide what one is the best one? Is any one algo better than the other? These are all difficult questions to answer but with some data on how the algos performs you should be able to use the data to help inform your decision.</p>

<p>We are trying to maximise the performance of each trade by choosing the correct algo. Our trade is described by a variable \(x\) and each algo performs differently depending on \(x\). The paper calls the performance ‘slippage’ but then tries to maximise the slippage which sounds weird to me - I always talk about minimising slippage! But that’s splitting hairs.</p>

<p>The performance of algo \(i\) is described by an analytical function with parameters \(\alpha _i, \beta _i\) plus some noise that depends on the duration of the trade \(d\) and the volatility \(\sigma\).</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> expSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alpha</span><span class="x">,</span> <span class="n">beta</span><span class="x">)</span>
   <span class="nd">@.</span> <span class="o">-</span><span class="n">alpha</span><span class="o">*</span><span class="x">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">beta</span><span class="x">)</span><span class="o">^</span><span class="mi">2</span> 
<span class="k">end</span>

<span class="k">function</span><span class="nf"> slippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alpha</span><span class="x">,</span> <span class="n">beta</span><span class="x">,</span> <span class="n">d</span><span class="x">,</span> <span class="n">sigma</span><span class="x">)</span>
    <span class="n">expSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alpha</span><span class="x">,</span> <span class="n">beta</span><span class="x">)</span> <span class="o">+</span> <span class="n">rand</span><span class="x">(</span><span class="n">Normal</span><span class="x">(</span><span class="mi">0</span><span class="x">,</span> <span class="n">d</span><span class="o">*</span><span class="n">sigma</span><span class="o">/</span><span class="mi">2</span><span class="x">))</span>
<span class="k">end</span>
</code></pre></div></div>

<p>The \(\alpha\)’s and \(\beta\)’s are simple constants set in the paper.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">alphas</span> <span class="o">=</span> <span class="x">[</span><span class="mi">5</span><span class="x">,</span><span class="mi">10</span><span class="x">,</span><span class="mi">15</span><span class="x">,</span><span class="mi">20</span><span class="x">]</span>
<span class="n">betas</span> <span class="o">=</span> <span class="x">[</span><span class="mf">0.2</span><span class="x">,</span> <span class="mf">0.4</span><span class="x">,</span> <span class="mf">0.6</span><span class="x">,</span> <span class="mf">0.8</span><span class="x">]</span>

<span class="n">x</span> <span class="o">=</span> <span class="n">collect</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="mf">0.01</span><span class="o">:</span><span class="mi">1</span><span class="x">)</span>
<span class="n">p</span> <span class="o">=</span> <span class="n">plot</span><span class="x">(</span><span class="n">xlabel</span> <span class="o">=</span> <span class="s">"x"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"Expected Slippage"</span><span class="x">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="n">eachindex</span><span class="x">(</span><span class="n">alphas</span><span class="x">)</span>
   <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">expSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alphas</span><span class="x">[</span><span class="n">i</span><span class="x">],</span> <span class="n">betas</span><span class="x">[</span><span class="n">i</span><span class="x">]),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Algo "</span> <span class="o">*</span> <span class="n">string</span><span class="x">(</span><span class="n">i</span><span class="x">),</span> <span class="n">lw</span> <span class="o">=</span> <span class="mi">2</span><span class="x">)</span> 
<span class="k">end</span>
<span class="n">p</span>
</code></pre></div></div>

<p><img src="/assets/importancesampling/slippage_functions.png" alt="Slippage functions" title="Slippage functions" width="80%" class="center-image" /></p>

<p>Here we can see where each algo is better for each \(x\). In reality, this is impossible to know or it might not even exist.</p>

<p>We are going to devise a rule of when we will select each trading algo:</p>

<ul>
  <li>
    <p>If \(x&lt;0.5\) then we will randomly select Strategy 1 62.5% of the time and the others 12.5% of the time.</p>
  </li>
  <li>
    <p>If \(x&gt;0.5\) then Strategy 3 62.5% and the others 12.5%.</p>
  </li>
</ul>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> tradingRule</span><span class="x">(</span><span class="n">x</span><span class="x">)</span>
    <span class="k">if</span> <span class="n">x</span> <span class="o">&lt;</span> <span class="mf">0.5</span>
        <span class="k">return</span> <span class="x">[</span><span class="mf">0.625</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">]</span>
    <span class="k">else</span> 
        <span class="k">return</span> <span class="x">[</span><span class="mf">0.125</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">,</span> <span class="mf">0.625</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">]</span>
    <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Julia’s vectorisation makes it easy to simulate going through multiple trades.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Uniform</span><span class="x">(),</span> <span class="mi">100</span><span class="x">)</span>
<span class="n">d</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Uniform</span><span class="x">(),</span> <span class="mi">100</span><span class="x">)</span>
<span class="n">stratProbs</span> <span class="o">=</span> <span class="n">tradingRule</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">)</span>
<span class="n">strat</span> <span class="o">=</span> <span class="n">rand</span><span class="o">.</span><span class="x">(</span><span class="n">Categorical</span><span class="o">.</span><span class="x">(</span><span class="n">stratProbs</span><span class="x">))</span>
<span class="n">stratProb</span> <span class="o">=</span> <span class="n">getindex</span><span class="o">.</span><span class="x">(</span><span class="n">stratProbs</span><span class="x">,</span> <span class="n">strat</span><span class="x">)</span>
<span class="n">slippageVal</span> <span class="o">=</span> <span class="n">slippage</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alphas</span><span class="x">[</span><span class="n">strat</span><span class="x">],</span> <span class="n">betas</span><span class="x">[</span><span class="n">strat</span><span class="x">],</span> <span class="n">d</span><span class="x">,</span> <span class="mi">5</span><span class="x">)</span>

<span class="n">res</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">x</span><span class="o">=</span><span class="n">x</span><span class="x">,</span> <span class="n">d</span><span class="o">=</span><span class="n">d</span><span class="x">,</span> <span class="n">strat</span><span class="o">=</span><span class="n">strat</span><span class="x">,</span> <span class="n">stratProb</span><span class="o">=</span><span class="n">stratProb</span><span class="x">,</span> <span class="n">prob</span><span class="o">=</span><span class="n">stratProb</span><span class="x">,</span> <span class="n">slippage</span><span class="o">=</span><span class="n">slippageVal</span><span class="x">)</span>
<span class="n">first</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="mi">3</span><span class="x">)</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th style="text-align: right"><strong>x</strong></th>
      <th style="text-align: right"><strong>d</strong></th>
      <th style="text-align: right"><strong>strat</strong></th>
      <th style="text-align: right"><strong>stratProb</strong></th>
      <th style="text-align: right"><strong>prob</strong></th>
      <th style="text-align: right"><strong>slippage</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">0.0192748</td>
      <td style="text-align: right">0.95432</td>
      <td style="text-align: right">1</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">1.29969</td>
    </tr>
    <tr>
      <td style="text-align: right">0.0700494</td>
      <td style="text-align: right">0.930581</td>
      <td style="text-align: right">1</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">0.855019</td>
    </tr>
    <tr>
      <td style="text-align: right">0.925858</td>
      <td style="text-align: right">0.90087</td>
      <td style="text-align: right">3</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">-2.62943</td>
    </tr>
  </tbody>
</table>

<p>This is our ‘production data’ for 100 random trades. The aim of the game is to understand how good our trading rules are rather than trying to estimate how good the individual algos are.</p>

<p>Does our rule above do better than just randomly choosing an algo? This is where we can use importance sampling to take the 100 trades and specially weight them to assess a new trading rule.</p>

<h2 id="importance-sampling">Importance Sampling</h2>

<p>Importance sampling is about using observed probabilities \(q\) and observations of a variable with different probabilities \(p\). In our case we want to calculate the expected slippage of a trading strategy given the observations we have of the current strategy.</p>

\[\mathbb{E} [\text{Slippage}] = \frac{1}{N} \sum _i \text{Slipage}_i \frac{p_i(\text{New Strategy})}{q_i(\text{Current Strategy})}\]

<p>\(q_i(\text{Current Strategy})\) is equal to the <code class="language-plaintext highlighter-rouge">stratProb</code> column in the dataframe and \(p_i\) is the probability we would have chosen the given algo under the new strategy.</p>

<p>For the importance sampling, we calculate the likelihood ratio using equal probabilities and then take the weighted average of the slippages.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqProb</span> <span class="o">=</span> <span class="mf">0.25</span><span class="x">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">ratio</span> <span class="o">=</span> <span class="o">:</span><span class="n">EqProb</span> <span class="o">./</span> <span class="o">:</span><span class="n">stratProb</span><span class="x">)</span>
<span class="nd">@combine</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">StratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">),</span> <span class="o">:</span><span class="n">EqStratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">,</span> <span class="n">Weights</span><span class="x">(</span><span class="o">:</span><span class="n">ratio</span><span class="x">)))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th style="text-align: right"><strong>StratSlippage</strong></th>
      <th style="text-align: right"><strong>EqStratSlippage</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">-1.02243</td>
      <td style="text-align: right">-1.8774</td>
    </tr>
  </tbody>
</table>

<p>The average slippage for the 100 trades is worse (more negative) that the current strategy. This suggests that randomly choosing would perform <em>worse</em>.</p>

<p>Then plotting the average slippage across the orders.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">StratSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">)</span> <span class="o">./</span><span class="n">collect</span><span class="x">(</span><span class="mi">1</span><span class="o">:</span><span class="n">length</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">)))</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span> <span class="o">.*</span> <span class="o">:</span><span class="n">ratio</span><span class="x">)</span> <span class="o">./</span><span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">ratio</span><span class="x">))</span>

<span class="n">plot</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">StratSlipapgeRolling</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Production"</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span><span class="mi">2</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">EqSlipapgeRolling</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Equal Weighted"</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span><span class="mi">2</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/importancesampling/simplestrat.png" alt="Simple strategy slippage" title="Simple strategy slippage" width="80%" class="center-image" /></p>

<p>The timeseries of the slippage shows that the equally weighted strategy is worse, so gives us confidence in the current strategy. When we observe a bad outcome the likelihood ratio weights that outcome based on how different the probability is from the production strategy.</p>

<p>How can we use importance sampling to build better strategies?</p>

<h2 id="easy-reinforcement-learning-and-expected-slippage">Easy Reinforcement Learning and Expected Slippage</h2>

<p>Each trade is described by \(x\). In this toy model that is just a number but in real life this could correspond to the size of the order, the asset, the time of day and any combination of variables. In the original paper they use the spread, volatility, order size relative to the ADV and duration as descriptive variables of a random dataset. I’m going to keep it simple and stick to \(x\) being just a single number.</p>

<p>We want to understand if a particular \(x\) means we should use algo \(i\). For this, we need to build an ‘expected slippage’ model where we use the historical \(x\) values and outcomes of using algo \(i\).</p>

<p>For the modelling part, we will use <code class="language-plaintext highlighter-rouge">xgboost</code> through <code class="language-plaintext highlighter-rouge">MLJ.jl</code>.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">MLJ</span>
<span class="n">xgboostModel</span> <span class="o">=</span> <span class="nd">@load</span> <span class="n">XGBoostRegressor</span> <span class="n">pkg</span><span class="o">=</span><span class="n">XGBoost</span> <span class="n">verbosity</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">xgboostmodel</span> <span class="o">=</span> <span class="n">xgboostModel</span><span class="x">(</span><span class="n">eval_metric</span><span class="o">=</span><span class="x">[</span><span class="s">"rmse"</span><span class="x">]);</span>
</code></pre></div></div>

<p>The inputs are \(x\) and an indicator of the chosen algo.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res2</span> <span class="o">=</span> <span class="n">coerce</span><span class="x">(</span><span class="n">res</span><span class="x">[</span><span class="o">:</span><span class="x">,[</span><span class="o">:</span><span class="n">x</span><span class="x">,</span> <span class="o">:</span><span class="n">strat</span><span class="x">,</span> <span class="o">:</span><span class="n">slippage</span><span class="x">]],</span> <span class="o">:</span><span class="n">strat</span><span class="o">=&gt;</span><span class="n">Multiclass</span><span class="x">);</span>

<span class="n">y</span><span class="x">,</span> <span class="n">X</span> <span class="o">=</span> <span class="n">unpack</span><span class="x">(</span><span class="n">res2</span><span class="x">,</span> <span class="o">==</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">);</span> <span class="n">rng</span><span class="o">=</span><span class="mi">123</span><span class="x">);</span>

<span class="n">encoder</span> <span class="o">=</span> <span class="n">ContinuousEncoder</span><span class="x">()</span>
<span class="n">encMach</span> <span class="o">=</span> <span class="n">machine</span><span class="x">(</span><span class="n">encoder</span><span class="x">,</span> <span class="n">X</span><span class="x">)</span> <span class="o">|&gt;</span> <span class="n">fit!</span>
<span class="n">X_encoded</span> <span class="o">=</span> <span class="n">MLJ</span><span class="o">.</span><span class="n">transform</span><span class="x">(</span><span class="n">encMach</span><span class="x">,</span> <span class="n">X</span><span class="x">);</span>

<span class="n">xgbMachine</span> <span class="o">=</span> <span class="n">machine</span><span class="x">(</span><span class="n">xgboostmodel</span><span class="x">,</span> <span class="n">X_encoded</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>

<span class="n">evaluate!</span><span class="x">(</span><span class="n">xgbMachine</span><span class="x">,</span>
          <span class="n">resampling</span><span class="o">=</span><span class="n">CV</span><span class="x">(</span><span class="n">nfolds</span> <span class="o">=</span> <span class="mi">6</span><span class="x">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="nb">true</span><span class="x">),</span>
          <span class="n">measures</span><span class="o">=</span><span class="x">[</span><span class="n">rmse</span><span class="x">,</span> <span class="n">rsq</span><span class="x">],</span>
          <span class="n">verbosity</span><span class="o">=</span><span class="mi">0</span><span class="x">)</span>
</code></pre></div></div>
<p>The overall regression gets an \(R^2\) of 0.5 on our 100 trade dataset - a decent model.</p>

<p>In this new simulation, we will fit the xgboost model on the trades to build up an expected slippage model with all the data we have so far. <code class="language-plaintext highlighter-rouge">prepareData</code> and <code class="language-plaintext highlighter-rouge">fitSlippage</code> transform the data and fit the model.</p>

<p>We will then use this model to predict the expected slippage (<code class="language-plaintext highlighter-rouge">predictSlippage</code>) for each algo and use that to selected what algo to use for a given trade.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> prepareData</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span><span class="x">,</span> <span class="n">slippage</span><span class="x">)</span>
    <span class="n">res</span> <span class="o">=</span> <span class="n">coerce</span><span class="x">(</span><span class="n">DataFrame</span><span class="x">(</span><span class="n">x</span><span class="o">=</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span><span class="o">=</span><span class="n">strat</span><span class="x">,</span> <span class="n">slippage</span><span class="o">=</span><span class="n">slippage</span><span class="x">),</span> <span class="o">:</span><span class="n">strat</span><span class="o">=&gt;</span><span class="n">Multiclass</span><span class="x">);</span>
    <span class="n">y</span><span class="x">,</span> <span class="n">X</span> <span class="o">=</span> <span class="n">unpack</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">==</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">);</span> <span class="n">rng</span><span class="o">=</span><span class="mi">123</span><span class="x">);</span>
    <span class="n">encoder</span> <span class="o">=</span> <span class="n">ContinuousEncoder</span><span class="x">()</span>
    <span class="n">encMach</span> <span class="o">=</span> <span class="n">machine</span><span class="x">(</span><span class="n">encoder</span><span class="x">,</span> <span class="n">X</span><span class="x">)</span> <span class="o">|&gt;</span> <span class="n">fit!</span>
    <span class="n">X_encoded</span> <span class="o">=</span> <span class="n">MLJ</span><span class="o">.</span><span class="n">transform</span><span class="x">(</span><span class="n">encMach</span><span class="x">,</span> <span class="n">X</span><span class="x">);</span>
    <span class="k">return</span> <span class="n">X_encoded</span><span class="x">,</span> <span class="n">y</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> fitSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span><span class="x">,</span> <span class="n">slippage</span><span class="x">,</span> <span class="n">xgboostmodel</span><span class="x">)</span>
    <span class="n">X_encoded</span><span class="x">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">prepareData</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span><span class="x">,</span> <span class="n">slippage</span><span class="x">)</span>
    <span class="n">xgbMachine</span> <span class="o">=</span> <span class="n">machine</span><span class="x">(</span><span class="n">xgboostmodel</span><span class="x">,</span> <span class="n">X_encoded</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>

    <span class="n">evaluate!</span><span class="x">(</span><span class="n">xgbMachine</span><span class="x">,</span>
          <span class="n">resampling</span><span class="o">=</span><span class="n">CV</span><span class="x">(</span><span class="n">nfolds</span> <span class="o">=</span> <span class="mi">6</span><span class="x">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="nb">true</span><span class="x">),</span>
          <span class="n">measures</span><span class="o">=</span><span class="x">[</span><span class="n">rmse</span><span class="x">,</span> <span class="n">rsq</span><span class="x">],</span>
          <span class="n">verbosity</span><span class="o">=</span><span class="mi">0</span><span class="x">)</span>
    <span class="k">return</span> <span class="x">(</span><span class="n">xgbMachine</span><span class="x">,</span> <span class="n">encMach</span><span class="x">)</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> predictSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">xgbMachine</span><span class="x">,</span> <span class="n">encMachine</span><span class="x">)</span>
    <span class="n">X_pred</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">x</span><span class="o">=</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span> <span class="o">=</span> <span class="x">[</span><span class="mi">1</span><span class="x">,</span><span class="mi">2</span><span class="x">,</span><span class="mi">3</span><span class="x">,</span><span class="mi">4</span><span class="x">],</span> <span class="n">slippage</span> <span class="o">=</span> <span class="nb">NaN</span><span class="x">)</span>
    <span class="n">X_pred</span> <span class="o">=</span> <span class="n">coerce</span><span class="x">(</span><span class="n">X_pred</span><span class="x">[</span><span class="o">:</span><span class="x">,[</span><span class="o">:</span><span class="n">x</span><span class="x">,</span> <span class="o">:</span><span class="n">strat</span><span class="x">,</span> <span class="o">:</span><span class="n">slippage</span><span class="x">]],</span> <span class="o">:</span><span class="n">strat</span><span class="o">=&gt;</span><span class="n">Multiclass</span><span class="x">)</span>
    <span class="n">X_pred</span> <span class="o">=</span> <span class="n">MLJ</span><span class="o">.</span><span class="n">transform</span><span class="x">(</span><span class="n">encMach</span><span class="x">,</span> <span class="n">X_pred</span><span class="x">)</span>
    <span class="n">preds</span> <span class="o">=</span> <span class="n">MLJ</span><span class="o">.</span><span class="n">predict</span><span class="x">(</span><span class="n">xgbMachine</span><span class="x">,</span> <span class="n">X_pred</span><span class="x">)</span>
    <span class="k">return</span><span class="x">(</span><span class="n">preds</span><span class="x">)</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> slippageToProb</span><span class="x">(</span><span class="n">preds</span><span class="x">)</span>
    <span class="n">scores</span> <span class="o">=</span> <span class="n">exp</span><span class="o">.</span><span class="x">(</span><span class="n">preds</span><span class="x">)</span> <span class="o">./</span> <span class="n">sum</span><span class="x">(</span><span class="n">exp</span><span class="o">.</span><span class="x">(</span><span class="n">preds</span><span class="x">))</span>
    <span class="n">p</span> <span class="o">=</span> <span class="x">((</span><span class="mf">0.9</span> <span class="o">.*</span> <span class="n">scores</span><span class="x">)</span> <span class="o">.+</span> <span class="mf">0.025</span><span class="x">)</span> <span class="o">./</span> <span class="n">sum</span><span class="x">((</span><span class="mf">0.9</span> <span class="o">.*</span> <span class="n">scores</span><span class="x">)</span> <span class="o">.+</span> <span class="mf">0.025</span><span class="x">)</span> 
    <span class="k">return</span> <span class="n">p</span>
<span class="k">end</span>
</code></pre></div></div>

<p>The predicted slippage is then transformed into a probability using the softmax function (<code class="language-plaintext highlighter-rouge">slippageToProb</code>) which gives us a mapping of the real-valued estimated slippage onto a probability. We then sample which strategy to use from this probability. By adding an element of randomness into the algo selection we are making sure we can use the importance sampling framework to either change the model (xgboost to something else) or change how we build the probabilities (softmax to something else).</p>

<p>To simulate the problem we will start by randomly choosing a strategy for the first 200 runs. After this we will start using the xgboost regression model to predict the expected slippage of each strategy and use this to decide what strategy to use.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">epsilon</span> <span class="o">=</span> <span class="mf">0.05</span>
<span class="n">volatility</span> <span class="o">=</span> <span class="mi">5</span>
<span class="n">N</span> <span class="o">=</span> <span class="mi">1000</span>

<span class="n">x</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>
<span class="n">strat</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>
<span class="n">slippages</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>
<span class="n">d</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>
<span class="n">stratProb</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>

<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="n">N</span>
    <span class="n">xVal</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Uniform</span><span class="x">())</span>
    <span class="n">dVal</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Uniform</span><span class="x">())</span>

    <span class="k">if</span> <span class="n">i</span> <span class="o">&gt;</span> <span class="mi">200</span>
        <span class="n">xgbMachine</span><span class="x">,</span> <span class="n">encMachine</span> <span class="o">=</span> <span class="n">fitSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">strat</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">slippages</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">xgboostmodel</span><span class="x">)</span>
        <span class="n">predCost</span> <span class="o">=</span> <span class="n">predictSlippage</span><span class="x">(</span><span class="n">xVal</span><span class="x">,</span> <span class="n">xgbMachine</span><span class="x">,</span> <span class="n">encMachine</span><span class="x">)</span>
        <span class="n">stratProbs</span> <span class="o">=</span> <span class="n">slippageToProb</span><span class="x">(</span><span class="n">predCost</span><span class="x">)</span>
    <span class="k">else</span>
        <span class="n">stratProbs</span> <span class="o">=</span> <span class="x">[</span><span class="mf">0.25</span><span class="x">,</span> <span class="mf">0.25</span><span class="x">,</span> <span class="mf">0.25</span><span class="x">,</span> <span class="mf">0.25</span><span class="x">]</span>
    <span class="k">end</span>

    <span class="n">stratVal</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Categorical</span><span class="x">(</span><span class="n">stratProbs</span><span class="x">))</span>
    <span class="n">slippageVal</span> <span class="o">=</span> <span class="n">slippage</span><span class="x">(</span><span class="n">xVal</span><span class="x">,</span> <span class="n">alphas</span><span class="x">[</span><span class="n">stratVal</span><span class="x">],</span> <span class="n">betas</span><span class="x">[</span><span class="n">stratVal</span><span class="x">],</span> <span class="n">dVal</span><span class="x">,</span> <span class="n">volatility</span><span class="x">)</span>
    
    <span class="n">x</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">xVal</span>
    <span class="n">strat</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">stratVal</span>
    <span class="n">stratProb</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">stratProbs</span><span class="x">[</span><span class="n">stratVal</span><span class="x">]</span>
    <span class="n">slippages</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">slippageVal</span>
    <span class="n">d</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">dVal</span>
<span class="k">end</span>

<span class="n">res</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">x</span><span class="o">=</span><span class="n">x</span><span class="x">,</span> <span class="n">d</span><span class="o">=</span><span class="n">d</span><span class="x">,</span> <span class="n">strat</span><span class="o">=</span><span class="n">strat</span><span class="x">,</span> <span class="n">stratProb</span><span class="o">=</span><span class="n">stratProb</span><span class="x">,</span> <span class="n">slippage</span><span class="o">=</span><span class="n">slippages</span><span class="x">)</span>
</code></pre></div></div>

<p>Again, we output each strategy and the probability the strategy was used. We use the importance sampling approach to estimate the slippage for choosing an algo randomly to gives us a comparison to the xgboost method.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqProb</span> <span class="o">=</span> <span class="mf">0.25</span><span class="x">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqRatio</span> <span class="o">=</span> <span class="o">:</span><span class="n">EqProb</span> <span class="o">./</span> <span class="o">:</span><span class="n">stratProb</span><span class="x">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">StratSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">)</span> <span class="o">./</span><span class="n">collect</span><span class="x">(</span><span class="mi">1</span><span class="o">:</span><span class="n">length</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">)))</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span> <span class="o">.*</span> <span class="o">:</span><span class="n">EqRatio</span><span class="x">)</span> <span class="o">./</span><span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">EqRatio</span><span class="x">));</span>

<span class="n">plot</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">StratSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Production"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">EqSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Equal Weighting"</span><span class="x">)</span>
</code></pre></div></div>
<p><img src="/assets/importancesampling/modelslippage.png" alt="model slippage" width="80%" class="center-image" /></p>

<p>For the first 200 trades we are just selecting randomly, so no difference in performance. Then afterwards we can see the XGBoost model starts to outperform as it learns what algo is better for each \(x\).
So whilst we have only run the XGBoost model in production it has shown it is doing better than random by using the importance sampling method.</p>

<h2 id="testing-a-new-model-without-running-it-in-production">Testing a New Model Without Running it in Production</h2>

<p>The XGBoost model is doing well and out-performing an equal weighted model, but what if you wanted to change from XGBoost to something else? How can you build the case that this is something worth doing?</p>

<p>By constructing new probabilities of whether the strategy would be selected (new \(p_i\)’s) and with the current strategy probabilities (\(q_i\)’s) we can estimate the slippage of the new model without having to run any more trades.</p>

<p>With <code class="language-plaintext highlighter-rouge">MLJ.jl</code> we can create a new model and pass it into the functions to replicate running the strategy in production. This time we use a simple linear regression model with the same features. We run through the trades in the same order so there is no information leakage.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@load</span> <span class="n">LinearRegressor</span> <span class="n">pkg</span><span class="o">=</span><span class="n">MLJLinearModels</span>

<span class="n">linreg</span> <span class="o">=</span> <span class="n">MLJLinearModels</span><span class="o">.</span><span class="n">LinearRegressor</span><span class="x">()</span>

<span class="n">newProb</span> <span class="o">=</span> <span class="n">ones</span><span class="x">(</span><span class="n">N</span><span class="x">)</span> <span class="o">*</span> <span class="mf">0.25</span>

<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="n">N</span><span class="o">-</span><span class="mi">1</span><span class="x">)</span>

    <span class="k">if</span> <span class="n">i</span> <span class="o">&gt;</span> <span class="mi">200</span>
        <span class="n">linMachine</span><span class="x">,</span> <span class="n">enchMachine</span> <span class="o">=</span> <span class="n">fitSlippage</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">res</span><span class="o">.</span><span class="n">strat</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">res</span><span class="o">.</span><span class="n">slippage</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">linreg</span><span class="x">)</span>
        <span class="n">predSlippage</span> <span class="o">=</span> <span class="n">predictSlippage</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">x</span><span class="x">[</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="x">],</span> <span class="n">linMachine</span><span class="x">,</span> <span class="n">enchMachine</span><span class="x">)</span>
        <span class="n">stratProbs</span> <span class="o">=</span> <span class="n">slippageToProb</span><span class="x">(</span><span class="n">predSlippage</span><span class="x">)</span>
        <span class="n">newProbVal</span> <span class="o">=</span> <span class="n">stratProbs</span><span class="x">[</span><span class="kt">Int</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">strat</span><span class="x">[</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="x">])]</span>
        <span class="n">newProb</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">newProbVal</span>
    <span class="k">end</span>
    
<span class="k">end</span>

<span class="n">res</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">LinearProb</span><span class="x">]</span> <span class="o">=</span> <span class="n">newProb</span>

<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">LinearRatio</span> <span class="o">=</span> <span class="o">:</span><span class="n">LinearProb</span> <span class="o">./</span> <span class="o">:</span><span class="n">stratProb</span><span class="x">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">LinearSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span> <span class="o">.*</span> <span class="o">:</span><span class="n">LinearRatio</span><span class="x">)</span> <span class="o">./</span><span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">LinearRatio</span><span class="x">))</span>
<span class="n">plot</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">StratSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Production"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">EqSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Equal Weighting"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">LinearSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Linear Model"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/importancesampling/linreg.png" alt="Linear regression strategy" title="Linear regression strategy" width="80%" class="center-image" /></p>

<p>Adding the linear regression decision rule to the data gives us a way of assessing this new model without having to run it directly in production. We can see that the linear model is better than XGBoost and also better than the equal weighting.</p>

<p>A simple bootstrap of taking the average slippage for each strategy a random amount of times provides the simplest performance measure.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">bs</span> <span class="o">=</span> <span class="n">mapreduce</span><span class="x">(</span><span class="n">x</span><span class="o">-&gt;</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">res</span><span class="x">[</span><span class="n">sample</span><span class="x">(</span><span class="mi">201</span><span class="o">:</span><span class="n">nrow</span><span class="x">(</span><span class="n">res</span><span class="x">),</span> <span class="n">nrow</span><span class="x">(</span><span class="n">res</span><span class="x">)</span><span class="o">-</span><span class="mi">200</span><span class="x">),</span> <span class="o">:</span><span class="x">],</span> 
              <span class="o">:</span><span class="n">StratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">),</span> 
              <span class="o">:</span><span class="n">EqStratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">,</span> <span class="n">Weights</span><span class="x">(</span><span class="o">:</span><span class="n">EqRatio</span><span class="x">)),</span>
              <span class="o">:</span><span class="n">LinearStratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">,</span> <span class="n">Weights</span><span class="x">(</span><span class="o">:</span><span class="n">LinearRatio</span><span class="x">))),</span>
			  <span class="n">vcat</span><span class="x">,</span> <span class="mi">1</span><span class="o">:</span><span class="mi">1000</span><span class="x">);</span>

<span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">stack</span><span class="x">(</span><span class="n">bs</span><span class="x">),</span> <span class="o">:</span><span class="n">variable</span><span class="x">),</span> <span class="o">:</span><span class="n">avg</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">value</span><span class="x">),</span> <span class="o">:</span><span class="n">sd</span> <span class="o">=</span> <span class="n">std</span><span class="x">(</span><span class="o">:</span><span class="n">value</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th style="text-align: right"><strong>variable</strong></th>
      <th style="text-align: right"><strong>avg</strong></th>
      <th style="text-align: right"><strong>sd</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">StratSlippage</td>
      <td style="text-align: right">-1.55385</td>
      <td style="text-align: right">0.0967389</td>
    </tr>
    <tr>
      <td style="text-align: right">EqStratSlippage</td>
      <td style="text-align: right">-1.59169</td>
      <td style="text-align: right">0.119028</td>
    </tr>
    <tr>
      <td style="text-align: right">LinearStratSlippage</td>
      <td style="text-align: right">-1.52706</td>
      <td style="text-align: right">0.133231</td>
    </tr>
  </tbody>
</table>

<p>As its a toy problem, nothing of significance between the models - but both models do better than the random allocation.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Importance sampling gives you a way of getting more out of the current data and strategy you are using. By weighting the observations in a new way you can get an idea whether a new strategy is worth it or not.
By rethinking you current setup you can easily add a bit of randomness into decisions and use the importance sampling framework going forward.</p>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><summary type="html"><![CDATA[A new paper hit my feed Choosing trading strategies in electronic execution using importance sampling. I’ve only encountered sampling as part of a statistical computing course as part of my PhD, and I had never strayed away from Monte Carlo sampling, but this practical example provided an intuitive understanding of its importance and utility.]]></summary></entry><entry><title type="html">Alpha Capture and Acquired</title><link href="https://dm13450.github.io/2024/09/19/Alpha-Capture-and-Acquired.html" rel="alternate" type="text/html" title="Alpha Capture and Acquired" /><published>2024-09-19T00:00:00+00:00</published><updated>2024-09-19T00:00:00+00:00</updated><id>https://dm13450.github.io/2024/09/19/Alpha-Capture-and-Acquired</id><content type="html" xml:base="https://dm13450.github.io/2024/09/19/Alpha-Capture-and-Acquired.html"><![CDATA[<p>People are never short of a trade idea. There is a whole industry of
researchers, salespeople and amateurs coming up with trading ideas and
making big calls on what stock will go up, what country will cut
interest rates and what the price of gold will do next. Alpha capture
is about systematically assessing ideas and working out who has
<em>alpha</em> and generates profitable ideas and who is just making it up as
they are going along.</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>Alpha capture started as a way of profiling a broker’s stock
recommendation. If you have 50 people recommending you 50 different
ideas, how do you know who is good? You’ll quickly run out of money if
you blindly follow all the recommendations that hit your
inbox. Instead, you need to profile each person’s idea and see
who on average can make good recommendations. Whoever is good at
picking stocks probably deserves more of your business.</p>

<p>It has since expanded that some hedge fund have internal desks that
are doing a similar analysis on their portfolio managers (PMs) to double
down on profitable bets and mitigate risks of all the PMs picking the
same stock. Picking stocks and managing a portfolio across many PMs
are two different skills and different departments at your modern
hedge fund.</p>

<p>A simple way to measure the alpha of a PM or broker recommendation
will be to see if the price of a stock they buy (or recommend) goes up
after the day they suggest it. Those with alpha would see their
picks move higher on a large enough sample and those without alpha
would average out to zero, some ideas would go higher, some ideas
lower, the net result being 0 alpha. If a PM has the opposite effect,
every stock they buy goes down they are a contrarian
indicator so take their idea and do the opposite!</p>

<p><img src="/assets/AlphaCapture/jc1.png" alt="Alpha capture markout graph" title="Alpha capture markout graph" class="center-image" /></p>

<p><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3873884">Alpha Capture Systems: Past, Present, and Future
Directions</a>
goes through the history of alpha capture and is a good short read
that inspired this blog post.</p>

<h2 id="basic-alpha-capture">Basic Alpha Capture</h2>

<p>What if we wanted to try our own Alpha Capture? We need some stock recommendations and a way of calculating what happens to the price after the recommendation. This is where the <a href="https://www.acquired.fm/">Acquired</a> podcast comes in.</p>

<p><img src="https://img.transistor.fm/rc6ysihLHIou3_VscLeIvhCyPjvpQaGzKVeRnh5PnWc/rs:fill:3000:3000:1/q:60/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS8zNDFk/ZWYwYjUyZWZiNjQ0/NTliYTI5NjJkOWZi/MmM1ZS5wbmc.jpg" alt="Acquired logo" width="30%" class="center-image" /></p>

<p>Acquired tells the stories and strategies of great companies (taken from their website). It’s a pretty popular podcast and each episode gets close to a million listeners. So this makes it an ideal Alpha Capture study - when they release an episode about a company does the stock price of that company go higher or lower on average? 
If it were to go higher then each time an episode is released call your broker and go long the stock!</p>

<p>They aren’t explicitly recommending a stock by talking about
it, as they say in their intro. So it’s just a toy exercise to see if
there is any correlation between the stock price and the release date
of an episode.</p>

<p>To systematically test this we need to get a list of the episodes and calculate a ‘markout’ from each episode.</p>

<h2 id="collecting-podcast-data">Collecting Podcast Data</h2>

<p>The internet is a wonderful thing and each episode of Acquired is
available as a XML feed from <a href="https://transistor.fm/">transistor.fm</a>. So doing some fun parsing
of XML I can get the full history of the podcast with each date
and title.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> parseEpisode</span><span class="x">(</span><span class="n">x</span><span class="x">)</span>
  <span class="n">rawDate</span> <span class="o">=</span> <span class="n">first</span><span class="x">(</span><span class="n">simplevalue</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">[</span><span class="n">tag</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">.==</span> <span class="s">"pubDate"</span><span class="x">]))</span>
  <span class="n">date</span> <span class="o">=</span> <span class="n">ZonedDateTime</span><span class="x">(</span><span class="n">rawDate</span><span class="x">,</span> <span class="n">dateformat</span><span class="s">"eee, dd uuu yyyy HH:MM:ss z"</span><span class="x">)</span>

  <span class="kt">Dict</span><span class="x">(</span><span class="s">"title"</span> <span class="o">=&gt;</span> <span class="n">first</span><span class="x">(</span><span class="n">simplevalue</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">[</span><span class="n">tag</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">.==</span> <span class="s">"title"</span><span class="x">])),</span>
       <span class="s">"date"</span> <span class="o">=&gt;</span><span class="n">date</span><span class="x">)</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> parse_date</span><span class="x">(</span><span class="n">t</span><span class="x">)</span>
   <span class="kt">Date</span><span class="x">(</span><span class="n">string</span><span class="x">(</span><span class="n">split</span><span class="x">(</span><span class="n">t</span><span class="x">,</span> <span class="s">"T"</span><span class="x">)[</span><span class="mi">1</span><span class="x">]))</span>
<span class="k">end</span>

<span class="n">url</span> <span class="o">=</span> <span class="s">"https://feeds.transistor.fm/acquired"</span>

<span class="n">data</span> <span class="o">=</span> <span class="n">parse</span><span class="x">(</span><span class="n">Node</span><span class="x">,</span> <span class="kt">String</span><span class="x">(</span><span class="n">HTTP</span><span class="o">.</span><span class="n">get</span><span class="x">(</span><span class="n">url</span><span class="x">)</span><span class="o">.</span><span class="n">body</span><span class="x">))</span>

<span class="n">episodes</span> <span class="o">=</span> <span class="n">children</span><span class="x">(</span><span class="n">data</span><span class="x">[</span><span class="mi">3</span><span class="x">][</span><span class="mi">1</span><span class="x">])</span>
<span class="n">filter!</span><span class="x">(</span><span class="n">x</span> <span class="o">-&gt;</span> <span class="n">tag</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">==</span> <span class="s">"item"</span><span class="x">,</span> <span class="n">episodes</span><span class="x">)</span>
<span class="n">episodes</span> <span class="o">=</span> <span class="n">children</span><span class="o">.</span><span class="x">(</span><span class="n">episodes</span><span class="x">)</span>

<span class="n">episodeData</span> <span class="o">=</span> <span class="n">parseEpisode</span><span class="o">.</span><span class="x">(</span><span class="n">episodes</span><span class="x">)</span>

<span class="n">episodeFrame</span> <span class="o">=</span> <span class="n">vcat</span><span class="x">(</span><span class="n">DataFrame</span><span class="o">.</span><span class="x">(</span><span class="n">episodeData</span><span class="x">)</span><span class="o">...</span><span class="x">)</span>
<span class="n">CSV</span><span class="o">.</span><span class="n">write</span><span class="x">(</span><span class="s">"episodeRaw.csv"</span><span class="x">,</span> <span class="n">episodeFrame</span><span class="x">)</span>
</code></pre></div></div>

<p>After writing the data to a CSV I need to somehow parse the episode
title into a stock ticker. This is a tricky task as the episode names
are human friendly not computer friendly. So time for our LLM
overlords to lend a hand a do the heavy lifting. I drop the CSV into
<a href="https://www.perplexity.ai/">Perplexity</a> and prompt it to add the relevant stock ticker to the
file. I then reread the CSV into my notebook.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">episodeFrame</span> <span class="o">=</span> <span class="n">CSV</span><span class="o">.</span><span class="n">read</span><span class="x">(</span><span class="s">"episodeTicker.csv"</span><span class="x">,</span> <span class="n">DataFrame</span><span class="x">)</span>
<span class="n">episodeFrame</span><span class="o">.</span><span class="n">date</span> <span class="o">=</span> <span class="n">ZonedDateTime</span><span class="o">.</span><span class="x">(</span><span class="kt">String</span><span class="o">.</span><span class="x">(</span><span class="n">episodeFrame</span><span class="o">.</span><span class="n">date</span><span class="x">),</span> <span class="n">dateformat</span><span class="s">"yyyy-mm-ddTHH:MM:SS.sss-z"</span><span class="x">)</span>

<span class="n">vcat</span><span class="x">(</span><span class="n">first</span><span class="x">(</span><span class="nd">@subset</span><span class="x">(</span><span class="n">episodeFrame</span><span class="x">,</span> <span class="o">:</span><span class="n">stock_ticker</span> <span class="o">.!=</span> <span class="s">"-"</span><span class="x">),</span> <span class="mi">4</span><span class="x">),</span>
        <span class="n">last</span><span class="x">(</span><span class="nd">@subset</span><span class="x">(</span><span class="n">episodeFrame</span><span class="x">,</span> <span class="o">:</span><span class="n">stock_ticker</span> <span class="o">.!=</span> <span class="s">"-"</span><span class="x">),</span> <span class="mi">4</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th style="text-align: right"><strong>date</strong><br /><code class="language-plaintext highlighter-rouge">ZonedDateTime</code></th>
      <th style="text-align: right"><strong>title</strong><br /><code class="language-plaintext highlighter-rouge">String</code></th>
      <th style="text-align: right"><strong>stock_ticker</strong><br /><code class="language-plaintext highlighter-rouge">String15</code></th>
      <th style="text-align: right"><strong>sector_etf</strong><br /><code class="language-plaintext highlighter-rouge">String7</code></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">2024-03-17T17:54:00.400+07:00</td>
      <td style="text-align: right">Renaissance Technologies</td>
      <td style="text-align: right">RNR</td>
      <td style="text-align: right">PSI</td>
    </tr>
    <tr>
      <td style="text-align: right">2024-02-19T17:56:00.410+08:00</td>
      <td style="text-align: right">Hermès</td>
      <td style="text-align: right">RMS.PA</td>
      <td style="text-align: right">GXLU</td>
    </tr>
    <tr>
      <td style="text-align: right">2024-01-21T17:59:00.450+08:00</td>
      <td style="text-align: right">Novo Nordisk (Ozempic)</td>
      <td style="text-align: right">NOVO-B.CO</td>
      <td style="text-align: right">IHE</td>
    </tr>
    <tr>
      <td style="text-align: right">2023-11-26T16:24:00.250+08:00</td>
      <td style="text-align: right">Visa</td>
      <td style="text-align: right">V</td>
      <td style="text-align: right">IPAY</td>
    </tr>
    <tr>
      <td style="text-align: right">2018-09-23T18:28:00.550+07:00</td>
      <td style="text-align: right">Season 3, Episode 5: Alibaba</td>
      <td style="text-align: right">BABA</td>
      <td style="text-align: right">KWEB</td>
    </tr>
    <tr>
      <td style="text-align: right">2018-08-20T09:20:00.370+07:00</td>
      <td style="text-align: right">Season 3, Episode 3: The Sonos IPO</td>
      <td style="text-align: right">SONO</td>
      <td style="text-align: right">GAMR</td>
    </tr>
    <tr>
      <td style="text-align: right">2018-08-05T18:15:00.030+07:00</td>
      <td style="text-align: right">Season 3, Episode 2: The Xiaomi IPO</td>
      <td style="text-align: right">XIACF</td>
      <td style="text-align: right">KWEB</td>
    </tr>
    <tr>
      <td style="text-align: right">2018-07-16T21:40:00.560+07:00</td>
      <td style="text-align: right">Season 3, Episode 1: Tesla</td>
      <td style="text-align: right">TSLA</td>
      <td style="text-align: right">TSLA</td>
    </tr>
  </tbody>
</table>

<p>It’s done an ok job. Most of the episodes seem to correspond to the
right ticker but we can see it has hallucinated the RenTech stock
ticker as RNR. RenTech is a private company, no stock ticker and
instead, Perplexity has decided the RNR (a reinsurance company) is the
correct stock ticker. So not 100% accurate. Still, it has saved me a
good chunk of time and we can move on to getting the stock price data.</p>

<p>We want to measure the average price move of a stock after an episode is released. If Acquired had stock-picking skill, you expect the price to increase after the release of an episode as they are generally speaking positively about the various companies.</p>

<p>So using <a href="https://github.com/dm13450/AlpacaMarkets.jl">AlpacaMarkets.jl</a> we get the stock price for the days before and the days after the episode.  As AlpacaMarkets only has US stock data then only some of the episodes end up with a full dataset.</p>

<h2 id="what-is-a-markout">What is a Markout?</h2>

<p>We calculate the percentage change relative to the episode date and then aggregate all the stock tickers together.</p>

\[\text{Markout} = \frac{p - p_{\text{episode released}}}{p_{\text{episode released}}}\]

<p>Acquired is about great companies so they choose to speak favourably about a company, therefore I think it’s a reasonable assumption that we expect the stock price to increase after everyone gets round to listening to it. 
So once we aggregate all the episodes we should hopefully have
enough data to decide if this is true.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> getStockData</span><span class="x">(</span><span class="n">stock</span><span class="x">,</span> <span class="n">startDate</span><span class="x">)</span>
  <span class="n">prices</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">stock_bars</span><span class="x">(</span><span class="n">stock</span><span class="x">,</span> <span class="s">"1Day"</span><span class="x">,</span> <span class="n">startTime</span><span class="o">=</span><span class="n">startDate</span> <span class="o">-</span> <span class="kt">Month</span><span class="x">(</span><span class="mi">1</span><span class="x">),</span> <span class="n">limit</span><span class="o">=</span><span class="mi">10000</span><span class="x">)[</span><span class="mi">1</span><span class="x">]</span>
  <span class="n">prices</span><span class="o">.</span><span class="n">date</span> <span class="o">.=</span> <span class="n">startDate</span>
  <span class="n">prices</span><span class="o">.</span><span class="n">t</span> <span class="o">=</span> <span class="n">parse_date</span><span class="o">.</span><span class="x">(</span><span class="n">prices</span><span class="o">.</span><span class="n">t</span><span class="x">)</span>
  <span class="n">prices</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">symbol</span><span class="x">,</span> <span class="o">:</span><span class="n">vw</span><span class="x">,</span> <span class="o">:</span><span class="n">date</span><span class="x">]]</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> calcMarkout</span><span class="x">(</span><span class="n">data</span><span class="x">)</span>
   <span class="n">arrivalInd</span> <span class="o">=</span> <span class="n">findlast</span><span class="x">(</span><span class="n">data</span><span class="o">.</span><span class="n">t</span> <span class="o">.&lt;=</span> <span class="n">data</span><span class="o">.</span><span class="n">date</span><span class="x">)</span>
   <span class="n">arrivalPrice</span> <span class="o">=</span> <span class="n">data</span><span class="x">[</span><span class="n">arrivalInd</span><span class="x">,</span> <span class="o">:</span><span class="n">vw</span><span class="x">]</span>
   <span class="n">data</span><span class="o">.</span><span class="n">arrivalPrice</span> <span class="o">.=</span> <span class="n">arrivalPrice</span>
   <span class="n">data</span><span class="o">.</span><span class="n">ts</span> <span class="o">=</span> <span class="x">[</span><span class="n">x</span><span class="o">.</span><span class="n">value</span> <span class="k">for</span> <span class="n">x</span> <span class="k">in</span> <span class="x">(</span><span class="n">data</span><span class="o">.</span><span class="n">t</span> <span class="o">.-</span> <span class="n">data</span><span class="o">.</span><span class="n">date</span><span class="x">)]</span>
   <span class="n">data</span><span class="o">.</span><span class="n">markout</span> <span class="o">=</span> <span class="mf">1e4</span><span class="o">*</span><span class="x">(</span><span class="n">data</span><span class="o">.</span><span class="n">vw</span> <span class="o">.-</span> <span class="n">data</span><span class="o">.</span><span class="n">arrivalPrice</span><span class="x">)</span> <span class="o">./</span> <span class="n">data</span><span class="o">.</span><span class="n">arrivalPrice</span>
   <span class="n">data</span>
<span class="k">end</span>

<span class="n">res</span> <span class="o">=</span> <span class="x">[]</span>

<span class="k">for</span> <span class="n">row</span> <span class="k">in</span> <span class="n">eachrow</span><span class="x">(</span><span class="n">episodeFrame</span><span class="x">)</span>
    
    <span class="k">try</span> 
        <span class="n">stockData</span> <span class="o">=</span> <span class="n">getStockData</span><span class="x">(</span><span class="n">row</span><span class="o">.</span><span class="n">stock_ticker</span><span class="x">,</span> <span class="kt">Date</span><span class="x">(</span><span class="n">row</span><span class="o">.</span><span class="n">date</span><span class="x">))</span>
        <span class="n">stockData</span> <span class="o">=</span> <span class="n">calcMarkout</span><span class="x">(</span><span class="n">stockData</span><span class="x">)</span>
        <span class="n">append!</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="x">[</span><span class="n">stockData</span><span class="x">])</span>
    <span class="k">catch</span> <span class="n">e</span>
        <span class="n">println</span><span class="x">(</span><span class="n">row</span><span class="o">.</span><span class="n">stock_ticker</span><span class="x">)</span>
    <span class="k">end</span>
<span class="k">end</span>

<span class="n">res</span> <span class="o">=</span> <span class="n">vcat</span><span class="x">(</span><span class="n">res</span><span class="o">...</span><span class="x">)</span>
</code></pre></div></div>
<p>With the data pulled we now aggregate by each day before and after the episode.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">markoutRes</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span><span class="x">),</span> <span class="o">:</span><span class="n">n</span> <span class="o">=</span> <span class="n">length</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">),</span> 
                                         <span class="o">:</span><span class="n">avgMarkout</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">),</span>
                                         <span class="o">:</span><span class="n">devMarkout</span> <span class="o">=</span> <span class="n">std</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">))</span>
<span class="n">markoutRes</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">markoutRes</span><span class="x">,</span> <span class="o">:</span><span class="n">errMarkout</span> <span class="o">=</span> <span class="o">:</span><span class="n">devMarkout</span> <span class="o">./</span><span class="n">sqrt</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">n</span><span class="x">))</span>
</code></pre></div></div>

<p>Always need error bars as this data gets noisy.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">markoutResSub</span> <span class="o">=</span> <span class="nd">@subset</span><span class="x">(</span><span class="n">markoutRes</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span> <span class="o">.&lt;=</span> <span class="mi">60</span><span class="x">,</span> <span class="o">:</span><span class="n">n</span> <span class="o">.&gt;=</span> <span class="mi">10</span><span class="x">)</span>
<span class="n">plot</span><span class="x">(</span><span class="n">markoutResSub</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">markoutResSub</span><span class="o">.</span><span class="n">avgMarkout</span><span class="x">,</span> <span class="n">yerr</span><span class="o">=</span><span class="n">markoutResSub</span><span class="o">.</span><span class="n">errMarkout</span><span class="x">,</span> 
     <span class="n">xlabel</span> <span class="o">=</span> <span class="s">"Days"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"Markout"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Acquired Alpha Capture"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>
<span class="n">hline!</span><span class="x">([</span><span class="mi">0</span><span class="x">],</span> <span class="n">ls</span> <span class="o">=</span> <span class="o">:</span><span class="n">dash</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"grey"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">([</span><span class="mi">0</span><span class="x">],</span> <span class="n">ls</span> <span class="o">=</span> <span class="o">:</span><span class="n">dash</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"grey"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>

</code></pre></div></div>

<p><img src="/assets/AlphaCapture/avgMarkout2.png" alt="Average markout" title="Average
 markouts" width="80%" class="center-image" /></p>

<p>Not really a pattern. The majority of the error bars are intercepting zero after the podcast is released. 
If you squint a little bit there seems to be a bit of a downward trend post-episode which would suggest they talk about a company at the peak of the stock price.</p>

<p>Beforehand there is a bit of positive momentum, again suggesting that
they release the podcast at the peak of the stock price. Now this is
even more of a stretch given there is only 1 podcast a month and it
takes more than 20 days to prepare an episode (I imagine!), so
more noise than signal.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">markoutIndRes</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">symbol</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span><span class="x">]),</span> <span class="o">:</span><span class="n">n</span> <span class="o">=</span> <span class="n">length</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">),</span> 
                                         <span class="o">:</span><span class="n">avgMarkout</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">),</span>
                                         <span class="o">:</span><span class="n">devMarkout</span> <span class="o">=</span> <span class="n">std</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">))</span>
<span class="n">markoutIndRes</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">markoutIndRes</span><span class="x">,</span> <span class="o">:</span><span class="n">errMarkout</span> <span class="o">=</span> <span class="o">:</span><span class="n">devMarkout</span> <span class="o">./</span><span class="n">sqrt</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">n</span><span class="x">))</span>

<span class="n">p</span> <span class="o">=</span> <span class="n">plot</span><span class="x">()</span>
<span class="n">hline!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="x">[</span><span class="mi">0</span><span class="x">],</span> <span class="n">ls</span> <span class="o">=</span> <span class="o">:</span><span class="n">dash</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"grey"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="x">[</span><span class="mi">0</span><span class="x">],</span> <span class="n">ls</span> <span class="o">=</span> <span class="o">:</span><span class="n">dash</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"grey"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>
<span class="k">for</span> <span class="n">sym</span> <span class="k">in</span> <span class="x">[</span><span class="s">"TSLA"</span><span class="x">,</span> <span class="s">"V"</span><span class="x">,</span> <span class="s">"META"</span><span class="x">]</span>
   <span class="n">markoutResSub</span> <span class="o">=</span> <span class="n">sort</span><span class="x">(</span><span class="nd">@subset</span><span class="x">(</span><span class="n">markoutIndRes</span><span class="x">,</span> <span class="o">:</span><span class="n">symbol</span> <span class="o">.==</span> <span class="n">sym</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span> <span class="o">.&lt;=</span> <span class="mi">60</span><span class="x">,</span> <span class="o">:</span><span class="n">n</span> <span class="o">.&gt;=</span> <span class="mi">1</span><span class="x">),</span> <span class="o">:</span><span class="n">ts</span><span class="x">)</span>
    <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">markoutResSub</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">markoutResSub</span><span class="o">.</span><span class="n">avgMarkout</span><span class="x">,</span> <span class="n">yerr</span><span class="o">=</span><span class="n">markoutResSub</span><span class="o">.</span><span class="n">errMarkout</span><span class="x">,</span> 
     <span class="n">xlabel</span> <span class="o">=</span> <span class="s">"Days"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"Markout"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Acquired Alpha Capture"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="n">sym</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span><span class="mi">2</span><span class="x">)</span> 
<span class="k">end</span>
<span class="n">p</span>
</code></pre></div></div>

<p><img src="/assets/AlphaCapture/indMarkout2.png" alt="Individual markouts" title="Individual markouts" width="80%" class="center-image" /></p>

<p>When we pull out 3 examples of episodes we can see the randomness and specifically the volatility of TSLA here.</p>

<h2 id="conclusion">Conclusion</h2>

<p>From this, we would not put any specific weight on the stock
performance after an episode is released. There doesn’t appear to be
any statistical pattern to exploit. No alpha means no alpha
capture. It is a nice exercise though and has hopefully explained the
concept of a markout.</p>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><summary type="html"><![CDATA[People are never short of a trade idea. There is a whole industry of researchers, salespeople and amateurs coming up with trading ideas and making big calls on what stock will go up, what country will cut interest rates and what the price of gold will do next. Alpha capture is about systematically assessing ideas and working out who has alpha and generates profitable ideas and who is just making it up as they are going along.]]></summary></entry><entry><title type="html">Solving the Almgren Chris Model</title><link href="https://dm13450.github.io/2024/06/06/Solving-the-Almgren-Chris-Model.html" rel="alternate" type="text/html" title="Solving the Almgren Chris Model" /><published>2024-06-06T00:00:00+00:00</published><updated>2024-06-06T00:00:00+00:00</updated><id>https://dm13450.github.io/2024/06/06/Solving-the-Almgren-Chris-Model</id><content type="html" xml:base="https://dm13450.github.io/2024/06/06/Solving-the-Almgren-Chris-Model.html"><![CDATA[<p>The Almgren Chris model from <a href="https://www.smallake.kr/wp-content/uploads/2016/03/optliq.pdf">Optimal Execution
of Portfolio Transactions</a> is the most well known optimal
execution model and provides the foundational math about how to think
about trading some quantity of an asset. This blog post goes through
the math and how we set the problem up and arrived at the various
solutions.</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>I first encountered the Almgren Chriss model in my initial PhD year
through a Microstructure and Machine Learning course. It was for 2 hours at 18:00 on a
Friday night and on the other side of London from where I lived, so a bit of a pain
for me to attend. This post in essence is inspired by these notes as
I’ve always wanted to summarise them into a digital version. So this is a maths-heavy post that will act as a springboard for some
more future content.</p>

<h2 id="the-trading-problem">The Trading Problem</h2>

<p>We have \(X\) amount of something to trade over some time\(0\)
to \(T\) such that \(X_T = 0\). How should we slice and dice our
trades to minimise the execution cost?</p>

<p>We need a model of</p>

<ul>
  <li>How the price moves</li>
  <li>How our trading affects prices</li>
</ul>

<p>then we can build a trading cost function that we then optimise in different
ways.</p>

<h2 id="price-dynamics">Price Dynamics</h2>

<p>The price evolves like
\(S_t = \bar{S} _t + \eta v_t + \theta (X_0 - X_t),\)</p>

<ul>
  <li>\(\bar{S} _t\) is the unperturbed stock price</li>
  <li>\(\eta \cdot v_t\) is the temporary market impact that scales with the
trading speed \(v_t\)</li>
  <li>\(\theta \cdot (X_0 - X_T)\) is the permanent market impact</li>
</ul>

<p>The unperturbed price is a simple Gaussian random walk with no drift:
\(\mathrm{d} \bar{S} _t = \sigma S_0 \mathrm{d} W_t\)</p>

<p>The trading rate 
\(v_t = - \frac{\mathrm{d} X_t}{\mathrm{d}t} = - \dot{X} _t\)
so simply the speed at which we are executing the trades.</p>

<p>So the fundamental price (\(\bar{S}\)) evolves as a random walk but our
actions of trading means that the observed price is higher by an amount
proportional to our trading speed. The signs of the components are set
up such that we are buying - so the faster we trade the more we
distort the price from the true price by pushing it higher</p>

<h2 id="trading-costs">Trading Costs</h2>

<p>The final cost of the execution is the sum of the amount we traded
multiplied by the price of all the trades. In continuous time this is
simply the integral of this observed stock price multiplied by the
trading speed over the execution window:</p>

\[C_{0, T} = \int _0 ^T S_t v_t \mathrm{d} t,\]

<p>which after inserting the equation for the asset price gives us three different
components</p>

\[C_{0_,T} = \underbrace {\int _0 ^T \bar{S_t} v_t \mathrm{d} t}_\text{(1)} + \underbrace{\int_0 ^T \eta
v_t ^2 \mathrm{d} t}_\text{(2)} + \underbrace{\int _0 ^T \theta (X_0 -
X_t) v_t \mathrm{d}t}_\text{(3)}\]

<p>Term \((1)\) we use integration by parts:</p>

\[\begin{align*} \int _0 ^T \bar{S_t} v_t \mathrm{d} t &amp; =- \int _0 ^T
\bar{S_t} \mathrm{d}X_t \\
&amp; = - \left[\bar{S_t} X_t \right]_0^T + \int _0 ^T X_t \mathrm{d} \bar{S_t} \\
&amp; = -(\bar{S}_TX_T - \bar{S}_0X_0) + \int _0 ^T X_t \sigma S_0
\mathrm{d} W_t \\
&amp; = \bar{S_0} X_0 + \int _0 ^T X_t \sigma S_0
\mathrm{d} W_t
\end{align*}\]

<p>\(\int _0 ^T \bar{S} _t v_t \mathrm{d}t = - \int _0 ^T \bar{S} _t \mathrm{d} x_t\)
which with integration by parts and substituting in the GBM part</p>

\[X_0 S_0 + \int _0 ^T x_t \sigma S_0 \mathrm{d} W_t\]

<p>For term (3)</p>

\[\theta \int _o ^T (X_0 - X_t) v_t \mathrm{d} t= -\theta \int _0 ^T (X_0 - X_t) \mathrm{d} X_t\]

\[= \frac{\theta ^2}{2}\]

<p>which gives us a formula for \(C_{0, T}\)</p>

\[C_{0, T} = X_0 S_0 + \int _0 ^T X_t \sigma S_0 \mathrm{d} W_t + \eta \int _0 ^T v_t ^2 \mathrm{d}t + \frac{\theta ^2}{2}.\]

<p>This is our expected cost function and we want to find the \(v_t\)
that minimises the final cost.</p>

<h2 id="minimising-the-expected-cost">Minimising the Expected Cost</h2>

<p>If we take expectations (we want to minimise the <em>average</em> execution
path - each path will be different as it is a stochastic problem) we
end up with just one term we can influence the expected cost:</p>

\[\mathbb{E}[C] = \underbrace{X_0 S_0 + \frac{\theta ^ 2}{2}}_{\text{Constant}} +
         \underbrace{\mathbb{E}
		 \left[\int _0 ^T X_t \sigma S_0 \mathrm{d} W_t \right]}_{
		\mathbb{E}[ \mathrm{d}W_t] =  0} +
         \mathbb{E} \left[ \eta \int _0 ^T v_t ^2 \mathrm{d}t \right]\]

<p>So we minimise the expected cost by finding the trading speed that
minimises this term</p>

\[\min _{v_t} \eta \int _0 ^T v^2_t \mathrm{d} t.\]

<p>To solve this we apply the
<a href="https://en.wikipedia.org/wiki/Euler-Lagrange_equation">Euler-Lagrange equation</a>
to minimise the action. The action is the term inside the integral.</p>

\[\frac{\partial f}{\partial X} = \frac{\mathrm{d}}{\mathrm{d}t}
\frac{\partial f}{\partial v}\]

<p>And from the above</p>

\[\begin{align*} f &amp; = v^2_t \\
\frac{\partial f}{\partial X} &amp; = 0 \\
\frac{\partial f}{\partial v} &amp; = 2 v_t,
\end{align*}\]

<p>so</p>

\[\frac{\mathrm{d}}{\mathrm{d} t} v_t = 0,\]

<p>which means the speed of the execution must be constant \(v_t = B\).</p>

\[X_t = A + B t.\]

<p>We have the boundary conditions</p>

\[X_0 = A,\]

\[X_T = X_0 + BT = 0,\]

\[B = \frac{-X_0}{T},\]

\[X_t = X_0 - \frac{X_0}{T} t.\]

<p>Putting this trading schedule back into the expected cost formula gives
us an overall result</p>

\[\int _0 ^T v_t^2\mathrm{d} t = \frac{X^2_0}{T^2} (T - 0) =
\frac{X_0^2}{T}.\]

<p>When we plot this schedule we can see that the speed is constant and
we are simply running a TWAP (time-weighted average price).</p>

<p><img src="/assets/optexmaths/twap.png" alt="TWAP execution schedule" title="TWAP execution schedule" /></p>

<p>The maths is telling us:</p>

<ul>
  <li>To minimise cost for an amount \(X_0\) then you should run your
TWAP for an infinite amount of time.</li>
</ul>

<p>This neglects the price risk, so sure, run a very long TWAP but don’t
complain when the market trends against you!</p>

<p>How can we account for this price risk?</p>

<h2 id="mean-variance-optimisation-of-the-almgren-chriss-model">Mean-Variance Optimisation of the Almgren Chriss Model</h2>

<p>We now need to minimise both the expected cost and the <em>variance</em> of
the expected cost with our trading schedule. This means we will now be
sensitive to cases where the price moves far away from the starting
value.</p>

<p>We introduce a new
parameter, \(\lambda\), that controls our risk aversion. So now we are
worried about the price potentially running away from us if we take
too long to finish the trade</p>

\[\min _ {v_t} \left( \mathbb{E} [C] + \lambda \text{Var} [C] \right ),\]

<p>so now we want to minimise the average and the variation of the
trading cost and see what schedule that produces.</p>

<p>When we took the expectation, only the deterministic bits remained. When we calculate the variance only the random bits remain</p>

\[\text{Var} [C] = \mathbb{E} \left[ \sigma _0 \bar{S} _0 \int _0 ^T X_t \mathrm{d} t \right] ^2 = \sigma ^2 \bar{S}_0^2 \int _0 ^T X_t ^2 \mathrm{d} t,\]

<p>which means our minimisation problem can be written as:</p>

\[\text{min} _{v_t} \int _0 ^T v_t ^2 \mathrm{d} t + \lambda \sigma ^2 \bar{S}_0^2 \int _0 ^T X_t ^2 \mathrm{d} t.\]

<p>Using the Euler-Lagrange equations again</p>

\[\begin{align*}
f &amp; = A v_t^2 + B X_t^2 \\
\frac{\partial f}{\partial X} &amp; = 2B X_t \\
\frac{\partial f}{\partial v} &amp; = 2A v_t \\
B X_t &amp; = A\frac{\mathrm{d} }{\mathrm{d} t} v_t \\
 &amp; = - \frac{A}{B} \frac{\mathrm{d}^2}{\mathrm{d} t^2} X_t.
\end{align*}\]

<p>This is a second-order linear ordinary differential equation with
solution</p>

\[X_t = c_1 e^{\sqrt{\frac{A}{B}} t} + c_2 e ^{- \sqrt{\frac{A}{B}} t},\]

<p>Again, applying boundary conditions</p>

\[X_0 = c_1 + c_2,\]

\[X_T = 0 = c_1 e^{\sqrt{\frac{A}{B}} T} + c_2 e^{-\sqrt{\frac{A}{B}T}},\]

\[X_t = X_0 \frac{\text{sinh} \sqrt{\frac{\eta}{\lambda \sigma ^2 \bar{S}_0}} T-t}{\text{sinh}
\sqrt{\frac{\eta}{\lambda \sigma ^2 \bar{S}_0}} T}.\]

<p>Which is a funny expression, but underneath it is just an exponential.</p>

<p>We now have the additional \(\lambda\) parameter and so plot the
execution schedule for different risk aversions</p>

<p><img src="/assets/optexmaths/ag.png" alt="Comparing the TWAP to the Almgren Chriss model" title="Comparing the TWAP to the Almgren Chriss model" /></p>

<p>A higher \(\lambda\) means a higher risk tolerance so it becomes
closer to the TWAP. In general, we can see that the Almgren Chriss
solution is front-loaded - most of the trading is done early on in the
time window.</p>

<h2 id="summary">Summary</h2>

<p>Ok maths over, put down your pencils and breathe. We’ve gone through
the full problem set-up and show how the TWAP minimises expected
costs for a risk-neutral investor and how an exponential execution
schedule minimises cost for a risk-sensitive investor.</p>

<p>Now we know the maths we can go on to do some interesting things.</p>]]></content><author><name>Dean Markwick</name></author><summary type="html"><![CDATA[The Almgren Chris model from Optimal Execution of Portfolio Transactions is the most well known optimal execution model and provides the foundational math about how to think about trading some quantity of an asset. This blog post goes through the math and how we set the problem up and arrived at the various solutions.]]></summary></entry><entry><title type="html">Currency Hedging and Principal Component Analysis</title><link href="https://dm13450.github.io/2024/04/25/Currency-Hedging-and-Principal-Component-Analysis.html" rel="alternate" type="text/html" title="Currency Hedging and Principal Component Analysis" /><published>2024-04-25T00:00:00+00:00</published><updated>2024-04-25T00:00:00+00:00</updated><id>https://dm13450.github.io/2024/04/25/Currency%20Hedging%20and%20Principal%20Component%20Analysis</id><content type="html" xml:base="https://dm13450.github.io/2024/04/25/Currency-Hedging-and-Principal-Component-Analysis.html"><![CDATA[<p>Principal component analysis (PCA) reduces a dataset to its main
components. When we apply it to a dataset of different
currencies it helps us understand how each currency drives the overall
portfolio and what currency might be a common factor.</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>This post was inspired by a problem on the <a href="https://www.reddit.com/r/quant/">r/quant</a> subreddit where someone posted their interview/take-home question.</p>

<blockquote>
  <p>A client is considering using SGD to (proxy) hedge their exposure to
a basket of other Asian currencies. Is this likely to be effective?
What analysis could you produce that would help inform their
decision? The client is a US Corporate. The client is exposed to
medium-term changes (say monthly) in the currency. The client has equal (USD equivalent) revenues in each Asian currency. We are not considering hedging costs for this analysis (spot-only component). The data for daily close spot values against USD for each pair is provided. Which currency pairs will it work better for? Would it work for an equally weighted currency portfolio? Would another (single) currency work better? Which correlations should we consider and how reliable are these?</p>
</blockquote>

<p>This is an interesting question and not too dissimilar to the
occasional question I answer in my day job. So I thought I’d run through how I might answer it.</p>

<h2 id="getting-fx-data">Getting FX Data</h2>

<p>First, we need to get some data and I’ll be using Alphavantage to pull
daily closing prices of the different currencies. I’ll calculate the
log returns and save the data to cache it for future use. Plus
AlphaVantage only lets you make 25 calls a day, so each time I mucked
up I got locked out for the day - delaying the analysis. We have to
start from 2014 as this is the earliest common date across all
currencies.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> _pull_data</span><span class="x">(</span><span class="n">ccy</span><span class="x">)</span>
    <span class="n">println</span><span class="x">(</span><span class="n">ccy</span><span class="x">)</span>
    <span class="n">res</span> <span class="o">=</span> <span class="n">AlphaVantage</span><span class="o">.</span><span class="n">fx_daily</span><span class="x">(</span><span class="s">"USD"</span><span class="x">,</span> <span class="n">ccy</span><span class="x">,</span> <span class="n">outputsize</span><span class="o">=</span><span class="s">"full"</span><span class="x">,</span> <span class="n">datatype</span><span class="o">=</span><span class="s">"csv"</span><span class="x">)</span>
    <span class="n">res</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="kt">Dict</span><span class="x">(</span><span class="o">:</span><span class="kt">Date</span><span class="o">=&gt;</span><span class="kt">Date</span><span class="o">.</span><span class="x">(</span><span class="n">res</span><span class="x">[</span><span class="mi">1</span><span class="x">][</span><span class="o">:</span><span class="x">,</span> <span class="mi">1</span><span class="x">]),</span> <span class="o">:</span><span class="n">c</span><span class="o">=&gt;</span><span class="kt">Float64</span><span class="o">.</span><span class="x">(</span><span class="n">res</span><span class="x">[</span><span class="mi">1</span><span class="x">][</span><span class="o">:</span><span class="x">,</span><span class="mi">5</span><span class="x">]),</span> <span class="o">:</span><span class="n">ccy</span> <span class="o">=&gt;</span> <span class="n">ccy</span><span class="x">));</span>
    <span class="n">res</span> <span class="o">=</span> <span class="n">sort</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="kt">Date</span><span class="x">)</span>
    <span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">LogReturn</span> <span class="o">=</span> <span class="x">[</span><span class="mi">0</span><span class="x">;</span> <span class="n">diff</span><span class="x">(</span><span class="n">log</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">c</span><span class="x">))])</span>
    <span class="n">res</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> pull_data</span><span class="x">(</span><span class="n">ccy</span><span class="x">)</span>
    <span class="k">if</span> <span class="n">isfile</span><span class="x">(</span><span class="s">"</span><span class="si">$</span><span class="s">ccy.csv"</span><span class="x">)</span>
        <span class="n">res</span> <span class="o">=</span> <span class="n">CSV</span><span class="o">.</span><span class="n">read</span><span class="x">(</span><span class="s">"</span><span class="si">$</span><span class="s">ccy.csv"</span><span class="x">,</span> <span class="n">DataFrame</span><span class="x">)</span>
    <span class="k">else</span>
        <span class="n">res</span> <span class="o">=</span> <span class="n">_pull_data</span><span class="x">(</span><span class="n">ccy</span><span class="x">)</span>
        <span class="n">CSV</span><span class="o">.</span><span class="n">write</span><span class="x">(</span><span class="s">"</span><span class="si">$</span><span class="s">ccy.csv"</span><span class="x">,</span> <span class="n">res</span><span class="x">)</span>
    <span class="k">end</span>
    <span class="n">res</span>
<span class="k">end</span>

<span class="n">ccys</span> <span class="o">=</span> <span class="x">[</span><span class="s">"JPY"</span><span class="x">,</span> <span class="s">"CNH"</span><span class="x">,</span> <span class="s">"SGD"</span><span class="x">,</span> <span class="s">"THB"</span><span class="x">,</span> <span class="s">"HKD"</span><span class="x">,</span> <span class="s">"KRW"</span><span class="x">,</span> <span class="s">"TWD"</span><span class="x">]</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">vcat</span><span class="x">(</span><span class="n">pull_data</span><span class="o">.</span><span class="x">(</span><span class="n">ccys</span><span class="x">)</span><span class="o">...</span><span class="x">);</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">sort</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="kt">Date</span><span class="x">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">ccy</span><span class="x">),</span> <span class="o">:</span><span class="n">LogReturn</span> <span class="o">=</span> <span class="x">[</span><span class="mi">0</span><span class="x">;</span> <span class="n">diff</span><span class="x">(</span><span class="n">log</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">c</span><span class="x">))])</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@subset</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="kt">Date</span> <span class="o">.&gt;=</span> <span class="kt">Date</span><span class="x">(</span><span class="s">"2014-11-24"</span><span class="x">))</span>
</code></pre></div></div>

<p>Like all good blog posts, let’s start with the plot of the
cumulative returns. Only HKD stands out as something different given
its peg to USD.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">p</span> <span class="o">=</span> <span class="n">plot</span><span class="x">(</span><span class="n">ylabel</span> <span class="o">=</span> <span class="s">"Cummulative Return"</span><span class="x">)</span>
<span class="k">for</span> <span class="n">ccy</span> <span class="k">in</span> <span class="n">ccys</span>
    <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">res</span><span class="x">[</span><span class="n">res</span><span class="o">.</span><span class="n">ccy</span> <span class="o">.==</span> <span class="n">ccy</span><span class="x">,</span> <span class="o">:</span><span class="x">]</span><span class="o">.</span><span class="kt">Date</span><span class="x">,</span> <span class="n">cumsum</span><span class="x">(</span><span class="n">res</span><span class="x">[</span><span class="n">res</span><span class="o">.</span><span class="n">ccy</span> <span class="o">.==</span> <span class="n">ccy</span><span class="x">,</span> <span class="o">:</span><span class="x">]</span><span class="o">.</span><span class="n">LogReturn</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="n">ccy</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span> <span class="mi">2</span><span class="x">)</span>
<span class="k">end</span>
<span class="n">p</span>
</code></pre></div></div>

<p><img src="/assets/asianccys/ccyReturns.png" alt="Asian Currency Returns" title="Asian
Currency Returns" /></p>

<p>According to the problem, our client is long equal amounts of these
Asian currencies, so it makes sense to calculate the market returns by
taking the average return each day.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">market</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="kt">Date</span><span class="x">),</span> <span class="o">:</span><span class="n">LogReturn</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">LogReturn</span><span class="x">))</span>
<span class="n">market</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="o">:</span><span class="n">ccy</span><span class="x">]</span> <span class="o">.=</span> <span class="s">"Market"</span>
<span class="n">market</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="o">:</span><span class="n">c</span><span class="x">]</span> <span class="o">.=</span> <span class="nb">NaN</span><span class="x">;</span>
</code></pre></div></div>

<p>Which we add to the original plot.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">p</span> <span class="o">=</span> <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">market</span><span class="o">.</span><span class="kt">Date</span><span class="x">,</span> <span class="n">cumsum</span><span class="x">(</span><span class="n">market</span><span class="o">.</span><span class="n">LogReturn</span><span class="x">)</span>
    <span class="n">label</span> <span class="o">=</span> <span class="s">"Market"</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"black"</span><span class="x">,</span> <span class="n">lw</span>  <span class="o">=</span> <span class="mi">2</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/asianccys/ccyReturnsMarket.png" alt="Asian currency returns with market returns" title="Asian
currency returns with market returns" /></p>

<p>The client thinks that hedging with SGD alone is enough to protect
against the overall market returns. We can see from the graph that
this probably isn’t the case. But how do we recommend a better
approach?</p>

<p>First, we will start with the correlation in returns between the
different currencies. This will shed some light on how linked they
are and is also simple to explain to the client.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cr</span> <span class="o">=</span> <span class="n">cor</span><span class="x">(</span><span class="kt">Matrix</span><span class="x">(</span><span class="n">modelData</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">JPY</span><span class="x">,</span> <span class="o">:</span><span class="n">CNH</span><span class="x">,</span> <span class="o">:</span><span class="n">SGD</span><span class="x">,</span> <span class="o">:</span><span class="n">THB</span><span class="x">,</span> <span class="o">:</span><span class="n">HKD</span><span class="x">,</span> <span class="o">:</span><span class="n">KRW</span><span class="x">,</span> <span class="o">:</span><span class="n">TWD</span><span class="x">]]))</span>
<span class="n">heatmap</span><span class="x">(</span><span class="n">ccys</span><span class="x">,</span> <span class="n">ccys</span><span class="x">,</span> <span class="n">cr</span> <span class="o">.&gt;</span> <span class="mf">0.5</span><span class="x">)</span>
</code></pre></div></div>

<p>We use a heat-map, but only highlight when two currencies have a
correlation &gt; 0.5, otherwise it’s a bit of a psychedelic nightmare.</p>

<p><img src="/assets/asianccys/ccyCorr.png" alt="Asian currency correlations" title="Asian currency correlations" /></p>

<p>We can see that HKD has a low correlation with most, KRW and SGD have
a high correlation between each other and KRW has a high correlation with the majority of
these currencies. 
However, we will use the covariance matrix to analyse the best hedging portfolio rather than the correlation matrix.</p>

<h2 id="principal-component-analysis">Principal Component Analysis</h2>

<p>Principal component analysis (or PCA) is a tool that tries to find a
common basis of variation in a matrix. It’s about transforming the
data into uncorrelated components through linear algebra.</p>

<p>For this we are using the covariance matrix, so now the diagonals are
the individual price series variances and the off-diagonals are the
covariances between two currencies. If this were a different problem
we might rescale the returns so they all had the same volatility but
this would mean applying leverage, which our hypothetical customer
probably wouldn’t be up for it.</p>

<p>We pull out the covariance matrix</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">modelData</span> <span class="o">=</span> <span class="n">dropmissing</span><span class="x">(</span><span class="n">unstack</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="kt">Date</span><span class="x">,</span> <span class="o">:</span><span class="n">ccy</span><span class="x">,</span> <span class="o">:</span><span class="n">LogReturn</span><span class="x">))</span>
<span class="n">cm</span> <span class="o">=</span> <span class="n">cov</span><span class="x">(</span><span class="kt">Matrix</span><span class="x">(</span><span class="n">modelData</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">JPY</span><span class="x">,</span> <span class="o">:</span><span class="n">CNH</span><span class="x">,</span> <span class="o">:</span><span class="n">SGD</span><span class="x">,</span> <span class="o">:</span><span class="n">THB</span><span class="x">,</span> <span class="o">:</span><span class="n">HKD</span><span class="x">,</span> <span class="o">:</span><span class="n">KRW</span><span class="x">,</span> <span class="o">:</span><span class="n">TWD</span><span class="x">]]))</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">MultivariateStats.jl</code> package has the functions for doing PCA and
the appropriate functions for pulling out the right data after fitting the
PCA model.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pcaRes</span> <span class="o">=</span> <span class="n">fit</span><span class="x">(</span><span class="n">PCA</span><span class="x">,</span> <span class="n">cm</span><span class="x">;</span> <span class="n">maxoutdim</span><span class="o">=</span><span class="mi">3</span><span class="x">)</span>
</code></pre></div></div>

<p>Firstly the weights of all the currencies for the three principal
components.</p>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>PC1 Weights</th>
      <th>PC2 Weights</th>
      <th>PC3 Weights</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>JPY</td>
      <td>4.96845E-06</td>
      <td>9.11362E-06</td>
      <td>-2.98467E-07</td>
    </tr>
    <tr>
      <td>CNH</td>
      <td>2.11372E-06</td>
      <td>-1.1987E-06</td>
      <td>-4.78571E-08</td>
    </tr>
    <tr>
      <td>SGD</td>
      <td>3.35545E-06</td>
      <td>-5.17405E-07</td>
      <td>-1.00414E-07</td>
    </tr>
    <tr>
      <td>THB</td>
      <td>3.21579E-06</td>
      <td>-7.50513E-07</td>
      <td>3.05907E-06</td>
    </tr>
    <tr>
      <td>HKD</td>
      <td>4.21256E-08</td>
      <td>-7.74387E-08</td>
      <td>-1.84514E-08</td>
    </tr>
    <tr>
      <td>KRW</td>
      <td>7.67389E-06</td>
      <td>-4.39207E-06</td>
      <td>-8.40943E-07</td>
    </tr>
    <tr>
      <td>TWD</td>
      <td>2.42907E-06</td>
      <td>-2.01299E-06</td>
      <td>-6.01965E-07</td>
    </tr>
  </tbody>
</table>

<ul>
  <li>PC1 shows the weights for each currency but is unnormalised. The key thing
we can see here is that HKD is magnitudes smaller than the others.</li>
  <li>PC2 is long JPY and short all the others</li>
  <li>PC3 is long THB and short all the others</li>
</ul>

<p>Then the explained variance of the three components.</p>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>PC1</th>
      <th>PC2</th>
      <th>PC3</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Eigenvalues</td>
      <td>1.15544e-10</td>
      <td>1.08674e-10</td>
      <td>1.05292e-11</td>
    </tr>
    <tr>
      <td>Variance explained</td>
      <td>0.47267</td>
      <td>0.444567</td>
      <td>0.0430731</td>
    </tr>
    <tr>
      <td>Cumulative variance</td>
      <td>0.47267</td>
      <td>0.917237</td>
      <td>0.96031</td>
    </tr>
  </tbody>
</table>

<p>The first component can explain 49% of the variance and then including
the second component 91% of the variance, with the final component
making up 5% to take it to 96% in total. This means that this dataset
can be broken down quite nicely into the two principal components and
this explains most of the variation.</p>

<p>The first principal component is commonly called the
‘market’ portfolio and represents the overall combined market dynamics
of the portfolio. The next portfolio (using the 2nd PC weights) is
uncorrelated to the market and thus more diversified to the overall
market.</p>

<p>In our problem then we can see that we are trying to come up with a
representation of the market and use that to decide how to hedge out
our currencies. So the first principal component is the most relevant.</p>

<p>We take these principal component weights and join them to the original
dataframe to start exploring what the market portfolio looks like.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">evFrame</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="kt">Dict</span><span class="x">(</span><span class="o">:</span><span class="n">ccy</span> <span class="o">=&gt;</span> <span class="kt">String</span><span class="o">.</span><span class="x">([</span><span class="o">:</span><span class="n">JPY</span><span class="x">,</span> <span class="o">:</span><span class="n">CNH</span><span class="x">,</span> <span class="o">:</span><span class="n">SGD</span><span class="x">,</span> <span class="o">:</span><span class="n">THB</span><span class="x">,</span> <span class="o">:</span><span class="n">HKD</span><span class="x">,</span> <span class="o">:</span><span class="n">KRW</span><span class="x">,</span> <span class="o">:</span><span class="n">TWD</span><span class="x">]),</span> 
          <span class="o">:</span><span class="n">ev1</span> <span class="o">=&gt;</span> <span class="n">eigvecs</span><span class="x">(</span><span class="n">pcaRes</span><span class="x">)[</span><span class="o">:</span><span class="x">,</span><span class="mi">1</span><span class="x">],</span>
          <span class="o">:</span><span class="n">ev2</span> <span class="o">=&gt;</span> <span class="n">eigvecs</span><span class="x">(</span><span class="n">pcaRes</span><span class="x">)[</span><span class="o">:</span><span class="x">,</span><span class="mi">2</span><span class="x">]))</span>
<span class="n">sort!</span><span class="x">(</span><span class="n">evFrame</span><span class="x">,</span> <span class="o">:</span><span class="n">ev1</span><span class="x">)</span>

<span class="n">res</span> <span class="o">=</span> <span class="n">leftjoin</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="n">dropmissing</span><span class="x">(</span><span class="n">evFrame</span><span class="x">),</span> <span class="n">on</span> <span class="o">=</span> <span class="o">:</span><span class="n">ccy</span><span class="x">)</span>

<span class="n">evFrame</span> <span class="o">=</span> <span class="n">sort</span><span class="x">(</span><span class="n">evFrame</span><span class="x">,</span> <span class="o">:</span><span class="n">ev1</span><span class="x">);</span>
</code></pre></div></div>

<p>Then plotting the weights by currency pair</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">bar</span><span class="x">(</span><span class="n">evFrame</span><span class="o">.</span><span class="n">ccy</span><span class="x">,</span> <span class="n">evFrame</span><span class="o">.</span><span class="n">ev1</span> <span class="o">./</span> <span class="n">sum</span><span class="x">(</span><span class="n">evFrame</span><span class="o">.</span><span class="n">ev1</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Eigen Weights"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/asianccys/eigenWeights.png" alt="First principal component weights" title="First principal component weights" /></p>

<p>These are the weights of the different currencies of the first eigen
portfolio. This combination of currencies is what we would recommend
if the client was exposed to a similar basket. The key points:</p>

<ul>
  <li>The client is long these currencies through their business</li>
  <li>They short this portfolio and thus are market-neutral</li>
</ul>

<p>We now calculate the returns of the eigen portfolios, the portfolio
that only uses the largest 2 (and 3) weights.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">evPortfolios</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="kt">Date</span><span class="x">),</span> 
         <span class="o">:</span><span class="n">ReturnEV1</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">LogReturn</span> <span class="o">.*</span> <span class="o">:</span><span class="n">ev1</span><span class="x">)</span> <span class="o">./</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">ev1</span><span class="x">),</span> 
         <span class="o">:</span><span class="n">ReturnEV2</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">LogReturn</span> <span class="o">.*</span> <span class="o">:</span><span class="n">ev2</span><span class="x">)</span> <span class="o">./</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">ev2</span><span class="x">));</span>

<span class="n">ccy2Portfolio</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">res</span><span class="x">[</span><span class="k">in</span><span class="o">.</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">ccy</span><span class="x">,</span> <span class="kt">Ref</span><span class="x">([</span><span class="s">"KRW"</span><span class="x">,</span> <span class="s">"JPY"</span><span class="x">])),</span> <span class="o">:</span><span class="x">],</span> <span class="o">:</span><span class="kt">Date</span><span class="x">),</span> 
         <span class="o">:</span><span class="n">Return2Ccy</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">LogReturn</span> <span class="o">.*</span> <span class="o">:</span><span class="n">ev1</span><span class="x">)</span> <span class="o">./</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">ev1</span><span class="x">));</span>

<span class="n">ccy3Portfolio</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">res</span><span class="x">[</span><span class="k">in</span><span class="o">.</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">ccy</span><span class="x">,</span> <span class="kt">Ref</span><span class="x">([</span><span class="s">"KRW"</span><span class="x">,</span> <span class="s">"JPY"</span><span class="x">,</span> <span class="s">"SGD"</span><span class="x">])),</span> <span class="o">:</span><span class="x">],</span> <span class="o">:</span><span class="kt">Date</span><span class="x">),</span> 
         <span class="o">:</span><span class="n">Return3Ccy</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">LogReturn</span> <span class="o">.*</span> <span class="o">:</span><span class="n">ev1</span><span class="x">)</span> <span class="o">./</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">ev1</span><span class="x">));</span>
</code></pre></div></div>

<p>And plotting these returns</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plot</span><span class="x">(</span><span class="n">market</span><span class="o">.</span><span class="kt">Date</span><span class="x">,</span> <span class="n">cumsum</span><span class="x">(</span><span class="n">market</span><span class="o">.</span><span class="n">LogReturn</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Market"</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"black"</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span> <span class="mi">2</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">evPortfolios</span><span class="o">.</span><span class="kt">Date</span><span class="x">,</span>  <span class="n">cumsum</span><span class="x">(</span><span class="n">evPortfolios</span><span class="o">.</span><span class="n">ReturnEV1</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Eigen Portfolio"</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span> <span class="mi">2</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">ccy2Portfolio</span><span class="o">.</span><span class="kt">Date</span><span class="x">,</span>  <span class="n">cumsum</span><span class="x">(</span><span class="n">ccy2Portfolio</span><span class="o">.</span><span class="n">Return2Ccy</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"2 Ccy"</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span><span class="mi">2</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">ccy3Portfolio</span><span class="o">.</span><span class="kt">Date</span><span class="x">,</span>  <span class="n">cumsum</span><span class="x">(</span><span class="n">ccy3Portfolio</span><span class="o">.</span><span class="n">Return3Ccy</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"3 Ccy"</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span> <span class="mi">2</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/asianccys/eigenPortfolio.png" alt="&quot;Eigen portfolio returns&quot;" title="Eigen portfolio returns" /></p>

<p>Then finally, looking at the correlation between these portfolios</p>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>Market Return</th>
      <th>Market Eigen Portfolio</th>
      <th>2nd Eigen Portfolio</th>
      <th>KRW + JPY</th>
      <th>KRW + JPY + SGD</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Market Return</td>
      <td>1.0</td>
      <td>0.99</td>
      <td>0.01</td>
      <td>0.93</td>
      <td>0.95</td>
    </tr>
    <tr>
      <td>Market Eigen Portfolio</td>
      <td>0.99</td>
      <td>1.0</td>
      <td>0.01</td>
      <td>0.97</td>
      <td>0.98</td>
    </tr>
    <tr>
      <td>2nd Eigen Portfolio</td>
      <td>0.01</td>
      <td>0.01</td>
      <td>1.0</td>
      <td>0.11</td>
      <td>0.08</td>
    </tr>
    <tr>
      <td>KRW + JPY</td>
      <td>0.93</td>
      <td>0.97</td>
      <td>0.11</td>
      <td>1.0</td>
      <td>0.99</td>
    </tr>
    <tr>
      <td>KRW + JPY + SGD</td>
      <td>0.95</td>
      <td>0.99</td>
      <td>0.08</td>
      <td>0.99</td>
      <td>1.0</td>
    </tr>
  </tbody>
</table>

<ul>
  <li>The Eigen Portfolio 1 is most correlated with the equal-weighted portfolio.</li>
  <li>With just KRW and JPY you get to a 93% correlation with the market.</li>
  <li>KRW, JPY and SGD gets you to a 95% with the market.</li>
</ul>

<p>As expected Eigen portfolio 2 is the most uncorrelated with the
market.</p>

<h2 id="summary">Summary</h2>

<p>So our final answer to the client would be:</p>

<ul>
  <li>We have a proprietary portfolio (the market eigen portfolio) that you
should hedge with - this will give you the best outcome.</li>
  <li>If you don’t want the full portfolio use a 60/40 ratio of KRW and
JPY.</li>
  <li>SGD probably isn’t a great idea and will leave you exposed.</li>
</ul>

<p>Now, we are assuming that these weightings are stable through time and
haven’t changed recently and are therefore valid for the future
returns too. We are ignoring transaction costs, KRW being an NDF and
more expensive to trade compared to a spot currency (like JPY) means
that this approach will break down if the client needs to hedge a
significant amount.</p>]]></content><author><name>Dean Markwick</name></author><summary type="html"><![CDATA[Principal component analysis (PCA) reduces a dataset to its main components. When we apply it to a dataset of different currencies it helps us understand how each currency drives the overall portfolio and what currency might be a common factor.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://dm13450.github.io/assets/asianccys/eigenPortfolio.png" /><media:content medium="image" url="https://dm13450.github.io/assets/asianccys/eigenPortfolio.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Calibrating an Ornstein–Uhlenbeck Process</title><link href="https://dm13450.github.io/2024/03/09/Calibrating-an-Ornstein-Uhlenbeck-Process.html" rel="alternate" type="text/html" title="Calibrating an Ornstein–Uhlenbeck Process" /><published>2024-03-09T00:00:00+00:00</published><updated>2024-03-09T00:00:00+00:00</updated><id>https://dm13450.github.io/2024/03/09/Calibrating-an-Ornstein%E2%80%93Uhlenbeck-Process</id><content type="html" xml:base="https://dm13450.github.io/2024/03/09/Calibrating-an-Ornstein-Uhlenbeck-Process.html"><![CDATA[<p>Read enough quant finance papers or books and you’ll come across the
Ornstein–Uhlenbeck (OU) process. This is a post that explores the OU
process, the equations, how we can simulate such a process and then estimate the parameters.</p>

<p></p>
<hr />
<p>Enjoy these types of posts? Then you should sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />
<p></p>

<p>I’ve briefly touched on mean reversion and OU processes before in my
<a href="https://dm13450.github.io/2023/07/15/Stat-Arb-Walkthrough.html">Stat Arb - An Easy Walkthrough</a>
blog post where we modelled the spread between an asset and its
respective ETF. The whole concept of ‘mean reversion’ is something
that comes up frequently in finance and at different time scales. It
can be thought of as the first basic extension as Brownian motion and
instead of things moving randomly there is now a slight structure
where it be oscillating around a constant value.</p>

<p>The Hudson Thames group have a similar post on OU processes (<a href="https://hudsonthames.org/caveats-in-calibrating-the-ou-process/">Mean-Reverting Spread Modeling: Caveats in Calibrating the OU Process</a>) and
my post should be a nice compliment with code and some extensions.</p>

<h2 id="the-ornstein-uhlenbeck-equation">The Ornstein-Uhlenbeck Equation</h2>

<p>As a continuous process, we write the change in \(X_t\) as an increment in time and some noise</p>

\[\mathrm{d}X_t = \theta (\mu - x_t) \mathrm{d}t + \sigma \mathrm{d}W_t\]

<p>The amount it changes in time depends on the previous \(X_t\) and to free parameters \(\mu\) and \(\theta\).</p>

<ul>
  <li>The \(\mu\) is the long-term drift of the process</li>
  <li>The \(\theta\) is the mean reversion or momentum parameter depending on the sign.</li>
</ul>

<p>If \(\theta\) is 0 we can see the equation collapses down to a simple random walk.</p>

<p>If we assume \(\mu = 0\), so the long-term average is 0, then a <strong>positive</strong> value of \(\theta\) means we see mean reversion. Large values of \(X\) mean the next change is likely to have a negative sign, leading to a smaller value in \(X\).</p>

<p>A <strong>negative</strong> value of \(\theta\) means the opposite and we end up with a large value in X generating a further large positive change and the process explodes. 
E
If discretise the process we can simulate some samples with different parameters to illustrate these two modes.</p>

\[X_{t+1} - X_t = \theta (\mu - X_t) \Delta t + \sigma \sqrt{\Delta t} W_t\]

<p>where \(W_t \sim N(0,1)\).</p>

<p>which is easy to write out in Julia. We can save some time by drawing the random values first and then just summing everything together.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">Distributions</span><span class="x">,</span> <span class="n">Plots</span>

<span class="k">function</span><span class="nf"> simulate_os</span><span class="x">(</span><span class="n">theta</span><span class="x">,</span> <span class="n">mu</span><span class="x">,</span> <span class="n">sigma</span><span class="x">,</span> <span class="n">dt</span><span class="x">,</span> <span class="n">maxT</span><span class="x">,</span> <span class="n">initial</span><span class="x">)</span>
    <span class="n">p</span> <span class="o">=</span> <span class="kt">Array</span><span class="x">{</span><span class="kt">Float64</span><span class="x">}(</span><span class="nb">undef</span><span class="x">,</span> <span class="n">length</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="n">dt</span><span class="o">:</span><span class="n">maxT</span><span class="x">))</span>
    <span class="n">p</span><span class="x">[</span><span class="mi">1</span><span class="x">]</span> <span class="o">=</span> <span class="n">initial</span>
    <span class="n">w</span> <span class="o">=</span> <span class="n">sigma</span> <span class="o">*</span> <span class="n">rand</span><span class="x">(</span><span class="n">Normal</span><span class="x">(),</span> <span class="n">length</span><span class="x">(</span><span class="n">p</span><span class="x">))</span> <span class="o">*</span> <span class="n">sqrt</span><span class="x">(</span><span class="n">dt</span><span class="x">)</span>
    <span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="n">length</span><span class="x">(</span><span class="n">p</span><span class="x">)</span><span class="o">-</span><span class="mi">1</span><span class="x">)</span>
        <span class="n">p</span><span class="x">[</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="x">]</span> <span class="o">=</span> <span class="n">p</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">+</span> <span class="n">theta</span><span class="o">*</span><span class="x">(</span><span class="n">mu</span><span class="o">-</span><span class="n">p</span><span class="x">[</span><span class="n">i</span><span class="x">])</span><span class="o">*</span><span class="n">dt</span> <span class="o">+</span> <span class="n">w</span><span class="x">[</span><span class="n">i</span><span class="x">]</span>
    <span class="k">end</span>
    <span class="k">return</span> <span class="n">p</span>
<span class="k">end</span>
</code></pre></div></div>

<p>We have two classes of OU processes we want to simulate, a mean
reverting \(\theta &gt; 0\) and a momentum version (\(\theta &lt; 0\)) and
we also want to simulate a random walk at the same time, so \(\theta =
0\). We will assume \(\mu = 0\) which keeps the pictures simple.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">maxT</span> <span class="o">=</span> <span class="mi">5</span>
<span class="n">dt</span> <span class="o">=</span> <span class="mi">1</span><span class="o">/</span><span class="x">(</span><span class="mi">60</span><span class="o">*</span><span class="mi">60</span><span class="x">)</span>
<span class="n">vol</span> <span class="o">=</span> <span class="mf">0.005</span>

<span class="n">initial</span> <span class="o">=</span> <span class="mf">0.00</span><span class="o">*</span><span class="n">rand</span><span class="x">(</span><span class="n">Normal</span><span class="x">())</span>

<span class="n">p1</span> <span class="o">=</span> <span class="n">simulate_os</span><span class="x">(</span><span class="o">-</span><span class="mf">0.5</span><span class="x">,</span> <span class="mi">0</span><span class="x">,</span> <span class="n">vol</span><span class="x">,</span> <span class="n">dt</span><span class="x">,</span> <span class="n">maxT</span><span class="x">,</span> <span class="n">initial</span><span class="x">)</span>
<span class="n">p2</span> <span class="o">=</span> <span class="n">simulate_os</span><span class="x">(</span><span class="mf">0.5</span><span class="x">,</span> <span class="mi">0</span><span class="x">,</span> <span class="n">vol</span><span class="x">,</span> <span class="n">dt</span><span class="x">,</span> <span class="n">maxT</span><span class="x">,</span> <span class="n">initial</span><span class="x">)</span>
<span class="n">p3</span> <span class="o">=</span> <span class="n">simulate_os</span><span class="x">(</span><span class="mi">0</span><span class="x">,</span> <span class="mi">0</span><span class="x">,</span> <span class="n">vol</span><span class="x">,</span> <span class="n">dt</span><span class="x">,</span> <span class="n">maxT</span><span class="x">,</span> <span class="n">initial</span><span class="x">)</span>

<span class="n">plot</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="n">dt</span><span class="o">:</span><span class="n">maxT</span><span class="x">,</span> <span class="n">p1</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Momentum"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="n">dt</span><span class="o">:</span><span class="n">maxT</span><span class="x">,</span> <span class="n">p2</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Mean Reversion"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="n">dt</span><span class="o">:</span><span class="n">maxT</span><span class="x">,</span> <span class="n">p3</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Random Walk"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/ouprocess/oudemo.png" alt="Different values an OU process can look" title="Different values an OU process can look" /></p>

<p>The mean reversion (orange) hasn’t moved away from the long-term average (\(\mu=0\)) and the momentum has diverged the furthest from the starting point, which lines up with the name. The random walk, inbetween both as we would expect.</p>

<p>Now we have successfully simulated the process we want to try and
estimate the \(\theta\) parameter from the simulation. We have two
slightly different (but similar methods) to achieve this.</p>

<h2 id="ols-calibration-of-an-ou-process">OLS Calibration of an OU Process</h2>

<p>When we look at the generating equation we can simply rearrange it into a linear equation.</p>

\[\Delta X = \theta \mu \Delta t - \theta \Delta t X_t + \epsilon\]

<p>and the usual OLS equation</p>

\[y = \alpha + \beta X + \epsilon\]

<p>such that</p>

\[\alpha = \theta \mu \Delta t\]

\[\beta = -\theta \Delta t\]

<p>where \(\epsilon\) is the noise. So we just need a DataFrame with the difference between subsequent observations and relate that to the current observation. Just a <code class="language-plaintext highlighter-rouge">diff</code> and a shift.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">DataFrames</span><span class="x">,</span> <span class="n">DataFramesMeta</span>
<span class="n">momData</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">y</span><span class="o">=</span><span class="n">p1</span><span class="x">)</span>
<span class="n">momData</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">momData</span><span class="x">,</span> <span class="o">:</span><span class="n">diffY</span> <span class="o">=</span> <span class="x">[</span><span class="nb">NaN</span><span class="x">;</span> <span class="n">diff</span><span class="x">(</span><span class="o">:</span><span class="n">y</span><span class="x">)],</span> <span class="o">:</span><span class="n">prevY</span> <span class="o">=</span> <span class="x">[</span><span class="nb">NaN</span><span class="x">;</span> <span class="o">:</span><span class="n">y</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)]])</span>
</code></pre></div></div>

<p>Then using the standard OLS process from the <code class="language-plaintext highlighter-rouge">GLM</code> package.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">mdl</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">diffY</span> <span class="o">~</span> <span class="n">prevY</span><span class="x">),</span> <span class="n">momData</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">])</span>
<span class="n">alpha</span><span class="x">,</span> <span class="n">beta</span> <span class="o">=</span> <span class="n">coef</span><span class="x">(</span><span class="n">mdl</span><span class="x">)</span>

<span class="n">theta</span> <span class="o">=</span> <span class="o">-</span><span class="n">beta</span> <span class="o">/</span> <span class="n">dt</span>
<span class="n">mu</span> <span class="o">=</span> <span class="n">alpha</span> <span class="o">/</span> <span class="x">(</span><span class="n">theta</span> <span class="o">*</span> <span class="n">dt</span><span class="x">)</span>
</code></pre></div></div>

<p>Which gives us \(\mu = 0.0075, \theta = -0.3989\), so close to zero
for the drift and the reversion parameter has the correct sign.</p>

<p>Doing the same for the mean reversion data.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">mdl</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">diffY</span> <span class="o">~</span> <span class="n">prevY</span><span class="x">),</span> <span class="n">revData</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">])</span>
<span class="n">alpha</span><span class="x">,</span> <span class="n">beta</span> <span class="o">=</span> <span class="n">coef</span><span class="x">(</span><span class="n">mdl</span><span class="x">)</span>

<span class="n">theta</span> <span class="o">=</span> <span class="o">-</span><span class="n">beta</span> <span class="o">/</span> <span class="n">dt</span>
<span class="n">mu</span> <span class="o">=</span> <span class="n">alpha</span> <span class="o">/</span> <span class="x">(</span><span class="n">theta</span> <span class="o">*</span> <span class="n">dt</span><span class="x">)</span>
</code></pre></div></div>

<p>This time \(\mu = 0.001\) and \(\theta = 1.2797\). So a little wrong
compared to the true values, but at least the correct sign.</p>

<h2 id="does-bootstrapping-help">Does Bootstrapping Help?</h2>

<p>It could be that we need more data, so we use the bootstrap to randomly sample from the population to give us pseudo-new draws. We use the DataFrames again and pull random rows with replacement to build out the data set. We do this sampling 1000 times.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="mi">1000</span><span class="x">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="mi">1000</span>
    <span class="n">mdl</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">diffY</span> <span class="o">~</span> <span class="n">prevY</span> <span class="o">+</span> <span class="mi">0</span><span class="x">),</span> <span class="n">momData</span><span class="x">[</span><span class="n">sample</span><span class="x">(</span><span class="mi">2</span><span class="o">:</span><span class="n">nrow</span><span class="x">(</span><span class="n">momData</span><span class="x">),</span> <span class="n">nrow</span><span class="x">(</span><span class="n">momData</span><span class="x">),</span> <span class="n">replace</span><span class="o">=</span><span class="nb">true</span><span class="x">),</span> <span class="o">:</span><span class="x">])</span>
    <span class="n">res</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="o">-</span><span class="n">first</span><span class="x">(</span><span class="n">coef</span><span class="x">(</span><span class="n">mdl</span><span class="x">)</span><span class="o">/</span><span class="n">dt</span><span class="x">)</span>
<span class="k">end</span>

<span class="n">bootMom</span> <span class="o">=</span> <span class="n">histogram</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Momentum"</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"#7570b3"</span><span class="x">)</span>
<span class="n">bootMom</span> <span class="o">=</span> <span class="n">vline!</span><span class="x">(</span><span class="n">bootMom</span><span class="x">,</span> <span class="x">[</span><span class="o">-</span><span class="mf">0.5</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Truth"</span><span class="x">,</span> <span class="n">momentum</span> <span class="o">=</span> <span class="mi">2</span><span class="x">)</span>
<span class="n">bootMom</span> <span class="o">=</span> <span class="n">vline!</span><span class="x">(</span><span class="n">bootMom</span><span class="x">,</span> <span class="x">[</span><span class="mf">0.0</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"black"</span><span class="x">)</span>
</code></pre></div></div>

<p>We then do the same for the reversion data.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="mi">1000</span><span class="x">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="mi">1000</span>
    <span class="n">mdl</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">diffY</span> <span class="o">~</span> <span class="n">prevY</span> <span class="o">+</span> <span class="mi">0</span><span class="x">),</span> <span class="n">revData</span><span class="x">[</span><span class="n">sample</span><span class="x">(</span><span class="mi">2</span><span class="o">:</span><span class="n">nrow</span><span class="x">(</span><span class="n">revData</span><span class="x">),</span> <span class="n">nrow</span><span class="x">(</span><span class="n">revData</span><span class="x">),</span> <span class="n">replace</span><span class="o">=</span><span class="nb">true</span><span class="x">),</span> <span class="o">:</span><span class="x">])</span>
    <span class="n">res</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">first</span><span class="x">(</span><span class="o">-</span><span class="n">coef</span><span class="x">(</span><span class="n">mdl</span><span class="x">)</span><span class="o">/</span><span class="n">dt</span><span class="x">)</span>
<span class="k">end</span>

<span class="n">bootRev</span> <span class="o">=</span> <span class="n">histogram</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Reversion"</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"#1b9e77"</span><span class="x">)</span>
<span class="n">bootRev</span> <span class="o">=</span> <span class="n">vline!</span><span class="x">(</span><span class="n">bootRev</span><span class="x">,</span> <span class="x">[</span><span class="mf">0.5</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Truth"</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span> <span class="mi">2</span><span class="x">)</span>
<span class="n">bootRev</span> <span class="o">=</span> <span class="n">vline!</span><span class="x">(</span><span class="n">bootRev</span><span class="x">,</span> <span class="x">[</span><span class="mf">0.0</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"black"</span><span class="x">)</span>
</code></pre></div></div>

<p>Then combining both the graphs into one plot.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plot</span><span class="x">(</span><span class="n">bootMom</span><span class="x">,</span> <span class="n">bootRev</span><span class="x">,</span> 
  <span class="n">layout</span><span class="o">=</span><span class="x">(</span><span class="mi">2</span><span class="x">,</span><span class="mi">1</span><span class="x">),</span><span class="n">dpi</span><span class="o">=</span><span class="mi">900</span><span class="x">,</span> <span class="n">size</span><span class="o">=</span><span class="x">(</span><span class="mi">800</span><span class="x">,</span> <span class="mi">300</span><span class="x">),</span>
  <span class="n">background_color</span><span class="o">=:</span><span class="n">transparent</span><span class="x">,</span> <span class="n">foreground_color</span><span class="o">=:</span><span class="n">black</span><span class="x">,</span>
     <span class="n">link</span><span class="o">=:</span><span class="n">all</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/ouprocess/bootPlot.png" alt="Bootstrapping an OU process" title="Bootstrapping an OU process" /></p>

<p>The momentum bootstrap has worked and centred around the correct
value, but the same cannot be said for the reversion plot. However, it
has correctly guessed the sign.</p>

<h2 id="ar1-calibration-of-a-ou-process">AR(1) Calibration of a OU Process</h2>

<p>If we continue assuming that \(\mu = 0\) then we can simplify the OLS
to a 1-parameter regression - OLS without an intercept. From the
generating process, we can see that this is an AR(1) process - each
observation depends on the previous observation by some amount.</p>

\[\phi = \frac{\sum _i X_i  X_{i-1}}{\sum _i X_{i-1}^2}\]

<p>then the reversion parameter is calculated as</p>

\[\theta = - \frac{\log \phi}{\Delta t}\]

<p>This gives us a simple equation to calculate \(\theta\) now.</p>

<p>For the momentum sample:</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">phi</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="n">p1</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]</span> <span class="o">.*</span> <span class="n">p1</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)])</span> <span class="o">/</span> <span class="n">sum</span><span class="x">(</span><span class="n">p1</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)]</span> <span class="o">.^</span><span class="mi">2</span><span class="x">)</span>
<span class="o">-</span><span class="n">log</span><span class="x">(</span><span class="n">phi</span><span class="x">)</span><span class="o">/</span><span class="n">dt</span>
</code></pre></div></div>

<p>Givens \(\theta = -0.50184\), so very close to the true value.</p>

<p>For the reversion sample</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">phi</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="n">p2</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]</span> <span class="o">.*</span> <span class="n">p2</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)])</span> <span class="o">/</span> <span class="n">sum</span><span class="x">(</span><span class="n">p2</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)]</span> <span class="o">.^</span><span class="mi">2</span><span class="x">)</span>
<span class="o">-</span><span class="n">log</span><span class="x">(</span><span class="n">phi</span><span class="x">)</span><span class="o">/</span><span class="n">dt</span>
</code></pre></div></div>

<p>Gives \(\theta = 1.26\), so correct sign, but quite a way off.</p>

<p>Finally, for the random walk</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">phi</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="n">p3</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]</span> <span class="o">.*</span> <span class="n">p3</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)])</span> <span class="o">/</span> <span class="n">sum</span><span class="x">(</span><span class="n">p3</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)]</span> <span class="o">.^</span><span class="mi">2</span><span class="x">)</span>
<span class="o">-</span><span class="n">log</span><span class="x">(</span><span class="n">phi</span><span class="x">)</span><span class="o">/</span><span class="n">dt</span>
</code></pre></div></div>

<p>Produces \(\theta = -0.027\), so quite close to zero.</p>

<p>Again, values are similar to what we expect, so our estimation process
appears to be working.</p>

<h2 id="using-multiple-samples-for-calibrating-an-ou-process">Using Multiple Samples for Calibrating an OU Process</h2>

<p>If you aren’t convinced I don’t blame you. Those point estimates above are nowhere near the actual values that simulated the data so it’s hard to believe the estimation method is working. Instead, what we need to do is repeat the process and generate many more price paths and estimate the parameters of each one.</p>

<p>To make things a bit more manageable code-wise though I’m going to
introduce a <code class="language-plaintext highlighter-rouge">struct</code> that contains the parameters and allows to
simulate and estimate in a more contained manner.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span><span class="nc"> OUProcess</span>
    <span class="n">theta</span>
    <span class="n">mu</span> 
    <span class="n">sigma</span>
    <span class="n">dt</span>
    <span class="n">maxT</span>
    <span class="n">initial</span>
<span class="k">end</span>
</code></pre></div></div>

<p>We now write specific functions for this object and this allows us to
simplify the code slightly.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> simulate</span><span class="x">(</span><span class="n">ou</span><span class="o">::</span><span class="n">OUProcess</span><span class="x">)</span>
    <span class="n">simulate_os</span><span class="x">(</span><span class="n">ou</span><span class="o">.</span><span class="n">theta</span><span class="x">,</span> <span class="n">ou</span><span class="o">.</span><span class="n">mu</span><span class="x">,</span> <span class="n">ou</span><span class="o">.</span><span class="n">sigma</span><span class="x">,</span> <span class="n">ou</span><span class="o">.</span><span class="n">dt</span><span class="x">,</span> <span class="n">ou</span><span class="o">.</span><span class="n">maxT</span><span class="x">,</span> <span class="n">ou</span><span class="o">.</span><span class="n">initial</span><span class="x">)</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> estimate</span><span class="x">(</span><span class="n">ou</span><span class="o">::</span><span class="n">OUProcess</span><span class="x">)</span>
   <span class="n">p</span> <span class="o">=</span> <span class="n">simulate</span><span class="x">(</span><span class="n">ou</span><span class="x">)</span>
   <span class="n">phi</span> <span class="o">=</span>  <span class="n">sum</span><span class="x">(</span><span class="n">p</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]</span> <span class="o">.*</span> <span class="n">p</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)])</span> <span class="o">/</span> <span class="n">sum</span><span class="x">(</span><span class="n">p</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)]</span> <span class="o">.^</span><span class="mi">2</span><span class="x">)</span>
   <span class="o">-</span><span class="n">log</span><span class="x">(</span><span class="n">phi</span><span class="x">)</span><span class="o">/</span><span class="n">ou</span><span class="o">.</span><span class="n">dt</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> estimate</span><span class="x">(</span><span class="n">ou</span><span class="o">::</span><span class="n">OUProcess</span><span class="x">,</span> <span class="n">N</span><span class="x">)</span>
    <span class="n">res</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>
    <span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="n">N</span>
        <span class="n">p</span> <span class="o">=</span> <span class="n">simulate</span><span class="x">(</span><span class="n">ou</span><span class="x">)</span>
        <span class="n">res</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">estimate</span><span class="x">(</span><span class="n">ou</span><span class="x">)</span>
    <span class="k">end</span>
    <span class="n">res</span>
<span class="k">end</span>
</code></pre></div></div>

<p>We use these new functions to draw from the process 1,000 times and
sample the parameters for each one, collecting the results as an
array.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ou</span> <span class="o">=</span> <span class="n">OUProcess</span><span class="x">(</span><span class="mf">0.5</span><span class="x">,</span> <span class="mf">0.0</span><span class="x">,</span> <span class="n">vol</span><span class="x">,</span> <span class="n">dt</span><span class="x">,</span> <span class="n">maxT</span><span class="x">,</span> <span class="n">initial</span><span class="x">)</span>
<span class="n">revPlot</span> <span class="o">=</span> <span class="n">histogram</span><span class="x">(</span><span class="n">estimate</span><span class="x">(</span><span class="n">ou</span><span class="x">,</span> <span class="mi">1000</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Reversion"</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">revPlot</span><span class="x">,</span> <span class="x">[</span><span class="mf">0.5</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">);</span>
</code></pre></div></div>

<p>And the same for the momentum OU process</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ou</span> <span class="o">=</span> <span class="n">OUProcess</span><span class="x">(</span><span class="o">-</span><span class="mf">0.5</span><span class="x">,</span> <span class="mf">0.0</span><span class="x">,</span> <span class="n">vol</span><span class="x">,</span> <span class="n">dt</span><span class="x">,</span> <span class="n">maxT</span><span class="x">,</span> <span class="n">initial</span><span class="x">)</span>
<span class="n">momPlot</span> <span class="o">=</span> <span class="n">histogram</span><span class="x">(</span><span class="n">estimate</span><span class="x">(</span><span class="n">ou</span><span class="x">,</span> <span class="mi">1000</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Momentum"</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">momPlot</span><span class="x">,</span> <span class="x">[</span><span class="o">-</span><span class="mf">0.5</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">);</span>
</code></pre></div></div>

<p>Plotting the distribution of the results gives us a decent
understanding of how varied the samples can be.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plot</span><span class="x">(</span><span class="n">revPlot</span><span class="x">,</span> <span class="n">momPlot</span><span class="x">,</span> <span class="n">layout</span> <span class="o">=</span> <span class="x">(</span><span class="mi">2</span><span class="x">,</span><span class="mi">1</span><span class="x">),</span> <span class="n">link</span><span class="o">=:</span><span class="n">all</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/ouprocess/multisample.png" alt="Multiple sample estimation of an OU process" title="Multiple sample estimation of an OU process" /></p>

<p>We can see the heavy-tailed nature of the estimation process, but
thankfully the histograms are centred around the correct number. This
goes to show how difficult it is to estimate the mean reversion
parameter even in this simple setup. So for a real dataset, you need to
work out how to collect more samples or radically adjust how accurate
you think your estimate is.</p>

<h2 id="summary">Summary</h2>

<p>We have progressed from simulating an Ornstein-Uhlenbeck process to
estimating its parameters using various methods. We attempted to
enhance the accuracy of the estimates through bootstrapping, but we
discovered that the best approach to improve the estimation is to have
multiple samples.</p>

<p>So if you are trying to fit this type of process on some real world
data, be it the spread between two stocks
(<a href="https://math.nyu.edu/~avellane/AvellanedaLeeStatArb071108.pdf">Statistical Arbitrage in the U.S. Equities Market</a>),
client flow (<a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4609588">Unwinding Stochastic Order Flow: When to Warehouse Trades</a>) or anything
else you believe might be mean reverting, then understand how much
data you might need to accurately model the process.</p>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><summary type="html"><![CDATA[Read enough quant finance papers or books and you’ll come across the Ornstein–Uhlenbeck (OU) process. This is a post that explores the OU process, the equations, how we can simulate such a process and then estimate the parameters.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://dm13450.github.io/assets/ouprocess/oudemo.png" /><media:content medium="image" url="https://dm13450.github.io/assets/ouprocess/oudemo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>