<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://dm13450.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://dm13450.github.io/" rel="alternate" type="text/html" /><updated>2026-05-18T15:51:35+00:00</updated><id>https://dm13450.github.io/feed.xml</id><title type="html">Dean Markwick</title><subtitle>Personal website for Dean Markwick. If you like stats, sports and rambling, you&apos;ve come to the right place. All rights reserved. 
</subtitle><author><name>Dean Markwick</name></author><entry><title type="html">The Joys of Free Cloudflare</title><link href="https://dm13450.github.io/2026/05/18/the-joys-of-free-cloudflare.html" rel="alternate" type="text/html" title="The Joys of Free Cloudflare" /><published>2026-05-18T00:00:00+00:00</published><updated>2026-05-18T00:00:00+00:00</updated><id>https://dm13450.github.io/2026/05/18/the-joys-of-free-cloudflare</id><content type="html" xml:base="https://dm13450.github.io/2026/05/18/the-joys-of-free-cloudflare.html"><![CDATA[<p>I’ve been tinkering around with the free tier on Cloudflare and have managed to churn out a couple of side projects. Of course, I had a little help with various AI systems, but it was assisted rather than vibe-coded.</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>Most of my work life is spent in Jupyter notebooks looking at data. Most of my blogging is looking at data using Julia or Python. Every now and then I branch out and build something other people can use like <a href="https://cryptoliquiditymetrics.com/">Crypto Liquidity Metrics</a>. It’s a simple Netlify-hosted website with pretty simple HTML and JavaScript, but it doesn’t really use ‘the cloud’ in any meaningful way. I want to start expanding my horizons and using more of what’s available to build things.</p>

<p>I’m not sure how I ended up on Cloudflare but the fact you can use things without entering any payment details reassures me that I won’t lose my house if one of these things gets popular. Although once you read what I’ve built you’ll see there is very little chance they take off!</p>

<h2 id="cloudflare-pages">Cloudflare Pages</h2>

<p>My first little project is a personal train timetable. I live in a place where there are fast trains that skip out lots of stops. However, they are only at specific times throughout the day. I would normally use the National Rail app and have to scroll through the regular slow trains and keep my eye out for a fast one. What I needed was a way to filter the trains by platform as the fast ones left from their own platform. For this I needed a way of getting train data.</p>

<p>Rail departure information is made available for free through the <a href="https://www.nationalrail.co.uk/developers/darwin-data-feeds/">Darwin Data Feeds</a>. You apply for a key and off you go. However, it’s a bit complicated to query as it’s not a REST API. Thankfully, someone has done the hard work and built a REST API for the same data. This API is called <a href="https://huxley2.azurewebsites.net/">huxley2</a> and is an open-source, self-hosted version. You still need a token from National Rail but a REST API is much easier to work with. I query the API from the browser, filter based on the platform and return the resulting trains. So all in, pretty easy. I let Claude style the front end and it’s all done.</p>

<p>I now needed a place to host this single page app. This is where Cloudflare Pages come in. You can upload the HTML and JavaScript files and it’s done. Of course because I’m a professional programmer I didn’t do that, I connected it to my GitHub and every time I push a commit it rebuilds the website.</p>

<p><img src="/assets/joysofcloudflare/fasttrains.png" alt="Screenshot of the fast trains filter showing a list of filtered train departures by platform" /></p>

<p>I save this webpage to my phone’s homepage and job done. I open the link and it tells me the next fast train home. Now obviously, this is only useful for me and a few family members. So no chance of it taking off! Now I could add cookies, let you choose the filters for the trains you are interested in, and make it more applicable to a wider audience, but there’s not really much upside in building that out. This stays personal for now.</p>

<p>Building out a 1 page website with some HTML and JavaScript is simple. I wanted to ramp it up a little more and see what else Cloudflare can deliver for free.</p>

<h2 id="cloudflare-workers-and-d1-sql-database">Cloudflare Workers and D1 SQL Database</h2>

<p>CBOE publishes their daily FX volumes as a JSON file on their website. It only contains the last 30 days, so I wanted to save this down each day to build out my own personal history. It’s trivial to write the JSON parsing. This is a problem around automation—I don’t want the code on my laptop where I have to make sure the script runs manually. Cloudflare provides Workers, which gives you a short burst of compute to do something interesting. For me, I use this to run through the JSON data and get it ready for saving down.</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="nx">scheduled</span><span class="p">(</span><span class="nx">request</span><span class="p">,</span> <span class="nx">env</span><span class="p">)</span> <span class="p">{</span>

    <span class="kd">const</span> <span class="nx">spotURL</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">https://cdn.cboe.com/fx/spotInstrumentVolume.json</span><span class="dl">"</span><span class="p">;</span>
    <span class="kd">const</span> <span class="nx">ndfURL</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">https://cdn.cboe.com/fx/sefInstrumentVolume.json</span><span class="dl">"</span>

    <span class="k">await</span> <span class="nx">saveData</span><span class="p">(</span><span class="nx">spotURL</span><span class="p">,</span> <span class="nx">env</span><span class="p">);</span>
    <span class="k">await</span> <span class="nx">saveData</span><span class="p">(</span><span class="nx">ndfURL</span><span class="p">,</span> <span class="nx">env</span><span class="p">);</span>

    <span class="k">return</span> <span class="k">new</span> <span class="nx">Response</span><span class="p">(</span><span class="dl">"</span><span class="s2">Data saved successfully</span><span class="dl">"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now, to save it down I want a database. I don’t want to be saving it to a CSV; I want something better. A database gives us a better way to query the data immediately. D1 is Cloudflare’s implementation of SQLite, and it’s trivial to bind the worker to the database, which means the worker can access and use the database as needed. Of course, you have to define the tables and set the keys, but for something as simple as this data (date, sym, volume), it’s trivial.</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="kd">function</span> <span class="nx">saveData</span><span class="p">(</span><span class="nx">url</span><span class="p">,</span> <span class="nx">env</span><span class="p">)</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="nx">dailyData</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">getDailyData</span><span class="p">(</span><span class="nx">url</span><span class="p">);</span>
    <span class="kd">const</span> <span class="nx">volumes</span> <span class="o">=</span> <span class="nx">dailyData</span><span class="p">.</span><span class="nx">map</span><span class="p">(</span><span class="nx">x</span> <span class="o">=&gt;</span> <span class="nx">parseData</span><span class="p">(</span><span class="nx">x</span><span class="p">));</span>
    <span class="kd">const</span> <span class="nx">statements</span> <span class="o">=</span> <span class="nx">volumes</span><span class="p">.</span><span class="nx">map</span><span class="p">(</span><span class="nx">item</span> <span class="o">=&gt;</span> <span class="nx">bindStatement</span><span class="p">(</span><span class="nx">item</span><span class="p">,</span> <span class="nx">env</span><span class="p">)).</span><span class="nx">flat</span><span class="p">();</span>
    <span class="k">await</span> <span class="nx">env</span><span class="p">.</span><span class="nx">DB</span><span class="p">.</span><span class="nx">batch</span><span class="p">(</span><span class="nx">statements</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This gives us all the data into a database nicely. We set the schedule to run at midnight every night and it can do its thing. I check it each morning, and sure enough, the new data is there.</p>

<p>We can now think about building a small dashboard for this data. More HTML and JavaScript!
For the frontend, I wanted to be more self-reliant and in control. After some conversations with Gemini and Claude, I settled on <a href="https://picocss.com/">Pico</a> CSS, which provides a clean style straight out of the box. For the charts, I used <a href="https://www.chartjs.org/">Chart.js</a>, the most popular charting library according to the AI tools. For the table, I used <a href="https://gridjs.io/">Grid.js</a>.</p>

<p>The flow is very simple. When the dashboard is loaded, it pulls the data into memory via a simple SQL query to the D1 database. I then plot the total volumes of the day (i.e., sum across the currency pairs), plot an individual currency chosen from a dropdown, and build a table that shows the top 10 highest volumes for yesterday and their volume relative to the 30-day average.</p>

<p><img src="/assets/joysofcloudflare/fxvolumes.png" alt="FX volumes dashboard displaying total market volumes" /></p>

<p>Only the first graph is shown as I don’t want a massive screenshot! But overall it looks very smart, and I’ve learned some new web development skills.</p>

<p>It should all stay under the free tier. If the data starts to get too big, I’ll have to make better use of caching and no longer just dump everything into memory immediately. But for now, it’s another job done. Overall, I’m pretty proud. I’m building up a nice dataset in the cloud and a slick frontend in front of the data. All in a weekend’s work.</p>

<p>If you haven’t already, sign up and start tinkering yourself. Start talking to the AI tools (Claude/Gemini/ChatGPT) to sketch out how to approach something, and just start small. The <a href="https://developers.cloudflare.com/">Cloudflare Docs</a> are great and their command line tool <a href="https://developers.cloudflare.com/workers/wrangler/">Wrangler</a> also makes things very easy to setup locally.</p>]]></content><author><name>Dean Markwick</name></author><category term="javascript" /><category term="web-dev" /><category term="finance" /><summary type="html"><![CDATA[I’ve been tinkering around with the free tier on Cloudflare and have managed to churn out a couple of side projects. Of course, I had a little help with various AI systems, but it was assisted rather than vibe-coded.]]></summary></entry><entry><title type="html">A Fundamental FX Factor Model</title><link href="https://dm13450.github.io/2026/04/19/A-Fundamental-FX-Factor-Model.md.html" rel="alternate" type="text/html" title="A Fundamental FX Factor Model" /><published>2026-04-19T00:00:00+00:00</published><updated>2026-04-19T00:00:00+00:00</updated><id>https://dm13450.github.io/2026/04/19/A-Fundamental-FX-Factor-Model.md</id><content type="html" xml:base="https://dm13450.github.io/2026/04/19/A-Fundamental-FX-Factor-Model.md.html"><![CDATA[<p>I’ve been reading <a href="https://www.wiley.com/en-us/The+Elements+of+Quantitative+Investing-p-9781394265466">The Elements of Quantitative Investing</a> to branch out from my usual high-frequency finance to something slower or mid-frequency. Factor models are a big part of this quant topic, and I’m trying to get a deeper understanding by following the book and applying the process to FX data.</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>Factor models provide a mechanism for explaining returns. They are multivariate models that break down the features that drive an asset’s performance. The key assumption is that each individual asset’s return is not independent of the others, but there are common factors that drive returns and an asset’s sensitivity to those factors drives its returns. In equities, you’ll hear of value and momentum factors, and there are even ETFs that you can invest in for exposure to those factors. We want to come up with something similar in the FX space.</p>

<p><img src="/assets/fxfundamental/ccy_ret_intro.png" alt="Currency returns" /></p>

<p>A factor model attempts to explain asset-universe return behaviour. From this you can start to build portfolios, decompose risk across the different factors, and even look at returns not explained by the factors, which in turn becomes alpha research. These types of models are the foundation of many other quant topics, so it’s good to get a handle on them.</p>

<p>I will start by getting the data and the features into place. Part of that is using the DXY functions from my previous post (<a href="https://dm13450.github.io/2026/03/10/Making-Sense-of-the-DXY.html">Making Sense of the DXY</a>) and adding some new ETF data. I’ll then run two models: one to explain price moves over time and another to explain price moves between currencies themselves. Using the models in tandem forms the FX factor model. We will then explore the specific factors and how you can build factor portfolios from different currency pairs.</p>

<h2 id="fx-vs-equities">FX vs Equities</h2>

<p>Typically, factor models and most academic research in this field use equity data. However, I am an FX man at heart (for my sins?) and so I want to use currency data. This restricts the universe to about 30 assets rather than the 2,000 US stocks. Therefore, to overcome the small sample size, we will use weekly data rather than monthly.</p>

<p>Monthly data will remove as much “trading” noise as possible. You want the price moves to reflect the underlying performance of the asset and not the day-to-day flows and execution noise. Daily data isn’t an option as FX trades 24 hours a day but the ETFs only trade during the regular market hours. This presents a synchronisation problem. A currency move could happen overnight based on some headlines hours before the ETF is even open for trading. So we will split the difference and use weekly data. This should give us enough data while keeping the overall price movements based on the same time period and information.</p>

<p>Another problem with FX data is a lack of descriptive features. Again, in equities, you have the financial reports of a company, things like price to book and market capitilisation but these have no equivalent in FX so we need to a different way of coming up with characteristics. For this I’ll be using ETFs to try and see what macro features might move the currency pairs.</p>

<h2 id="the-data-pipeline">The Data Pipeline</h2>

<p>We are bringing together ETF, currency, and DXY data. This is all simple to pull from <a href="https://twelvedata.com/">twelvedata</a>.</p>

<h3 id="downloading-and-preparing-the-etf-data">Downloading and Preparing the ETF Data</h3>

<p>I’ll be using different macro ETFs as general factors. These four ETFs proxy the major macro drivers:</p>

<ul>
  <li><strong>VTI (Risk Appetite):</strong> When stocks rally, investors move from cash to risk assets, weakening 
the dollar (capital flows out of US). When stocks fall, the reverse.</li>
  <li><strong>BND (Interest Rates):</strong> Bond prices move inversely to rates. Rising US rates strengthen the 
dollar; falling rates weaken it.</li>
  <li><strong>GLD (Inflation/Uncertainty):</strong> Gold rallies when inflation forecasts rise or geopolitical 
risk spikes, often correlated with currency volatility.</li>
  <li><strong>USO (Commodity Risk):</strong> Oil is priced in dollars. Oil rallies often reflect emerging market 
demand, shifting currency flows.</li>
</ul>

<p>Each of these ETFs forms a standard macro-economic indicator that I suspect currencies might respond to. You could go further and break down the stocks into different regions or sizes (small-cap, large-cap etc.) and likewise for the bonds, which could be broken down by country. But for now, these are a good high-level weather vane for how the global economy is moving.</p>

<p>We will be using the same functions from my previous post, just updating it to save at a weekly frequency. Then we load those files, combine everything, and calculate the log returns.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">etfs</span> <span class="o">=</span> <span class="p">[</span><span class="s">"GLD"</span><span class="p">,</span> <span class="s">"BND"</span><span class="p">,</span> <span class="s">"VTI"</span><span class="p">,</span> <span class="s">"USO"</span><span class="p">]</span>
<span class="n">etfDF</span> <span class="o">=</span> <span class="p">[</span><span class="n">load_data</span><span class="p">(</span><span class="n">etf</span><span class="p">)</span> <span class="k">for</span> <span class="n">etf</span> <span class="ow">in</span> <span class="n">etfs</span><span class="p">]</span>
<span class="n">etfDF</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">etfDF</span><span class="p">)</span>
<span class="n">etfDF</span> <span class="o">=</span> <span class="n">etfDF</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span>

<span class="n">etfDF</span> <span class="o">=</span> <span class="n">etfDF</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">().</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return"</span><span class="p">))</span>
</code></pre></div></div>

<p>We need to normalise the log returns by rolling volatility. To calculate the volatility, we take the standard deviation of the returns in a 52-week period.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">etfDF</span> <span class="o">=</span> <span class="n">etfDF</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return"</span><span class="p">).</span><span class="n">rolling_std</span><span class="p">(</span><span class="n">window_size</span><span class="o">=</span><span class="mi">52</span><span class="p">).</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"vol_52"</span><span class="p">))</span>
<span class="n">etfDF</span> <span class="o">=</span> <span class="n">etfDF</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return"</span><span class="p">)</span><span class="o">/</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"vol_52"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return_scaled"</span><span class="p">))</span>


<span class="n">etfDF</span> <span class="o">=</span> <span class="n">etfDF</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">),</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">),</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return_scaled"</span><span class="p">))</span>
<span class="n">etfDF</span> <span class="o">=</span> <span class="n">etfDF</span><span class="p">.</span><span class="n">pivot</span><span class="p">(</span><span class="n">values</span><span class="o">=</span><span class="s">"log_return_scaled"</span><span class="p">,</span> <span class="n">index</span><span class="o">=</span><span class="s">"datetime"</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="s">"ccy"</span><span class="p">)</span>
</code></pre></div></div>

<p>When we plot these normalised ETF returns, they line up with what we expect.</p>

<p><img src="/assets/fxfundamental/etfs.png" alt="Normalized weekly returns for four macro ETFs (VTI, BND, GLD, USO) from 2015 to 2024, displayed as separate line charts stacked vertically. Each chart shows volatility-scaled returns fluctuating around zero between approximately -3 and 3 standard deviations. VTI exhibits sharp downturns in early 2020 and 2022. BND shows relative stability with occasional spikes. GLD displays distinct upward trends in 2020 and 2024. USO demonstrates high volatility with dramatic swings throughout the period, particularly severe in 2020." /></p>

<p>We need to ensure the different ETFs aren’t overly correlated. Highly correlated ETF returns would indicate redundant information, and multicollinearity in our regression analysis would lead to unstable coefficient estimates. Ideally, the ETF returns should capture distinct dimensions of macro risk.</p>

<p>Polars makes it easy to calculate the correlations over time with the <code class="language-plaintext highlighter-rouge">rolling_corr</code> function.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">etfDFCorr</span> <span class="o">=</span> <span class="n">etfDF</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">rolling_corr</span><span class="p">(</span><span class="s">"VTI"</span><span class="p">,</span> <span class="s">"USO"</span><span class="p">,</span> <span class="n">window_size</span><span class="o">=</span><span class="mi">52</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"VTI_USO_corr"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">rolling_corr</span><span class="p">(</span><span class="s">"VTI"</span><span class="p">,</span> <span class="s">"BND"</span><span class="p">,</span> <span class="n">window_size</span><span class="o">=</span><span class="mi">52</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"VTI_BND_corr"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">rolling_corr</span><span class="p">(</span><span class="s">"VTI"</span><span class="p">,</span> <span class="s">"GLD"</span><span class="p">,</span> <span class="n">window_size</span><span class="o">=</span><span class="mi">52</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"VTI_GLD_corr"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">rolling_corr</span><span class="p">(</span><span class="s">"USO"</span><span class="p">,</span> <span class="s">"GLD"</span><span class="p">,</span> <span class="n">window_size</span><span class="o">=</span><span class="mi">52</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"USO_GLD_corr"</span><span class="p">)</span>
    <span class="p">).</span><span class="n">drop_nulls</span><span class="p">()</span>
</code></pre></div></div>

<p>Plotting these correlations gives us confidence that everything is reasonable.</p>

<p><img src="/assets/fxfundamental/etf_corr.png" alt="Rolling 52-week correlations between macro ETFs showing VTI-USO, VTI-BND, VTI-GLD, and USO-GLD pairs from 2015 to 2024. All correlations fluctuate between approximately -0.4 and 0.8, with most pairs averaging near zero. VTI-BND correlation shows a notable shift to consistently positive values from 2020 onwards after being negative or near-zero in earlier years." /></p>

<p>At worst, we see a 0.6 correlation, which is just about acceptable as it only occurs for a brief period.</p>

<p>Sidenote: it’s interesting how stock–bond correlation hasn’t been negative since 2020. Thinking out loud, but that must have some big consequences for the risk profile of the 60/40 allocation. Another post for another day prehaps.</p>

<p>Now, onto the FX data.</p>

<h3 id="getting-the-fx--dxy-data">Getting the FX + DXY Data</h3>

<p>Again, following my last post I’m now just pulling the weekly data instead of daily. I’ve also wrapped the DXY calculations from my previous post (<a href="https://dm13450.github.io/2026/03/10/Making-Sense-of-the-DXY.html">Making Sense of the DXY</a>) into a nice function.</p>

<p>We load across the 33 currencies available.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dfs</span> <span class="o">=</span> <span class="p">[</span><span class="n">load_data</span><span class="p">(</span><span class="n">ccy</span><span class="p">)</span> <span class="k">for</span> <span class="n">ccy</span> <span class="ow">in</span> <span class="n">ccys</span><span class="p">]</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">dfs</span><span class="p">)</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">drop</span><span class="p">(</span><span class="s">"open"</span><span class="p">,</span> <span class="s">"high"</span><span class="p">,</span> <span class="s">"low"</span><span class="p">)</span>
</code></pre></div></div>

<p>Then join the DXY and ETF data.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dxy</span> <span class="o">=</span> <span class="n">load_dxy</span><span class="p">()</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">dxy</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">),</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"dxy_close"</span><span class="p">)),</span> <span class="n">on</span><span class="o">=</span><span class="s">"datetime"</span><span class="p">,</span> <span class="n">how</span><span class="o">=</span><span class="s">"left"</span><span class="p">)</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">etfDF</span><span class="p">,</span> <span class="n">on</span><span class="o">=</span><span class="s">"datetime"</span><span class="p">,</span> <span class="n">how</span><span class="o">=</span><span class="s">"left"</span><span class="p">)</span>
</code></pre></div></div>

<p>We then calculate the returns and the 1-month, 6-month and 1-year momentum factors.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">().</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"dxy_close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">().</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_log_return"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="mi">4</span><span class="p">).</span><span class="n">shift</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return_4"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="mi">26</span><span class="p">).</span><span class="n">shift</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return_26"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="mi">52</span><span class="p">).</span><span class="n">shift</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return_52"</span><span class="p">)</span>
<span class="p">)</span>
</code></pre></div></div>

<p>Like the ETF returns we also want to normalise the currency returns and DXY returns by their rolling volatility.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return"</span><span class="p">).</span><span class="n">rolling_std</span><span class="p">(</span><span class="n">window_size</span><span class="o">=</span><span class="mi">52</span><span class="p">).</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"vol_52"</span><span class="p">))</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return"</span><span class="p">)</span><span class="o">/</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"vol_52"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return_scaled"</span><span class="p">))</span>

<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"dxy_log_return"</span><span class="p">).</span><span class="n">rolling_std</span><span class="p">(</span><span class="n">window_size</span><span class="o">=</span><span class="mi">52</span><span class="p">).</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_vol_52"</span><span class="p">))</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"dxy_log_return"</span><span class="p">)</span><span class="o">/</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"dxy_vol_52"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_log_return_scaled"</span><span class="p">))</span>
</code></pre></div></div>

<p>We normalise the momentum features in the same way.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return_4"</span><span class="p">)</span><span class="o">/</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return_4_vol_52"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return_4_scaled"</span><span class="p">))</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return_26"</span><span class="p">)</span><span class="o">/</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return_26_vol_52"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return_26_scaled"</span><span class="p">))</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return_52"</span><span class="p">)</span><span class="o">/</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"log_return_52_vol_52"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return_52_scaled"</span><span class="p">))</span>
</code></pre></div></div>

<p>It is also recommended you <a href="https://en.wikipedia.org/wiki/Winsorizing">winsorise</a> the return data. This involves replacing the extreme values with the 5% quantiles and a simple polars function. This reduces the influence of outliers in the models and just keeps the data a bit cleaner.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cols</span> <span class="o">=</span> <span class="p">[</span><span class="s">"log_return_scaled"</span><span class="p">,</span> <span class="s">"dxy_log_return_scaled"</span><span class="p">,</span> <span class="s">"log_return_4_scaled"</span><span class="p">,</span> <span class="s">"log_return_26_scaled"</span><span class="p">,</span> <span class="s">"log_return_52_scaled"</span><span class="p">,</span>
       <span class="s">"GLD"</span><span class="p">,</span> <span class="s">"BND"</span><span class="p">,</span> <span class="s">"VTI"</span><span class="p">,</span> <span class="s">"USO"</span><span class="p">]</span>

<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">([</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="n">c</span><span class="p">).</span><span class="n">clip</span><span class="p">(</span>
        <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="n">c</span><span class="p">).</span><span class="n">quantile</span><span class="p">(</span><span class="mf">0.05</span><span class="p">),</span>
        <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="n">c</span><span class="p">).</span><span class="n">quantile</span><span class="p">(</span><span class="mf">0.95</span><span class="p">)</span>
    <span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">c</span><span class="si">}</span><span class="s">_clipped"</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">cols</span>
<span class="p">])</span>
</code></pre></div></div>

<p>With the data collected we can now move on to some modelling.</p>

<h2 id="fx-return-characteristics">FX Return Characteristics</h2>

<p>We need to build a dataset of <em>characteristics</em> per currency pair. These are potential features that will explain an individual currency’s return over time.</p>

<p>Mathematically</p>

\[R = \beta X,\]

<p>where \(R\) is the currency return, \(X\) are the returns from other assets and we want to estimate \(\beta\). If a currency is sensitive to oil, then it will have some element of dependence on the oil ETF USO and \(\beta _\text{USO}\) will capture that effect.</p>

<p>\(X\) contains the weekly values of</p>

<ul>
  <li>Weekly DXY return</li>
  <li>Global stocks (VTI)</li>
  <li>Global bonds (AGG)</li>
  <li>Oil (USO)</li>
  <li>Gold (GLD)</li>
  <li>The currency’s momentum at 1, 6 and 12 month intervals.</li>
</ul>

<p>The model is fitted per currency individually as a rolling one-year regression. We use volatility-normalised returns so the \(\beta\)s are more stable over time.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">statsmodels.formula.api</span> <span class="k">as</span> <span class="n">smf</span>
<span class="kn">from</span> <span class="nn">statsmodels.regression.rolling</span> <span class="kn">import</span> <span class="n">RollingOLS</span>

<span class="n">allParams</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c1"># sort the subdata by datetime to ensure the rolling regression works correctly
</span><span class="k">for</span> <span class="n">ccy</span> <span class="ow">in</span> <span class="n">ccys</span><span class="p">:</span>
    <span class="n">subDF</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">)</span> <span class="o">==</span> <span class="n">ccy</span><span class="p">).</span><span class="n">drop_nulls</span><span class="p">().</span><span class="n">sort</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span>
    <span class="n">mod</span> <span class="o">=</span> <span class="n">RollingOLS</span><span class="p">.</span><span class="n">from_formula</span><span class="p">(</span><span class="s">"log_return_scaled_clipped ~ dxy_log_return_scaled_clipped + GLD_clipped + BND_clipped + VTI_clipped + USO_clipped + log_return_4_scaled_clipped + log_return_26_scaled_clipped + log_return_52_scaled_clipped"</span><span class="p">,</span> 
                <span class="n">window</span> <span class="o">=</span> <span class="mi">52</span><span class="p">,</span>
                <span class="n">data</span><span class="o">=</span><span class="n">subDF</span><span class="p">).</span><span class="n">fit</span><span class="p">()</span>
    
    <span class="n">paramDF</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">from_pandas</span><span class="p">(</span><span class="n">mod</span><span class="p">.</span><span class="n">params</span><span class="p">)</span>
    <span class="n">paramDF</span> <span class="o">=</span> <span class="n">paramDF</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">ccy</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="n">ccy</span><span class="p">),</span> 
                                   <span class="n">datetime</span> <span class="o">=</span> <span class="n">subDF</span><span class="p">[</span><span class="s">"datetime"</span><span class="p">],</span>
                                   <span class="n">log_return</span> <span class="o">=</span> <span class="n">subDF</span><span class="p">[</span><span class="s">"log_return"</span><span class="p">],</span>
                                   <span class="n">log_return_prev</span> <span class="o">=</span> <span class="n">subDF</span><span class="p">[</span><span class="s">"log_return"</span><span class="p">].</span><span class="n">shift</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> 
                                   <span class="n">r2</span> <span class="o">=</span> <span class="n">mod</span><span class="p">.</span><span class="n">rsquared_adj</span><span class="p">.</span><span class="n">values</span><span class="p">,</span>
                                   <span class="n">vol_52</span> <span class="o">=</span> <span class="n">subDF</span><span class="p">[</span><span class="s">"vol_52"</span><span class="p">])</span>
    
    <span class="n">allParams</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">paramDF</span><span class="p">)</span>


<span class="n">allParams</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">allParams</span><span class="p">).</span><span class="n">drop_nulls</span><span class="p">().</span><span class="n">sort</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span>
</code></pre></div></div>

<p>We save the \(\beta\) time series, the \(R^2\) values, and the volatility.</p>

<p>To make sure the regression is doing a good job for all the time periods we plot the \(R^2\) for a few currencies.</p>

<p><img src="/assets/fxfundamental/ccy_r2.png" alt="R-squared values for eight currency pairs (AUD, CAD, CNH, EUR, GBP, INR, JPY, MXN) plotted as separate line charts stacked vertically from 2015 to 2024. Each chart displays rolling 52-week R-squared values fluctuating between approximately 0 and 0.8. Most currencies show R-squared values clustering between 0.3 and 0.6, indicating the regression model explains 30-60% of weekly return variance. CNH and MXN exhibit the lowest and most volatile R-squared values, often dropping below 0.3, while AUD, EUR, and GBP maintain relatively more stable values above 0.4." /></p>

<p>They are all the right order of magnitude with CNH and MXN being the worst but still manageable.</p>

<p>If we average over currency pairs and time, we get a rough understanding of the \(\beta\) values.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">betaSummary</span> <span class="o">=</span> <span class="p">(</span>
    <span class="n">allParams</span>
    <span class="p">.</span><span class="n">unpivot</span><span class="p">(</span><span class="n">index</span><span class="o">=</span><span class="p">[</span><span class="s">"datetime"</span><span class="p">,</span> <span class="s">"ccy"</span><span class="p">])</span>
    <span class="p">.</span><span class="n">group_by</span><span class="p">(</span><span class="s">"variable"</span><span class="p">)</span>
    <span class="p">.</span><span class="n">agg</span><span class="p">(</span>
        <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"value"</span><span class="p">).</span><span class="n">mean</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"mean"</span><span class="p">),</span>
        <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"value"</span><span class="p">).</span><span class="n">std</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"std"</span><span class="p">),</span>
        <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"value"</span><span class="p">).</span><span class="nb">min</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"min"</span><span class="p">),</span>
        <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"value"</span><span class="p">).</span><span class="nb">max</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"max"</span><span class="p">)</span>
    <span class="p">)</span>
    <span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"mean"</span><span class="p">,</span> <span class="n">descending</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="p">)</span>
<span class="n">betaSummary</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"variable"</span><span class="p">).</span><span class="nb">str</span><span class="p">.</span><span class="n">contains</span><span class="p">(</span><span class="s">"dxy_log_return|log_return_4|log_return_26|log_return_52|VTI|BND|GLD|USO|Intercept"</span><span class="p">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>Variable</th>
      <th>Mean</th>
      <th>Std</th>
      <th>Min</th>
      <th>Max</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>dxy_log_return_scaled</td>
      <td>0.485</td>
      <td>0.327</td>
      <td>-0.452</td>
      <td>1.34</td>
    </tr>
    <tr>
      <td>Intercept</td>
      <td>0.0853</td>
      <td>0.385</td>
      <td>-2.26</td>
      <td>25.1</td>
    </tr>
    <tr>
      <td>USO</td>
      <td>-0.0222</td>
      <td>0.166</td>
      <td>-0.942</td>
      <td>0.730</td>
    </tr>
    <tr>
      <td>BND</td>
      <td>-0.0267</td>
      <td>0.176</td>
      <td>-0.970</td>
      <td>0.788</td>
    </tr>
    <tr>
      <td>log_return_4_scaled</td>
      <td>-0.0341</td>
      <td>0.146</td>
      <td>-0.864</td>
      <td>1.69</td>
    </tr>
    <tr>
      <td>log_return_52_scaled</td>
      <td>-0.0525</td>
      <td>0.192</td>
      <td>-1.92</td>
      <td>0.932</td>
    </tr>
    <tr>
      <td>log_return_26_scaled</td>
      <td>-0.0591</td>
      <td>0.180</td>
      <td>-1.50</td>
      <td>0.895</td>
    </tr>
    <tr>
      <td>GLD</td>
      <td>-0.0636</td>
      <td>0.179</td>
      <td>-1.06</td>
      <td>0.599</td>
    </tr>
    <tr>
      <td>VTI</td>
      <td>-0.149</td>
      <td>0.206</td>
      <td>-0.982</td>
      <td>0.844</td>
    </tr>
  </tbody>
</table>

<p>DXY is the main driver, with a negative dependence on VTI. This makes sense and lines up with our beliefs: if stocks are doing badly, it’s likely people sold them for cash, and likewise when stocks are doing well people are moving from cash into equities. This helps confirm VTI as a general risk-on/risk-off factor.</p>

<p>It’s frustrating that the intercept has a large average \(\beta\) value, as it means we are missing drivers of currency returns. An obvious omission is the carry factor and how interest rates across countries drive currency returns. Annoyingly, it’s hard to get free data for that, so we will have to make do for now.</p>

<p>We’ve now got a picture of how much each currency depends on macro factors but this tells us about individual currencies in isolation. We now need to know if differences in these sensitivities explain why some pairs outperform others.</p>

<p>To answer that, we regress across currency pairs at each point in time. This is known as cross sectional regression.</p>

<h2 id="cross-sectional-regression-for-currency-returns">Cross Sectional Regression for Currency Returns</h2>

<p>From the first regression we have currency characteristics over time. For the cross-sectional regression, we now use all the currencies per week and then run the regression to see if the sensitivity to the factors (the \(\beta\)s) explains the returns.</p>

<p>We also add in a currency group factor as an additional characteristic that classifies broad groups of currency pairs.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">allParams</span> <span class="o">=</span> <span class="n">allParams</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">map_elements</span><span class="p">(</span><span class="n">ccy_group_map</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"ccyGroup"</span><span class="p">)</span>
    <span class="p">)</span>  
</code></pre></div></div>

<p>We normalise the \(\beta\)’s across the currency pairs which helps keep everything comparable.</p>

<p>This time mathematically,</p>

\[R = \lambda B,\]

<p>where \(R\) are the currency returns for a given week and \(B\) is the matrix of normalised \(\beta\) values and the currency group indicator. We are using simple weighted regression to estimate \(\lambda\). The weights use the inverse of the volatility to reduce the impact of high volatility pairs.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">allParams2</span> <span class="o">=</span> <span class="p">[]</span>

<span class="n">factor_cols</span> <span class="o">=</span> <span class="p">[</span><span class="s">"dxy_log_return_scaled_clipped"</span><span class="p">,</span> <span class="s">"GLD_clipped"</span><span class="p">,</span> <span class="s">"BND_clipped"</span><span class="p">,</span> <span class="s">"VTI_clipped"</span><span class="p">,</span> <span class="s">"USO_clipped"</span><span class="p">,</span> <span class="s">"log_return_4_scaled_clipped"</span><span class="p">,</span> <span class="s">"log_return_26_scaled_clipped"</span><span class="p">,</span> <span class="s">"log_return_52_scaled_clipped"</span><span class="p">]</span>

<span class="k">for</span> <span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">dt</span><span class="p">)</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">allParams</span><span class="p">[</span><span class="s">"datetime"</span><span class="p">].</span><span class="n">unique</span><span class="p">()):</span>
    <span class="n">subDF</span> <span class="o">=</span> <span class="n">allParams</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span> <span class="o">==</span> <span class="n">dt</span><span class="p">)</span>

    <span class="n">subDF</span> <span class="o">=</span> <span class="n">subDF</span><span class="p">.</span><span class="n">with_columns</span><span class="p">([</span>
    <span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="n">c</span><span class="p">)</span> <span class="o">-</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="n">c</span><span class="p">).</span><span class="n">mean</span><span class="p">().</span><span class="n">over</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">))</span> <span class="o">/</span> 
      <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="n">c</span><span class="p">).</span><span class="n">std</span><span class="p">().</span><span class="n">over</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">c</span><span class="si">}</span><span class="s">_scaled"</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">factor_cols</span>
    <span class="p">])</span>

    <span class="n">csr</span> <span class="o">=</span> <span class="n">smf</span><span class="p">.</span><span class="n">wls</span><span class="p">(</span><span class="s">"log_return_prev ~ ccyGroup + dxy_log_return_scaled_clipped_scaled + GLD_clipped_scaled + BND_clipped_scaled + VTI_clipped_scaled + USO_clipped_scaled + log_return_4_scaled_clipped_scaled + log_return_26_scaled_clipped_scaled + log_return_52_scaled_clipped_scaled"</span><span class="p">,</span> 
                  <span class="n">data</span><span class="o">=</span><span class="n">subDF</span><span class="p">,</span> <span class="n">weights</span><span class="o">=</span><span class="mi">1</span><span class="o">/</span><span class="p">(</span><span class="n">subDF</span><span class="p">[</span><span class="s">"vol_52"</span><span class="p">]</span><span class="o">**</span><span class="mi">2</span><span class="p">)).</span><span class="n">fit</span><span class="p">()</span>

    <span class="n">paramsRes</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">data</span> <span class="o">=</span> <span class="p">[[</span><span class="n">x</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">csr</span><span class="p">.</span><span class="n">params</span><span class="p">.</span><span class="n">values</span><span class="p">],</span> 
             <span class="n">schema</span><span class="o">=</span><span class="nb">list</span><span class="p">(</span><span class="n">csr</span><span class="p">.</span><span class="n">params</span><span class="p">.</span><span class="n">index</span><span class="p">.</span><span class="n">values</span><span class="p">))</span>

    <span class="n">paramsRes</span> <span class="o">=</span> <span class="n">paramsRes</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">datetime</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="n">dt</span><span class="p">))</span>
    <span class="n">allParams2</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">paramsRes</span><span class="p">)</span>

<span class="n">allParams2</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">allParams2</span><span class="p">).</span><span class="n">drop_nulls</span><span class="p">().</span><span class="n">sort</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span>
<span class="n">allParams2</span> <span class="o">=</span> <span class="n">allParams2</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span> <span class="o">!=</span> <span class="n">pl</span><span class="p">.</span><span class="n">date</span><span class="p">(</span><span class="mi">2009</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span> <span class="mi">14</span><span class="p">))</span>
<span class="n">allParams2</span> <span class="o">=</span> <span class="n">allParams2</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span> <span class="o">!=</span> <span class="n">pl</span><span class="p">.</span><span class="n">date</span><span class="p">(</span><span class="mi">2009</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span> <span class="mi">15</span><span class="p">))</span>
<span class="n">allParams2</span> <span class="o">=</span> <span class="n">allParams2</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span> <span class="o">!=</span> <span class="n">pl</span><span class="p">.</span><span class="n">date</span><span class="p">(</span><span class="mi">2009</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span> <span class="mi">16</span><span class="p">))</span>
</code></pre></div></div>

<p>To check the performance of the regression, we plot the \(R^2\) over time.</p>

<p><img src="/assets/fxfundamental/crossec_r2.png" alt="Cross-sectional R-squared values for the factor model plotted as a line chart from 2015 to 2024. The chart shows weekly rolling R-squared values fluctuating between approximately 0.2 and 0.6, indicating the model explains 20-60% of cross-currency return variance. The values are generally noisy with frequent peaks and troughs, but a rolling average line reveals a relatively stable trend hovering around 0.4 throughout the period. A marked dip occurs around 2020 during market volatility, with values recovering afterward. The overall pattern suggests moderate and consistent explanatory power of the factor model across currency pairs despite week-to-week variation." /></p>

<p>Again noisy, but the rolling average moves around 0.4, which is a respectable value.</p>

<p>We then calculate the t-stats by taking the average fitted parameters and dividing by the standard error.</p>

<table>
  <thead>
    <tr>
      <th>variable</th>
      <th>avg</th>
      <th>std</th>
      <th>N</th>
      <th>std_error</th>
      <th>t_stat</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>ccyGroup[T.EM]</td>
      <td>0.00113</td>
      <td>0.00920</td>
      <td>797</td>
      <td>0.000326</td>
      <td>3.46</td>
    </tr>
    <tr>
      <td>log_return_4_scaled_clipped_scaled</td>
      <td>0.000266</td>
      <td>0.00240</td>
      <td>797</td>
      <td>0.000085</td>
      <td>3.13</td>
    </tr>
    <tr>
      <td>log_return_26_scaled_clipped_scaled</td>
      <td>0.000324</td>
      <td>0.00295</td>
      <td>797</td>
      <td>0.000105</td>
      <td>3.10</td>
    </tr>
    <tr>
      <td>ccyGroup[T.SCANDI]</td>
      <td>0.000586</td>
      <td>0.00982</td>
      <td>797</td>
      <td>0.000348</td>
      <td>1.68</td>
    </tr>
    <tr>
      <td>ccyGroup[T.G7]</td>
      <td>0.000452</td>
      <td>0.00776</td>
      <td>797</td>
      <td>0.000275</td>
      <td>1.64</td>
    </tr>
    <tr>
      <td>ccyGroup[T.CEMA]</td>
      <td>0.000584</td>
      <td>0.0101</td>
      <td>797</td>
      <td>0.000358</td>
      <td>1.63</td>
    </tr>
    <tr>
      <td>log_return_52_scaled_clipped_scaled</td>
      <td>0.000135</td>
      <td>0.00291</td>
      <td>797</td>
      <td>0.000103</td>
      <td>1.31</td>
    </tr>
    <tr>
      <td>ccyGroup[T.LATAM]</td>
      <td>0.000404</td>
      <td>0.00949</td>
      <td>797</td>
      <td>0.000336</td>
      <td>1.20</td>
    </tr>
    <tr>
      <td>VTI_clipped_scaled</td>
      <td>0.000102</td>
      <td>0.00319</td>
      <td>797</td>
      <td>0.000113</td>
      <td>0.906</td>
    </tr>
    <tr>
      <td>GLD_clipped_scaled</td>
      <td>0.0000670</td>
      <td>0.00284</td>
      <td>797</td>
      <td>0.000101</td>
      <td>0.668</td>
    </tr>
    <tr>
      <td>Intercept</td>
      <td>0.0000620</td>
      <td>0.00520</td>
      <td>797</td>
      <td>0.000184</td>
      <td>0.335</td>
    </tr>
    <tr>
      <td>USO_clipped_scaled</td>
      <td>-0.00000200</td>
      <td>0.00285</td>
      <td>797</td>
      <td>0.000101</td>
      <td>-0.0175</td>
    </tr>
    <tr>
      <td>BND_clipped_scaled</td>
      <td>-0.0000840</td>
      <td>0.00289</td>
      <td>797</td>
      <td>0.000102</td>
      <td>-0.822</td>
    </tr>
    <tr>
      <td>dxy_log_return_scaled_clipped_scaled</td>
      <td>-0.000340</td>
      <td>0.00398</td>
      <td>797</td>
      <td>0.000141</td>
      <td>-2.42</td>
    </tr>
  </tbody>
</table>

<p>Anything over 2 is deemed significant, which gives us:</p>

<ul>
  <li>EM pairs</li>
  <li>1-month momentum</li>
  <li>6-month momentum</li>
  <li>DXY return</li>
</ul>

<p>We can look at the returns of these factors (just the significant ones).</p>

<p><img src="/assets/fxfundamental/sigfactorreturn.png" alt="Significant factor returns (EM premium, 1-month momentum, 6-month momentum, and DXY return) plotted as separate line charts stacked vertically from 2015 to 2024. Each chart displays weekly cumulative returns measured in percentage terms. The EM factor shows strong upward trend reaching approximately 150% by 2024 with notable dip in 2020. The 1-month and 6-month momentum factors display more volatile patterns with cumulative returns fluctuating between -50% and 100%, both showing recovery from 2020 lows. The DXY factor demonstrates downward trend from 2015 with cumulative returns declining to approximately -100% by 2024, indicating negative factor premium. All charts include rolling averages to highlight longer-term trends amid weekly volatility." /></p>

<p>The EM factor has the best return, and all the factor returns are positive except the DXY factor. For the EM factor, the coefficients are significant and positive; therefore we interpret this as investors demanding a premium return for holding EM pairs — at least the ones I’ve tagged as EM. Similarly, the two momentum factors command a similar premium.</p>

<p>But what currency pairs do you need to buy and sell to get these factor returns?</p>

<h2 id="how-to-build-the-factor-portfolios">How to Build the Factor Portfolios</h2>

<p>After fitting the cross sectional regression model we arrive at \(\hat{\lambda}\) which are the the factor returns. What we now want are the currency weights that will get us to the factor returns</p>

\[\hat{\lambda} = w R,\]

<p>after some maths you arrive at</p>

\[w = (B^TB)^{−1}B^T.\]

<p>Easy enough to translate into Python.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ccyWeights</span> <span class="o">=</span> <span class="p">[]</span>

<span class="k">for</span> <span class="n">dt</span> <span class="ow">in</span> <span class="n">allParams</span><span class="p">[</span><span class="s">"datetime"</span><span class="p">].</span><span class="n">unique</span><span class="p">():</span>
    <span class="n">betas</span> <span class="o">=</span> <span class="n">allParams</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">exclude</span><span class="p">(</span><span class="s">"log_return"</span><span class="p">,</span> <span class="s">"log_return_prev"</span><span class="p">,</span> <span class="s">"ccyGroup"</span><span class="p">,</span> <span class="s">"r2"</span><span class="p">,</span> <span class="s">"vol_52"</span><span class="p">))</span>
    <span class="n">betas</span> <span class="o">=</span> <span class="n">betas</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span> <span class="o">==</span> <span class="n">dt</span><span class="p">)</span>
    <span class="n">B</span> <span class="o">=</span> <span class="n">betas</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">exclude</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">,</span> <span class="s">"ccy"</span><span class="p">)).</span><span class="n">to_numpy</span><span class="p">()</span>

    <span class="n">W</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">linalg</span><span class="p">.</span><span class="n">solve</span><span class="p">(</span><span class="n">B</span><span class="p">.</span><span class="n">T</span> <span class="o">@</span> <span class="n">B</span><span class="p">,</span> <span class="n">B</span><span class="p">.</span><span class="n">T</span><span class="p">)</span>

    <span class="n">res</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">W</span><span class="p">,</span> <span class="n">schema</span><span class="o">=</span><span class="n">betas</span><span class="p">[</span><span class="s">"ccy"</span><span class="p">].</span><span class="n">to_list</span><span class="p">()).</span><span class="n">with_columns</span><span class="p">(</span>
        <span class="n">pl</span><span class="p">.</span><span class="n">Series</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s">"factor"</span><span class="p">,</span> <span class="n">values</span><span class="o">=</span><span class="n">betas</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">exclude</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">,</span> <span class="s">"ccy"</span><span class="p">)).</span><span class="n">columns</span><span class="p">),</span>
        <span class="n">datetime</span> <span class="o">=</span> <span class="n">betas</span><span class="p">[</span><span class="s">"datetime"</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span>
        <span class="p">).</span><span class="n">unpivot</span><span class="p">(</span><span class="n">index</span><span class="o">=</span><span class="p">[</span><span class="s">"datetime"</span><span class="p">,</span><span class="s">"factor"</span><span class="p">])</span>

    <span class="n">ccyWeights</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">res</span><span class="p">)</span>

<span class="n">ccyWeights</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">ccyWeights</span><span class="p">)</span>
</code></pre></div></div>

<p>As we are using the \(\beta\) matrix, we get a time series of weights. The currencies’ underlying sensitivities to the different features change over time, meaning that they will undergo different weighting in the factor portfolios over time too.</p>

<p>After running that calculation we get to the currency rates.</p>

<p><img src="/assets/fxfundamental/momccyweights.png" alt="Currency pair weights for the 1-month momentum factor from 2015 to 2024, displayed as stacked area chart with individual line overlays. Seven currency pairs (AUD, CAD, EUR, GBP, JPY, NZD, NOK) are tracked with weights fluctuating between approximately -0.3 and 0.3. EUR maintains the most stable positioning closest to zero throughout the period. AUD and CAD show cyclical long and short transitions, with AUD predominantly long and CAD shifting from short to neutral over time. GBP, JPY, NZD, and NOK exhibit more volatile swings. Notable volatility spikes occur around 2020 and 2022, reflecting market stress periods. The overall pattern indicates momentum factor weights remain relatively balanced with no single currency dominating systematically, suggesting diversified portfolio construction across pairs." /></p>

<p>For the momentum factor, EUR hugs zero more than the other selected currencies.</p>

<p>If we look at the DXY factor and the currency weights for 2026 to have a more realistic view of how they are changing, we can see much more stability.</p>

<p><img src="/assets/fxfundamental/dxyccyweights.png" alt="Currency weights in the DXY factor." /></p>

<p>Small changes around EUR; CNH has hovered around zero; TWD has gone long since February; and AUD has picked up a short position. Given these are weekly weights, it’s good that there aren’t any wild swings, since big changes in positioning would lead to larger transaction costs.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Done. We’ve built a fundamental FX factor model. It’s involved, with lots of different ways to fall over, but we made it. Three factors were significant: 1-month momentum, 6-month momentum, DXY, and the EM factor. The smaller size of the FX universe compared to equities means there is less data through time and across assets. Also, the underlying \(\beta\)s are noisy given the tighter return ranges compared to equities. There is also a case that regime changes average things out to zero, but it’s hard to see that in the data. However, this model can help in hedging and explaining risk, but not serve as a source of expected returns.</p>

<p>If you’ve looked at FX factor models before, you’ll realise I’ve missed a pretty significant factor — carry. It’s very hard to get free data to calculate the carry factor across the full universe of currencies. I’m saving it for another day for a smaller set of pairs where there is data.</p>

<p>So I hope this has been a good walkthrough and explainer on how to approach these factor models.</p>]]></content><author><name>Dean Markwick</name></author><category term="python" /><category term="quant" /><category term="factor-model" /><summary type="html"><![CDATA[I’ve been reading The Elements of Quantitative Investing to branch out from my usual high-frequency finance to something slower or mid-frequency. Factor models are a big part of this quant topic, and I’m trying to get a deeper understanding by following the book and applying the process to FX data.]]></summary></entry><entry><title type="html">Making Sense of the DXY</title><link href="https://dm13450.github.io/2026/03/10/Making-Sense-of-the-DXY.html" rel="alternate" type="text/html" title="Making Sense of the DXY" /><published>2026-03-10T00:00:00+00:00</published><updated>2026-03-10T00:00:00+00:00</updated><id>https://dm13450.github.io/2026/03/10/Making-Sense-of-the-DXY</id><content type="html" xml:base="https://dm13450.github.io/2026/03/10/Making-Sense-of-the-DXY.html"><![CDATA[<p>My day job is in quant <em>trading</em>, but there’s another fascinating world: quantitative <em>investing</em>. While I focus on latencies and execution, quant investors are busy building the most efficient portfolios and ensuring they extract pure alpha. Not one to stay in my lane, I’m using this blog post as an opportunity to dive into the world of quant investing and level up my knowledge.</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>Now most quant investing examples use equities as the underlying asset class, but I am an FX man, so will be replacing Apple and Microsoft with Euro’s and Yen. In some ways, this is easier; I just have to worry about 30-odd currencies as my investible universe compared to the thousands, if not hundreds of thousands, of different stocks. But in many ways it’s harder. What drives FX returns is at a much higher macro-level compared to an individual stock, and things like central banks changing interest rates, government policy changes are difficult to translate to a dataset compared to the price-to-book ratio of a stock. Still, we will give it a go.</p>

<p>In short, we want to better understand what can influence a currency’s return and produce a systematic model. This post is going to start with the basics, pulling in the right data, building a proxy to the overall FX market and ending with some basic regressions.</p>

<h2 id="twelve-data">Twelve Data</h2>

<p>For any quant investing model, we need to start with data. I’m always on the hunt for new sources, and <a href="https://twelvedata.com">twelvedata</a> is the latest one to come across my radar. It has a generous free tier and, more importantly, has FX data across all the main pairs. Plus, it has a Python API that is dead simple to use. This makes it ideal for this string of posts.</p>

<p>Sign up and get your API key, and you can follow along.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">twelvedata</span> <span class="kn">import</span> <span class="n">TDClient</span>
 
<span class="n">td</span> <span class="o">=</span> <span class="n">TDClient</span><span class="p">(</span><span class="n">apikey</span><span class="o">=</span><span class="n">API_KEY</span><span class="p">)</span>

<span class="n">td</span><span class="p">.</span><span class="n">time_series</span><span class="p">(</span>
        <span class="n">symbol</span><span class="o">=</span><span class="s">"USD/JPY"</span><span class="p">,</span>
        <span class="n">interval</span><span class="o">=</span><span class="s">"1day"</span><span class="p">,</span>
        <span class="n">start_date</span><span class="o">=</span><span class="s">"2025-01-01"</span><span class="p">,</span>
        <span class="n">end_date</span><span class="o">=</span><span class="s">"2026-03-01"</span><span class="p">,</span>
        <span class="n">outputsize</span><span class="o">=</span><span class="mi">5000</span><span class="p">).</span><span class="n">as_json</span><span class="p">()</span>
</code></pre></div></div>

<p>This returns the daily timeseries of USDJPY since 2025 til March 2026, formatted as a JSON. Pretty simple to then go from that to a dataframe or however you want to deal with the data.</p>

<p>I don’t want to get blocked by the API limits, so I’m going to save the JSON objects locally.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">download_data</span><span class="p">(</span><span class="n">td</span><span class="p">,</span> <span class="n">ccy</span><span class="p">,</span> <span class="n">start_date</span><span class="p">,</span> <span class="n">end_date</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">td</span><span class="p">.</span><span class="n">time_series</span><span class="p">(</span>
        <span class="n">symbol</span><span class="o">=</span><span class="sa">f</span><span class="s">"USD/</span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
        <span class="n">interval</span><span class="o">=</span><span class="s">"1day"</span><span class="p">,</span>
        <span class="n">start_date</span><span class="o">=</span><span class="n">start_date</span><span class="p">,</span>
        <span class="n">end_date</span><span class="o">=</span><span class="n">end_date</span><span class="p">,</span>
        <span class="n">outputsize</span><span class="o">=</span><span class="mi">5000</span>
    <span class="p">)</span>

<span class="k">def</span> <span class="nf">save_data</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">ccy</span><span class="p">):</span>
    <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="sa">f</span><span class="s">"data/</span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">.json"</span><span class="p">,</span> <span class="s">"w"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
        <span class="n">json</span><span class="p">.</span><span class="n">dump</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">as_json</span><span class="p">(),</span> <span class="n">f</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">download_and_save_data</span><span class="p">(</span><span class="n">td</span><span class="p">,</span> <span class="n">ccy</span><span class="p">,</span> <span class="n">start_date</span><span class="p">,</span> <span class="n">end_date</span><span class="p">):</span>
    <span class="n">file_path</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"data/</span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">.json"</span>
    <span class="k">if</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">exists</span><span class="p">(</span><span class="n">file_path</span><span class="p">):</span>
        <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"File for </span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s"> already exists. Skipping download."</span><span class="p">)</span>
        <span class="k">return</span> <span class="bp">False</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Downloading data for </span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">..."</span><span class="p">)</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">download_data</span><span class="p">(</span><span class="n">td</span><span class="p">,</span> <span class="n">ccy</span><span class="p">,</span> <span class="n">start_date</span><span class="p">,</span> <span class="n">end_date</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Saving data for </span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">..."</span><span class="p">)</span>
    <span class="n">save_data</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">ccy</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Data for </span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s"> downloaded and saved successfully."</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"Sleeping for 8 seconds to avoid hitting API rate limits..."</span><span class="p">)</span>
    <span class="n">time</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span>
    <span class="k">return</span> <span class="bp">True</span>
</code></pre></div></div>

<p>Then, to load the data for a particular currency, we have a separate function.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">load_data</span><span class="p">(</span><span class="n">ccy</span><span class="p">):</span>
    <span class="n">df</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">read_json</span><span class="p">(</span><span class="sa">f</span><span class="s">'data/</span><span class="si">{</span><span class="n">ccy</span><span class="si">}</span><span class="s">.json'</span><span class="p">)</span>
    <span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
        <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Date</span><span class="p">),</span>
        <span class="n">ccy</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="n">ccy</span><span class="p">),</span>
        <span class="nb">open</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"open"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Float64</span><span class="p">),</span>
        <span class="n">high</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"high"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Float64</span><span class="p">),</span>
        <span class="n">low</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"low"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Float64</span><span class="p">),</span>
        <span class="n">close</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">).</span><span class="n">cast</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">Float64</span><span class="p">))</span>
    <span class="k">return</span> <span class="n">df</span>
</code></pre></div></div>

<p>To make sure everything is working nicely, let’s load and plot JPY.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">load_data</span><span class="p">(</span><span class="s">"JPY"</span><span class="p">)</span>

<span class="n">fig</span> <span class="o">=</span> <span class="n">go</span><span class="p">.</span><span class="n">Figure</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">go</span><span class="p">.</span><span class="n">Ohlc</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'datetime'</span><span class="p">],</span>
                    <span class="nb">open</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'open'</span><span class="p">],</span>
                    <span class="n">high</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'high'</span><span class="p">],</span>
                    <span class="n">low</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'low'</span><span class="p">],</span>
                    <span class="n">close</span><span class="o">=</span><span class="n">df</span><span class="p">[</span><span class="s">'close'</span><span class="p">]))</span>

<span class="n">fig</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/dxy/jpy.png" alt="Line chart depicting the the price of USDJPY" /></p>

<p>All looks good, so now we can download whatever pair our heart desires. Which leads us to the next part.</p>

<h2 id="what-is-the-dxy">What is the DXY?</h2>

<p>In my mind, the DXY is the FX equivalent of the S&amp;P500. It gives a general indication of how the dollar’s value is changing by using the exchange rate of EUR, JPY, CHF, GBP, CAD and SEK vs the dollar. It’s calculated as a geometric weighted average of these six currencies, and given the dollar’s dominance in the FX market, it works as a reasonable proxy of how the overall FX market is moving.</p>

<p>If we cast our mind back to the <a href="https://en.wikipedia.org/wiki/Capital_asset_pricing_model">Capital Asset Pricing Model</a>, an asset’s expected return can be broken down to its \(\alpha\) active return and its sensitivity to the market, \(r_m\). The strength of this sensitivity is \(\beta\).</p>

\[r_i = \alpha_i + \beta_i r_m\]

<p>In equities, \(r_i\) is a single stock and \(r_m\) is some measure of the overall market return (S&amp;P500, FTSE100, etc.). In FX, \(r_i\) is an individual currency and \(r_m\) is the DXY. This gives us an easy quantitative model to judge how a currency’s return is driven by the overall movement in the dollar.</p>

<p>Now you can either read the DXY from a market data source (expensive) or you can calculate it yourself.</p>

<h2 id="calculating-the-dxy">Calculating the DXY</h2>

<p>The formula for the DXY is in a pdf here: <a href="https://www.ice.com/publicdocs/futures_us/ICE_Dollar_Index_FAQ.pdf">U.S. Dollar Index Contracts</a>. It’s a simple weighted geometric average, so we just need the individual currency prices, and we can implement the calculation.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dfs</span> <span class="o">=</span> <span class="p">[</span><span class="n">load_data</span><span class="p">(</span><span class="n">ccy</span><span class="p">)</span> <span class="k">for</span> <span class="n">ccy</span> <span class="ow">in</span> <span class="p">[</span><span class="s">"EUR"</span><span class="p">,</span> <span class="s">"JPY"</span><span class="p">,</span> <span class="s">"GBP"</span><span class="p">,</span> <span class="s">"CAD"</span><span class="p">,</span> <span class="s">"SEK"</span><span class="p">,</span> <span class="s">"CHF"</span><span class="p">]]</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">dfs</span><span class="p">)</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span>
</code></pre></div></div>

<p>The more eagle-eyed readers might have noticed that I’m saving down some of the pairs the ‘wrong’ way round. USDEUR instead of EURUSD, USDGBP instead of GBPUSD, etc. This is because the DXY needs to flip everything into USD base terms, so in the weighting, some of the negatives are changed to positive.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dxyWeightings</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s">"EUR"</span><span class="p">:</span> <span class="mf">0.576</span><span class="p">,</span>
    <span class="s">"JPY"</span><span class="p">:</span> <span class="mf">0.136</span><span class="p">,</span>
    <span class="s">"GBP"</span><span class="p">:</span> <span class="mf">0.119</span><span class="p">,</span>
    <span class="s">"CAD"</span><span class="p">:</span> <span class="mf">0.091</span><span class="p">,</span>
    <span class="s">"SEK"</span><span class="p">:</span> <span class="mf">0.042</span><span class="p">,</span>
    <span class="s">"CHF"</span><span class="p">:</span> <span class="mf">0.036</span><span class="p">,</span>
    <span class="s">"const"</span><span class="p">:</span> <span class="mf">50.14348112</span><span class="p">}</span>

<span class="n">weights_df</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">dxyWeightings</span><span class="p">.</span><span class="n">items</span><span class="p">()),</span> <span class="n">schema</span><span class="o">=</span><span class="p">[</span><span class="s">"ccy"</span><span class="p">,</span><span class="s">"weight"</span><span class="p">])</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">weights_df</span><span class="p">,</span> <span class="n">on</span><span class="o">=</span><span class="s">"ccy"</span><span class="p">,</span> <span class="n">how</span><span class="o">=</span><span class="s">"left"</span><span class="p">)</span>
</code></pre></div></div>

<p>So now we have a dataframe of the relevant prices joined by the weightings.</p>

<p>Step 1: exponentiate the 4 prices by the right power.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"open"</span><span class="p">)</span> <span class="o">**</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"weight"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"open_weighted"</span><span class="p">),</span>
    <span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"high"</span><span class="p">)</span> <span class="o">**</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"weight"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"high_weighted"</span><span class="p">),</span>
    <span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"low"</span><span class="p">)</span> <span class="o">**</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"weight"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"low_weighted"</span><span class="p">),</span>
    <span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">)</span> <span class="o">**</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"weight"</span><span class="p">)).</span><span class="n">alias</span><span class="p">(</span><span class="s">"close_weighted"</span><span class="p">)</span>
</code></pre></div></div>

<p>Step 2: For each day, take the product and multiply it by the constant.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dxy</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">group_by</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">).</span><span class="n">agg</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"open_weighted"</span><span class="p">).</span><span class="n">product</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_open"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"high_weighted"</span><span class="p">).</span><span class="n">product</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_high"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"low_weighted"</span><span class="p">).</span><span class="n">product</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_low"</span><span class="p">),</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close_weighted"</span><span class="p">).</span><span class="n">product</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_close"</span><span class="p">)</span>
<span class="p">).</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">'dxy_open'</span><span class="p">)</span><span class="o">*</span><span class="n">dxyWeightings</span><span class="p">[</span><span class="s">"const"</span><span class="p">],</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">'dxy_high'</span><span class="p">)</span><span class="o">*</span><span class="n">dxyWeightings</span><span class="p">[</span><span class="s">"const"</span><span class="p">],</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">'dxy_low'</span><span class="p">)</span><span class="o">*</span><span class="n">dxyWeightings</span><span class="p">[</span><span class="s">"const"</span><span class="p">],</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">'dxy_close'</span><span class="p">)</span><span class="o">*</span><span class="n">dxyWeightings</span><span class="p">[</span><span class="s">"const"</span><span class="p">])</span>
</code></pre></div></div>

<p><img src="/assets/dxy/dxy.png" alt="Line chart depicting the DXY" /></p>

<p>[Alt text: Line chart depicting daily DXY values. The x-axis shows time, and the y-axis shows the DXY value. The chart provides a clear view of the daily movement of the DXY.]</p>

<p>If you compare it to the Yahoo Finance DXY plot, it looks pretty similar, so I’m pretty confident this is all correct.</p>

<h2 id="individual-currency-betas">Individual Currency \(\beta\)’s</h2>

<p>Now we can go on to measuring the currencies \(\beta\) values. This is a simple linear regression of the log returns of an individual currency vs the log returns of the DXY.</p>

<p>We need to load in more currency pairs.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dfs</span> <span class="o">=</span> <span class="p">[</span><span class="n">load_data</span><span class="p">(</span><span class="n">ccy</span><span class="p">)</span> <span class="k">for</span> <span class="n">ccy</span> <span class="ow">in</span> <span class="n">all_pairs</span><span class="p">]</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">dfs</span><span class="p">)</span>
<span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"datetime"</span><span class="p">)</span>
</code></pre></div></div>

<p>For the regression, we need the individual currency returns and also the DXY returns. Simple log return calculation, and then join the DXY frame onto the individual currencies.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">().</span><span class="n">over</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"log_return"</span><span class="p">)</span>
<span class="p">)</span>

<span class="n">dxy</span> <span class="o">=</span> <span class="n">dxy</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span>
    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"dxy_close"</span><span class="p">).</span><span class="n">log</span><span class="p">().</span><span class="n">diff</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"dxy_log_return"</span><span class="p">)</span>
<span class="p">)</span>

<span class="n">combined_df</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">dxy</span><span class="p">,</span> <span class="n">on</span><span class="o">=</span><span class="s">"datetime"</span><span class="p">,</span> <span class="n">how</span><span class="o">=</span><span class="s">"left"</span><span class="p">)</span>
</code></pre></div></div>

<p>We will do a rolling regression using a 252-day look back, which is roughly the number of trading days in a year.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">statsmodels.regression.rolling</span> <span class="kn">import</span> <span class="n">RollingOLS</span>

<span class="n">allParams</span> <span class="o">=</span> <span class="p">[]</span>

<span class="k">for</span> <span class="n">ccy</span> <span class="ow">in</span> <span class="p">[</span><span class="s">"EUR"</span><span class="p">,</span> <span class="s">"SEK"</span><span class="p">,</span> <span class="s">"CNH"</span><span class="p">,</span> <span class="s">"TWD"</span><span class="p">,</span> <span class="s">"TRY"</span><span class="p">]:</span>

    <span class="n">subDF</span> <span class="o">=</span> <span class="n">combined_df</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"ccy"</span><span class="p">)</span> <span class="o">==</span> <span class="n">ccy</span><span class="p">)</span>
    <span class="n">mod</span> <span class="o">=</span> <span class="n">RollingOLS</span><span class="p">.</span><span class="n">from_formula</span><span class="p">(</span><span class="s">"log_return ~ dxy_log_return"</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">subDF</span><span class="p">,</span> <span class="n">window</span><span class="o">=</span><span class="mi">252</span><span class="p">)</span>
    <span class="n">rres</span> <span class="o">=</span> <span class="n">mod</span><span class="p">.</span><span class="n">fit</span><span class="p">()</span>

    <span class="n">paramDF</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">from_pandas</span><span class="p">(</span><span class="n">rres</span><span class="p">.</span><span class="n">params</span><span class="p">)</span>
    <span class="n">paramDF</span> <span class="o">=</span> <span class="n">paramDF</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">ccy</span><span class="o">=</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="n">ccy</span><span class="p">),</span> <span class="n">Date</span> <span class="o">=</span> <span class="n">subDF</span><span class="p">[</span><span class="s">"datetime"</span><span class="p">])</span>
    <span class="n">allParams</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">paramDF</span><span class="p">)</span>

<span class="n">allParams</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">allParams</span><span class="p">)</span>
</code></pre></div></div>

<p>To examine the results, we plot the \(\beta_i\) value over time for some different currencies.</p>

<p><img src="/assets/dxy/betas.png" alt="Line chart depicting beta values of various currencies over time." /></p>

<p>EUR (green) is close to 1, which aligns with intuition as it’s the largest weight of the DXY calculation. TRY has the lowest \(\beta\) out of these pairs, which suggests its returns are not driven by the overall dollar returns, again, makes sense given TRY’s movements reflect the underlying macroeconomics of TRY. SEK has a consistent \(\beta &gt; 1\) which again suggests it’s very susceptible to general dollar moves. It’s not pictured, but HKD comes out with the lowest \(\beta\), which is reassuring as it is pegged to the dollar.</p>

<p>Overall, do these \(\beta\)’s tell us much? Not really, but it is interesting to measure, and this is the foundation needed before we start looking at other factors that might influence the daily currency movements. These can be things like momentum, oil/gold sensitivity, etc.</p>

<h2 id="conclusion">Conclusion</h2>

<p>From this, we have built up a new dataset of daily currency prices and now have daily DXY values too. This has given the underpinnings of an FX factor model, and next time we can start looking at other components that could explain currency movements.</p>

<p>Loosely related is my post on <a href="https://dm13450.github.io/2024/04/25/Currency-Hedging-and-Principal-Component-Analysis.html">Currency Hedging and Principal Component Analysis</a> and <a href="https://dm13450.github.io/2022/06/09/ETF-Correlations.html">Dipping My Toes into ETF Correlations</a>.</p>]]></content><author><name>Dean Markwick</name></author><category term="python" /><category term="quant" /><category term="fx" /><summary type="html"><![CDATA[My day job is in quant trading, but there’s another fascinating world: quantitative investing. While I focus on latencies and execution, quant investors are busy building the most efficient portfolios and ensuring they extract pure alpha. Not one to stay in my lane, I’m using this blog post as an opportunity to dive into the world of quant investing and level up my knowledge.]]></summary></entry><entry><title type="html">Premier League Survival – How Many Points Are Enough?</title><link href="https://dm13450.github.io/2025/10/31/Premier-League-Survival-How-Many-Points-Are-Enough.html" rel="alternate" type="text/html" title="Premier League Survival – How Many Points Are Enough?" /><published>2025-10-31T00:00:00+00:00</published><updated>2025-10-31T00:00:00+00:00</updated><id>https://dm13450.github.io/2025/10/31/Premier-League-Survival%E2%80%93How-Many-Points-Are-Enough</id><content type="html" xml:base="https://dm13450.github.io/2025/10/31/Premier-League-Survival-How-Many-Points-Are-Enough.html"><![CDATA[<p>It’s been an interesting start to the Premier League. All of the promoted teams (Sunderland, Leeds and Burnley) are outside the relegation zone, with Wolves and West Ham struggling at the bottom. So I want to look back at the other seasons and work out the average number of points throughout the season that characterises relegation teams, and how many points do you need to avoid relegation?</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>This is also a post where I dive into Python. I’ve been meaning to learn both <a href="https://pola.rs/">Polars</a> and <a href="https://plotly.com/">Plotly</a>, and given the relative simplicity of this post, it feels like the opportune time. It has also been a while since I’ve written about football and given my reduced output recently, it feels like a quick win to churn something out quickly.</p>

<h2 id="downloading-the-data">Downloading the Data</h2>

<p>The gold standard for free and easy football data is <a href="https://www.football-data.co.uk/">football-data</a>, where they have a CSV of every season for many years. This makes it easy to download it directly and merge the seasons together.</p>

<p>Reading a CSV with Polars is no different to Pandas, but adding in a new column is slightly different with the <code class="language-plaintext highlighter-rouge">use_columns</code> function and giving it an <code class="language-plaintext highlighter-rouge">alias</code>.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">s</span> <span class="o">=</span> <span class="nb">range</span><span class="p">(</span><span class="mi">2009</span><span class="p">,</span> <span class="mi">2027</span><span class="p">)</span>
<span class="n">seasons</span> <span class="o">=</span> <span class="p">[</span><span class="nb">str</span><span class="p">((</span><span class="n">x</span><span class="o">-</span><span class="mi">1</span><span class="p">))[</span><span class="mi">2</span><span class="p">:</span><span class="mi">4</span><span class="p">]</span> <span class="o">+</span> <span class="nb">str</span><span class="p">((</span><span class="n">x</span><span class="p">))[</span><span class="mi">2</span><span class="p">:</span><span class="mi">4</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">s</span><span class="p">]</span>

<span class="n">rawDataList</span> <span class="o">=</span> <span class="p">[]</span>

<span class="k">for</span> <span class="n">season</span> <span class="ow">in</span> <span class="n">seasons</span><span class="p">:</span>
    <span class="n">url</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"https://www.football-data.co.uk/mmz4281/</span><span class="si">{</span><span class="n">season</span><span class="si">}</span><span class="s">/E0.csv"</span>
    <span class="n">rawData</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">read_csv</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">truncate_ragged_lines</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">rawData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="n">season</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"Season"</span><span class="p">))</span>
    <span class="n">rawDataList</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">rawData</span><span class="p">)</span>

<span class="n">rawData</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">(</span><span class="n">rawDataList</span><span class="p">,</span> <span class="n">how</span> <span class="o">=</span> <span class="s">"diagonal"</span><span class="p">)</span>
</code></pre></div></div>

<p>We diagonally concatenate the dataframes because not every season has the same columns, and this will null-fill any missing columns.</p>

<p>We then add a column of row indices and add the points scored by the home and away team based on the outcome of the match.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rawData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">.</span><span class="n">with_row_index</span><span class="p">(</span><span class="s">"MatchID"</span><span class="p">)</span>
<span class="n">rawData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FTR"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"H"</span><span class="p">).</span><span class="n">then</span><span class="p">(</span><span class="mi">3</span><span class="p">).</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FTR"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"A"</span><span class="p">)).</span><span class="n">then</span><span class="p">(</span><span class="mi">0</span><span class="p">).</span><span class="n">otherwise</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">'PTH'</span><span class="p">))</span>
<span class="n">rawData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FTR"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"A"</span><span class="p">).</span><span class="n">then</span><span class="p">(</span><span class="mi">3</span><span class="p">).</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FTR"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"H"</span><span class="p">)).</span><span class="n">then</span><span class="p">(</span><span class="mi">0</span><span class="p">).</span><span class="n">otherwise</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">'PTA'</span><span class="p">))</span>
</code></pre></div></div>

<h2 id="formatting-the-data">Formatting the Data</h2>

<p>Currently, the data is in a ‘per match’ format with a home and away team. We need to rearrange this so that each team gets its own row per match, so if we filter for a specific team, we get all their matches rather than having to filter both the home and away columns.</p>

<p>The current columns refer to stats in terms of home (<code class="language-plaintext highlighter-rouge">H</code>) and away (<code class="language-plaintext highlighter-rouge">A</code>). We will replace those names with <code class="language-plaintext highlighter-rouge">1</code> and <code class="language-plaintext highlighter-rouge">2</code>.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">matchDetailsCols</span> <span class="o">=</span> <span class="p">[</span><span class="s">"MatchID"</span><span class="p">,</span> <span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Date"</span><span class="p">,</span> <span class="s">"HomeTeam"</span><span class="p">,</span> <span class="s">"AwayTeam"</span><span class="p">]</span>
<span class="n">matchDetailsMap</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">matchDetailsCols</span><span class="p">,</span> <span class="p">[</span><span class="s">"MatchID"</span><span class="p">,</span> <span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Date"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">,</span> <span class="s">"Team2"</span><span class="p">]))</span>

<span class="n">matchStatsCols</span> <span class="o">=</span> <span class="p">[</span><span class="s">"FTHG"</span><span class="p">,</span> <span class="s">"FTAG"</span><span class="p">,</span> <span class="s">"HS"</span><span class="p">,</span> <span class="s">"AS"</span><span class="p">,</span> <span class="s">"HST"</span><span class="p">,</span> <span class="s">"AST"</span><span class="p">,</span> <span class="s">"PSCD"</span><span class="p">,</span> <span class="s">"PSCH"</span><span class="p">,</span> <span class="s">"PSCA"</span><span class="p">,</span> <span class="s">"PTH"</span><span class="p">,</span> <span class="s">"PTA"</span><span class="p">]</span>
<span class="n">matchStatsMap</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">matchStatsCols</span><span class="p">,</span> <span class="p">[</span><span class="n">x</span><span class="p">.</span><span class="n">replace</span><span class="p">(</span><span class="s">"H"</span><span class="p">,</span> <span class="s">"1"</span><span class="p">).</span><span class="n">replace</span><span class="p">(</span><span class="s">"A"</span><span class="p">,</span> <span class="s">"2"</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">matchStatsCols</span><span class="p">]))</span>

<span class="n">allCols</span> <span class="o">=</span> <span class="n">matchDetailsCols</span> <span class="o">+</span> <span class="n">matchStatsCols</span>
<span class="n">colsMap</span> <span class="o">=</span> <span class="n">matchDetailsMap</span> <span class="o">|</span> <span class="n">matchStatsMap</span>
<span class="n">matchData</span> <span class="o">=</span> <span class="n">rawData</span><span class="p">[</span><span class="n">allCols</span><span class="p">]</span>
</code></pre></div></div>

<p>So we create a frame with all the matches relabelled as <code class="language-plaintext highlighter-rouge">Team1</code> and add a dummy indicator for a <code class="language-plaintext highlighter-rouge">Home</code> match.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">team1Data</span> <span class="o">=</span> <span class="n">matchData</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">colsMap</span><span class="p">)</span>
<span class="n">team1Data</span> <span class="o">=</span> <span class="n">team1Data</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"Home"</span><span class="p">))</span>
</code></pre></div></div>

<p>Likewise for <code class="language-plaintext highlighter-rouge">Team2</code>.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">team2Data</span> <span class="o">=</span> <span class="n">matchData</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">colsMap</span><span class="p">)</span>
<span class="n">team2Map</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">team2Data</span><span class="p">.</span><span class="n">columns</span><span class="p">,</span> <span class="p">[</span><span class="n">x</span><span class="p">.</span><span class="n">replace</span><span class="p">(</span><span class="s">"1"</span><span class="p">,</span> <span class="s">"2"</span><span class="p">)</span> <span class="k">if</span> <span class="s">"1"</span> <span class="ow">in</span> <span class="n">x</span> <span class="k">else</span> <span class="n">x</span><span class="p">.</span><span class="n">replace</span><span class="p">(</span><span class="s">"2"</span><span class="p">,</span> <span class="s">"1"</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">team2Data</span><span class="p">.</span><span class="n">columns</span><span class="p">]))</span>
<span class="n">team2Data</span> <span class="o">=</span> <span class="n">team2Data</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">team2Map</span><span class="p">)</span>
<span class="n">team2Data</span> <span class="o">=</span> <span class="n">team2Data</span><span class="p">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">lit</span><span class="p">(</span><span class="mi">0</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">"Home"</span><span class="p">))</span>
</code></pre></div></div>

<p>Then rejoin and sort by the matchID.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">teamData</span> <span class="o">=</span> <span class="n">pl</span><span class="p">.</span><span class="n">concat</span><span class="p">([</span><span class="n">team1Data</span><span class="p">,</span> <span class="n">team2Data</span><span class="p">],</span> <span class="n">how</span> <span class="o">=</span> <span class="s">"diagonal"</span><span class="p">)</span>
<span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"MatchID"</span><span class="p">)</span>
</code></pre></div></div>

<p>Now we want to add the cumulative sum of points, goals, and goals conceded to get a view of each team’s league position on a match by match basis.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"PT1"</span><span class="p">).</span><span class="n">cum_sum</span><span class="p">().</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">))</span>
<span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FT1G"</span><span class="p">).</span><span class="n">cum_sum</span><span class="p">().</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"TotalGoals1"</span><span class="p">))</span>
<span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FT2G"</span><span class="p">).</span><span class="n">cum_sum</span><span class="p">().</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"TotalGoalsC1"</span><span class="p">))</span>
<span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">int_range</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">len</span><span class="p">()).</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"N"</span><span class="p">))</span>
</code></pre></div></div>

<p>This is a bit different to the usual groupby and aggregate, but makes sense to define the function over the column then specify the aggregation columns.</p>

<p>Finally, we are going to create a league table dataframe by taking the last points/goals/goals conceded by each team per season and use that to work out who got relegated each year.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">leagueTable</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">group_by</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">]).</span><span class="n">agg</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"N"</span><span class="p">,</span> <span class="s">"TotalPoints1"</span><span class="p">,</span> <span class="s">"TotalGoals1"</span><span class="p">,</span> <span class="s">"TotalGoalsC1"</span><span class="p">).</span><span class="n">last</span><span class="p">())</span>
<span class="n">leagueTable</span> <span class="o">=</span> <span class="n">leagueTable</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">,</span> <span class="n">descending</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">leagueTable</span> <span class="o">=</span> <span class="n">leagueTable</span><span class="p">.</span><span class="n">select</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">all</span><span class="p">(),</span> <span class="n">pl</span><span class="p">.</span><span class="n">int_range</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="nb">len</span><span class="p">()).</span><span class="n">over</span><span class="p">([</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">]).</span><span class="n">alias</span><span class="p">(</span><span class="s">"FinalPosition"</span><span class="p">))</span>
<span class="n">leagueTable</span> <span class="o">=</span> <span class="n">leagueTable</span><span class="p">.</span><span class="n">with_columns</span><span class="p">((</span><span class="n">pl</span><span class="p">.</span><span class="n">when</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FinalPosition"</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="mi">17</span><span class="p">).</span><span class="n">then</span><span class="p">(</span><span class="mi">1</span><span class="p">)).</span><span class="n">otherwise</span><span class="p">(</span><span class="mi">0</span><span class="p">).</span><span class="n">alias</span><span class="p">(</span><span class="s">'Relegated'</span><span class="p">))</span>
</code></pre></div></div>

<p>We can then join this to the <code class="language-plaintext highlighter-rouge">teamData</code>, and this will form the basis of our stats.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">teamData</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">leagueTable</span><span class="p">[[</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">,</span> <span class="s">"FinalPosition"</span><span class="p">,</span> <span class="s">"Relegated"</span><span class="p">]],</span> <span class="n">on</span> <span class="o">=</span> <span class="p">[</span><span class="s">"Season"</span><span class="p">,</span> <span class="s">"Div"</span><span class="p">,</span> <span class="s">"Team1"</span><span class="p">])</span>
</code></pre></div></div>

<h2 id="relegation-statistics">Relegation Statistics</h2>

<p>The data is in a nice format, and we can manipulate it and see where this season is lining up. This is where <code class="language-plaintext highlighter-rouge">plotly</code> now comes in. I’ve always been a <a href="https://matplotlib.org/">matplotlib</a> user and enjoyed building up the plots layer by layer and a decent amount of control. Plotly was always missing from my arsenal, so if I’m dipping my toes into Python, I might as well plug that gap. I’ve neglected some of the final graph formatting points to keep the code chunks manageable.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">plotly.express</span> <span class="k">as</span> <span class="n">px</span>
<span class="kn">import</span> <span class="nn">plotly.graph_objects</span> <span class="k">as</span> <span class="n">go</span>
<span class="kn">from</span> <span class="nn">plotly.subplots</span> <span class="kn">import</span> <span class="n">make_subplots</span>
</code></pre></div></div>

<p>First, we calculate the relegation stats. We want to calculate the average number of points, goals scored, and goals conceded after each game week for the teams that were eventually relegated.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">relegated</span> <span class="o">=</span> <span class="p">(</span><span class="n">teamData</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Season"</span><span class="p">)</span> <span class="o">!=</span> <span class="s">"2526"</span><span class="p">)</span>
                     <span class="p">.</span><span class="n">group_by</span><span class="p">([</span><span class="s">"N"</span><span class="p">,</span> <span class="s">"Relegated"</span><span class="p">])</span>
                     <span class="p">.</span><span class="n">agg</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span> 
                          <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalGoals1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span> 
                          <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalGoalsC1"</span><span class="p">).</span><span class="n">mean</span><span class="p">())</span>
                     <span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"N"</span><span class="p">).</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Relegated"</span><span class="p">)</span> <span class="o">==</span> <span class="mi">1</span><span class="p">))</span>
</code></pre></div></div>

<p>We then want to plot this and compare it to the currently promoted teams, plus Wolves and West Ham, who are in the most trouble. Also, shout out to <a href="https://teamcolours.netlify.app/">https://teamcolours.netlify.app/</a> to get the actual colours of the teams for the plot.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">fig</span> <span class="o">=</span> <span class="n">go</span><span class="p">.</span><span class="n">Figure</span><span class="p">()</span>
<span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">relegated</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">relegated</span><span class="p">[</span><span class="s">"TotalPoints1"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="s">'Avg Points Of A Relegated Team'</span><span class="p">))</span>

<span class="k">for</span> <span class="n">team</span> <span class="ow">in</span> <span class="p">[</span><span class="s">"West Ham"</span><span class="p">,</span> <span class="s">"Wolves"</span><span class="p">,</span> <span class="s">"Sunderland"</span><span class="p">,</span> <span class="s">"Leeds"</span><span class="p">,</span> <span class="s">"Burnley"</span><span class="p">]:</span>
    <span class="n">latestTeam</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Team1"</span><span class="p">)</span> <span class="o">==</span> <span class="n">team</span><span class="p">,</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Season"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"2526"</span><span class="p">)</span>

    <span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">latestTeam</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">latestTeam</span><span class="p">[</span><span class="s">"TotalPoints1"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="n">team</span><span class="p">))</span>


<span class="n">fig</span><span class="p">.</span><span class="n">update_layout</span><span class="p">(</span><span class="n">height</span><span class="o">=</span><span class="mi">500</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="mi">700</span><span class="p">,</span>
                  <span class="n">title_text</span><span class="o">=</span><span class="s">"Relegation Stats"</span><span class="p">)</span>

<span class="n">fig</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/relegation/relegated.png" alt="Line chart titled Relegation Stats showing average cumulative points of teams that were eventually relegated compared to current teams. X axis is match week number and Y axis is total points. Primary subjects are the average relegated team line and individual team lines for West Ham, Wolves, Sunderland, Leeds, and Burnley. The average relegated team line rises steadily through the season. Sunderland's line is well above the average, Leeds and Burnley track close to the average, and West Ham and Wolves fall below the average with Wolves furthest below." /></p>

<p>Wolves and West Ham are currently in trouble. They are below the average line at this point in the season, whereas Sunderland is storming it, Leeds are also quite safe, and Burnley’s recent performance have kept them above the fated line.</p>

<p>However, looking at the average points of a relegated team isn’t the best way of looking at this. It can get dragged down by a very poor team at the bottom of the league. Instead we need to look at the minimum and average number of points to stay safe every season.</p>

<p>This is the same calculation as above, but aggregating on the final position of each team and then filtering on position 16, one above the relegation zone.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">safe</span> <span class="o">=</span> <span class="p">(</span><span class="n">teamData</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Season"</span><span class="p">)</span> <span class="o">!=</span> <span class="s">"2526"</span><span class="p">)</span>
                <span class="p">.</span><span class="n">group_by</span><span class="p">([</span><span class="s">"N"</span><span class="p">,</span> <span class="s">"FinalPosition"</span><span class="p">])</span>
                <span class="p">.</span><span class="n">agg</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span> 
                     <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalGoals1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span> 
                    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalGoalsC1"</span><span class="p">).</span><span class="n">mean</span><span class="p">(),</span>
                    <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"TotalPoints1"</span><span class="p">).</span><span class="nb">min</span><span class="p">().</span><span class="n">alias</span><span class="p">(</span><span class="s">"Min"</span><span class="p">))</span>
                <span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="s">"N"</span><span class="p">).</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"FinalPosition"</span><span class="p">)</span> <span class="o">==</span> <span class="mi">16</span><span class="p">)</span>
       <span class="p">)</span>
</code></pre></div></div>

<p>Again, plotting this with the same teams.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">fig</span> <span class="o">=</span> <span class="n">go</span><span class="p">.</span><span class="n">Figure</span><span class="p">()</span>
<span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">safe</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">safe</span><span class="p">[</span><span class="s">"TotalPoints1"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="s">'Avg Points of a Safe Team'</span><span class="p">))</span>

<span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">safe</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">safe</span><span class="p">[</span><span class="s">"Min"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="s">'Min Points of a Safe Team'</span><span class="p">))</span>

<span class="k">for</span> <span class="n">team</span> <span class="ow">in</span> <span class="p">[</span><span class="s">"West Ham"</span><span class="p">,</span> <span class="s">"Wolves"</span><span class="p">,</span> <span class="s">"Sunderland"</span><span class="p">,</span> <span class="s">"Leeds"</span><span class="p">,</span> <span class="s">"Burnley"</span><span class="p">]:</span>
    <span class="n">latestTeam</span> <span class="o">=</span> <span class="n">teamData</span><span class="p">.</span><span class="nb">filter</span><span class="p">(</span><span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Team1"</span><span class="p">)</span> <span class="o">==</span> <span class="n">team</span><span class="p">,</span> <span class="n">pl</span><span class="p">.</span><span class="n">col</span><span class="p">(</span><span class="s">"Season"</span><span class="p">)</span> <span class="o">==</span> <span class="s">"2526"</span><span class="p">)</span>

    <span class="n">fig</span><span class="p">.</span><span class="n">add_trace</span><span class="p">(</span><span class="n">go</span><span class="p">.</span><span class="n">Scatter</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">latestTeam</span><span class="p">[</span><span class="s">"N"</span><span class="p">],</span> <span class="n">y</span><span class="o">=</span><span class="n">latestTeam</span><span class="p">[</span><span class="s">"TotalPoints1"</span><span class="p">],</span>
                    <span class="n">mode</span><span class="o">=</span><span class="s">'lines+markers'</span><span class="p">,</span>
                    <span class="n">name</span><span class="o">=</span><span class="n">team</span><span class="p">))</span>

<span class="n">fig</span><span class="p">.</span><span class="n">update_layout</span><span class="p">(</span><span class="n">height</span><span class="o">=</span><span class="mi">500</span><span class="p">,</span> <span class="n">width</span><span class="o">=</span><span class="mi">700</span><span class="p">,</span>
                  <span class="n">title_text</span><span class="o">=</span><span class="s">"Safety Stats"</span><span class="p">)</span>

<span class="n">fig</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/relegation/safe.png" alt="Line chart titled Safety Stats showing cumulative points by match week on the x axis and total points on the y axis. Primary subjects are the colored lines representing Avg Points of a Safe Team, Min Points of a Safe Team, and individual teams West Ham, Wolves, Sunderland, Leeds, Burnley. Sunderland is well above both safety lines, Leeds and Burnley track close to the average and minimum lines, Wolves falls below both safety lines, and West Ham falls below the average line." /></p>

<p>Again, Wolves and West Ham are well below the average line (blue), and Wolves are even below the minimum line (red). Burnley and Leeds are in touching distance. Sunderland is well above. From this, Sunderland should be happy and confident that they can stay up; Leeds are at the bare minimum. Wolves are in big danger, but with a new manager, they might be able to get going again. West Ham have already had their new manager bounce, and it’s still looking precarious.</p>

<p>This also shows that, on average, you need 37.23 points to survive in the Premier League, with 35 as the bare minimum. So the fabled 40 point mark is actually a slight over estimation.</p>

<p>It’s not just points, though. What about the number of goals each team has scored and how many they are conceding? Let’s look at these stats and also format up the graph so it’s a bit less default, and focus just on the games so far.</p>

<p><img src="/assets/relegation/more_safe.png" alt="A line chart comparing Premier League teams' cumulative points across the season, focusing on teams near the relegation zone." title="A line chart comparing Premier League teams' cumulative points across the season, focusing on teams near the relegation zone." /></p>

<p>No real change to the conclusion. Sunderland are doing well on both points and goals scored, and their conceded goals are below the average in the 16th position. Wolves and West Ham are underperforming across the board. Leeds and Burnley are scraping by.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Based on these early-season trajectories, it’s not looking good for West Ham or Wolves. By contrast, Sunderland should be getting excited about the prospect of another season in the Premier League. Leeds and Burnley - not quite out of the woods. As another cliche goes, relegation is about hoping you are better than 3 other teams and at the minute Wolves and West Ham are struggling to find three other worse teams!</p>]]></content><author><name>Dean Markwick</name></author><category term="python" /><category term="sports" /><summary type="html"><![CDATA[It’s been an interesting start to the Premier League. All of the promoted teams (Sunderland, Leeds and Burnley) are outside the relegation zone, with Wolves and West Ham struggling at the bottom. So I want to look back at the other seasons and work out the average number of points throughout the season that characterises relegation teams, and how many points do you need to avoid relegation?]]></summary></entry><entry><title type="html">Easy Neural Nets and Finance - Part 1</title><link href="https://dm13450.github.io/2025/07/23/Easy-Neural-Nets-and-Finance-Part-1.html" rel="alternate" type="text/html" title="Easy Neural Nets and Finance - Part 1" /><published>2025-07-23T00:00:00+00:00</published><updated>2025-07-23T00:00:00+00:00</updated><id>https://dm13450.github.io/2025/07/23/Easy-Neural-Nets-and-Finance-Part-1</id><content type="html" xml:base="https://dm13450.github.io/2025/07/23/Easy-Neural-Nets-and-Finance-Part-1.html"><![CDATA[<p>I’m fortunate enough to be participating in a lecture series at work that covers deep learning and its applications in finance. This will be a series of posts documenting what I learn and implementing the ‘homework’ (I’m 32, how am I still getting homework?) using Julia and Flux.</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>The phrase ‘deep learning’ already feels outdated, and the current hotness is more about AI and LLMs, so the lecture and topics might feel a bit out of date. But given LLMs wouldn’t be here without the deep learning, it feels like going back to the basics.</p>

<p>Plus, I’ve never really jumped in and explored neural nets, so this gives me a chance to do some deep learning in an applied way.</p>

<p>After reading this, you will be able to build your own neural net with different layers and compare it to a simpler linear model.</p>

<h2 id="predicting-a-stocks-daily-volume">Predicting a Stock’s Daily Volume</h2>

<p>If you Google neural nets and finance, you will find an infinite amount of copy-pasted quant finance Python examples of people using PyTorch/TensorFlow/JAX to predict the closing price of some stock. Kudos to these tutorials for putting something out there, but you will struggle to learn anything meaningful about either finance, modelling or neural nets.</p>

<p>This is my attempt to be different.</p>

<p>Instead of predicting prices or returns and showing that neural nets can make money, we will model the total number of shares traded per day. For starters, this is much easier as the data is a bit more signal and less noise. Plus, if I managed to build something that could predict prices, why would I share it?</p>

<p>So, we will be using deep learning to build a model of the <em>total trading volume</em> per day of the SPY ETF. A basic time series prediction task that can be approached both with linear models and deep learning.</p>

<p>You know the drill, fire up your Julia notebook and follow along.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">Dates</span><span class="x">,</span> <span class="n">AlpacaMarkets</span><span class="x">,</span> <span class="n">Plots</span><span class="x">,</span> <span class="n">StatsBase</span>
<span class="k">using</span> <span class="n">DataFramesMeta</span><span class="x">,</span> <span class="n">ShiftedArrays</span>
</code></pre></div></div>

<h2 id="getting-the-data">Getting the Data</h2>

<p>We are using similar data to my <a href="https://dm13450.github.io/2025/06/16/Cyclical-Embedding.html">Cyclical Embedding</a> post, except for this time, we will be using the SPY ETF instead of Apple.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spyRaw</span><span class="x">,</span> <span class="n">npt</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">stock_bars</span><span class="x">(</span><span class="s">"SPY"</span><span class="x">,</span> <span class="s">"1Day"</span><span class="x">;</span> 
  <span class="n">startTime</span><span class="o">=</span><span class="kt">Date</span><span class="x">(</span><span class="s">"2000-01-01"</span><span class="x">),</span> 
  <span class="n">endTime</span> <span class="o">=</span> <span class="n">today</span><span class="x">()</span> <span class="o">-</span> <span class="kt">Day</span><span class="x">(</span><span class="mi">1</span><span class="x">)</span> <span class="x">,</span>
  <span class="n">adjustment</span> <span class="o">=</span> <span class="s">"all"</span><span class="x">,</span> <span class="n">limit</span> <span class="o">=</span> <span class="mi">10000</span><span class="x">)</span>
</code></pre></div></div>

<p>From the raw data, we parse the timestamp and scale the volumes by a million.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span> <span class="o">=</span> <span class="n">spyRaw</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">,</span> <span class="o">:</span><span class="n">c</span><span class="x">]]</span>
<span class="n">spy</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]</span> <span class="o">=</span> <span class="kt">DateTime</span><span class="o">.</span><span class="x">(</span><span class="n">chop</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]));</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]</span> <span class="o">.*</span> <span class="mf">1e-6</span><span class="x">;</span>
</code></pre></div></div>

<p>We also add in the returns with a lag because we are using the close-to-close return as a feature.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"r"</span><span class="x">]</span> <span class="o">=</span> <span class="n">log</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">c</span><span class="x">)</span> <span class="o">.-</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">log</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">c</span><span class="x">))</span>
<span class="n">spy</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"prev_r"</span><span class="x">]</span> <span class="o">=</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">r</span><span class="x">);</span>
</code></pre></div></div>

<p>In this data, the daily volume isn’t stationary and it is also heavy-tailed.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plot</span><span class="x">(</span>
  <span class="n">plot</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">spy</span><span class="o">.</span><span class="n">vNorm</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"IEX Daily Volume"</span><span class="x">),</span>
  <span class="n">histogram</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">vNorm</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Daily Volume Distribution"</span><span class="x">)</span>
  <span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/deeplearning/part1/volumes.png" alt="Line chart showing daily trading volume for SPY ETF over time. The chart displays a fluctuating pattern with several peaks and troughs, illustrating periods of higher and lower trading activity." width="80%" class="center-image" /></p>

<p>Looking at the autocorrelation, we can see a long-range dependence on the daily volumes, but when we take the daily difference in daily volume, we see a strong effect at lag 1, and the rest are much smaller.</p>

<p><img src="/assets/deeplearning/part1/volumes_autocor.png" alt="Bar chart displaying autocorrelation of daily trading volume for SPY ETF across multiple lags. The chart shows a prominent negative bar at lag 1, indicating strong mean reversion, followed by smaller bars for subsequent lags." width="80%" class="center-image" /></p>

<p>A negative value at lag 1 indicates a mean reversion-like process, but more importantly, means modelling the difference in daily volume will be easier than just directly modelling the daily volumes.</p>

<p>Predicting the daily change in volume does reduce how far out we can forecast volumes, though, as it relies on using the known previous volume to produce the next day’s volume. If you estimate multiple days, then you will be compounding the error.</p>

<p>We lag the volume variables as required.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">])</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">delta_vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]</span> <span class="o">.-</span> <span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_vNorm</span><span class="x">]</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_delta_vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">delta_vNorm</span><span class="x">])</span>

<span class="n">spy</span> <span class="o">=</span> <span class="n">dropmissing</span><span class="x">(</span><span class="n">spy</span><span class="x">)</span>
</code></pre></div></div>

<p>We add in the time-based variables and cyclically encode them.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfMonth</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofmonth</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfWeek</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofweek</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfQtr</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofquarter</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthOfYear</span><span class="x">]</span> <span class="o">=</span> <span class="n">month</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>

<span class="n">spy</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="s">"DayOfWeek"</span><span class="x">)</span>
<span class="n">spy</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="s">"DayOfMonth"</span><span class="x">)</span>
<span class="n">spy</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="s">"DayOfQtr"</span><span class="x">)</span>
<span class="n">spy</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="s">"MonthOfYear"</span><span class="x">);</span>
</code></pre></div></div>

<p>We also add in if the date was the end of the month.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">month</span><span class="x">]</span> <span class="o">=</span> <span class="n">floor</span><span class="o">.</span><span class="x">(</span><span class="n">spy</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">Dates</span><span class="o">.</span><span class="kt">Month</span><span class="x">)</span>
<span class="n">spy</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">spy</span><span class="x">,</span> <span class="o">:</span><span class="n">month</span><span class="x">),</span> 
                 <span class="o">:</span><span class="n">MonthEnd</span> <span class="o">=</span> <span class="x">(</span><span class="o">:</span><span class="n">t</span> <span class="o">.==</span> <span class="n">maximum</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">)))</span>
</code></pre></div></div>

<p>Finally, train/test split.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spyTrain</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">,</span> <span class="o">:</span><span class="x">];</span>
<span class="n">spyTest</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">];</span>
</code></pre></div></div>

<p>With the data prepared, we move on to building out the models.</p>

<h2 id="the-baseline-model">The Baseline Model</h2>

<p>We always want to make sure the neural nets are adding value, so we need something simple to compare to. In regular statistical modelling, this might be an intercept-only model, but in this case, we want the best linear model.</p>

<p>It’s a simple linear regression of all the available variables.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">GLM</span>

<span class="n">linearModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">delta_vNorm</span> <span class="o">~</span> <span class="n">prev_delta_vNorm</span> <span class="o">+</span> <span class="n">prev_vNorm</span> <span class="o">+</span> 
                                        <span class="n">MonthEnd</span> <span class="o">+</span> <span class="n">prev_r</span> <span class="o">+</span>
                                        <span class="n">DayOfWeek_sin</span> <span class="o">+</span> <span class="n">DayOfWeek_cos</span> <span class="o">+</span> 
                                        <span class="n">DayOfMonth_sin</span> <span class="o">+</span> <span class="n">DayOfMonth_cos</span> <span class="o">+</span>
                                        <span class="n">DayOfQtr_sin</span> <span class="o">+</span> <span class="n">DayOfQtr_cos</span> <span class="o">+</span>
                                        <span class="n">MonthOfYear_sin</span> <span class="o">+</span> <span class="n">MonthOfYear_cos</span>
                        <span class="x">),</span> <span class="n">spyTrain</span><span class="x">)</span>
</code></pre></div></div>

<p>This fits instantly and we get an in-sample \(R^2\) of 23% and an out-of-sample MSE of 380.</p>

<p>To add the predicted volume to the test set, we need to add the prediction of the model to the previous volume.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spyTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"linearPred"</span><span class="x">]</span> <span class="o">=</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">prev_vNorm</span> <span class="o">.+</span> <span class="n">predict</span><span class="x">(</span><span class="n">linearModel</span><span class="x">,</span> <span class="n">spyTest</span><span class="x">);</span>
<span class="n">sort!</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>
</code></pre></div></div>

<p><img src="/assets/deeplearning/part1/lm_res.png" alt="Line chart comparing predicted and actual daily trading volumes for SPY ETF over time." title="Line chart comparing predicted and actual daily trading volumes for SPY ETF over time." width="80%" class="center-image" /></p>

<p>Everything lines up quite nicely. There are a couple of periods where the volume spikes and the model can’t keep up, but other than that, it looks decent.</p>

<p>Also interesting to look at the shape of the cyclically encoded variables.</p>

<p><img src="/assets/deeplearning/part1/lm_cyen.png" alt="Line plot showing the effect of cyclically encoded variables on predicted daily trading volume changes for SPY ETF. The chart displays four panels for day of the week, day of the month, day of the quarter, and month of the year, each with a smooth curve illustrating how each time-based feature influences the model output." width="80%" class="center-image" /></p>

<p>Plenty going on here!</p>

<ul>
  <li><strong>Day of the Week</strong> - Wednesdays and Thursdays have a larger positive effect than Mondays and Tuesdays.</li>
  <li><strong>Day of the Month</strong> - The middle of the month (10-15) has the higher positive effect.</li>
  <li><strong>Day of the Quarter</strong> - Larger positive effects towards the end of the quarter.</li>
  <li><strong>Month of the Year</strong> - Summer months have the most negative effect.</li>
</ul>

<p>A positive effect here means a larger positive change in the daily volume compared to the previous day, and similarly, the same with the negative effects.</p>

<p>So, an intuitive model to begin with that has produced a strong foundation to improve upon with the neural net models.</p>

<h2 id="neural-nets-in-julia">Neural Nets in Julia</h2>

<p>Let’s increase the model complexity and introduce the neural nets. We are still using the same variables, but we expand them to include even more lags of the change in volumes.</p>

<h3 id="preparing-the-data-for-a-neural-network">Preparing the Data for a Neural Network</h3>

<p>We start with the dataframe, but iterate through and add the 30 lags of the previous volume changes.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rawData</span> <span class="o">=</span> <span class="n">spy</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">delta_vNorm</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_vNorm</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthEnd</span><span class="x">,</span> <span class="o">:</span><span class="n">prev_r</span><span class="x">,</span>
                      <span class="o">:</span><span class="n">DayOfWeek_sin</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfWeek_cos</span><span class="x">,</span>
                      <span class="o">:</span><span class="n">DayOfMonth_sin</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfMonth_cos</span><span class="x">,</span>
                      <span class="o">:</span><span class="n">DayOfQtr_sin</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfQtr_cos</span><span class="x">,</span>
                      <span class="o">:</span><span class="n">MonthOfYear_sin</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthOfYear_cos</span><span class="x">]]</span>

<span class="n">maxLag</span> <span class="o">=</span> <span class="mi">30</span>
<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="n">maxLag</span>
    <span class="n">rawData</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="s">"lag_</span><span class="si">$(i)</span><span class="s">_delta_vNorm"</span><span class="x">)]</span> <span class="o">=</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">rawData</span><span class="o">.</span><span class="n">delta_vNorm</span><span class="x">,</span> <span class="n">i</span><span class="x">)</span>
<span class="k">end</span>

<span class="n">dropmissing!</span><span class="x">(</span><span class="n">rawData</span><span class="x">)</span>
</code></pre></div></div>

<p>We then need to go from dataframes to matrices and flip the dimensions so each column is an observation rather than each row.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">y</span> <span class="o">=</span> <span class="n">permutedims</span><span class="x">(</span><span class="n">rawData</span><span class="o">.</span><span class="n">delta_vNorm</span><span class="x">)</span>
<span class="n">ts</span> <span class="o">=</span> <span class="n">rawData</span><span class="o">.</span><span class="n">t</span>
<span class="n">x</span> <span class="o">=</span> <span class="nd">@select</span><span class="x">(</span><span class="n">rawData</span><span class="x">,</span> <span class="n">Not</span><span class="x">(</span><span class="o">:</span><span class="n">delta_vNorm</span><span class="x">,</span> <span class="o">:</span><span class="n">t</span><span class="x">))</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">permutedims</span><span class="x">(</span><span class="kt">Matrix</span><span class="x">(</span><span class="n">x</span><span class="x">));</span>
</code></pre></div></div>

<p>Again, train/test split too.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">xTrain</span> <span class="o">=</span> <span class="n">x</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">]</span>
<span class="n">yTrain</span> <span class="o">=</span> <span class="n">y</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">]</span>
<span class="n">tsTrain</span> <span class="o">=</span> <span class="n">ts</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">]</span>

<span class="n">xTest</span> <span class="o">=</span> <span class="n">x</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">]</span>
<span class="n">yTest</span> <span class="o">=</span> <span class="n">y</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">]</span>
<span class="n">tsTest</span> <span class="o">=</span> <span class="n">ts</span><span class="x">[</span><span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">];</span>
</code></pre></div></div>

<p><a href="https://fluxml.ai/Flux.jl/stable/">Flux.jl</a> is Julia’s neural network library and the go-to for deep learning in Julia. It provides all the tools to build and train these types of models. One such tool is the <code class="language-plaintext highlighter-rouge">DataLoader</code>, which enables batch training for models. Batch training uses random subsets of the full data to train the model, which is very useful if you have too much data to fit into memory. You get to train the model on all your data by breaking it down into chunks.</p>

<p>Now, in this specific case, it isn’t needed as our data is small, but it’s always good to understand the techniques, and Flux makes it very simple. Pass in the <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> matrices, define the batch size and whether you want to randomise the samples or not.</p>

<p>Here we build random batches of 5.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">train_loader</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">DataLoader</span><span class="x">((</span><span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">),</span> <span class="n">batchsize</span><span class="o">=</span><span class="mi">5</span><span class="x">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="nb">true</span><span class="x">);</span>
</code></pre></div></div>

<p>Next, we need to build the model. In Flux, each layer of the basic net needs the number of input nodes and output nodes.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model</span> <span class="o">=</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">)</span>
</code></pre></div></div>

<p>Simply taking the number of rows of the <code class="language-plaintext highlighter-rouge">x</code> matrix as the input, and we are outputting 1 number - the expected change in volume for that day.</p>

<p>We also need to define a loss function for the model. We will use the mean square error (MSE). We predict the values from the model and calculate the MSE compared to the true values.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> flux_loss</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>
    <span class="n">yhat</span> <span class="o">=</span> <span class="n">flux_model</span><span class="x">(</span><span class="n">x</span><span class="x">)</span>
    <span class="n">Flux</span><span class="o">.</span><span class="n">mse</span><span class="x">(</span><span class="n">yhat</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>A neural net has several parameters that we need to optimise using the training data. With each batch of data, we evaluate the loss function and use the gradient of the loss function to push the parameters in the right direction to minimise the loss. The mechanics of moving around the loss function are controlled by the optimiser. In this case, we will use regular gradient descent, but there are many different optimisers out there that Flux provides - <a href="https://fluxml.ai/Flux.jl/stable/reference/training/optimisers/#man-optimisers">Optimiser Reference</a>.</p>

<p>Again, Flux makes this easy to do out of the box without really needing to understand what’s happening behind the scenes. We provide a gradient descent optimiser, <code class="language-plaintext highlighter-rouge">Flux.setup(Descent(eta)), flux_model)</code> (with <code class="language-plaintext highlighter-rouge">eta</code> (\(\eta\)) being the learning rate) and update the parameters after each batch of data.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">l</span><span class="x">,</span> <span class="n">gs</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">withgradient</span><span class="x">(</span><span class="n">m</span> <span class="o">-&gt;</span> <span class="n">flux_loss</span><span class="x">(</span><span class="n">m</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">),</span><span class="n">flux_model</span><span class="x">)</span>
<span class="n">Flux</span><span class="o">.</span><span class="n">update!</span><span class="x">(</span><span class="n">opt_state</span><span class="x">,</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">gs</span><span class="x">[</span><span class="mi">1</span><span class="x">])</span>
</code></pre></div></div>

<p>After all that, we throw everything into one function to easily iterate around the models. We are batch training with gradient descent and returning the trained model plus the loss history on both the full training set and the test set.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> train</span><span class="x">(</span><span class="n">train</span><span class="x">,</span> <span class="n">test</span><span class="x">,</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">flux_loss</span><span class="x">;</span> <span class="n">batchSize</span><span class="o">=</span><span class="mi">1024</span><span class="x">,</span> <span class="n">epochs</span><span class="o">=</span><span class="mi">10</span><span class="x">,</span> <span class="n">eta</span><span class="o">=</span><span class="mf">0.01</span><span class="x">)</span>
    <span class="x">(</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">)</span> <span class="o">=</span> <span class="n">train</span>
    <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">)</span> <span class="o">=</span> <span class="n">test</span>
    
    <span class="n">train_loader</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">DataLoader</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="n">batchsize</span><span class="o">=</span><span class="n">batchSize</span><span class="x">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="nb">true</span><span class="x">);</span>
    <span class="n">opt_state</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">setup</span><span class="x">(</span><span class="n">Descent</span><span class="x">(</span><span class="n">eta</span><span class="x">),</span> <span class="n">flux_model</span><span class="x">);</span>
        
    <span class="n">allTrainLoss</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">epochs</span><span class="x">)</span>
    <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">epochs</span><span class="x">)</span>
    
    <span class="k">for</span> <span class="n">epoch</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="n">epochs</span>
        <span class="n">loss</span> <span class="o">=</span> <span class="mf">0.0</span>
        <span class="k">for</span> <span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span> <span class="k">in</span> <span class="n">train_loader</span>
            <span class="n">l</span><span class="x">,</span> <span class="n">gs</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">withgradient</span><span class="x">(</span><span class="n">m</span> <span class="o">-&gt;</span> <span class="n">flux_loss</span><span class="x">(</span><span class="n">m</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">),</span> <span class="n">flux_model</span><span class="x">)</span>
            <span class="n">Flux</span><span class="o">.</span><span class="n">update!</span><span class="x">(</span><span class="n">opt_state</span><span class="x">,</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">gs</span><span class="x">[</span><span class="mi">1</span><span class="x">])</span>
            <span class="n">loss</span> <span class="o">+=</span> <span class="n">l</span> <span class="o">/</span> <span class="n">length</span><span class="x">(</span><span class="n">train_loader</span><span class="x">)</span>
        <span class="k">end</span>
        <span class="n">train_loss</span> <span class="o">=</span> <span class="n">flux_loss</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">)</span>
        <span class="n">test_loss</span> <span class="o">=</span> <span class="n">flux_loss</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">)</span>
        <span class="n">allTrainLoss</span><span class="x">[</span><span class="n">epoch</span><span class="x">]</span> <span class="o">=</span> <span class="n">train_loss</span>
        <span class="n">allTestLoss</span><span class="x">[</span><span class="n">epoch</span><span class="x">]</span> <span class="o">=</span> <span class="n">test_loss</span>
        
    <span class="k">end</span>
    <span class="k">return</span> <span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span><span class="x">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>We can now train the models, so let’s build some models!</p>

<h3 id="a-1-layer-neural-net">A 1 Layer Neural Net</h3>

<p>The simplest neural net is 1 layer with the features as an input and 1 value as the output. Nothing else!</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model</span> <span class="o">=</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">)</span>
<span class="n">flux_model</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">train</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">),</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">flux_loss</span><span class="x">;</span> <span class="n">epochs</span> <span class="o">=</span> <span class="mi">1000</span><span class="x">,</span> <span class="n">eta</span><span class="o">=</span><span class="mf">1e-6</span><span class="x">);</span>
</code></pre></div></div>

<p><img src="/assets/deeplearning/part1/layer1_traing.png" alt="Line chart showing the training loss over epochs for a one-layer neural network model predicting daily trading volume changes. The chart displays a downward trend, indicating that the model loss decreases as training progresses." width="80%" class="center-image" /></p>

<p>You might notice something strange here: the test loss is smaller than the training loss. This is a quirk of this data set; the test set has a tighter distribution than the training data, which is easy to see in a histogram.</p>

<p><img src="/assets/deeplearning/part1/testtraindist.png" alt="Histogram comparing the distribution of daily trading volume changes for SPY ETF in the training and test datasets. The training set shows a wider spread and more extreme values, while the test set is more tightly clustered around the center. The chart highlights the difference in variability between the two datasets." width="80%" class="center-image" /></p>

<p>Like I said, it’s a quirk of the dataset, but something to bear in mind for the rest of the examples.</p>

<p>Let’s look at the predicted values of this first neural net and how they line up with reality. Plus, we can compare it to the linear model. For the linear model, you just need to run <code class="language-plaintext highlighter-rouge">predict</code> and pass in the test dataset. Similarly, with the neural net, we evaluate the trained model on the testing matrix.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">nnTest</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">t</span><span class="o">=</span><span class="n">tsTest</span><span class="x">,</span> <span class="n">delta_vNorm_nn</span> <span class="o">=</span> <span class="n">vec</span><span class="x">(</span><span class="n">flux_model</span><span class="x">(</span><span class="n">xTest</span><span class="x">)</span><span class="err">'</span><span class="x">))</span>
<span class="n">spyTest</span><span class="o">.</span><span class="n">delta_vNorm_lin</span> <span class="o">=</span> <span class="n">predict</span><span class="x">(</span><span class="n">linearModel</span><span class="x">,</span> <span class="n">spyTest</span><span class="x">)</span>
<span class="n">spyTest</span> <span class="o">=</span> <span class="n">leftjoin</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="n">nnTest</span><span class="x">,</span> <span class="n">on</span> <span class="o">=</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>
</code></pre></div></div>

<p>As we are predicting the change in the daily volume, we need to add back in the previous value to get our predicted daily volume.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">spyTest</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">v_nn</span> <span class="o">=</span> <span class="o">:</span><span class="n">prev_vNorm</span> <span class="o">.+</span> <span class="o">:</span><span class="n">delta_vNorm_nn</span><span class="x">,</span> <span class="o">:</span><span class="n">v_lin</span> <span class="o">=</span> <span class="o">:</span><span class="n">prev_vNorm</span> <span class="o">+</span> <span class="o">:</span><span class="n">delta_vNorm_lin</span><span class="x">);</span>
<span class="n">sort!</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>
</code></pre></div></div>

<p>And then plotting</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">p</span> <span class="o">=</span> <span class="n">plot</span><span class="x">(</span><span class="n">spyTest</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">vNorm</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"True"</span><span class="x">,</span>  <span class="n">dpi</span><span class="o">=</span><span class="mi">300</span><span class="x">,</span> <span class="n">background_color</span> <span class="o">=</span> <span class="o">:</span><span class="n">transparent</span><span class="x">)</span>
<span class="n">p</span> <span class="o">=</span> <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">v_nn</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"NN"</span><span class="x">)</span>
<span class="n">p</span> <span class="o">=</span> <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">spyTest</span><span class="o">.</span><span class="n">v_lin</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Linear"</span><span class="x">)</span>
<span class="n">p</span>
</code></pre></div></div>

<p><img src="/assets/deeplearning/part1/layer1_results.png" alt="Line chart comparing predicted and actual daily trading volumes for SPY ETF over time. The chart shows three lines: one representing true daily volumes, another representing neural network predictions and another showing the linear model predictions. All the lines follow a similar pattern." width="80%" class="center-image" /></p>

<p>Things line up quite well, nothing outrageous.</p>

<p>In terms of performance, we calculate the MSE from the dataframe.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@combine</span><span class="x">(</span><span class="n">dropmissing</span><span class="x">(</span><span class="n">spyTest</span><span class="x">),</span> 
          <span class="o">:</span><span class="n">NN</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nn</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span> 
          <span class="o">:</span><span class="n">Lin</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_lin</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>NN</th>
      <th>Lin</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>405.55</td>
      <td>370.57</td>
    </tr>
  </tbody>
</table>

<p>The linear model is doing better so far.</p>

<h3 id="2-layer-neural-nets">2 Layer Neural Nets</h3>

<p>We are now in the realm of multi-layer perceptrons (MLPs) and have introduced many more parameters into the model. We can also now build more complicated interactions with each layer.</p>

<p>In Flux, building out more layers is simple; you are chaining different dense layers together. We are choosing to have a fully connected MLP with 2 layers, with all the variables passed through.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model2</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">Chain</span><span class="x">(</span><span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">)),</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">))</span>

<span class="n">flux_model2</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">train</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">),</span> <span class="n">flux_model2</span><span class="x">,</span> <span class="n">flux_loss</span><span class="x">;</span> <span class="n">epochs</span> <span class="o">=</span> <span class="mi">1000</span><span class="x">,</span> <span class="n">eta</span> <span class="o">=</span> <span class="mf">1e-6</span><span class="x">);</span>
</code></pre></div></div>

<p>This trains in the same amount of time with the same train/test loss pattern. Again, assessing the MSE of this bigger model.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">nnhTest</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">t</span><span class="o">=</span><span class="n">tsTest</span><span class="x">,</span> <span class="n">delta_vNorm_nnh</span> <span class="o">=</span> <span class="n">vec</span><span class="x">(</span><span class="n">flux_model2</span><span class="x">(</span><span class="n">xTest</span><span class="x">)</span><span class="err">'</span><span class="x">))</span>
<span class="n">spyTest</span> <span class="o">=</span> <span class="n">leftjoin</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="n">nnhTest</span><span class="x">,</span> <span class="n">on</span> <span class="o">=</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>

<span class="n">spyTest</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">v_nnh</span> <span class="o">=</span> <span class="o">:</span><span class="n">prev_vNorm</span> <span class="o">.+</span> <span class="o">:</span><span class="n">delta_vNorm_nnh</span><span class="x">)</span>
<span class="nd">@combine</span><span class="x">(</span><span class="n">dropmissing</span><span class="x">(</span><span class="n">spyTest</span><span class="x">),</span> <span class="o">:</span><span class="n">NN</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nn</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span> 
                               <span class="o">:</span><span class="n">Lin</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_lin</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span>
                               <span class="o">:</span><span class="n">NNH</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nnh</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>NN</th>
      <th>Lin</th>
      <th>NNH</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>405.55</td>
      <td>370.57</td>
      <td>401.424</td>
    </tr>
  </tbody>
</table>

<p>This has improved on the 1-layer neural net, but still no better than the linear model.</p>

<h2 id="neural-net-regularisation">Neural Net Regularisation</h2>

<p>The linear model has 13 parameters, the 1-layer neural net has 42 parameters, and the 2-layer net has 1,764 parameters. This is a rapid growth in complexity which raises the likelihood that the model starts to overfit. How do we make sure the neural net models only pick out the key parameters and regularise themselves?</p>

<p>We have two options: add a penalisation score in the loss function that bounds the total size of the coefficients or introduce something called a dropout layer.</p>

<h3 id="penalising-the-loss-function">Penalising the Loss Function</h3>

<p>You can extend regularisation into neural networks the same way you do linear models. You add an additional term to the loss function that penalises the total combined size of the coefficients.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> flux_loss_reg</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>
    <span class="n">flux_loss</span><span class="x">(</span><span class="n">flux_model</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span> <span class="o">+</span> <span class="n">sum</span><span class="x">(</span><span class="n">x</span><span class="o">-&gt;</span><span class="n">sum</span><span class="x">(</span><span class="n">abs2</span><span class="x">,</span> <span class="n">x</span><span class="x">),</span> <span class="n">Flux</span><span class="o">.</span><span class="n">trainables</span><span class="x">(</span><span class="n">flux_model</span><span class="x">))</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Therefore, if the model wants to allocate more weight to 1 parameter, it needs to take some weight from another. This acts as a balancing mechanism and should reduce the chance of overfitting.</p>

<p>We use this new loss function with the 2-layer net.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">Chain</span><span class="x">(</span><span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">)),</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">))</span>
<span class="n">flux_model</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">train</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">),</span> <span class="n">flux_model</span><span class="x">,</span> <span class="n">flux_loss_reg</span><span class="x">;</span> <span class="n">epochs</span> <span class="o">=</span> <span class="mi">1000</span><span class="x">,</span> <span class="n">eta</span> <span class="o">=</span> <span class="mf">1e-6</span><span class="x">);</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>NN</th>
      <th>Lin</th>
      <th>NNH</th>
      <th>NNHR</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>405.55</td>
      <td>370.57</td>
      <td>401.424</td>
      <td>388.548</td>
    </tr>
  </tbody>
</table>

<p>So slightly better than the unregularised version.</p>

<h3 id="neural-net-dropout-layers">Neural Net Dropout Layers</h3>

<p>An alternative way of regularising a network is to introduce a dropout layer. Dropout randomly sets the output of a node to zero during the training phase, which means the net has fewer parameters to optimise over and reduces the possibility of overfitting. When it comes to inference, all of the nodes are included but rescaled by the dropout probability. The original dropout paper is an engaging read - <a href="https://jmlr.org/papers/v15/srivastava14a.html"> Dropout: A Simple Way to Prevent Neural Networks from Overfitting</a>.</p>

<p>Again, very simple to use dropout in Julia and Flux; it is just another type of layer.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flux_model3</span> <span class="o">=</span> <span class="n">Flux</span><span class="o">.</span><span class="n">Chain</span><span class="x">(</span><span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">)),</span> <span class="n">Dropout</span><span class="x">(</span><span class="mf">0.5</span><span class="x">),</span> <span class="n">Dense</span><span class="x">(</span><span class="n">size</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="mi">1</span><span class="x">),</span> <span class="mi">1</span><span class="x">))</span>

<span class="n">flux_model3</span><span class="x">,</span> <span class="n">allTrainLoss</span><span class="x">,</span> <span class="n">allTestLoss</span> <span class="o">=</span> <span class="n">train</span><span class="x">((</span><span class="n">xTrain</span><span class="x">,</span> <span class="n">yTrain</span><span class="x">),</span> <span class="x">(</span><span class="n">xTest</span><span class="x">,</span> <span class="n">yTest</span><span class="x">),</span> <span class="n">flux_model3</span><span class="x">,</span> <span class="n">flux_loss</span><span class="x">;</span> <span class="n">epochs</span> <span class="o">=</span> <span class="mi">250</span><span class="x">,</span> <span class="n">eta</span> <span class="o">=</span> <span class="mf">1e-6</span><span class="x">);</span>
</code></pre></div></div>

<p>For the final time, let’s evaluate this model on the test set and calculate the MSE.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">nndTest</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">t</span><span class="o">=</span><span class="n">tsTest</span><span class="x">,</span> <span class="n">delta_vNorm_nnd</span> <span class="o">=</span> <span class="n">vec</span><span class="x">(</span><span class="n">flux_model3</span><span class="x">(</span><span class="n">xTest</span><span class="x">)</span><span class="err">'</span><span class="x">))</span>
<span class="n">spyTest</span> <span class="o">=</span> <span class="n">leftjoin</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="n">nndTest</span><span class="x">,</span> <span class="n">on</span> <span class="o">=</span> <span class="o">:</span><span class="n">t</span><span class="x">);</span>

<span class="n">spyTest</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">spyTest</span><span class="x">,</span> <span class="o">:</span><span class="n">v_nnd</span> <span class="o">=</span> <span class="o">:</span><span class="n">prev_vNorm</span> <span class="o">.+</span> <span class="o">:</span><span class="n">delta_vNorm_nnd</span><span class="x">)</span>
<span class="nd">@combine</span><span class="x">(</span><span class="n">dropmissing</span><span class="x">(</span><span class="n">spyTest</span><span class="x">),</span> <span class="o">:</span><span class="n">NN</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nn</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span> 
                               <span class="o">:</span><span class="n">Lin</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_lin</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span>
                               <span class="o">:</span><span class="n">NNH</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nnh</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">),</span>
                               <span class="o">:</span><span class="n">NND</span> <span class="o">=</span> <span class="n">mean</span><span class="x">((</span><span class="o">:</span><span class="n">vNorm</span> <span class="o">.-</span> <span class="o">:</span><span class="n">v_nnd</span><span class="x">)</span><span class="o">.^</span><span class="mi">2</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>NN</th>
      <th>Lin</th>
      <th>NNH</th>
      <th>NNHR</th>
      <th>NNHD</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>405.55</td>
      <td>370.57</td>
      <td>401.424</td>
      <td>388.548</td>
      <td>411.105</td>
    </tr>
  </tbody>
</table>

<p>The worst model so far!</p>

<h2 id="conclusion">Conclusion</h2>

<p>So the linear model is still winning. The neural net and various iterations haven’t improved on this simple model, and the best neural net was the 2-layer with regularisation.</p>

<p>It must be noted that this problem isn’t exactly hard, and the amount of data is relatively small, so it is unsurprising that the added complexity of the neural nets hasn’t added anything. It’s hardly a ‘deep learning’ problem!</p>

<p>I’ve also not gone crazy with the neural net optimisations. You can include more layers, change the number of nodes in the layers, change the activation functions, and change the loss function - all sorts of things that could be tweaked and improve the model.</p>

<p>Hopefully I’ve not just added to the slop of neural net finance tutorials and you’ve found something useful. Unfortunately, the neural nets haven’t beaten the linear model, which shows you can’t just jump into the fancy tools without looking at the simpler models.</p>

<h2 id="other-juliafinance-posts">Other Julia/Finance Posts</h2>

<p>For more quant finance tutorials check out some of my older posts.</p>

<ul>
  <li><a href="https://dm13450.github.io/2025/03/14/Fitting-Price-Impact-Models.html">Fitting Price Impact</a></li>
  <li><a href="https://dm13450.github.io/2024/02/08/Cross-Asset-Skew-A-Trading-Strategy.html">Cross Asset Skew - A Trading Strategy</a></li>
  <li><a href="https://dm13450.github.io/2023/07/15/Stat-Arb-Walkthrough.html">Stat Arb - An Easy Walkthrough</a></li>
</ul>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><category term="quant" /><category term="deep-learning" /><summary type="html"><![CDATA[I’m fortunate enough to be participating in a lecture series at work that covers deep learning and its applications in finance. This will be a series of posts documenting what I learn and implementing the ‘homework’ (I’m 32, how am I still getting homework?) using Julia and Flux.]]></summary></entry><entry><title type="html">Cyclical Embedding</title><link href="https://dm13450.github.io/2025/06/16/Cyclical-Embedding.html" rel="alternate" type="text/html" title="Cyclical Embedding" /><published>2025-06-16T00:00:00+00:00</published><updated>2025-06-16T00:00:00+00:00</updated><id>https://dm13450.github.io/2025/06/16/Cyclical-Embedding</id><content type="html" xml:base="https://dm13450.github.io/2025/06/16/Cyclical-Embedding.html"><![CDATA[<p>Cyclical embedding (or encoding) is a basic transformation for numerical variables that follow a cycle. Let’s explore how they work.</p>

<p>I am currently attending a Deep Learning in Finance lecture series (lectured by Stefan Zohran in preparation for his new book). The ongoing homework is taking a basic time series model and applying the various deep learning techniques. In the process of doing this homework, I’ve come across Cyclical Embeddings and how they are used to transform variables that move into a cycle into something a model can understand.</p>

<p>Consider this blog post me reading this Kaggle notebook: <a href="https://www.kaggle.com/code/avanwyk/encoding-cyclical-features-for-deep-learning">Encoding Cyclical Features for Deep Learning</a>, converting it to Julia and using some examples to convince myself Cyclical Embeddings work and are useful.</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>Cyclical variables are especially pertinent in Finance. For example, day of the week you could either use a factor (the label directly) or number (Mon=1, Tue=2 etc.) in a model. Using a factor, your model now includes 5 additional parameters. If you use the number you’ll have to specify the form of the relationship (linear or using a GAM). Each has its ups and downs, but there is also a key piece of information missing: the days of the week form a cycle where 1 follows from 5.  How can we translate this into something the model will understand?</p>

<p>As the name suggests, cyclical embeddings lead to a cycle and the natural functions are the trigonometry sin and cos. We take the one-dimensional variable and transform it into two dimensions</p>

\[\begin{align*}
x &amp; = \sin \left( \frac{2 \pi t}{\text{max} (t)} \right), \\
y &amp; = \cos \left( \frac{2 \pi t}{\text{max} (t)} \right).
\end{align*}\]

<p>If we apply this transformation to our day of the week we go from \(t \in [0, 4]\) to a circle in \(x\) and \(y\).</p>

<p><img src="/assets/CyclicalEmbedding/example.png" alt="A two-dimensional plot showing the cyclical embedding of days of the week, where each day is represented as a point on a circle using sine and cosine transformations. The points form a closed loop, visually demonstrating the cyclical nature of the days." width="80%" class="center-image" /></p>

<p>I am reminded of polar coordinates and we can now see that Monday is the same distance from Friday as it is Tuesday. 
Crucially, the new variables are nicely bounded between -1 and 1 which is always helpful when building models. 
All in, this looks like a sensible transformation, now to see if it has a noticeable difference in modelling performance.</p>

<h2 id="practical-cyclical-embeddings---daily-volumes">Practical Cyclical Embeddings - Daily Volumes</h2>

<p>Let’s model the daily trading volume of a stock. It feels logical that the day of the week (Mon-Fri), day of the month (1-31) and month (1-12) would affect the amount traded. The summer months might be quieter, the end of the month might be busier (month-end rebalancing) and Fridays might be quieter. All three of these time variables are cyclical so the cyclical embeddings should help.</p>

<p>We have 3 separate choices:</p>

<ol>
  <li>Everything as a number (3 free parameters)</li>
  <li>Days of the week and months as factors (5 + 12 + 1 free parameters)</li>
  <li>Cyclically embedded the three variables (3x2=6 parameters)</li>
</ol>

<p>So a balance between the number of parameters and the flexibility of the model.</p>

<p>We will use a simple linear model, nothing fancy.</p>

<p>As always we will be in Julia.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">Dates</span><span class="x">,</span> <span class="n">AlpacaMarkets</span><span class="x">,</span> <span class="n">Plots</span><span class="x">,</span> <span class="n">StatsBase</span><span class="x">,</span> <span class="n">GLM</span>
<span class="k">using</span> <span class="n">DataFramesMeta</span><span class="x">,</span> <span class="n">CategoricalArrays</span><span class="x">,</span> <span class="n">ShiftedArrays</span>
</code></pre></div></div>

<p>To load the data in we will use my <a href="https://github.com/dm13450/AlpacaMarkets.jl">AlpacaMarkets.jl</a> API and pull in as much daily data as possible.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aaplRaw</span><span class="x">,</span> <span class="n">npt</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">stock_bars</span><span class="x">(</span><span class="s">"AAPL"</span><span class="x">,</span> <span class="s">"1Day"</span><span class="x">;</span> <span class="n">startTime</span><span class="o">=</span><span class="kt">Date</span><span class="x">(</span><span class="s">"2000-01-01"</span><span class="x">),</span> <span class="n">endTime</span> <span class="o">=</span> <span class="n">today</span><span class="x">()</span> <span class="o">-</span> <span class="kt">Day</span><span class="x">(</span><span class="mi">2</span><span class="x">),</span> <span class="n">adjustment</span> <span class="o">=</span> <span class="s">"all"</span><span class="x">,</span> <span class="n">limit</span> <span class="o">=</span> <span class="mi">10000</span><span class="x">)</span>
</code></pre></div></div>

<p>Some basic cleaning and formatting.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aapl</span> <span class="o">=</span> <span class="n">aaplRaw</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]]</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]</span> <span class="o">=</span> <span class="kt">DateTime</span><span class="o">.</span><span class="x">(</span><span class="n">chop</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]))</span>
</code></pre></div></div>

<p>Julia makes it easy to add the factor variables and the numeric versions. As the numeric values all start at 1 we subtract one so they begin at 0.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayName</span><span class="x">]</span> <span class="o">=</span> <span class="n">CategoricalArray</span><span class="x">(</span><span class="n">dayname</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">))</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthName</span><span class="x">]</span> <span class="o">=</span> <span class="n">CategoricalArray</span><span class="x">(</span><span class="n">monthname</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">))</span>

<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfMonth</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofmonth</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">DayOfWeek</span><span class="x">]</span> <span class="o">=</span> <span class="n">dayofweek</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">MonthOfYear</span><span class="x">]</span> <span class="o">=</span> <span class="n">month</span><span class="o">.</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">)</span> <span class="o">.-</span> <span class="mi">1</span><span class="x">;</span>
</code></pre></div></div>

<p>We normalise the volume to millions of shares and take the difference.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aapl</span> <span class="o">=</span> <span class="n">aaplRaw</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]]</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]</span> <span class="o">.*</span> <span class="mf">1e-6</span><span class="x">;</span>
<span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">delta_vNorm</span><span class="x">]</span> <span class="o">=</span> <span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]</span> <span class="o">.-</span> <span class="n">ShiftedArrays</span><span class="o">.</span><span class="n">lag</span><span class="x">(</span><span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">vNorm</span><span class="x">]);</span>
</code></pre></div></div>

<p>As the regular volumes (<code class="language-plaintext highlighter-rouge">vNorm</code>) aren’t stationary, we can see a clear trend that changes, it’s better to model the difference in volumes each day.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plot</span><span class="x">(</span><span class="n">plot</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">aapl</span><span class="o">.</span><span class="n">vNorm</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Volume"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">),</span> 
     <span class="n">plot</span><span class="x">(</span><span class="n">aapl</span><span class="o">.</span><span class="n">t</span><span class="x">,</span> <span class="n">aapl</span><span class="o">.</span><span class="n">delta_vNorm</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Volume Difference"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">),</span> <span class="n">layout</span><span class="o">=</span><span class="x">(</span><span class="mi">2</span><span class="x">,</span><span class="mi">1</span><span class="x">))</span>
</code></pre></div></div>

<p><img src="/assets/CyclicalEmbedding/volumes.png" alt="Two line plots showing daily trading volumes for AAPL over time. The first plot displays significant fluctuations and trends, with periods of higher and lower trading activity. The second plot is the difference in trading volumes between the days and doesn't have a trend." width="80%" class="center-image" /></p>

<p>To apply the cyclical encoding we need to take one column and turn it into two.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> cyclical_encode</span><span class="x">(</span><span class="n">df</span><span class="x">,</span> <span class="n">col</span><span class="x">,</span> <span class="n">max</span><span class="x">)</span>
    <span class="n">df</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="s">"</span><span class="si">$(col)</span><span class="s">_sin"</span><span class="x">)]</span> <span class="o">=</span> <span class="n">sin</span><span class="o">.</span><span class="x">(</span><span class="mi">2</span> <span class="o">.*</span> <span class="nb">pi</span> <span class="o">.*</span> <span class="n">df</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="n">col</span><span class="x">)]</span><span class="o">/</span><span class="n">max</span><span class="x">)</span>
    <span class="n">df</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="s">"</span><span class="si">$(col)</span><span class="s">_cos"</span><span class="x">)]</span> <span class="o">=</span> <span class="n">cos</span><span class="o">.</span><span class="x">(</span><span class="mi">2</span> <span class="o">.*</span> <span class="nb">pi</span> <span class="o">.*</span> <span class="n">df</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="kt">Symbol</span><span class="x">(</span><span class="n">col</span><span class="x">)]</span><span class="o">/</span><span class="n">max</span><span class="x">)</span>
    <span class="n">df</span>
<span class="k">end</span>

<span class="k">for</span> <span class="n">col</span> <span class="k">in</span> <span class="x">[</span><span class="s">"DayOfWeek"</span><span class="x">,</span> <span class="s">"DayOfMonth"</span><span class="x">,</span> <span class="s">"MonthOfYear"</span><span class="x">]</span>
    <span class="n">aapl</span> <span class="o">=</span> <span class="n">cyclical_encode</span><span class="x">(</span><span class="n">aapl</span><span class="x">,</span> <span class="n">col</span><span class="x">,</span> <span class="n">maximum</span><span class="x">(</span><span class="n">aapl</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="n">col</span><span class="x">]))</span>
<span class="k">end</span>
</code></pre></div></div>

<p>If you’ve not seen it before the <code class="language-plaintext highlighter-rouge">$</code> is like Python F-strings and lets you use a variable in the string.</p>

<p>We do the normal test/train split.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aaplTrain</span> <span class="o">=</span> <span class="n">aapl</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="mi">2000</span><span class="x">,</span><span class="o">:</span><span class="x">]</span>
<span class="n">aaplTest</span> <span class="o">=</span> <span class="n">aapl</span><span class="x">[</span><span class="mi">2001</span><span class="o">:</span><span class="k">end</span><span class="x">,</span><span class="o">:</span><span class="x">];</span>
</code></pre></div></div>

<p>Now to build the three models.</p>

<p>The numerical model takes in the numbers directly.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">numModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">delta_vNorm</span> <span class="o">~</span> <span class="n">DayOfWeek</span> <span class="o">+</span> <span class="n">MonthOfYear</span> <span class="o">+</span> <span class="n">DayOfMonth</span><span class="x">),</span> <span class="n">aaplTrain</span><span class="x">)</span>
</code></pre></div></div>

<p>The factor model represents the day of the week and day of the month as categories so they each get a separate parameter.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">factorModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">delta_vNorm</span> <span class="o">~</span> <span class="n">DayName</span> <span class="o">+</span> <span class="n">MonthName</span> <span class="o">+</span> <span class="n">DayOfMonth</span> <span class="o">+</span> <span class="mi">0</span><span class="x">),</span> <span class="n">aaplTrain</span><span class="x">)</span>
</code></pre></div></div>

<p>The embedding model takes in the sin/cos transformation of each of the variables.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">embeddingModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">delta_vNorm</span> <span class="o">~</span> <span class="n">DayOfWeek_sin</span> <span class="o">+</span> <span class="n">DayOfWeek_cos</span> <span class="o">+</span> <span class="n">DayOfMonth_sin</span> <span class="o">+</span> <span class="n">DayOfMonth_cos</span> <span class="o">+</span> <span class="n">MonthOfYear_sin</span> <span class="o">+</span> <span class="n">MonthOfYear_cos</span><span class="x">),</span> <span class="n">aaplTrain</span><span class="x">);</span>
</code></pre></div></div>

<p>To assess how well the models perform we look at the RMSE (in sample and out of sample), AIC (in sample) and \(R^2\) (in sample and out of sample).</p>

<table>
  <thead>
    <tr>
      <th>Model</th>
      <th>NumCoefs</th>
      <th>RMSE</th>
      <th>RMSEOOS</th>
      <th>AIC</th>
      <th>R2</th>
      <th>R2OOS</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Numeric</td>
      <td>4</td>
      <td>31.1041</td>
      <td>50.2975</td>
      <td>21346.9</td>
      <td>0.0336539</td>
      <td>0.0396665</td>
    </tr>
    <tr>
      <td>Factor</td>
      <td>17</td>
      <td>31.2978</td>
      <td>50.0453</td>
      <td>21352.8</td>
      <td>0.0433269</td>
      <td>0.0276647</td>
    </tr>
    <tr>
      <td>Embedding</td>
      <td>7</td>
      <td>31.7484</td>
      <td>51.1591</td>
      <td>21420.8</td>
      <td>0.0002655</td>
      <td>-0.000531</td>
    </tr>
  </tbody>
</table>

<p>Interestingly, the embedding model performs the worst both in sample and out of sample.</p>

<p>When we pull out the Day of the Week effect it’s easy to see what the model has learnt.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">params</span> <span class="o">=</span> <span class="kt">Dict</span><span class="x">(</span><span class="n">zip</span><span class="x">(</span><span class="n">coefnames</span><span class="x">(</span><span class="n">embedingExample</span><span class="x">),</span> <span class="n">coef</span><span class="x">(</span><span class="n">embedingExample</span><span class="x">)))</span>

<span class="n">x</span> <span class="o">=</span> <span class="mi">0</span><span class="o">:</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">4</span>
<span class="n">ySin</span> <span class="o">=</span> <span class="n">params</span><span class="x">[</span><span class="s">"DayOfWeek_sin"</span><span class="x">]</span> <span class="o">*</span> <span class="n">sin</span><span class="o">.</span><span class="x">(</span><span class="mi">2</span> <span class="o">.*</span> <span class="nb">pi</span> <span class="o">.*</span> <span class="n">x</span> <span class="o">./</span> <span class="n">maximum</span><span class="x">(</span><span class="n">x</span><span class="x">))</span>
<span class="n">yCos</span> <span class="o">=</span> <span class="n">params</span><span class="x">[</span><span class="s">"DayOfWeek_cos"</span><span class="x">]</span> <span class="o">*</span> <span class="n">cos</span><span class="o">.</span><span class="x">(</span><span class="mi">2</span> <span class="o">.*</span> <span class="nb">pi</span> <span class="o">.*</span> <span class="n">x</span> <span class="o">./</span> <span class="n">maximum</span><span class="x">(</span><span class="n">x</span><span class="x">))</span>


<span class="n">p</span> <span class="o">=</span> <span class="n">plot</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">ySin</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Sin"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">yCos</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Cos"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">yCos</span> <span class="o">.+</span> <span class="n">ySin</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Combined"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/CyclicalEmbedding/dayofweek.png" alt="Circular plot illustrating the cyclical embedding of days of the week effect from the model." width="80%" class="center-image" /></p>

<p>This indicates the lower volume changes are on Tuesday and the higher volume changes are on Thursday.</p>

<p>Based on the model performance it’s not a great showing for the embedding transformation. Let’s move on to another example where the cyclical nature might be more obvious.</p>

<h2 id="practical-cyclical-embeddings---intraday-volumes">Practical Cyclical Embeddings - Intraday Volumes</h2>

<p>Another example would be the flow of trades over the day. In this case, the hour is the variable we will cyclically embed. For this, we use BTCUSD trades from AlpacaMarkets.jl and aggregate them over the day.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">btcRaw</span><span class="x">,</span> <span class="n">token</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">crypto_bars</span><span class="x">(</span><span class="s">"BTC/USD"</span><span class="x">,</span> <span class="s">"1H"</span><span class="x">;</span> <span class="n">startTime</span><span class="o">=</span><span class="kt">Date</span><span class="x">(</span><span class="s">"2025-01-01"</span><span class="x">),</span> <span class="n">limit</span> <span class="o">=</span> <span class="mi">10000</span><span class="x">)</span>

<span class="n">res</span> <span class="o">=</span> <span class="x">[</span><span class="n">btcRaw</span><span class="x">]</span>
<span class="k">while</span> <span class="o">!</span><span class="x">(</span><span class="n">isnothing</span><span class="x">(</span><span class="n">token</span><span class="x">)</span> <span class="o">||</span> <span class="n">isempty</span><span class="x">(</span><span class="n">token</span><span class="x">))</span>
    <span class="n">println</span><span class="x">(</span><span class="n">token</span><span class="x">)</span>
    <span class="n">newtrades</span><span class="x">,</span> <span class="n">token</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">crypto_bars</span><span class="x">(</span><span class="s">"BTC/USD"</span><span class="x">,</span> <span class="s">"1H"</span><span class="x">;</span> <span class="n">startTime</span><span class="o">=</span><span class="kt">Date</span><span class="x">(</span><span class="s">"2025-01-01"</span><span class="x">),</span> <span class="n">limit</span> <span class="o">=</span> <span class="mi">10000</span><span class="x">,</span> <span class="n">page_token</span> <span class="o">=</span> <span class="n">token</span><span class="x">)</span>
    <span class="n">println</span><span class="x">((</span><span class="n">minimum</span><span class="x">(</span><span class="n">newtrades</span><span class="o">.</span><span class="n">t</span><span class="x">),</span> <span class="n">maximum</span><span class="x">(</span><span class="n">newtrades</span><span class="o">.</span><span class="n">t</span><span class="x">)))</span>
    <span class="n">append!</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="x">[</span><span class="n">newtrades</span><span class="x">])</span>
    <span class="n">sleep</span><span class="x">(</span><span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">SLEEP_TIME</span><span class="x">[])</span>
<span class="k">end</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">vcat</span><span class="x">(</span><span class="n">res</span><span class="o">...</span><span class="x">);</span>
</code></pre></div></div>

<p>Sidenote, I do need to wrap this functionality into the package itself.</p>

<p>We get the raw data into a suitable state.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">btc</span> <span class="o">=</span> <span class="n">res</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">v</span><span class="x">]]</span>
<span class="n">btc</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]</span> <span class="o">=</span> <span class="kt">DateTime</span><span class="o">.</span><span class="x">(</span><span class="n">chop</span><span class="o">.</span><span class="x">(</span><span class="n">btc</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"t"</span><span class="x">]));</span>

<span class="n">btc</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">btc</span><span class="x">,</span> <span class="o">:</span><span class="kt">Date</span> <span class="o">=</span> <span class="kt">Date</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">),</span> <span class="o">:</span><span class="kt">Time</span> <span class="o">=</span> <span class="kt">Time</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">),</span> <span class="o">:</span><span class="n">DayOfWeek</span> <span class="o">=</span> <span class="n">dayofweek</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">),</span> <span class="o">:</span><span class="kt">Hour</span> <span class="o">=</span> <span class="n">hour</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">t</span><span class="x">))</span>
<span class="n">trainDates</span> <span class="o">=</span> <span class="n">unique</span><span class="x">(</span><span class="n">btc</span><span class="o">.</span><span class="kt">Date</span><span class="x">)[</span><span class="mi">1</span><span class="o">:</span><span class="mi">140</span><span class="x">]</span>
<span class="n">testDates</span> <span class="o">=</span> <span class="n">setdiff</span><span class="x">(</span><span class="n">unique</span><span class="x">(</span><span class="n">btc</span><span class="o">.</span><span class="kt">Date</span><span class="x">),</span> <span class="n">trainDates</span><span class="x">)</span>

<span class="n">trainDataRaw</span> <span class="o">=</span> <span class="n">btc</span><span class="x">[</span><span class="n">findall</span><span class="x">(</span><span class="k">in</span><span class="x">(</span><span class="n">trainDates</span><span class="x">),</span> <span class="n">btc</span><span class="o">.</span><span class="kt">Date</span><span class="x">),</span> <span class="o">:</span><span class="x">];</span>
<span class="n">testDataRaw</span> <span class="o">=</span> <span class="n">btc</span><span class="x">[</span><span class="n">findall</span><span class="x">(</span><span class="k">in</span><span class="x">(</span><span class="n">testDates</span><span class="x">),</span> <span class="n">btc</span><span class="o">.</span><span class="kt">Date</span><span class="x">),</span> <span class="o">:</span><span class="x">];</span>

<span class="n">trainData</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">trainDataRaw</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="kt">Hour</span><span class="x">]),</span> <span class="o">:</span><span class="n">v</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">))</span>
<span class="n">trainData</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">trainData</span><span class="x">,</span> <span class="o">:</span><span class="n">total_v</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">),</span> <span class="o">:</span><span class="n">frac</span> <span class="o">=</span> <span class="o">:</span><span class="n">v</span><span class="o">./</span><span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">))</span>

<span class="n">testData</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">testDataRaw</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="kt">Hour</span><span class="x">]),</span> <span class="o">:</span><span class="n">v</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">))</span>
<span class="n">testData</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">testData</span><span class="x">,</span> <span class="o">:</span><span class="n">total_v</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">),</span> <span class="o">:</span><span class="n">frac</span> <span class="o">=</span> <span class="o">:</span><span class="n">v</span><span class="o">./</span><span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">v</span><span class="x">))</span>

<span class="n">sort!</span><span class="x">(</span><span class="n">trainData</span><span class="x">,</span> <span class="o">:</span><span class="kt">Hour</span><span class="x">);</span>
<span class="n">sort!</span><span class="x">(</span><span class="n">testData</span><span class="x">,</span> <span class="o">:</span><span class="kt">Hour</span><span class="x">);</span>
</code></pre></div></div>

<p>Again, using a linear model we fit the embedded hour variables to the fraction of the volume traded per hour.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">embedModelIntra</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">frac</span> <span class="o">~</span> <span class="n">Hour_sin</span> <span class="o">+</span> <span class="n">Hour_cos</span><span class="x">),</span> <span class="n">trainData</span><span class="x">)</span>
</code></pre></div></div>

<p>When comparing the results, we are now just looking at the intraday profile of the trades for both the train set and test set overlaid with the model.</p>

<p><img src="/assets/CyclicalEmbedding/intraEmbedd.png" alt="Line plot comparing actual and predicted intraday trading volume fractions by hour. The plot shows three lines: one representing the observed fraction of trading volume for each hour of the day from the training set, another from the test set and another representing the model's predicted values using cyclical embedding." width="80%" class="center-image" /></p>

<p>The model has done well to pick up the peak in the afternoon but has missed the peak in the early morning. The RMSE of this model is 0.029 vs 0.026 from using the training fractions directly, so again the encoded model has done worse. 
This is the limiting factor with this embedding, we have a single frequency of sin/cos when in reality this problem needs more degrees of freedom, i.e. multiple components</p>

\[\sum _i c^1_i \sin \left(\frac{2 \pi \omega _i x}{\max (x)}\right) + c^2_i \cos \left(\frac{2 \pi \omega _i x}{\max (x)}\right).\]

<p>This is now a GAM with trigonometric splines so we can view the cyclical encoding as a 1-spline GAM.</p>

<h2 id="conclusion">Conclusion</h2>

<p>It’s an interesting transformation of time-like variables and gives you a route to smoothing out the beginning and ending of the cycles.</p>

<p>In these toy models, the embedding hasn’t improved performance but it’s possible that it’s more relevant in deep learning architectures where there are more parameters and more interactions. In all the above models there’s much more groundwork to do before we start eeking out performance gains from the time variables.</p>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><category term="deep-learning" /><summary type="html"><![CDATA[Cyclical embedding (or encoding) is a basic transformation for numerical variables that follow a cycle. Let’s explore how they work.]]></summary></entry><entry><title type="html">Fitting Price Impact Models</title><link href="https://dm13450.github.io/2025/03/14/Fitting-Price-Impact-Models.html" rel="alternate" type="text/html" title="Fitting Price Impact Models" /><published>2025-03-14T00:00:00+00:00</published><updated>2025-03-14T00:00:00+00:00</updated><id>https://dm13450.github.io/2025/03/14/Fitting-Price-Impact-Models</id><content type="html" xml:base="https://dm13450.github.io/2025/03/14/Fitting-Price-Impact-Models.html"><![CDATA[<p>A big part of market microstructure is price impact and understanding how you move the market every time you trade. In the simplest sense, every trade upends the supply and demand of an asset even for a tiny amount of time. The market responds to this change, then responds to the response, then responds to that response, etc. You get the idea. It’s a cascading effect of interactions between all the people in the market.</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>Price impact is happening both at the micro and macro level. At the micro level each trade moves the market a little bit based on the instantaneous market conditions commonly called ‘liquidity’. At the macro level, continuous trades in one direction have a compounding and overlapping effect. In reality, you can’t separate out either effect so the market impact models need to work for both small and large scales.</p>

<p>This post is inspired by two sources:</p>

<ol>
  <li><a href="https://www.routledge.com/Handbook-of-Price-Impact-Modeling/Webster/p/book/9781032328225">Handbook of Price Impact Modelling</a> - Chapter 7</li>
  <li><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4286108">Stochastic Liquidity as a Proxy for Nonlinear Price Impact</a></li>
</ol>

<p>Both cover very similar models but one is a fairly expensive
book and the other is on SSRN for free. The same author is involved in
both of them too.</p>

<p>In terms of data, there are two routes you can go down.</p>

<ol>
  <li>You have your own, private, execution data and can build out a data set for the models.</li>
  <li>You use publicly available trades and adjust the models to account for the anonymous data.</li>
</ol>

<p>In the first case, you will know when an execution started and stopped so can record how the price changed. In the second case, the data will be made up of lots of trades and less obvious when some parent execution started and stopped.</p>

<p>We will take the 2nd route and using Bitcoin data to look at different price impact models.</p>

<p>As ever I will be using Julia with some of the standard packages.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">LibPQ</span>
<span class="k">using</span> <span class="n">DataFrames</span><span class="x">,</span> <span class="n">DataFramesMeta</span>
<span class="k">using</span> <span class="n">Dates</span>
<span class="k">using</span> <span class="n">Plots</span>
<span class="k">using</span> <span class="n">GLM</span><span class="x">,</span> <span class="n">Statistics</span><span class="x">,</span> <span class="n">Optim</span>
</code></pre></div></div>

<h2 id="bitcoin-price-impact-data">Bitcoin Price Impact Data</h2>

<p>We will use my old trusty Bitcoin data set that I collected
in 2021. It’s just over a day’s worth of Bitcoin trades and L1 prices
that I piped into QuestDB. Full detail in <a href="https://dm13450.github.io/2021/08/05/questdb-part-1.html">Using QuestDB to Build a Crypto Trade Database in Julia</a>.</p>

<p>First, we connect to the database.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">conn</span> <span class="o">=</span> <span class="n">LibPQ</span><span class="o">.</span><span class="n">Connection</span><span class="x">(</span><span class="s">"""
             dbname=qdb
             host=127.0.0.1
             password=quest
             port=8812
             user=admin"""</span><span class="x">);</span>
</code></pre></div></div>

<p>For each trade recorded in the database, we also want to join the best bid and offer immediately before it. This is where an <code class="language-plaintext highlighter-rouge">ASOF</code> join is useful. It joins two tables with timestamps using the entry of the 2nd table with time before the first table row. Sounds more complicated than it really is. In short, it takes the trade table and adds in the prices using the price just before the trade.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">trades</span> <span class="o">=</span> <span class="n">execute</span><span class="x">(</span><span class="n">conn</span><span class="x">,</span> 
    <span class="s">"WITH
trades AS ( 
   SELECT * FROM coinbase_trades
   ),
prices as (
  select * from coinbase_bbo
)
select * from trades ASOF JOIN prices"</span><span class="x">)</span> <span class="o">|&gt;</span> <span class="n">DataFrame</span>
<span class="n">dropmissing!</span><span class="x">(</span><span class="n">trades</span><span class="x">);</span>
<span class="n">trades</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="o">:</span><span class="n">mid</span> <span class="o">=</span> <span class="mf">0.5</span><span class="o">*</span><span class="x">(</span><span class="o">:</span><span class="n">ask</span> <span class="o">.+</span> <span class="o">:</span><span class="n">bid</span><span class="x">))</span>
</code></pre></div></div>

<p>For these small tables, it calculates pretty much instantly and we are
able to return a Julia data frame. Plus we calculate the mid-price for each row.</p>

<p>In all the price impact models we are aggregating this data:</p>
<ol>
  <li>Group the data by some time bucket (seconds or minutes etc.)</li>
  <li>Calculate the net amount, total absolute amount and open and close prices of the bucket.</li>
  <li>Calculate the price return using the close-to-close prices.</li>
</ol>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> aggregate_data</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="n">smp</span><span class="x">)</span>
    <span class="n">tradesAgg</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="nd">@transform</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span> <span class="o">=</span> <span class="n">floor</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">timestamp</span><span class="x">,</span> <span class="n">smp</span><span class="x">)),</span> <span class="o">:</span><span class="n">ts</span><span class="x">),</span> 
             <span class="o">:</span><span class="n">q</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">size</span> <span class="o">.*</span> <span class="o">:</span><span class="n">side</span><span class="x">),</span> 
             <span class="o">:</span><span class="n">absq</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="o">:</span><span class="n">size</span><span class="x">),</span> 
             <span class="o">:</span><span class="n">o</span> <span class="o">=</span> <span class="n">first</span><span class="x">(</span><span class="o">:</span><span class="n">mid</span><span class="x">),</span> 
             <span class="o">:</span><span class="n">c</span> <span class="o">=</span> <span class="n">last</span><span class="x">(</span><span class="o">:</span><span class="n">mid</span><span class="x">));</span>
    <span class="n">tradesAgg</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">]</span> <span class="o">.=</span> <span class="x">[</span><span class="nb">NaN</span><span class="x">;</span> <span class="x">(</span><span class="n">tradesAgg</span><span class="o">.</span><span class="n">c</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]</span><span class="o">./</span> <span class="n">tradesAgg</span><span class="o">.</span><span class="n">c</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="k">end</span><span class="o">-</span><span class="mi">1</span><span class="x">)])</span> <span class="o">.-</span> <span class="mi">1</span><span class="x">]</span>
    <span class="n">tradesAgg</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"ofi"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">tradesAgg</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span> <span class="n">tradesAgg</span><span class="o">.</span><span class="n">absq</span>

    <span class="n">tradesAgg</span>
<span class="k">end</span>
</code></pre></div></div>

<p>We are going to bucket the data by 10 seconds.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggData</span>  <span class="o">=</span> <span class="n">aggregate_data</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="n">Dates</span><span class="o">.</span><span class="kt">Second</span><span class="x">(</span><span class="mi">10</span><span class="x">))</span>
</code></pre></div></div>

<p>As ever, let’s split this data into a training and test set.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span> <span class="o">=</span> <span class="n">aggData</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="mi">7500</span><span class="x">,</span> <span class="o">:</span><span class="x">]</span>
<span class="n">aggDataTest</span> <span class="o">=</span> <span class="n">aggData</span><span class="x">[</span><span class="mi">7501</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">];</span>
</code></pre></div></div>

<p>It’s just a simple split on time.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plot</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">c</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Train"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">c</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Test"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/traintest.png" alt="" width="80%" class="center-image" /></p>

<h2 id="calculating-the-volatility-and-adv">Calculating the Volatility and ADV</h2>

<p>All the models require a volatility and ADV calculation. My data runs just over a day, so need to adjust for that.</p>

<p>For the ADV we take the sum of the total volume traded and divide by the length of time converted to days.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">deltaT</span> <span class="o">=</span> <span class="n">maximum</span><span class="x">(</span><span class="n">trades</span><span class="o">.</span><span class="n">timestamp</span><span class="x">)</span> <span class="o">-</span> <span class="n">minimum</span><span class="x">(</span><span class="n">trades</span><span class="o">.</span><span class="n">timestamp</span><span class="x">)</span>
<span class="n">deltaTDays</span> <span class="o">=</span> <span class="x">(</span><span class="n">deltaT</span><span class="o">.</span><span class="n">value</span> <span class="o">*</span> <span class="mf">1e-3</span><span class="x">)</span><span class="o">/</span><span class="x">(</span><span class="mi">24</span><span class="o">*</span><span class="mi">60</span><span class="o">*</span><span class="mi">60</span><span class="x">)</span>
<span class="n">adv</span> <span class="o">=</span> <span class="n">sum</span><span class="x">(</span><span class="n">trades</span><span class="o">.</span><span class="n">size</span><span class="x">)</span><span class="o">/</span><span class="n">deltaTDays</span>
<span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"ADV"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">adv</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"ADV"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">adv</span><span class="x">;</span>
</code></pre></div></div>

<p>For the volatility, we take the square root of the sum of the 5-minute return squared. Should probably be annualised if we were comparing the parameters across different assets.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">min5Agg</span> <span class="o">=</span> <span class="n">aggregate_data</span><span class="x">(</span><span class="n">trades</span><span class="x">,</span> <span class="n">Dates</span><span class="o">.</span><span class="kt">Minute</span><span class="x">(</span><span class="mi">5</span><span class="x">))</span>
<span class="n">volatility</span> <span class="o">=</span> <span class="n">sqrt</span><span class="x">(</span><span class="n">sum</span><span class="x">(</span><span class="n">min5Agg</span><span class="o">.</span><span class="n">price_return</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]</span> <span class="o">.*</span> <span class="n">min5Agg</span><span class="o">.</span><span class="n">price_return</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">]))</span>
<span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"Vol"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">volatility</span><span class="x">;</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"Vol"</span><span class="x">]</span> <span class="o">.=</span> <span class="n">volatility</span><span class="x">;</span>
</code></pre></div></div>

<p>The ADV and volatility have a normalising effect across assets. So if we had multiple coins, we could use the same model even if one was a highly traded coin like BTC or ETH vs a lower volume coin (the rest of them?!). This would give us comparable model parameters to judge the impact effect.</p>

<p>As our data sample is so small we are only calculating 1 volatility and 1 ADV. In reality, you calculate the volatility/ADV on a rolling basis and then do the train/test split.</p>

<h2 id="models-of-market-impact">Models of Market Impact</h2>

<p>The paper and book describe different market impact models that all follow a similar functional form. I’ve chosen four of them to illustrate the model fitting process.</p>

<ul>
  <li>The Order Flow Imbalance model (OFI)</li>
  <li>The Obizhaeva-Wang (OW) model</li>
  <li>The Concave Propagator model</li>
  <li>The Reduced Form model</li>
</ul>

<p>For all the models we will state the form of the market impact
\(\Delta I\) and use the price returns over the same period to find
the best parameters of the model.</p>

<p>The overarching idea is that the return in each bucket is proportional
to the amount of volume traded in that bucket plus some
contribution from the previous volumes earlier - suitably decayed.</p>

<h3 id="order-flow-imbalance">Order Flow Imbalance</h3>

<p>This is the simplest model as it just uses the imbalance over the
bucket to predict return. For the OFI we are just using the trade
imbalance, the net volume divided by the total volume in the bucket.</p>

\[\Delta I = \lambda \sigma \frac{q_t}{| q_t | \text{ADV}}\]

<p>As there is no dependence on the previous returns, we can use simple linear regression to estimate \(\lambda\).</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ofi"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ofi</span> <span class="o">./</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ADV</span><span class="x">)</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ofi"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">ofi</span> <span class="o">./</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">ADV</span><span class="x">)</span>

<span class="n">ofiModel</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="nd">@formula</span><span class="x">(</span><span class="n">price_return</span> <span class="o">~</span> <span class="n">x_ofi</span> <span class="o">+</span> <span class="mi">0</span><span class="x">),</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">])</span>
</code></pre></div></div>
<p>The model has returned a significant value of \(\lambda = 59\) and has an in sample \(R^2\) of 11% and our of sample RMSE of 0.0003. Encouraging and off to a good start!</p>

<p>Side note, I’ve written about Order Flow Imbalance before in <a href="https://dm13450.github.io/2022/02/02/Order-Flow-Imbalance.html">Order Flow Imbalance - A High Frequency Trading Signal</a>.</p>

<h3 id="the-obizhaeva-wang-ow-model">The Obizhaeva-Wang (OW) Model</h3>

<p>The OW model is a foundational model of market impact and you will see this model frequently across different microstructure papers. It suggests a linear dependence between the signed order flow and price impact but again normalising against the ADV and volatility.</p>

\[\Delta I = -\beta I_t + \lambda \sigma \frac{q_t}{ADV}\]

<p>Again, we create the \(x\) variable in the data frame specific for this model but this will need special attention to fit.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ADV</span><span class="x">);</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">ADV</span><span class="x">);</span>
</code></pre></div></div>

<p>From the market impact formula, we can see that the relationship is
recursive. The impact at time \(t\) depends on the impact at time
\(t-1\). How much of the previous impact is carried over is controlled
by \(\beta\) and in the paper they fix this at \(\frac{\log 2}{\beta}
= 60 \text{ Minutes}\). This means we have to fit the model as:</p>

<ol>
  <li>Calculate the \(I\) given an estimate of \(\lambda\)</li>
  <li>Adjust the price returns by this impact</li>
  <li>Regress the adjusted price returns against the \(x\) variable.</li>
  <li>Repeat with the new estimate of \(\lambda\) until converged.</li>
</ol>

<p>This is a simple 1 parameter optimisation where we minimise the RMSE.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> calcImpact</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">beta</span><span class="x">,</span> <span class="n">lambda</span><span class="x">)</span>
    <span class="n">impact</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">length</span><span class="x">(</span><span class="n">x</span><span class="x">))</span>
    <span class="n">impact</span><span class="x">[</span><span class="mi">1</span><span class="x">]</span> <span class="o">=</span> <span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="x">]</span>
    <span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">2</span><span class="o">:</span><span class="n">length</span><span class="x">(</span><span class="n">impact</span><span class="x">)</span>
        <span class="n">impact</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="x">(</span><span class="mi">1</span><span class="o">-</span><span class="n">beta</span><span class="x">)</span><span class="o">*</span><span class="n">impact</span><span class="x">[</span><span class="n">i</span><span class="o">-</span><span class="mi">1</span><span class="x">]</span> <span class="o">+</span> <span class="n">lambda</span><span class="o">*</span><span class="n">x</span><span class="x">[</span><span class="n">i</span><span class="x">]</span>
    <span class="k">end</span>
    <span class="n">impact</span>
<span class="k">end</span>
	
<span class="k">function</span><span class="nf"> fitLambda</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">y</span><span class="x">,</span> <span class="n">beta</span><span class="x">,</span> <span class="n">lambda</span><span class="x">)</span>
    <span class="n">I</span> <span class="o">=</span> <span class="n">calcImpact</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">beta</span><span class="x">,</span> <span class="n">lambda</span><span class="x">)</span>
    <span class="n">y2</span> <span class="o">=</span> <span class="n">y</span> <span class="o">.+</span> <span class="x">(</span><span class="n">beta</span> <span class="o">.*</span> <span class="n">I</span><span class="x">)</span>
    <span class="n">model</span> <span class="o">=</span> <span class="n">lm</span><span class="x">(</span><span class="n">reshape</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="x">(</span><span class="n">length</span><span class="x">(</span><span class="n">x</span><span class="x">),</span> <span class="mi">1</span><span class="x">))[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">,</span> <span class="o">:</span><span class="x">],</span> <span class="n">y2</span><span class="x">[</span><span class="mi">2</span><span class="o">:</span><span class="k">end</span><span class="x">])</span>
    <span class="n">model</span>
<span class="k">end</span>

<span class="n">rmse</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">=</span> <span class="n">sqrt</span><span class="x">(</span><span class="n">mean</span><span class="x">(</span><span class="n">residuals</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">.^</span><span class="mi">2</span><span class="x">))</span>
</code></pre></div></div>

<p>We start with \(\lambda = 1\) and let the optimiser do the work.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="n">optimize</span><span class="x">(</span><span class="n">x</span> <span class="o">-&gt;</span> <span class="n">rmse</span><span class="x">(</span><span class="n">fitLambda</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">],</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="x">])),</span> <span class="x">[</span><span class="mf">1.0</span><span class="x">])</span>
</code></pre></div></div>

<p>It’s converged! We plot the different values of the objective function and show that this process can find the minimum.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">lambdaRes</span> <span class="o">=</span> <span class="n">rmse</span><span class="o">.</span><span class="x">(</span><span class="n">fitLambda</span><span class="o">.</span><span class="x">([</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">]],</span> <span class="x">[</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">]],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="mi">0</span><span class="o">:</span><span class="mi">1</span><span class="o">:</span><span class="mi">20</span><span class="x">))</span>
<span class="n">plot</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="mi">1</span><span class="o">:</span><span class="mi">20</span><span class="x">,</span> <span class="n">lambdaRes</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">xlabel</span> <span class="o">=</span> <span class="s">L"\lambda"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"RMSE"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"OW Model"</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">Optim</span><span class="o">.</span><span class="n">minimizer</span><span class="x">(</span><span class="n">res</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Optimised Value"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/ow.png" alt="" width="80%" class="center-image" /></p>

<p>We then pull out the best-fitting model and estimate the \(R^2\).
We have a nice convex relationship which is always a good sign.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">owModel</span> <span class="o">=</span> <span class="n">fitLambda</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_ow"</span><span class="x">],</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">first</span><span class="x">(</span><span class="n">Optim</span><span class="o">.</span><span class="n">minimizer</span><span class="x">(</span><span class="n">res</span><span class="x">)))</span>
</code></pre></div></div>

<p>Which gives \(R^2 = 11\%\). So roughly the same as the OFI model. For the out-of-sample RMSE we get 0.0006.</p>

<h2 id="concave-propagator-model">Concave Propagator Model</h2>

<p>This model follows the belief that market impact is a power law and
that power is close to 0.5. Using the square root of the total amount
traded and the net direction gives us the \(x\) variable.</p>

\[\Delta I = -\beta I_t + \lambda \sigma \text{sign} (q_t) \sqrt
{\frac{| q_t |}{\text{ADV}}}\]

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_cp"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="n">sign</span><span class="o">.</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">q</span><span class="x">)</span> <span class="o">.*</span> <span class="n">sqrt</span><span class="o">.</span><span class="x">((</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">absq</span> <span class="o">./</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ADV</span><span class="x">));</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_cp"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="n">sign</span><span class="o">.</span><span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">q</span><span class="x">)</span> <span class="o">.*</span> <span class="n">sqrt</span><span class="o">.</span><span class="x">((</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">absq</span> <span class="o">./</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">ADV</span><span class="x">));</span>
</code></pre></div></div>

<p>Again, we optimise using the same methodology as above.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="n">optimize</span><span class="x">(</span><span class="n">x</span> <span class="o">-&gt;</span> <span class="n">rmse</span><span class="x">(</span><span class="n">fitLambda</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_cp"</span><span class="x">],</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="x">])),</span> <span class="x">[</span><span class="mf">1.0</span><span class="x">])</span>
<span class="n">lambdaRes</span> <span class="o">=</span> <span class="n">rmse</span><span class="o">.</span><span class="x">(</span><span class="n">fitLambda</span><span class="o">.</span><span class="x">([</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_cp"</span><span class="x">]],</span> <span class="x">[</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">]],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="mi">0</span><span class="o">:</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">1</span><span class="x">))</span>
<span class="n">plot</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">1</span><span class="x">,</span> <span class="n">lambdaRes</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">xlabel</span> <span class="o">=</span> <span class="s">L"\lambda"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"RMSE"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Concave Propagator Model"</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">Optim</span><span class="o">.</span><span class="n">minimizer</span><span class="x">(</span><span class="n">res</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Optimised Value"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/concaveprop.png" alt="" width="80%" class="center-image" /></p>

<p>Another success! This time the \(R^2\) is 17% so an improvement on the other two models. It’s out of sample RMSE is 0.0008.</p>

<h2 id="reduced-form-model">Reduced Form Model</h2>

<p>The paper suggests that as the number of trades and time increment
increases the market impact function converges to a linear form with a
dependence on the stochastic volatility of the order flow.</p>

\[\Delta I = -\beta I_t + \lambda \sigma \frac{q_t}{\sqrt{v_t \cdot \text{ADV}}}\]

<p>For this, we need to calculate the stochastic liquidity parameter, \(v_t\), which is simply the moving average of the absolute market volumes.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> calcLiquidity</span><span class="x">(</span><span class="n">absq</span><span class="x">,</span> <span class="n">beta</span><span class="x">)</span>
    <span class="n">v</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">length</span><span class="x">(</span><span class="n">absq</span><span class="x">))</span>
    <span class="n">v</span><span class="x">[</span><span class="mi">1</span><span class="x">]</span> <span class="o">=</span> <span class="n">absq</span><span class="x">[</span><span class="mi">1</span><span class="x">]</span>
    <span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">2</span><span class="o">:</span><span class="n">length</span><span class="x">(</span><span class="n">v</span><span class="x">)</span>
        <span class="n">v</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="x">(</span><span class="mi">1</span><span class="o">-</span><span class="n">beta</span><span class="x">)</span><span class="o">*</span><span class="n">v</span><span class="x">[</span><span class="n">i</span><span class="o">-</span><span class="mi">1</span><span class="x">]</span> <span class="o">+</span> <span class="n">absq</span><span class="x">[</span><span class="n">i</span><span class="x">]</span>
    <span class="k">end</span>
    <span class="k">return</span> <span class="n">v</span>
<span class="k">end</span>

<span class="n">v</span> <span class="o">=</span> <span class="n">calcLiquidity</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"absq"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">)</span>
<span class="n">vTest</span> <span class="o">=</span> <span class="n">calcLiquidity</span><span class="x">(</span><span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"absq"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">)</span>

<span class="n">plot</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">v</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Stochastic Liquidity"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">vTest</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Test Set"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/stochliq.png" alt="" width="80%" class="center-image" /></p>

<p>Adding this into our data frame and calculating the \(x\) variable is simple.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"v"</span><span class="x">]</span> <span class="o">=</span> <span class="n">v</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"v"</span><span class="x">]</span> <span class="o">=</span> <span class="n">vTest</span>

<span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_rf"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="n">aggDataTrain</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span> <span class="n">sqrt</span><span class="o">.</span><span class="x">((</span><span class="n">aggDataTrain</span><span class="o">.</span><span class="n">ADV</span> <span class="o">.*</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"v"</span><span class="x">]));</span>
<span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_rf"</span><span class="x">]</span> <span class="o">=</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">Vol</span> <span class="o">.*</span> <span class="n">aggDataTest</span><span class="o">.</span><span class="n">q</span> <span class="o">./</span>
<span class="n">sqrt</span><span class="o">.</span><span class="x">((</span><span class="n">aggDataTest</span><span class="o">.</span><span class="n">ADV</span> <span class="o">.*</span> <span class="n">aggDataTest</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"v"</span><span class="x">]));</span>
</code></pre></div></div>

<p>And again, we repeat the fitting process.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">lambdaVals</span> <span class="o">=</span> <span class="mi">0</span><span class="o">:</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">5</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">optimize</span><span class="x">(</span><span class="n">x</span> <span class="o">-&gt;</span> <span class="n">rmse</span><span class="x">(</span><span class="n">fitLambda</span><span class="x">(</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_rf"</span><span class="x">],</span> <span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="x">])),</span> <span class="x">[</span><span class="mf">1.0</span><span class="x">])</span>
<span class="n">lambdaRes</span> <span class="o">=</span> <span class="n">rmse</span><span class="o">.</span><span class="x">(</span><span class="n">fitLambda</span><span class="o">.</span><span class="x">([</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"x_rf"</span><span class="x">]],</span> <span class="x">[</span><span class="n">aggDataTrain</span><span class="x">[</span><span class="o">!</span><span class="x">,</span> <span class="s">"price_return"</span><span class="x">]],</span> <span class="mf">0.01</span><span class="x">,</span> <span class="n">lambdaVals</span><span class="x">))</span>
<span class="n">plot</span><span class="x">(</span><span class="n">lambdaVals</span><span class="x">,</span> <span class="n">lambdaRes</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">,</span> <span class="n">xlabel</span> <span class="o">=</span> <span class="s">L"\lambda"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"RMSE"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Reduced Form Model"</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">Optim</span><span class="o">.</span><span class="n">minimizer</span><span class="x">(</span><span class="n">res</span><span class="x">),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Optimised Value"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/priceimpact/rf.png" alt="" width="80%" class="center-image" /></p>

<p>This model gives an \(R^2=10%\) and out-of-sample RMSE of 0.0009.</p>

<p>With all four models fitted, we can now look at the differences statistically and how the impact state evolves over the course of the day.</p>

<table>
  <thead>
    <tr>
      <th>Model</th>
      <th>\(\lambda\)</th>
      <th>\(R^2\)</th>
      <th>OOS RMSE</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>OFI</td>
      <td>43</td>
      <td>0.11</td>
      <td>0.0003</td>
    </tr>
    <tr>
      <td>OW</td>
      <td>14</td>
      <td>0.11</td>
      <td>0.0006</td>
    </tr>
    <tr>
      <td>Concave Propagator</td>
      <td>0.34</td>
      <td>0.17</td>
      <td>0.0008</td>
    </tr>
    <tr>
      <td>Reduced Form</td>
      <td>1.7</td>
      <td>0.10</td>
      <td>0.0009</td>
    </tr>
  </tbody>
</table>

<p>So, the concave propagator model has the highest \(R^2\) followed by the reduced form model. The OFI and OW models have slightly lower \(R^2\).
But, looking at the RMSE values from the out-of-sample performance its
clear that the OFI model seems to be the best.</p>

<p>When we plot the resulting impacts from the 4 models we generally see
they agree, with only the OFI model being the most different. This
difference comes from the lack of time decay from the previous volumes.</p>

<p><img src="/assets/priceimpact/priceimpact.png" alt="" width="80%" class="center-image" /></p>

<h2 id="conclusion">Conclusion</h2>

<p>Overall, I don’t think these results are that informative, my data set is tiny
compared to the paper (1 day vs months). Instead, use this as more of
an instructional on how to fit these models. We didn’t even explore
optimising the time decay (\(\beta\) values) for Bitcoin which could
be substantially different from the paper dataset on equities. So
there is plenty more to do!</p>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><category term="quant" /><category term="microstructure" /><summary type="html"><![CDATA[A big part of market microstructure is price impact and understanding how you move the market every time you trade. In the simplest sense, every trade upends the supply and demand of an asset even for a tiny amount of time. The market responds to this change, then responds to the response, then responds to that response, etc. You get the idea. It’s a cascading effect of interactions between all the people in the market.]]></summary></entry><entry><title type="html">Importance Sampling, Reinforcement Learning and Getting More From The Data You Have</title><link href="https://dm13450.github.io/2024/12/17/Importance-Sampling-Reinforcement-Learning-and-Getting-More-From-The-Data-You-Have.html" rel="alternate" type="text/html" title="Importance Sampling, Reinforcement Learning and Getting More From The Data You Have" /><published>2024-12-17T00:00:00+00:00</published><updated>2024-12-17T00:00:00+00:00</updated><id>https://dm13450.github.io/2024/12/17/Importance-Sampling-Reinforcement-Learning-and-Getting-More-From-The-Data-You-Have</id><content type="html" xml:base="https://dm13450.github.io/2024/12/17/Importance-Sampling-Reinforcement-Learning-and-Getting-More-From-The-Data-You-Have.html"><![CDATA[<p>A new paper hit my feed <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5001783">Choosing trading strategies in electronic execution using
importance sampling</a>. I’ve only encountered sampling as part of a statistical computing course as part of my PhD, and I had never strayed away from Monte Carlo sampling, but this practical example provided an intuitive understanding of its importance and utility.</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>The key tenet of the paper is to use the data you have to evaluate a strategy you are considering without actually running the new strategy in production. In real life, changing something like these strategies can take a long time, with limited upside but unlimited downside if it all goes wrong.</p>

<p>This blog post will run through the paper and replicate the main themes in Julia. I believe the author is a Julia user too, I remember enjoying their JuliaCon talk about high-frequency covariance matrices - <a href="https://www.youtube.com/watch?v=X_TCI02rgu0">HighFrequencyCovariance: Estimating Covariance Matrices in Julia</a> and the associated Julia package <a href="https://github.com/s-baumann/HighFrequencyCovariance.jl">HighFrequencyCovariance.jl</a></p>

<h2 id="the-execution-traders-problem">The Execution Traders Problem</h2>

<p>You are an execution trader with access to 4 different broker algorithms (algos) to execute your trade. With each trade you need to choose an algo and measure the trade’s overall slippage - the price you paid vs the price at the start of the order. You want to choose the best algo to ensure each of your trades gets the best price.</p>

<p>How do you choose what one to use? Do you have enough data to decide what one is the best one? Is any one algo better than the other? These are all difficult questions to answer but with some data on how the algos performs you should be able to use the data to help inform your decision.</p>

<p>We are trying to maximise the performance of each trade by choosing the correct algo. Our trade is described by a variable \(x\) and each algo performs differently depending on \(x\). The paper calls the performance ‘slippage’ but then tries to maximise the slippage which sounds weird to me - I always talk about minimising slippage! But that’s splitting hairs.</p>

<p>The performance of algo \(i\) is described by an analytical function with parameters \(\alpha _i, \beta _i\) plus some noise that depends on the duration of the trade \(d\) and the volatility \(\sigma\).</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> expSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alpha</span><span class="x">,</span> <span class="n">beta</span><span class="x">)</span>
   <span class="nd">@.</span> <span class="o">-</span><span class="n">alpha</span><span class="o">*</span><span class="x">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">beta</span><span class="x">)</span><span class="o">^</span><span class="mi">2</span> 
<span class="k">end</span>

<span class="k">function</span><span class="nf"> slippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alpha</span><span class="x">,</span> <span class="n">beta</span><span class="x">,</span> <span class="n">d</span><span class="x">,</span> <span class="n">sigma</span><span class="x">)</span>
    <span class="n">expSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alpha</span><span class="x">,</span> <span class="n">beta</span><span class="x">)</span> <span class="o">+</span> <span class="n">rand</span><span class="x">(</span><span class="n">Normal</span><span class="x">(</span><span class="mi">0</span><span class="x">,</span> <span class="n">d</span><span class="o">*</span><span class="n">sigma</span><span class="o">/</span><span class="mi">2</span><span class="x">))</span>
<span class="k">end</span>
</code></pre></div></div>

<p>The \(\alpha\)’s and \(\beta\)’s are simple constants set in the paper.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">alphas</span> <span class="o">=</span> <span class="x">[</span><span class="mi">5</span><span class="x">,</span><span class="mi">10</span><span class="x">,</span><span class="mi">15</span><span class="x">,</span><span class="mi">20</span><span class="x">]</span>
<span class="n">betas</span> <span class="o">=</span> <span class="x">[</span><span class="mf">0.2</span><span class="x">,</span> <span class="mf">0.4</span><span class="x">,</span> <span class="mf">0.6</span><span class="x">,</span> <span class="mf">0.8</span><span class="x">]</span>

<span class="n">x</span> <span class="o">=</span> <span class="n">collect</span><span class="x">(</span><span class="mi">0</span><span class="o">:</span><span class="mf">0.01</span><span class="o">:</span><span class="mi">1</span><span class="x">)</span>
<span class="n">p</span> <span class="o">=</span> <span class="n">plot</span><span class="x">(</span><span class="n">xlabel</span> <span class="o">=</span> <span class="s">"x"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"Expected Slippage"</span><span class="x">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="n">eachindex</span><span class="x">(</span><span class="n">alphas</span><span class="x">)</span>
   <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">x</span><span class="x">,</span> <span class="n">expSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alphas</span><span class="x">[</span><span class="n">i</span><span class="x">],</span> <span class="n">betas</span><span class="x">[</span><span class="n">i</span><span class="x">]),</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Algo "</span> <span class="o">*</span> <span class="n">string</span><span class="x">(</span><span class="n">i</span><span class="x">),</span> <span class="n">lw</span> <span class="o">=</span> <span class="mi">2</span><span class="x">)</span> 
<span class="k">end</span>
<span class="n">p</span>
</code></pre></div></div>

<p><img src="/assets/importancesampling/slippage_functions.png" alt="Slippage functions" title="Slippage functions" width="80%" class="center-image" /></p>

<p>Here we can see where each algo is better for each \(x\). In reality, this is impossible to know or it might not even exist.</p>

<p>We are going to devise a rule of when we will select each trading algo:</p>

<ul>
  <li>
    <p>If \(x&lt;0.5\) then we will randomly select Strategy 1 62.5% of the time and the others 12.5% of the time.</p>
  </li>
  <li>
    <p>If \(x&gt;0.5\) then Strategy 3 62.5% and the others 12.5%.</p>
  </li>
</ul>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> tradingRule</span><span class="x">(</span><span class="n">x</span><span class="x">)</span>
    <span class="k">if</span> <span class="n">x</span> <span class="o">&lt;</span> <span class="mf">0.5</span>
        <span class="k">return</span> <span class="x">[</span><span class="mf">0.625</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">]</span>
    <span class="k">else</span> 
        <span class="k">return</span> <span class="x">[</span><span class="mf">0.125</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">,</span> <span class="mf">0.625</span><span class="x">,</span> <span class="mf">0.125</span><span class="x">]</span>
    <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Julia’s vectorisation makes it easy to simulate going through multiple trades.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Uniform</span><span class="x">(),</span> <span class="mi">100</span><span class="x">)</span>
<span class="n">d</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Uniform</span><span class="x">(),</span> <span class="mi">100</span><span class="x">)</span>
<span class="n">stratProbs</span> <span class="o">=</span> <span class="n">tradingRule</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">)</span>
<span class="n">strat</span> <span class="o">=</span> <span class="n">rand</span><span class="o">.</span><span class="x">(</span><span class="n">Categorical</span><span class="o">.</span><span class="x">(</span><span class="n">stratProbs</span><span class="x">))</span>
<span class="n">stratProb</span> <span class="o">=</span> <span class="n">getindex</span><span class="o">.</span><span class="x">(</span><span class="n">stratProbs</span><span class="x">,</span> <span class="n">strat</span><span class="x">)</span>
<span class="n">slippageVal</span> <span class="o">=</span> <span class="n">slippage</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">alphas</span><span class="x">[</span><span class="n">strat</span><span class="x">],</span> <span class="n">betas</span><span class="x">[</span><span class="n">strat</span><span class="x">],</span> <span class="n">d</span><span class="x">,</span> <span class="mi">5</span><span class="x">)</span>

<span class="n">res</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">x</span><span class="o">=</span><span class="n">x</span><span class="x">,</span> <span class="n">d</span><span class="o">=</span><span class="n">d</span><span class="x">,</span> <span class="n">strat</span><span class="o">=</span><span class="n">strat</span><span class="x">,</span> <span class="n">stratProb</span><span class="o">=</span><span class="n">stratProb</span><span class="x">,</span> <span class="n">prob</span><span class="o">=</span><span class="n">stratProb</span><span class="x">,</span> <span class="n">slippage</span><span class="o">=</span><span class="n">slippageVal</span><span class="x">)</span>
<span class="n">first</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="mi">3</span><span class="x">)</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th style="text-align: right"><strong>x</strong></th>
      <th style="text-align: right"><strong>d</strong></th>
      <th style="text-align: right"><strong>strat</strong></th>
      <th style="text-align: right"><strong>stratProb</strong></th>
      <th style="text-align: right"><strong>prob</strong></th>
      <th style="text-align: right"><strong>slippage</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">0.0192748</td>
      <td style="text-align: right">0.95432</td>
      <td style="text-align: right">1</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">1.29969</td>
    </tr>
    <tr>
      <td style="text-align: right">0.0700494</td>
      <td style="text-align: right">0.930581</td>
      <td style="text-align: right">1</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">0.855019</td>
    </tr>
    <tr>
      <td style="text-align: right">0.925858</td>
      <td style="text-align: right">0.90087</td>
      <td style="text-align: right">3</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">0.625</td>
      <td style="text-align: right">-2.62943</td>
    </tr>
  </tbody>
</table>

<p>This is our ‘production data’ for 100 random trades. The aim of the game is to understand how good our trading rules are rather than trying to estimate how good the individual algos are.</p>

<p>Does our rule above do better than just randomly choosing an algo? This is where we can use importance sampling to take the 100 trades and specially weight them to assess a new trading rule.</p>

<h2 id="importance-sampling">Importance Sampling</h2>

<p>Importance sampling is about using observed probabilities \(q\) and observations of a variable with different probabilities \(p\). In our case we want to calculate the expected slippage of a trading strategy given the observations we have of the current strategy.</p>

\[\mathbb{E} [\text{Slippage}] = \frac{1}{N} \sum _i \text{Slippage}_i \frac{p_i(\text{New Strategy})}{q_i(\text{Current Strategy})}\]

<p>\(q_i(\text{Current Strategy})\) is equal to the <code class="language-plaintext highlighter-rouge">stratProb</code> column in the dataframe and \(p_i\) is the probability we would have chosen the given algo under the new strategy.</p>

<p>For the importance sampling, we calculate the likelihood ratio using equal probabilities and then take the weighted average of the slippages.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqProb</span> <span class="o">=</span> <span class="mf">0.25</span><span class="x">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">ratio</span> <span class="o">=</span> <span class="o">:</span><span class="n">EqProb</span> <span class="o">./</span> <span class="o">:</span><span class="n">stratProb</span><span class="x">)</span>
<span class="nd">@combine</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">StratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">),</span> <span class="o">:</span><span class="n">EqStratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">,</span> <span class="n">Weights</span><span class="x">(</span><span class="o">:</span><span class="n">ratio</span><span class="x">)))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th style="text-align: right"><strong>StratSlippage</strong></th>
      <th style="text-align: right"><strong>EqStratSlippage</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">-1.02243</td>
      <td style="text-align: right">-1.8774</td>
    </tr>
  </tbody>
</table>

<p>The average slippage for the 100 trades is worse (more negative) that the current strategy. This suggests that randomly choosing would perform <em>worse</em>.</p>

<p>Then plotting the average slippage across the orders.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">StratSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">)</span> <span class="o">./</span><span class="n">collect</span><span class="x">(</span><span class="mi">1</span><span class="o">:</span><span class="n">length</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">)))</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span> <span class="o">.*</span> <span class="o">:</span><span class="n">ratio</span><span class="x">)</span> <span class="o">./</span><span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">ratio</span><span class="x">))</span>

<span class="n">plot</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">StratSlipapgeRolling</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Production"</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span><span class="mi">2</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">EqSlipapgeRolling</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Equal Weighted"</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span><span class="mi">2</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/importancesampling/simplestrat.png" alt="Simple strategy slippage" title="Simple strategy slippage" width="80%" class="center-image" /></p>

<p>The timeseries of the slippage shows that the equally weighted strategy is worse, so gives us confidence in the current strategy. When we observe a bad outcome the likelihood ratio weights that outcome based on how different the probability is from the production strategy.</p>

<p>How can we use importance sampling to build better strategies?</p>

<h2 id="easy-reinforcement-learning-and-expected-slippage">Easy Reinforcement Learning and Expected Slippage</h2>

<p>Each trade is described by \(x\). In this toy model that is just a number but in real life this could correspond to the size of the order, the asset, the time of day and any combination of variables. In the original paper they use the spread, volatility, order size relative to the ADV and duration as descriptive variables of a random dataset. I’m going to keep it simple and stick to \(x\) being just a single number.</p>

<p>We want to understand if a particular \(x\) means we should use algo \(i\). For this, we need to build an ‘expected slippage’ model where we use the historical \(x\) values and outcomes of using algo \(i\).</p>

<p>For the modelling part, we will use <code class="language-plaintext highlighter-rouge">xgboost</code> through <code class="language-plaintext highlighter-rouge">MLJ.jl</code>.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="n">MLJ</span>
<span class="n">xgboostModel</span> <span class="o">=</span> <span class="nd">@load</span> <span class="n">XGBoostRegressor</span> <span class="n">pkg</span><span class="o">=</span><span class="n">XGBoost</span> <span class="n">verbosity</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">xgboostmodel</span> <span class="o">=</span> <span class="n">xgboostModel</span><span class="x">(</span><span class="n">eval_metric</span><span class="o">=</span><span class="x">[</span><span class="s">"rmse"</span><span class="x">]);</span>
</code></pre></div></div>

<p>The inputs are \(x\) and an indicator of the chosen algo.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res2</span> <span class="o">=</span> <span class="n">coerce</span><span class="x">(</span><span class="n">res</span><span class="x">[</span><span class="o">:</span><span class="x">,[</span><span class="o">:</span><span class="n">x</span><span class="x">,</span> <span class="o">:</span><span class="n">strat</span><span class="x">,</span> <span class="o">:</span><span class="n">slippage</span><span class="x">]],</span> <span class="o">:</span><span class="n">strat</span><span class="o">=&gt;</span><span class="n">Multiclass</span><span class="x">);</span>

<span class="n">y</span><span class="x">,</span> <span class="n">X</span> <span class="o">=</span> <span class="n">unpack</span><span class="x">(</span><span class="n">res2</span><span class="x">,</span> <span class="o">==</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">);</span> <span class="n">rng</span><span class="o">=</span><span class="mi">123</span><span class="x">);</span>

<span class="n">encoder</span> <span class="o">=</span> <span class="n">ContinuousEncoder</span><span class="x">()</span>
<span class="n">encMach</span> <span class="o">=</span> <span class="n">machine</span><span class="x">(</span><span class="n">encoder</span><span class="x">,</span> <span class="n">X</span><span class="x">)</span> <span class="o">|&gt;</span> <span class="n">fit!</span>
<span class="n">X_encoded</span> <span class="o">=</span> <span class="n">MLJ</span><span class="o">.</span><span class="n">transform</span><span class="x">(</span><span class="n">encMach</span><span class="x">,</span> <span class="n">X</span><span class="x">);</span>

<span class="n">xgbMachine</span> <span class="o">=</span> <span class="n">machine</span><span class="x">(</span><span class="n">xgboostmodel</span><span class="x">,</span> <span class="n">X_encoded</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>

<span class="n">evaluate!</span><span class="x">(</span><span class="n">xgbMachine</span><span class="x">,</span>
          <span class="n">resampling</span><span class="o">=</span><span class="n">CV</span><span class="x">(</span><span class="n">nfolds</span> <span class="o">=</span> <span class="mi">6</span><span class="x">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="nb">true</span><span class="x">),</span>
          <span class="n">measures</span><span class="o">=</span><span class="x">[</span><span class="n">rmse</span><span class="x">,</span> <span class="n">rsq</span><span class="x">],</span>
          <span class="n">verbosity</span><span class="o">=</span><span class="mi">0</span><span class="x">)</span>
</code></pre></div></div>
<p>The overall regression gets an \(R^2\) of 0.5 on our 100 trade dataset - a decent model.</p>

<p>In this new simulation, we will fit the xgboost model on the trades to build up an expected slippage model with all the data we have so far. <code class="language-plaintext highlighter-rouge">prepareData</code> and <code class="language-plaintext highlighter-rouge">fitSlippage</code> transform the data and fit the model.</p>

<p>We will then use this model to predict the expected slippage (<code class="language-plaintext highlighter-rouge">predictSlippage</code>) for each algo and use that to selected what algo to use for a given trade.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> prepareData</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span><span class="x">,</span> <span class="n">slippage</span><span class="x">)</span>
    <span class="n">res</span> <span class="o">=</span> <span class="n">coerce</span><span class="x">(</span><span class="n">DataFrame</span><span class="x">(</span><span class="n">x</span><span class="o">=</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span><span class="o">=</span><span class="n">strat</span><span class="x">,</span> <span class="n">slippage</span><span class="o">=</span><span class="n">slippage</span><span class="x">),</span> <span class="o">:</span><span class="n">strat</span><span class="o">=&gt;</span><span class="n">Multiclass</span><span class="x">);</span>
    <span class="n">y</span><span class="x">,</span> <span class="n">X</span> <span class="o">=</span> <span class="n">unpack</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">==</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">);</span> <span class="n">rng</span><span class="o">=</span><span class="mi">123</span><span class="x">);</span>
    <span class="n">encoder</span> <span class="o">=</span> <span class="n">ContinuousEncoder</span><span class="x">()</span>
    <span class="n">encMach</span> <span class="o">=</span> <span class="n">machine</span><span class="x">(</span><span class="n">encoder</span><span class="x">,</span> <span class="n">X</span><span class="x">)</span> <span class="o">|&gt;</span> <span class="n">fit!</span>
    <span class="n">X_encoded</span> <span class="o">=</span> <span class="n">MLJ</span><span class="o">.</span><span class="n">transform</span><span class="x">(</span><span class="n">encMach</span><span class="x">,</span> <span class="n">X</span><span class="x">);</span>
    <span class="k">return</span> <span class="n">X_encoded</span><span class="x">,</span> <span class="n">y</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> fitSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span><span class="x">,</span> <span class="n">slippage</span><span class="x">,</span> <span class="n">xgboostmodel</span><span class="x">)</span>
    <span class="n">X_encoded</span><span class="x">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">prepareData</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span><span class="x">,</span> <span class="n">slippage</span><span class="x">)</span>
    <span class="n">xgbMachine</span> <span class="o">=</span> <span class="n">machine</span><span class="x">(</span><span class="n">xgboostmodel</span><span class="x">,</span> <span class="n">X_encoded</span><span class="x">,</span> <span class="n">y</span><span class="x">)</span>

    <span class="n">evaluate!</span><span class="x">(</span><span class="n">xgbMachine</span><span class="x">,</span>
          <span class="n">resampling</span><span class="o">=</span><span class="n">CV</span><span class="x">(</span><span class="n">nfolds</span> <span class="o">=</span> <span class="mi">6</span><span class="x">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="nb">true</span><span class="x">),</span>
          <span class="n">measures</span><span class="o">=</span><span class="x">[</span><span class="n">rmse</span><span class="x">,</span> <span class="n">rsq</span><span class="x">],</span>
          <span class="n">verbosity</span><span class="o">=</span><span class="mi">0</span><span class="x">)</span>
    <span class="k">return</span> <span class="x">(</span><span class="n">xgbMachine</span><span class="x">,</span> <span class="n">encMach</span><span class="x">)</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> predictSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">,</span> <span class="n">xgbMachine</span><span class="x">,</span> <span class="n">encMachine</span><span class="x">)</span>
    <span class="n">X_pred</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">x</span><span class="o">=</span><span class="n">x</span><span class="x">,</span> <span class="n">strat</span> <span class="o">=</span> <span class="x">[</span><span class="mi">1</span><span class="x">,</span><span class="mi">2</span><span class="x">,</span><span class="mi">3</span><span class="x">,</span><span class="mi">4</span><span class="x">],</span> <span class="n">slippage</span> <span class="o">=</span> <span class="nb">NaN</span><span class="x">)</span>
    <span class="n">X_pred</span> <span class="o">=</span> <span class="n">coerce</span><span class="x">(</span><span class="n">X_pred</span><span class="x">[</span><span class="o">:</span><span class="x">,[</span><span class="o">:</span><span class="n">x</span><span class="x">,</span> <span class="o">:</span><span class="n">strat</span><span class="x">,</span> <span class="o">:</span><span class="n">slippage</span><span class="x">]],</span> <span class="o">:</span><span class="n">strat</span><span class="o">=&gt;</span><span class="n">Multiclass</span><span class="x">)</span>
    <span class="n">X_pred</span> <span class="o">=</span> <span class="n">MLJ</span><span class="o">.</span><span class="n">transform</span><span class="x">(</span><span class="n">encMach</span><span class="x">,</span> <span class="n">X_pred</span><span class="x">)</span>
    <span class="n">preds</span> <span class="o">=</span> <span class="n">MLJ</span><span class="o">.</span><span class="n">predict</span><span class="x">(</span><span class="n">xgbMachine</span><span class="x">,</span> <span class="n">X_pred</span><span class="x">)</span>
    <span class="k">return</span><span class="x">(</span><span class="n">preds</span><span class="x">)</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> slippageToProb</span><span class="x">(</span><span class="n">preds</span><span class="x">)</span>
    <span class="n">scores</span> <span class="o">=</span> <span class="n">exp</span><span class="o">.</span><span class="x">(</span><span class="n">preds</span><span class="x">)</span> <span class="o">./</span> <span class="n">sum</span><span class="x">(</span><span class="n">exp</span><span class="o">.</span><span class="x">(</span><span class="n">preds</span><span class="x">))</span>
    <span class="n">p</span> <span class="o">=</span> <span class="x">((</span><span class="mf">0.9</span> <span class="o">.*</span> <span class="n">scores</span><span class="x">)</span> <span class="o">.+</span> <span class="mf">0.025</span><span class="x">)</span> <span class="o">./</span> <span class="n">sum</span><span class="x">((</span><span class="mf">0.9</span> <span class="o">.*</span> <span class="n">scores</span><span class="x">)</span> <span class="o">.+</span> <span class="mf">0.025</span><span class="x">)</span> 
    <span class="k">return</span> <span class="n">p</span>
<span class="k">end</span>
</code></pre></div></div>

<p>The predicted slippage is then transformed into a probability using the softmax function (<code class="language-plaintext highlighter-rouge">slippageToProb</code>) which gives us a mapping of the real-valued estimated slippage onto a probability. We then sample which strategy to use from this probability. By adding an element of randomness into the algo selection we are making sure we can use the importance sampling framework to either change the model (xgboost to something else) or change how we build the probabilities (softmax to something else).</p>

<p>To simulate the problem we will start by randomly choosing a strategy for the first 200 runs. After this we will start using the xgboost regression model to predict the expected slippage of each strategy and use this to decide what strategy to use.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">epsilon</span> <span class="o">=</span> <span class="mf">0.05</span>
<span class="n">volatility</span> <span class="o">=</span> <span class="mi">5</span>
<span class="n">N</span> <span class="o">=</span> <span class="mi">1000</span>

<span class="n">x</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>
<span class="n">strat</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>
<span class="n">slippages</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>
<span class="n">d</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>
<span class="n">stratProb</span> <span class="o">=</span> <span class="n">zeros</span><span class="x">(</span><span class="n">N</span><span class="x">)</span>

<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="n">N</span>
    <span class="n">xVal</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Uniform</span><span class="x">())</span>
    <span class="n">dVal</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Uniform</span><span class="x">())</span>

    <span class="k">if</span> <span class="n">i</span> <span class="o">&gt;</span> <span class="mi">200</span>
        <span class="n">xgbMachine</span><span class="x">,</span> <span class="n">encMachine</span> <span class="o">=</span> <span class="n">fitSlippage</span><span class="x">(</span><span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">strat</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">slippages</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">xgboostmodel</span><span class="x">)</span>
        <span class="n">predCost</span> <span class="o">=</span> <span class="n">predictSlippage</span><span class="x">(</span><span class="n">xVal</span><span class="x">,</span> <span class="n">xgbMachine</span><span class="x">,</span> <span class="n">encMachine</span><span class="x">)</span>
        <span class="n">stratProbs</span> <span class="o">=</span> <span class="n">slippageToProb</span><span class="x">(</span><span class="n">predCost</span><span class="x">)</span>
    <span class="k">else</span>
        <span class="n">stratProbs</span> <span class="o">=</span> <span class="x">[</span><span class="mf">0.25</span><span class="x">,</span> <span class="mf">0.25</span><span class="x">,</span> <span class="mf">0.25</span><span class="x">,</span> <span class="mf">0.25</span><span class="x">]</span>
    <span class="k">end</span>

    <span class="n">stratVal</span> <span class="o">=</span> <span class="n">rand</span><span class="x">(</span><span class="n">Categorical</span><span class="x">(</span><span class="n">stratProbs</span><span class="x">))</span>
    <span class="n">slippageVal</span> <span class="o">=</span> <span class="n">slippage</span><span class="x">(</span><span class="n">xVal</span><span class="x">,</span> <span class="n">alphas</span><span class="x">[</span><span class="n">stratVal</span><span class="x">],</span> <span class="n">betas</span><span class="x">[</span><span class="n">stratVal</span><span class="x">],</span> <span class="n">dVal</span><span class="x">,</span> <span class="n">volatility</span><span class="x">)</span>
    
    <span class="n">x</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">xVal</span>
    <span class="n">strat</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">stratVal</span>
    <span class="n">stratProb</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">stratProbs</span><span class="x">[</span><span class="n">stratVal</span><span class="x">]</span>
    <span class="n">slippages</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">slippageVal</span>
    <span class="n">d</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">dVal</span>
<span class="k">end</span>

<span class="n">res</span> <span class="o">=</span> <span class="n">DataFrame</span><span class="x">(</span><span class="n">x</span><span class="o">=</span><span class="n">x</span><span class="x">,</span> <span class="n">d</span><span class="o">=</span><span class="n">d</span><span class="x">,</span> <span class="n">strat</span><span class="o">=</span><span class="n">strat</span><span class="x">,</span> <span class="n">stratProb</span><span class="o">=</span><span class="n">stratProb</span><span class="x">,</span> <span class="n">slippage</span><span class="o">=</span><span class="n">slippages</span><span class="x">)</span>
</code></pre></div></div>

<p>Again, we output each strategy and the probability the strategy was used. We use the importance sampling approach to estimate the slippage for choosing an algo randomly to gives us a comparison to the xgboost method.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqProb</span> <span class="o">=</span> <span class="mf">0.25</span><span class="x">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqRatio</span> <span class="o">=</span> <span class="o">:</span><span class="n">EqProb</span> <span class="o">./</span> <span class="o">:</span><span class="n">stratProb</span><span class="x">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">StratSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">)</span> <span class="o">./</span><span class="n">collect</span><span class="x">(</span><span class="mi">1</span><span class="o">:</span><span class="n">length</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">)))</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">EqSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span> <span class="o">.*</span> <span class="o">:</span><span class="n">EqRatio</span><span class="x">)</span> <span class="o">./</span><span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">EqRatio</span><span class="x">));</span>

<span class="n">plot</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">StratSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Production"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">EqSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Equal Weighting"</span><span class="x">)</span>
</code></pre></div></div>
<p><img src="/assets/importancesampling/modelslippage.png" alt="model slippage" width="80%" class="center-image" /></p>

<p>For the first 200 trades we are just selecting randomly, so no difference in performance. Then afterwards we can see the XGBoost model starts to outperform as it learns what algo is better for each \(x\).
So whilst we have only run the XGBoost model in production it has shown it is doing better than random by using the importance sampling method.</p>

<h2 id="testing-a-new-model-without-running-it-in-production">Testing a New Model Without Running it in Production</h2>

<p>The XGBoost model is doing well and out-performing an equal weighted model, but what if you wanted to change from XGBoost to something else? How can you build the case that this is something worth doing?</p>

<p>By constructing new probabilities of whether the strategy would be selected (new \(p_i\)’s) and with the current strategy probabilities (\(q_i\)’s) we can estimate the slippage of the new model without having to run any more trades.</p>

<p>With <code class="language-plaintext highlighter-rouge">MLJ.jl</code> we can create a new model and pass it into the functions to replicate running the strategy in production. This time we use a simple linear regression model with the same features. We run through the trades in the same order so there is no information leakage.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@load</span> <span class="n">LinearRegressor</span> <span class="n">pkg</span><span class="o">=</span><span class="n">MLJLinearModels</span>

<span class="n">linreg</span> <span class="o">=</span> <span class="n">MLJLinearModels</span><span class="o">.</span><span class="n">LinearRegressor</span><span class="x">()</span>

<span class="n">newProb</span> <span class="o">=</span> <span class="n">ones</span><span class="x">(</span><span class="n">N</span><span class="x">)</span> <span class="o">*</span> <span class="mf">0.25</span>

<span class="k">for</span> <span class="n">i</span> <span class="k">in</span> <span class="mi">1</span><span class="o">:</span><span class="x">(</span><span class="n">N</span><span class="o">-</span><span class="mi">1</span><span class="x">)</span>

    <span class="k">if</span> <span class="n">i</span> <span class="o">&gt;</span> <span class="mi">200</span>
        <span class="n">linMachine</span><span class="x">,</span> <span class="n">enchMachine</span> <span class="o">=</span> <span class="n">fitSlippage</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">x</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">res</span><span class="o">.</span><span class="n">strat</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">res</span><span class="o">.</span><span class="n">slippage</span><span class="x">[</span><span class="mi">1</span><span class="o">:</span><span class="n">i</span><span class="x">],</span> <span class="n">linreg</span><span class="x">)</span>
        <span class="n">predSlippage</span> <span class="o">=</span> <span class="n">predictSlippage</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">x</span><span class="x">[</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="x">],</span> <span class="n">linMachine</span><span class="x">,</span> <span class="n">enchMachine</span><span class="x">)</span>
        <span class="n">stratProbs</span> <span class="o">=</span> <span class="n">slippageToProb</span><span class="x">(</span><span class="n">predSlippage</span><span class="x">)</span>
        <span class="n">newProbVal</span> <span class="o">=</span> <span class="n">stratProbs</span><span class="x">[</span><span class="kt">Int</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">strat</span><span class="x">[</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="x">])]</span>
        <span class="n">newProb</span><span class="x">[</span><span class="n">i</span><span class="x">]</span> <span class="o">=</span> <span class="n">newProbVal</span>
    <span class="k">end</span>
    
<span class="k">end</span>

<span class="n">res</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="o">:</span><span class="n">LinearProb</span><span class="x">]</span> <span class="o">=</span> <span class="n">newProb</span>

<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">LinearRatio</span> <span class="o">=</span> <span class="o">:</span><span class="n">LinearProb</span> <span class="o">./</span> <span class="o">:</span><span class="n">stratProb</span><span class="x">)</span>
<span class="n">res</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">LinearSlipapgeRolling</span> <span class="o">=</span> <span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span> <span class="o">.*</span> <span class="o">:</span><span class="n">LinearRatio</span><span class="x">)</span> <span class="o">./</span><span class="n">cumsum</span><span class="x">(</span><span class="o">:</span><span class="n">LinearRatio</span><span class="x">))</span>
<span class="n">plot</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">StratSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Production"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">EqSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Equal Weighting"</span><span class="x">)</span>
<span class="n">plot!</span><span class="x">(</span><span class="n">res</span><span class="o">.</span><span class="n">LinearSlipapgeRolling</span><span class="x">[</span><span class="mi">50</span><span class="o">:</span><span class="k">end</span><span class="x">],</span> <span class="n">label</span> <span class="o">=</span> <span class="s">"Linear Model"</span><span class="x">)</span>
</code></pre></div></div>

<p><img src="/assets/importancesampling/linreg.png" alt="Linear regression strategy" title="Linear regression strategy" width="80%" class="center-image" /></p>

<p>Adding the linear regression decision rule to the data gives us a way of assessing this new model without having to run it directly in production. We can see that the linear model is better than XGBoost and also better than the equal weighting.</p>

<p>A simple bootstrap of taking the average slippage for each strategy a random amount of times provides the simplest performance measure.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">bs</span> <span class="o">=</span> <span class="n">mapreduce</span><span class="x">(</span><span class="n">x</span><span class="o">-&gt;</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">res</span><span class="x">[</span><span class="n">sample</span><span class="x">(</span><span class="mi">201</span><span class="o">:</span><span class="n">nrow</span><span class="x">(</span><span class="n">res</span><span class="x">),</span> <span class="n">nrow</span><span class="x">(</span><span class="n">res</span><span class="x">)</span><span class="o">-</span><span class="mi">200</span><span class="x">),</span> <span class="o">:</span><span class="x">],</span> 
              <span class="o">:</span><span class="n">StratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">),</span> 
              <span class="o">:</span><span class="n">EqStratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">,</span> <span class="n">Weights</span><span class="x">(</span><span class="o">:</span><span class="n">EqRatio</span><span class="x">)),</span>
              <span class="o">:</span><span class="n">LinearStratSlippage</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">slippage</span><span class="x">,</span> <span class="n">Weights</span><span class="x">(</span><span class="o">:</span><span class="n">LinearRatio</span><span class="x">))),</span>
			  <span class="n">vcat</span><span class="x">,</span> <span class="mi">1</span><span class="o">:</span><span class="mi">1000</span><span class="x">);</span>

<span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">stack</span><span class="x">(</span><span class="n">bs</span><span class="x">),</span> <span class="o">:</span><span class="n">variable</span><span class="x">),</span> <span class="o">:</span><span class="n">avg</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">value</span><span class="x">),</span> <span class="o">:</span><span class="n">sd</span> <span class="o">=</span> <span class="n">std</span><span class="x">(</span><span class="o">:</span><span class="n">value</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th style="text-align: right"><strong>variable</strong></th>
      <th style="text-align: right"><strong>avg</strong></th>
      <th style="text-align: right"><strong>sd</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">StratSlippage</td>
      <td style="text-align: right">-1.55385</td>
      <td style="text-align: right">0.0967389</td>
    </tr>
    <tr>
      <td style="text-align: right">EqStratSlippage</td>
      <td style="text-align: right">-1.59169</td>
      <td style="text-align: right">0.119028</td>
    </tr>
    <tr>
      <td style="text-align: right">LinearStratSlippage</td>
      <td style="text-align: right">-1.52706</td>
      <td style="text-align: right">0.133231</td>
    </tr>
  </tbody>
</table>

<p>As it’s a toy problem, nothing of significance between the models - but both models do better than the random allocation.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Importance sampling gives you a way of getting more out of the current data and strategy you are using. By weighting the observations in a new way you can get an idea whether a new strategy is worth it or not.
By rethinking your current setup you can easily add a bit of randomness into decisions and use the importance sampling framework going forward.</p>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><summary type="html"><![CDATA[A new paper hit my feed Choosing trading strategies in electronic execution using importance sampling. I’ve only encountered sampling as part of a statistical computing course as part of my PhD, and I had never strayed away from Monte Carlo sampling, but this practical example provided an intuitive understanding of its importance and utility.]]></summary></entry><entry><title type="html">Alpha Capture and Acquired</title><link href="https://dm13450.github.io/2024/09/19/Alpha-Capture-and-Acquired.html" rel="alternate" type="text/html" title="Alpha Capture and Acquired" /><published>2024-09-19T00:00:00+00:00</published><updated>2024-09-19T00:00:00+00:00</updated><id>https://dm13450.github.io/2024/09/19/Alpha-Capture-and-Acquired</id><content type="html" xml:base="https://dm13450.github.io/2024/09/19/Alpha-Capture-and-Acquired.html"><![CDATA[<p>People are never short of a trade idea. There is a whole industry of
researchers, salespeople and amateurs coming up with trading ideas and
making big calls on what stock will go up, what country will cut
interest rates and what the price of gold will do next. Alpha capture
is about systematically assessing ideas and working out who has
<em>alpha</em> and generates profitable ideas and who is just making it up as
they are going along.</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>Alpha capture started as a way of profiling a broker’s stock
recommendation. If you have 50 people recommending you 50 different
ideas, how do you know who is good? You’ll quickly run out of money if
you blindly follow all the recommendations that hit your
inbox. Instead, you need to profile each person’s idea and see
who on average can make good recommendations. Whoever is good at
picking stocks probably deserves more of your business.</p>

<p>It has since expanded that some hedge fund have internal desks that
are doing a similar analysis on their portfolio managers (PMs) to double
down on profitable bets and mitigate risks of all the PMs picking the
same stock. Picking stocks and managing a portfolio across many PMs
are two different skills and different departments at your modern
hedge fund.</p>

<p>A simple way to measure the alpha of a PM or broker recommendation
will be to see if the price of a stock they buy (or recommend) goes up
after the day they suggest it. Those with alpha would see their
picks move higher on a large enough sample and those without alpha
would average out to zero, some ideas would go higher, some ideas
lower, the net result being 0 alpha. If a PM has the opposite effect,
every stock they buy goes down they are a contrarian
indicator so take their idea and do the opposite!</p>

<p><img src="/assets/AlphaCapture/jc1.png" alt="Alpha capture markout graph" title="Alpha capture markout graph" class="center-image" /></p>

<p><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3873884">Alpha Capture Systems: Past, Present, and Future
Directions</a>
goes through the history of alpha capture and is a good short read
that inspired this blog post.</p>

<h2 id="basic-alpha-capture">Basic Alpha Capture</h2>

<p>What if we wanted to try our own Alpha Capture? We need some stock recommendations and a way of calculating what happens to the price after the recommendation. This is where the <a href="https://www.acquired.fm/">Acquired</a> podcast comes in.</p>

<p><img src="https://img.transistor.fm/rc6ysihLHIou3_VscLeIvhCyPjvpQaGzKVeRnh5PnWc/rs:fill:3000:3000:1/q:60/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS8zNDFk/ZWYwYjUyZWZiNjQ0/NTliYTI5NjJkOWZi/MmM1ZS5wbmc.jpg" alt="Acquired logo" width="30%" class="center-image" /></p>

<p>Acquired tells the stories and strategies of great companies (taken from their website). It’s a pretty popular podcast and each episode gets close to a million listeners. So this makes it an ideal Alpha Capture study - when they release an episode about a company does the stock price of that company go higher or lower on average? 
If it were to go higher then each time an episode is released call your broker and go long the stock!</p>

<p>They aren’t explicitly recommending a stock by talking about
it, as they say in their intro. So it’s just a toy exercise to see if
there is any correlation between the stock price and the release date
of an episode.</p>

<p>To systematically test this we need to get a list of the episodes and calculate a ‘markout’ from each episode.</p>

<h2 id="collecting-podcast-data">Collecting Podcast Data</h2>

<p>The internet is a wonderful thing and each episode of Acquired is
available as a XML feed from <a href="https://transistor.fm/">transistor.fm</a>. So doing some fun parsing
of XML I can get the full history of the podcast with each date
and title.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> parseEpisode</span><span class="x">(</span><span class="n">x</span><span class="x">)</span>
  <span class="n">rawDate</span> <span class="o">=</span> <span class="n">first</span><span class="x">(</span><span class="n">simplevalue</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">[</span><span class="n">tag</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">.==</span> <span class="s">"pubDate"</span><span class="x">]))</span>
  <span class="n">date</span> <span class="o">=</span> <span class="n">ZonedDateTime</span><span class="x">(</span><span class="n">rawDate</span><span class="x">,</span> <span class="n">dateformat</span><span class="s">"eee, dd uuu yyyy HH:MM:ss z"</span><span class="x">)</span>

  <span class="kt">Dict</span><span class="x">(</span><span class="s">"title"</span> <span class="o">=&gt;</span> <span class="n">first</span><span class="x">(</span><span class="n">simplevalue</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">[</span><span class="n">tag</span><span class="o">.</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">.==</span> <span class="s">"title"</span><span class="x">])),</span>
       <span class="s">"date"</span> <span class="o">=&gt;</span><span class="n">date</span><span class="x">)</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> parse_date</span><span class="x">(</span><span class="n">t</span><span class="x">)</span>
   <span class="kt">Date</span><span class="x">(</span><span class="n">string</span><span class="x">(</span><span class="n">split</span><span class="x">(</span><span class="n">t</span><span class="x">,</span> <span class="s">"T"</span><span class="x">)[</span><span class="mi">1</span><span class="x">]))</span>
<span class="k">end</span>

<span class="n">url</span> <span class="o">=</span> <span class="s">"https://feeds.transistor.fm/acquired"</span>

<span class="n">data</span> <span class="o">=</span> <span class="n">parse</span><span class="x">(</span><span class="n">Node</span><span class="x">,</span> <span class="kt">String</span><span class="x">(</span><span class="n">HTTP</span><span class="o">.</span><span class="n">get</span><span class="x">(</span><span class="n">url</span><span class="x">)</span><span class="o">.</span><span class="n">body</span><span class="x">))</span>

<span class="n">episodes</span> <span class="o">=</span> <span class="n">children</span><span class="x">(</span><span class="n">data</span><span class="x">[</span><span class="mi">3</span><span class="x">][</span><span class="mi">1</span><span class="x">])</span>
<span class="n">filter!</span><span class="x">(</span><span class="n">x</span> <span class="o">-&gt;</span> <span class="n">tag</span><span class="x">(</span><span class="n">x</span><span class="x">)</span> <span class="o">==</span> <span class="s">"item"</span><span class="x">,</span> <span class="n">episodes</span><span class="x">)</span>
<span class="n">episodes</span> <span class="o">=</span> <span class="n">children</span><span class="o">.</span><span class="x">(</span><span class="n">episodes</span><span class="x">)</span>

<span class="n">episodeData</span> <span class="o">=</span> <span class="n">parseEpisode</span><span class="o">.</span><span class="x">(</span><span class="n">episodes</span><span class="x">)</span>

<span class="n">episodeFrame</span> <span class="o">=</span> <span class="n">vcat</span><span class="x">(</span><span class="n">DataFrame</span><span class="o">.</span><span class="x">(</span><span class="n">episodeData</span><span class="x">)</span><span class="o">...</span><span class="x">)</span>
<span class="n">CSV</span><span class="o">.</span><span class="n">write</span><span class="x">(</span><span class="s">"episodeRaw.csv"</span><span class="x">,</span> <span class="n">episodeFrame</span><span class="x">)</span>
</code></pre></div></div>

<p>After writing the data to a CSV I need to somehow parse the episode
title into a stock ticker. This is a tricky task as the episode names
are human friendly not computer friendly. So time for our LLM
overlords to lend a hand a do the heavy lifting. I drop the CSV into
<a href="https://www.perplexity.ai/">Perplexity</a> and prompt it to add the relevant stock ticker to the
file. I then reread the CSV into my notebook.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">episodeFrame</span> <span class="o">=</span> <span class="n">CSV</span><span class="o">.</span><span class="n">read</span><span class="x">(</span><span class="s">"episodeTicker.csv"</span><span class="x">,</span> <span class="n">DataFrame</span><span class="x">)</span>
<span class="n">episodeFrame</span><span class="o">.</span><span class="n">date</span> <span class="o">=</span> <span class="n">ZonedDateTime</span><span class="o">.</span><span class="x">(</span><span class="kt">String</span><span class="o">.</span><span class="x">(</span><span class="n">episodeFrame</span><span class="o">.</span><span class="n">date</span><span class="x">),</span> <span class="n">dateformat</span><span class="s">"yyyy-mm-ddTHH:MM:SS.sss-z"</span><span class="x">)</span>

<span class="n">vcat</span><span class="x">(</span><span class="n">first</span><span class="x">(</span><span class="nd">@subset</span><span class="x">(</span><span class="n">episodeFrame</span><span class="x">,</span> <span class="o">:</span><span class="n">stock_ticker</span> <span class="o">.!=</span> <span class="s">"-"</span><span class="x">),</span> <span class="mi">4</span><span class="x">),</span>
        <span class="n">last</span><span class="x">(</span><span class="nd">@subset</span><span class="x">(</span><span class="n">episodeFrame</span><span class="x">,</span> <span class="o">:</span><span class="n">stock_ticker</span> <span class="o">.!=</span> <span class="s">"-"</span><span class="x">),</span> <span class="mi">4</span><span class="x">))</span>
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th style="text-align: right"><strong>date</strong><br /><code class="language-plaintext highlighter-rouge">ZonedDateTime</code></th>
      <th style="text-align: right"><strong>title</strong><br /><code class="language-plaintext highlighter-rouge">String</code></th>
      <th style="text-align: right"><strong>stock_ticker</strong><br /><code class="language-plaintext highlighter-rouge">String15</code></th>
      <th style="text-align: right"><strong>sector_etf</strong><br /><code class="language-plaintext highlighter-rouge">String7</code></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: right">2024-03-17T17:54:00.400+07:00</td>
      <td style="text-align: right">Renaissance Technologies</td>
      <td style="text-align: right">RNR</td>
      <td style="text-align: right">PSI</td>
    </tr>
    <tr>
      <td style="text-align: right">2024-02-19T17:56:00.410+08:00</td>
      <td style="text-align: right">Hermès</td>
      <td style="text-align: right">RMS.PA</td>
      <td style="text-align: right">GXLU</td>
    </tr>
    <tr>
      <td style="text-align: right">2024-01-21T17:59:00.450+08:00</td>
      <td style="text-align: right">Novo Nordisk (Ozempic)</td>
      <td style="text-align: right">NOVO-B.CO</td>
      <td style="text-align: right">IHE</td>
    </tr>
    <tr>
      <td style="text-align: right">2023-11-26T16:24:00.250+08:00</td>
      <td style="text-align: right">Visa</td>
      <td style="text-align: right">V</td>
      <td style="text-align: right">IPAY</td>
    </tr>
    <tr>
      <td style="text-align: right">2018-09-23T18:28:00.550+07:00</td>
      <td style="text-align: right">Season 3, Episode 5: Alibaba</td>
      <td style="text-align: right">BABA</td>
      <td style="text-align: right">KWEB</td>
    </tr>
    <tr>
      <td style="text-align: right">2018-08-20T09:20:00.370+07:00</td>
      <td style="text-align: right">Season 3, Episode 3: The Sonos IPO</td>
      <td style="text-align: right">SONO</td>
      <td style="text-align: right">GAMR</td>
    </tr>
    <tr>
      <td style="text-align: right">2018-08-05T18:15:00.030+07:00</td>
      <td style="text-align: right">Season 3, Episode 2: The Xiaomi IPO</td>
      <td style="text-align: right">XIACF</td>
      <td style="text-align: right">KWEB</td>
    </tr>
    <tr>
      <td style="text-align: right">2018-07-16T21:40:00.560+07:00</td>
      <td style="text-align: right">Season 3, Episode 1: Tesla</td>
      <td style="text-align: right">TSLA</td>
      <td style="text-align: right">TSLA</td>
    </tr>
  </tbody>
</table>

<p>It’s done an ok job. Most of the episodes seem to correspond to the
right ticker but we can see it has hallucinated the RenTech stock
ticker as RNR. RenTech is a private company, no stock ticker and
instead, Perplexity has decided the RNR (a reinsurance company) is the
correct stock ticker. So not 100% accurate. Still, it has saved me a
good chunk of time and we can move on to getting the stock price data.</p>

<p>We want to measure the average price move of a stock after an episode is released. If Acquired had stock-picking skill, you expect the price to increase after the release of an episode as they are generally speaking positively about the various companies.</p>

<p>So using <a href="https://github.com/dm13450/AlpacaMarkets.jl">AlpacaMarkets.jl</a> we get the stock price for the days before and the days after the episode.  As AlpacaMarkets only has US stock data then only some of the episodes end up with a full dataset.</p>

<h2 id="what-is-a-markout">What is a Markout?</h2>

<p>We calculate the percentage change relative to the episode date and then aggregate all the stock tickers together.</p>

\[\text{Markout} = \frac{p - p_{\text{episode released}}}{p_{\text{episode released}}}\]

<p>Acquired is about great companies so they choose to speak favourably about a company, therefore I think it’s a reasonable assumption that we expect the stock price to increase after everyone gets round to listening to it. 
So once we aggregate all the episodes we should hopefully have
enough data to decide if this is true.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span><span class="nf"> getStockData</span><span class="x">(</span><span class="n">stock</span><span class="x">,</span> <span class="n">startDate</span><span class="x">)</span>
  <span class="n">prices</span> <span class="o">=</span> <span class="n">AlpacaMarkets</span><span class="o">.</span><span class="n">stock_bars</span><span class="x">(</span><span class="n">stock</span><span class="x">,</span> <span class="s">"1Day"</span><span class="x">,</span> <span class="n">startTime</span><span class="o">=</span><span class="n">startDate</span> <span class="o">-</span> <span class="kt">Month</span><span class="x">(</span><span class="mi">1</span><span class="x">),</span> <span class="n">limit</span><span class="o">=</span><span class="mi">10000</span><span class="x">)[</span><span class="mi">1</span><span class="x">]</span>
  <span class="n">prices</span><span class="o">.</span><span class="n">date</span> <span class="o">.=</span> <span class="n">startDate</span>
  <span class="n">prices</span><span class="o">.</span><span class="n">t</span> <span class="o">=</span> <span class="n">parse_date</span><span class="o">.</span><span class="x">(</span><span class="n">prices</span><span class="o">.</span><span class="n">t</span><span class="x">)</span>
  <span class="n">prices</span><span class="x">[</span><span class="o">:</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">t</span><span class="x">,</span> <span class="o">:</span><span class="n">symbol</span><span class="x">,</span> <span class="o">:</span><span class="n">vw</span><span class="x">,</span> <span class="o">:</span><span class="n">date</span><span class="x">]]</span>
<span class="k">end</span>

<span class="k">function</span><span class="nf"> calcMarkout</span><span class="x">(</span><span class="n">data</span><span class="x">)</span>
   <span class="n">arrivalInd</span> <span class="o">=</span> <span class="n">findlast</span><span class="x">(</span><span class="n">data</span><span class="o">.</span><span class="n">t</span> <span class="o">.&lt;=</span> <span class="n">data</span><span class="o">.</span><span class="n">date</span><span class="x">)</span>
   <span class="n">arrivalPrice</span> <span class="o">=</span> <span class="n">data</span><span class="x">[</span><span class="n">arrivalInd</span><span class="x">,</span> <span class="o">:</span><span class="n">vw</span><span class="x">]</span>
   <span class="n">data</span><span class="o">.</span><span class="n">arrivalPrice</span> <span class="o">.=</span> <span class="n">arrivalPrice</span>
   <span class="n">data</span><span class="o">.</span><span class="n">ts</span> <span class="o">=</span> <span class="x">[</span><span class="n">x</span><span class="o">.</span><span class="n">value</span> <span class="k">for</span> <span class="n">x</span> <span class="k">in</span> <span class="x">(</span><span class="n">data</span><span class="o">.</span><span class="n">t</span> <span class="o">.-</span> <span class="n">data</span><span class="o">.</span><span class="n">date</span><span class="x">)]</span>
   <span class="n">data</span><span class="o">.</span><span class="n">markout</span> <span class="o">=</span> <span class="mf">1e4</span><span class="o">*</span><span class="x">(</span><span class="n">data</span><span class="o">.</span><span class="n">vw</span> <span class="o">.-</span> <span class="n">data</span><span class="o">.</span><span class="n">arrivalPrice</span><span class="x">)</span> <span class="o">./</span> <span class="n">data</span><span class="o">.</span><span class="n">arrivalPrice</span>
   <span class="n">data</span>
<span class="k">end</span>

<span class="n">res</span> <span class="o">=</span> <span class="x">[]</span>

<span class="k">for</span> <span class="n">row</span> <span class="k">in</span> <span class="n">eachrow</span><span class="x">(</span><span class="n">episodeFrame</span><span class="x">)</span>
    
    <span class="k">try</span> 
        <span class="n">stockData</span> <span class="o">=</span> <span class="n">getStockData</span><span class="x">(</span><span class="n">row</span><span class="o">.</span><span class="n">stock_ticker</span><span class="x">,</span> <span class="kt">Date</span><span class="x">(</span><span class="n">row</span><span class="o">.</span><span class="n">date</span><span class="x">))</span>
        <span class="n">stockData</span> <span class="o">=</span> <span class="n">calcMarkout</span><span class="x">(</span><span class="n">stockData</span><span class="x">)</span>
        <span class="n">append!</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="x">[</span><span class="n">stockData</span><span class="x">])</span>
    <span class="k">catch</span> <span class="n">e</span>
        <span class="n">println</span><span class="x">(</span><span class="n">row</span><span class="o">.</span><span class="n">stock_ticker</span><span class="x">)</span>
    <span class="k">end</span>
<span class="k">end</span>

<span class="n">res</span> <span class="o">=</span> <span class="n">vcat</span><span class="x">(</span><span class="n">res</span><span class="o">...</span><span class="x">)</span>
</code></pre></div></div>
<p>With the data pulled we now aggregate by each day before and after the episode.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">markoutRes</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span><span class="x">),</span> <span class="o">:</span><span class="n">n</span> <span class="o">=</span> <span class="n">length</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">),</span> 
                                         <span class="o">:</span><span class="n">avgMarkout</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">),</span>
                                         <span class="o">:</span><span class="n">devMarkout</span> <span class="o">=</span> <span class="n">std</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">))</span>
<span class="n">markoutRes</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">markoutRes</span><span class="x">,</span> <span class="o">:</span><span class="n">errMarkout</span> <span class="o">=</span> <span class="o">:</span><span class="n">devMarkout</span> <span class="o">./</span><span class="n">sqrt</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">n</span><span class="x">))</span>
</code></pre></div></div>

<p>Always need error bars as this data gets noisy.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">markoutResSub</span> <span class="o">=</span> <span class="nd">@subset</span><span class="x">(</span><span class="n">markoutRes</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span> <span class="o">.&lt;=</span> <span class="mi">60</span><span class="x">,</span> <span class="o">:</span><span class="n">n</span> <span class="o">.&gt;=</span> <span class="mi">10</span><span class="x">)</span>
<span class="n">plot</span><span class="x">(</span><span class="n">markoutResSub</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">markoutResSub</span><span class="o">.</span><span class="n">avgMarkout</span><span class="x">,</span> <span class="n">yerr</span><span class="o">=</span><span class="n">markoutResSub</span><span class="o">.</span><span class="n">errMarkout</span><span class="x">,</span> 
     <span class="n">xlabel</span> <span class="o">=</span> <span class="s">"Days"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"Markout"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Acquired Alpha Capture"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>
<span class="n">hline!</span><span class="x">([</span><span class="mi">0</span><span class="x">],</span> <span class="n">ls</span> <span class="o">=</span> <span class="o">:</span><span class="n">dash</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"grey"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">([</span><span class="mi">0</span><span class="x">],</span> <span class="n">ls</span> <span class="o">=</span> <span class="o">:</span><span class="n">dash</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"grey"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>

</code></pre></div></div>

<p><img src="/assets/AlphaCapture/avgMarkout2.png" alt="Average markout" title="Average
 markouts" width="80%" class="center-image" /></p>

<p>Not really a pattern. The majority of the error bars are intercepting zero after the podcast is released. 
If you squint a little bit there seems to be a bit of a downward trend post-episode which would suggest they talk about a company at the peak of the stock price.</p>

<p>Beforehand there is a bit of positive momentum, again suggesting that
they release the podcast at the peak of the stock price. Now this is
even more of a stretch given there is only 1 podcast a month and it
takes more than 20 days to prepare an episode (I imagine!), so
more noise than signal.</p>

<div class="language-julia highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">markoutIndRes</span> <span class="o">=</span> <span class="nd">@combine</span><span class="x">(</span><span class="n">groupby</span><span class="x">(</span><span class="n">res</span><span class="x">,</span> <span class="x">[</span><span class="o">:</span><span class="n">symbol</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span><span class="x">]),</span> <span class="o">:</span><span class="n">n</span> <span class="o">=</span> <span class="n">length</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">),</span> 
                                         <span class="o">:</span><span class="n">avgMarkout</span> <span class="o">=</span> <span class="n">mean</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">),</span>
                                         <span class="o">:</span><span class="n">devMarkout</span> <span class="o">=</span> <span class="n">std</span><span class="x">(</span><span class="o">:</span><span class="n">markout</span><span class="x">))</span>
<span class="n">markoutIndRes</span> <span class="o">=</span> <span class="nd">@transform</span><span class="x">(</span><span class="n">markoutIndRes</span><span class="x">,</span> <span class="o">:</span><span class="n">errMarkout</span> <span class="o">=</span> <span class="o">:</span><span class="n">devMarkout</span> <span class="o">./</span><span class="n">sqrt</span><span class="o">.</span><span class="x">(</span><span class="o">:</span><span class="n">n</span><span class="x">))</span>

<span class="n">p</span> <span class="o">=</span> <span class="n">plot</span><span class="x">()</span>
<span class="n">hline!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="x">[</span><span class="mi">0</span><span class="x">],</span> <span class="n">ls</span> <span class="o">=</span> <span class="o">:</span><span class="n">dash</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"grey"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>
<span class="n">vline!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="x">[</span><span class="mi">0</span><span class="x">],</span> <span class="n">ls</span> <span class="o">=</span> <span class="o">:</span><span class="n">dash</span><span class="x">,</span> <span class="n">color</span> <span class="o">=</span> <span class="s">"grey"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="o">:</span><span class="n">none</span><span class="x">)</span>
<span class="k">for</span> <span class="n">sym</span> <span class="k">in</span> <span class="x">[</span><span class="s">"TSLA"</span><span class="x">,</span> <span class="s">"V"</span><span class="x">,</span> <span class="s">"META"</span><span class="x">]</span>
   <span class="n">markoutResSub</span> <span class="o">=</span> <span class="n">sort</span><span class="x">(</span><span class="nd">@subset</span><span class="x">(</span><span class="n">markoutIndRes</span><span class="x">,</span> <span class="o">:</span><span class="n">symbol</span> <span class="o">.==</span> <span class="n">sym</span><span class="x">,</span> <span class="o">:</span><span class="n">ts</span> <span class="o">.&lt;=</span> <span class="mi">60</span><span class="x">,</span> <span class="o">:</span><span class="n">n</span> <span class="o">.&gt;=</span> <span class="mi">1</span><span class="x">),</span> <span class="o">:</span><span class="n">ts</span><span class="x">)</span>
    <span class="n">plot!</span><span class="x">(</span><span class="n">p</span><span class="x">,</span> <span class="n">markoutResSub</span><span class="o">.</span><span class="n">ts</span><span class="x">,</span> <span class="n">markoutResSub</span><span class="o">.</span><span class="n">avgMarkout</span><span class="x">,</span> <span class="n">yerr</span><span class="o">=</span><span class="n">markoutResSub</span><span class="o">.</span><span class="n">errMarkout</span><span class="x">,</span> 
     <span class="n">xlabel</span> <span class="o">=</span> <span class="s">"Days"</span><span class="x">,</span> <span class="n">ylabel</span> <span class="o">=</span> <span class="s">"Markout"</span><span class="x">,</span> <span class="n">title</span> <span class="o">=</span> <span class="s">"Acquired Alpha Capture"</span><span class="x">,</span> <span class="n">label</span> <span class="o">=</span> <span class="n">sym</span><span class="x">,</span> <span class="n">lw</span> <span class="o">=</span><span class="mi">2</span><span class="x">)</span> 
<span class="k">end</span>
<span class="n">p</span>
</code></pre></div></div>

<p><img src="/assets/AlphaCapture/indMarkout2.png" alt="Individual markouts" title="Individual markouts" width="80%" class="center-image" /></p>

<p>When we pull out 3 examples of episodes we can see the randomness and specifically the volatility of TSLA here.</p>

<h2 id="conclusion">Conclusion</h2>

<p>From this, we would not put any specific weight on the stock
performance after an episode is released. There doesn’t appear to be
any statistical pattern to exploit. No alpha means no alpha
capture. It is a nice exercise though and has hopefully explained the
concept of a markout.</p>]]></content><author><name>Dean Markwick</name></author><category term="julia" /><summary type="html"><![CDATA[People are never short of a trade idea. There is a whole industry of researchers, salespeople and amateurs coming up with trading ideas and making big calls on what stock will go up, what country will cut interest rates and what the price of gold will do next. Alpha capture is about systematically assessing ideas and working out who has alpha and generates profitable ideas and who is just making it up as they are going along.]]></summary></entry><entry><title type="html">Solving the Almgren Chris Model</title><link href="https://dm13450.github.io/2024/06/06/Solving-the-Almgren-Chris-Model.html" rel="alternate" type="text/html" title="Solving the Almgren Chris Model" /><published>2024-06-06T00:00:00+00:00</published><updated>2024-06-06T00:00:00+00:00</updated><id>https://dm13450.github.io/2024/06/06/Solving-the-Almgren-Chris-Model</id><content type="html" xml:base="https://dm13450.github.io/2024/06/06/Solving-the-Almgren-Chris-Model.html"><![CDATA[<p>The Almgren Chris model from <a href="https://www.smallake.kr/wp-content/uploads/2016/03/optliq.pdf">Optimal Execution
of Portfolio Transactions</a> is the most well known optimal
execution model and provides the foundational math about how to think
about trading some quantity of an asset. This blog post goes through
the math and how we set the problem up and arrived at the various
solutions.</p>

<p></p>
<hr />

<p>Enjoy these types of posts? Then sign up for my newsletter.</p>
<div style="text-align: center;">
<iframe src="https://dm13450.substack.com/embed" width="480" height="150" style="border:1px solid ##fdfdfd; background:#fdfdfd;" frameborder="0" scrolling="no"></iframe>
</div>
<hr />

<p></p>

<p>I first encountered the Almgren Chriss model in my initial PhD year
through a Microstructure and Machine Learning course. It was for 2 hours at 18:00 on a
Friday night and on the other side of London from where I lived, so a bit of a pain
for me to attend. This post in essence is inspired by these notes as
I’ve always wanted to summarise them into a digital version. So this is a maths-heavy post that will act as a springboard for some
more future content.</p>

<h2 id="the-trading-problem">The Trading Problem</h2>

<p>We have \(X\) amount of something to trade over some time\(0\)
to \(T\) such that \(X_T = 0\). How should we slice and dice our
trades to minimise the execution cost?</p>

<p>We need a model of</p>

<ul>
  <li>How the price moves</li>
  <li>How our trading affects prices</li>
</ul>

<p>then we can build a trading cost function that we then optimise in different
ways.</p>

<h2 id="price-dynamics">Price Dynamics</h2>

<p>The price evolves like
\(S_t = \bar{S} _t + \eta v_t + \theta (X_0 - X_t),\)</p>

<ul>
  <li>\(\bar{S} _t\) is the unperturbed stock price</li>
  <li>\(\eta \cdot v_t\) is the temporary market impact that scales with the
trading speed \(v_t\)</li>
  <li>\(\theta \cdot (X_0 - X_T)\) is the permanent market impact</li>
</ul>

<p>The unperturbed price is a simple Gaussian random walk with no drift:
\(\mathrm{d} \bar{S} _t = \sigma S_0 \mathrm{d} W_t\)</p>

<p>The trading rate 
\(v_t = - \frac{\mathrm{d} X_t}{\mathrm{d}t} = - \dot{X} _t\)
so simply the speed at which we are executing the trades.</p>

<p>So the fundamental price (\(\bar{S}\)) evolves as a random walk but our
actions of trading means that the observed price is higher by an amount
proportional to our trading speed. The signs of the components are set
up such that we are buying - so the faster we trade the more we
distort the price from the true price by pushing it higher</p>

<h2 id="trading-costs">Trading Costs</h2>

<p>The final cost of the execution is the sum of the amount we traded
multiplied by the price of all the trades. In continuous time this is
simply the integral of this observed stock price multiplied by the
trading speed over the execution window:</p>

\[C_{0, T} = \int _0 ^T S_t v_t \mathrm{d} t,\]

<p>which after inserting the equation for the asset price gives us three different
components</p>

\[C_{0_,T} = \underbrace {\int _0 ^T \bar{S_t} v_t \mathrm{d} t}_\text{(1)} + \underbrace{\int_0 ^T \eta
v_t ^2 \mathrm{d} t}_\text{(2)} + \underbrace{\int _0 ^T \theta (X_0 -
X_t) v_t \mathrm{d}t}_\text{(3)}\]

<p>Term \((1)\) we use integration by parts:</p>

\[\begin{align*} \int _0 ^T \bar{S_t} v_t \mathrm{d} t &amp; =- \int _0 ^T
\bar{S_t} \mathrm{d}X_t \\
&amp; = - \left[\bar{S_t} X_t \right]_0^T + \int _0 ^T X_t \mathrm{d} \bar{S_t} \\
&amp; = -(\bar{S}_TX_T - \bar{S}_0X_0) + \int _0 ^T X_t \sigma S_0
\mathrm{d} W_t \\
&amp; = \bar{S_0} X_0 + \int _0 ^T X_t \sigma S_0
\mathrm{d} W_t
\end{align*}\]

<p>\(\int _0 ^T \bar{S} _t v_t \mathrm{d}t = - \int _0 ^T \bar{S} _t \mathrm{d} x_t\)
which with integration by parts and substituting in the GBM part</p>

\[X_0 S_0 + \int _0 ^T x_t \sigma S_0 \mathrm{d} W_t\]

<p>For term (3)</p>

\[\theta \int _o ^T (X_0 - X_t) v_t \mathrm{d} t= -\theta \int _0 ^T (X_0 - X_t) \mathrm{d} X_t\]

\[= \frac{\theta ^2}{2}\]

<p>which gives us a formula for \(C_{0, T}\)</p>

\[C_{0, T} = X_0 S_0 + \int _0 ^T X_t \sigma S_0 \mathrm{d} W_t + \eta \int _0 ^T v_t ^2 \mathrm{d}t + \frac{\theta ^2}{2}.\]

<p>This is our expected cost function and we want to find the \(v_t\)
that minimises the final cost.</p>

<h2 id="minimising-the-expected-cost">Minimising the Expected Cost</h2>

<p>If we take expectations (we want to minimise the <em>average</em> execution
path - each path will be different as it is a stochastic problem) we
end up with just one term we can influence the expected cost:</p>

\[\mathbb{E}[C] = \underbrace{X_0 S_0 + \frac{\theta ^ 2}{2}}_{\text{Constant}} +
         \underbrace{\mathbb{E}
		 \left[\int _0 ^T X_t \sigma S_0 \mathrm{d} W_t \right]}_{
		\mathbb{E}[ \mathrm{d}W_t] =  0} +
         \mathbb{E} \left[ \eta \int _0 ^T v_t ^2 \mathrm{d}t \right]\]

<p>So we minimise the expected cost by finding the trading speed that
minimises this term</p>

\[\min _{v_t} \eta \int _0 ^T v^2_t \mathrm{d} t.\]

<p>To solve this we apply the
<a href="https://en.wikipedia.org/wiki/Euler-Lagrange_equation">Euler-Lagrange equation</a>
to minimise the action. The action is the term inside the integral.</p>

\[\frac{\partial f}{\partial X} = \frac{\mathrm{d}}{\mathrm{d}t}
\frac{\partial f}{\partial v}\]

<p>And from the above</p>

\[\begin{align*} f &amp; = v^2_t \\
\frac{\partial f}{\partial X} &amp; = 0 \\
\frac{\partial f}{\partial v} &amp; = 2 v_t,
\end{align*}\]

<p>so</p>

\[\frac{\mathrm{d}}{\mathrm{d} t} v_t = 0,\]

<p>which means the speed of the execution must be constant \(v_t = B\).</p>

\[X_t = A + B t.\]

<p>We have the boundary conditions</p>

\[X_0 = A,\]

\[X_T = X_0 + BT = 0,\]

\[B = \frac{-X_0}{T},\]

\[X_t = X_0 - \frac{X_0}{T} t.\]

<p>Putting this trading schedule back into the expected cost formula gives
us an overall result</p>

\[\int _0 ^T v_t^2\mathrm{d} t = \frac{X^2_0}{T^2} (T - 0) =
\frac{X_0^2}{T}.\]

<p>When we plot this schedule we can see that the speed is constant and
we are simply running a TWAP (time-weighted average price).</p>

<p><img src="/assets/optexmaths/twap.png" alt="TWAP execution schedule" title="TWAP execution schedule" /></p>

<p>The maths is telling us:</p>

<ul>
  <li>To minimise cost for an amount \(X_0\) then you should run your
TWAP for an infinite amount of time.</li>
</ul>

<p>This neglects the price risk, so sure, run a very long TWAP but don’t
complain when the market trends against you!</p>

<p>How can we account for this price risk?</p>

<h2 id="mean-variance-optimisation-of-the-almgren-chriss-model">Mean-Variance Optimisation of the Almgren Chriss Model</h2>

<p>We now need to minimise both the expected cost and the <em>variance</em> of
the expected cost with our trading schedule. This means we will now be
sensitive to cases where the price moves far away from the starting
value.</p>

<p>We introduce a new
parameter, \(\lambda\), that controls our risk aversion. So now we are
worried about the price potentially running away from us if we take
too long to finish the trade</p>

\[\min _ {v_t} \left( \mathbb{E} [C] + \lambda \text{Var} [C] \right ),\]

<p>so now we want to minimise the average and the variation of the
trading cost and see what schedule that produces.</p>

<p>When we took the expectation, only the deterministic bits remained. When we calculate the variance only the random bits remain</p>

\[\text{Var} [C] = \mathbb{E} \left[ \sigma _0 \bar{S} _0 \int _0 ^T X_t \mathrm{d} t \right] ^2 = \sigma ^2 \bar{S}_0^2 \int _0 ^T X_t ^2 \mathrm{d} t,\]

<p>which means our minimisation problem can be written as:</p>

\[\text{min} _{v_t} \int _0 ^T v_t ^2 \mathrm{d} t + \lambda \sigma ^2 \bar{S}_0^2 \int _0 ^T X_t ^2 \mathrm{d} t.\]

<p>Using the Euler-Lagrange equations again</p>

\[\begin{align*}
f &amp; = A v_t^2 + B X_t^2 \\
\frac{\partial f}{\partial X} &amp; = 2B X_t \\
\frac{\partial f}{\partial v} &amp; = 2A v_t \\
B X_t &amp; = A\frac{\mathrm{d} }{\mathrm{d} t} v_t \\
 &amp; = - \frac{A}{B} \frac{\mathrm{d}^2}{\mathrm{d} t^2} X_t.
\end{align*}\]

<p>This is a second-order linear ordinary differential equation with
solution</p>

\[X_t = c_1 e^{\sqrt{\frac{A}{B}} t} + c_2 e ^{- \sqrt{\frac{A}{B}} t},\]

<p>Again, applying boundary conditions</p>

\[X_0 = c_1 + c_2,\]

\[X_T = 0 = c_1 e^{\sqrt{\frac{A}{B}} T} + c_2 e^{-\sqrt{\frac{A}{B}T}},\]

\[X_t = X_0 \frac{\text{sinh} \sqrt{\frac{\eta}{\lambda \sigma ^2 \bar{S}_0}} T-t}{\text{sinh}
\sqrt{\frac{\eta}{\lambda \sigma ^2 \bar{S}_0}} T}.\]

<p>Which is a funny expression, but underneath it is just an exponential.</p>

<p>We now have the additional \(\lambda\) parameter and so plot the
execution schedule for different risk aversions</p>

<p><img src="/assets/optexmaths/ag.png" alt="Comparing the TWAP to the Almgren Chriss model" title="Comparing the TWAP to the Almgren Chriss model" /></p>

<p>A higher \(\lambda\) means a higher risk tolerance so it becomes
closer to the TWAP. In general, we can see that the Almgren Chriss
solution is front-loaded - most of the trading is done early on in the
time window.</p>

<h2 id="summary">Summary</h2>

<p>Ok maths over, put down your pencils and breathe. We’ve gone through
the full problem set-up and show how the TWAP minimises expected
costs for a risk-neutral investor and how an exponential execution
schedule minimises cost for a risk-sensitive investor.</p>

<p>Now we know the maths we can go on to do some interesting things.</p>]]></content><author><name>Dean Markwick</name></author><category term="maths" /><category term="quant" /><summary type="html"><![CDATA[The Almgren Chris model from Optimal Execution of Portfolio Transactions is the most well known optimal execution model and provides the foundational math about how to think about trading some quantity of an asset. This blog post goes through the math and how we set the problem up and arrived at the various solutions.]]></summary></entry></feed>