u-boats

Investigating Causal Inference Modeling on Naval Operations during the Battle of the Atlantic




Reading time: 27 minute(s) @ 200 WP

The Battle of the Atlantic

The ability to quantify the effect of a new tactic, technology or event would significantly enhance naval warfare operations. Emerging research in causal analysis offers insight and algorithmic opportunities to explore applications in naval warfare.

The Battle of the Atlantic during World War II occurred from 1939 to 1945 as the German Navy conducted a U-boat offensive against the Allied efforts to supply the United Kingdom. The total loss of shipping included 3,500 allied merchant ships and 783 U-boats. The ongoing Battle was characterized by changes in tactics and technology on both sides. From convoys to radar receivers and new weapon systems, the Allies and the Axis would counter each others advances to regain advantage. This period saw the development of innovative operational research methods due to the ability to apply mathematical models to the data.

Much has been written about the ebb and flow of the battle with references to causes for increased success on either side. Anecdotal evidence viewed in graphs do tell a compelling story of the Allies eventually outbuilding the Axis’ ability to sink enough supply tonnage. In more detailed accounts, specific events and technology such as the German breaking of the UK naval codes in March 1943 is attributed to increases in sinking, to be countered by the Allied cracking of the German Naval code Enigma by December 1942.

The metric of interest for our analysis will be the number of U-boat attacks since it is directly related to the safe passage across the Atlantic and reducing these attacks is the desired effect. The graph below shows the number of U-boat attacks by month during the Battle.

This analysis explores how to quantify the effects of cited tactical and technology changes as interventions where the causal effect can be computed and examined for significance. With this approach we ask questions like:

Did the breaking of the Enigma code have a measurable effect on decreasing the amount of U-boat attacks? Was it statistically significant?

To do this, we will explore recent approaches in causal inference with Bayesian models which provide methods to analyse time series data to measure the statistical significance effect of events.

Causal Inference

It is well known that correlation does not equate to causation. But we also know effects are often seen in data that correspond to an action. Causal effects result from some type of intervention. For example, there would be a causal effect of a medicine (the intervention) to cure a disease (the effect). Causal inference modeling provides an approach to model what would have happened if the intervention was not implemented. This is referred to as a counter factual. This approach allows us to create a model when it isn’t possible, reasonable or ethical to conduct trials in order to observe separate groups (intervention vs non-intervention).

Causal inference modeling works by creating a regression fit to data before an intervention, and then analyzes the deviations from that model going forward. Our approach will build a Bayesian structured time series that filters and selects the most important variables to provide conditional probabilities from the data. The predictions for future counterfactuals (what would have happened without intervention) are based on previous states of our data. The model also uses related data not impacted by the intervention.

This analysis uses the R Library CausalImpact 2 to create these models. For our model, we are analyzing how the number of U-boat attacks that would have occurred had an intervention not occurred. In the case of the Enigma, the counterfactual will be the attacks that would have occurred had the British not cracked the code. All models are based on assumptions, and this inference is based on three:

  • The data used has predictive capability.
  • Data used as covariates is unaffected by any intervention that effects the primary response (attacks)
  • Data used as covariates maintains the relationship established in the pre-intervention period through the post intervention period.

Data

The causal modeling time series approach requires covariates, or control sets of time series data that can be related to the response data (attacks). These data are assumed to be unaffected by the intervention and retain their relationship prior to the intervention to establish a relationship. These relationships hold through the intervention and after. Four sets of data will be considered for our first exploration:

  • Number of Liberty ships built by the United States over the period
  • Number of U-boats operational available
  • Number of ships in convoys
  • Shipping tons available

In the graphs below are the data aggregated by week for our causal inference model. Since our data sources vary in frequency (daily, monthly, yearly), data preparation requires creating smooth curves to interpolate by a common frequency. Initial results by month demonstrated good results and the freuqncy was increased to weekly to increase data for model training.

The first set of data in the graph below are the number of U-boats operational each week of the Battle3. To support of model, the assumption is that these data are not effected by any intervention we analyze.

The next data are the Liberty Ships built during WWII by month. The assumption of these data as a control set is based on presuming the plan to build these ships in unchanged by the interventions analyzed prior to, and after an intervention. For example, the number of ships built did not change when the German’s developed the radar detector.

The next set of data is the number of ships in all North Atlantic convoys. Each convoy had multiple ships and these data are grouped by ships per week to align with out other data sets. As with the other data, we have to assume these are not impacted by the interventions we select to model.

The next set of data is the shipping tons available. These data come from interpreting hand drawn graphs from the reports in Antisubmarine Warfare in World War II 4.

Allied aerial bombing missions

The last set of data is the allied aircraft bombing runs. These data come from the Theater History of Operations data on WWII5. We are including these data because the shape of the curve appears similar to the u-boat attack data. While that may sound subjective, our modeling process will determine inclusion based on similarity. This also illustrates how the data does not have to relate in subject to our response variable.

# Creating a probabilistic data model to predict attacks Our goal is to use all the related data to create a prediction of U-boat attacks, and then measure deviations from those predictions after known intervention events. To include these data into our approach, we will build a probabilistic model by creating a Bayesian Structured Time Series model (BSTS). This modeling approach differs from other types of regression methods by considering other data in a conditional probability manner where prior information is updated with new information. A key feature of the BSTS approach is that is allows us to use additional data sets to contribute to the regression and determine which data will best contribute to our prediction model.

As we set up for casual inference, we first build a prediction BSTS model to understand how these other data will contribute to forecasting the number of U-boat attacks. The model accounts for seasonality and other trending or cyclical information. For brevity in this paper, we will not go into an explanation of the BSTS modeling. In the graphs below we can see how our model predicts near the tail end of data and view the how our model uses the covariate data. The ships in convoys and U-boats operational are the primary contributors to our model and the number of Liberty ships being built contributes the least to our model. We also note how these data all have low probability, a factor in determining statistical significance of interventions.

Interventions

There were several changes to tactics and new technical developments on both sides of the Battle. At times these countered each other. For example on the introduction of airborne radar from the Allies, the Axis developed radar detectors and could dive the U-boat prior to being attacked. Some early interventions will not be usable for our approach since we need to establish relative patterns prior to the intervention. These include adoption of convoys and early sonar. The following actions have been cited in many accounts of the battle and some will be explored for this analysis:

  • Germans break UK Navy code March, 1942 - June, 1943
  • High-frequency direction-finding (HF/DF) in May, 1943
  • Breaking of the German Navy Enigma cipher code in July, 1941 - November, 1941
  • German new Enigma machine February, 1942 - December, 1942
  • Leigh Light January, 1942
  • German Metox Radar receiver August, 1942
  • UK breaks new Enigma Code in in October 1942
  • UK sea-scanning radar April, 1943
  • US B-24 Liberator Bombers close the gap for air cover March, 1943

Not all of these can be considered discrete interventions since some are slow to implement over time. For example, the HF/DF units began being installed on Allied ships in February, 1942, being fully implemented in most vessels by early 1943. Our current modeling approach requires a smaller period, or moment in time where effects change immediately. The deciphering and implementing of codes is an event that can occur with immediate results. If one side breaks the code of the other, from that date they are able to use the advantage in rerouting convoys of U-boats. These activities may be successful modeled as interventions.

Enigma code

Allied breaking of the Enigma Code occurred in July, 1941 which we will consider as our model’s first intervention. Narrative from this achievement credits a significant decrease in attacks through the year to November, 1941. The end date may have been a result in the German Navy evolving tactics even though they remained unaware the code was broken. In causal modeling this can be considered a decay of the intervention effect. For example the effect of a marketing ad campaign would eventually decrease. Measuring decay of tactical or technical changes could be an important feature in this model for naval warfare and similar applications.

The second intervention was in February, 1942 when the German Navy released a new Enigma machine which provided secure communications until December, 1942 when a unit was captured and the code again broken which we will model as our third intervention.

To set our analysis up we will find the points in our attack data where these Enigma interventions occurred. The graph below shows these three periods. Visually there appears to be an increase in attacks after the new Enigma code is implemented, but visual can be specious in these data given all of the other variables discussed above. For example, is the increase in attacks in the second period due to more ships in the Atlantic? More operational U-boats? New tactics? These other factors are considered confounders and we may be unable to attribute a single event to any effect with them present in the data.

UK Breaks Enigma code in July, 1941

We examine the first intervention of a the UK breaking of the German Navy Enigma code which allowed the Allies to intercept U-boat communications to understand their locations. The code was broken in July, 1941. The effect of this is considered to have continued to November, 1941.

The top graph below shows the original data with the dotted lines indicating the beginning and end of our intervention. The dotted line represents the counterfactual prediction without the intervention. We are interested in the counter factual in between the dotted lines since we our hypothesis from narratives is that the effects of the Enigma code breaking (the intervention) decayed over this period.

The middle “point wise” graph displays the differences between the original data and the counterfactual predictions provided a point-by-point time prediction of the causal effect of the Enigma code compromise. In this case the number of attacks is decreasing after the intervention.

The last graph is a cumulative sum of the intervention effect of that period. The posterior inference output of this model is also shown providing us information on how well the model performed with confidence intervals for the prediction results.

The model results show the breaking of the Enigma resulted in a decrease of attacks by -26 % per week over the period. The predicted number of attacks on average during the period would have been 12 attacks per month versus 9 attacks with a cumulative effect over the period of 165 actual versus, what would have been 222 attacks had the breaking of the code not occurred. This is -57 less attacks due to the intervention.

The confidence intervals provide information on the Monte Carlos runs. It is notable that these intervals increase over time due to decaying effect of the intervention and based on the model’s prediction ability. In our model the average effect was -26%. The confidence intervals span -85% to 33.51% increase. With a p-value less than 0.05, of 0.183, the effect of the intervention is not considered to be statistically significant. We can see in the data that there is a measurable decrease with a small increase in attacks in September. This small increase is what changes the p-value from statistically significant to not. If we stop the post period at 2 months, it is significant. So we do not dismiss this impact, but could look for other factors that may have resulted in the increase prior to November.

The p-value is the probability that the null hypothesis is true. Our null hypothesis is: “the breaking of the code had no effect.” If we disprove this, then the effect is significant.

This means: if the breaking of the code had no effect on the number of attacks, there would be a random chance of 0.183 to see at least this effect on attacks. It is customary to consider p-values <= 0.05 to be statistically significant. However, we do not want to dismiss any p-values and consider them in context of our goals.

Germany changes Enigma code in October, 1942

Next we examine the the intervention of a new Enigma code when the German Navy created a new machine that made their communications secure from February, 1942 to October, 1942 when the code was again broken upon the seizing of the German U-559. The plot below shows the same analysis from above now applied to the new period.

In this model, we see a statistically significant effect (p = 0.003) of the intervention with an increase in attacks by 83% per month. On average there are 12 additional attacks per month with a total during the period of 397 additional attacks caused from the new Enigma code.

The models follow the historical description of how the Enigma code factored into the Battle of the Atlantic. When the code was originally broken by the UK, we see the number of attacks decrease. When the German Navy changes the machine and the Allies are unable to read messages, we see the number of attacks increase.

UK breaks new Enigma Code in in October 1942

Finally we measure the effect of the new Enigma being again broken by Alan Turning and the code breakers at Bletchley Park. We will set the date to November, 1942 since the U-599 was captured October 30, 1942. We will run this to the March of 1943 when another code change was enacted resulting in a brief period of loss of decryption.

The effect of the second period of breaking the code shows a significant effect on the number of attacks with a very small chance this is attributed to random factors (p = 0.004). The result is a decrease in attacks by -46% per month. While other factors certainly influenced this final decrease in attacks, especially given the long post treatment period, we are able to measure causal impact from the date of the code break.

Confirming model performance

As discussed earlier, the models are built on strong assumptions and we are aware of numerous factors that influenced the number of attacks. One recommended method to help validate our model is to create a fictitious intervention - one that did not occur on a specific date prior to our intervention. If we view the post intervention period for our fictitious intervention (a non intervention), we should not be able to discern an effect.

Confirming Intervention 1 - Initial code breaking

Since the initial breaking of the code was not statistically significant, we do not require confirmation.

Confirming Intervention 2 - New Enigma code

Next we confirm the 4th Rotor in the Enigma machine causal inference by creating a fictitious intervention prior to the actual intervention in February, 1942. We create a fictitious intervention on November, 1941 and ensure our post period does not include February, 1942. This did not result in any any statistically significant effects providing supportive evidence to our that the new code did cause an increase in attacks.

Confirming Intervention 3 - Final code breaking

Here again we will seek to confirm the final code breaking effect by creating a final fictitious intervention prior to the actual event. We set this intervention to be August, 1942 and end the period prior to November, 1942 when the actual intervention was modeled. The beginning and ending periods have are important since the intervention would shown up at some point if included in the post period and cause the model to consider the effect.

Our non-real intervention does provide a statistically significant effect in our data. This is an indication that the second breaking of the code by the Allies did not have the causal impact we calculated above. But, before we dismiss the effect we have to consider this in context. Our periods of impact are small compared to the pre-periods as we attempted to not include the actual dates of events. Also, we know other events (e.g., German Metox Radar receiver August, 1942) were in play for all our analysis.

Observations

In this analysis causal inference modeling was applied to data from the Battle of the Atlantic to examine the effect of events related to the Enigma code on the number of U-boat attacks. We demonstrated measurable impact in all cases and in two of the three code-related interventions we demonstrated statistically significant effects. To verify causal impact on the statistically significant causes, both of the corresponding fictitious interventions showed no significance in effects. The causal analysis mostly follows the historical narrative of the importance of code breaking in the Battle of the Atlantic.

Areas for additional research include the selection of the intervention date and the end date of the intervention effect. Moving these slightly changes the impact and statistical significance. To avoid confirmation bias, selection of these time periods needs to be defensible based on reason. We chose the historical narrative and counter impact actions (e.g., a new code introduced). In marketing applications, the dates of an ad campaign is well-defined and effects can be seen immediately. In naval warfare operations the intervention period is likely more gradual. Further research into the modeling parameters and additional data is required to increase the quantification of these interventions.

The key takeaway is our ability to build probabilistic data models on events and measure the impact. In determining statistical significance, p-values can be specious either way (for or against the null hypothesis) as are all cut-off values outside of context. And, in this analysis confounding variables are most certainly at play.

This analysis on historical data is promising for other interventions - not necessarily historical or battle-related. The introduction of new technology, new tactics or training provide opportunities to examine cause and effect modeling. Below are some areas that would be interesting to explore to gain more insight and develop quantifiable feedback.

  • Impact of convoys and other tactics on pirate attacks.
  • Changes in maintenance requirements when operating procedures change.
  • Effects on safety by new systems or procedures (new landing approaches for aircraft).
  • Changes in readiness from new operational requirements.
  • Interventions against drug smuggling by U.S. Coast Guard.

  1. Tim Carrico is a retired Navy commander and currently an AI Engineer & Data Scientist at Ailantic,LLC, ↩︎

  2. CausalImpact 1.2.7, Brodersen et al., Annals of Applied Statistics (2015).↩︎

  3. Romney B Duffey & John Gallehawk (2019): Quantifying countermeasure and detection effectiveness to threats using U-boat data from the Second World War, Journal for Maritime Research, DOI: 10.1080/21533369.2019.1608665↩︎

  4. Sternhell, Charles M, and Alan M Thorndike. Antisubmarine warfare in World War II. Washington: Operations Evaluation Group, Office of the Chief of Naval Operations, Navy Dept, 1946. Pdf. https://www.loc.gov/item/2009655248/.↩︎

  5. https://data.world/datamil/world-war-ii-thor-data↩︎

Previous
Previous

aliens

Next
Next

markets