Structured machine-learning approach avoids mis-specification
The traditional historical approach taken when building a multi-factor model of the investment markets requires that the model factors be based on observed characteristics of the assets to be included in the model. For example, a global equity model will classify stocks by country and may take account of fundamental values such as P:E ratio, yield and market capitalisation.
Apart from the fact that many companies operate in multiple jurisdictions and that fundamental attributes may not be comparable where accounting standard differ, this method is restrictive in that data availability will constrain the set of factors that can be modeled. For example, interest rate sensitivity could be an important factor, affecting both the demand for a company’s product as well as its financing cost. While measures of leverage might be available, they are unlikely to capture the full effect.
This data problem will restrict any such model’s ability to accurately portray the relationships between assets, leading to model mis-specification through over-fitting of the human proposed factors and omisison of factors that have no associated fundamental metric. Further, the fundamentally derived factors represent the only descriptive factors that can be applied in an analysis. If a factor is not defined in stage 1 – model construction it is not available in stage 2 – risk analysis.
In contrast, the EMA model does not in any way constrain the model building algorithm with any pre-specified notion of what the best fitting set of factors should be. The algorithm uses a machine-learning technique, “guessing” a random set of factors as a seed to the process, and then amending the factors in a multiple-step process until the set of factors that are most likely the true factors explaining the structure of the market are located.
This means that the EMA method can represent systematic relationships between assets that may not be identifiable in any way by reference to traditional fundamental measures. At the descriptive stage, there is no limit to the ways that the risk sources can be defined, so different sets of risk source “styles” can be established for different analytical purposes.