Preface
This article introduces the indicator measurement method used by the Cash Flow Research Institute team in the segmented industry cash flow data analysis. This method has been applied to the "Water Implementation" series industry cash flow analysis courses. This article focuses on the cash flow cycle and discusses the measurement methods of two cash flow analysis indicators. It can determine whether an enterprise meets the general cyclical laws of the industry, and compares the differences between each enterprise and the industry baseline, and can further apply it to abnormal enterprise detection.
cash cycle refers to the period from the time when cash is consumed to the time when the raw materials are obtained and the cash is recovered. The length of the cash cycle not only reflects the company's asset management level, but also affects the company's profitability and debt repayment ability.
The enterprise operating cycle will be reflected in the cash flow data. When conducting enterprise cash flow data analysis, it will be difficult to find a clear operating cycle when all transaction data are mixed together. Therefore, in the absence of obvious circumstances, the operating cycle can be found more clearly through statistical single items or combinations of operating/investment/financing/internal transfer/water/electricity/tax, etc. Through the classification tag of the cash flow due diligence system, we can easily screen and divest cash flow income and expenditure, such as operating income and expenditure, internal transfer, investment income and expenditure, water and electricity taxes, etc.
Assuming that enterprises in the same sub-industry or industrial chain have similar operating cycles, the summary of the manifestations is divided into the following two situations:
- Cash flow synchronization cycle
- Some industries have specific periods that will lead to business changes (such as cash flow changes caused by festivals in the tobacco and alcohol sales industry), or the industry they belong to have seasonal factors that will lead to business changes, such as winter and summer laws (such as movie industry, agricultural product production and processing industry). Companies in such industries generally have similar peak and trough periods.
Comment: Each line represents the numerical change in cash flow of a company. Because all periodic functions can be decomposed by the Fourier transform into the sum of several trigonometric functions, the trigonometric function is used as an example here. The same below.
- Cash flow asynchronous cycle
- Although there are similar production cycles in the industry, there are time differences between different enterprises (such as cement and other building materials industries and construction industries). Although the cycle time of enterprises in this type of industry is the same, there are differences in specific peak and trough periods.
In actual analysis, the flow changes of a single project may not necessarily be found in similar rules. At this time, we can also consider operating income and expenditure, internal transfer, investment income and expenditure, water, electricity, taxes and fees, etc., and then classified and divided into monthly/day/time periods and other statistics to find possible cyclical laws.
Anomaly Enterprise Detection Method Based on Cash Flow Synchronous Cycle
Take monthly statistics Operating cash flow As an example: How do we compare whether they have similar cycle rules based on corporate operating cash flow? How to judge whether they meet the general cyclical laws of the industry?
The figure shows the monthly income and expenditure of an enterprise's operating cash flow
First of all, in order to balance the errors caused by the enterprise's operating scale, we selected samples of 7 enterprises with similar scales in the same industry and calculated the proportion of their monthly turnover to the whole year. As shown in the figure below:
Note: During actual analysis, enterprises with different business scale ranges can be classified and counted according to demand, such as distinguishing small and medium-sized enterprises in the same industry from large and above.
It can be seen that the lines of these 7 companies are very messy, so how to measure and compare them? For the two business scenarios in this case, we give two sets of measurement methods:
- Method 1: Find out the abnormal enterprises in the industry based on the probability density function (PDF).
Based on the monthly turnover statistics of enterprises in the same industry, the distribution pattern of monthly turnover accounts for the whole year's turnover is fitted through the probability density function (PDF, Probability Density Function).
Note: The figure takes the probability density distribution of January flow as an example, the red line is the warning boundary value, and the green line is the industry baseline. The red line can be adjusted left and right according to the actual situation of the industry and tolerance level to select different thresholds.
In the above figure, the red line indicates that the operating flow of enterprises in this industry in January should account for 6~14% of the whole year. The enterprises outside the two red lines are abnormal , and further investigation is needed to see what special behaviors the enterprise has in that month.
If the company has an abnormality for many months, it is recommended to pay attention to the company's special behavior in that year and analyze whether there are any abnormalities in the business essence of the company.
- Method 2: Compare the differences between enterprises and industries and enterprises based on the industry baseline and optimization distance measurement method (EDM).
can be used for abnormal enterprise detection by comparing the differences between each company and the industry baseline. The following is an example of measuring the differences between each company and the industry's baseline:
industry baseline can be defined by data analysts based on data situation and application scenarios. For example, the average of the monthly turnover ratio of enterprises in the industry, or the weighted average of the peak of each month distribution (green line in the figure above).
record N as the feature number (12 months in this example), {_}, {_} are the samples to be compared:
, traditional European distance measurement:
, optimized measurement mode (EDM, Enhanced Distance Measurement):
(where hyperparameter C is determined by the order of magnitude of each index data; set normal K to ensure log Meaningful. )
actual effect: (C=50, K=1)
If traditional European distance is used to measure, it will be difficult to quantify the "large-different" scale standards in actual applications. In this example, we judge that the Euclidean distance between and is greater than 14, which is a big difference (that is, the gap between most companies and the baseline in the sample is less than 14), but this standard is only valid for this sample, and it is difficult to determine whether 10, 11, 12, 13 can also be defined as "large difference".
Comment: The measurement result of the two samples (square distance between samples 196) is 14 in the traditional European distance, and the corresponding optimized measurement mode result is 0.9
If the optimized measurement mode is used, first, the difference will be standardized within the [0,1] interval, we will have a relatively constant abnormality detection threshold , which can compress the distance measurement result of most normal data and the industry baseline to within 0.9. In addition, compared with similarity measurement method (a small foreshadowing here, we will talk about it next time ~), this method can retain the numerical size information of each indicator.