Global Journal of Management and Business Research, A: Administration and Management, Volume 22 Issue 1

= − ⁡( ， ) (3) In (3), a i is the mean distance between point (node) i and other points in the same cluster, which represents the cluster density; and b i is the smallest mean distance between point i and points in any other cluster than that in which i is a member, representing the dissimilarity to neighboring clusters. This means that, if a cluster partition is applied at the largest mean silhouette value, clustering can be carried out under such conditions in which the clusters are the densest and the most dispersed from one another. Finally, we describe the modeling and prediction procedures based on this clustering method. ① Modeling procedures: learning datasets (184 companies, with mixed bankruptcy statuses) > normalization > principal component analysis (PCA) > silhouette analysis to determine the number of clusters (K) > clustering with K as the predetermined number of clusters > saving the learning model. ② Prediction procedures: input the dataset (one non-bankrupt company) > normalization using the learning model > PCA using the learning model > predicting the cluster to which the dataset belongs with reference to the learning model. The prediction outcomes are output for each company, and thus each company will have one prediction result. IV. C lustering R esults and D iscussions a) Clustering results We applied the above clustering model to the data described in section 3.1, and as a result, four clusters were formed. Table 6 shows the member distributions. In the next section, the prediction results are discussed. Here, as in a correlation analysis, if thresholds are assumed as a “ratio of bankrupt companies (%) > = 0.7” for a cluster with a high likelihood of bankruptcy and a “ratio of bankrupt companies (%) < = 0.3” for a low likelihood, then a non-bankrupt company is interpreted as not likely to go bankrupt if the company’s input dataset belongs to Cluster 1, whereas it is likely an interpretation if the dataset falls within Cluster 3 or 4. Table 6: Member distribution resulting from the clustering b) Corporate bankruptcy prediction using sample datasets We ran the process ② prediction (test) using sample financial statement data for non-bankrupt companies 1 and 2, the results of which are shown in Figures 1 and 2, respectively. As indicated by the star symbols, Company 1 (Figure 1) belonged to Cluster 1, whereas Company 2 (Figure 2) belonged to Cluster 3. Fig. 1: Clustering result (non-bankrupt company 1) Cluster No. No. of members Bankrupt companies Non-bankrupt companies Ratio of bankrupt companies (%) 1 100 26 74 0.26 2 36 21 15 0.58 3 41 32 9 0.78 4 7 5 2 0.71 14 Global Journal of Management and Business Research Volume XXII Issue I Version I Year 2022 ( ) A © 2022 Global Journals A Study on Machine Learning Prediction Model for Company Bankruptcy using Features in Time Series Financial Data