Documents

Managing complexity in large data bases using self-organizing maps.pdf

Description
Accting., Mgmt. & Info. Tech. 8 (1998) 191–210 Managing complexity in large data bases using self-organizing maps Barbro Backa,*, Kaisa Sereb,1, Hannu Vanharantac,2 a Turku School of Economics and Business Administration, Turku, Finland b Åbo Akademi University, Turku, Finland c Lappeenranta University of Technology, Lappeenranta, Finland Received 1 March 1997; received in revised form 1 July 1998; accepted 13 July 1998 Abstract The amount of financial information in today’s sophisticated la
Categories
Published
of 20
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Accting., Mgmt. & Info. Tech. 8 (1998) 191–210 Managing complexity in large data bases usingself-organizing maps Barbro Back  a,* , Kaisa Sere b,1 , Hannu Vanharanta c,2 a Turku School of Economics and Business Administration, Turku, Finland  b  Åbo Akademi University, Turku, Finland  c  Lappeenranta University of Technology, Lappeenranta, Finland  Received 1 March 1997; received in revised form 1 July 1998; accepted 13 July 1998 Abstract The amount of financial information in today’s sophisticated large data bases is substantialand makes comparisons between company performance—especially over time—difficult or atleast very time consuming. The aim of this paper is to investigate whether neural networksin the form of self-organizing maps can be used to manage the complexity in large data bases.We structure and analyze accounting numbers in a large data base over several time periods.By using self-organizing maps, we overcome the problems associated with finding the appro-priate underlying distribution and the functional form of the underlying data in the structuringtask that is often encountered, for example, when using cluster analysis. The method chosenalso offers a way of visualizing the results. The data base in this study consists of annualreports of more than 120 world wide pulp and paper companies with data from a five yeartime period.  ©  1998 Elsevier Science Ltd. All rights reserved. Keywords:  Complexity; Adaptive systems; Self-organizing maps; Financial benchmarking; Financial per-formance; Strategic management 1. Introduction Today’s data bases hold a substantial amount of information about companies.The trick is to find patterns in the data that reveal important information about com- * Corresponding author. E-mail: bback@abo.fi 1 kaisa.sere@abo.fi 2 hannu.vanharanta@lut.fi 0959-8022/98/$19.00  ©  1998 Elsevier Science Ltd. All rights reserved.PII: S0959-8022(98)00009-5  192  B. Back et al./Accting., Mgmt. & Info. Tech. 8 (1998) 191–210 panies for different stakeholders, i.e., stockholders, creditors, auditors, financial ana-lysts, and management. Finding patterns in financial performance can, for example,be helpful in identifying internal problems, firm evaluation by investors, and forbenchmarking purposes.In this paper, we focus on analysis of financial performance for benchmarkingpurposes. Benchmarking is an important company-internal process, in which thefunctions and performance of one company are compared with those of other compa-nies. Financial competitive benchmarking uses financial information—most often inthe form of ratios—to perform these comparisons. Financial competitive bench-marking is utilized, among other things, as a communication tool in strategic manage-ment, for example, in situations where company management must gain approval,from internal and external interest groups alike, for new functional objectives forthe company.Vanharanta (1995) has built a hyperknowledge-based system for financial bench-marking. The system contains a data base with financial data on more than 160 pulpand paper companies worldwide. This data base is used as a basis for the presentstudy, too. The amount of financial information in this system is, however, so largeand the structure of it is so complex that it makes comparisons between companiesdifficult—or at least very time consuming.Multivariate statistical methods, especially cluster analysis, has been used as atool of analysis of company performance although mostly in research contexts(Ketchen & Shook, 1996). However, many problems have been reported concerningthese methods. The two most important problems are the assumption on normalityin the underlying distributions and difficulties in finding an appropriate functionalform for the distributions (Trigueiros, 1995), (Fernandes-Castro & Smith, 1994).Moreover, results of analyses are difficult to visualize when there are several explana-tory variables (Vermeulen et al., 1994).Many researchers have addressed these problems: Trigueiros (1995) reports onseveral studies that have shown the existence of positive or negative skewedness inthe ratios and on different remedies to overcome these difficulties. He also explainsthe existence of symmetrical and negatively skewed ratios and offers guidelines forachieving higher precision when using ratios in statistical context.Fernandes-Castro & Smith (1994) used a non-parametric model of corporate per-formance to overcome the need for specification of statistical distribution or func-tional form. Vermeulen et al. (1994) presented a way to visualize the results withinterfirm comparison when the explanatory variable was explained by more than onefirm characteristic. Successful use of visual information depends substantially on itsacceptance by the user. Meyer (1997) states that visualized information makes thetransfer of information easier and thus a bottleneck in human information processingis avoided.Ketchen & Shook (1996) evaluate the past use of cluster analysis in strategicmanagement research. One concern has been the extensive reliance on researcher judgment that is inherent in cluster analysis. As another concern they list that theapplications lack an underlying theoretical rationale and that clustering dimensions  193  B. Back et al./Accting., Mgmt. & Info. Tech. 8 (1998) 191–210 seem to be selected haphazardly. There has also been concern with the standardiz-ation of variables and problems with multicollinearity among variables.Self-organizing maps, which are a form of artificial neural networks, are a promis-ing new paradigm in information processing. One of the main features of neuralnetworks is their ability to learn from examples and adapt their behavior to newsituations. The theory of self-organizing maps facilitates a reduction and clusteranalysis of high dimensional feature spaces into two-dimensional arrays of represen-tative weight vectors (Kohonen, 1997). The method does not need any specificationof an underlying distribution or of the functional form of the financial indicators.Furthermore, one can visualize the results in a comprehensive way.Neural networks have previously been suggested by Trigueiros (1995) for use withcomputerized accounting reports data bases, and by Chen et al. (1995) to definecluster structures in large data bases. Martin-del-Brio & Serrano-Cinca (1995) usedself-organizing maps for analyzing the financial state of Spanish companies.In a previous study (Back et al., 1998) we investigated the potential of self-organizing maps to structure 76 companies’ financial data in our data base andpresented an approximated position of one company’s financial performance com-pared to that of other companies. The study was explorative and limited to Nordicand North-American companies. However, the results were very promising and thatstudy served as a basis for this paper.We use the self-organizing maps to structure the financial information on morethan 120 companies, including now also Central-European companies, in our database into clusters based on the underlying weight vectors. Each cluster is then namedaccording to the financial characteristics of the cluster. We analyse the financialperformance of the companies year 1985 and take a closer look in these clustersover the years 1985–89. Even though we focus only on specific companies, anyindividual company or group of companies can be the focus of interest.The rest of the paper is organized as follows: Section 2 describes the methodologywe have used, the network structure, the data base, the list of companies in the studyand the criteria for and the choice of financial ratios. Section 3 presents the construc-tion of the self-organizing maps and Section 4 presents an analysis of the maps. Theconclusions of our study are presented in Section 5. 2. Methodology 2.1. Benchmarking Competitive benchmarking is a company-internal process in which the activitiesof a given company are measured against the best practices of other, best-in-classcompanies (Geber, 1990). In the process of competitive benchmarking, internal func-tions are analyzed and measured using financial (i.e. quantitative) and/or non-finan-cial (i.e. qualitative) yardsticks. Functions measured from one company are comparedwith similar functions measured from leading competitors, or they are compared withthe best practices in other industries. The differences between compared functions  194  B. Back et al./Accting., Mgmt. & Info. Tech. 8 (1998) 191–210 are measured. The overall management goal of competitive benchmarking within agiven company is to close the measured “gap” by changing the company’s character-istics in ways that will improve company performance.The generic benchmarking process consists of a planning phase, an analysis phaseand an integration and action phase. The specific activity of   financial  competitivebenchmarking is an integral part of the generic benchmarking process. In financialbenchmarking, the aim is to compare the company with its competitors using avail-able financial information, financial yardsticks. At the beginning of a benchmarkingprocess, in its planning phase, financial benchmarking plays an important role in theidentification and selection of the right competitors and/or good performers, thosethat will act as the benchmarks in the non-financial benchmarking to be done laterin the generic process. Financial benchmarking is also important in the analysis phasewhen performance gaps are being measured and future performance levels projected.In the integration and action phase, financial benchmarking is useful for monitoringand tracking progress and for re-calibrating the benchmarks. Financial benchmarkingachieves its greatest potential, however, as a communications tool at times whencompany management must gain approval, from internal and external interest groupsalike, for new functional objectives for the company, i.e. in strategic management.The financial information needed for financial benchmarking work is, however,invariably available only from large commercial data bases or from specializedreports and publications, from where it must be gleaned with difficulty. Such infor-mation is thus far removed from its active users. If the needed financial informationis to be brought closer to the active users, it must first be pre-processed, i.e. refinedand classified. The overall objective of the present study is to pre-process, with thehelp of neural networks, the data and information needed for financial benchmarkingpurposes. Thus pre-processed, the information can be used in computerized bench-marking systems and executive support systems, making the task of competitivefinancial benchmarking easier and more effective. 2.2. Neural networks A neural network is a computing device that is able to learn from examples. Itconsists of a set of simple processing units, neurons, that are connected to each otherto form a network topology. A neural network compares input data with output data,and tries to approximate some complicated, unknown functionality between the two.When developing a neural network, the first step is to find a suitable topology forthe network and thereafter train it so that it gradually learns the desired input/outputfunctionality. There are two ways to train a network,  supervised   and  unsupervised  .In supervised learning the network is presented with examples of known input-outputdata pairs, after which it starts to mimic the presented input-output behavior. Thenetwork is then tested to see whether it is able to produce correct output, when onlyinput is presented to it. In unsupervised learning, the output data is not availableand usually not even known beforehand. Instead, the network tries to find similaritiesbetween input data samples. Similar samples form clusters that constitute the outputof the network. The user is responsible for giving an interpretation to each cluster.
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks