Title: Improving transparency: extracting, visualising and analysing corporate relationships from SEC 10-K documents

Authors: Michael Gebbie, Kim Norlen, Gabriel Lucas, John Chuang

Addresses: School of Information Management and Systems, University of California at Berkeley, 102 South Hall, Berkeley, CA 94720–4600, USA. ' School of Information Management and Systems, University of California at Berkeley, 102 South Hall, Berkeley, CA 94720–4600, USA. ' School of Information Management and Systems, University of California at Berkeley, 102 South Hall, Berkeley, CA 94720–4600, USA . ' School of Information Management and Systems, University of California at Berkeley, 102 South Hall, Berkeley, CA 94720–4600, USA

Abstract: We present a system to extract, visualise and analyse inter-corporation relationships disclosed by public companies in their annual reports to the US Securities and Exchange Commission (SEC). In improving the transparency of these disclosures, we allow policymakers, analysts, investors and the general public to analyse these relationships at both the firm level and the industry level. Using probabilistic information retrieval and extraction techniques, we automatically extract a dataset of 45,000 relationships between 26,000 companies from over 15 GB of SEC 10-K documents. These relationships range from ownerships, agreements and personal connections to competition and legal disagreements. Information visualisation and social network analytic techniques can then be applied to explore and analyse the dataset.

Keywords: corporate transparency; information retrieval; information extraction; information visualisation; social network analysis; inter-corporation relationships; public companies; US Securities and Exchange Commission; SEC; USA; United States.

DOI: 10.1504/IJTPM.2007.012237

International Journal of Technology, Policy and Management, 2007 Vol.7 No.1, pp.15 - 31

Published online: 31 Jan 2007 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article