International Journal of Computational Economics and Econometrics (19 papers in press)
Reservoir computing vs. neural networks in financial forecasting
by Spyros P. Georgopoulos, Panagiotis Tziatzios, Stavros G. Stavrinides, Ioannis P. Antoniades, Michael P. Hanias
Abstract: Stock market prediction techniques are a major research area, thus, extracting time-dependent patterns for the existing predictive models is of major significance. In this work, we compare forecasting performance of the nonlinear model of recurrent neural networks (RNN) in two implementations, LSTM and CNN-LSTM, to the relatively novel approach of reservoir computing (RC), and in specific, the particular class of the echo state networks (ESN). This comparison focuses on exploiting data latent dynamics, in performing efficient training and high quality predictions of the evolution of real-world financial data. Applying a multivariate scheme to a stock market index without any stationarity techniques, a definite precedence of the ESN-RC over both types of RNNs in computational efficiency as well as prediction quality, emerges. Finally, the implemented approach is friendly to the trader, since specific values of a stock market timeseries provide with a frame allowing for in time forecasting, under real-world circumstances.
Keywords: deep learning; neural networks; reservoir computing; machine learning; time series analysis; financial-economic forecasting; algorithmic comparisons.
The influence of financial and technological structure on eco-efficiency: an application of DDF bootstrapped framework in the Italian polluting industries
by Greta Falavigna, Alessandro Manello
Abstract: In this paper, we estimate efficiency scores environmental corrected for a large sample of Italian firms operating in four different polluting industrial sectors subjected to the same European normative framework. Merging economic and emission data coming from reliable public sources, we measure overall performances through the non-parametric directional distance function and in order to improve the robustness of the results, we perform an extension of the bootstrap proposed for standard efficiency scores. Results are analysed through a truncated regression after testing for the validity of separability condition between input-output space and explanatory variables as well as in light of industrial specificity. Results show that both the financial structure and the technological status of the firms have a significant explanatory power in relation to environmental corrected efficiency scores. Policy makers should carefully consider both aspects as important issues for supporting sustainable practices.
Keywords: environmental corrected efficiency; directional distance function; DDF; bootstrapping; two-stage procedure; separability conditions.
The Stock Market - Oil Prices Variability Relationship in the U.S.A.: The Financial Crisis Effect
by Dimitrios Kartsonakis-Mademlis, Nikolaos Dritsakis
Abstract: This paper employs bivariate GARCH models to investigate the relationship between Dow Jones Industrial Average index and crude oil Brent. The models are used to generate the conditional variances of our indices and test for volatility spillover effects. Our evidence supports that for the entire sample period, there is no causal relationship between the volatilities of Dow Jones and Brent. For the period before the financial crisis, there is evidence of a unidirectional link regarding the transmission of shocks from the stock to the oil market and a bidirectional link concerning the volatility spillover between the markets. Considering the period of the crisis, bidirectional shock and volatility linkages are found. In contrast, for the period after the financial crisis, only the effect of the transmission of shocks and volatility spillover from Dow Jones to Brent is significant. We also compute the optimal portfolio weights and dynamic risk-minimizing hedge ratios to highlight the importance of our empirical results.
Keywords: BEKK-GARCH model; Financial Crisis; Stock market; Oil Prices; Volatility Spillover.
Consumption per effective labour in Brazil: testing for the optimising behaviour
by Ricardo Ramalhete Moreira
Abstract: We contributed to the literature on household consumption in specific ways: firstly, we adopted a long-run cointegration analysis instead of a first-difference one, which was conventional in previous works; second, by building on monthly data, we enlarged the number of observations of our sample, thus overcoming a common problem of low-frequency data in studies based on quarterly or annual time series. We thus confirmed the hypothesis of an optimising behaviour in consumption for the Brazilian economy, although jointly with a relevant role of income. We also inferred the effect of risk aversion in lowering the optimality of consumption after the Subprime crisis period, based on a Markov-switching approach.
Keywords: consumption; households; interest rate; income; Brazil.
Unreplicated Factorial Experimental Designs for Off-Line Quality Improvement and Industrial Process Optimization
by Hager FARHOUD, Lotfi TALEB
Abstract: A problem frequently encountered in the industrial off-line improvement of quality is to identify from among many factors, those which are responsible for large changes in the quality characteristics, namely factors with active location and/or dispersion effects. Unreplicated experimental designs propose economic tools to discover what manufacturing conditions minimize product variation, maintain product measurements near the desired target value and make the product insensitive to environmental changes. However, no degrees of freedom are left to estimate the experimental error. To remove this dependency structure of residuals at the high and low levels of factor combinations, this study first addresses a synthesis and a critical analysis of existing location and dispersion effect identification methods. Second, a new method is proposed and robustness check is based on real example and extensive simulation study. Third, practical issues are presented to enlighten investigators in their decision-making process.
Keywords: Statistical Process Optimization; Quality Engineering; Screening Designs; Power; IER; EER; Modified Residuals.
Globalization and the Nigerian Environment: Empirical Evidence from Quantile Cointegration.
by Olalekan Bashir Aworinde
Abstract: This paper examines the validity of Environmental Kuznets Curve (EKC) in Nigeria by exploring the impact of economic growth, overall, economic, social and political globalization on Nigerian ecological footprint using the recently developed Quantile Autoregressive Distributed Lag (Q-ARDL) technique for the period of 1970Q1-2018Q4. The findings of the linear ARDL support the presence of long run relationship and validity of EKC in all the four models considered. The Q-ARDL results showed that the assumptions of the error-correction terms are met across all quantiles. The long run results reveal evidence of an inverted U-shaped EKC in Nigeria. Additionally, the long run period shows that overall, social, economic and political globalization worsens the Nigerian environment. This study therefore recommends that the Nigerian government should adopt energy efficient environmental policies that will promote green growth development.
Keywords: Globalization; Environment; Nigeria; Environmental Kuznets Curve (EKC); Quantile Autoregressive Distributed Lag (Q-ARDL); Ecological footprint.
An S-curve Efficient Frontier on Second-Hand Auto Price
by Fadzilah Salim, Nur Azman Abu
Abstract: An efficient frontier has been a popular concept in a capital asset pricing model and will be introduced as a practical predictive model for second-hand automobile prices. It is a practically useful model that comes with an upper price recommendation on the market to suggest maximum possible coverage by auto insurance. A non-linear model has been observed to give a better estimate of price appreciation while describing real-life phenomena. In this paper, an S-efficient frontier curve model is proposed as a simple non-linear model used for estimating second-hand automobile pricing. A dynamic S-shaped Membership Function (SMF) will be used as a basis to construct an S-curve algorithm for the selected auto model. An S-curve model is found to offer a useful and practical estimation regarding second-hand auto prices. Therefore, an S-curve efficient frontier model is expected to provide a better and practical estimate of second-hand auto pricing in Malaysia.
Keywords: S-curve model; efficient frontier; second-hand auto price; price modelling.
The use of classification models to identify factors differentiating the competitiveness of the EU-15 and EU-13 countries
by Agnieszka Kleszcz
Abstract: This paper reports on a study of the Global Competitiveness Index pillars, aiming to differentiate the European Union countries grouped by their accession year in terms of their competitiveness. A linear (regularised logistic regression) and nonlinear (random forests) classifiers are proposed, to model the relationship between multidimensional economic condition indicators and the countrys group. The key discriminators of the competitiveness of the EU-15 (accession before 2004) and the EU-13 (accession in or after 2004) are obtained by analysis of feature importance in classification models. Upon study of 12 competitive indicators from the World Economic Reports (2007-2017 edition) we conclude that the highest disparities between the groups of countries can be observed in infrastructure. Innovation, market size and institutions are the next three most important differentiating factors. A major methodological contribution of the paper is the use of explainable statistical models for identifying key features differentiating groups of countries.
Keywords: logistic regression; random forest; European Union; Global Competitiveness Index; GCI; feature importance.
Polarization, Institutional Quality, and Social Cohesion: Evidence in Worldwide Scenario.
by Muhammad Nadeem, Mumtaz Anwar, Zahid Pervaiz
Abstract: Diversity and socioeconomic deprivation have been widely discussed as determinants of social cohesion. The current study has tried to discuss some other important factors which may also be important for social cohesion with a particular focus on legal institutional quality. The current study employs the fixed-effect model for estimation. The analysis is conducted for a large number of countries, using five-year average panel data from 1990 to 2010. The results suggest that legal institutional quality augments social cohesion, while ethnic diversity, income inequality, and globalization are a threat to social cohesion. Gender equality and per capita income also augment social cohesion. Moreover, the threat to social cohesion is greater when there is: low legal institutional quality and high: ethnic diversity, and income inequality as compared to a situation where there are high legal institutional quality and low: ethnic diversity, and income inequality. The results further suggest that the harmful effects of ethnic diversity, globalization, and inequality can be, not only overcome by institutional quality but can also be put to use to enhance social cohesion.
Keywords: Social cohesion; Institutional quality; Diversity; Income inequality; Globalization.
Simple methods to handle missing data
by Ruzhdie Bici
Abstract: Missing data are a common problem in big data sets. Specifically, missing data are present in surveys and in different studies, leading in increasing of variance and not reliable results. While most of the researchers focus on the analysis of more sophisticated methods, the simplest techniques are not treated in detail. The article explains the theoretical concepts of different types of missing data, the reasons and analysing methods how to deal with those. The focus is using simple imputation techniques (mean imputation, regression imputation and non-treating missing at all). The analysis is done using Malawi data, IHS5 20192020 survey data. In this article, the interest is to know the whole property values (selling and renting) in the country, while the information in these variables is partly not filled. The results show how the different imputation methods influence the results and sometimes the value is predicted from other auxiliary variables.
Keywords: simple methods; missing data; handle missing data; imputation; regression; non-response.
The (relative) importance of the attack in the game of football: evidence from a team-level study of Italian Serie A
by Siyan Chen, Saul Desiderio
Abstract: Attackers are recognised as the most important players of football, and this reflects on their high wages and market values. A natural question is whether such high financial costs are justified by the actual contribution of the attack to the success of a football team. In this paper we use team-level data relative to 34 seasons of Italian Serie A to see if the offensive sector as a whole is indeed the major determinant of the strength of a team. Results show only a moderate prevalence of the attack over the defence, which suggests that offensive players are likely to be overvalued with respect to their actual contribution to the strength of the team. In addition, we found that a good defence is the key for ending the season on top spots, whereas a bad attack is the main reason for ending at the bottom of the ranking.
Keywords: football; strikers; team success; Italian Serie A; pooled OLS; probit.
Measuring tax administrations efficiency using data envelopment analysis: evidence from 26 European countries
by Athanasios Anastasiou, Charalampos Kalligosfyris, Eleni Kalamara
Abstract: The purpose of this paper is to assess the efficiency of tax administrations of 26 European countries, using data envelopment analysis. In particular, by applying the CCR data environment analysis (DEA) output-oriented model, the quantification of the tax administrations performance of 26 European countries is attempted, in the areas of taxpayers servicing, public revenue collection, strengthening voluntary tax compliance and targeted tax audits, the assessment of relative efficiency, the evaluation of results and the identification of fully efficient and inefficient tax administrations, in which a real improvement in their efficiency can be achieved. Subsequently, for the tax administrations that are assessed as inefficient, the reference units are identified, the missing quantity of outputs and the excess amount of inputs are estimated, in order to make them efficient and a set of possible ways of improving their operation is proposed, through specific changes.
Keywords: tax administration; efficiency; data envelopment analysis; DEA; tax compliance; tax audit; tax revenue.
Quantile regression-based seasonal adjustment
by Massimiliano Caporin, Mohammed Elseidi
Abstract: We introduce a seasonal adjustment method based on quantile regression that focuses on capturing different forms of deterministic seasonal patterns. Given a variable of interest, by describing its seasonal behaviour over an approximation of the entire conditional distribution, we are capable of removing seasonal patterns affecting the mean and/or the variance or seasonal patterns varying over quantiles of the conditional distribution. We provide empirical examples based on simulated and real data through which we compare our proposal to least squares approaches.
Keywords: quantile regression; seasonal adjustment; deterministic seasonal patterns.
Special Issue on: Computational and Statistical Modelling for Tackling the Emergence of the COVID-19 Pandemic
The management of COVID-19 epidemic: estimate of the actual infected population, impact of social distancing and directions for an efficient testing strategy. The case of Italy
by Federico Brogi, Barbara Guardabascio, Giulio Barcaroli
Abstract: This work focuses on the so called 'first wave' of COVID-19 epidemic (21 February10 April 2020) and aims at outlining a viable strategy to contain the COVID-19 spread and efficiently plan an exit from lockdown measures. It offers a model to estimate the total number of actual infected among the population at national and regional level inferring from the lethality rate, to fill the proven gap with the number of officially reported cases. The result is the reference population used to develop a forecasting exercise of new daily cases, compared to the reported ones. The eventual discrepancy is analysed in terms of compliance with the restrictive measures or to an insufficient number of tests performed. This simulation indicates that an
efficient testing policy is the main actionable measure. Furthermore, the paper
estimates the optimal number of tests to be performed at national and regional
level, in order to be able to release an increasing number of individuals from
Keywords: COVID-19; policy evaluation; scenario analysis; infected population; testing strategy; compliance; Italy.
Socio-economic and demographic factors influencing the spatial spread of COVID-19 in the USA
by Christopher F. Baum, Miguel Henry
Abstract: As the COVID-19 pandemic progressed in the USA, 'hotspots' shifted geographically over time to suburban and rural counties showing a high prevalence of the disease. We analyse population-adjusted confirmed case rates based on daily US county-level variations in COVID-19 confirmed case counts during the first several months of the pandemic (1 March 2020 through 23 May 2020) to evaluate the spatial dependence between neighbouring counties and quantify the overall spatial effect of socio-economic and demographic factors on the prevalence of COVID-19. We indeed find strong evidence of county-level socio-economic and demographic factors influencing the spatial spread such as sex, race, ethnicity, population density, pollution, health conditions, and income. The relevance of the spatial factors suggests that neighbouring counties have a significant and positive effect on the prevalence of COVID-19.
Keywords: COVID-19; coronavirus; spatial spillovers; socio-economic factors; demographics; spatial econometrics.
An integrated K-means-GP approach for US stock fund diversification and its impact due to COVID-19
by Dinesh K. Sharma, H.S. Hota, Vineet Kumar Awasthi
Abstract: The stock fund diversification process is a tedious task due to the erratic nature of the stock market. On the other hand, work is more challenging due to high annual return expectations with low risk. This research work explores the potential of goal programming (GP) and K-means algorithm as an integrated K-means-GP approach for fund diversification, where K-means is used to create groups of stock based on their performance. Then GP is used to diversify total funds into various groups of stocks to achieve a high annual return. The experimental work has been done in 30 stocks of DOW30 of the years 2017-18, 2018-19, and 2019-20. A comparative study was carried with three different cases based on individual year data and an average of two and three years of data. The empirical results show that the K-means-GP approach outperformed the GP approach for stock fund diversification; the annual return is higher in the case of the K-means-GP approach using three years of average data with 12.59% of annual return against the expected annual return of 20%. Due to COVID-19, few stocks perform in the negative direction, and hence the annual return is being affected after fund diversification.
Keywords: k-means; goal programming; DOW30; fund diversification; COVID-19.
Non-parametric Bayesian updating and windowing with kernel density and the kudzu algorithm
by Robert L. Grant
Abstract: The concept of 'updating' parameter estimates and predictions as more data arrive is an important attraction for people adopting Bayesian methods, and essential in big data settings. Implementation via the hyperparameters of a joint prior distribution is challenging. This paper considers non-parametric updating, using a previous posterior sample as a new prior sample. Streaming data can be analysed in a moving window of time by subtracting old posterior sample(s) with appropriate weights. We evaluate three forms of kernel density, a sampling importance resampling implementation, and a novel algorithm called kudzu, which smooths density estimation trees. Methods are tested for distortion of illustrative prior distributions, long-run performance in a low-dimensional simulation study, and feasibility with a realistically large and fast dataset of taxi journeys. Kernel estimation appears to be useful in low-dimensional problems, and kudzu in high-dimensional problems, but careful tuning and monitoring is required. Areas for further research are outlined.
Keywords: Bayesian data analysis; big data; density estimation trees; kernel density estimation; non-parametric statistics; streaming data.
COVID-19 pandemic and the economy: sentiment analysis on Twitter data
by Shira Fano, Gianluca Toschi
Abstract: In the last decade, social networks have increasingly been used in social sciences to monitor consumer preferences and citizens opinion formation, as they are able to produce a massive amount of data. In this paper we aim to collect and analyse data from Twitter posts identifying emerging patterns related to the COVID-19 outbreak and to evaluate the economic sentiment of users during the pandemic. Using the Twitter API, we collected tweets containing the term coronavirus and at least a keyword related to the economy selected from a pre-determined batch, obtaining a database of approximately two million tweets. We show that our Economic Twitter Index (ETI) is able to nowcast the current state of economic sentiment, exhibiting peaks and drops related to real-world events. Finally, we test our index and show it is positively correlated to standard economic indicators.
Keywords: economic sentiment index; sentiment analysis; COVID-19 pandemic; Twitter; social media.
Searching for the peak: Google Trends and the first COVID-19 wave in Italy
by Paolo Brunori, Giuliano Resce, Laura Serlenga
Abstract: One of the difficulties faced by policy makers during the COVID-19 outbreak in Italy was the monitoring of the virus diffusion. Due to changes in the criteria and insufficient resources to test all suspected cases, the number of 'confirmed infected' rapidly proved to be unreliably reported by official statistics. We explore the possibility of using information obtained from Google Trends to predict the evolution of the epidemic. Following the most recent developments on the statistical analysis of longitudinal data, we estimate a dynamic heterogeneous panel. This approach allows to takes into account the presence of common shocks and unobserved components in the error term both likely to occur in this context. We find that Google queries contain useful information to predict number patients admitted to the intensive care units, number of deaths and excess mortality in Italian regions.
Keywords: COVID-19; Google Trends; dynamic panel data; Italy.