International Journal of Cloud Computing (10 papers in press)
Task Scheduling and Virtual Resource Optimizing in Hadoop YARN-based Cloud Computing Environment
by Frederic Nzanywayingoma
Abstract: we are living in the data world where a high volume of data is changing the way things used to be in traditional IT industry. Big Data is being generated everywhere around us at all times by cameras, mobile devices, sensors, and software logs with large amount of data in units of hundreds of terabytes to petabytes. Therefore, to analyze these massive data, new skills, intensive applications and storage clusters are needed. Apache Hadoop is one of the most recently popular tools developed for big data processing. It has been deployed by many giant companies to stream large files in big datasets. The main purpose in this paper is to analyze different scheduling algorithms that can help to achieve better performance, efficiency and reliability of Hadoop YARN environment. We describe some task schedulers which consider different levels of Hadoop such as FIFO (First In First Out) scheduler, fair scheduler, delay scheduler, deadline constraint scheduler, dynamic priority scheduling, capacity scheduler, and we analyze the performance of these widely used Hadoop task schedulers based on the following elements: makespan; turnaround time; and throughput. A reliable scheduling algorithm is suggested which can work efficiently in Hadoop environments. To conclude this paper, the experimental results were given.
Keywords: Hadoop; MapReduce; Task Scheduling; YARN; HDFS; JobTracker; TaskTracker.
A Survey of Scheduling Frameworks in Big Data Systems
by Ji LIU, Esther Pacitti, Patrick Valduriez
Abstract: Cloud and big data technologies are now converging to enable organizations to outsource data in the cloud and get value from data through big data analytics. Big data systems typically exploit computer clusters to gain scalability and obtain a good cost-performance ratio. However, scheduling a workload in a computer cluster remains a well-known open problem. Scheduling methods are typically implemented in a scheduling framework and may have different objectives. In this paper, we survey scheduling methods and frameworks for big data systems, propose a taxonomy and analyze the features of the different categories of scheduling frameworks. These frameworks have been designed initially for the cloud (MapReduce) to process Web data. We examine sixteen popular scheduling frameworks and discuss their features. Our study shows that different frameworks are proposed for different big data systems, different scales of computer clusters and different objectives. We propose the main dimensions for workloads and metrics for benchmarks to evaluate these scheduling frameworks. Finally, we analyze their limitations and propose new research directions.
Keywords: Big data; cloud computing; cluster computing; parallel processing; scheduling method; scheduling framework.
Optimal Cloud Resource Provisioning for Auto-scaling Enterprise Applications
by Satish Srirama, Alireza Ostovar
Abstract: Auto-scaling enterprise/workflow systems on cloud needs to deal with both the scaling policy, which determines "when to scale" and the resource provisioning policy, which determines "how to scale". This paper presents a novel resource provisioning policy that can find the most cost optimal setup of variety of instances of cloud that can fulfill incoming workload. All major factors involved in resource amount estimation such as processing power, periodic cost and configuration cost of each instance type, lifetime of each running instance, and capacity of clouds are considered in the model. Benchmark experiments were conducted on Amazon cloud and were matched with Amazon AutoScale, using a real load trace and through two main control flow components of enterprise applications, AND and XOR. The experiments showed that the model is plausible for auto-scaling any web/services based enterprise workflow/application on the cloud, along with the effect of individual parameters on the optimal policy.
Keywords: Cloud computing; auto-scaling; enterprise applications; resource provisioning; optimization; control flows
Special Issue on: ICA CON 2016 & 2017 A Collaborative Community of Leaders Cloud Computing in Education
Extreme Value Analysis for Capacity Design
by Szilard Bozoki, Andras Pataricza
Abstract: Cloud computing has become the fundamental platform for service offerings. Such services frequently face peaks in their variable workload. Thus, the cloudification of critical applications with strict Service Level Agreements (e.g. performability) need a properly engineered capacity to withstand peak loads. A core problem is the prediction of the value of peaks, especially in bursty workloads. They originate in the cumulative effect of hard-to-predict rare and extreme events. Luckily, system monitoring collects enough vital information for a prediction by statistical methods. Extreme value analysis focuses on the prediction of future peaks.
This paper investigates the use of extreme value theory for capacity planning in cloud platforms and services and assesses the technical metrology aspects as well.
Keywords: cloud computing; performability engineering; capacity design; extreme value analysis; Facebook Prophet.
A Formal Model Toward ScientificWorkflow Security in the Cloud
by Donghoon Kim, Mladen Vouk
Abstract: Scientific workflow management systems (SWFMS) may be vulnerable in the Cloud since they may have not embraced practical security solutions yet. This paper presents an approach to formal modeling of scientific workflow security in the Cloud.We focus on the procedure to build secure data flows in a holistic way. This work suggests that a white-list approach to input validation can play a vital in protecting the flows from zero-day attacks.
Keywords: Formal method; security; workflow; security property; input validation; access control; cloud.
Cloud-based Environment in Support of IoT Education
by Anand Singh, Yannis Viniotis
Abstract: Students taking an IoT curriculum need to acquire skills (among others) in areas as (a) developers of IoT applications, (b) architects of IoT systems, and, (c) administrators of such systems. At North Carolina State University, we have developed a cloud-based environment to support the development of such skills. The environment is based on IBMs Watson IoT Cloud Platform and uses components such as Intels Edison Boards, Raspberry Pis, Cisco IoT gateways, TI boards, sensors/actuators, and GitHub, to give students an end-to-end experience in all aspects of IoT solution and system development. In this paper, we discuss the challenges we faced, how we overcame them, feedback from students and plans for our next steps.
Keywords: IoT systems; Cloud platforms; Edge Computing; Curriculum development.
Special Issue on: ICBDCC2017 Big Data and Cloud Computing Technologies
A New Key Generation Technique Using GA for Enhancing Data Security in Cloud Environment
by D. I. George Amalarethinam, H.M. LEENA
Abstract: Cloud computing is the distributed and centralized network with the collection of interconnected systems with a provision of providing the resources based on pay-per-use. This facility of ubiquitous computing attracts the user towards the usage of various services. One of the major issues in cloud is Security. When the users deploy the services for storing their sensitive data in the cloud, protecting their data is a crucial task. Cryptography plays an important role in securing data. Symmetric algorithms of Cryptography are more suitable when large amount of data is to be stored. Instead Asymmetric algorithms are preferred for encrypting the keys rather than the data because of its less speed. The technique for generating or selecting the key plays a vital role in securing the data. Genetic Algorithms is a powerful tool for solving the most of the optimization problems like The Traveling Salesperson Problem, Knapsack Problem, Scheduling Problem etc., The proposed Genetic algorithm is used for generating a best key which satisfies the specified fitness function. The generated key is sent to the Asymmetric Addition Chaining Cryptographic Algorithm (ACCA) for encryption. The encrypted key can be used by any one of the Symmetric Algorithms like AES, DES, Blowfish for encrypting large volume of data.
Keywords: Cloud Computing; Security; Cryptography; Encryption; Key Generation; Genetic Algorithms; Optimization Problems; Fitness Function; Data Security; Addition Chaining.
Confidential Storage of Medical Images A Chaos Based Encryption Approach
by Mohamed Parvees M Y, Abdul Samath J, Parameswaran Bose B
Abstract: The recent developments in telehealth increase the demands of clinical and non-clinical services which lead to work on medical image security to provide better teleradiology services. One of the mandatory characteristics of the image security is confidentiality. The traditional block and stream ciphers are suitable for encrypting small data or the file which is not having redundant information. Hence, this study proposes an encryption algorithm to provide confidentiality to medical images using chaotic maps. The different enhanced chaotic economic maps (ECEM) are derived by substituting sine and cosine functions in basic chaotic economic map (CEM) equation. The ECEMs are studied in detail with respect to their bifurcate nature and Lyapunov exponents to achieve greater robustness in encryption. The improved maps generate different chaotic sequences which are employed in confusing, swapping and diffusing 16-bit DICOM image pixels, thereby assure confidentiality. After scrambling, the different security analyses such as statistical, entropy, differential, key space, key sensitivity, cropping attack, noise attack, decryption efficacy analysis are performed to prove the effectiveness of the proposed algorithm.
Keywords: Patient confidentiality; Chaotic map; DICOM encryption; Medical images.
A Secure Encryption Scheme based on Certificateless Proxy Signature
by Sudharani Kamaraj
Abstract: Certificateless Public Key Cryptography (CL-PKC) scheme is introduced for solving the key escrow problems in the identity-based cryptography and eliminate the use of security certificates. By introducing the proxy signature concept in the certificateless cryptography scheme, this Certificateless Proxy Signature (CLPS) scheme has attracted the attention of more researchers. However, this scheme suffers due to the security issues and fails to achieve the unforgeability against the attacks. To overcome the security issues in the existing cryptographic scheme, this paper proposes an encryption scheme based on the certificateless proxy signature for sharing the sensitive data in the public cloud in a secure manner. The proposed scheme is proven to be unforgeable against the message attacks. When compared with the existing CLPS scheme without random oracles, the proposed scheme offers better data security while ensuring better data sharing performance. From the experimental Results, it was noticed that the proposed scheme requires minimum encryption and decryption time than that of existing scheme.
Keywords: : Access Control; Cloud computing; Certificateless Public Key Cryptography (CL-PKC); Data Confidentiality; Malicious KGC Attack; Proxy Signature; Public Key Replacement Attack.
Automatic Cloud Service Monitoring and Management with Prediction based Service Provisioning
by Kirit Modi
Abstract: Cloud computing provides an efficient, on-demand and scalable environment for the benefit of end users by offering cloud services as per SLAs (Service Level Agreement) on which both user and cloud service providers are mutually agreed. As the number of cloud users is increasing day by day, sometimes cloud service providers unable to offer service as per SLA which results in SLA violation. To detect SLA violation and to fulfill the user requirements from the service provider, cloud services should be monitored. The current cloud service provision based on the current workload. Due to unexpected future demands by the customers, the cloud service provider may not able to maintain the QoS what they promised and lead to SLA violations. Thus, it is needed to predict the future requirement of the customer and based on that cloud service provisioning must be done. Cloud service monitoring plays a critical role for both the customers and service providers as monitoring status helps service provider to improve their services at the same time it also helps the customers to know whether they are receiving the promised QoS or not as per the SLA. Most existing cloud service monitoring frameworks are developed towards service provider side. This raises the question of correctness and fairness of monitoring mechanism on the other hand if monitoring is applied at user side then it would become overhead to the clients. To manage such issue, an ontology based Automatic Cloud Services Monitoring and Management (ACSMM) with prediction based service provisioning approach is proposed in this paper, where cloud service monitoring and management is performed at cloud broker, which is an intermediate entity between the user and service provider. In this approach, when SLA violation is detected, it sends alert to both clients and service providers, and also generates the status report. Based on the status report, broker automatically reschedules the tasks to reduce the further SLA violation. In our framework, the cloud service provisioning is based on the predicting the future demands so that cloud service provider can handle the unexpected resources demands of the customers which will reduce the SLA violations.
Keywords: Cloud Service Monitoring; Service Level Agreement; Cloud Service; Ontology;Rescheduling; Prediction based Service Provisioning.