A Comprehensive Survey on Big Data Analytics: Characteristics, Tools and Techniques
Abstract
Modern computing devices generate vast amounts of diverse data. It means that a fast transition through various computing devices leads to big data production. Big data with high velocity, volume, and variety presents challenges like data inconsistency, scalability, real-time analysis, and tool selection. Although numerous solutions have been proposed for big data processing, they are often limited in scope and effectiveness. This survey aims to address the lack of comprehensive analysis of big data challenges in relation to machine learning (ML) and the Internet of Things (IoT) environments, particularly concerning the 7Vs of big data. It emphasizes the significance of selecting suitable tools to address each unique big data characteristic, providing a structured approach to manage these challenges effectively. The article systematically reviews big data characteristics and associated techniques, with a detailed discussion of various tools and their applications. Additionally, it analyzes existing ML methods and techniques for IoT data analytics in big data contexts. Through a systematic literature review (SLR), we examine key aspects, including core concepts, benefits, limitations, and the impact of big data on ML algorithms and IoT data analytics. We highlight groundbreaking studies addressing big data challenges to impact future research and enhance big data-driven applications.