Global Journal of Science Frontier Research, A: Physics and Space Science, Volume 23 Issue 1

other insights like market trends and customer sentiments that enables organizations to make informed business decisions. We divide the entire analysis in three steps, namely Data management, Analytics & visualization. This is shown in Figure 2. A few tools used in different stages are mentioned in the Figure. Figure 2: Stages of Data analytics and visualization Hadoop (Apache Hadoop Documentation, 2014) is an open-source framework written in Java that provides many analytical tools to store and process large datasets ranging in size from gigabytes to petabytes to generate new insight, which includes Machine Learning and data mining. The Apache Hadoop kernel has a storage part, known as Hadoop Distributed File System (HDFS), and a processing part known as the Map Reduce algorithm for parallel processing. Few of the tools are Apache spark (Spark Core Programming, n.a.; Kannan, 2015) , a cluster computing platform to process batch applications, Machine Learning, streaming data processing, and interactive queries; Map Reduce (Seema & Jha, 2015) , an Algorithm based on the YARN framework to perform the distributed processing in parallel in a Hadoop cluster; Apache Hive (Hiba et al. 2019) , a Data warehousing tool that uses query language known as HQL or HIVEQL; Apache Impala, an open-source SQL engine; Apache Mahout (Anil et al. 2020) , used for implementing various Machine Learning algorithms offering implementations of classification, clustering, dimensionality reduction and linear algebraic computations; Apache pig, an open-source Apache library that runs on top of Hadoop, providing a scripting language used for analysing massive datasets by representing them as dataflow (Swarna & Zahid 2017) ; HBase, a non-relational, NoSQL distributed, and column-oriented database that allows for data to be analysed in real-time, as it is entered (History of Apache HBase, n. a.); Tableau, a software to generate helpful visualizing charts on interactive dashboards and worksheets and many more used in the Business Intelligence Industry (Tableau, n. a.) . These tools, with their short descriptions, have been shown in Figure 3. © 2023 Global Journals 1 Year 2023 20 Global Journal of Science Frontier Research Volume XXIII Issue ersion I VI ( A ) Data-Driven Knowledge Agriculture: A Paradigm Shift for Enhancing Farm Productivity & Global Food Security

RkJQdWJsaXNoZXIy NTg4NDg=