Global Journal of Computer Science and Technology, C: Software & Data Engineering, Volume 22 Issue 2

• Sandboxes: The analytical components that consume the information from the Data Lake, described in the previous points, are distributed at the level of sandboxes which were mounted on Google Cloud. In the actualization, we do not use Google services and the Jupyter Lab tools are mounted on the same servers of the financial institution. Sandboxes contain specific permissions for the files in the Data Lake, according to the responsible for the sandbox, which means different Data Scientists can access to different sandboxes which can access different governed files in the Data Lake. For the campaign pilot, we created a sandbox and performing the basic processing libraries in PySpark and ScalaToore (Python and Scala, respectively) and the access and profiling modalities for the Data Scientists to develop their processing and / or models. To finish the component validation, we integrate all the pieces described above in the Component Diagram of figure 5 Figure 5: Components Diagram Integration of the Big Data Environment in a Financial Sector Entity to Optimize Products, Services and Decision-Making Global Journal of Computer Science and Technology Volume XXII Issue II Version I 44 Year 2022 ( ) C © 2022 Global Journals Figure 3: Diagram from Compounds with Tools from Open Source Figure 4: Jupyter and Advanced Analytics

RkJQdWJsaXNoZXIy NTg4NDg=