Autor
Autor
Autor
In today’s digital age, organizations are challenged to keep up with the unprecedented pace of data generation and the plethora of enterprise systems and digital technologies that collect all types of data. This is coupled with the need to rapidly and efficiently analyze these large volumes of data to generate insights and intelligence in order to maximize their business value. As a result, big data platforms have become an essential foundation for organizations to efficiently deploy data solutions that provide timely data-driven business decisions and competitive advantage.
"Data Analyse- und Intelligenzlösungen verbreiten sich in Unternehmen, um das Geschäftswachstum zu fördern. Unternehmen sollten große data Plattformen als solide Grundlage für den Einsatz von data Lösungen in großem Umfang aufbauen. Diese data Plattformen sollten speziell für Unternehmen entwickelt werden, da sie nur so gut sind wie die Geschäftseinblicke und die Intelligenz, die sie ermöglichen; und sie sollten zukunftssicher sein und von den ständigen Fortschritten bei data Infrastrukturdiensten und -technologien profitieren."Oussama Ahmad, Data Beratungspartner bei Artefact
Hauptziele der Plattform Big Data
Big data Plattformen zielen darauf ab, data Silos aufzubrechen und die verschiedenen Arten von data Quellen zu integrieren, die für die Implementierung fortschrittlicher data Analyse- und Intelligence-Lösungen erforderlich sind. Sie bieten eine skalierbare und flexible Infrastruktur für das Sammeln, Speichern und Analysieren großer Mengen von data aus verschiedenen Quellen. Diese Plattformen sollten die besten data Verwaltungsdienste und -technologien nutzen und drei Hauptziele erfüllen:
Infrastruktur der Plattform Big Data
There are several infrastructure options for a big data platform: fully on-premise, fully cloud or hybrid cloud/on-premise, each with its own advantages and challenges. Organizations should consider a number of factors when choosing the most appropriate infrastructure option for their big data platform, including data security and residency requirements, data source integrations, functionality and scalability requirements, and cost and time. A fully cloud-based architecture offers lower and more predictable costs, out-of-the-box services and integrations, and rapid scalability, but lacks control over hardware and may not comply with data privacy and residency regulations. A fully on-premise architecture provides full control over hardware and data security, typically complies with privacy and residency regulations, but incurs higher costs and requires long-term planning for scaling. A hybrid cloud/on-premise architecture offers the best of both worlds, facilitating full migration to the cloud at a later date, but may require a more complex setup.
Many organizations choose a hybrid infrastructure for their big data platforms due to organizational requirements to keep highly sensitive data (such as customer and financial data) on their own servers, or due to the lack of government-certified cloud service providers (CSPs) that meet local data privacy and residency requirements. These organizations also prefer to keep cloud-native or non-sensitive data sources in the cloud to optimize storage and compute resource costs and take advantage of out-of-the-box data analytics and machine learning services available from CSPs. Other organizations that have no organizational or regulatory requirements for data residency within the company or country opt for fully cloud-based infrastructure for faster time to implement, optimized costs, and easily scalable resources.

Figure 1: Hybrid Cloud & On-Premise Data Platform Infrastructure
Eine große data Plattform umfasst in der Regel sieben Hauptebenen, die den data Lebenszyklus von "Rohdaten data" über "Informationen" bis hin zu "Erkenntnissen" widerspiegeln. Unternehmen sollten sorgfältig prüfen, welche Dienste und Werkzeuge für jede dieser Ebenen erforderlich sind, um einen nahtlosen Datenfluss und eine effiziente Generierung von data Erkenntnissen zu gewährleisten. Diese Dienste und Werkzeuge sollten in jeder Schicht der Big data Plattform Schlüsselfunktionen übernehmen, wie in Abbildung 2: Big Data Plattform Data Schichten dargestellt.

Figure 2: Big Data Platform Data Layers
Evolution of the Big Data Platform
The development of a big data platform should evolve through several stages, starting with a minimum viable platform (MVP) and continuing with incremental upgrades. An organization should synchronize the evolution of its big data platform with increased requirements for broader and faster data insights and intelligence for business decisions. These increased requirements affect the complexity of the big data platform in terms of data analytics solutions, data source volumes and types, and internal and external users. The evolution of the big data platform includes the addition of more storage and compute resources, advanced features and functionality, and improvements in platform security and management.

Exhibit 3: Big Data Platform Evolution
“We have seen that many organizations tend to build big data platforms with advanced and unnecessary features from day one, which increases the technology cost of ownership. Big data platform deployment should start with a minimum viable platform and evolve based on business and technology requirements. In the early stages of building the platform, organizations should implement a robust data governance and management layer that ensures data quality, privacy, security, and compliance with local and regional data laws.”Anthony Cassab, Data Consulting Director at Artefact
Guidelines for a Future-Proof Big Data Platform
A big data platform should be built according to key architectural guidelines to ensure that it is future-proof, allowing for easy scalability of resources, portability across different on-premise and cloud infrastructures, upgrade and replacement of services, and expansion of data collection and sharing mechanisms.
“An adaptable and modular platform that can scale as business needs evolve is preferable to a “black box” platform that is well integrated but allows limited customization. These platform architectures can be built fully or partially in the cloud to leverage the benefits of cloud computing, such as scalability and cost efficiency, while also meeting the privacy and security requirements of data protection regulations.”Faisal Najmuddin, Data Engineering Manager at Artefact
In summary, a big data platform brings multiple benefits to organizations, such as centralizing data sources, enabling advanced data analytics solutions, and providing enterprise-wide access to data analytics solutions and sources. However, implementing a big data platform entails a number of strategic decisions, such as choosing the right infrastructure(s), adopting a future-proof architecture, selecting standard and “migratable” services, carefully considering data protection regulations, and finally, defining an optimal evolution plan that’s closely linked to business requirements and maximizes return on data investment.

BLOG








