Architecting
a Modern Data Warehouse for Large Enterprises
Build Multi-cloud Modern Distributed Data Warehouses with Azure and AWS
Anjani Kumar
Gurgaon, India
Abhishek Mishra
Thane West, Maharashtra, India
Sanjeev Kumar
Gurgaon, Haryana, India
ISBN 979-8-8688-0028-3 e-ISBN 979-8-8688-0029-0 https://doi.org/10.1007/979-8-8688-0029-0
© Anjani Kumar, Abhishek Mishra, and Sanjeev Kumar 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, speci ically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on micro ilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speci ic statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional af iliations.
This Apress imprint is published by the registered company APress Media, LLC, part of Springer Nature. The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
I dedicate this book to my mother, Prabhawati; my aunt, Sunita; and my wife, Suchi.
— Anjani Kumar
I dedicate this book to my lovely daughter, Aaria.
Abhishek Mishra
I dedicate this book to my late father, Shri Mahinder Nath.
— Sanjeev Kumar
Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub (https://github.com/Apress). For more detailed information, please visit https://www.apress.com/gp/services/source-code.
Acknowledgments
We would like to thank Apress for giving us the opportunity to work on this book. Also, thanks to the technical reviewer and the editor and the entire Apress team for supporting us on this journey.
Table of Contents
Chapter 1: Introduction
Objective
Origin of Data Processing and Storage in the Computer Era
Evolution of Databases and Codd Rules
Transitioning to the World of Data Warehouses
Data Warehouse Concepts
Data Sources (Data Format and Common Sources)
ETL (Extract, Transform, Load)
Data Mart
Data Modeling
Cubes and Reporting
OLAP
Metadata
Data Storage Techniques and Options
Evolution of Big Data Technologies and Data Lakes
Transition to the Modern Data Warehouse
Traditional Big Data Technologies
The Emergence of Data Lakes
Data Lake House and Data Mesh
Transformation and Optimization between New vs. Old (Evolution to Data Lake House)
A Wider Evolving Concept Called Data Mesh
Building an Effective Data Engineering Team
An Enterprise Scenario for Data Warehousing
Summary
Chapter 2: Modern Data Warehouses
Objectives
Introduction to Characteristics of Modern Data Warehouse
Data Velocity
Data Variety Volume
Data Value
Fault Tolerance
Scalability
Interoperability
Reliability
Modern Data Warehouse Features: Distributed Processing, Storage, Streaming, and Processing Data in the Cloud
Distributed Processing
Storage
Streaming and Processing
Autonomous Administration Capabilities
Multi-tenancy and Security
Performance
What Are NoSQL Databases?
Key–Value Pair Stores
Document Databases
Columnar DBs
Graph Databases
Case Study: Enterprise Scenario for Modern Cloud-based Data Warehouse
Advantages of Modern Data Warehouse over Traditional Data Warehouse
Summary
Chapter 3: Data Lake, Lake House, and Delta Lake
Structure
Objectives
Data Lake, Lake House, and Delta Lake Concepts
Data Lake, Storage, and Data Processing Engines Synergies and Dependencies
Implement Lake House in Azure
Create a Data Lake on Azure and Ingest the Health Data CSV File
Create an Azure Synapse Pipeline to Convert the CSV File to a Parquet File
Attach the Parquet File to the Lake Database
Implement Lake House in AWS
Create an S3 Bucket to Keep the Raw Data
Create an AWS Glue Job to Convert the Raw Data into a Delta Table
Query the Delta Table using the AWS Glue Job
Summary
Chapter 4: Data Mesh
Structure
Objectives
The Modern Data Problem and Data Mesh
Data Mesh Principles
Domain-driven Ownership
Data-as-a-Product
Self-Serve Data Platform
Federated Computational Governance
Design a Data Mesh on Azure
Create Data Products for the Domains
Create Self-Serve Data Platform
Federated Governance
Summary
Chapter 5: Data Orchestration Techniques
Structure
Objective
Data Orchestration Concepts
Modern Data Orchestration in Detail
Evolution of Data Orchestration
Data Integration
Middleware and ETL Tools
Enterprise Application Integration (EAI)
Service-Oriented Architecture (SOA)
Data Warehousing
Real-Time and Streaming Data Integration
Cloud-Based Data Integration
Data Integration for Big Data and NoSQL
Self-Service Data Integration
Data Pipelines
Data Processing using Data Pipelines
Bene its and Advantages of Data Pipelines
Common Use Cases for Data Pipelines
Data Governance Empowered by Data Orchestration: Enhancing Control and Compliance
Achieving Data Governance through Data Orchestration
Tools and Examples
Azure Data Factory
Azure Synapse
Summary
Chapter 6: Data Democratization, Governance, and Security
Objectives
Introduction to Data Democratization
Factors Driving Data Democratization
Layers of Democratization Architecture
Self-Service
Data Catalog and Data Sharing
People
Tools and Technology: Self-Service Tools
Data Governance Tools
Introduction to Data Governance
Ten Key Factors that Ensure Successful Data Governance
Data Stewardship
Models of Data Stewardship
Data Security Management
Security Layers
Data Security Approach
Types of Controls
Data Security in Outsourcing Mode
Popular Information Security Frameworks
Major Privacy and Security Regulations
Major Modern Security Management Concepts
Practical Use Case for Data Governance and Data
Democratization
Problem Statement
High-Level Proposed Solution
Summary
Chapter 7: Business Intelligence
Structure
Objectives
Introduction to Business Intelligence
Descriptive Reports
Predictive Reports
Prescriptive Reports
Business Intelligence Tools
Query and Reporting Tools
Online Analytical Processing (OLAP) Tools
Analytical Applications
Trends in Business Intelligence (BI)
Business Decision Intelligence Analysis
Self-Service
Advanced BI Analytics
BI and Data Science Together
Data Strategy
Data and Analytics Approach and Strategy
Summary Index
About the Authors
Anjani Kumar is the managing director and founder of MultiCloud4u, a rapidly growing startup that helps clients and partners seamlessly implement data-driven solutions for their digital businesses. With a background in computer science, Anjani began his career researching and developing multi-lingual systems that were powered by distributed processing and data synchronization across remote regions of India. He later collaborated with companies such as Mahindra Satyam, Microsoft, RBS, and Sapient to create data warehouses and other data-based systems that could handle high-volume data processing and transformation.
Dr. Abhishek Mishra
is a cloud architect at a leading organization and has more than a decade and a half of experience building and architecting software solutions for large and complex enterprises across the globe. He has deep expertise in enabling digital transformations for his customers using the cloud and arti icial intelligence.
Sanjeev Kumar heads up a global data and analytics practice at the leading and oldest multinational shoe company with headquarters in Switzerland. He has 19+ years of experience working for organizations in multiple industries modeling modern data solutions. He has consulted with some of the top multinational irms and enabled digital transformations for large enterprises using modern data warehouses in the cloud. He is an expert in multiple ields of modern data management and execution, including data strategy, automation, data governance, architecture, metadata, modeling, business intelligence, data management, and analytics.
About the Technical Reviewer
Viachaslau Matsukevich is an industry expert with over a decade of experience in various roles, including DevOps, cloud, solutions architecture, tech leadership, and infrastructure engineering.
As a cloud solutions architect, Viachaslau has delivered 20+ DevOps projects for a number of Fortune 500 and Global 2000 enterprises. He holds certi ications from Microsoft, Google, and the Linux Foundation, including Solutions Architect Expert, Professional Cloud Architect, and Kubernetes Administrator.
Viachaslau authors technology articles about cloud-native technologies and Kubernetes, for platforms such as Red Hat Enable Architect, SD Times, Hackernoon, and Dzone.
In addition to his technical expertise, Viachaslau serves as a technical reviewer for technology books, ensuring the quality and accuracy of the latest publications.
He has also made signi icant contributions as an industry expert and judge for esteemed awards programs, including SIIA CODiE Awards and Globee Awards (including IT World Awards, Golden Bridge Awards, Disruptor Company Awards, and American Best in Business Awards).
Viachaslau has also lent his expertise as a judge in over 20 hackathons.
Viachaslau is also the author of online courses covering a wide array of topics related to cloud, DevOps and Kubernetes tools.
Follow Viachaslau on LinkedIn: https://www.linkedin.com/in/viachaslaumatsukevich/
© Anjani Kumar, Abhishek Mishra, and Sanjeev Kumar 2024
A. Kumar et al., Architecting a Modern Data Warehouse for Large Enterprises
https://doi org/10 1007/979-8-8688-0029-0 1
1. Introduction
Anjani Kumar1 , Abhishek Mishra2 and Sanjeev Kumar3
(1) (2) (3)
Gurgaon, India
Thane West, Maharashtra, India
Gurgaon, Haryana, India
In the early days of computing, businesses struggled to keep up with the lood of data. They had few options for storing and analyzing data, hindering their ability to make informed decisions. As technology improved, businesses recognized the value of data and needed a way to make sense of it. This led to the birth of data warehousing, coined by Bill Inmon in the 1980s. Inmon’s approach was focused on structured, relational data for reporting and analysis. Early data warehouses were basic but set the stage for more advanced solutions as businesses gained access to more data. Today, new technologies like Big Data and data lakes have emerged to help deal with the increasing volume and complexity of data. The data lakehouse combines the best of data lakes and warehouses for real-time processing of both structured and unstructured data, allowing for advanced analytics and machine learning. While the different chapters of this book cover all aspects of modern data warehousing, this chapter speci ically focuses on the transformation of data warehousing techniques from past to present to future, and how it impacts building a modern data warehouse.
In this chapter we will explore the following:
History and Evolution of Data Warehouse
Basic Concepts and Features of Data Warehouse
Advantages and Examples of Cloud-based Data Warehouse
Enterprise Scenario for Data Warehouse
Objective
This chapter provides an overview of data warehouses and familiarizes the readers with the terminologies and concepts of data warehouses. The chapter further focuses on the transformation of data warehousing techniques from past to present to future, and how it impacts building a modern data warehouse.
After studying this chapter, you should be able to do the following: Understand the basics of data warehousing, from the tools, processes, and techniques used in modern-day data warehousing to the different roles and responsibilities of a data warehouse team. Set up a synergy between engineering and operational communities, even when they’re at different stages of learning and implementation maturity.
Determine what to adopt and what to ignore, ensuring your team stays up to date with the latest trends in data warehousing.
Whether you’re starting a data warehouse team or just looking to expand your knowledge, this guide is the perfect place to start. It will provide you with a background on the topics covered in detail in further chapters, allowing you to better understand the nuances of data warehousing and become an expert in the ield.
Origin of Data Processing and Storage in the Computer Era
The history of data processing and storage dates back to the early 20th century when mechanical calculators were used for basic arithmetic operations. However, it wasn’t until the mid-20th century that electronic computers were developed, which revolutionized data processing and storage.
The irst electronic computer, the ENIAC (Electronic Numerical Integrator and Computer), was built in 1946 by J. Presper Eckert and John Mauchly. It was a massive machine that illed an entire room and used vacuum tubes to perform calculations. ENIAC was primarily used for military purposes, such as computing artillery iring tables.
In the 1950s and ’60s, the development of smaller and faster transistors led to the creation of smaller and more ef icient computers. The introduction of magnetic tape and magnetic disks in the late 1950s allowed for the storage of large amounts of data, which could be accessed much more quickly than with punched cards or paper tape.
In the 1970s, the development of integrated circuits (ICs) made it possible to create even smaller and more powerful computers. This led to the development of personal computers in the 1980s, which were affordable and accessible to a wide range of users.
Today, data processing and storage are essential to nearly every aspect of modern life, from scienti ic research to business and commerce to entertainment. The rapid growth of the modern storage solution powered by SSD and lash memory and by internet and cloud computing has made it possible to store and access vast amounts of data from almost anywhere in the world.
In conclusion, the origin of data processing and storage can be traced back to the early 20th century, but it wasn’t until the development of electronic computers in the mid-20th century that these processes became truly revolutionary. From massive room-sized machines to powerful personal computers, data processing and storage have come a long way and are now essential to almost every aspect of modern life.
Evolution of Databases and Codd Rules
The evolution of databases began with IBM’s development of the irst commercially successful database management system (DBMS) in the 1960s. The relational model of databases, introduced by E.F. Codd in the 1970s, organized data into tables consisting of rows and columns, leading to the development of Structured Query Language (SQL). The rise of the internet and e-commerce in the 1990s led to the development of NoSQL databases for handling vast amounts of unstructured data. The Chord protocol, proposed in 2001, is a distributed hash table (DHT) algorithm used for maintaining the consistency and reliability of data across multiple nodes in distributed databases.
Codd’s 12 principles for relational databases established a framework for designing and implementing a robust, lexible, and scalable data management system. These principles are relevant in data warehousing today because they provide a standard for evaluating data warehousing systems and ensuring that they can handle large volumes of data, support complex queries, maintain data integrity, and evolve over time to meet changing business needs.
The 12 principles of Codd’s rules for relational databases are as follows:
1. Information Rule: All data in the database should be represented as values in tables. This means that the database should be structured as a collection of tables, with each table representing a single entity or relationship.
2. Guaranteed Access Rule: Each value in the database should be accessible by specifying its table name, primary key value, and column name. This ensures that every piece of data in the database is uniquely identi iable and can be accessed ef iciently.
3. Systematic Treatment of Null Values: The database should support the use of null values to represent missing or unknown data. These null values should be treated consistently throughout the system, with appropriate support for operations such as null comparisons and null concatenations.
4. Dynamic Online Catalog Based on the Relational Model: The database should provide a dynamic online catalog that describes the structure of the database in terms of tables, columns, indexes, and other relevant information. This catalog should be accessible to users and applications and should be based on the relational model.
5. Comprehensive Data Sublanguage Rule: The database should support a comprehensive data sublanguage that allows users to de ine, manipulate, and retrieve data in a variety of ways. This sublanguage should be able to express complex queries, data de initions, and data modi ications.
6. View Updating Rule: The database should support the updating of views, which are virtual tables that are de ined in terms of other tables. This allows users to modify data in a lexible and intuitive way, without having to worry about the underlying structure of the database.
7. High-Level Insert, Update, and Delete Rule: The database should support high-level insert, update, and delete operations that allow users to modify multiple rows or tables at once. This simpli ies data management and improves performance by reducing the number of database interactions required.
8. Physical Data Independence: The database should be able to store and retrieve data without being affected by changes to the physical storage or indexing structure of the database. This allows the database to evolve over time without requiring signi icant changes to the application layer.
9. Logical Data Independence: The database should be able to store and retrieve data without being affected by changes to the logical structure of the database. This means that the database schema can be modi ied without requiring changes to the application layer.
10. Integrity Independence: The database should be able to enforce integrity constraints such as primary keys, foreign keys, and other business rules without being affected by changes to the application layer. This ensures that data is consistent and accurate at all times.
11. Distribution Independence: The database should be able to distribute data across multiple locations without being affected by changes to the application layer. This allows the database to scale horizontally and geographically without requiring changes to the application layer.
12. Non-Subversion Rule: The database should not be susceptible to subversion by unauthorized users or applications. This means
that the database should enforce access controls, encryption, and other security measures to protect against unauthorized access or modi ication of data.
Traditional tabular systems based on Codd rules were relevant, but with the rise of the internet and e-commerce, there was a huge increase in the volume and variety of data being generated. To handle this data, new NoSQL databases were developed, which are more lexible and scalable, especially for unstructured data. In building a universally accepted data warehouse, it’s important to consider the strengths and weaknesses of both traditional and NoSQL databases and follow best practices, such as data quality, data modeling, data governance, and security measures. In the upcoming section of this chapter, we will explore this transition in a step-bystep manner while giving special attention to the areas that remain relevant for creating a strong and widely accepted modern data warehouse.
Transitioning to the World of Data Warehouses
In the 1970s, the dominant form of database used in business was the hierarchical database, which organized data in a tree-like structure, with parent and child nodes. However, as businesses began to collect more data and as the need for complex querying and reporting increased, it became clear that the hierarchical database was not suf icient.
This led to the development of the network database, which allowed for more complex relationships between data, but it was still limited in its ability to handle large volumes of data and complex querying. As a result, the relational database model was developed, which organized data into tables consisting of rows and columns, allowing for more ef icient storage and easier retrieval of information. However, the relational model was not without its limitations. As businesses continued to collect more data, the need for a centralized repository to store and manage data became increasingly important. This led to the development of the data warehouse, which is a large,
centralized repository of data that is optimized for reporting and analysis.
The data warehouse is designed to handle large volumes of data from multiple sources and to provide a single source of truth for reporting and analytics. Data warehouses use specialized technologies, such as extract, transform, load (ETL) processes, to extract data from multiple sources, transform it into a common format, and load it into the data warehouse.
Data warehouses also use specialized tools for querying and reporting, such as online analytical processing (OLAP), which allows users to analyze data across multiple dimensions, and data mining, which uses statistical and machine learning techniques to identify patterns and relationships in the data.
The world transitioned to data warehousing from databases in the 1970s as businesses realized the limitations of the hierarchical and network database models when handling large volumes of data and complex querying. The development of the data warehouse provided a centralized repository for storing and managing data, as well as specialized tools for reporting and analysis. Today, data warehouses are a critical component of modern businesses, enabling them to make data-driven decisions and stay competitive in a rapidly changing market.
During this pivotal transition in the world of data management, numerous scientists and experts made signi icant contributions to the ield. Notable among them are Bill Inmon, revered as the originator of the data warehouse concept, which focuses on a single source of truth for reporting and analysis; Ralph Kimball, a renowned data warehousing expert who introduced dimensional modeling, which emphasizes optimized data modeling for reporting, star schemas, and fact tables; and Dan Linstedt, who invented the data vault modeling approach, which combines elements of Inmon and Kimball’s methodologies and is tailored for handling substantial data volumes and historical reporting. In addition, Claudia Imhoff, a business intelligence and data warehousing expert, founded the Boulder BI Brain Trust, offering thought leadership; Barry Devlin pioneered the business
data warehouse concept, which highlights business metadata’s importance and aligns data warehousing with business objectives; and, lastly, Jim Gray, a computer scientist and database researcher, who contributed signi icantly by introducing the data cube, a multidimensional database structure for enhanced analysis and reporting. In conclusion, these luminaries represent just a fraction of the visionary minds that shaped modern data warehousing, empowering businesses to harness data for informed decision-making in a dynamic market landscape.
Data Warehouse Concepts
In today’s world, businesses collect more data than ever before. This data can come from a variety of sources, such as customer transactions, social media, and Internet of Things (IoT) devices. However, collecting data is only the irst step; to truly unlock the value of this data, businesses must be able to analyze and report on it. This is where the data warehouse comes in. The following are aspects of the data warehouse:
A data warehouse is a large, centralized repository of data that is optimized for reporting and analysis. The data warehouse is designed to handle large volumes of data from multiple sources, and to provide a single source of truth for reporting and analytics. It is a critical component of modern business intelligence, enabling businesses to make data-driven decisions and stay competitive in a rapidly changing market.
Data warehouses use specialized technologies, such as extract, transform, load (ETL) processes, to extract data from multiple sources, transform it into a common format, and load it into the data warehouse. This allows businesses to bring together data from disparate sources and create a single, uni ied view of the data.
Data warehouses also use specialized tools for querying and reporting, such as online analytical processing (OLAP), which allows users to analyze data across multiple dimensions, and data mining, which uses statistical and machine learning techniques to identify patterns and relationships in the data.
One of the key features of the data warehouse is its ability to handle historical data. Traditional transactional databases are optimized for handling current data, but they are not well suited to handling large volumes of historical data. Data warehouses, however, are optimized for handling large volumes of historical data, which is critical for trend analysis and forecasting.
In addition, data warehouses are designed to be easy to use for business users. They use specialized reporting tools that allow users to create custom reports and dashboards, and to drill down into the data to gain deeper insights. This makes it easy for business users to access and analyze the data they need to make informed decisions. There are several common concepts in data warehouses that are essential to understanding their architecture. Here are some of the most important concepts:
Data Sources: A data warehouse collects data from a variety of sources, such as transactional databases, external data sources, and lat iles. Data is extracted from these sources and transformed into a standardized format before being loaded into the data warehouse.
ETL (Extract, Transform, Load): This is the process used to collect data from various sources and prepare it for analysis in the data warehouse. During this process, data is extracted from the source systems, transformed into a common format, and loaded into the data warehouse.
Data Marts: A data mart is a subset of a data warehouse that is designed to meet the needs of a particular department or group within an organization. Data marts are typically organized around speci ic business processes or functions, such as sales or marketing.
Data Modeling: In the ield of data warehousing, there are two main approaches to modeling data: tabular modeling and dimensional modeling. Tabular modeling is a relational approach to data modeling, which means it organizes data into tables with rows and columns. Dimensional modeling involves organizing data around dimensions (such as time, product, or location) and measures (such as sales revenue or customer count) and using a star or snow lake schema to represent the data.
OLAP (Online Analytical Processing): OLAP is a set of tools and techniques used to analyze data in a data warehouse. OLAP tools
allow users to slice and dice data along different dimensions and to drill down into the data to gain deeper insights.
Data Mining: Data mining is the process of analyzing large datasets to identify patterns, trends, and relationships in the data. This technique uses statistical and machine learning algorithms to discover insights and make predictions based on the data.
Metadata: Metadata is data about the data in a data warehouse. It provides information about the source, structure, and meaning of the data in the warehouse, and is essential for ensuring that the data is accurate and meaningful.
Data Sources (Data Format and Common Sources)
In a data warehouse, data source refers to any system or application that provides data to the data warehouse. A data source can be any type of system or application that generates data, such as a transactional system, a customer relationship management (CRM) application, or an enterprise resource planning (ERP) system.
The data from these sources is extracted and transformed before it is loaded into the data warehouse. This process involves cleaning, standardizing, and consolidating the data to ensure that it is accurate, consistent, and reliable. Once the data has been transformed, it is then loaded into the data warehouse for storage and analysis.
In some cases, data sources may be connected to the data warehouse using extract, transform, and load (ETL) processes, while in other cases, they may be connected using other data integration methods, such as data replication, data federation, or data virtualization.
Note Data sources are a critical component of a data warehouse, as they provide the data that is needed to support business intelligence and analytics. By consolidating data from multiple sources into a single location, a data warehouse enables organizations to gain insights into their business operations and make more-informed decisions.
There are various types and formats of data sources that can be used in a data warehouse. Here are some examples:
Relational databases: A common data source for a data warehouse is a relational database, such as Oracle, Microsoft SQL Server, or MySQL. These databases store data in tables with de ined schemas and can be queried using SQL.
Flat iles: Data can also be sourced from lat iles, such as CSV iles, Parquet, Excel, or any other formatted text iles. These iles typically have a delimited format with columns and rows.
Cloud storage services: Cloud storage services, such as Amazon S3 or Azure Data Lake Storage, can also be used as a data source for a data warehouse. These services can store data in a structured or unstructured format and can be accessed through APIs.
NoSQL databases: NoSQL databases, such as MongoDB or Cassandra, can be used as data sources for data warehouses. These databases are designed to handle large volumes of unstructured data and can be queried using NoSQL query languages.
Real-time data sources: Real-time data sources, such as message queues or event streams, can be used to stream data into a data warehouse in real-time. This type of data source is often used for applications that require up-to-date data.
APIs: APIs can also be used as a data source, providing access to data from third-party applications or web services.
Format of the data coming from multiple sources can also vary depending on the type of data. For example, data can be structured or unstructured, semi-structured, such as JSON or XML. The data format needs to be considered when designing the data warehouse schema and the ETL processes. It is important to ensure that the data is properly transformed and loaded into the data warehouse in a format that is usable for analysis.
Data can low to the data warehouse through different systems, some of the most used of which include the following:
Transactional databases: Transactional databases are typically the primary source of data for a data warehouse. These databases capture and store business data generated by various systems, such as sales, inance, and operations.
ERP systems: Enterprise resource planning (ERP) systems are used by many organizations to manage their business processes. ERP systems can provide a wealth of data that can be used in a data warehouse, including information on customer orders, inventory, and inancial transactions.
CRM systems: Customer relationship management (CRM) systems provide data on customer interactions that can be used to support business analytics and decision-making.
Legacy systems: Legacy systems are often used to store important historical data that needs to be incorporated into the data warehouse. This data may be stored in a variety of formats, including lat iles or proprietary databases.
Cloud-based systems: Cloud-based systems, such as software-as-aservice (SaaS) applications, are becoming increasingly popular as data sources for data warehouses. These systems can provide access to a variety of data, including customer behavior, website traf ic, and sales data.
Social media: Social media platforms are another source of data that can be used in a data warehouse. This data can be used to gain insights into customer behavior, sentiment analysis, and brand reputation.
One effective approach for documenting data-related artifacts, such as data sources and data lows, is using data dictionaries and data catalogs. These tools can capture relevant information about data elements, including their structure and meaning, as well as provide more comprehensive details about data sources, lows, lineage, and ownership. By leveraging these tools, implementation teams and data operations teams can gain a better understanding of this information, leading to improved data quality, consistency, and collaboration across various teams and departments within an organization.
Note When categorizing data into structured or unstructured sources, you’ll ind that older systems like transactional, ERP, CRM, and legacy tend to have well-organized and -classi ied data compared to that sourced from cloud-based systems or social media. It’s not entirely accurate to say that all data from cloud platforms and website analytics activities are unstructured, but analyzing such
Another random document with no related content on Scribd:
adept lodged with his friend in the Abbey of Westminster, where he worked, and perfected the stone which Cremer had so long unsuccessfully sought. He was duly presented to the King, who, previously informed of the talents of the illustrious stranger, received him with regard and attention.
When he “communicated his treasures,” the single condition which he made was that they should not be expended in the luxuries of a court or in war with a Christian prince, but that the King should go in person with an army against the infidels.
Edward, under pretence of doing honour to Raymond, gave him an apartment in the Tower of London, where the adept repeated his process. He transmuted base metal into gold, which was coined at the mint into six millions of nobles, each worth three pounds sterling at the present day. These coins are well known to antiquarians by the name of Rose Nobles. They prove in the assay of the test to be a purer gold than the Jacobus, or any other gold coin made in those times. Lully in his last testament declares that in a short time, while in London, he converted twenty-two tons weight of quicksilver, lead, and tin into the precious metal.
His lodging in the Tower proved only an honourable prison, and when Raymond had satisfied the desires of the King, the latter disregarded the object which the adept was so eager to see executed, and to regain his own liberty Lully was obliged to escape surreptitiously, when he quickly departed from England.
Cremer, whose intentions were sincere, was not less grieved than Raymond at this issue of the event, but he was subject to his sovereign, and could only groan in silence. He declares his extreme affliction in his testament, and his monastery daily offered up prayers to God for the success of Raymond’s cause. The Abbot lived long after this, and saw part of the reign of King Edward III. The course of operations which he proposes in his testament, with apparent sincerity, is not less veiled than are those in the most obscure authors.[N]
Now, in the first place, this story is not in harmony with itself. If Raymond Lully were at Vienna in 1311, how did John Cremer
contrive to meet him in Italy at or about the same time? In the second place, the whole story concerning the manufacture of Rose Nobles is a series of blunders. The King who ascended the throne of England in 1307 was Edward II., and the Rose Nobles first appear in the history of numismatics during the reign of Edward IV., and in the year 1465.
“In the King’s fifth year, by another indenture with Lord Hastings, the gold coins were again altered, and it was ordered that forty-five nobles only, instead of fifty, as in the last two reigns, should be made of a pound of gold. This brought back the weight of the noble to one hundred and fifty grains, as it had been from 1351 to 1412, but its value was raised to 10s. At the same time, new coins impressed with angels were ordered to be made, sixty-seven and a half to be struck from one pound of gold, and each to be of the value of 6s. 8d.—that is to say, the new angel which weighed eighty grains was to be of the same value as the noble had been which weighed one hundred and eight grains. The new nobles to distinguish them from the old ones were called Rose Nobles, from the rose which is stamped on both sides of them, or ryals, or royals, a name borrowed from the French, who had given it to a coin which bore the figure of the King in his royal robes, which the English ryals did not. Notwithstanding its inappropriateness, however, the name of royal was given to these 10s. pieces, not only by the people, but also in several statutes of the realm.”[O]
In the third place, the testament ascribed to John Cremer, Abbot of Westminster, and to which we are indebted for the chief account of Lully’s visit to England, is altogether spurious. No person bearing that name ever filled the position of Abbot at any period of the history of the Abbey.
The only coinage of nobles which has been attributed to alchemy was that made by Edward III. in 1344. The gold used in this coinage is supposed to have been manufactured in the Tower; the adept in question was not Raymond Lully, but the English Ripley.
Whether the saint of Majorca was proficient in the Hermetic art or not, it is quite certain that he did not visit the British Isles. It is also
certain that in the Ars Magna Sciendi, part 9, chapter on Elements, he states that one species of metal cannot be changed into another, and that the gold of alchemy has only the semblance of that metal; that is, it is simply a sophistication.
As all the treatises ascribed to Raymond Lully cannot possibly be his, and as his errant and turbulent life could have afforded him few opportunities for the long course of experiments which are generally involved in the search for the magnum opus, it is reasonable to suppose that his alchemical writings are spurious, or that two authors, bearing the same name, have been ignorantly confused. With regard to “the Jewish neophyte,” referred to by the Biographie Universelle, no particulars of his life are forthcoming. The whole question is necessarily involved in uncertainty, but it is a point of no small importance to have established for the first time the fabulous nature of the Cremer Testament. This production was first published by Michael Maier, in his Tripus Aureus, about the year 1614. The two treatises which accompany it appear to be genuine relics of Hermetic antiquity.
The “Clavicula, or Little Key” of Raymond Lully is generally considered to contain the arch secrets of alchemical adeptship; it elucidates the other treatises of its author, and undertakes to declare the whole art without any fiction. The transmutation of metals depends upon their previous reduction into volatile sophic argent vive, and the only metals worth reducing, for the attainment of this prima materia, are silver and gold. This argent vive is said to be dryer, hotter, and more digested than the common substance, but its extraction is enveloped in mystery and symbolism, and the recipes are impossible to follow for want of the materials so evasively and deceptively described. At the same time, it is clear that the operations are physical, and that the materials and objects are also physical, which points are sufficient for our purpose, and may be easily verified by research.
Moreover, the alchemist who calls himself Raymond Lully was acquainted with nitric acid and with its uses as a dissolvent of metals. He could form aqua regia by adding sal ammoniac, or common salt, to nitric acid, and he was aware of its property of dissolving gold. Spirit of wine was well known to him, says Gruelin; he strengthened it with dry carbonate of potash, and prepared vegetable tinctures by its means. He mentions alum from Rocca, marcasite, white and red mercurial precipitate. He knew the volatile alkali and its coagulations by means of alcohol. He was acquainted with cupellated silver, and first obtained rosemary oil by distilling the plant with water.[P]
FOOTNOTES:
[K] This illness is referred to by another writer, with details of a miraculous kind “About 1275 (the chronology of all the biographers is a chaos of confusion) he fell ill a second time, and was reduced to such an extremity that he could take neither rest nor nourishment On the feast of the Conversion of St Paul, the crucified Saviour again appeared to him, glorified, and surrounded by a most exquisite odour, which surpassed musk, amber, and all other scents. In remembrance of this miracle, on the same day, in the same bed and place where he lived and slept, the same supernal odour is diffused.”
[L] The following variation is also related: “Finding him still alive when they bore him to the ship, the merchants put back towards Genoa to get help, but they were carried miraculously to Majorca, where the martyr expired in sight of his native island The merchants resolved to say nothing of their precious burden, which they embalmed and preserved religiously, being determined to transport it to Genoa. Three times they put to sea with a wind that seemed favourable, but as often they were forced to return into port, which proved plainly the will of God, and obliged them to make known the martyrdom of the man whom they revered, who was stoned for the glory of God in the town of Bugia (?) in the year of grace 1318.” From this account it will be seen that the place of Lully’s violent death, as well as the date on which it occurred, are both involved in doubt. He was born under the pontificate of Honorius IV , and died, according to Genebrand,
about 1304; but the author of the preface to the meditations of the Hermit Blaquerne positively fixes his decease on the feast of the martyrdom of SS Peter and Paul, June 29, 1315, and declares that he was eighty-six years old
[M] E.g., Jean-Marie de Vernon, who extends the lists to about three thousand, and, following the Père Pacifique de Provence, prolongs his life by the discovery of the universal medicine.
[N] “Lives of Alchemystical Philosophers,” ed 1815
[O] Kenyon, “Gold Coins of England,” pp. 57, 58.
[P] Gruelin, Geschichte der Chemie, i 74
NICHOLAS FLAMEL.
The name of this alchemical adept has been profoundly venerated not only in the memory of the Hermetists but in the hearts of the French people, among whom he is the central figure of many marvellous legends and traditions. “Whilst in all ages and nations the majority of hierophants have derived little but deception, ruination, and despair as the result of their devotion to alchemy, Nicholas Flamel enjoyed permanent good fortune and serenity. Far from expending his resources in the practice of the magnum opus, he added with singular suddenness a vast treasure to a moderate fortune. These he employed in charitable endowments and in pious foundations that long survived him and long sanctified his memory He built churches and chapels which were adorned with statues of himself, accompanied by symbolical characters and mysterious crosses, which subsequent adepts long strove to decipher, that they might discover his secret history, and the kabbalistic description of the process by which he was conducted to the realisation of the Grand Magisterium.”
Whether Flamel was born at Paris or Pontoise is not more uncertain than the precise date of his nativity. This occurred some time during the reign of Philippe le Bel, the spoliator of the grand order of the Temple, and, on the whole, the most probable year is 1330. His parents were poor, and left him little more than the humble house in Paris which he continued to possess till his death, and which he eventually bequeathed to the Church. It stood in Notary Street, at the corner of Marivaux Street, opposite the Marivaux door of the Church of Saint-Jacques-la-Boucherie.
Authorities disagree as to the amount of education that Flamel obtained in his youth, but it was sufficient to qualify him for the business of a scrivener, which, in spite of his wealth and his accredited wisdom, he continued to follow through life. He was proficient in painting and poetry, and had a taste for architecture and
the mathematical sciences; yet he applied himself steadily to business, and contracted a prudent marriage, his choice falling on a widow, named Pernelle, who, though handsome, was over forty years, but who brought a considerable dowry to her second husband.
In his capacity as a copyist before the age of printing, books of all classes fell into the hands of Flamel, and among them were many of those illuminated alchemical treatises which are reckoned among the rarest treasures of mediæval manuscripts. Acquainted with the Latin language, he insensibly accumulated an exoteric knowledge of the aims and theories of the adepts. His interest and curiosity were awakened, and he began studying them in his leisure moments. Now tradition informs us that, whether his application was great, his desire intense, or whether he was super-eminently fitted to be included by divine election among the illuminated Sons of the Doctrine, or for whatever other reason, the mystical Bath-Kôl appeared to him under the figure of an angel, bearing a remarkable book bound in well-wrought copper, the leaves of thin bark, graven right carefully with a pen of iron. An inscription in characters of gold contained a dedication addressed to the Jewish nation by Abraham the Jew, prince, priest, astrologer, and philosopher.
“Flamel,” cried the radiant apparition, “behold this book of which thou understandest nothing; to many others but thyself it would remain for ever unintelligible, but one day thou shalt discern in its pages what none but thyself will see!”
At these words Flamel eagerly stretched out his hands to take possession of the priceless gift, but book and angel disappeared in an auriferous tide of light. The scrivener awoke to be ravished henceforth by the divine dream of alchemy; but so long a time passed without any fulfilment of the angelic promise, that the ardour of his imagination cooled, the great hope dwindled gradually away, and he was settling once more into the commonplace existence of a plodding scribe, when, on a certain day of election in the year 1357, an event occurred which bore evidence of the veracity of his visionary promise-maker, and exalted his ambition and aspirations to a furnace heat. This event, with the consequences it entailed, are
narrated in the last testament of Nicholas Flamel, which begins in the following impressive manner, but omits all reference to the legendary vision:—
“The Lord God of my life, who exalts the humble in spirit out of the most abject dust, and makes the hearts of such as hope in Him to rejoice, be eternally praised.
“Who, of His own grace, reveals to the believing souls the springs of His bounty, and subjugates beneath their feet the crowns of all earthly felicities and glories.
“In Him let us always put our confidence, in His fear let us place our happiness, and in His mercy the hope and glory of restoration from our fallen state.
“And in our supplications to Him let us demonstrate or show forth a faith unfeigned and stable, an assurance that shall not for ever be shaken.
“And Thou, O Lord God Almighty, as Thou, out of Thy infinite and most desirable goodness, hast condescended to open the earth and unlock Thy treasures unto me, Thy poor and unworthy servant, and hast given into my possession the fountains and well-springs of all the treasures and riches of this world.
“So, O Lord God, out of Thine abundant kindness, extend Thy mercies unto me, that when I shall cease to be any longer in the land of the living, Thou mayst open unto me the celestial riches, the divine treasures, and give me a part or portion in the heavenly inheritance for ever.
“Where I may behold Thy divine glory and the fulness of Thy Heavenly Majesty, a pleasure, so ineffable, and a joy, so ravishing, which no mortal can express or conceive.
“This I entreat of Thee, O Lord, for our Lord Jesus Christ, Thy wellbeloved Son’s sake, who in the unity of the Holy Spirit liveth with Thee, world without end. Amen.
“I, Nicholas Flamel, Scrivener, living at Paris, anno 1399, in the Notary Street, near St James, of the Bouchery, though I learned not
much Latin, because of the poorness and meanness of my parents, who were notwithstanding (by them that envy me most) accounted honest and good people.
“Yet, by the blessing of God, I have not wanted an understanding of the books of the philosophers, but learned them and attained to a certain kind of knowledge, even of their hidden secrets.
“For which cause sake there shall not any moment of my life pass, wherein remembering this so vast a good, I will not on my bare knees, if the place will permit of it, or otherwise in my heart, with all the entireness of my affections, render thanks to this my most good and precious God.
“Who never forsakes the righteous generation, or suffers the children of the just to beg their bread, nor deceives their expectations, but supports them with blessings who put their trust in Him.
“After the death of my parents, I, Nicholas Flamel, got my living by the art of writing, engrossing inventories, making up accounts, keeping of books, and the like.
“In this course of living there fell by chance into my hands a gilded book, very old and large, which cost me only two florins.
“It was not made of paper or parchment, as other books are, but of admirable rinds (as it seemed to me) of young trees. The cover of it was of brass; it was well bound, and graven all over with a strange kind of letters, which I take to be Greek characters, or some such like.
“This I know that I could not read them, nor were they either Latin or French letters, of which I understand something.
“But as to the matter which was written within, it was engraven (as I suppose) with an iron pencil or graver upon the said bark leaves, done admirably well, and in fair and neat Latin letters, and curiously coloured.
“It contained thrice seven leaves, for so they were numbered in the top of each folio, and every seventh leaf was without any writing, but
in place thereof there were several images or figures painted.
“Upon the first seventh leaf was depicted—1. A Virgin. 2. Serpents swallowing her up. On the second seventh, a serpent crucified; and on the last seventh, a desert or wilderness, in midst whereof were seen many fair fountains, whence issued out a number of serpents here and there.
“Upon the first of the leaves was written in capital letters of gold, Abraham the Jew, Priest, Prince, Levite, Astrologer, and Philosopher, to the nation of the Jews dispersed by the wrath of God in France, wisheth health.
“After which words, it was filled with many execrations and curses, with this word M , which was oft repeated against any one that should look in to unfold it, except he were either Priest or Scribe.
“The person that sold me this book was ignorant of its worth as well as I who bought it. I judge it might have been stolen from some of the Jewish nation, or else found in some place where they anciently abode.
“In the second leaf of the book he consoled his nation, and gave them pious counsel to turn from their wickedness and evil ways, but above all to flee from idolatry, and to wait in patience for the coming of the Messiah, who, conquering all the kings and potentates of the earth, should reign in glory with his people to eternity. Without doubt, this was a very pious, wise, and understanding man.
“In the third leaf, and in all the writings that followed, he taught them, in plain words, the transmutation of metals, to the end that he might help and assist his dispersed people to pay their tribute to the Roman Emperors, and some other things not needful here to be repeated.
“He painted the vessels by the side or margin of the leaves, and discovered all the colours as they should arise or appear, with all the rest of the work.
“But of the prima materia or first matter, or agent, he spake not so much as one word; but only he told them that in the fourth and fifth
leaves he had entirely painted or decyphered it, and depicted or figured it, with a desirable dexterity and workmanship.
“Now though it was singularly well and materially or intelligibly figured and painted, yet by that could no man ever have been able to understand it without having been well skilled in their Cabala, which is a series of old traditions, and also to have been well studied in their books.
“The fourth and fifth leaf thereof was without any writing, but full of fair figures, bright and shining, or, as it were, enlightened, and very exquisitely depicted.
“First, there was a young man painted, with wings at his ankles, having in his hand a caducean rod, writhen about with two serpents, wherewith he stroke upon an helmet covering his head.
“This seemed in my mean apprehension to be one of the heathen gods, namely, Mercury. Against him there came running and flying with open wings, a great old man with an hour-glass fixed upon his head, and a scythe in his hands, like Death, with which he would (as it were in indignation) have cut off the feet of Mercury.
“On the other side of the fourth leaf he painted a fair flower, on the top of a very high mountain, which was very much shaken by the north wind. Its footstalk was blue, its flowers white and red, and its leaves shining like fine gold, and round about it the dragons and griffins of the north made their nests and habitations.
“On the fifth leaf was a fair rose-tree, flowered, in the midst of a garden, growing up against a hollow oak, at the foot whereof bubbled forth a fountain of pure white water, which ran headlong down into the depths below.
“Yet it passed through the hands of a great number of people who digged in the earth, seeking after it, but, by reason of their blindness, none of them knew it, except a very few, who considered its weight.
“On the last side of the leaf was depicted a king, with a faulchion, who caused his soldiers to slay before him many infants, the mothers standing by, and weeping at the feet of their murderers.
“These infants’ blood being gathered up by other soldiers, was put into a great vessel wherein Sol and Luna came to bathe themselves.
“And because this history seemed to represent the destruction of the Innocents by Herod, and that I learned the chiefest part of the art in this book, therefore I placed in their churchyard these hieroglyphic figures of this learning. Thus have you that which was contained in the first five leaves.
“As for what was in all the rest of the written leaves, which was wrote in good and intelligible Latin, I must conceal, lest God being offended with me should send His plague and judgments upon me. It would be a wickedness much greater than he who wished that all men in the world had but one head, that he might cut it off at a blow.
“Having thus obtained this delicate and precious book, I did nothing else day and night but study it; conceiving very well all the operations it pointed forth, but wholly ignorant of the prima materia with which I should begin, which made me very sad and discontented.
“My wife, whose name was Perrenelle, whom I loved equally with myself, and whom I had but lately married, was mightily concerned for me, and, with many comforting words, earnestly desired to know how she might deliver me from this trouble.
“I could no longer keep counsel, but told her all, shewing her the very book, which, when she saw, she became as well pleased with it as myself, and with great delight beheld the admirable cover, the engraving, the images, and exquisite figures thereof, but understood them as little as I.
“Yet it was matter of consolation to me to discourse and entertain myself with her, and to think what we should do to find out the interpretation and meaning thereof.
“At length I caused to be painted within my chamber, as much to the life or original as I could, all the images and figures of the said fourth and fifth leaves.
“These I showed to the greatest scholars and most learned men in Paris, who understood thereof no more than myself: I told them they
were found in a book which taught the philosophers’ stone.
“But the greatest part of them made a mock both of me and that most excellent secret, except one whose name was Anselm, a practiser of physic and a deep student in this art.
“He much desired to see my book, which he valued more than anything else in the world, but I always refused him, only making him a large demonstration of the method.
“He told me that the first figure represented Time, which devours all things, and that, according to the number of the six written leaves, there was required a space of six years to perfect the stone; and then, said he, we must turn the glass and see it no more.
“I told him this was not painted, but only to show the teacher the prima materia, or first agent, as was written in the book. He answered me that this digestion for six years was, as it were, a second agent, and that certainly the first agent was there painted, which was a white and heavy water.
“This, without doubt, was argent vive, which they could not fix; that is, cut off his feet, or take away his volubility, save by that long digestion in the pure blood of young infants.
“For in that this argent vive being joined with Sol and Luna was first turned with them into a plant, like that there painted, and afterwards by corruption into serpents, which serpents, being perfectly dried and digested, were made a fine powder of gold, which is the stone.
“This strange or foreign discourse to the matter was the cause of my erring, and that made me wander for the space of one and twenty years in a perfect meander from the verity; in which space of time I went through a thousand labyrinths or processes, but all in vain; yet never with the blood of infants, for that I accounted wicked and villainous.
“For I found in my book that the philosophers called blood the mineral spirit which is in the metals, chiefly in Sol, Luna, and Mercury, to which sense I always, in my own judgment, assented.