Saturday, January 11, 2020

2020-01-11 Saturday - Enterprise Data Lake Resources

This blog posting is a placeholder for interesting resources related to Enterprise Data Lakes:

Background Reading:
    • "Coined by James Dixon, CTO of Pentaho, the term “data lake” refers to the ad hoc nature of data in a data lake, as opposed to the clean and processed data stored in traditional data warehouse systems."
Suggested Amazon Books:

Additional Suggested Reading:


Amazon AWS Data Lakes:

Microsoft - Azure:

Microsoft - SQL Server 2019:
  IBM Data Lake:


Apache Projects:

  • https://flume.apache.org/
    • "Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application."
   



Other Open Source Projects:

  • https://prestodb.io/
    • "Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes."
    • https://aws.amazon.com/emr/features/presto/ 
    • https://aws.amazon.com/big-data/what-is-presto/
      •  "Presto (or PrestoDB) is an open source, distributed SQL query engine, designed from the ground up for fast analytic queries against data of any size. It supports both non-relational sources, such as the Hadoop Distributed File System (HDFS), Amazon S3, Cassandra, MongoDB, and HBase, and relational data sources such as MySQL, PostgreSQL, Amazon Redshift, Microsoft SQL Server, and Teradata."
      • "Presto can query data where it is stored, without needing to move data into a separate analytics system. Query execution runs in parallel over a pure memory-based architecture, with most results returning in seconds. You’ll find it used by many well-known companies like Facebook, Airbnb, Netflix, Atlassian, and Nasdaq." 

Video Resources:

No comments:

Copyright

© 2001-2021 International Technology Ventures, Inc., All Rights Reserved.