Practical data science : a guide to building the technology stack for turning data lakes into business assets

Bibliographische Detailangaben

Titel
Practical data science a guide to building the technology stack for turning data lakes into business assets
verantwortlich
Vermeulen, Andreas François (VerfasserIn)
veröffentlicht
Berkeley, CA: Apress, 2018
©2018
Erscheinungsjahr
2018
Medientyp
E-Book
Datenquelle
British Library Catalogue
Tags
Tag hinzufügen

Zugang

Für diesen Titel können wir derzeit leider keine weiteren Informationen zur Verfügbarkeit bereitstellen.

Inhaltsangabe:
  • Intro; Table of Contents; About the Author; About the Technical Reviewer; Acknowledgments; Introduction; Chapter 1: Data Science Technology Stack; Rapid Information Factory Ecosystem; Data Science Storage Tools; Schema-on-Write and Schema-on-Read; Schema-on-Write Ecosystems; Schema-on-Read Ecosystems; Data Lake; Data Vault; Hubs; Links; Satellites; Data Warehouse Bus Matrix; Data Science Processing Tools; Spark; Spark Core; Spark SQL; Spark Streaming; MLlib Machine Learning Library; GraphX; Mesos; Akka; Cassandra; Kafka; Kafka Core; Kafka Streams; Kafka Connect; Elastic Search; R; Scala.
  • PythonMQTT (MQ Telemetry Transport); Whatâ#x80;#x99;s Next?; Chapter 2: Vermeulen-Krennwallner-Hillman-Clark; Windows; Linux; Itâ#x80;#x99;s Now Time to Meet Your Customer; Vermeulen PLC; Krennwallner AG; Hillman Ltd; Clark Ltd; Processing Ecosystem; Scala; Apache Spark; Apache Mesos; Akka; Apache Cassandra; Kafka; Message Queue Telemetry Transport; Example Ecosystem; Python; Ubuntu; CentOS/RHEL; Windows; Is Python3 Ready?; Python Libraries; Pandas; Ubuntu; Centos/RHEL; PIP; Matplotlib; Ubuntu; CentOS/RHEL; PIP; NumPy; SymPy; Scikit-Learn; R; Ubuntu; CentOS/RHEL; Windows; Development Environment; R Studio.
  • UbuntuCentOS/RHEL; Windows; R Packages; Data.Table Package; ReadR Package; JSONLite Package; Ggplot2 Package; Amalgamation of R with Spark; Sample Data; IP Addresses Data Sets; Customer Data Sets; Logistics Data Sets; Post Codes; Warehouse Data Set; Shop Data Set; Exchange Rate Data Set; Profit-and-Loss Statement Data Set; Summary; Chapter 3: Layered Framework; Definition of Data Science Framework; Cross-Industry Standard Process for Data Mining (CRISP-DM); Business Understanding; Data Understanding; Data Preparation; Modeling; Evaluation; Deployment.
  • Homogeneous Ontology for Recursive Uniform SchemaThe Top Layers of a Layered Framework; The Basics for Business Layer; The Basics for Utility Layer; The Basics for Operational Management Layer; The Basics for Audit, Balance, and Control Layer; Audit; Balance; Control; The Basics for Functional Layer; Layered Framework for High-Level Data Science and Engineering; Windows; Linux; Summary; Chapter 4: Business Layer; Business Layer; The Functional Requirements; General Functional Requirements; Specific Functional Requirements; Data Mapping Matrix; Sun Models; Dimensions.
  • SCD Type 1â#x80;#x94;Only UpdateSCD Type 2â#x80;#x94;Keeps Complete History; SCD Type 3â#x80;#x94;Transition Dimension; SCD Type 4â#x80;#x94;Fast-Growing Dimension.; Facts; Intra-Sun Model Consolidation Matrix; Sun Model One; Sun Model Two; Sun Model Three; The Nonfunctional Requirements; Accessibility Requirements; Audit and Control Requirements; Availability Requirements; Backup Requirements; Capacity, Current, and Forecast; Capacity; Concurrency; Throughput Capacity; Storage (Memory); Storage (Disk); Storage (GPU); Year-on-Year Growth Requirements; Configuration Management; Deployment; Documentation; Disaster Recovery.