After completing this module, students will be able to:
-
Explain the purpose of R server.
-
Connect to R server from R client
-
Explain the purpose of the ScaleR functions.
After completing this module, students will be able to:
-
Explain ScaleR data sources
-
Describe how to import XDF data
-
Describe how to summarize data held in XCF format
After completing this module, students will be able to:
After completing this module, students will be able to:
Lessons
- Using the RxLocalParallel compute context with rxExec
- Using the revoPemaR package
Lab : Using rxExec and RevoPemaR to parallelize operations
- Using rxExec to maximize resource use
- Creating and using a PEMA class
After completing this module, students will be able to:
Module 6: Creating and Evaluating Regression ModelsExplain how to build and evaluate regression models generated from big dataLessons
- Clustering Big Data
- Generating regression models and making predictions
Lab : Creating a linear regression model
- Creating a cluster
- Creating a regression model
- Generate data for making predictions
- Use the models to make predictions and compare the results
After completing this module, students will be able to:
Module 7: Creating and Evaluating Partitioning ModelsExplain how to create and score partitioning models generated from big data.Lessons
- Creating partitioning models based on decision trees.
- Test partitioning models by making and comparing predictions
Lab : Creating and evaluating partitioning models
- Splitting the dataset
- Building models
- Running predictions and testing the results
- Comparing results
After completing this module, students will be able to:
-
Create partitioning models using the rxDTree, rxDForest, and rxBTree algorithms.
-
Test partitioning models by making and comparing predictions.
Module 8: Processing Big Data in SQL Server and HadoopExplain how to transform and clean big data sets.Lessons
- Using R in SQL Server
- Using Hadoop Map/Reduce
- Using Hadoop Spark
Lab : Processing big data in SQL Server and Hadoop
- Creating a model and predicting outcomes in SQL Server
- Performing an analysis and plotting the results using Hadoop Map/Reduce
- Integrating a sparklyr script into a ScaleR workflow
After completing this module, students will be able to: