## Data Analysis and Visualization

## Data Analysis

With the data, we first draw graphs that show the variation of these qualities on time which are marked with different colors to represent phases in the experiments. And the results will be demonstrated in the Conclusion part

Next, the reasonable algorithm is to caculate the decay funtion of different qualities on time and show the observed dacay rate. From the vitualization of the data, we find the feature of piecewise linearity and chose linear regression to training the model. Before that, we do some preprocess on the data.

Since the sensor is so sensitive that there is drastic fluctuation in the data, we thus use moving average to filter the data.

Then, we also find that there are several noise points in the data. We determine to use RANSAC for robust regression.

RANSAC is a a non-deterministic iterative algorithm that estimates the parameter of a (supervised) machine learning algorithm from a dataset that contains outliers. For that, RANSAC divides the points in the dataset into two subsets: 1- outlier 2- inlier. Then it uses the inliers to create the ML model.

Generally speaking, RANSAC starts by selecting a subset of points as hypothetical inliers. The size of this subset is selected big enough to fit the ML model. For example for linear regression we need at least n+1 points where n is the dimension of the features. After fitting the model to the hypothetical inliers, RANSAC checks which elements in the original dataset are consistent with the model instantiated with the estimated parameters and, if it is the case, it updates the current subset. The RANSAC algorithm iteratively repeats until the inlier subset is large enough (large enough is an input to the algorithm) or reaching to the end of the iteration.

RANSAC algorithm is designed based on two fundamental assumptions.

There are enough inliers points to agree on a good model

The outliers will not vote consistently for any single model. This is vital because otherwise the outliers will consistency create their own model and the iteration will fall into local optimum.

Next, the reasonable algorithm is to caculate the decay funtion of different qualities on time and show the observed dacay rate. From the vitualization of the data, we find the feature of piecewise linearity and chose linear regression to training the model. Before that, we do some preprocess on the data.

Since the sensor is so sensitive that there is drastic fluctuation in the data, we thus use moving average to filter the data.

Then, we also find that there are several noise points in the data. We determine to use RANSAC for robust regression.

RANSAC is a a non-deterministic iterative algorithm that estimates the parameter of a (supervised) machine learning algorithm from a dataset that contains outliers. For that, RANSAC divides the points in the dataset into two subsets: 1- outlier 2- inlier. Then it uses the inliers to create the ML model.

Generally speaking, RANSAC starts by selecting a subset of points as hypothetical inliers. The size of this subset is selected big enough to fit the ML model. For example for linear regression we need at least n+1 points where n is the dimension of the features. After fitting the model to the hypothetical inliers, RANSAC checks which elements in the original dataset are consistent with the model instantiated with the estimated parameters and, if it is the case, it updates the current subset. The RANSAC algorithm iteratively repeats until the inlier subset is large enough (large enough is an input to the algorithm) or reaching to the end of the iteration.

RANSAC algorithm is designed based on two fundamental assumptions.

There are enough inliers points to agree on a good model

The outliers will not vote consistently for any single model. This is vital because otherwise the outliers will consistency create their own model and the iteration will fall into local optimum.

## Data Visualization

ThingSpeak is an open source Internet of Things (IoT) application and API to store and retrieve data from things using the HTTP protocol over the Internet or via a Local Area Network. Our real-time data populated from sensors are displayed through ThingSpeak. An example showed in the following.

Proudly powered by Weebly