Saturday, June 9, 2018

Python - Linear Regression with tolerance

Ok, here is the situation: In linear regression we come up with a hypothesis to predict a new given value based on train dataset. Standard algorithms output is a real number as prediction but user has no idea about accuracy of the prediction (as far as i know), therefore I thought to write a simple Python code to check how much giving tolerance will be useful.
I wrote a simple code to create a fake dataset which has linear correlation. The output was as figure 1:

As we can see the predicted value in lower values of X has huge tolerance but in higher values it has predicted visibly lower than average data. Therefore knowing how much the predicted value can vary could be useful in some cases. For this matter the function predict() will return final predicted value as well as distance of predicted value from minimum value and maximum value in the nearby data to give better insight of maximum and minimum possible values and the relation of the predicted value with the possible ranges. For demonstration purpose the code prints the calculated values as following:
we can see for given value 2 the predicted value is 0.258 and mean of nearby data around 2 is 0.2612 by calculating average of these two numbers the final prediction will be 0.260 which is enhanced a little based on the data spread nearby. In given value 2 the prediction is almost in the middle of max and min values but for x=3 the distance of prediction with min is -0.21 and with max is 0.14 which indicates the prediction is off from min , and the final value update shows that enhancement is correct. (toward min value)

The function considers 5% of total data as nearby data.

Note: this method works in the range of training data, therefore this function can't enhance results outside of data range as there is no data.

You can find the Python code in GitHub


3 comments:

  1. I just wanted to comment on this blog to support you. Nice blog and informative content. Keep sharing more blogs with us. All the best for your future blogs.
    Best Data Science Course Training in Hyderabad

    ReplyDelete
  2. Wonderful blog. I am delighted in perusing your articles. This is genuinely an incredible pursuit for me. I have bookmarked it and I am anticipating perusing new articles. Keep doing awesome!
    data analytics courses in hyderabad

    ReplyDelete
  3. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.data science course data science training in surat

    ReplyDelete

Format

Here is all you need to know about string formatting. I have summarized the page  https://pyformat.info/  , in fact I just removed all e...