Limited Time OfferFLAT 20% off & $20 bonus sign up. Order Now
New! Hire Essay Assignment Writer Online and Get Flat 20% Discount!!Order Now
The aim of the Kickstarter Campaign is to predict the success or failure of a campaign. It is a multi class classification problem and has five levels namely canceled, failed, live, successful and suspended. It is extremely important to know the impact of the campaign before the launch of the project. Hence this analysis is a vital factor to analyze the funding goal.
The projects available in the dataset has either of the five status mentioned above. The project id column and the url does not seem to have any effect on the model building. Hence they have been removed from both the training and test dataset. The levels column has been converted into a factor in both training and test dataset.
Feature selection is an extremely important aspect of model building. This has the ability to alter the accuracy to a great extent.
Since the data has both categorical and numerical attributes, the method chosen for feature selection is “Boruta”. (Analytics Vidya 2016)
Boruta gives a clear call on the importance of the features. Boruta is popular for following all relevant feature selection method which captures the features that fall under some circumstances most relevant to the predictor variable. Unlike other traditional feature selection algorithms which follow minimal optimal method, this method relies on small subsets of features that yield minimal error.
Boruta checks all the features and displays which features are strongly relevant and weakly relevant to the predictor variable. The technique is extensively applied in medical field because of this feature.
As per the results declared by boruta, the features that are strongly related to the decision variable are category, subcategory, goal, levels, duration and location.
Since this is a classification problem, the methods that can be used are KNN, Random Forest, Decision Tree, SVM, etc. However, the method that has been used is Decision Tree.
A decision tree classifier is a supervised learning method which poses a series of learned questions regarding the features of the training examples. Every time it receives an answer, a question is asked till a final conclusion regarding the label of the class is decided.
After creating a data frame which includes all the important features, a model is built on it using the library rpart.
The accuracy obtained on the training data is 98% and on the test data is 45%.
To improve results, another model that has been applied in SVM
SVM, Support Vector Machine is an extremely popular discriminative algorithm that is widely used in the text classification problems as well. SVAM tries to fine the maximum margin hyper-plane that aims at separating the data based on class in the high dimensional space. This significantly solves the optimization problem. (Ray Sunil, 2017)
SVM is a great algorithm that helps in handling over fitting and also reduces the curse of dimensionality.
Class Separation: SVM basically looks for a hyper plane that gets placed between two classes with maximum margin between the closest points of two classes. The points that first touch the hyper plane are called support vectors
Overlapping Classes: SVM reduces the weight of data points that are wrongly classified into wrong class i.e. it solves the problem of overlapping classes.Nonlinearity: SVM also solves the problem of non-linearity. (Siva, 2013)
When we cannot find a linear separator, we project the data points into higher dimensional space and ensure they can be linearly separated. Most of the time, when one does SVM, there is no need to do preprocessing. SVM is used for supervised classification and a supervised machine learning algorithm. It can be used for both classification and regression problems. The goal of SVM is to arrive at the optimal line that separates two sections using a hyper-plane. SVM is efficient for linear and non linear classification. Non-linear classifications are handled by SVM using Kernal trick. Choice of the kernal for non-linear classifications is tricky and requires trial and error approach.
SVM also has been built on the pruned data simply by using the variables that boruta decided to be important.
The results have been saved to scores.csv file and the header has been changed to product.id and status. After creation of the csv file, the file has been uploaded to kaggle.
In this assignment we are trying to predict the class of the campaign using innovative machine learning and classification approaches. Several insights have been developed which can be used for better campaigning. The results explain that better incorporation of the failed projects can significantly improve the robustness of the prediction model and one can perform much better predictions. Another important method that can be used to predict the results of campaign is to use the twitter data and analyze the impact of the campaign on social media. For the above said example, along with SVM, Decision Trees also can be use. But it might take very long time on a computer with average capabilities and chances are high for the results to be unsatisfactory.
Available at : https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/ (Accessed: September 13, 2017)
Siva (2013) SVM Implementation Step By Step With R. Available at https://sivaanalytics.wordpress.com/2013/06/15/svm-implementation-step-by-step-with-r-data-preparation/ (Accessed: June 15, 2013)
Saxena R (2017) How Decision Algorithm Works. Available at: http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/ (Accessed: January 30, 2017 )
Analytics Vidya(2016 )How to perform feature selection (i.e. pick important variables) using Boruta Package in R ? Available at: https://www.analyticsvidhya.com/blog/2016/03/select-important-variables-boruta-package/ (Accessed: March 22, 2016).
No matter how close the deadline is, you will find quick solutions for your urgent assignments.
All assessments are written by experts based on research and credible sources. It also quality-approved by editors and proofreaders.
Our team consists of writers and PhD scholars with profound knowledge in their subject of study and deliver A+ quality solution.
We offer academic help services for a wide array of subjects.
We care about our students and guarantee the best price in the market to help them avail top academic services that fit any budget.
15,000+ happy customers and counting!