Showing posts with label cluster. Show all posts
Showing posts with label cluster. Show all posts

Tuesday, March 22, 2016

Home Depot Kaggle competition started

Started working on Home Depot Kaggle competition. This competition requires a lot of text cleaning, before any significant improvement over benchmark can be done.
Running some cleaning, spell-checking, initial feature generation on my AWS Spark cluster with 33 nodes.
I might not be able to put a lot of effort into it, but I will make sure I make at least one submission with basic features.