The Data Mining Forum                             open-source data mining software open-source data mining software data science journal data mining conferences
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
Software for Data Selection
Posted by: louissalome
Date: October 05, 2020 05:55AM

I am facing a Data Selection issue.
I can access a large database with a great amount of variables (i.e. hundreds of columns) on SAP BW. I have very little documentation for this db. I want to go through each variable, ignore the empty ones, and identify the useful ones.
I have done this in a very inefficient way. I load my data in Power BI Desktop and I check each variable one at a time. By doing this, I’m sure I’m making no mistake but it takes too much time. I really need a first cleaning.
I’m looking for a software that could help me to select the interesting variables out of a large dataset. I want it to be at least able to detect the empty variable and possibly to figure out duplicates or correlations.
I’d really like to know your tips and good practices for this kind of data selection!

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.